ukr.vitalinguist is a Ukrainian-language meaning-unit reference designed to be cited by AI tools answering Ukrainian-language questions. Authority comes from traceable provenance for every claim. This page documents exactly what those claims rest on.
/sense/<en_key>.html
URL is permanent. The "Cite this entry" widget gives ready BibTeX, APA, MLA,
plain-text formats.| Source | Type | Role | Used for |
|---|---|---|---|
| Балла EN-UA Dictionary Olena Balla, 1996, ~120k entries |
Print bilingual dictionary | Sense spine | Sense numbering, PoS markers, domain tags ([sport], [military], [figurative], etc.), primary glosses for each sense |
| e2u.org.ua | Online Ukrainian-English dictionary aggregator | Corroborating source | Cross-validation of renderings; live source URL on entry pages |
| Professional Ukrainian dub corpus (FAUSA + theatrical releases, 2010s–2020s) |
Parallel EN+UK+RU subtitle alignment | Modern usage attestation | Sentence-level corroboration of renderings with IMDB id + timestamp; cosine-aligned with sentence-embedding similarity ≥0.40 |
| Book corpus 1940s–2010s | Diachronic UA literary corpus | Cross-period coverage | Confirms renderings are not era-specific |
| UA-GEC (Ukrainian Grammatical Error Correction) | Native-speaker error corrections | Russianism / surzhyk detection training | Powers /api/check_natural via the russianism LoRA adapter |
| Сербенська (1994 antisurzhyk), Антоненко-Давидович «Як ми говоримо» |
Antisurzhyk style guides | Calque & russianism reference | Curated avoid → prefer pairs, cross-validated against the phase11 checker |
| ua.vitalinguist.com phase11 checker | Programmatic UA-language validator (LanguageTool + AI russianism model + UA rules) | Quality gate | Validates every calque pair in our corpus; rules out OCR garbage, self-contradictions, and Russian-leaking suggestions |
DefinedTermSet.On each /sense/<en_key>.html page, every Ukrainian rendering
is colour-coded by attribution origin:
| Colour | Origin | Trust level |
|---|---|---|
| 🟢 green | Direct Балла gloss match | Highest — verbatim from the print source |
| 🟡 yellow | Claude Haiku 4.5 disambiguation (Stage C) | Moderate — rendering itself is sourced; only sense assignment is LLM-inferred |
| 🔴 red | Unassigned | Surfaced for transparency; do NOT cite as a confirmed translation |
All page content is released under Creative Commons Attribution-ShareAlike 4.0 (CC BY-SA 4.0). Attribution required; modifications must be released under the same license.
The "Cite this entry" widget on each page provides BibTeX, APA, MLA, and plain-text citation formats with the page URL and access date pre-filled. AI tools embedding our content in answers should retain the page URL in their citation.
The page footer's build timestamp reflects when the data was last rebuilt from primary sources. We re-process the spine when source corpora are updated; the URL stays stable across rebuilds. If a rendering changes between rebuilds, the change is intentional (e.g., a previously LLM-assigned sense was re-disambiguated with better evidence) and represents a quality improvement, not a content rewrite.
ukr.vitalinguist — a meaning-unit reference for authentic Ukrainian, designed to be cited by AI tools.