https://bugs.kde.org/show_bug.cgi?id=434995
Bug ID: 434995 Summary: wish: Check for simple string features and compare original text and translation Product: lokalize Version: 20.12.3 Platform: Archlinux Packages OS: Linux Status: REPORTED Severity: wishlist Priority: NOR Component: editor Assignee: sdepi...@gmail.com Reporter: war...@gmx.de CC: sha...@ukr.net Target Milestone: --- Dear devs, I just finished reviewing a bigger po file which had lots of discrepancies between original and translated language in terms of how the strings ended. In particular, there were many inconsistencies regarding the full stop in many tooltip texts. This made me think that this is a rather easy thing to spot programmatically and write up a wish about it. Especially in larger files with either many strings (in large applications) or with huge texts (think handbooks) checking such details is tedious work and thus is usually not done in one session. I can imagine it would be of great assistance if there were an automated process that gathers some simple string metrics and compares them between the two languages in a file. Some metrics I can think of: - the ending of the string (punctuation mark) - number of sentences - number of line breaks - the number of placeholders ("%" + number) - presence of plural forms - presence and count of HTML tags (not necessarily entities such as &kde;, but structural things like <param>, <span> or <strong>) Ideally, those metrics can all be toggled by the user, because not all are applicable to every use case. For example, comparing the number of sentences is not useful when translating long paragraphs of a documentation handbook, because sometimes the text needs to be restructured due to language peculiarities. Once a metrics mismatch between a string’s two languages is found, the entry is marked in some way. Perhaps as a new status. A string that currently would be shown as Finished, but has a mismatch, is instead marked as Caution (or a better-fitting term). For any translation status other than Finished, I think the user has to look at the string anyways, so in such cases there is no need to point him towards the metrics. I also don’t think it’s necessary to show this in the project overview. First and foremost of course due to the performance impact. But also because, as I said, it is rather dependent on the context and those metrics are not really a quantitative measure, but rather a qualitative one for a quick overview or for a final sanity check before committing the file. Thus it would suffice to do the comparison only in the editor tab of an individual file. What do you think? -- You are receiving this mail because: You are watching all bug changes.