[framework-issues] [Issue 113558] Change Case broken by lan guage tags and/or ligatures
To comment on the following update, log in, then open the issue: http://www.openoffice.org/issues/show_bug.cgi?id=113558 --- Additional comments from o...@openoffice.org Mon Aug 2 06:32:56 + 2010 --- *** Issue 113568 has been marked as a duplicate of this issue. *** - Please do not reply to this automatically generated notification from Issue Tracker. Please log onto the website and enter your comments. http://qa.openoffice.org/issue_handling/project_issues.html#notification - To unsubscribe, e-mail: issues-unsubscr...@framework.openoffice.org For additional commands, e-mail: issues-h...@framework.openoffice.org - To unsubscribe, e-mail: allbugs-unsubscr...@openoffice.org For additional commands, e-mail: allbugs-h...@openoffice.org
[framework-issues] [Issue 113558] Change Case broken by lan guage tags and/or ligatures
To comment on the following update, log in, then open the issue: http://www.openoffice.org/issues/show_bug.cgi?id=113558 User of changed the following: What|Old value |New value CC|'fedorafonts,majukr05'|'fedorafonts,majukr05,mru' Assigned to|tm|writerneedsconfirm Component|framework |Word processor QA contact|iss...@framework |iss...@sw --- Additional comments from o...@openoffice.org Mon Aug 2 06:33:50 + 2010 --- reassigned - Please do not reply to this automatically generated notification from Issue Tracker. Please log onto the website and enter your comments. http://qa.openoffice.org/issue_handling/project_issues.html#notification - To unsubscribe, e-mail: issues-unsubscr...@framework.openoffice.org For additional commands, e-mail: issues-h...@framework.openoffice.org - To unsubscribe, e-mail: allbugs-unsubscr...@openoffice.org For additional commands, e-mail: allbugs-h...@openoffice.org
[framework-issues] [Issue 113558] Change Case broken by lan guage tags and/or ligatures
To comment on the following update, log in, then open the issue: http://www.openoffice.org/issues/show_bug.cgi?id=113558 User majukr05 changed the following: What|Old value |New value CC|'fedorafonts' |'fedorafonts,majukr05' - Please do not reply to this automatically generated notification from Issue Tracker. Please log onto the website and enter your comments. http://qa.openoffice.org/issue_handling/project_issues.html#notification - To unsubscribe, e-mail: issues-unsubscr...@framework.openoffice.org For additional commands, e-mail: issues-h...@framework.openoffice.org - To unsubscribe, e-mail: allbugs-unsubscr...@openoffice.org For additional commands, e-mail: allbugs-h...@openoffice.org
[framework-issues] [Issue 113558] Change Case broken by lan guage tags and/or ligatures
To comment on the following update, log in, then open the issue: http://www.openoffice.org/issues/show_bug.cgi?id=113558 User nmailhot changed the following: What|Old value |New value CC|''|'fedorafonts' - Please do not reply to this automatically generated notification from Issue Tracker. Please log onto the website and enter your comments. http://qa.openoffice.org/issue_handling/project_issues.html#notification - To unsubscribe, e-mail: issues-unsubscr...@framework.openoffice.org For additional commands, e-mail: issues-h...@framework.openoffice.org - To unsubscribe, e-mail: allbugs-unsubscr...@openoffice.org For additional commands, e-mail: allbugs-h...@openoffice.org
[framework-issues] [Issue 113558] Change Case broken by lan guage tags and/or ligatures
To comment on the following update, log in, then open the issue: http://www.openoffice.org/issues/show_bug.cgi?id=113558 --- Additional comments from j...@openoffice.org Sun Aug 1 04:25:39 + 2010 --- One more comment... After posting this report yesterday, I starting playing with the new user dictionary interface in M85 (the default for new user dictionary files has changed from binary to UTF-8). There are bugs there, too, which may possibly be related to the casing errors. So, please don't treat the following as a separate bug report (it's in the wrong place for that, I know), but instead as a clue to the possible cause of the casing errors. In short, when I add a word to a user dictionary that contains a double-byte character (eg a letter combined with an unusual accent, such as dot underneath), or if the user dictionary already contains such words, things start getting buggy: in some cases, an *incomplete* copy of the last word in the list gets appended to the dictionary; in other cases, the word is not added to the selected dictionary at all, but to another one. Again, I'm not a programmer, but if I were to bet on it, I'd guess there's a possibility that both sets of errors are caused by a bug in a text parsing library used by both the casing and user dictionary routines. Reason for saying this is that all the errors - casing and dictionary - appear to involve miscalculating text bounds. The parallel is particularly compelling when comparing Capitalize Every Word's mangling of text with ligatures (which could be counted as one, two or more characters), to the dictionary parser's mangling of user dictionaries that contain non-compiled characters with combining accents (which could also be counted as one, two or more characters). Has something recently changed in a text parsing component? - Please do not reply to this automatically generated notification from Issue Tracker. Please log onto the website and enter your comments. http://qa.openoffice.org/issue_handling/project_issues.html#notification - To unsubscribe, e-mail: issues-unsubscr...@framework.openoffice.org For additional commands, e-mail: issues-h...@framework.openoffice.org - To unsubscribe, e-mail: allbugs-unsubscr...@openoffice.org For additional commands, e-mail: allbugs-h...@openoffice.org
[framework-issues] [Issue 113558] Change Case broken by lan guage tags and/or ligatures
To comment on the following update, log in, then open the issue: http://www.openoffice.org/issues/show_bug.cgi?id=113558 --- Additional comments from j...@openoffice.org Sat Jul 31 05:12:59 + 2010 --- Created an attachment (id=70901) Examples (input / expected / actual output) - Please do not reply to this automatically generated notification from Issue Tracker. Please log onto the website and enter your comments. http://qa.openoffice.org/issue_handling/project_issues.html#notification - To unsubscribe, e-mail: issues-unsubscr...@framework.openoffice.org For additional commands, e-mail: issues-h...@framework.openoffice.org - To unsubscribe, e-mail: allbugs-unsubscr...@openoffice.org For additional commands, e-mail: allbugs-h...@openoffice.org
[framework-issues] [Issue 113558] Change Case broken by lan guage tags and/or ligatures
To comment on the following update, log in, then open the issue: http://www.openoffice.org/issues/show_bug.cgi?id=113558 Issue #|113558 Summary|Change Case broken by language tags and/or ligatures Component|framework Version|OOO330m1 Platform|PC URL| OS/Version|Windows, all Status|UNCONFIRMED Status whiteboard| Keywords| Resolution| Issue type|DEFECT Priority|P2 Subcomponent|code Assigned to|tm Reported by|jurf --- Additional comments from j...@openoffice.org Sat Jul 31 05:10:08 + 2010 --- Casing options broken by language tags and/or ligatures Issue 1601 (http://qa.openoffice.org/issues/show_bug.cgi?id=1601), marked Fixed and with CWS tl74 included in OOo-dev300m85 (tested) and OOO330m2 (not tested, but likely identical), implements three new and welcome options in Format | Change case, namely: Sentence case Capitalize Every Word tOGGLE cASE Whilst I've not tested tOGGLE cASE (it's not something I need), I have spent a good while poking Sentence case and Capitalize Every Word with a stick. Both functions are, unfortunately, very buggy. The implementation of Capitalize Every Word is especially bad, with a high probability of data loss (disappearing text with no guarantee that Undo works properly). So far, I've seen the bugs be triggered by either language mark-up or ligatures (the latter not necessarily in text selections), which are actually the only conditions I've been testing for. As such, it's likely there are other triggers, too. The data loss is particularly troubling as the undo function, even if given sufficient steps, does not necessarily restore the original text correctly. And even that assumes that the user is half-expecting trouble. Issue present in both Writer and Calc (not tested others), and in both cases is severe. I'm attaching an ODT file to this issue. It contains several examples you can try out yourself, together with mock-ups of expected and actual results. ** ISSUE DESCRIPTION In brief, the main problems I've found so far are: Sentence case - The presence of language mark-up within selected text confuses the parser, causing it to consider the marked-up section as a new sentence, thus capitalizing two or more words in the middle of a sentence. Capitalize Every Word - Language mark-up causes similar miscalculations, but more exaggerated, potentially causing data loss (see attached file) - The presence of ligatures, either within selected text, or before it (but in the same paragraph) causes similar problems. - Applying Capitalize Every Word to multiple selections further exacerbates the problem. ** POSSIBLE CAUSES I'm not a programmer, but I think the primary cause of the bugs in either function is a miscalculation of selection bounds, which leads to at times extremely severe offset errors both as regards the selection area and the bounds of the text itself. Among the causes would appear to be: 1. the parser gives language declarations a width (two characters for each tag, apparently, being one for the opening, another for the closure); 2. the parser miscounts the length of ligatures (unicode FF00 to FF06) whether or not they're selected, which causes both selections and actual words processed to expand to the right - if there's no room at the end of the paragraph for this expansion, text disappears; 3. multiple selections are incorrectly handled (it appears as though errors in one selection block are carried over to the next, and so on). This may simply be the symptomatic of the first potential causes, but it may also be compounded by buffers not being cleared. Or something (TM). The problem was exacerbated, I think, by the original test case (http://quaste.services.openoffice.org/index.php?option=com_tcstask=tcs_showtcsid=3116), which is just plain text: no formatting, no language tags, no awkward characters such as non-diphthong ligatures (ff, fi, fl, etc.) ** EXAMPLE The following is a simple example of the buggy behaviour of Sentence case, to give you an idea of the type of problem. See the attached file for many more examples (all different) of both Sentence case and Capitalize Every Word: Input: the rapide brown fox [with rapide marked as French] Expected: The rapide brown fox Output: The Rapide Brown fox The underlying code (from contents.xml) is this, where T3 is default format, and T4 is French: text:p text:style-name=Standard text:span text:style-name=T3The /text:span text:span text:style-name=T4Rapide BroWn Fox-Like Creat/text:span text:span text:style-name=T3ure/text:span /text:p
[framework-issues] [Issue 113558] Change Case broken by lan guage tags and/or ligatures
To comment on the following update, log in, then open the issue: http://www.openoffice.org/issues/show_bug.cgi?id=113558 --- Additional comments from j...@openoffice.org Sat Jul 31 05:35:39 + 2010 --- My apologies, I merged two examples into one in my post. It should have one of these examples: 1. SENTENCE CASE Input: the rapide brown fox [with 'rapide' marked as French] Output: The Rapide Brown fox Underlying code: text:p text:style-name=P1The text:span text:style-name=T4Rapide/text:span Brown fox/text:p 2. Capitalize Every Word Input: the rapide brown fox-like creature [with 'rapide' marked as French] Output: The Rapide BroWn Fox-Like Creature [with everything from 'Rapide' to 'Creat' inclusive marked French] Underlying code: text:p text:style-name=BodyThe text:span text:style-name=T4Rapide BroWn Fox-Like Creat/text:spanure/text:p - Please do not reply to this automatically generated notification from Issue Tracker. Please log onto the website and enter your comments. http://qa.openoffice.org/issue_handling/project_issues.html#notification - To unsubscribe, e-mail: issues-unsubscr...@framework.openoffice.org For additional commands, e-mail: issues-h...@framework.openoffice.org - To unsubscribe, e-mail: allbugs-unsubscr...@openoffice.org For additional commands, e-mail: allbugs-h...@openoffice.org