[framework-issues] [Issue 113558] Change Case broken by lan guage tags and/or ligatures

2010-08-02 Thread of
To comment on the following update, log in, then open the issue:
http://www.openoffice.org/issues/show_bug.cgi?id=113558





--- Additional comments from o...@openoffice.org Mon Aug  2 06:32:56 + 
2010 ---
*** Issue 113568 has been marked as a duplicate of this issue. ***

-
Please do not reply to this automatically generated notification from
Issue Tracker. Please log onto the website and enter your comments.
http://qa.openoffice.org/issue_handling/project_issues.html#notification

-
To unsubscribe, e-mail: issues-unsubscr...@framework.openoffice.org
For additional commands, e-mail: issues-h...@framework.openoffice.org


-
To unsubscribe, e-mail: allbugs-unsubscr...@openoffice.org
For additional commands, e-mail: allbugs-h...@openoffice.org



[framework-issues] [Issue 113558] Change Case broken by lan guage tags and/or ligatures

2010-08-02 Thread of
To comment on the following update, log in, then open the issue:
http://www.openoffice.org/issues/show_bug.cgi?id=113558


User of changed the following:

What|Old value |New value

  CC|'fedorafonts,majukr05'|'fedorafonts,majukr05,mru'

 Assigned to|tm|writerneedsconfirm

   Component|framework |Word processor

  QA contact|iss...@framework  |iss...@sw





--- Additional comments from o...@openoffice.org Mon Aug  2 06:33:50 + 
2010 ---
reassigned

-
Please do not reply to this automatically generated notification from
Issue Tracker. Please log onto the website and enter your comments.
http://qa.openoffice.org/issue_handling/project_issues.html#notification

-
To unsubscribe, e-mail: issues-unsubscr...@framework.openoffice.org
For additional commands, e-mail: issues-h...@framework.openoffice.org


-
To unsubscribe, e-mail: allbugs-unsubscr...@openoffice.org
For additional commands, e-mail: allbugs-h...@openoffice.org



[framework-issues] [Issue 113558] Change Case broken by lan guage tags and/or ligatures

2010-08-01 Thread majukr05
To comment on the following update, log in, then open the issue:
http://www.openoffice.org/issues/show_bug.cgi?id=113558


User majukr05 changed the following:

What|Old value |New value

  CC|'fedorafonts' |'fedorafonts,majukr05'





-
Please do not reply to this automatically generated notification from
Issue Tracker. Please log onto the website and enter your comments.
http://qa.openoffice.org/issue_handling/project_issues.html#notification

-
To unsubscribe, e-mail: issues-unsubscr...@framework.openoffice.org
For additional commands, e-mail: issues-h...@framework.openoffice.org


-
To unsubscribe, e-mail: allbugs-unsubscr...@openoffice.org
For additional commands, e-mail: allbugs-h...@openoffice.org



[framework-issues] [Issue 113558] Change Case broken by lan guage tags and/or ligatures

2010-07-31 Thread nmailhot
To comment on the following update, log in, then open the issue:
http://www.openoffice.org/issues/show_bug.cgi?id=113558


User nmailhot changed the following:

What|Old value |New value

  CC|''|'fedorafonts'





-
Please do not reply to this automatically generated notification from
Issue Tracker. Please log onto the website and enter your comments.
http://qa.openoffice.org/issue_handling/project_issues.html#notification

-
To unsubscribe, e-mail: issues-unsubscr...@framework.openoffice.org
For additional commands, e-mail: issues-h...@framework.openoffice.org


-
To unsubscribe, e-mail: allbugs-unsubscr...@openoffice.org
For additional commands, e-mail: allbugs-h...@openoffice.org



[framework-issues] [Issue 113558] Change Case broken by lan guage tags and/or ligatures

2010-07-31 Thread jurf
To comment on the following update, log in, then open the issue:
http://www.openoffice.org/issues/show_bug.cgi?id=113558





--- Additional comments from j...@openoffice.org Sun Aug  1 04:25:39 + 
2010 ---
One more comment...

After posting this report yesterday, I starting playing with the new user
dictionary interface in M85 (the default for new user dictionary files has
changed from binary to UTF-8). There are bugs there, too, which may possibly be
related to the casing errors. So, please don't treat the following as a separate
bug report (it's in the wrong place for that, I know), but instead as a clue to
the possible cause of the casing errors.

In short, when I add a word to a user dictionary that contains a double-byte
character (eg a letter combined with an unusual accent, such as dot underneath),
or if the user dictionary already contains such words, things start getting
buggy: in some cases, an *incomplete* copy of the last word in the list gets
appended to the dictionary; in other cases, the word is not added to the
selected dictionary at all, but to another one.

Again, I'm not a programmer, but if I were to bet on it, I'd guess there's a
possibility that both sets of errors are caused by a bug in a text parsing
library used by both the casing and user dictionary routines.

Reason for saying this is that all the errors - casing and dictionary - appear
to involve miscalculating text bounds.

The parallel is particularly compelling when comparing Capitalize Every Word's
mangling of text with ligatures (which could be counted as one, two or more
characters), to the dictionary parser's mangling of user dictionaries that
contain non-compiled characters with combining accents (which could also be
counted as one, two or more characters).

Has something recently changed in a text parsing component?

-
Please do not reply to this automatically generated notification from
Issue Tracker. Please log onto the website and enter your comments.
http://qa.openoffice.org/issue_handling/project_issues.html#notification

-
To unsubscribe, e-mail: issues-unsubscr...@framework.openoffice.org
For additional commands, e-mail: issues-h...@framework.openoffice.org


-
To unsubscribe, e-mail: allbugs-unsubscr...@openoffice.org
For additional commands, e-mail: allbugs-h...@openoffice.org



[framework-issues] [Issue 113558] Change Case broken by lan guage tags and/or ligatures

2010-07-30 Thread jurf
To comment on the following update, log in, then open the issue:
http://www.openoffice.org/issues/show_bug.cgi?id=113558





--- Additional comments from j...@openoffice.org Sat Jul 31 05:12:59 + 
2010 ---
Created an attachment (id=70901)
Examples (input / expected / actual output)


-
Please do not reply to this automatically generated notification from
Issue Tracker. Please log onto the website and enter your comments.
http://qa.openoffice.org/issue_handling/project_issues.html#notification

-
To unsubscribe, e-mail: issues-unsubscr...@framework.openoffice.org
For additional commands, e-mail: issues-h...@framework.openoffice.org


-
To unsubscribe, e-mail: allbugs-unsubscr...@openoffice.org
For additional commands, e-mail: allbugs-h...@openoffice.org



[framework-issues] [Issue 113558] Change Case broken by lan guage tags and/or ligatures

2010-07-30 Thread jurf
To comment on the following update, log in, then open the issue:
http://www.openoffice.org/issues/show_bug.cgi?id=113558
 Issue #|113558
 Summary|Change Case broken by language tags and/or ligatures
   Component|framework
 Version|OOO330m1
Platform|PC
 URL|
  OS/Version|Windows, all
  Status|UNCONFIRMED
   Status whiteboard|
Keywords|
  Resolution|
  Issue type|DEFECT
Priority|P2
Subcomponent|code
 Assigned to|tm
 Reported by|jurf





--- Additional comments from j...@openoffice.org Sat Jul 31 05:10:08 + 
2010 ---
Casing options broken by language tags and/or ligatures

Issue 1601 (http://qa.openoffice.org/issues/show_bug.cgi?id=1601), marked Fixed
and with CWS tl74 included in OOo-dev300m85 (tested) and OOO330m2 (not tested,
but likely identical), implements three new and welcome options in Format |
Change case, namely:

Sentence case
Capitalize Every Word
tOGGLE cASE

Whilst I've not tested tOGGLE cASE (it's not something I need), I have spent a
good while poking Sentence case and Capitalize Every Word with a stick. Both
functions are, unfortunately, very buggy. The implementation of Capitalize Every
Word is especially bad, with a high probability of data loss (disappearing text
with no guarantee that Undo works properly). So far, I've seen the bugs be
triggered by either language mark-up or ligatures (the latter not necessarily in
text selections), which are actually the only conditions I've been testing for.
As such, it's likely there are other triggers, too.

The data loss is particularly troubling as the undo function, even if given
sufficient steps, does not necessarily restore the original text correctly. And
even that assumes that the user is half-expecting trouble.

Issue present in both Writer and Calc (not tested others), and in both cases is
severe.

I'm attaching an ODT file to this issue. It contains several examples you can
try out yourself, together with mock-ups of expected and actual results.


**

ISSUE DESCRIPTION

In brief, the main problems I've found so far are:

Sentence case
- The presence of language mark-up within selected text confuses the parser,
causing it to consider the marked-up section as a new sentence, thus
capitalizing two or more words in the middle of a sentence.

Capitalize Every Word
- Language mark-up causes similar miscalculations, but more exaggerated,
potentially causing data loss (see attached file)
- The presence of ligatures, either within selected text, or before it (but in
the same paragraph) causes similar problems.
- Applying Capitalize Every Word to multiple selections further exacerbates the
problem.

**

POSSIBLE CAUSES

I'm not a programmer, but I think the primary cause of the bugs in either
function is a miscalculation of selection bounds, which leads to at times
extremely severe offset errors both as regards the selection area and the bounds
of the text itself. Among the causes would appear to be:
1. the parser gives language declarations a width (two characters for each
tag, apparently, being one for the opening, another for the closure);
2. the parser miscounts the length of ligatures (unicode FF00 to FF06) whether
or not they're selected, which causes both selections and actual words processed
to expand to the right - if there's no room at the end of the paragraph for this
expansion, text disappears;
3. multiple selections are incorrectly handled (it appears as though errors in
one selection block are carried over to the next, and so on). This may simply be
the symptomatic of the first potential causes, but it may also be compounded by
buffers not being cleared. Or something (TM).

The problem was exacerbated, I think, by the original test case
(http://quaste.services.openoffice.org/index.php?option=com_tcstask=tcs_showtcsid=3116),
which is just plain text: no formatting, no language tags, no awkward characters
such as non-diphthong ligatures (ff, fi, fl, etc.)

**

EXAMPLE

The following is a simple example of the buggy behaviour of Sentence case, to
give you an idea of the type of problem. See the attached file for many more
examples (all different) of both Sentence case and Capitalize Every Word:

Input:  the rapide brown fox [with rapide marked as French]
Expected:   The rapide brown fox
Output: The Rapide Brown fox

The underlying code (from contents.xml) is this, where T3 is default format, and
T4 is French:

text:p text:style-name=Standard
text:span text:style-name=T3The /text:span
text:span text:style-name=T4Rapide BroWn Fox-Like Creat/text:span
text:span text:style-name=T3ure/text:span
/text:p



[framework-issues] [Issue 113558] Change Case broken by lan guage tags and/or ligatures

2010-07-30 Thread jurf
To comment on the following update, log in, then open the issue:
http://www.openoffice.org/issues/show_bug.cgi?id=113558





--- Additional comments from j...@openoffice.org Sat Jul 31 05:35:39 + 
2010 ---
My apologies, I merged two examples into one in my post. It should have one of
these examples:

1. SENTENCE CASE

Input:
the rapide brown fox [with 'rapide' marked as French]

Output:
The Rapide Brown fox

Underlying code:
text:p text:style-name=P1The text:span
text:style-name=T4Rapide/text:span Brown fox/text:p


2. Capitalize Every Word

Input:
the rapide brown fox-like creature [with 'rapide' marked as French]

Output:
The Rapide BroWn Fox-Like Creature [with everything from 'Rapide' to 'Creat'
inclusive marked French]

Underlying code:
text:p text:style-name=BodyThe text:span text:style-name=T4Rapide BroWn
Fox-Like Creat/text:spanure/text:p

-
Please do not reply to this automatically generated notification from
Issue Tracker. Please log onto the website and enter your comments.
http://qa.openoffice.org/issue_handling/project_issues.html#notification

-
To unsubscribe, e-mail: issues-unsubscr...@framework.openoffice.org
For additional commands, e-mail: issues-h...@framework.openoffice.org


-
To unsubscribe, e-mail: allbugs-unsubscr...@openoffice.org
For additional commands, e-mail: allbugs-h...@openoffice.org