[framework-issues] [Issue 113568] Silent corruption of user dictionaries

jurf Sun, 01 Aug 2010 00:43:09 -0700

To comment on the following update, log in, then open the issue:
http://www.openoffice.org/issues/show_bug.cgi?id=113568
                 Issue #|113568
                 Summary|Silent corruption of user dictionaries
               Component|framework
                 Version|DEV300m84
                Platform|PC
                     URL|
              OS/Version|All
                  Status|UNCONFIRMED
       Status whiteboard|
                Keywords|
              Resolution|
              Issue type|DEFECT
                Priority|P3
            Subcomponent|code
             Assigned to|tm
             Reported by|jurf






------- Additional comments from j...@openoffice.org Sun Aug  1 07:43:05 +0000 
2010 -------
OOo-dev330m85 and probably OOO330m2 may corrupt .dic user dictionaries when user
adds words. Buggy behaviour is not present in 3.2.1, hence regression.

DESCRIPTION

When adding words to a user dictionary (right-click menu), one of four things
may happen:

1. word is added normally, speller behaves as expected;
2. word is added to *another* user dictionary (not the one selected, and not
necessarily active);
3. only part of the word is added, appended at the end of the .dic (tested on
UTF8-versioned dic files);
4. word is not added at all, but part of the last word in a .dic (not
necessarily the one selected) is repeated as the new last entry.

The buggy behaviour was first noticed on an updated install (using settings,
including user dictionaries, created in older versions of OOo), but also
occurred on a clean installation with default user settings. In that case, words
I'd supposedly added to standard.dic were actually added to the ignore list.

Having seen similar bugs in a variety of programs (garbage appearing at the end
of incorrectly processed files, paragraphs or text selections, or even in other
non-selected paragraphs or files), I'd guess the problem is caused by a
text/bound parser that isn't counting straight when it encounters anything other
than single-byte characters, which in turn causes miscalculated insert positions
and data corruption while sorting fields etc. I'm quite confident about this
guess as the same build I was testing (OOo-dev330m85) also has data-corrupting
flaws in its new text-casing options, which also relies on calculating text
bounds - see Issue 113558.

For the user dictionary routine, I noticed the problem surfaced when multi-byte
characters (such as a letter with unusual accents, for which there are no
precomposed versions in Unicode) were present, either in the word being added,
or - fatally - already included in a user dictionary (not necessarily the one
selected to receive the word).

I'm afraid I got fed up with M85 and reverted to 3.2.1 before completing
testing. As such, I'm only guessing that the bug might also be triggered by the
presence of ligatures, or by trailing periods (OOo's spell-checker normally
ignores trailing periods, such that "A.D." is entered as "A.D", but this appears
to have buggy for a while), or indeed by anything else that might trip up a
mathematically-challenged parser.

If you need a quick and dirty test word, copy/paste aaaṣbbb (I know, not a word,
but it will at least appear at the top of your word lists to save you a scroll -
the s has a dot underneath, by the way).

---------------------------------------------------------------------
Please do not reply to this automatically generated notification from
Issue Tracker. Please log onto the website and enter your comments.
http://qa.openoffice.org/issue_handling/project_issues.html#notification

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@framework.openoffice.org
For additional commands, e-mail: issues-h...@framework.openoffice.org


---------------------------------------------------------------------
To unsubscribe, e-mail: allbugs-unsubscr...@openoffice.org
For additional commands, e-mail: allbugs-h...@openoffice.org

[framework-issues] [Issue 113568] Silent corruption of user dictionaries

Reply via email to