Hello Andrew.

    I'm aware of the issues with parsing japanese text. I'm the
developer of Furigana Injector, which uses the Mecab lib. It uses a
much larger dictionary (IPADIC, or UniDic), saved in a binary format
in a trie data structure. A trie is the bee's knees for this sort of
text search, as you've worked out. What I read about other people's
trie classes written in Javascript, however, is that they don't
perform so great and you seem to be confirming that. The amount of
memory being consumed for the dictionary is many times (10x?) the
starting amount the entire browser process required.

    I can't see a way around this though, if using Javascript. The
Rikaichan method of using one huge string and an index that contains
offsets into the string looked clunky to me when I first saw it, but
maybe that is the best optimization- to avoid creating javascript
objects as much as possible.

   I'm going off to crbug.com to see if there's anything about binary
dictionaries ...

Akira
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Chromium-extensions" group.
To post to this group, send email to chromium-extensions@googlegroups.com
To unsubscribe from this group, send email to 
chromium-extensions+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/chromium-extensions?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to