Eleonora,

Yes, I used a different dictionary than yours. The hu_HU.dic I used has 96,461 lines. Apparently the Hungarian dictionary available through DicOO isn't the latest.

Perhaps your hardware is faster than mine. In my slower(?) hardware, I see a significant difference between building the hash table for large dictionaries and for smaller ones. Many users have complained about OOo "getting stuck" while the dictionaries load. So I think that it would be useful if Hunspell developers could improve performance here.

Alan

ge wrote:

Alan,

The size of the 2-nd Hungarian dictionary is:

  lines    words    characters
  22068   124931   622546 hu_HU.aff
 873355   873348 26481165 hu_HU.dic
 895423   998279 27103711 total

dic contains 873378 words, it is 8 times larger than Hebrew.
aff is roughly twice as big as Hebrew.

I assume, you used the 1-st Hungarian one, with the small word count for your test.

I use the 2-nd all the time, and it loads in
less than 1 second for me.
Therefore I do not understand the effect you
describe.

-eleonora


Hi Marcin, Janis, Eleanora,

I did some debugging in the hunspell code, and found that the size of
the Hebrew dictionaries was the cause of the delay, similar to Janis's
problem in Latvian. The files are read line by line, and he_IL.dic has
329,326 entries, which is far more than the other dictionies I tried.
The main bottleneck was not in reading the files from the disk, but in
building the hash tables in hashmgr.cxx in add_word(). When I shortened
he_IL.dic to the size of the Hungarian dictionary, it took the same
amount of time to load Hebrew and Hungarian. Same with Hebrew and
English US.

To Hunspell developers out there: is there any way to make the building
of the hash tables more efficient?

Alan


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to