On Thu, 29 Jan 2009 12:19:38 +0100 "arne anka" <openm...@ginguppin.de> said:
> > This dictionary would have hundreds of millions of rows even if you take > > only reasonable user inputs. > > why would that be? colloquial language (nad that's what is to be > considered) contains only several thousends words, still a lot but far > away from millions. > > > But what to do if the users inputs something > > that's not in the dictionary? > > but that's a problem with every dictionary -- you never can contain every > possible word. > > i don't use the keyboard and i do not follow the discussion close, but > what always struck me odd was the use of a text file. > why not use a db? it would enable learning, too. sheer simplicity and dependencies. a db would mean selecting one. gdbm is gpl. libdb is fine - but they love to break db format every few releases and that'd royally suck. also these lean to key/value pair - and that means u need to GENERATE all possible permutations (which is prohibitively expensive) so the dict also affects the lookup as you simply avoid generating permutations u know will never have any matches (ie nothing starts with qz... so never worry about all the qz* permutations). the best suggestion is a trie - but i need a format i can access really quickly - and a library that isnt license or otherwise restricted, easy to use, doesnt eat much ram at all, and is fast. invariably you never get that - it either eats ram or it slow, or something else. so what i did is just use a simple format easy to generate with a small 1 liner shell command and index it on the fly for quick lookups in a tiny 2 level index. it of course is not incredibly fast - but it uses a tiny amount of precious ram. making it a text file opens the gate to easy generation of new dicts - and i wanted to keep that as easy as possible. -- ------------- Codito, ergo sum - "I code, therefore I am" -------------- The Rasterman (Carsten Haitzler) ras...@rasterman.com _______________________________________________ Openmoko community mailing list community@lists.openmoko.org http://lists.openmoko.org/mailman/listinfo/community