On Tue, Apr 03, 2007 at 03:55:28AM +0200, Jin wrote: > Jens Kraemer wrote: > > On Mon, Apr 02, 2007 at 07:58:35AM +0200, Jin wrote: > >> Anybody who knows whether ferret or acts_as_ferret support wordsegment > >> search? > > > > I don't know what you mean with this. Could you give an example? Strange > > enough, I only seem to find chinese documents when googling - looks like > > that's a feature useful when analyzing chinese text... > > > Yep.Now the search system is build up but maybe more features more > better,now customer's intension is if he input 'designpattern',there is > no space between design and pattern,it should dived the two words just > like google and provide the information of 'designpattern' and 'design > pattern',that is it
interesting, that could be useful for analyzing german text, too - we have lots of composite words like this :-) > > If Lucene can do it, Ferret will most probably be able to do it, too :-) > > > > Maybe it's just a matter of implementing a custom analyzer, I guess I > > found something like that there: http://kingcat1234.spaces.live.com/ > > (search for wordSegement). > I have checked the website u gave it,thank you.but what they done is for > diving the chinese sentence to words and it is not so accurate,e.g. > 'iamyourfriend' may be dived to 'iam','amyou','yourfriend','friend' > something like this. ah, ok. > this kind of things should need vocabulary support yeah. > if similar implementation existed then i would prefer use it but not > create it :) At least I couldn't find one. Are you sure lucene has an analyzer that can split composite words? If yes, porting it to ruby should be relatively easy :-) Jens -- Jens Krämer webit! Gesellschaft für neue Medien mbH Schnorrstraße 76 | 01069 Dresden Telefon +49 351 46766-0 | Telefax +49 351 46766-66 [EMAIL PROTECTED] | www.webit.de Amtsgericht Dresden | HRB 15422 GF Sven Haubold, Hagen Malessa _______________________________________________ Ferret-talk mailing list [email protected] http://rubyforge.org/mailman/listinfo/ferret-talk

