It's probably about 100,000 entries per "thing that it would care about
at once".
-Original Message-
From: Karl Wettin [mailto:[EMAIL PROTECTED]
Sent: Thursday, April 17, 2008 3:17 PM
To: [email protected]
Subject: Re: Word split problems
Max Metral skrev:
Max Metral skrev:
>
Lululemon Athletica
I'd like any of these search terms to work for this:
Lulu lemon
Lu Lu Lemon
Lululemon
What strategy would be optimal for this kind of thing (of course keeping
How large is your corpus? I suggest you look at NGramTokenizer.
karl
--
In our app, we search for businesses. So here's an example:
Lululemon Athletica
I'd like any of these search terms to work for this:
Lulu lemon
Lu Lu Lemon
Lululemon
What strategy would be optimal for this kind of thing (of course keeping
in mind negative matches are also bad)?