Hmmm, would it work for your case to use Synonyms? If you set
expand=false

and in your synonyms file have:
quick brown => quickbrown

it might do what you want....

Best
Erick

On Sun, Aug 21, 2011 at 3:53 PM, Xiyang Chen <[email protected]> wrote:
> Hi,
>
> I have a dictionary of multi-word phrases and I'd like to analyze documents 
> such that anything that appears in the dictionary will be treated as one 
> single token.
> For example, if the dictionary contains "brown fox", then the sentence
> The quick brown fox jumps over the lazy dog.
>
> Will be tokenized as (with stopwords stripped):
> quick | brown fox | jumps | lazy | dog
>
> What is the best way to achieve this?
>
> Thanks,
> XIyang
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to