Re: analyzer settings for breaking up words on hyphens

2014-10-27 Thread Nikolas Everett
Or you could cheat and use a character filter to turn the hyphen into
spaces.  Lots of ways to skin a cat.

On Mon, Oct 27, 2014 at 7:07 PM, Mike Topper  wrote:

> Thanks!  i'll go ahead and try the pattern tokenizer route.
>
>
>
> On Mon, Oct 27, 2014 at 1:22 PM, Ivan Brusic  wrote:
>
>> You can either use a pattern tokenizer with your patterns being
>> whitespace + hypen, or further decompose your token post tokenization with
>> the word delimiter token filter, which is much harder to use (and might be
>> an overkill for your use case).
>>
>>
>> http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/analysis-pattern-tokenizer.html
>>
>> http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/analysis-word-delimiter-tokenfilter.html
>>
>> Cheers,
>>
>> Ivan
>>
>> On Mon, Oct 27, 2014 at 7:55 AM, Mike Topper  wrote:
>>
>>> Hello,
>>>
>>> I have a field that is using the whitespace tokenizer, but I also want
>>> to tokenize on hyphens (-) like the standard analyzer does.  I'm having
>>> trouble figuring out what additional custom settings I would have to put in
>>> there in order to be able to tokenize off of hyphens as well.
>>>
>>> Thanks,
>>> Mike
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "elasticsearch" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to elasticsearch+unsubscr...@googlegroups.com.
>>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/elasticsearch/CALdNedLtdAWEiQN%2BoUV17J5e8DowMbDva2pJn1S%3Dr9w1qtP9bA%40mail.gmail.com
>>> 
>>> .
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>  --
>> You received this message because you are subscribed to the Google Groups
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to elasticsearch+unsubscr...@googlegroups.com.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQDeFdP4-imY0ReSZTkSAnfQ8o6_hWp9MAB0YcMOgDo9rA%40mail.gmail.com
>> 
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/CALdNedK9EfeL-FGbavnKO4t%3DkrQ%2BxeQ-O2p2wL-P_iqGSrhrsg%40mail.gmail.com
> 
> .
>
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAPmjWd1oEgb55Y0tVU6VNzDXEF6RJQRRFZ%3DW2_iKrRmJBMVW2Q%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: analyzer settings for breaking up words on hyphens

2014-10-27 Thread Mike Topper
Thanks!  i'll go ahead and try the pattern tokenizer route.



On Mon, Oct 27, 2014 at 1:22 PM, Ivan Brusic  wrote:

> You can either use a pattern tokenizer with your patterns being whitespace
> + hypen, or further decompose your token post tokenization with the word
> delimiter token filter, which is much harder to use (and might be an
> overkill for your use case).
>
>
> http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/analysis-pattern-tokenizer.html
>
> http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/analysis-word-delimiter-tokenfilter.html
>
> Cheers,
>
> Ivan
>
> On Mon, Oct 27, 2014 at 7:55 AM, Mike Topper  wrote:
>
>> Hello,
>>
>> I have a field that is using the whitespace tokenizer, but I also want to
>> tokenize on hyphens (-) like the standard analyzer does.  I'm having
>> trouble figuring out what additional custom settings I would have to put in
>> there in order to be able to tokenize off of hyphens as well.
>>
>> Thanks,
>> Mike
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to elasticsearch+unsubscr...@googlegroups.com.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/elasticsearch/CALdNedLtdAWEiQN%2BoUV17J5e8DowMbDva2pJn1S%3Dr9w1qtP9bA%40mail.gmail.com
>> 
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQDeFdP4-imY0ReSZTkSAnfQ8o6_hWp9MAB0YcMOgDo9rA%40mail.gmail.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CALdNedK9EfeL-FGbavnKO4t%3DkrQ%2BxeQ-O2p2wL-P_iqGSrhrsg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: analyzer settings for breaking up words on hyphens

2014-10-27 Thread Ivan Brusic
You can either use a pattern tokenizer with your patterns being whitespace
+ hypen, or further decompose your token post tokenization with the word
delimiter token filter, which is much harder to use (and might be an
overkill for your use case).

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/analysis-pattern-tokenizer.html
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/analysis-word-delimiter-tokenfilter.html

Cheers,

Ivan

On Mon, Oct 27, 2014 at 7:55 AM, Mike Topper  wrote:

> Hello,
>
> I have a field that is using the whitespace tokenizer, but I also want to
> tokenize on hyphens (-) like the standard analyzer does.  I'm having
> trouble figuring out what additional custom settings I would have to put in
> there in order to be able to tokenize off of hyphens as well.
>
> Thanks,
> Mike
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/CALdNedLtdAWEiQN%2BoUV17J5e8DowMbDva2pJn1S%3Dr9w1qtP9bA%40mail.gmail.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQDeFdP4-imY0ReSZTkSAnfQ8o6_hWp9MAB0YcMOgDo9rA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.