On Wed, Nov 12, 2014 at 11:13 AM, Alessandro Bonfanti <bnf....@gmail.com>
wrote:

>  Il 12/11/2014 15:25, Nikolas Everett ha scritto:
>
>
>
> On Wed, Nov 12, 2014 at 8:15 AM, Alessandro Bonfanti <bnf....@gmail.com>
> wrote:
>
>> Hi, I'm very newbie on ElasticSearch.
>> I'm try to indexing a set of biological data. There are some fields like
>> 'gene_id' or 'gene_shortname' that should be processed as literal strings.
>> When I try to search for 'ZNF6092' in a field filled with
>> 'linc-ZNF6092-6', I can't find anything. When I search for 'linc' I find
>> correct document elsewhere.
>> It seems that this is a problem with ES analyzer, but I tried to set it
>> for do not analyze fields, but it seems that nothing changes.
>> I try with:
>>
>>  curl -XPOST 'localhost:9200/a3' -d @tracking_map.json
>>
>> where tracking_map.json is
>>
>>  {
>>   "mappings": {
>>     "tracking": {
>>       "properties": {
>>         "tracking_id" : {
>>           "type": "string",
>>           "index":"not_analyzed"
>>         },
>>         "nearest_ref_id" : {
>>           "type": "string",
>>           "index":"not_analyzed"
>>         },
>>         "gene_id" : {
>>           "type": "string",
>>           "index":"not_analyzed"
>>         },
>>         "gene_short_name" : {
>>           "type": "string",
>>           "index":"not_analyzed"
>>         }
>>       }
>>     }
>>   }
>> }
>>
>>
>>
>> And then re-indexing of all documents. I failed, but where?
>> Thanks in advance,
>>
>> Alessandro
>>
>
>  Its an analyzer problem, certainly.  You've turned off analyzers with
> "index":"not_analazyed".  What you probably want is for the gene_short_name
> to be analyzed so that dashes are considered "word separators".  If you do
> that you can find linc-ZNF6092-6 by performing a simple_query_string (or
> match) search for <code>ZNF6092</code> or <code>ZNF6092 6</code> or
> <code>6</code> or <code>linc</code>.   Have a look at
> http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/analysis-pattern-tokenizer.html
> and go from there.  You may also want to use a lowercase filter so you can
> search for <code>znf6092</code> and still find it.
>
>  This is a good read on how to change the mapping as well:
> http://www.elasticsearch.org/blog/changing-mapping-with-zero-downtime/
>  even if you don't need all the information in there it is nice to know.
>
> Nik
>   --
> You received this message because you are subscribed to a topic in the
> Google Groups "elasticsearch" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/elasticsearch/Y6I2qNZxR-s/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/CAPmjWd06sKTVS6JC8q7x7R37gUEnsHEiuar0-yy_ZdOJQhKYzQ%40mail.gmail.com
> <https://groups.google.com/d/msgid/elasticsearch/CAPmjWd06sKTVS6JC8q7x7R37gUEnsHEiuar0-yy_ZdOJQhKYzQ%40mail.gmail.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>
> Very thanks for your answer,
> What I want is that ES store fields as literals, so I should find ZNF6092
> with a wilcard search (*ZNF6092* for example).
> I tried set "pattern" to "*" for testing (* isn't in gene_shortname, so I
> suppose that entire string is stored. But anyway I still find nothing.
>
>
You'd have to post your queries for me to help more but in general if best
to analyze the content up front and perform basic match queries without
wildcards than it is to search with wildcards.  Wildcards are way way way
slower.

Nik

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAPmjWd0itbdHQ-maOuOmrrYf2QCqMFORTG21QpFHOCrp9E0rmg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to