ElasticsSearch 'Script Filter'

2014-07-28 Thread thale jacobs
I have a ES search question and I think it can be solved using ES 
scripting, but I was not able to solve it and there may be a better way.

The index has 3 document types,  'province', 'city' and 'neighborhood'
Here is how the index is created:

curl -s -XPUT 'localhost:9200/test/province/1' -d '{ "province": "Ontario" 
}'

curl -s -XPUT 'localhost:9200/test/city/2' -d '{ "province": "Ontario", 
"city": "Toronto" }'
curl -s -XPUT 'localhost:9200/test/city/3' -d '{ "province": "Ontario", 
"city" : "Ontario City" }'

curl -s -XPUT 'localhost:9200/test/neighborhood/4' -d '{ "province": 
"Ontario", "city" : "Ontario City", "neighborhood" : "Waterfront" }'
curl -s -XPUT 'localhost:9200/test/neighborhood/5' -d '{ "province": 
"Ontario", "city" : "Ontario City", "neighborhood" : "Midtown Ontario" }'




The incoming search is for "Ontario".  If the document type in the index is 
'province',I want to be able to search the 'province' field, if the 
document type in the index is 'city' I want to search the 'city' field, and 
if the document type in the index is 'neighborhood', I want to search the 
'neighborhood' field.  So for the search of 'Ontario', the desired results 
would be to return document:

1,3,and 4 (1 is desired because the docuument type is 'province' and 
'Ontario' matched to the 'province' field, 3 is desired because 'Ontario' 
matched to the 'city', and 4 is desired because 'Ontario' matched to the 
district field.

Here is a simple search that does not produce the desired results:

curl -s -XGET 'localhost:9200/test/_search?pretty=true' -d '{
   "query":{
  "bool":{
"should" : [ {
  "multi_match" : {
"query" : "Ontario",
"fields" : [ "province", "city", "neighborhood" ]
  }
}]
  }
}
}'



The problem is that the search for 'Ontario' get a match on all the docs. 
Is it possible to search a specific field based on the document type?  
Based on some example, I it seems like this might be a good usage for 
"scripting" using a "script filter"?

Thanks for your help.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/55fe24a5-86e9-42cb--29dcee3557e0%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Efficiency of search vs get

2014-07-16 Thread thale jacobs
>From the example:
client.prepareSearch(indexName).setRouting(routingStr).setQuery(
QueryBuilders.termQuery("_routing", routingStr)).execute().actionGet();



For clarification, can someone verify that the routing needs to be 
specified via setRouting(routingStr) as well as 
TermQuery(QueryBuilders.termQuery("_routing", routingStr)...?I am 
having a difficult time finding documentation on the java client api as it 
pertains to routing.  Thanks for the help.






On Friday, September 23, 2011 7:15:43 PM UTC-4, kimchy wrote:
>
> Replace setFilter with setQuery(QueryBuilders.termQuery("_routing", 
> routingStr), as the filter is mainly used to filter results fo the query 
> you execute (mainly used with faceting).
>
> On Fri, Sep 23, 2011 at 11:18 PM, Per Steffensen  > wrote:
>
>>  Shay Banon skrev: 
>>
>> Get is as fast as you can go to retrieve a single document, search 
>> against a single field (term query) that uses routing to direct the search 
>> request to a single shard will be almost as fast, but not the same. I don't 
>> have actual numbers to say how slower it will be. 
>>
>>  Regarding a combined index, there is no option to do that in 
>> elasticsearch. You can do a boolean query, with several must clauses 
>> including term query against a, b, and c. This will be slower (since now 
>> you are not searching on a single field, but 3).
>>
>>  On the other hand, the _routing field is automatically indexed (not 
>> analyzed). So, based on the same below, you can simply do a term query 
>> against _routing field with the routing value.
>>  
>> Thanks. Will the following code do the trick?
>>
>> client.prepareSearch(indexName).setRouting(routingStr).setFilter(new 
>> TermFilterBuilder("_routing", routingStr)).execute().actionGet();
>>
>>  
>>  Of course, you might get several documents with the search request, but 
>> I think you factored that in (a_b_c_1, and a_b_c_2).
>>
>> On Thu, Sep 15, 2011 at 10:06 PM, Per Steffensen > > wrote:
>>
>>> curl -X PUT "localhost:9200/mytest/abc/_mapping" -d '{
>>> "abc" : {
>>>  "_routing" : {
>>>  "required" : true
>>>  }
>>>  "properties" : {
>>>  "idx" : {"type" : "string", "index" : "not_analyzed"},
>>>  "a" : {"type" : "string"},
>>>  "b" : {"type" : "string"},
>>>  "c" : {"type" : "integer"},
>>>  "txt" : {"type" : "string", "null_value" : "na"}
>>>  }
>>> }
>>> }
>>>
>>> Lots of abc documents indexed into mytest index - a.o. this
>>> curl -XPUT 
>>> "localhost:9200/mytest/abc/1234_5678_90123?routing=1234_5678_90123" -d '{
>>> "sms" :
>>> {
>>>  "a" : "1234",
>>>  "b" : "5678",
>>>  "c" : 90123,
>>>  "txt" : "Hello World"
>>> }
>>> }
>>>
>>> Expect this "get" will be very efficient:
>>> curl -XGET "
>>> http://localhost:9200/mytest/abc/1234_5678_90123?routing=1234_5678_90123
>>> "
>>>
>>> I have cheated a little in the code above, when I indicate that I can 
>>> make an id consisting of the values of a, b and c. It is only almost true - 
>>> sometimes (but very very seldom) there will be documents with the same 
>>> values for a, b and c. Therefore I cannot make id's like this (will have to 
>>> make a_b_c_X id's og just GUID id's instead), and therefore I cannot "find" 
>>> the document(s) using the "get" above.
>>>
>>> Question: If I know that there will never be more than a few documents 
>>> with concrete values for a, b and c, can I create a "search" finding those 
>>> documents, a search that is just (or almost) as efficient (with respect to 
>>> searchtime and resources used) as the "get" above? Note that I am using 
>>> routing so I should at least be able to hit the right shard in such a 
>>> search.
>>>
>>> In a RDMS I would make an combined index of a, b and c and use the query 
>>> "select * from abc where a="1234" and b="5678" and c=90123" (the "search") 
>>> instead of "select * from abc where id="1234_5678_90123"" (the "get"), and 
>>> that would be just as efficient (if the RDMS uses the combined index, or 
>>> else I will force it by hinting).
>>>
>>> Thanks!
>>>
>>> Regards, Per Steffensen
>>>
>>>  
>>   
>>  
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/7012fe03-4b32-4a86-8ca2-0cfdeb635760%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Corruption error after upgrade to 1.0

2014-05-12 Thread thale jacobs
Did you ever get this resolved and if so, how was it resolved?  I am 
experiencing the same issue...  

On Monday, February 17, 2014 4:25:00 PM UTC-5, Mo wrote:
>
> After upgrading to 1.0 I am unable to index any documents. I get the 
> following error. Could somebody help?
>  
>
> [Aardwolf] Message not fully read (response) for [0] handler 
> future(org.elasticsearch.client.transport.TransportClientNodesService$SimpleNodeSampler$1@5c6e3b4c),
>  
> error [true], resetting
>
> [Aardwolf] failed to get node info for 
> [#transport#-1][inet[/10.80.140.59:9300]], disconnecting...
>
> org.elasticsearch.transport.RemoteTransportException: Failed to 
> deserialize exception response from stream
>
> Caused by: org.elasticsearch.transport.TransportSerializationException: 
> Failed to deserialize exception response from stream
>
> at 
> org.elasticsearch.transport.netty.MessageChannelHandler.handlerResponseError(MessageChannelHandler.java:168)
>
> at 
> org.elasticsearch.transport.netty.MessageChannelHandler.messageReceived(MessageChannelHandler.java:122)
>
> at 
> org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
>
> at 
> org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
>
> at 
> org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
>
> at 
> org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:296)
>
> at 
> org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:462)
>
> at 
> org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:443)
>
> at 
> org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:303)
>
> at 
> org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
>
> at 
> org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
>
> at 
> org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
>
> at 
> org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:268)
>
> at 
> org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:255)
>
> at 
> org.elasticsearch.common.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
>
> at 
> org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:109)
>
> at 
> org.elasticsearch.common.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:312)
>
> at 
> org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:90)
>
> at 
> org.elasticsearch.common.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
>
> at 
> org.elasticsearch.common.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
>
> at 
> org.elasticsearch.common.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
>
> at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
>
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
>
> at java.lang.Thread.run(Unknown Source)
>
> Caused by: java.io.StreamCorruptedException: unexpected end of block data
>
> at java.io.ObjectInputStream.readObject0(Unknown Source)
>
> at java.io.ObjectInputStream.defaultReadFields(Unknown Source)
>
> at java.io.ObjectInputStream.defaultReadObject(Unknown Source)
>
> at java.lang.Throwable.readObject(Throwable.java:913)
>
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>
> at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
>
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>
> at java.lang.reflect.Method.invoke(Unknown Source)
>
> at java.io.ObjectStreamClass.invokeReadObject(Unknown Source)
>
> at java.io.ObjectInputStream.readSerialData(Unknown Source)
>
> at java.io.ObjectInputStream.readOrdinaryObject(Unknown Source)
>
> at java.io.ObjectInputStream.readObject0(Unknown Source)
>
> at java.io.ObjectInputStream.defaultReadFields(Unknown Source)
>  

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/4af31baf-d27b-4f90-8f9d-fc6e72f70ead%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Exact phrase match - city names example

2014-02-27 Thread thale jacobs
I get the same results as you using your example Thanks for posting 
it.  I am not sure why my original example does not work, but that is for 
me to figure out!  Thanks again.

On Thursday, June 14, 2012 2:02:28 PM UTC-4, Greg Silin wrote:
>
> Hi,
> One of our fields in the index stores city names, and we need to ensure 
> that the term is matched exactly.
>
> So if we have "san francisco" indexed, we need to ensure that *only* the 
> term "san francisco" matches; "san" or "francisco" or "south san francisco" 
> should all be misses.
>
> In particular, I don't have a solution on how to make sure "san francisco" 
> does not match against "south san francisco"
>
> Thanks
> -greg
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/5ef2e54a-b7b3-4dc6-a4d9-32d2eabb2010%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Exact phrase match - city names example

2014-02-27 Thread thale jacobs
Thanks for the reply Prashy - I tried performing a term query like you 
suggested; I get the same results (all documents containing main are 
returned...E Main St, W Main St...) Do you only get one document returned 
using the example I provided above (doc id 9/"Main")??

On Thursday, February 27, 2014 2:25:09 AM UTC-5, Prashy wrote:
>
> Try using the term query as term query is not analyzed so it might search 
> the 
> exact term only. 
>
> { 
> "query" : { 
> "term" : { "street" : "xxx" } 
> } 
> } 
>
>
>
> -- 
> View this message in context: 
> http://elasticsearch-users.115913.n3.nabble.com/Exact-phrase-match-city-names-example-tp4019310p4050604.html
>  
> Sent from the ElasticSearch Users mailing list archive at Nabble.com. 
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/6c5de1e2-b65d-4824-81d5-2e0e9636094d%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Exact phrase match - city names example

2014-02-26 Thread thale jacobs
Thanks for the reply Binh Ly - I think the mapping in your example are 
almost like the example I posted and I believe are functionality the 
equivalent.  But my query against the not_analyzed fields return all the 
docs with the word "Main" in themFrom the query side I also thought I 
could specify "analyzer" : "keyword"...but also get the same results...but 
yes, your are correct in something seems off as I can query and the case of 
the search term does not seem to impact the results so that is telling me a 
search analyzer is being used???
 

On Wednesday, February 26, 2014 5:12:02 PM UTC-5, Binh Ly wrote:

> Thale,
>
> Can you double check the mapping. Something seems off to me. Should be 
> something like this:
>
> {
>   "mappings": {
> "name": {
>   "properties": {
> "street": {
>   "type": "string",
>   "index" : "not_analyzed"
> }
>   }
> }
>   }
> }
>
> And don't forget, not_analyzed means case-sensitive matches, fyi. :)
>
> On Wednesday, February 26, 2014 4:51:40 PM UTC-5, thale jacobs wrote:
>>
>> I am having problem a similar problem too.  Here is how I set it up the 
>> test index:
>>
>> Create the index:
>> curl -s -XPUT 'localhost:9200/test' -d '{
>> "mappings": {
>> "properties": {
>> "name": {
>> "street": {
>> "type": "string",
>> "index_analyzer": "not_analyzed",
>> "search_analyzer": "not_analyzed",
>> "index" : "not_analyzed"
>> }
>> }
>> }
>> }
>> }'
>>
>>
>>
>> Inert some data:
>> curl -s -XPUT 'localhost:9200/test/name/5' -d '{ "street": ["E Main 
>> St"]}'
>> curl -s -XPUT 'localhost:9200/test/name/6' -d '{ "street": ["W Main St"] 
>> }'
>> curl -s -XPUT 'localhost:9200/test/name/7' -d '{ "street": ["East Main 
>> Rd"] }'
>> curl -s -XPUT 'localhost:9200/test/name/8' -d '{ "street": ["West Main 
>> Rd"] }'
>> curl -s -XPUT 'localhost:9200/test/name/9' -d '{ "street": ["Main"] }'
>> curl -s -XPUT 'localhost:9200/test/name/10' -d '{ "street": ["Main St"] 
>> }'
>>
>>
>>
>>
>> --Now attempt to search for "Main"... Not "Main St", Not "East Main 
>> Rd"...I only want to return doc #9 - "Main"
>> curl -s -XGET 'localhost:9200/test/_search?pretty=true' -d '{
>>"query":{
>>   "bool":{
>>  "must":[
>> {
>>"match":{
>>   "street":{
>>  "query":"main",
>>  "type":"phrase",
>>  "analyzer" : "keyword"
>>   }
>>}
>> }
>>  ]
>>   }
>>}
>> }';
>>
>> The best document returned is "Main", but I don't know how to filter out 
>> the others that are not exact matches (although they contain matching 
>> terms).
>> ...
>> Here the results from my example above:
>>   "_score" : 0.2876821, "_source" : { "street": ["Main"] }
>>   "_score" : 0.25316024, "_source" : { "street": ["East Main Rd"] }
>>   "_score" : 0.25316024, "_source" : { "street": ["W Main St"] }
>>   "_score" : 0.25316024, "_source" : { "street": ["E Main St"]}
>>   "_score" : 0.1805489, "_source" : { "street": ["Main St"] }
>>   "_score" : 0.14638957, "_source" : { "street": ["West Main Rd"] }
>>
>>
>>
>>
>>
>> On Thursday, June 14, 2012 3:38:31 PM UTC-4, Colin Dellow wrote:
>>>
>>> Does "index": "not_analyzed" not work for you (
>>> http://www.elasticsearch.org/guide/reference/mapping/core-types.html) ?
>>>
>>>
>>> On Thursday, 14 June 2012 14:02:28 UTC-4, Greg Silin wrote:
>>>>
>>>> Hi,
>>>> One of our fields in the index stores city names, and we need to ensure 
>>>> that the term is matched exactly.
>>>>
>>>> So if we have "san francisco" indexed, we need to ensure that *only* 
>>>> the term "san francisco" matches; "san" or "francisco" or "south san 
>>>> francisco" should all be misses.
>>>>
>>>> In particular, I don't have a solution on how to make sure "san 
>>>> francisco" does not match against "south san francisco"
>>>>
>>>> Thanks
>>>> -greg
>>>>
>>>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/5f94aca2-1754-4358-9be7-f763b671fc48%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Exact phrase match - city names example

2014-02-26 Thread thale jacobs
I am having problem a similar problem too.  Here is how I set it up the 
test index:

Create the index:
curl -s -XPUT 'localhost:9200/test' -d '{
"mappings": {
"properties": {
"name": {
"street": {
"type": "string",
"index_analyzer": "not_analyzed",
"search_analyzer": "not_analyzed",
"index" : "not_analyzed"
}
}
}
}
}'



Inert some data:
curl -s -XPUT 'localhost:9200/test/name/5' -d '{ "street": ["E Main St"]}'
curl -s -XPUT 'localhost:9200/test/name/6' -d '{ "street": ["W Main St"] }'
curl -s -XPUT 'localhost:9200/test/name/7' -d '{ "street": ["East Main Rd"] 
}'
curl -s -XPUT 'localhost:9200/test/name/8' -d '{ "street": ["West Main Rd"] 
}'
curl -s -XPUT 'localhost:9200/test/name/9' -d '{ "street": ["Main"] }'
curl -s -XPUT 'localhost:9200/test/name/10' -d '{ "street": ["Main St"] }'




--Now attempt to search for "Main"... Not "Main St", Not "East Main Rd"...I 
only want to return doc #9 - "Main"
curl -s -XGET 'localhost:9200/test/_search?pretty=true' -d '{
   "query":{
  "bool":{
 "must":[
{
   "match":{
  "street":{
 "query":"main",
 "type":"phrase",
 "analyzer" : "keyword"
  }
   }
}
 ]
  }
   }
}';

The best document returned is "Main", but I don't know how to filter out 
the others that are not exact matches (although they contain matching 
terms).
...
Here the results from my example above:
  "_score" : 0.2876821, "_source" : { "street": ["Main"] }
  "_score" : 0.25316024, "_source" : { "street": ["East Main Rd"] }
  "_score" : 0.25316024, "_source" : { "street": ["W Main St"] }
  "_score" : 0.25316024, "_source" : { "street": ["E Main St"]}
  "_score" : 0.1805489, "_source" : { "street": ["Main St"] }
  "_score" : 0.14638957, "_source" : { "street": ["West Main Rd"] }





On Thursday, June 14, 2012 3:38:31 PM UTC-4, Colin Dellow wrote:
>
> Does "index": "not_analyzed" not work for you (
> http://www.elasticsearch.org/guide/reference/mapping/core-types.html) ?
>
>
> On Thursday, 14 June 2012 14:02:28 UTC-4, Greg Silin wrote:
>>
>> Hi,
>> One of our fields in the index stores city names, and we need to ensure 
>> that the term is matched exactly.
>>
>> So if we have "san francisco" indexed, we need to ensure that *only* the 
>> term "san francisco" matches; "san" or "francisco" or "south san francisco" 
>> should all be misses.
>>
>> In particular, I don't have a solution on how to make sure "san 
>> francisco" does not match against "south san francisco"
>>
>> Thanks
>> -greg
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/42921778-0a92-4a57-ab6f-7f089ebe95ec%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Best way to search/index the data - with and without whitespace

2014-02-06 Thread thale jacobs
This is how I set up the mappings:

curl -s -XPUT 'localhost:9200/test' -d '{
"mappings": {
"properties": {
"name": {
"street": {
"type": "string",
"index_analyzer": "index_ngram",
"search_analyzer": "search_ngram"
}
}
}
},
"settings": {
"analysis": {
"filter": {
"desc_ngram": {
"type": "edgeNGram",
"min_gram": 3,
"max_gram": 20
}
},
"analyzer": {
"index_ngram": {
"type": "custom",
"tokenizer": "keyword",
"filter": [ "desc_ngram", "lowercase" ]
},
"search_ngram": {
"type": "custom",
"tokenizer": "keyword",
"filter": "lowercase"
}
}
}
}
}'


This is how I built the index:

curl -s -XPUT 'localhost:9200/test/name/1' -d '{ "street": "Lakeshore Dr" }'
curl -s -XPUT 'localhost:9200/test/name/2' -d '{ "street": "Sunnyshore Dr" 
}'
curl -s -XPUT 'localhost:9200/test/name/3' -d '{ "street": "Lake View Dr" }'
curl -s -XPUT 'localhost:9200/test/name/4' -d '{ "street": "Shore Dr" }'

If a user attempts to search  for "Lake Shore Dr", I want to only match to 
document 1/"Lakeshore Dr"
If a user attempts to search for "Lakeview Dr", I want to only match to 
document 3/"Lake View Dr"

Here is an example of the query that is not working correctly:

curl -s -XGET 'localhost:9200/test/_search?pretty=true' -d '{
   "query":{
  "bool":{
 "must":[
{
   "match":{
  "street":{
 "query":"lake shore dr",
 "type":"boolean"
  }
   }
}
 ]
  }
   }
}';


So is the issue with how I am setting up the mappings (tokenizer?, edgegram 
vs ngrams?, size of ngrams?) or the query (I have tried things like setting 
the minimum_should_match, and the analyzer to use), but I have not been 
able to get the desired results.

Thanks all.







On Thursday, February 6, 2014 10:16:40 AM UTC-5, Binh Ly wrote:
>
> Thale, you are correct - ngrams are usually used at index-time only, but 
> in your case and requirements, you might want to experiment both index and 
> seach time. I'd probably just increase the edge min ngram size to something 
> reasonable like maybe 4(?) and see if that works or not.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/3b7a9d63-3a08-4cfc-96ce-4b22d44cd9db%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Best way to search/index the data - with and without whitespace

2014-02-06 Thread thale jacobs
Hello Binh Ly - Thanks for the replay.  I thought I had read that ngram 
searching should only be used at either index time or search time, but not 
both...  Is that not the case?  Thanks again.  Thale

On Wednesday, January 29, 2014 6:49:10 PM UTC-5, Binh Ly wrote:
>
> Thale, I would try edge ngrams (both index and search) and see how that 
> works. I don't see why it wouldn't work for your 2 cases - just make your 
> queries into match queries and use the "AND" operator. Good luck!
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/ef6a8b2f-e291-419f-8a8b-1eefa8657d2b%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Best way to search/index the data - with and without whitespace

2014-01-29 Thread thale jacobs
Hello - I am having a problem indexing and searching for words that may or 
may not contain whitespace...Below is an example

Here is how the index is created:

curl -s -XPUT 'localhost:9200/test/name/1' -d '{ "street": "Lakeshore Dr" }'
curl -s -XPUT 'localhost:9200/test/name/2' -d '{ "street": "Sunnyshore Dr" 
}'
curl -s -XPUT 'localhost:9200/test/name/3' -d '{ "street": "Lake View Dr" }'
curl -s -XPUT 'localhost:9200/test/name/4' -d '{ "street": "Shore Dr" }'


If I want to query for record 1/"Lakeshore Dr", I can using the following 
query:

curl -s -XGET 'localhost:9200/test/name/_search?pretty=true' -d '{
   "query":{
  "bool":{
 "must":[
{
   "match":{
  "street":{
 "query":"lakeshore dr",
 "type":"phrase"
  }
   }
}
 ]
  }
   }
}';

This returns the desired result of document id 1.  But if a user searches 
for "Lake Shore Dr" (a space between Lake and Shore), it is still desired 
to return document id 1.

And the inverse of this problem is if a user searches for "Lakeview Dr" 
(but indexed as "Lake View Dr"):
curl -s -XGET 'localhost:9200/test/name/_search?pretty=true' -d '{
   "query":{
  "bool":{
 "must":[
{
   "match":{
  "street":{
 "query":"lakeview dr",
 "type":"phrase"
  }
   }
}
 ]
  }
   }
}';

The search matches to no documents.  If the search is changed to a 
booleansearch instead of a phrase
, 
many docs will match on "dr", but doc #3, "Lake Shore" is not necessarily 
returned as the top match.


NGrams at index time??  Ngrams at search time??  Remove whitespace at index 
time/search time??

Any suggestions would be appreciated.  Thanks.





-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/06538a83-17d1-446c-9b27-cebf12c6fc47%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: 0.90.9 in elasticsearch - indexing with ngrams, search returns 0 results

2014-01-16 Thread thale jacobs
Actually, I take back what I said about it not working.  The example you 
posted up on gist is working.  Thanks!...I just need to figure out the 
difference between the original example I posted up which is not working 
and your working post.  Thanks again.  

On Thursday, January 16, 2014 12:06:54 PM UTC-5, thale jacobs wrote:
>
> Hi David - Thanks for the reply.  I just tried it on 90.5, 90.9, 90.10, 
> and 1.0.0.RC1.  No results were returned in my searches.  I forgot to 
> include that I have ES running on ubuntu if that makes a difference.
>
> Thale
>
> On Thursday, January 16, 2014 10:14:20 AM UTC-5, David Pilato wrote:
>>
>> Hey!
>>
>> Just tested it with es 1.0.0.RC1 and it's working fine.
>> See https://gist.github.com/dadoonet/8456535
>>
>> -- 
>> *David Pilato* | *Technical Advocate* | *Elasticsearch.com*
>> @dadoonet <https://twitter.com/dadoonet> | 
>> @elasticsearchfr<https://twitter.com/elasticsearchfr>
>>
>>
>> Le 16 janvier 2014 at 14:25:02, thale jacobs (thale...@gmail.com) a 
>> écrit:
>>
>> Hello - I am attempting to run a ngram test using elastics 0.90.9.  I 
>> was able to replicate the problem I am having in our production system by 
>> following the example from here:
>>
>> http://blog.rnf.me/2013/exact-substring-search-in-elasticsearch.html
>>
>> The Mappings looks like this:
>>  
>> {
>> "mappings": {
>> "homes": {
>> "properties": {
>> "desc": {
>> "type": "string",
>> "index_analyzer": "index_ngram",
>> "search_analyzer": "search_ngram"
>> }
>> }
>> }
>> },
>> "settings": {
>> "analysis": {
>> "filter": {
>> "desc_ngram": {
>> "type": "ngram",
>> "min_gram": 3,
>> "max_gram": 8
>> }
>> },
>> "analyzer": {
>> "index_ngram": {
>> "type": "custom",
>> "tokenizer": "keyword",
>> "filter": [ "desc_ngram", "lowercase" ]
>> },
>> "search_ngram": {
>> "type": "custom",
>> "tokenizer": "keyword",
>> "filter": "lowercase"
>> }
>> }
>> }
>> }
>> }
>>
>> The mapping is set up as follows:
>>
>> curl -s -XPUT 'localhost:9200/listings' -d @settings.json
>>
>> Three records are inserted into the index as follows:
>>
>>
>>  curl -s -XPUT 'localhost:9200/listings/homes/1' -d '{ "desc": "This is 
>> a lovely home with a large yard." }'
>> curl -s -XPUT 'localhost:9200/listings/homes/2' -d '{ "desc": "This 
>> large fixer-upper has a gravelly yard." }'
>> curl -s -XPUT 'localhost:9200/listings/homes/3' -d '{ "desc": "A 
>> colonial mansion with a large yard and a pool." }'
>>
>>  
>>
>> The Search looks like this:
>>  curl -s -XGET 'localhost:9200/listings/homes/_search?pretty=true' -d '{ 
>> "query" : { "bool" : { "must" : [ { "match" : { "desc" : { "query" : "large 
>> ya", "type" : "phrase" } } }, { "match" : { "desc" : { "query" : "arge 
>> yar", "type" : "phrase" } } }, { "match" : { "desc" : { "query" : "rge 
>> yard", "type" : "phrase" } } } ] } } }';
>>
>>  
>>
>> The problem is no results are returned.  Any input that can be given for 
>> a solution would be appreciated.
>>
>> Thank you!
>>
>>
>>
>>
>>
>>
>>  
>>
>> --
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to elasticsearc...@googlegroups.com.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/4bee1be0-7ee4-4667-b4f7-b51371963928%40googlegroups.com
>> .
>> For more options, visit https://groups.google.com/groups/opt_out.
>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/d67e3572-f074-4879-9659-59bc2a561bc5%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: 0.90.9 in elasticsearch - indexing with ngrams, search returns 0 results

2014-01-16 Thread thale jacobs
Hi David - Thanks for the reply.  I just tried it on 90.5, 90.9, 90.10, and 
1.0.0.RC1.  No results were returned in my searches.  I forgot to include 
that I have ES running on ubuntu if that makes a difference.

Thale

On Thursday, January 16, 2014 10:14:20 AM UTC-5, David Pilato wrote:
>
> Hey!
>
> Just tested it with es 1.0.0.RC1 and it's working fine.
> See https://gist.github.com/dadoonet/8456535
>
> -- 
> *David Pilato* | *Technical Advocate* | *Elasticsearch.com*
> @dadoonet <https://twitter.com/dadoonet> | 
> @elasticsearchfr<https://twitter.com/elasticsearchfr>
>
>
> Le 16 janvier 2014 at 14:25:02, thale jacobs 
> (thale...@gmail.com) 
> a écrit:
>
> Hello - I am attempting to run a ngram test using elastics 0.90.9.  I was 
> able to replicate the problem I am having in our production system by 
> following the example from here:
>
> http://blog.rnf.me/2013/exact-substring-search-in-elasticsearch.html
>
> The Mappings looks like this:
>  
> {
> "mappings": {
> "homes": {
> "properties": {
> "desc": {
> "type": "string",
> "index_analyzer": "index_ngram",
> "search_analyzer": "search_ngram"
> }
> }
> }
> },
> "settings": {
> "analysis": {
> "filter": {
> "desc_ngram": {
> "type": "ngram",
> "min_gram": 3,
> "max_gram": 8
> }
> },
> "analyzer": {
> "index_ngram": {
> "type": "custom",
> "tokenizer": "keyword",
> "filter": [ "desc_ngram", "lowercase" ]
> },
> "search_ngram": {
> "type": "custom",
> "tokenizer": "keyword",
> "filter": "lowercase"
> }
> }
> }
> }
> }
>
> The mapping is set up as follows:
>
> curl -s -XPUT 'localhost:9200/listings' -d @settings.json
>
> Three records are inserted into the index as follows:
>
>
>  curl -s -XPUT 'localhost:9200/listings/homes/1' -d '{ "desc": "This is a 
> lovely home with a large yard." }'
> curl -s -XPUT 'localhost:9200/listings/homes/2' -d '{ "desc": "This large 
> fixer-upper has a gravelly yard." }'
> curl -s -XPUT 'localhost:9200/listings/homes/3' -d '{ "desc": "A colonial 
> mansion with a large yard and a pool." }'
>
>  
>
> The Search looks like this:
>  curl -s -XGET 'localhost:9200/listings/homes/_search?pretty=true' -d '{ 
> "query" : { "bool" : { "must" : [ { "match" : { "desc" : { "query" : "large 
> ya", "type" : "phrase" } } }, { "match" : { "desc" : { "query" : "arge 
> yar", "type" : "phrase" } } }, { "match" : { "desc" : { "query" : "rge 
> yard", "type" : "phrase" } } } ] } } }';
>
>  
>
> The problem is no results are returned.  Any input that can be given for a 
> solution would be appreciated.
>
> Thank you!
>
>
>
>
>
>
>  
>
> --
> You received this message because you are subscribed to the Google Groups 
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to elasticsearc...@googlegroups.com .
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/elasticsearch/4bee1be0-7ee4-4667-b4f7-b51371963928%40googlegroups.com
> .
> For more options, visit https://groups.google.com/groups/opt_out.
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/7bccf38a-eaa7-4e7b-9f82-a364a1d410df%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


0.90.9 in elasticsearch - indexing with ngrams, search returns 0 results

2014-01-16 Thread thale jacobs
Hello - I am attempting to run a ngram test using elastics 0.90.9.  I was 
able to replicate the problem I am having in our production system by 
following the example from here:

http://blog.rnf.me/2013/exact-substring-search-in-elasticsearch.html

The Mappings looks like this:

{
"mappings": {
"homes": {
"properties": {
"desc": {
"type": "string",
"index_analyzer": "index_ngram",
"search_analyzer": "search_ngram"
}
}
}
},
"settings": {
"analysis": {
"filter": {
"desc_ngram": {
"type": "ngram",
"min_gram": 3,
"max_gram": 8
}
},
"analyzer": {
"index_ngram": {
"type": "custom",
"tokenizer": "keyword",
"filter": [ "desc_ngram", "lowercase" ]
},
"search_ngram": {
"type": "custom",
"tokenizer": "keyword",
"filter": "lowercase"
}
}
}
}
}

The mapping is set up as follows:

curl -s -XPUT 'localhost:9200/listings' -d @settings.json

Three records are inserted into the index as follows:


curl -s -XPUT 'localhost:9200/listings/homes/1' -d '{ "desc": "This is a 
lovely home with a large yard." }'
curl -s -XPUT 'localhost:9200/listings/homes/2' -d '{ "desc": "This large 
fixer-upper has a gravelly yard." }'
curl -s -XPUT 'localhost:9200/listings/homes/3' -d '{ "desc": "A colonial 
mansion with a large yard and a pool." }'



The Search looks like this:
curl -s -XGET 'localhost:9200/listings/homes/_search?pretty=true' -d '{ 
"query" : { "bool" : { "must" : [ { "match" : { "desc" : { "query" : "large 
ya", "type" : "phrase" } } }, { "match" : { "desc" : { "query" : "arge 
yar", "type" : "phrase" } } }, { "match" : { "desc" : { "query" : "rge 
yard", "type" : "phrase" } } } ] } } }';



The problem is no results are returned.  Any input that can be given for a 
solution would be appreciated.

Thank you!








-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/4bee1be0-7ee4-4667-b4f7-b51371963928%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.