enabling filter cache

2015-04-22 Thread Ed Kim
Hi, I have a dynamic query built via java api that assembles a filtered 
query depending on the parameter input. I have about a dozen filters 
(mostly term filters) that may or may not be used, and had a couple 
questions:

1. Is it ok to simply set the parent boolFilterBuilder cache setting to 
true, or do I need to set cache=true for each filter?
2. Would it be a good practice to execute a dummy query with all the 
filters to preemptively create the filter before it's released for actual 
use?

Thanks in advance for your time

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/af0fc6a1-4951-4c83-9642-cb6b12e3e56f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: real time match analysis

2015-01-15 Thread Ed Kim
I was able to identify which field matched via explain, but couldn't see 
any information on which token filter was the reason for the match. I've 
tried specifying the analyzer name that the field uses as well as not 
specifying. If the explain is supposed to provide this data, I will give it 
another go and set up a test index with simpler analyzer setups.

Also, in order to do this, I will need to run the explain separate from the 
search itself. My ultimate goal is to be able to do this within 
milliseconds (less than 10). Is this feasible with explain?

On Wednesday, January 14, 2015 at 12:51:15 PM UTC-8, Nikolas Everett wrote:
>
> What about explain?
>
> On Wed, Jan 14, 2015 at 3:24 PM, Ed Kim > 
> wrote:
>
>> Just a friendly bump to see if anyone has any feedback. :)
>>
>>
>> On Saturday, January 10, 2015 at 10:38:34 PM UTC-8, Ed Kim wrote:
>>>
>>> Hello all, I was wondering if anyone could offer some feedback on 
>>> whether there is a way to determine how a document matched in real time. I 
>>> currently use custom analyzers at index time to allow a broad array of 
>>> matches for a given text field. I try to match based on phrases, synonyms, 
>>> substrings, stemming, etc of a given phrase, and I would like to be able to 
>>> figure out at search time, which analyzer was attributed to causing the 
>>> match. 
>>>
>>> Currently, I've gotten around this by creating child documents where the 
>>> fields are fanned out to their respective analyzer types. So I have a child 
>>> document where the field only applies stemming, another that uses only 
>>> synonyms, etc. However, due to the growing number of fields that require 
>>> analysis and the growth of my data set, I'd much prefer if I had less 
>>> documents (and less complex too). I was hoping there would be a way to tag 
>>> tokens at the analysis phase that could be used at the search phase to 
>>> quickly determine my match level, but I was not able to find anything like 
>>> this.
>>>
>>> Having said that, has anyone else ever tried to figure this out, or have 
>>> an thoughts on how to leverage ES at a lower level to determine match? 
>>>
>>  -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to elasticsearc...@googlegroups.com .
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/eab16b7d-7d98-4096-b853-66ef65376c44%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/elasticsearch/eab16b7d-7d98-4096-b853-66ef65376c44%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>>
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/326aca97-d937-41cc-9c28-7f89aa398c81%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: real time match analysis

2015-01-14 Thread Ed Kim
Just a friendly bump to see if anyone has any feedback. :)

On Saturday, January 10, 2015 at 10:38:34 PM UTC-8, Ed Kim wrote:
>
> Hello all, I was wondering if anyone could offer some feedback on whether 
> there is a way to determine how a document matched in real time. I 
> currently use custom analyzers at index time to allow a broad array of 
> matches for a given text field. I try to match based on phrases, synonyms, 
> substrings, stemming, etc of a given phrase, and I would like to be able to 
> figure out at search time, which analyzer was attributed to causing the 
> match. 
>
> Currently, I've gotten around this by creating child documents where the 
> fields are fanned out to their respective analyzer types. So I have a child 
> document where the field only applies stemming, another that uses only 
> synonyms, etc. However, due to the growing number of fields that require 
> analysis and the growth of my data set, I'd much prefer if I had less 
> documents (and less complex too). I was hoping there would be a way to tag 
> tokens at the analysis phase that could be used at the search phase to 
> quickly determine my match level, but I was not able to find anything like 
> this.
>
> Having said that, has anyone else ever tried to figure this out, or have 
> an thoughts on how to leverage ES at a lower level to determine match? 
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/eab16b7d-7d98-4096-b853-66ef65376c44%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: scalability questions

2015-01-14 Thread Ed Kim
The shard identification/routing is completely arbitrary. For instance, 
users who's usernames start from A-F can be routed to shard 1, G-M to shard 
2, etc. So you can imagine, user Ed, Cindy and user David data can live in 
shard 1. Use Greg will have his data in shard 2.

On Wednesday, January 14, 2015 at 12:14:50 PM UTC-8, Cindy wrote:
>
> Hi David,
>
>  
>
> The documentations you pointed out are exactly what I am looking for. They 
> are really helpful and demonstrate the uniqueness of Elasticsearch on 
> scalability :-)
>
>  
>
> I like the tips in "faking index per user with aliases" very much, but 
> since it basically routes the request to a single shard, I just want to 
> double check with you whether multiple users can share the same shard. 
>
> Thanks,
> Cindy
>
>
> On Wednesday, 14 January 2015 06:23:07 UTC-5, David Pilato wrote:
>>
>> I think I would start reading this: 
>> http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/kagillion-shards.html
>> This 
>> http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/user-based.html
>> and this 
>> http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/faking-it.html
>>
>> Actually the full chapter: 
>> http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/scale.html
>>  :)
>>
>> HTH
>>
>> -- 
>> *David Pilato* | *Technical Advocate* | *Elasticsearch.com 
>> *
>> @dadoonet  | @elasticsearchfr 
>>  | @scrutmydocs 
>> 
>>
>>
>>  
>> Le 14 janv. 2015 à 02:04, 'Cindy' via elasticsearch <
>> elasti...@googlegroups.com> a écrit :
>>
>> Hello
>>
>> We are using some other search engine and consider moving to use 
>> Elasticsearch. After done quite a lot reading, I am still not quite sure 
>> what the optimized way should be in our case, especially after I read that 
>> the number of shards can NOT be changed once the index is created.
>>
>>  
>> In our situation, our product is hosted in cloud environment and has 
>> rapid growing number of users, and each user is given various disk 
>> space(several gigabytes to hundreds gigabytes) to import their datasets. We 
>> index these datasets with fixed number of fields and the fields are all the 
>> same for some purpose. Each user can only search in their own imported 
>> datasets for security reason (segregated). So there is no need to query 
>> against the entire index and query time is much more important than 
>> indexing time. Our current query time is about 10 to 40 ms.
>>
>>  
>> It's very crucial for us how to scale out horizontally smoothly.
>>
>>  
>> If everything is added into one index with one type, I worried the 
>> index/search will be getting slower and slower with growing of the size of 
>> the indices. 
>>
>>  
>> So I plan to split the indices to speed up query, and here are some 
>> options
>>
>>1. Use one index and create a type for each user such that the query 
>>from one user is directly against his own type. But since the number of 
>>users can be over million, can elasticsearch be able to handle million 
>>types in one index? 
>>2. Group users into different indices such that the index/query can 
>>be dispatched  to different indices, so a smaller index to query from. 
>> But 
>>this means our application has to handle the complexity of horizontal 
>> scale 
>>out. 
>>
>>  
>> Is any option doable? Any option would you recommend?
>>
>>  
>> Besides, could you please tell me how many shards one index should have 
>> in best practice? Does too many shards also have performance hit?
>>
>> Many thanks,
>> Cindy
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to elasticsearc...@googlegroups.com.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/a2adcd16-1c7b-4e78-a131-d9ae4d61379b%40googlegroups.com
>>  
>> 
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/a50c8871-1608-466f-86be-3619ea666704%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: How to get all docs in a family ?

2015-01-14 Thread Ed Kim
I don't know how you access elasticsearch, so the best thing I can do is 
offer the resource on how to manipulate the payload returned on your 
request:
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-request-fields.html
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-request-source-filtering.html

Both guides will show you how to exclude/include fields from _source, which 
you can use to only get the fields you want. Afterwards, check whatever 
client you are using to see if it supports field inclusion/exclusion. For 
example, there's a 'setFetchSource' method that allows you to set 
exclusions in the Java client api for ES.


On Wednesday, January 14, 2015 at 12:04:56 PM UTC-8, buddarapu nagaraju 
wrote:
>
> okay .I am new to elastic search and help me understand .
>
> So you mean to say with the nested objects , you can return only inner 
> objects if required or return entire object ?
>
> If we can do this with nested objects , can you send me any sample code or 
> example .
>
> My applications relations deals only with one parent and multiple child's 
>  and child can have it own child's and so on .
>
> Do you recommend using nested object for this kind of document relations ?
>
>
> Actually now thinking that I can have fake document type which holds 
> documents in a family  and don't have to bother about the how deep the 
> relation hierarchy is 
>
>
>
> Regards
> Nagaraju
> 908 517 6981
>
> On Wed, Jan 14, 2015 at 2:54 PM, Ed Kim > 
> wrote:
>
>> If the relationship is very simple, I don't see why not. We originally 
>> decided to denormalize part of the parent document because we wanted to 
>> minimize the payload of the returning documents. A little while later, we 
>> found out we could omit fields from the document payload, but at that 
>> point, we had a smooth running app already, so we just noted this change 
>> into our research queue.
>>
>>
>> On Wednesday, January 14, 2015 at 11:39:42 AM UTC-8, buddarapu nagaraju 
>> wrote:
>>>
>>> my understanding is nested objects (including nested filter and join) 
>>> also doesnt help in this regard , correct me if am wrong 
>>>
>>> On Wednesday, 14 January 2015 14:31:22 UTC-5, Ed Kim wrote:
>>>>
>>>> I'm not sure if you can fetch both parent/child. You can certainly try 
>>>> by querying against the entire index (and therefore querying against your 
>>>> parent/child types), but we've never had any success with this. What we 
>>>> did 
>>>> to get around this was to denormalize the child documents, and let the 
>>>> parent be used only for matching purposes.
>>>>
>>>> On Wednesday, January 14, 2015 at 7:36:38 AM UTC-8, buddarapu nagaraju 
>>>> wrote:
>>>>>
>>>>> If some one has some ideas please let me know
>>>>>
>>>>> On Wednesday, 14 January 2015 04:16:00 UTC-5, buddarapu nagaraju wrote:
>>>>>>
>>>>>> All I will be knowing in API request is bool params to know whether 
>>>>>> to get family docs or not and query that user entered so I need to 
>>>>>> construct query that gets all family documents for qualified documents 
>>>>>> for 
>>>>>> given query if bool param is true and another query that just get the 
>>>>>> qualified documents 
>>>>>>
>>>>>> On Wednesday, 14 January 2015 01:51:47 UTC-5, buddarapu nagaraju 
>>>>>> wrote:
>>>>>>>
>>>>>>> Hi,
>>>>>>> I have question on document relations
>>>>>>>
>>>>>>> one document can have multiple child's. and now I have to address 
>>>>>>> below searches and achieve the mentioned expected results .
>>>>>>>
>>>>>>> 1)
>>>>>>> search on any child documents(meaning have a query that qualifies 
>>>>>>> child documents) and retrieve the child ,parent and all of the child's 
>>>>>>> of 
>>>>>>> the parents
>>>>>>>
>>>>>>> 2)
>>>>>>>
>>>>>>> search on any parent documents and retrieve parent and all of the 
>>>>>>> child documents
>>>>>>>
>>>>>>> 3) search on any parent documents and just retrieve only qualified 
>>>>>>> parent doc

Re: How to count all occurences in ES via search api call?

2015-01-14 Thread Ed Kim
Perhaps use nested documents for your tag list?
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-aggregations-bucket-nested-aggregation.html

I don't think you can aggregate substrings in a single field since normal 
aggregation is based on matches against the entire document. By having 
nested (or parent child), now you have documents for each tag that you can 
aggregate against.

On Wednesday, January 14, 2015 at 11:35:07 AM UTC-8, Bhumir Jhaveri wrote:
>
> Hey,
>
> I have data stored in ES(1.4.2) as 
> 1st document :
> {
> "col1":"123","col2":"tag1,tag2,tag4"
> }
> 2nd...
> {
> "col1":"333","col2":"tag1,tag4,tag5"
> }
> 3rd...
> {
> "col1":"111","col2":"tag1,tag1,tag5,tag5"
> }
>
> now when I am searching it via making search api call - Search is for tag1 
> - it returns me the count of 3 where as I am looking for 4 since 3rd 
> document is having tag1 two times - 
> I am generating the stats data via kibana (3.1.2) so thats my client who 
> makes call to ES?
>
> Is there any analyzer or tokenizer that I should be using while creating 
> an index?
> Though I am not using any special tokenizer or anything while creating 
> index - I am creating index on both col1 and col2.
>
> Any workaround on this?
>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/f31b587c-0dae-4f4c-a317-62d81265338b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: How to get all docs in a family ?

2015-01-14 Thread Ed Kim
If the relationship is very simple, I don't see why not. We originally 
decided to denormalize part of the parent document because we wanted to 
minimize the payload of the returning documents. A little while later, we 
found out we could omit fields from the document payload, but at that 
point, we had a smooth running app already, so we just noted this change 
into our research queue.

On Wednesday, January 14, 2015 at 11:39:42 AM UTC-8, buddarapu nagaraju 
wrote:
>
> my understanding is nested objects (including nested filter and join) also 
> doesnt help in this regard , correct me if am wrong 
>
> On Wednesday, 14 January 2015 14:31:22 UTC-5, Ed Kim wrote:
>>
>> I'm not sure if you can fetch both parent/child. You can certainly try by 
>> querying against the entire index (and therefore querying against your 
>> parent/child types), but we've never had any success with this. What we did 
>> to get around this was to denormalize the child documents, and let the 
>> parent be used only for matching purposes.
>>
>> On Wednesday, January 14, 2015 at 7:36:38 AM UTC-8, buddarapu nagaraju 
>> wrote:
>>>
>>> If some one has some ideas please let me know
>>>
>>> On Wednesday, 14 January 2015 04:16:00 UTC-5, buddarapu nagaraju wrote:
>>>>
>>>> All I will be knowing in API request is bool params to know whether to 
>>>> get family docs or not and query that user entered so I need to construct 
>>>> query that gets all family documents for qualified documents for given 
>>>> query if bool param is true and another query that just get the qualified 
>>>> documents 
>>>>
>>>> On Wednesday, 14 January 2015 01:51:47 UTC-5, buddarapu nagaraju wrote:
>>>>>
>>>>> Hi,
>>>>> I have question on document relations
>>>>>
>>>>> one document can have multiple child's. and now I have to address 
>>>>> below searches and achieve the mentioned expected results .
>>>>>
>>>>> 1)
>>>>> search on any child documents(meaning have a query that qualifies 
>>>>> child documents) and retrieve the child ,parent and all of the child's of 
>>>>> the parents
>>>>>
>>>>> 2)
>>>>>
>>>>> search on any parent documents and retrieve parent and all of the 
>>>>> child documents
>>>>>
>>>>> 3) search on any parent documents and just retrieve only qualified 
>>>>> parent documents
>>>>>
>>>>> 4) search on any child documents and just retrieve only qualified 
>>>>> child documents
>>>>>
>>>>> Is there any existing feature that help in achieving this ? any ideas 
>>>>> /thoughts would be very useful ?
>>>>> also please provide some sample code if possible
>>>>>
>>>>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/aab1b66b-36aa-47a7-a467-2680b64c2e6a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: How to get all docs in a family ?

2015-01-14 Thread Ed Kim
I'm not sure if you can fetch both parent/child. You can certainly try by 
querying against the entire index (and therefore querying against your 
parent/child types), but we've never had any success with this. What we did 
to get around this was to denormalize the child documents, and let the 
parent be used only for matching purposes.

On Wednesday, January 14, 2015 at 7:36:38 AM UTC-8, buddarapu nagaraju 
wrote:
>
> If some one has some ideas please let me know
>
> On Wednesday, 14 January 2015 04:16:00 UTC-5, buddarapu nagaraju wrote:
>>
>> All I will be knowing in API request is bool params to know whether to 
>> get family docs or not and query that user entered so I need to construct 
>> query that gets all family documents for qualified documents for given 
>> query if bool param is true and another query that just get the qualified 
>> documents 
>>
>> On Wednesday, 14 January 2015 01:51:47 UTC-5, buddarapu nagaraju wrote:
>>>
>>> Hi,
>>> I have question on document relations
>>>
>>> one document can have multiple child's. and now I have to address below 
>>> searches and achieve the mentioned expected results .
>>>
>>> 1)
>>> search on any child documents(meaning have a query that qualifies child 
>>> documents) and retrieve the child ,parent and all of the child's of the 
>>> parents
>>>
>>> 2)
>>>
>>> search on any parent documents and retrieve parent and all of the child 
>>> documents
>>>
>>> 3) search on any parent documents and just retrieve only qualified 
>>> parent documents
>>>
>>> 4) search on any child documents and just retrieve only qualified child 
>>> documents
>>>
>>> Is there any existing feature that help in achieving this ? any ideas 
>>> /thoughts would be very useful ?
>>> also please provide some sample code if possible
>>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/0649a861-a235-49c4-9556-18b514690fff%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Recommended number of master eligible nodes in a cluster

2015-01-14 Thread Ed Kim
3 should be fine, and they don't even have to be beefy machines. 

On Wednesday, January 14, 2015 at 1:39:47 AM UTC-8, Darshat Shah wrote:
>
> To clarify, the guideline I'm looking for is how many nodes to set apart 
> as dedicated to be master eligible in a cluster of size N.  That is not 
> N/2+1.
>
> On Wednesday, January 14, 2015 at 2:24:29 PM UTC+5:30, Darshat Shah wrote:
>>
>> That's the minimum number. Is it recommended to have more, or will 
>> elections take longer if we have more than this minimum?
>>
>> On Wednesday, January 14, 2015 at 2:16:27 PM UTC+5:30, drjz wrote:
>>>
>>> The formula is N/2+1, so in your case that would be 46 if there are 90 
>>> nodes.
>>>
>>> /JZ
>>>
>>> On Wednesday, January 14, 2015 at 9:38:01 AM UTC+1, Darshat Shah wrote:

 Hi,
 What is the guideline on recommended number of master eligible nodes in 
 a cluster?

 We have a big cluster with 90+ nodes, currently all are eligible to be 
 masters. However I do see master election take very long after a cluster 
 restart - most nodes are still trying to ping the old master and I think 
 there is no way to force a re-election.

 Will re-election be faster if number of master eligible nodes be less? 
 Is there a rule of thumb, given N nodes how many should we configure to be 
 eligible to be masters?

 Thanks
 Darshat

>>>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/bcec37c0-e62e-4475-9042-2f955c2f918e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Filters vs. Queries (ES Version 1.4+)

2015-01-13 Thread Ed Kim
Filters work based on caching bitsets to provide a fast lookup of documents 
that match the criteria. It looks like the query cache actually stores the 
hits returned from within the node. For now, it looks like query cache will 
not help with fetching documents (says count/aggregation/suggestions only), 
but when/if it does work to store search results, it's not optimized for 
frequent updates on the index (since that will invalidate the cache 
result). Filters, on the other hand, are updated aggressively by ES, so new 
documents get incrementally added to the bitset.

So to answer the question, they both have similar goals, but they 
accomplish them differently, so you will have to decide which 
implementation fits your needs better.

On Tuesday, January 13, 2015 at 1:49:06 PM UTC-8, AndrewK wrote:
>
> Hallo Adrien,
>
> Many thanks for the clarification. 
>
> But won't the query cache (1.4.0-Beta)
>
>
> http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/index-modules-shard-query-cache.html
>
> partially "close the gap" in performance? The cache won't be as 
> lightweight as the filter cache (as the key is the entire JSON query rather 
> than a bitset and operates at shard-level), but is a cache nonetheless.
>
> Or am I comparing two completely different things?
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/8cacc807-7efb-437c-a22f-5d75f849b41c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: How can I input data from java to ES in real time?

2015-01-13 Thread Ed Kim
Yea, since you mentioned real time, I thought you were concerned about data 
coming in 'today.' 

For older data, the only aspect of real time that applies is how soon you 
can search against ES once loaded. ES supports near real time so this isn't 
really an issue, and considering you have 10 year old data, I had assumed 
it's ok if the older data doesn't have to be readily available until the 
entire data set is loaded, especially since the intent for these type of 
things is to get an overall view of all the data points.

Anyways, one thing I did in the past was to partition the historical data 
(in oracle) and then index the data into ES in time intervals. For example, 
store 1 year's worth of data in a single index. If doing it this way, you 
will have 10 indexes, and for query purposes, you can set aliases on the 
indexes and control your query using aliases. For instance, if most of your 
queries for analysis are for 'the last year', you can query against the 
index that contains data for 2014 and 2015. This way you don't query a 
giant index that can go back 10 years. Also, by breaking down into chunks, 
any indexing failures would not cause me to re-index the entire thing, 
which could take a very long time.

Best wishes on your project!

On Tuesday, January 13, 2015 at 1:07:31 PM UTC-8, Marian Valero wrote:
>
> Thanks Ed.
>
> Yes, but I have data from 10 years ago and I have to input this data too 
> for analyze that logs with the data that is insert every day. There are 
> millions of logs.
>
> 2015-01-13 16:22 GMT-04:30 Ed Kim >:
>
>> Not sure if this is an option for you, but if an application is feeding 
>> that log data into oracle, you could consider having that application also 
>> index into ElasticSearch. 
>>
>> On Wednesday, January 7, 2015 at 5:46:18 AM UTC-8, Marian Valero wrote:
>>>
>>> I'm reading data from my oracle database to java with JDBC but I want to 
>>> know how can I input the data that I'm getting in Elasticsearch in real 
>>> time.
>>>
>>> Thanks.
>>>
>>  -- 
>> You received this message because you are subscribed to a topic in the 
>> Google Groups "elasticsearch" group.
>> To unsubscribe from this topic, visit 
>> https://groups.google.com/d/topic/elasticsearch/cnkq84aom1M/unsubscribe.
>> To unsubscribe from this group and all its topics, send an email to 
>> elasticsearc...@googlegroups.com .
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/86e937ac-af3f-43f7-8c1f-60c7517b8cbf%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/elasticsearch/86e937ac-af3f-43f7-8c1f-60c7517b8cbf%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>>
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/1fd5e8fb-117e-419b-b5b7-cda96a55abec%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: How can I input data from java to ES in real time?

2015-01-13 Thread Ed Kim
Not sure if this is an option for you, but if an application is feeding 
that log data into oracle, you could consider having that application also 
index into ElasticSearch. 

On Wednesday, January 7, 2015 at 5:46:18 AM UTC-8, Marian Valero wrote:
>
> I'm reading data from my oracle database to java with JDBC but I want to 
> know how can I input the data that I'm getting in Elasticsearch in real 
> time.
>
> Thanks.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/86e937ac-af3f-43f7-8c1f-60c7517b8cbf%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Correct way to use TransportClient connection object

2015-01-13 Thread Ed Kim
Did you declare the listener in your web.xml?

On Tuesday, January 13, 2015 at 3:27:15 AM UTC-8, Subhadip Bagui wrote:
>
> Hi Jorg,
>
> Sorry to open the thread again. But the issue is I'm getting OOM error 
> currently in tomcat and the webapplication is crushing. As u suggested I'm 
> calling the static TransportClient instance from contextInitialized() and 
> shutting down with contextDestroyed(). But the methods are not getting 
> called it seems. Trying like below.
> Can you please check.
>
> public class ESClientFactory implements ServletContextListener {
>
> /** The logger. */
> private static Logger logger = Logger.getLogger(ESClientFactory.class);
> 
> /** The instance. */
> public static TransportClient instance;
>
> /**
>  * Instantiates a new eS client factory.
>  */
> private ESClientFactory() {
> }
>
> /**
>  * Gets the single instance of ESClientFactory.
>  *
>  * @return single instance of ESClientFactory
>  */
> public static Client getInstance() {
> String ipAddress = MessageTranslator.getMessage("es.cluster.ip");
> int transportClientPort = 0;
> String clusterName = MessageTranslator.getMessage("es.cluster.name
> ");
> 
> try {
> transportClientPort =
> Integer.parseInt(MessageTranslator
> .getMessage("es.transportclient.port"));
> }
> catch (Exception e) {
> transportClientPort = 9300;
> LogImpl.setWarning(ESClientFactory.class, e);
> }
>
> logger.debug("got the client ip as :" + ipAddress + " and port :"
> + transportClientPort);
> if (instance == null) {
> logger
> .debug("the client instance is null, creating a new 
> instance");
> ImmutableSettings.Builder settings =
> ImmutableSettings.settingsBuilder();
> settings.put("cluster.name", clusterName);
> settings.build();
> instance =
> new TransportClient(settings)
> .addTransportAddress(new InetSocketTransportAddress(
> ipAddress, transportClientPort));
> 
> logger.debug("returning the new created client instance...");
> return instance;
> }
> logger
> .debug("returning the existing transport client object 
> connection.");
> return instance;
> }
>
>
> @Override
> public void contextInitialized(ServletContextEvent sce) {
> logger.debug("initializing the servletContextListener... TransportClient");
> getInstance();
> }
>
> @Override
> public void contextDestroyed(ServletContextEvent sce) {
> logger.debug("closing the servlet context");
> shutdown();
> logger.debug("successfully shutdown threadpool");
> }
>
> public synchronized void shutdown() {
> if (instance != null) {
> logger.debug("shutdown started");
> instance.close();
> instance.threadPool().shutdown();
> instance = null;
> logger.debug("shutdown complete");
> }
> }
>
> }
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/fae4cd13-9f9f-4187-9e23-93d8c73c3132%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Design a system that maintains historical view of user activity

2015-01-13 Thread Ed Kim
The short answer is yes. I've leveraged ES to store events and analyze them 
in time-based chunks. It's actually a very powerful tool for this type of 
application. However, you will have to decide on how to model your data to 
get the most out of it. The first question I would ask is why do you 
require so many joins? What is the purpose for each join operation, and do 
you necessarily need to join to everything up front? Can you denormalize 
some of the data to get what you need in the first pass, and then drill 
down afterwards? Take into account use experience, and how your 
queries/model will support that experience.


On Monday, January 12, 2015 at 4:51:55 PM UTC-8, Chen Wang wrote:
>
> Hey Guys,
> I am seeking advice on design a system that maintains a historical view of 
> a user's activities in past one year. Each user can have different 
> activities: email_open, email_click, item_view, add_to_cart, purchase etc. 
> The query I would like to do is, for example,
>
> Find all customers who browse item A in the past 6 month, and also clicked 
> an email.
> and I would like the query to be done in reasonable time frame. (for 
> example, within 30 minutes to retrieve 10million such users)
>
> Is ES a good candidate for such problem? I am thinking creating an index 
> for each user, but that would have too many indexes(millions). I also tried 
> to index each activity(userid, activity_type, item_id, timestamp etc) as a 
> individual document to ES, but it involves join operations which turns out 
> not so efficient(I am using parent-child).
>
> Has any of you have experience in designing similar system? As I think 
> this is a rather common problem that need to be solved..(Of cause we can do 
> it in map reduce)
> Any suggestion is appreciated.
>
> Thanks in advance.
> Chen
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/22442f86-bb46-4935-bdcb-f1f577665993%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Join between two different sources using Kibana 4

2015-01-13 Thread Ed Kim
Without parent/child, you'll need an extra layer to execute 2 queries and 
merge the results yourself. 

On Monday, January 12, 2015 at 2:10:54 PM UTC-8, Gregory Touretsky wrote:
>
> Is there a way to manage it via Kibana interface just at the query time?
> Something like Splunk "transaction" statement, which allows to group 
> events into transactions
>
> On Monday, January 12, 2015 at 9:38:56 PM UTC+2, Itamar Syn-Hershko wrote:
>>
>> You either use parent / child 
>> http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/parent-child.html
>>
>> Or index denormalized data in the first place
>>
>> Elasticsearch isn't meant to be used using the same models as relational 
>> databases
>>
>> --
>>
>> Itamar Syn-Hershko
>> http://code972.com | @synhershko 
>> Freelance Developer & Consultant
>> Author of RavenDB in Action 
>>
>> On Mon, Jan 12, 2015 at 9:36 PM, Gregory Touretsky > > wrote:
>>
>>> Hi, 
>>>  
>>>what would be the right way to join between two data sources using 
>>> Kibana 4 interface?
>>> Assume 2 data sources:
>>> 1. source=jobs,  fields = {jobid, user, host, exitstatus, 
>>> starttime,finishtime}
>>> Sample record:
>>>  type = jobs;  jobid = 1234; user = john; host = myhost; exitstatus 
>>> = -3002; starttime = 01/01/2015 01:01; finishtime = 01/01/2015  01:15
>>> 2. source=license, fields = {host, user, time, feature, result}
>>> Sample records:
>>>  type = license;  user = john; host = myhost; time = 01/01/2015 
>>> 01:05; feature = AAA; result = DENIED
>>>  type = license;  user = john; host = myhost; time = 01/01/2015 
>>> 01:07; feature = BBB; result = APPROVED
>>>
>>> I’d like to create a dashboard in Kibana 4 which would show a joint 
>>> table combining both sources.
>>> Using pseudo-SQL code, it should do something like:
>>>
>>> select 
>>> jobs.jobid,jobs.user,jobs.host,license.feature,license.result,count(license.time)
>>>  
>>> from jobs
>>> LEFT JOIN license
>>> WHERE jobs.exitstatus=-3002 AND license.user=jobs.user AND 
>>> license.host=jobs.host AND license.time>=jobs.starttime AND 
>>> license.time<=jobs.finishtime
>>> GROUP BY jobs.jobid,jobs.user,jobs.host
>>>
>>> Thanks in advance,
>>>Gregory
>>>
>>> -- 
>>> You received this message because you are subscribed to the Google 
>>> Groups "elasticsearch" group.
>>> To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to elasticsearc...@googlegroups.com.
>>> To view this discussion on the web visit 
>>> https://groups.google.com/d/msgid/elasticsearch/daf3dbf4-7b76-477e-8b10-5ca54cb53bf0%40googlegroups.com
>>>  
>>> 
>>> .
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/ebedd993-e489-4ed5-885e-48be074df3f4%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: elasticsearch client is giving OutOfMemoryError once connection is lost to elaticsearch server

2015-01-12 Thread Ed Kim
It's hard to say what's truly causing it, but when your server goes down, 
you should also see NoNodeAvailableException. Also, I don't think it 
matters too much whether you are doing index or search requests, as all 
requests queue handlers (Runnable) to the generic threadpool. What is your 
query per second like when searching? What timeout have you set? I'm 
wondering if TransportClient is flooding the threadpool with a whole bunch 
of requests and causing your server to crash as a result. In this case 
though, I might also expect to see timeout exceptions as well, but since 
you haven't mentioned them, I'm not really sure.

For the threadpool shutdown, I typically include threadpool.shutdown() 
whenever I call close() since my intention is to completely shutdown the 
transportclient. I wouldn't suggest simply killing the threadpool only.

Having said that, have you tried protecting TransportClient from 
handling/executing requests when NoNodeAvailableException occurs? We've set 
up a wrapper class around TransportClient to queue up requests until 
TransportClient recovers. We will periodically check to see if 
TransportClient can communicate with ES, and then execute the queued 
requests if things are back to normal. Worst case, we will drop the queued 
ES requests and return an error message back to the client.



On Monday, January 12, 2015 at 4:38:09 AM UTC-8, Subhadip Bagui wrote:
>
> Hi Ed,
>
> I my case I have created a singleton TransportClient and querying ES with 
> the same in frequent interval to get data. I'm not doing any bulk 
> operations only searching index. threadPool needs to be shutdown once 
> Tomcat stops I guess, But when my webapplication is up and running is there 
> any need to shutdown the threadPool ? I'm getting this error recently. 
> Previously if the ES server goes down I used to get 
> NoNodeAvailableException and that's OK with me. But now the whole tomcat 
> goes down and no app is working showing OutOfMemoryError. Please suggest.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/a01d7ebb-b86b-43f9-a93f-77c49b9be17f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Writing custom scripts for indexing data in Elasticsearch

2015-01-11 Thread Ed Kim
It executes once. You could consider running that script on a schedule and 
doing incremental updates using timestamps. 

On Sunday, January 11, 2015 at 9:24:28 PM UTC-8, Amtul Nazneen wrote:
>
> Thank you. I have a doubt though, once I run the script, the river plugin 
> is started and the data gets indexed into Elasticsearch, I want to know, if 
> the plugin would be running after that, or does it stop once the script 
> execution comes to an end?
>
>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/e671b48a-a57d-4917-a9ac-c23face41f43%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: elasticsearch client is giving OutOfMemoryError once connection is lost to elaticsearch server

2015-01-11 Thread Ed Kim
Yes, but in addition I would also call client.threadPool().shutdown() to 
make sure you free up the threadpool.

On Sunday, January 11, 2015 at 7:22:20 PM UTC-8, Jason Wee wrote:
>
> Hi Ed,
>
> soft restart as in client.close()  ? I did that as well, does not seem to 
> work, hence I resolve to tomcat restart instead... 
>
> Jason
>
> On Sun, Jan 11, 2015 at 3:03 PM, Ed Kim > 
> wrote:
>
>> Other members can correct me if I'm wrong, but I notice that when you 
>> lose connection with the server, the transportclient queues retries of 
>> whatever operations you try to execute, and it starts to queue listeners 
>> into a 'generic' threadpool (which I read somewhere that it was unbounded). 
>> We've seen this problem when we thrash ES until it eventually stops 
>> responding, and our bulk requests start to back up and eventually cause the 
>> application to halt due to OOM.
>>
>> I don't know exactly what your application is doing when your ES node(s) 
>> go down, but perhaps you can proactively stop requests to ES servers once 
>> your application sees the no node exception error (which you should get 
>> when ES goes down). You could also close the transportclient and shutdown 
>> its threadpool and reconnect/instantiate after a timed delay to clean up 
>> whatever is floating around in the transportclient. We have been able to 
>> solve most of our native thread issues by protecting our use of 
>> transportclient and doing a soft restart of this client. 
>>
>>
>> On Saturday, January 10, 2015 at 9:29:56 AM UTC-8, Subhadip Bagui wrote:
>>>
>>> Hi,
>>>
>>> I'm using elasticsearch using TransportClient for multiple operation. 
>>> The issue I'm facing now is if my es server goes down my client side app 
>>> getting OutOfMemoryError.  Getting the below exception. I had to restart my 
>>> tomcat every time after this to make my application up. Can some one please 
>>> suggest how to prevent this. 
>>>
>>>
>>> Jan 9, 2015 5:38:44 PM org.apache.catalina.core.StandardWrapperValve 
>>> invoke
>>> SEVERE: Servlet.service() for servlet [spring] in context with path 
>>> [/aricloud] threw exception [Handler processing failed; nested exception is 
>>> java.lang.OutOfMemoryError: unable to create new native thread] with root 
>>> cause
>>> java.lang.OutOfMemoryError: unable to create new native thread
>>> at java.lang.Thread.start0(Native Method)
>>> at java.lang.Thread.start(Thread.java:640)
>>> at java.util.concurrent.ThreadPoolExecutor.addThread(
>>> ThreadPoolExecutor.java:681)
>>> at java.util.concurrent.ThreadPoolExecutor.addIfUnderMaximumPoolSize(
>>> ThreadPoolExecutor.java:727)
>>> at java.util.concurrent.ThreadPoolExecutor.execute(
>>> ThreadPoolExecutor.java:655)
>>> at org.elasticsearch.common.netty.util.internal.
>>> DeadLockProofWorker.start(DeadLockProofWorker.java:38)
>>> at org.elasticsearch.common.netty.channel.socket.nio.
>>> AbstractNioSelector.openSelector(AbstractNioSelector.java:349)
>>> at org.elasticsearch.common.netty.channel.socket.nio.
>>> AbstractNioSelector.(AbstractNioSelector.java:100)
>>> at org.elasticsearch.common.netty.channel.socket.nio.
>>> AbstractNioWorker.(AbstractNioWorker.java:52)
>>> at org.elasticsearch.common.netty.channel.socket.nio.
>>> NioWorker.(NioWorker.java:45)
>>> at org.elasticsearch.common.netty.channel.socket.nio.
>>> NioWorkerPool.createWorker(NioWorkerPool.java:45)
>>> at org.elasticsearch.common.netty.channel.socket.nio.
>>> NioWorkerPool.createWorker(NioWorkerPool.java:28)
>>>
>>>
>>> Thanks,
>>> Subhadip
>>>
>>  -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to elasticsearc...@googlegroups.com .
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/0ab9-8356-4ca5-b53c-b682cbd76b1a%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/elasticsearch/0ab9-8356-4ca5-b53c-b682cbd76b1a%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>>
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/a124503b-ad60-42a0-bb17-de56c8c6a726%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: elasticsearch client is giving OutOfMemoryError once connection is lost to elaticsearch server

2015-01-10 Thread Ed Kim
Other members can correct me if I'm wrong, but I notice that when you lose 
connection with the server, the transportclient queues retries of whatever 
operations you try to execute, and it starts to queue listeners into a 
'generic' threadpool (which I read somewhere that it was unbounded). We've 
seen this problem when we thrash ES until it eventually stops responding, 
and our bulk requests start to back up and eventually cause the application 
to halt due to OOM.

I don't know exactly what your application is doing when your ES node(s) go 
down, but perhaps you can proactively stop requests to ES servers once your 
application sees the no node exception error (which you should get when ES 
goes down). You could also close the transportclient and shutdown its 
threadpool and reconnect/instantiate after a timed delay to clean up 
whatever is floating around in the transportclient. We have been able to 
solve most of our native thread issues by protecting our use of 
transportclient and doing a soft restart of this client. 


On Saturday, January 10, 2015 at 9:29:56 AM UTC-8, Subhadip Bagui wrote:
>
> Hi,
>
> I'm using elasticsearch using TransportClient for multiple operation. The 
> issue I'm facing now is if my es server goes down my client side app 
> getting OutOfMemoryError.  Getting the below exception. I had to restart my 
> tomcat every time after this to make my application up. Can some one please 
> suggest how to prevent this. 
>
>
> Jan 9, 2015 5:38:44 PM org.apache.catalina.core.StandardWrapperValve invoke
> SEVERE: Servlet.service() for servlet [spring] in context with path 
> [/aricloud] threw exception [Handler processing failed; nested exception is 
> java.lang.OutOfMemoryError: unable to create new native thread] with root 
> cause
> java.lang.OutOfMemoryError: unable to create new native thread
> at java.lang.Thread.start0(Native Method)
> at java.lang.Thread.start(Thread.java:640)
> at 
> java.util.concurrent.ThreadPoolExecutor.addThread(ThreadPoolExecutor.java:681)
> at 
> java.util.concurrent.ThreadPoolExecutor.addIfUnderMaximumPoolSize(ThreadPoolExecutor.java:727)
> at 
> java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:655)
> at 
> org.elasticsearch.common.netty.util.internal.DeadLockProofWorker.start(DeadLockProofWorker.java:38)
> at 
> org.elasticsearch.common.netty.channel.socket.nio.AbstractNioSelector.openSelector(AbstractNioSelector.java:349)
> at 
> org.elasticsearch.common.netty.channel.socket.nio.AbstractNioSelector.(AbstractNioSelector.java:100)
> at 
> org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.(AbstractNioWorker.java:52)
> at 
> org.elasticsearch.common.netty.channel.socket.nio.NioWorker.(NioWorker.java:45)
> at 
> org.elasticsearch.common.netty.channel.socket.nio.NioWorkerPool.createWorker(NioWorkerPool.java:45)
> at 
> org.elasticsearch.common.netty.channel.socket.nio.NioWorkerPool.createWorker(NioWorkerPool.java:28)
>
>
> Thanks,
> Subhadip
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/0ab9-8356-4ca5-b53c-b682cbd76b1a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


real time match analysis

2015-01-10 Thread Ed Kim
Hello all, I was wondering if anyone could offer some feedback on whether 
there is a way to determine how a document matched in real time. I 
currently use custom analyzers at index time to allow a broad array of 
matches for a given text field. I try to match based on phrases, synonyms, 
substrings, stemming, etc of a given phrase, and I would like to be able to 
figure out at search time, which analyzer was attributed to causing the 
match. 

Currently, I've gotten around this by creating child documents where the 
fields are fanned out to their respective analyzer types. So I have a child 
document where the field only applies stemming, another that uses only 
synonyms, etc. However, due to the growing number of fields that require 
analysis and the growth of my data set, I'd much prefer if I had less 
documents (and less complex too). I was hoping there would be a way to tag 
tokens at the analysis phase that could be used at the search phase to 
quickly determine my match level, but I was not able to find anything like 
this.

Having said that, has anyone else ever tried to figure this out, or have an 
thoughts on how to leverage ES at a lower level to determine match? 

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/4222f994-d448-4b61-a71e-3dca03a5a0fc%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.