Bug in Blog

2015-03-13 Thread Zennet Wheatcroft
There was a useful blog post about D3 and aggregations 
http://webcache.googleusercontent.com/search?q=cache:QhzvfcKiM70J:www.elasticsearch.org/blog/data-visualization-elasticsearch-aggregations/

Now I get this page which is mostly broken
https://www.elastic.co/blog/data-visualization-elasticsearch-aggregations/


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/0a364878-f542-4150-98d4-414fd6d7b58e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Unexpected behavior from nested - filter - nested aggregation

2014-10-24 Thread Zennet Wheatcroft
I'm also running into this it is not what I expected. I tried parent/child 
and got the same behavior. I expect the filtering to narrow down the 
results with each filter. I filter on a child (or nested) that has 
property=p then go back to aggregate on the parent and I get all the 
results again as if the filter is not applied.

I can include sample data, mapping, and queries if someone wants to comment.

I'm trying to do clickstream analysis on session events and user actions. 
The data models I am trying are where the session is the parent event with 
user actions as the child (or nested) events. I've tried several different 
models and have not found one that works well.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/bbc238d6-9f60-4ede-b311-9d522f930120%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Can i elastic search as my primary store?

2014-10-24 Thread Zennet Wheatcroft
I have heard from the source, Do not use Elasticsearch as a data store. But 
some people do and it works ok. I would recommend that you use the snapshot 
and restore features. And back up your json file data so you can re-index 
in case your index gets corrupted. And be careful upgrading, especially 
between breaking versions.


On Friday, October 24, 2014 2:32:56 PM UTC-7, Akram Hussein wrote:

 Is it a use case today to use elastic search as a primary store? basically 
 using it similar to mongodb? is that a use case the product is moving 
 towards or it is mostly just for search?


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/a6124515-fea1-47ab-9b0a-6718e4123164%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: nested aggregation against key value pairs

2014-10-24 Thread Zennet Wheatcroft
Have you tried the usual sub-aggregations? It looks like it should do 
exactly what you want. If so, why does that not work? Can you include some 
sample data and queries you have tried so that we can index it and try your 
queries?

Bucketing aggregations can have sub-aggregations (bucketing or metric). 
The sub-aggregations will be computed for the buckets which their parent 
aggregation generates.
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-aggregations.html


On Friday, October 24, 2014 12:17:04 PM UTC-7, Jay Hilden wrote:

 I have an ES type with a nested KeyValuePair type.  What I'm trying to do 
 is a terms aggregation on both the key and value fields such that I'd get 
 the following results:

 Key1 - Value1: DocCount = 10
 Key1 - Value2: DocCount = 9
 Key2 - Value3: DocCount = 4

 Here is my mapping:
 {
 index123 : {
 mappings : {
 type123 : {
 properties : {
 authEventID : {
 type : long
 },
 authInput : {
 properties : {
 uIDExtensionFields : {
 type : nested,
 properties : {
 key : {
 type : string
 },
 value : {
 type : string
 }
 }
 }
 }
 }
 }
 }
 }
 }
 }

 Is there a way to do this?

 Thank you.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/5415a7f5-31ea-4085-af3a-0bbbdc875ea9%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Unexpected behavior from nested - filter - nested aggregation

2014-10-24 Thread Zennet Wheatcroft
Lastly, it is not possible to “cross reference” between nested documents. 
One nested doc cannot “see” another nested doc’s properties. For example, 
you are not able to filter on “A.name” but facet on “B.age”. You can get 
around this by using `include_in_root`, which effectively copies the nested 
docs into the root, but this get’s you back to the problems of inner 
objects.
http://www.elasticsearch.org/blog/managing-relations-inside-elasticsearch/

Perhaps this is our answer?


On Friday, October 24, 2014 3:31:11 PM UTC-7, Zennet Wheatcroft wrote:

 I'm also running into this it is not what I expected. I tried parent/child 
 and got the same behavior. I expect the filtering to narrow down the 
 results with each filter. I filter on a child (or nested) that has 
 property=p then go back to aggregate on the parent and I get all the 
 results again as if the filter is not applied.

 I can include sample data, mapping, and queries if someone wants to 
 comment.

 I'm trying to do clickstream analysis on session events and user actions. 
 The data models I am trying are where the session is the parent event with 
 user actions as the child (or nested) events. I've tried several different 
 models and have not found one that works well.




-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/24b79787-aa22-4c15-8df1-831149d3f621%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: ElasticSearch Having-Clause?

2014-06-23 Thread Zennet Wheatcroft

As the open issue you quote suggests, ES currently has no support for an 
equivalent to to SQL’s HAVING clause.
Here's another reference which supports that:
https://groups.google.com/forum/#!msg/elasticsearch/UsrCG2Abj-A/IDO9DX_PoQwJ

What I did as a workaround is get all the results in an intermediate layer 
and then loop through them to leaving out the ones not meeting my boolean 
criteria (COUNT(*) = x). But that is not really a solution to your problem 
of too many results. I had 200,000 results which worked fine but if I had 
200M that would not work so well. And it won't work for any of the 
aggregation functions (sum, min, max, avg) other than count as far as I can 
tell.

Have you considered the 'min_doc_count: 50' feature?
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-aggregations-bucket-terms-aggregation.html#_minimum_document_count
I used it to filter out all the groups that have less than x documents and 
then manually removed the groups with more than x.
In your case it looks like you want to filter on something like HAVING 
SUM(impression)  49 and I don't think there is even a workaround for that 
because the functions, and even script filters, are applied to documents, 
not to the aggregations. At least as far as I can tell. It would be great 
if someone showed me otherwise.

Zennet 


On Wednesday, February 5, 2014 7:50:01 AM UTC-8, tke...@rundsp.com wrote:

 Hi guys,
 I’m using elasticsearch 1.0.0RC2 and wondering if there is an 
 equivalent to SQL’s “having-clause” for the aggregation framework there. 
 Below is an example query and a link to a ticket that describes the issue 
 well. The part of the query that's highlighted doesn't work, and is there 
 purely to give an idea of what I'm after. This query (omitting the 
 highlighted portion) gives impression counts for every 
 placement-referer-device-date combo. This is fine but the output is HUGE! I 
 was wondering if there was a way (like a having clause or filter) to reduce 
 the amount of results based off some logic (in this case, impressions 
 counts greater than 50). Thanks all!

 - Trev

 https://github.com/elasticsearch/elasticsearch/issues/4404


 curl -XPOST //_search?pretty=true -d '

 {

 size:0,

 query: {

 filtered: {

 query: {

 range: {

 date_time: {

 from: ZZZ,

 to: ,

 include_lower: true,

 include_upper: true

 }

 }

 }

 }

 },

 aggs: {

 placement: {

 terms: {

 field: placement

 },

 aggs: {

 device: {

 terms: {

 field: device

 },

 aggs: {

 referer: {

 terms: {

 field: referer

 },

 aggs: {

 totals: {

 date_histogram: {

 field: date_time,

 interval: day

 },

 aggs: {

 impression: {

 sum: {

 field: impression

 }

 ,having : { from : 50 }

 }

 }

 }

 }

 }

 }

 }

 }

 }

 }

 }

 '


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/99cd80a9-8f5d-4378-b4a5-09b8421f8c4e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Securing Data in Elasticsearch

2014-06-18 Thread Zennet Wheatcroft
If we want to use Kibana we will run into the same issue. I heard Shay say 
that Kibana really was not developed for the use case of exposing to 
external customers but he did not elaborate on that. What I was thinking of 
doing is wrapping ES in a simple web app that forwards GET requests from 
Kibana on to ES (keeping the same API) but blocks DELETE, PUT, and POST 
requests returning a 501 Not Implemented. Do you think that would work for 
maintaining functionality and disallowing updates and deletes? Would that 
work for your requirements?

Zennet


On Thursday, June 12, 2014 7:48:47 AM UTC-7, Harvii Dent wrote:

 Hello,

 I'm planning to use Elasticsearch with Logstash for logs management and 
 search, however, one thing I'm unable to find an answer for is making sure 
 that the data cannot be modified once it reaches Elasticsearch.

 action.destructive_requires_name prevents deleting all indices at once, 
 but they can still be deleted. Are there any options to prevent deleting 
 indices altogether? 

 And on the document level, is it possible to disable 'delete' *AND* 
 'update' operations without setting the entire index as read-only (ie. 
 'index.blocks.read_only')?

 Lastly, does setting 'index.blocks.read_only' ensure that the index files 
 on disk are not changed (so they can be monitored using a file integrity 
 monitoring solution)? as many regulatory and compliance bodies have 
 requirements for ensuring logs integrity.

 Thanks



-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/e4ce0c9a-30a4-4077-b3eb-e4bb5ab2dc0b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Inter-document Queries

2014-06-10 Thread Zennet Wheatcroft
I simplified the actual problem in order to avoid explaining the domain 
specific details. Allow me to add back more detail.

We want to be able to search for multiple points of user action, towards a 
conversion funnel, and condition on multiple fields. Let's add another 
field (response) to the above model:
{.., path:/promo/A, response: 200, ..}
{.., path:/page/1, response: 401, ..}
{.., path:/promo/D,response: 200, ..}
{.., path:/page/23, response: 301, ..}
{.., path:/page/2, response: 418, ..}
Let's say we define three points through the conversion funnel:
A: Visited path=/page/1
B: Got response=401 from some path
C: Exited at path=/sale/C

And we want to know how many users did steps A-B-C in that order. If we add 
an array prev_response like we did for prev_path, then we can use a term 
filter to find documents with term path=/sale/C and prev_path=/page/1 and 
prev_response=401. But this will not distinguish between A-B-C and 
B-A-C. Perhaps I could use the script filter for the last mile and from 
the term filtered results throw out B-A-C and it will run more quickly 
because of the reduced document set.

Is there another way to implement this query?

Zennet


On Wednesday, June 4, 2014 5:01:19 PM UTC-7, Itamar Syn-Hershko wrote:

 You need to be able to form buckets that can be reduced again, either 
 using the aggregations framework or a query. One model that will allow you 
 to do that is something like this:

 { userid: xyz, path:/sale/B, previous_paths:[...], 
 tstamp:..., ... }

 So whenever you add a new path, you denormalize and add previous paths 
 that could be relevant. This might bloat your storage a bit and be slower 
 on writes, but it is very optimized for reads since now you can do an 
 aggregation that queries for the desired path and buckets on the user. To 
 check the condition of the previous path you should be able to bucket again 
 using a script, or maybe even with a query on a nested type.

 This is just from the top of my head but should definitely work if you can 
 get to that model

 --

 Itamar Syn-Hershko
 http://code972.com | @synhershko https://twitter.com/synhershko
 Freelance Developer  Consultant
 Author of RavenDB in Action http://manning.com/synhershko/


 On Thu, Jun 5, 2014 at 2:36 AM, Zennet Wheatcroft zwhea...@atypon.com 
 javascript: wrote:

 Yes. I can re-index the data or transform it in any way to make this 
 query efficient. 

 What would you suggest?



 On Wednesday, June 4, 2014 2:14:09 PM UTC-7, Itamar Syn-Hershko wrote:

 This model is not efficient for this type of querying. You cannot do 
 this in one query using this model, and the pre-processing work you do now 
 + traversing all documents is very costly.

 Is it possible for you to index the data (even as a projection) into 
 Elasticsearch using a different model, so you can use ES properly using 
 queries or the aggregations framework?

 --

 Itamar Syn-Hershko
 http://code972.com | @synhershko https://twitter.com/synhershko
 Freelance Developer  Consultant
 Author of RavenDB in Action http://manning.com/synhershko/


 On Thu, Jun 5, 2014 at 12:04 AM, Zennet Wheatcroft zwhea...@atypon.com 
 wrote:

  Hi,

 I am looking for an efficient way to do inter-document queries in 
 Elasticsearch. Specifically, I want to count the number of users that went 
 through an exit point B after visiting point A.

 In general terms, say we have some event log data about users actions 
 on a website:
 
 {userid:xyz, machineid:110530745, path:/promo/A, country:
 US, tstamp:2013-04-01 00:01:01}
 {userid:pdq, machineid:110519774, path:/page/1, country:
 CN, tstamp:2013-04-01 00:02:11}
 {userid:xyz, machineid:110530745, path:/promo/D, country:
 US, tstamp:2013-04-01 00:06:31}
 {userid:abc, machineid:110527022, path:/page/23, country:
 DE, tstamp:2013-04-01 00:08:00}
 {userid:pdq, machineid:110519774, path:/page/2, country:
 CN, tstamp:2013-04-01 00:08:55}
 {userid:xyz, machineid:110530745, path:/sale/B, country:
 US, tstamp:2013-04-01 00:09:46}
 {userid:abc, machineid:110527022 , path:/promo/A, country
 :DE, tstamp:2013-04-01 00:10:46}
 
 And we have 500+M such entries.

 We want a count of the number of userids that visited path=/sale/B 
 after visiting path=/promo/A.

 What I did is to preprocess the data, sorting by userid, tstamp, then 
 compacting all events by the same userid into the same document. Then I 
 wrote a script filter which traverses the path array per document, and 
 returns true if it finds any occurrence of B followed by A. This however 
 is 
 inefficient. Most of our queries take 1 or 2 seconds on 100+M events. This 
 script filter query takes over 300 seconds. Specifically, it can process 
 events at about 400K events per second. BY comparison, I wrote a naive 
 program that does a linear pass of the un-compacted data and that process 
 11M events per second. By which I conclude that Elasticsearch does not do 
 well on this type of query.

 I am hoping someone can indicate a more

Re: Exposing elastic search query APIs at a public endpoint

2014-06-10 Thread Zennet Wheatcroft
Hi Pradeep,

We are in the middle of doing the same thing, designing a system for 
reporting. And I want to create a middle API layer for the reasons you 
suggest and other reasons. I would like to exchange notes with you in a 
private message, if you want. You have to create some middle later, right? 
You don't want to let users issue request for DELETE 
http://yourhost:9200/_all/.

Zennet


On Tuesday, June 10, 2014 12:03:08 AM UTC-7, Pradeep Narayan wrote:

 Hi - We are designing a system for reporting and are planning to use 
 Elastic search as a backend. We want to expose reporting in such a way that 
 users can build custom reports on top of their data without us coming in 
 their way. One way to do this is to expose elastic search query APIs 
 through our public endpoints. The other option is to use an abstraction 
 language which get translated to elastic search queries in the middle tier. 
 The latter option allows us to control what runs on ES but can become 
 restrictive in terms of how much we expose the rich query mechanism of ES 
 using the abstraction layer. I would like to know if there is a known 
 design pattern to solve this. How have users of elastic search addressed 
 flexibility vs. control?

 Regards,
 Pradeep


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/ad6388f1-d1df-40b6-90c8-d9b87bea17cd%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Inter-document Queries

2014-06-09 Thread Zennet Wheatcroft
Thank you Itamar and Jörg for your replies.

I followed your suggestion Itamar and it works. Queries that took 300+ 
seconds are now 400 ms again.

However, this model increases stored space complexity by O(N^2) which is 
usually not acceptable. So I would not consider this a general method. It 
works because the median length of a user session is about 3. We have 
sessions with 100s of events. If the median length of a session were 1000 
then this method would no longer work.

Any other ideas or refinements? Or is this the best we can do with 
Elasticsearch?

Zennet


On Wednesday, June 4, 2014 5:01:19 PM UTC-7, Itamar Syn-Hershko wrote:

 You need to be able to form buckets that can be reduced again, either 
 using the aggregations framework or a query. One model that will allow you 
 to do that is something like this:

 { userid: xyz, path:/sale/B, previous_paths:[...], 
 tstamp:..., ... }

 So whenever you add a new path, you denormalize and add previous paths 
 that could be relevant. This might bloat your storage a bit and be slower 
 on writes, but it is very optimized for reads since now you can do an 
 aggregation that queries for the desired path and buckets on the user. To 
 check the condition of the previous path you should be able to bucket again 
 using a script, or maybe even with a query on a nested type.

 This is just from the top of my head but should definitely work if you can 
 get to that model

 --

 Itamar Syn-Hershko
 http://code972.com | @synhershko https://twitter.com/synhershko
 Freelance Developer  Consultant
 Author of RavenDB in Action http://manning.com/synhershko/


 On Thu, Jun 5, 2014 at 2:36 AM, Zennet Wheatcroft zwhea...@atypon.com 
 javascript: wrote:

 Yes. I can re-index the data or transform it in any way to make this 
 query efficient. 

 What would you suggest?



 On Wednesday, June 4, 2014 2:14:09 PM UTC-7, Itamar Syn-Hershko wrote:

 This model is not efficient for this type of querying. You cannot do 
 this in one query using this model, and the pre-processing work you do now 
 + traversing all documents is very costly.

 Is it possible for you to index the data (even as a projection) into 
 Elasticsearch using a different model, so you can use ES properly using 
 queries or the aggregations framework?

 --

 Itamar Syn-Hershko
 http://code972.com | @synhershko https://twitter.com/synhershko
 Freelance Developer  Consultant
 Author of RavenDB in Action http://manning.com/synhershko/


 On Thu, Jun 5, 2014 at 12:04 AM, Zennet Wheatcroft zwhea...@atypon.com 
 wrote:

  Hi,

 I am looking for an efficient way to do inter-document queries in 
 Elasticsearch. Specifically, I want to count the number of users that went 
 through an exit point B after visiting point A.

 In general terms, say we have some event log data about users actions 
 on a website:
 
 {userid:xyz, machineid:110530745, path:/promo/A, country:
 US, tstamp:2013-04-01 00:01:01}
 {userid:pdq, machineid:110519774, path:/page/1, country:
 CN, tstamp:2013-04-01 00:02:11}
 {userid:xyz, machineid:110530745, path:/promo/D, country:
 US, tstamp:2013-04-01 00:06:31}
 {userid:abc, machineid:110527022, path:/page/23, country:
 DE, tstamp:2013-04-01 00:08:00}
 {userid:pdq, machineid:110519774, path:/page/2, country:
 CN, tstamp:2013-04-01 00:08:55}
 {userid:xyz, machineid:110530745, path:/sale/B, country:
 US, tstamp:2013-04-01 00:09:46}
 {userid:abc, machineid:110527022 , path:/promo/A, country
 :DE, tstamp:2013-04-01 00:10:46}
 
 And we have 500+M such entries.

 We want a count of the number of userids that visited path=/sale/B 
 after visiting path=/promo/A.

 What I did is to preprocess the data, sorting by userid, tstamp, then 
 compacting all events by the same userid into the same document. Then I 
 wrote a script filter which traverses the path array per document, and 
 returns true if it finds any occurrence of B followed by A. This however 
 is 
 inefficient. Most of our queries take 1 or 2 seconds on 100+M events. This 
 script filter query takes over 300 seconds. Specifically, it can process 
 events at about 400K events per second. BY comparison, I wrote a naive 
 program that does a linear pass of the un-compacted data and that process 
 11M events per second. By which I conclude that Elasticsearch does not do 
 well on this type of query.

 I am hoping someone can indicate a more efficient way to do this query 
 in ES. Or else confirm that ES cannot do inter-document queries well. 

 Thanks,
 Zennet


  -- 
 You received this message because you are subscribed to the Google 
 Groups elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send 
 an email to elasticsearc...@googlegroups.com.

 To view this discussion on the web visit https://groups.google.com/d/
 msgid/elasticsearch/28c93f2d-e870-4347-8677-e9da41b6be62%
 40googlegroups.com 
 https://groups.google.com/d/msgid/elasticsearch/28c93f2d-e870-4347-8677-e9da41b6be62%40googlegroups.com?utm_medium

Re: Inter-document Queries

2014-06-04 Thread Zennet Wheatcroft
Yes. I can re-index the data or transform it in any way to make this query 
efficient. 

What would you suggest?


On Wednesday, June 4, 2014 2:14:09 PM UTC-7, Itamar Syn-Hershko wrote:

 This model is not efficient for this type of querying. You cannot do this 
 in one query using this model, and the pre-processing work you do now + 
 traversing all documents is very costly.

 Is it possible for you to index the data (even as a projection) into 
 Elasticsearch using a different model, so you can use ES properly using 
 queries or the aggregations framework?

 --

 Itamar Syn-Hershko
 http://code972.com | @synhershko https://twitter.com/synhershko
 Freelance Developer  Consultant
 Author of RavenDB in Action http://manning.com/synhershko/


 On Thu, Jun 5, 2014 at 12:04 AM, Zennet Wheatcroft zwhea...@atypon.com 
 javascript: wrote:

 Hi,

 I am looking for an efficient way to do inter-document queries in 
 Elasticsearch. Specifically, I want to count the number of users that went 
 through an exit point B after visiting point A.

 In general terms, say we have some event log data about users actions on 
 a website:
 
 {userid:xyz, machineid:110530745, path:/promo/A, country:
 US, tstamp:2013-04-01 00:01:01}
 {userid:pdq, machineid:110519774, path:/page/1, country:
 CN, tstamp:2013-04-01 00:02:11}
 {userid:xyz, machineid:110530745, path:/promo/D, country:
 US, tstamp:2013-04-01 00:06:31}
 {userid:abc, machineid:110527022, path:/page/23, country:
 DE, tstamp:2013-04-01 00:08:00}
 {userid:pdq, machineid:110519774, path:/page/2, country:
 CN, tstamp:2013-04-01 00:08:55}
 {userid:xyz, machineid:110530745, path:/sale/B, country:
 US, tstamp:2013-04-01 00:09:46}
 {userid:abc, machineid:110527022 , path:/promo/A, country:
 DE, tstamp:2013-04-01 00:10:46}
 
 And we have 500+M such entries.

 We want a count of the number of userids that visited path=/sale/B after 
 visiting path=/promo/A.

 What I did is to preprocess the data, sorting by userid, tstamp, then 
 compacting all events by the same userid into the same document. Then I 
 wrote a script filter which traverses the path array per document, and 
 returns true if it finds any occurrence of B followed by A. This however is 
 inefficient. Most of our queries take 1 or 2 seconds on 100+M events. This 
 script filter query takes over 300 seconds. Specifically, it can process 
 events at about 400K events per second. BY comparison, I wrote a naive 
 program that does a linear pass of the un-compacted data and that process 
 11M events per second. By which I conclude that Elasticsearch does not do 
 well on this type of query.

 I am hoping someone can indicate a more efficient way to do this query in 
 ES. Or else confirm that ES cannot do inter-document queries well. 

 Thanks,
 Zennet


  -- 
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearc...@googlegroups.com javascript:.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/28c93f2d-e870-4347-8677-e9da41b6be62%40googlegroups.com
  
 https://groups.google.com/d/msgid/elasticsearch/28c93f2d-e870-4347-8677-e9da41b6be62%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.




-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/5c576f27-4b14-4a2d-9415-17ac50e41371%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.