Re: Elasticsearch and MongoDB without River

2015-05-24 Thread Michael Sick
I'd use Apache Storm, especially if it was used elsewhere in your
organization. --Mike

On Sat, May 23, 2015 at 4:28 AM, sriharshakiran 
kiran.srihar...@imomentous.com wrote:

 Hi All

 Now that rivers are deprecated, I need to index data into ES from MongoDB.
 Can anyone suggest an approach?

 We currently have a system in place with mongo river. I just need to know
 what should be the plan of action to get data into ES without Rivers.

 Thanks
 Sri Harsha



 --
 View this message in context:
 http://elasticsearch-users.115913.n3.nabble.com/Elasticsearch-and-MongoDB-without-River-tp4074854.html
 Sent from the Elasticsearch Users mailing list archive at Nabble.com.

 --
 Please update your bookmarks! We have moved to https://discuss.elastic.co/
 ---
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/1432369692249-4074854.post%40n3.nabble.com
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
Please update your bookmarks! We have moved to https://discuss.elastic.co/
--- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAP8axnBTUUwgGTiS--zJzNdg9OPJNZ-QTH0nRTsPQkhAz-Y_eA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Elasticsearch 1.1.0 - Optimize broken?

2014-04-04 Thread Michael Sick
Have you tried max_num_segments=1 on your optimize?

On Fri, Apr 4, 2014 at 11:27 AM, Elliott Bradshaw ebradsh...@gmail.comwrote:

 Any thoughts on this?  I've run optimize several more times, and the
 number of segments falls each time, but I'm still over 1000 segments per
 shard.  Has anyone else run into something similar?


 On Thursday, April 3, 2014 11:21:29 AM UTC-4, Elliott Bradshaw wrote:

 OK.  Optimize finally returned, so I suppose something was happening in
 the background, but I'm still seeing over 6500 segments.  Even after
 setting max_num_segments=5.  Does this seem right?  Queries are a little
 faster (350-400ms) but still not great.  Bigdesk is still showing a fair
 amount of file IO.

 On Thursday, April 3, 2014 8:47:32 AM UTC-4, Elliott Bradshaw wrote:

 Hi All,

 I've recently upgraded to Elasticsearch 1.1.0.  I've got a 4 node
 cluster, each with 64G of ram, with 24G allocated to Elasticsearch on
 each.  I've batch loaded approximately 86 million documents into a single
 index (4 shards) and have started benchmarking cross_field/multi_match
 queries on them.  The index has one replica and takes up a total of 111G.
 I've run several batches of warming queries, but queries are not as fast as
 I had hoped, approximately 400-500ms each.  Given that *top *(on
 Centos) shows 5-8 GB of free memory on each server, I would assume that the
 entire index has been paged into memory (I had worried about disk
 performance previously, as we are working in a virtualized environment).

 A stats query on the index in questions shows that the index is composed
 of  7000 segments.  This seemed high to me, but maybe it's appropriate.
 Regardless, I dispatched an optimize command, but I am not seeing any
 progress and the command has not returned.  Current merges remains at zero,
 and the segment count is not changing.  Checking out hot threads in
 ElasticHQ, I initially saw an optimize call in the stack that was blocked
 on a waitForMerge call.  This however has disappeared, and I'm seeing no
 evidence that the optimize is occuring.

 Does any of this seem out of the norm or unusual?  Has anyone else had
 similar issues.  This is the second time I have tried to optimize an index
 since upgrading.  I've gotten the same result both time.

 Thanks in advance for any help/tips!

 - Elliott

  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/5391291f-5c5e-4088-a1f2-93272beef0bb%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/5391291f-5c5e-4088-a1f2-93272beef0bb%40googlegroups.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAP8axnD7BUziGct2%3Db%3DfupaKYFnA5fR2TBsxHoURJumHSyODFA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Advice for implementing a secure graph index with ElasticSearch

2014-03-05 Thread Michael Sick
https://github.com/thinkaurelius/titan/wiki

Titan is a distributed graph
databasehttp://en.wikipedia.org/wiki/Graph_database optimized
for storing and querying
graphshttp://en.wikipedia.org/wiki/Graph_(mathematics) represented
over a cluster of machines. The cluster can elastically scale to support a
growing dataset and user base. Titan has a pluggable storage architecture
which allows it to build on proven database technology such as Apache
Cassandra http://cassandra.apache.org/, Apache
HBasehttp://hbase.apache.org/,
or Oracle BerkeleyDBhttp://www.oracle.com/technetwork/database/berkeleydb/.
Furthermore, the pluggable indexing architecture supports
ElasticSearchhttp://elasticsearch.com/
 and Lucene http://lucene.apache.org/.

I did some basic research for ES + graph and found the Titan project
interesting. Titan separates storage from indexing and only currently
supports ES for the latter. I'm sure that you could implement a storage
engine based on ES too (which makes more sense now that ES 1.x supports
backup/restore). Didn't look into security at all but this might be a good
starting point. Hope it's helpful. --Mike


On Wed, Mar 5, 2014 at 12:10 PM, Jeff Kunkle kunkl...@gmail.com wrote:

 I've been trying to figure out how I can index a graph data structure
 using ElasticSearch and could really use some advice from someone more
 knowledgeable than me. First, let me explain the challenge. The graph model
 has individual access controls at the vertex (node), edge (relationship),
 and property level. I'd like my users to be able to search the graph for
 vertices or edges containing matching properties, with two caveats:

1. They should not get vertex or edge results they don't have
permission to see.
2. Properties a user does not have access to see should not be
evaluated in the query.

 My first thought was to index properties as either nested or child
 documents of a vertex/edge and use a custom filter to remove properties a
 user didn't have access to. The first problem I run into is when I try a
 boolean query across properties. For example, assume I want to query a
 person vertex by first name and date of birth. Since these properties are
 indexed as separate documents there is never a match.

 What I essentially need is the ability to query across nested or child
 documents and return the parent only when there are matches across the
 child documents. For example, assume a parent vertex with one property
 document called full_name set to Barak Obama and another property
 document named political_party set to Democrat. Is there any way for me
 to query for the parent document of these two properties by asking for one
 property with full_name=Barak Obama and another property with
 political_party=Democrat?

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/a8beee5b-82d0-45fa-8666-31e956c03439%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/a8beee5b-82d0-45fa-8666-31e956c03439%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/groups/opt_out.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAP8axnAv7D3Ux4jXuPiYS9ZSBGSXSxiR0Qg3C3FzcVHoRZXgiw%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Advice for implementing a secure graph index with ElasticSearch

2014-03-05 Thread Michael Sick
Hi Jeff,

Lumify looks very interesting - I'll have to take a serious look later.
While I didn't get very far down the Titan path, my starting point would be
to use Titan for all it's graphing features and add some type of query pre
and post processing to insert ACL information into the ES queries and
indexing statements. Not sure if Titan offers any hooks - but it seems it
could be added.

I'd start with the Aurelius folks and see how something like this could be
added with low impact to the existing interfaces. As far as the ES part, I
have added ACL's to documents before by having the id's as an array. This
worked in the simple case I needed because the owners of the documents
changed infrequently (so there was not much reindexing load) and I didn't
have to think it through more than that.

--Mike

On Wed, Mar 5, 2014 at 1:08 PM, Jeff Kunkle kunkl...@gmail.com wrote:

 Hi Mike,

 Thanks for the reply. We actually started with Titan and its a very good
 project, but we couldn't easily add the needed security constraints on top
 of it. Hence why I'm exploring this topic. It would be rather
 straightforward to implement the index on ElasticSearch if all the data was
 open to everyone. I'd be able to consolidate all of a vertex's or edge's
 properties in a single document. Unfortunately, that's not the case. The
 project I'm working on is at http://lumify.io if that's helpful in any
 way.

 Thanks Again,
 Jeff


 On Wednesday, March 5, 2014 12:41:59 PM UTC-5, Michael Sick wrote:

 https://github.com/thinkaurelius/titan/wiki

 Titan is a distributed graph 
 databasehttp://en.wikipedia.org/wiki/Graph_database optimized
 for storing and querying 
 graphshttp://en.wikipedia.org/wiki/Graph_(mathematics) represented
 over a cluster of machines. The cluster can elastically scale to support a
 growing dataset and user base. Titan has a pluggable storage architecture
 which allows it to build on proven database technology such as Apache
 Cassandra http://cassandra.apache.org/, Apache 
 HBasehttp://hbase.apache.org/,
 or Oracle BerkeleyDBhttp://www.oracle.com/technetwork/database/berkeleydb/.
 Furthermore, the pluggable indexing architecture supports 
 ElasticSearchhttp://elasticsearch.com/
  and Lucene http://lucene.apache.org/.

 I did some basic research for ES + graph and found the Titan project
 interesting. Titan separates storage from indexing and only currently
 supports ES for the latter. I'm sure that you could implement a storage
 engine based on ES too (which makes more sense now that ES 1.x supports
 backup/restore). Didn't look into security at all but this might be a good
 starting point. Hope it's helpful. --Mike


 On Wed, Mar 5, 2014 at 12:10 PM, Jeff Kunkle kunk...@gmail.com wrote:

 I've been trying to figure out how I can index a graph data structure
 using ElasticSearch and could really use some advice from someone more
 knowledgeable than me. First, let me explain the challenge. The graph model
 has individual access controls at the vertex (node), edge (relationship),
 and property level. I'd like my users to be able to search the graph for
 vertices or edges containing matching properties, with two caveats:

1. They should not get vertex or edge results they don't have
permission to see.
2. Properties a user does not have access to see should not be
evaluated in the query.

 My first thought was to index properties as either nested or child
 documents of a vertex/edge and use a custom filter to remove properties a
 user didn't have access to. The first problem I run into is when I try a
 boolean query across properties. For example, assume I want to query a
 person vertex by first name and date of birth. Since these properties are
 indexed as separate documents there is never a match.

 What I essentially need is the ability to query across nested or child
 documents and return the parent only when there are matches across the
 child documents. For example, assume a parent vertex with one property
 document called full_name set to Barak Obama and another property
 document named political_party set to Democrat. Is there any way for me
 to query for the parent document of these two properties by asking for one
 property with full_name=Barak Obama and another property with
 political_party=Democrat?

 --
 You received this message because you are subscribed to the Google
 Groups elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send
 an email to elasticsearc...@googlegroups.com.

 To view this discussion on the web visit https://groups.google.com/d/
 msgid/elasticsearch/a8beee5b-82d0-45fa-8666-31e956c03439%
 40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/a8beee5b-82d0-45fa-8666-31e956c03439%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/groups/opt_out.


  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group

Re: Insert later feature

2014-02-23 Thread Michael Sick
Also, if there are no other clients wanting a faster refresh, you can
set index.refresh_interval to a higher value than the 1s default either in
general for your index or just during the times when you're doing your bulk
updates.
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/index-modules.html


On Sun, Feb 23, 2014 at 8:28 AM, joergpra...@gmail.com 
joergpra...@gmail.com wrote:

 Best method to achieve this would be to implement this in front of ES so
 the bulk indexing client runs only at the time it should run.

 For the gathering plugin which I am working on, I plan to separate the two
 phases of gathering documents and indexing documents. So, by giving a
 scheduling option, it will be possible to index (or even reindex) gathered
 documents at a later time, for example, documents are continuously
 collected from various sources, like JDBC, web, or file system, and then
 indexed at some later time (for example at night). Such collected documents
 will be stored in an archive format at each gatherer node, like the archive
 formats supported in the knapsack plugin.

 Jörg



 On Sun, Feb 23, 2014 at 6:52 AM, vineeth mohan 
 vm.vineethmo...@gmail.comwrote:

 Hi ,

 I am doing a lot of bulk insert into Elasticsearch and at the same time
 doing lots of read in another index.

 Because of the bulk insert my searches on other index are slow.

 It is not very urgent that these bulk indexes actually gets indexed and
 are immediately searchable.

 Is there anyway , I can ask Elasticsearch to receive the bulk inserts but
 do the actual indexing ( Which should be the CPU consuming part ) later.

 I figured out that Elasticsearch would wait for 1 second before making
 the documents searchable.
 Here , what is it waiting for ? Is it to index the document or reopening
 the indexWriter ?
 Will it help me if i can configure this 1 second to 1 hour ?
 If so , which parameter should i tweak.

 Kindly let me know if there are any other similar features out there
 which can be of any help.

 Thanks
   Vineeth

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/CAGdPd5kwxwB%2Bi%3DHZDS1y%2B6Ad-VTax8hLSpgSVaSNH7CbzagB3Q%40mail.gmail.com
 .
 For more options, visit https://groups.google.com/groups/opt_out.


  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/CAKdsXoFFewVHjeoEyZVktYEEqtbBXoD4VH3K-Tx9KAh%3DTfj%3D1Q%40mail.gmail.com
 .

 For more options, visit https://groups.google.com/groups/opt_out.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAP8axnCq-PE%3Du0ZSC6d7rDxME%3DpkzpBo%3D9-tq_rT%2BCZjQgzFxg%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Problem with keeping in sync Elasticsearch across two data centers

2014-02-22 Thread Michael Sick
Hi Amit,

Ivan is correct. You might also check out I believe that you're looking for
TribeNodes
http://www.elasticsearch.org/guide/en/elasticsearch/reference/master/modules-tribe.html
and
see if it fits your needs for cross-dc replication.

--Mike

On Sat, Feb 22, 2014 at 1:32 PM, Amit Soni amitson...@gmail.com wrote:

 Hello Michael - Understand that ES is not built to maintain consistent
 cluster state across data centers. what I am wondering is whether there is
 a way for ElasticSearch to continue to replicate data onto a different data
 center (with some delay of course) so that when the primary center fails,
 the fail over data center still has most of the data (may be except for the
 last few seconds/minutes/hours).

 Overall I am looking for a right way to implement cross data center
 deployment of elastic-search!

 -Amit.


 On Fri, Feb 21, 2014 at 9:37 AM, Michael Sick 
 michael.s...@serenesoftware.com wrote:

 Dario,

 I believe that you're looking for TribeNodes
 http://www.elasticsearch.org/guide/en/elasticsearch/reference/master/modules-tribe.html

 ES is not built to consistently cluster across DC's / larger network
 lags.

 On Fri, Feb 21, 2014 at 11:24 AM, Dario Rossi dario...@gmail.com wrote:

 Hi,
 I've the following problem: our application publishes content to an
 Elasticsearch cluster. We use local data less node for querying
 elasticsearch then, so we don't use HTTP REST and the local nodes are the
 loadbalancer. Now they came with the requirement of having the cluster
 replicated to another data center too (and in the future maybe another
 too... ) for resilience.

 At the very beginning we thought of having one large cluster that goes
 across data centers (crazy). This solution has the following problems:

 - The cluster has the split-brain problem (!)
 - The client data less node will try to do requests across different
 data centers (is there a solution to this???). I can't find a way to avoid
 this. We don't want this to happen because of a) latency and b) firewalling
 issues.

 So we started to think that this solution is not really viable. So we
 thought of having one cluster per data center, which seems more sensible.
 But then here we have the problem that we must publish data to all clusters
 and, if one fails, we have no means of rolling back (unless we try to set
 up a complicated version based rollback system). I find this very
 complicated and hard to maintain, although can be somewhat doable.

 My biggest problem is that we have to keep the data centers in the same
 state at any time, so that if one goes down, we can readily switch to the
 other.

 Any ideas, or can you recommend some support to help use deal with this?

 --
 You received this message because you are subscribed to the Google
 Groups elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send
 an email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/5424a274-3f6b-4c12-9fe6-621e04f87a8d%40googlegroups.com
 .
 For more options, visit https://groups.google.com/groups/opt_out.


  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
  To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/CAP8axnDW4GCDnnzwA%2BcyR%2BN4g-26VV4CZ-ZW6SDGgxFL75qy%2Bw%40mail.gmail.com
 .

 For more options, visit https://groups.google.com/groups/opt_out.


  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/CAAOGaQKLUGepyKyR4oDNq1B7-uosp9SWCCeZmkRdQHsSJTSndA%40mail.gmail.com
 .

 For more options, visit https://groups.google.com/groups/opt_out.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAP8axnBOi0R_c%3DcvMYrmpQK-z2%3D4-ik6tGk-ngGtnpjjn11%3DoQ%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Problem with keeping in sync Elasticsearch across two data centers

2014-02-21 Thread Michael Sick
Dario,

I believe that you're looking for TribeNodes
http://www.elasticsearch.org/guide/en/elasticsearch/reference/master/modules-tribe.html

ES is not built to consistently cluster across DC's / larger network lags.

On Fri, Feb 21, 2014 at 11:24 AM, Dario Rossi dario...@gmail.com wrote:

 Hi,
 I've the following problem: our application publishes content to an
 Elasticsearch cluster. We use local data less node for querying
 elasticsearch then, so we don't use HTTP REST and the local nodes are the
 loadbalancer. Now they came with the requirement of having the cluster
 replicated to another data center too (and in the future maybe another
 too... ) for resilience.

 At the very beginning we thought of having one large cluster that goes
 across data centers (crazy). This solution has the following problems:

 - The cluster has the split-brain problem (!)
 - The client data less node will try to do requests across different data
 centers (is there a solution to this???). I can't find a way to avoid this.
 We don't want this to happen because of a) latency and b) firewalling
 issues.

 So we started to think that this solution is not really viable. So we
 thought of having one cluster per data center, which seems more sensible.
 But then here we have the problem that we must publish data to all clusters
 and, if one fails, we have no means of rolling back (unless we try to set
 up a complicated version based rollback system). I find this very
 complicated and hard to maintain, although can be somewhat doable.

 My biggest problem is that we have to keep the data centers in the same
 state at any time, so that if one goes down, we can readily switch to the
 other.

 Any ideas, or can you recommend some support to help use deal with this?

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/5424a274-3f6b-4c12-9fe6-621e04f87a8d%40googlegroups.com
 .
 For more options, visit https://groups.google.com/groups/opt_out.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAP8axnDW4GCDnnzwA%2BcyR%2BN4g-26VV4CZ-ZW6SDGgxFL75qy%2Bw%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


performing range summary on the result of a metric within a bucket

2014-01-31 Thread Michael Sick
I have what will be a long running time series table that I would like to
aggregate by time buckets and analyze:

   1. Query  Filter (in this case by dates  a term query on user_guid)
   (Working)
   2. Bucket by time for analysis (working using the aggs date_aggregate
   and heartRate_stats aggs
   3. Perform range counts on the metrics produced by the metrics within
   the buckets. *Not working. *The heartRate_zoneCounts1 aggregator gets
   zero counts - seems you can't point from the top down. The
   heartRate_zoneCounts aggregator does counts of the individual documents
   within the bucket (nice - but not what I'm looking for).

So how would I apply a range aggregate to the outcome of the
heartRate_stats metric and get only one value per date_aggregate bucket? I
can post process the results to apply a range but would rather have ES do
it for me. Any / all help apprecuated. Thanks in advance!
--Mike

*The Results*
mikeasick https://gist.github.com/mikeasick /
*gist:8734325*https://gist.github.com/mikeasick/8734325

*The Data (per second stream of the information below)*
mikeasick https://gist.github.com/mikeasick / *gist:8734404
https://gist.github.com/mikeasick/8734404*

*The Search*
https://gist.github.com/mikeasick/8734117


curl -XGET http://localhost:9200/vitals/vital/_search?pretty=true; -d'
{

  size: 0,
  query: {
filtered: {
  query: {
match_all: {}
  },
  filter: {

and: {
  filters: [
{
  term: {user_guid: 0ad08904-c1cf-46cf-9a04-e0865c1cced2}
},
{
  numeric_range: {

recorded_time: {
  gte: 2013-01-05T04:44:33.396-05:00,
  lte: 2014-01-05T04:44:33.396-05:00
}
  }
}

  ]
}
  }
}

  },
  aggs: {
date_aggregate: {
  date_histogram: {
field: recorded_time,
interval: 5m

  },
  aggs: {
heartRate_zoneAverageCounts: {

   range : {
field : heartRate_stats.avg,
ranges : [
{ to : 50 },
{ from : 50, to : 100 },

{ from : 100 }
]
}

},
heartRate_zoneCounts: {
   range : {

field : heartRate,
ranges : [
{ to : 50 },
{ from : 50, to : 100 },
{ from : 100 }

]
}
},
heartRate_stats: {
  extended_stats: {
field: heartRate

  }
}
  }
}

  }
}'

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAP8axnB_ysa7DMMnsxou%2BJoaPe7kzT3CooLrn_oYi1petadF8A%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: performing range summary on the result of a metric within a bucket

2014-01-31 Thread Michael Sick
Forgot to post the mapping (here mikeasickhttps://gist.github.com/mikeasick /
gist:8738689 https://gist.github.com/mikeasick/8738689) and that it's
using 1.0.0.Beta2 and Java 1.7.0_25 on Windows 7.

Thanks!

On Fri, Jan 31, 2014 at 10:42 AM, Michael Sick 
michael.s...@serenesoftware.com wrote:

 I have what will be a long running time series table that I would like to
 aggregate by time buckets and analyze:

1. Query  Filter (in this case by dates  a term query on user_guid)
(Working)
2. Bucket by time for analysis (working using the aggs
date_aggregate and heartRate_stats aggs
3. Perform range counts on the metrics produced by the metrics within
the buckets. *Not working. *The heartRate_zoneCounts1 aggregator
gets zero counts - seems you can't point from the top down. The
heartRate_zoneCounts aggregator does counts of the individual documents
within the bucket (nice - but not what I'm looking for).

 So how would I apply a range aggregate to the outcome of the
 heartRate_stats metric and get only one value per date_aggregate bucket? I
 can post process the results to apply a range but would rather have ES do
 it for me. Any / all help apprecuated. Thanks in advance!
 --Mike

 *The Results*
 mikeasick https://gist.github.com/mikeasick / 
 *gist:8734325*https://gist.github.com/mikeasick/8734325

 *The Data (per second stream of the information below)*
 mikeasick https://gist.github.com/mikeasick / *gist:8734404
 https://gist.github.com/mikeasick/8734404*

 *The Search*
 https://gist.github.com/mikeasick/8734117


 curl -XGET http://localhost:9200/vitals/vital/_search?pretty=true; -d'
 {

   size: 0,
   query: {
 filtered: {
   query: {
 match_all: {}
   },
   filter: {

 and: {
   filters: [
 {
   term: {user_guid: 0ad08904-c1cf-46cf-9a04-e0865c1cced2}
 },
 {
   numeric_range: {

 recorded_time: {
   gte: 2013-01-05T04:44:33.396-05:00,
   lte: 2014-01-05T04:44:33.396-05:00

 }
   }
 }

   ]
 }
   }
 }

   },
   aggs: {
 date_aggregate: {
   date_histogram: {
 field: recorded_time,
 interval: 5m

   },
   aggs: {
 heartRate_zoneAverageCounts: {

range : {
 field : heartRate_stats.avg,
 ranges : [
 { to : 50 },
 { from : 50, to : 100 },

 { from : 100 }
 ]
 }

 },
 heartRate_zoneCounts: {
range : {

 field : heartRate,
 ranges : [
 { to : 50 },
 { from : 50, to : 100 },
 { from : 100 }

 ]
 }
 },
 heartRate_stats: {
   extended_stats: {
 field: heartRate

   }
 }
   }
 }

   }
 }'




-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAP8axnC1gLEhXk02jKUeP2MQSSeFemmGczvbPTa2eERwqMtYkg%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.