Logging queries run via search templates

2015-01-20 Thread ppearcy
Hi, 
  I am running some template search queries on 1.4.2 via Java APIs like so:

esClient.prepareSearch(index)
  .setTemplateName(templateName)
  .setTemplateParams(templateParams)
  .setTemplateType(ScriptService.ScriptType.INDEXED)

When I log the request, I get an empty json object. Is it possible to dump 
the the actual evaluated query from the node client? I dug through the code 
and it looks to me the template is evaluated in SearchService.java, but 
didn't see a clean way to extract that out.  

I am pretty sure I can logs this out server side, but was hoping to get 
this info from the caller. 

Thanks!
Paul

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/89dabe98-60b6-47b1-a95a-3a4782445766%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


API to get bounded port

2014-10-27 Thread ppearcy
When starting a cluster via testing, I want to get the bounded port since I 
am letting choose and there might be a conflict. 

Here is the ugly and brittle code I came up to do this:
https://gist.github.com/ppearcy/c5d969326b9e6ace8046

Is there a nicer API than having to regex out the connection string? 

Thanks,
Paul

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/2a10a6e4-5f01-45f4-a9b8-0be5d35546a0%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Aggregations across values returned by term then date histogram

2014-09-11 Thread ppearcy
I haven't been able to figure out how to do this and it may not be 
possible, but figured I'd ask. 

I have a query with multiple aggregations that looks like this:
https://gist.github.com/ppearcy/0c6a86ebf32a0bbcb1fc

This returns a time series of data per user: 
https://gist.github.com/ppearcy/7ceac858da2e647ff341

I want to do a stats aggregation across all the values for each week to 
provided per weekly statistical view of things. 

Currently, I am doing these computations client side and it works pretty 
well, but have performance concerns around merging lots of time series 
streams. 

Any help or ideas would be much appreciated. 

Thanks!
Paul

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/a83cc20d-8c9c-4a6b-b843-349a2669e580%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Aggregations across values returned by term then date histogram

2014-09-11 Thread ppearcy
Hi,
  I am doing a terms aggregation on user with a sub date histogram 
aggregation to get time series per user. I then want to perform a stats 
aggregation all the values of each date bucket across users. 

Thanks,
Paul

On Thursday, September 11, 2014 8:32:13 PM UTC-4, vineeth mohan wrote:

 Hello , 

 I didn't get your question completely , but then i feel a simple date 
 histogram query should do the trick.

   aggs : {

 {{time_interval}}: {
   date_histogram: {
 field: time,
 interval: {{time_interval}},
 min_doc_count: 0
   }
 }
   }

 Let me know if this doesn't fit your need and if so , what other data you are 
 looking for .

 Thanks
  Vineeth


 On Thu, Sep 11, 2014 at 11:38 PM, ppearcy ppe...@gmail.com javascript: 
 wrote:

 I haven't been able to figure out how to do this and it may not be 
 possible, but figured I'd ask. 

 I have a query with multiple aggregations that looks like this:
 https://gist.github.com/ppearcy/0c6a86ebf32a0bbcb1fc

 This returns a time series of data per user: 
 https://gist.github.com/ppearcy/7ceac858da2e647ff341

 I want to do a stats aggregation across all the values for each week to 
 provided per weekly statistical view of things. 

 Currently, I am doing these computations client side and it works pretty 
 well, but have performance concerns around merging lots of time series 
 streams. 

 Any help or ideas would be much appreciated. 

 Thanks!
 Paul

 -- 
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearc...@googlegroups.com javascript:.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/a83cc20d-8c9c-4a6b-b843-349a2669e580%40googlegroups.com
  
 https://groups.google.com/d/msgid/elasticsearch/a83cc20d-8c9c-4a6b-b843-349a2669e580%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.




-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/3b0a967b-8445-4401-82fe-ee22c942d050%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Snapshot compress not compressing?

2014-09-08 Thread ppearcy
Hehe, good to know. I submitted a PR to clarify the documentation:
https://github.com/elasticsearch/elasticsearch/pull/7654

The at the moment leads me to believe this is planned or in the pipeline, 
looking forward to it. 

Best Regards,
Paul



On Monday, September 8, 2014 2:00:30 PM UTC-4, Igor Motov wrote:

 At the moment, compression is applied only to metadata files (index 
 mapping and settings basically). Data files are not compressed. 

 On Monday, September 8, 2014 5:22:09 AM UTC-4, Russell Seymour wrote:

 Good morning,

 I experienced the exact same issue on Friday as well.

 I have an Elastic Search cluster (1.3.2) running on Windows using Oracle 
 Java 1.7.0_67.  We needed a backup strategy and purposely upgraded to this 
 version to take advantage of the snapshots feature.
 The size of the indexes in the cluster is about 40Gb and even with the 
 'compress' option explicitly set to true (as in Paul's post and in the 
 documentation) the snapshot is about 40Gb,

 Is there a work around to get this working or some other fix?

 Thanks, Russell

 On Friday, 5 September 2014 22:27:09 UTC+1, ppearcy wrote:

 I am playing around with snapshot/restore and have a local 1.3.2 cluster 
 running on Mac OS X with 894MB of index data. 

 I have registered a backup repository like so (straight from the docs):

 curl -XPUT 'http://localhost:9200/_snapshot/my_backup' -d '{
 type: fs,
 settings: {
 location: /tmp/backups/my_backup,
 compress: true
 }
 }'

 Then run the snapshot (again straight from the docs):

 curl -XPUT 
 localhost:9200/_snapshot/my_backup/snapshot_1?wait_for_completion=true

 The snapshot runs fine, but the backup directory that is generated is 
 890MB, which tells me that compression isn't kicking in. When I set 
 compress: false, I get the same results. 

 If I tar/gz that directory it gets squashed down to 204MB. I'd expect 
 the compressed snapshot from ES to be somewhere in that ballpark.

 Am I doing something wrong or is there a bug?
 Thanks and Best Regards,
 Paul



-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/e71eb039-3f13-4417-8427-eace733bca53%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Best practices for dealing with a large number of small activity stream events

2014-07-23 Thread ppearcy
Hi all,
  I'm looking at using elasticsearch for a use case that I'd love some 
feedback on regarding best practices. 

A little background... I've been digging into various approaches to 
allowing interactive drill down slicing dicing of activity stream data ( 
actor / verb / target ) user data for realtime analytics for end users. 
This is high dimensional data that has too many potential ways to view to 
effectively precompute rollups. Other systems out there that try to tackle 
this similar problem that I have played around with are Druid, OpenTSDB, 
Blueflood, InfluxDB. At the end of the day they either all use an inverted 
index or have or are planning to have elasticsearch integrations, so I 
figure why not stick with ES.

There are three areas I am trying to optimize:
- Minimize the index footprint on disk.
- Minimize the RAM footprint
- Maximize the speed

I believe the key tradeoff I need to make with my dataset is going to 
doc_values and whether or not I try to utilize heap or page cache.  

All my fields are straight exact match not analyzed fields and there are 
~15 of them. not_analyzed appears to have all the extras that can cause 
bloat disabled (norms, frequencies, etc). I am not indexing source. Here is 
my index template:
https://gist.github.com/ppearcy/fc5202a1664dbc90cbc2

With some test data, I'm getting pretty solid results. Average messages are 
~360bytes and I am getting:
- 60 bytes per without doc_values 
- 80 bytes per with doc values

On a test index with ~160million docs w/o doc values, I have it at 9.6GB of 
data with the file breakdown like so:
3.8G Jul 23 09:40 _mwf.fdt
3.9G Jul 23 10:32 _mwf_es090_0.tim
1.8G Jul 23 10:32 _mwf_es090_0.doc

Anybody know how I can slim things down any further or general advice when 
dealing with large numbers of small documents? 

Thanks!
Paul

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/58c823da-a493-4d46-b16f-dd3dfdb5960a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Sense on github abandoned?

2014-04-02 Thread ppearcy
Hi, 
  Since Marvel requires a license for production usage, does this mean in 
order to use the Marvel bundled Sense against a production instance 
requires you to buy a license? 

I just got out of a meeting where I told a bunch of people to go download 
sense off the chrome store. Whoops :) 

Thanks!
Paul

On Tuesday, April 1, 2014 12:14:44 PM UTC-4, kimchy wrote:

 Sense started as a weekend project, and Boaz did not place a license on 
 it. As you mentioned, this license effectively applies: 
 http://choosealicense.com/no-license/. We consulted our lawyers, who 
 specialize in open source, and changing the license to open source one is 
 complex, expensive, and requires a lot of resources. The reason is that its 
 not only getting the committers agreement, but also reaching all possible 
 users and have them agree to it (or at least showing big investment in 
 trying to do so, + a rather large time window to allow for people to 
 object).

 When Boaz created Sense, he was not employed by Elasticsearch. Obviously 
 any project started by our employees has a clear license (as you can notice 
 with the many projects we created).

 Regarding Marvel:

 - You are only required to pay for it when used in production.
 - You don’t have to be a support customer of Elasticsearch the company, 
 you can buy a license for Marvel easily on the web. We made it super cheap 
 since we think its something that a lot of people will find benefit from.

 On Apr 1, 2014, at 17:00, Ivan Brusic iv...@brusic.com javascript: 
 wrote:

 I personally do not require an open source license for Marvel/Sense, but I 
 would like to see an explicit clarification about the use of Marvel in this 
 scenario. Marvel does require a license to use and that would apply to any 
 of its subsystems. Then again, Sense does not have a license, which means 
 its use is also somewhat restricted.

 Sense is an excellent tool and users dependency on the tool is quite 
 apparent from this thread. :)

 I haven't packaged a Chrome plugin in about 3 years. Not only has my 
 memory faded, but I would assume the mechanism has changed in our fast 
 changing world of development. It would be a fun exercise to attempt to do 
 it again.

 Cheers,
 Ivan


 On Tue, Apr 1, 2014 at 5:48 AM, Tim S tims...@gmail.com javascript:wrote:

 @kimchy the whole reason for me asking these questions is that sometimes 
 a customer is using elasticsearch but they don't (yet) have a support 
 contract, but don't consider themselves in development either, and thus 
 wouldn't allow me to use Marvel. Yes, there are other tools for poking 
 around, but sense is invaluable for constructing complicated queries etc 
 quickly. In this situation they wouldn't let me install a chrome plugin 
 either, but sense works nicely as an elasticsearch plugin too.

 So, if sense (the abandoned version on github) had some kind of 
 permissive licence, I could turn up on customer site and use sense to poke 
 around.
 Ideally, it would have a licence like AL2 which would allow me to modify 
 it if necessary.

 I realise that you don't want updates pushed back to the version of sense 
 on github because those changes are helping you to make money from Marvel, 
 I understand that. But if the abandoned version of sense did have an 
 appropriate licence, it would allow us to use the current version - it's 
 still useful even if it's not kept up to date. I might even be tempted to 
 try and keep it up to date in my spare time. But clearly I can't do this 
 unless it has a licence that allows me to do it.

 Glad to see I'm not the only person thinking along these lines.



 On Tuesday, April 1, 2014 11:15:07 AM UTC+1, Jörg Prante wrote:

 +1 for Sense standalone packaging
 +1 for Sense in Chrome Web Store

 Sense is used here all the time, it's essential.

 I have also forked the code in case Sense goes away, hoping for a FOSS 
 license.

 Not that I'm fluid in writing browser plugins, but if I find time, I am 
 not afraid of the learning curve.

 Jörg



 -- 
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearc...@googlegroups.com javascript:.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/837794c8-1a0a-411f-a29c-852133d6fbc2%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/837794c8-1a0a-411f-a29c-852133d6fbc2%40googlegroups.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.



 -- 
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearc...@googlegroups.com javascript:.
 To view this discussion on the web visit