from:"ppearcy"

Logging queries run via search templates

2015-01-20 Thread ppearcy

Hi, 
  I am running some template search queries on 1.4.2 via Java APIs like so:

esClient.prepareSearch(index)
  .setTemplateName(templateName)
  .setTemplateParams(templateParams)
  .setTemplateType(ScriptService.ScriptType.INDEXED)

When I log the request, I get an empty json object. Is it possible to dump 
the the actual evaluated query from the node client? I dug through the code 
and it looks to me the template is evaluated in SearchService.java, but 
didn't see a clean way to extract that out.  

I am pretty sure I can logs this out server side, but was hoping to get 
this info from the caller. 

Thanks!
Paul

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/89dabe98-60b6-47b1-a95a-3a4782445766%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

API to get bounded port

2014-10-27 Thread ppearcy

When starting a cluster via testing, I want to get the bounded port since I 
am letting choose and there might be a conflict. 

Here is the ugly and brittle code I came up to do this:
https://gist.github.com/ppearcy/c5d969326b9e6ace8046

Is there a nicer API than having to regex out the connection string? 

Thanks,
Paul

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/2a10a6e4-5f01-45f4-a9b8-0be5d35546a0%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Aggregations across values returned by term then date histogram

2014-09-11 Thread ppearcy

I haven't been able to figure out how to do this and it may not be 
possible, but figured I'd ask. 

I have a query with multiple aggregations that looks like this:
https://gist.github.com/ppearcy/0c6a86ebf32a0bbcb1fc

This returns a time series of data per user: 
https://gist.github.com/ppearcy/7ceac858da2e647ff341

I want to do a stats aggregation across all the values for each week to 
provided per weekly statistical view of things. 

Currently, I am doing these computations client side and it works pretty 
well, but have performance concerns around merging lots of time series 
streams. 

Any help or ideas would be much appreciated. 

Thanks!
Paul

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/a83cc20d-8c9c-4a6b-b843-349a2669e580%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Aggregations across values returned by term then date histogram

2014-09-11 Thread ppearcy

Hi,
I am doing a terms aggregation on user with a sub date histogram
aggregation to get time series per user. I then want to perform a stats
aggregation all the values of each date bucket across users.

Thanks,
Paul

On Thursday, September 11, 2014 8:32:13 PM UTC-4, vineeth mohan wrote:

Hello ,

I didn't get your question completely , but then i feel a simple date
histogram query should do the trick.

aggs : {

{{time_interval}}: {
date_histogram: {
field: time,
interval: {{time_interval}},
min_doc_count: 0
}
}
}

Let me know if this doesn't fit your need and if so , what other data you are
looking for .

Thanks
Vineeth

On Thu, Sep 11, 2014 at 11:38 PM, ppearcy ppe...@gmail.com javascript:
wrote:

I haven't been able to figure out how to do this and it may not be
possible, but figured I'd ask.

I have a query with multiple aggregations that looks like this:
https://gist.github.com/ppearcy/0c6a86ebf32a0bbcb1fc

This returns a time series of data per user:
https://gist.github.com/ppearcy/7ceac858da2e647ff341

I want to do a stats aggregation across all the values for each week to
provided per weekly statistical view of things.

Currently, I am doing these computations client side and it works pretty
well, but have performance concerns around merging lots of time series
streams.

Any help or ideas would be much appreciated.

Thanks!
Paul

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com javascript:.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/a83cc20d-8c9c-4a6b-b843-349a2669e580%40googlegroups.com

https://groups.google.com/d/msgid/elasticsearch/a83cc20d-8c9c-4a6b-b843-349a2669e580%40googlegroups.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/3b0a967b-8445-4401-82fe-ee22c942d050%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Snapshot compress not compressing?

2014-09-08 Thread ppearcy

Hehe, good to know. I submitted a PR to clarify the documentation:
https://github.com/elasticsearch/elasticsearch/pull/7654

The at the moment leads me to believe this is planned or in the pipeline,
looking forward to it.

Best Regards,
Paul

On Monday, September 8, 2014 2:00:30 PM UTC-4, Igor Motov wrote:

At the moment, compression is applied only to metadata files (index
mapping and settings basically). Data files are not compressed.

On Monday, September 8, 2014 5:22:09 AM UTC-4, Russell Seymour wrote:

Good morning,

I experienced the exact same issue on Friday as well.

I have an Elastic Search cluster (1.3.2) running on Windows using Oracle
Java 1.7.0_67. We needed a backup strategy and purposely upgraded to this
version to take advantage of the snapshots feature.
The size of the indexes in the cluster is about 40Gb and even with the
'compress' option explicitly set to true (as in Paul's post and in the
documentation) the snapshot is about 40Gb,

Is there a work around to get this working or some other fix?

Thanks, Russell

On Friday, 5 September 2014 22:27:09 UTC+1, ppearcy wrote:

I am playing around with snapshot/restore and have a local 1.3.2 cluster
running on Mac OS X with 894MB of index data.

I have registered a backup repository like so (straight from the docs):

curl -XPUT 'http://localhost:9200/_snapshot/my_backup' -d '{
type: fs,
settings: {
location: /tmp/backups/my_backup,
compress: true
}
}'

Then run the snapshot (again straight from the docs):

curl -XPUT
localhost:9200/_snapshot/my_backup/snapshot_1?wait_for_completion=true

The snapshot runs fine, but the backup directory that is generated is
890MB, which tells me that compression isn't kicking in. When I set
compress: false, I get the same results.

If I tar/gz that directory it gets squashed down to 204MB. I'd expect
the compressed snapshot from ES to be somewhere in that ballpark.

Am I doing something wrong or is there a bug?
Thanks and Best Regards,
Paul

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/e71eb039-3f13-4417-8427-eace733bca53%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Best practices for dealing with a large number of small activity stream events

2014-07-23 Thread ppearcy

Hi all,
I'm looking at using elasticsearch for a use case that I'd love some
feedback on regarding best practices.

A little background... I've been digging into various approaches to
allowing interactive drill down slicing dicing of activity stream data (
actor / verb / target ) user data for realtime analytics for end users.
This is high dimensional data that has too many potential ways to view to
effectively precompute rollups. Other systems out there that try to tackle
this similar problem that I have played around with are Druid, OpenTSDB,
Blueflood, InfluxDB. At the end of the day they either all use an inverted
index or have or are planning to have elasticsearch integrations, so I
figure why not stick with ES.

There are three areas I am trying to optimize:
- Minimize the index footprint on disk.
- Minimize the RAM footprint
- Maximize the speed

I believe the key tradeoff I need to make with my dataset is going to
doc_values and whether or not I try to utilize heap or page cache.

All my fields are straight exact match not analyzed fields and there are
~15 of them. not_analyzed appears to have all the extras that can cause
bloat disabled (norms, frequencies, etc). I am not indexing source. Here is
my index template:
https://gist.github.com/ppearcy/fc5202a1664dbc90cbc2

With some test data, I'm getting pretty solid results. Average messages are
~360bytes and I am getting:
- 60 bytes per without doc_values
- 80 bytes per with doc values

On a test index with ~160million docs w/o doc values, I have it at 9.6GB of
data with the file breakdown like so:
3.8G Jul 23 09:40 _mwf.fdt
3.9G Jul 23 10:32 _mwf_es090_0.tim
1.8G Jul 23 10:32 _mwf_es090_0.doc

Anybody know how I can slim things down any further or general advice when
dealing with large numbers of small documents?

Thanks!
Paul

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/58c823da-a493-4d46-b16f-dd3dfdb5960a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Sense on github abandoned?

2014-04-02 Thread ppearcy

Hi,
Since Marvel requires a license for production usage, does this mean in
order to use the Marvel bundled Sense against a production instance
requires you to buy a license?

I just got out of a meeting where I told a bunch of people to go download
sense off the chrome store. Whoops :)

Thanks!
Paul

On Tuesday, April 1, 2014 12:14:44 PM UTC-4, kimchy wrote:

Sense started as a weekend project, and Boaz did not place a license on
it. As you mentioned, this license effectively applies:
http://choosealicense.com/no-license/. We consulted our lawyers, who
specialize in open source, and changing the license to open source one is
complex, expensive, and requires a lot of resources. The reason is that its
not only getting the committers agreement, but also reaching all possible
users and have them agree to it (or at least showing big investment in
trying to do so, + a rather large time window to allow for people to
object).

When Boaz created Sense, he was not employed by Elasticsearch. Obviously
any project started by our employees has a clear license (as you can notice
with the many projects we created).

Regarding Marvel:

- You are only required to pay for it when used in production.
- You don’t have to be a support customer of Elasticsearch the company,
you can buy a license for Marvel easily on the web. We made it super cheap
since we think its something that a lot of people will find benefit from.

On Apr 1, 2014, at 17:00, Ivan Brusic iv...@brusic.com javascript:
wrote:

I personally do not require an open source license for Marvel/Sense, but I
would like to see an explicit clarification about the use of Marvel in this
scenario. Marvel does require a license to use and that would apply to any
of its subsystems. Then again, Sense does not have a license, which means
its use is also somewhat restricted.

Sense is an excellent tool and users dependency on the tool is quite
apparent from this thread. :)

I haven't packaged a Chrome plugin in about 3 years. Not only has my
memory faded, but I would assume the mechanism has changed in our fast
changing world of development. It would be a fun exercise to attempt to do
it again.

Cheers,
Ivan

On Tue, Apr 1, 2014 at 5:48 AM, Tim S tims...@gmail.com javascript:wrote:

@kimchy the whole reason for me asking these questions is that sometimes
a customer is using elasticsearch but they don't (yet) have a support
contract, but don't consider themselves in development either, and thus
wouldn't allow me to use Marvel. Yes, there are other tools for poking
around, but sense is invaluable for constructing complicated queries etc
quickly. In this situation they wouldn't let me install a chrome plugin
either, but sense works nicely as an elasticsearch plugin too.

So, if sense (the abandoned version on github) had some kind of
permissive licence, I could turn up on customer site and use sense to poke
around.
Ideally, it would have a licence like AL2 which would allow me to modify
it if necessary.

I realise that you don't want updates pushed back to the version of sense
on github because those changes are helping you to make money from Marvel,
I understand that. But if the abandoned version of sense did have an
appropriate licence, it would allow us to use the current version - it's
still useful even if it's not kept up to date. I might even be tempted to
try and keep it up to date in my spare time. But clearly I can't do this
unless it has a licence that allows me to do it.

Glad to see I'm not the only person thinking along these lines.

On Tuesday, April 1, 2014 11:15:07 AM UTC+1, Jörg Prante wrote:

+1 for Sense standalone packaging
+1 for Sense in Chrome Web Store

Sense is used here all the time, it's essential.

I have also forked the code in case Sense goes away, hoping for a FOSS
license.

Not that I'm fluid in writing browser plugins, but if I find time, I am
not afraid of the learning curve.

Jörg

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com javascript:.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/837794c8-1a0a-411f-a29c-852133d6fbc2%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/837794c8-1a0a-411f-a29c-852133d6fbc2%40googlegroups.com?utm_medium=emailutm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

Logging queries run via search templates

API to get bounded port

Aggregations across values returned by term then date histogram

Re: Aggregations across values returned by term then date histogram

Re: Snapshot compress not compressing?

Best practices for dealing with a large number of small activity stream events

Re: Sense on github abandoned?

7 matches

Site Navigation

Mail list logo

Footer information