Re: Doc Values vs Field Data Questions

2015-05-25 Thread Matt Traynham
@rmuir Interesting, it sounds like my gains may be better than previously expected, given the server is constantly evicting from heap. If I'm able, I'll post some performance metrics back here when I'm done. -- Please update your bookmarks! We have moved to https://discuss.elastic.co/ --- You

Re: Doc Values vs Field Data Questions

2015-05-22 Thread Robert Muir
rformance metrics on using Doc Values vs > FDC? I've seen the "10-25%" slower value thrown around, but I wanted to > know what that was tested with (CPU, mem, spinning vs. SSD, etc...) and > where gains may be had. > > in my debugging the current differences are usua

Re: Doc Values vs Field Data Questions

2015-05-22 Thread Matt Traynham
Thanks for the clarification Adrien. If that's the case, is there such a flag that can enable them by default for all fields (excluding non-analyzed strings; using ~1.4.3 here)? Also, do you guys have more performance metrics on using Doc Values vs FDC? I've seen the "10-25

Re: Doc Values vs Field Data Questions

2015-05-21 Thread Adrien Grand
All fields that you search on and aggregate on should be moved to doc values in my opinion. By the way, elasticsearc 2.0 will make doc values on by default except on analyzed string fields. We still need some fielddata memory for something called the "global ordinal map". When you h

Re: Doc Values vs Field Data Questions

2015-05-21 Thread Matt Traynham
Just to correct myself, I misstated; a 1/3 increase in index size, not 3x. -- Please update your bookmarks! We have moved to https://discuss.elastic.co/ --- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop r

Doc Values vs Field Data Questions

2015-05-21 Thread Matt Traynham
I'm doing some overall testing on my cluster, debating if I should switch to Doc Values. I have about 15 fields for each document, with 83 million documents spread across 60 indices. All the fields are dynamically mapped, and all of them can migrate to Doc Values. So, I have one copy o

Re: Could use some help with using Doc Values

2015-04-17 Thread Mark Walkom
It's a little difficult to see what is currently in field data to check (you need a heap dump). You could probably keep an eye on existing field data and see if it increases slower than before but that's a little abstract. Really, as long as it doesn't complain about the mapping you're good. On

Re: Could use some help with using Doc Values

2015-04-17 Thread Scott Chapman
Thanks Mark. Exactly what I was looking for. Once I make the change is there any way I can tell it is being used properly for a specific field? On Friday, April 17, 2015 at 10:23:15 PM UTC-4, Mark Walkom wrote: > > You can add it in and it'll map it correctly. > > "@timestamp" : { > "

Re: Could use some help with using Doc Values

2015-04-17 Thread Mark Walkom
You can add it in and it'll map it correctly. "@timestamp" : { "index" : "not_analyzed", "type" : "date", *"doc_values": true* } On 17 April 2015 at 10:14, Scott Chapman wrote: > Thanks. The field I wanted to map was @timestamp which isn't explicit

Re: Could use some help with using Doc Values

2015-04-16 Thread Scott Chapman
Thanks. The field I wanted to map was @timestamp which isn't explicitly in the template. What would it look like? Also, once I have made the change to my template, what's the right way to test it (validate that for a new index i is using Doc Value for the specific field)? On Thursday, April 16

Re: Could use some help with using Doc Values

2015-04-16 Thread Mark Walkom
As per the docs just add this; "@version" : { "index" : "not_analyzed", "type" : "string", *"doc_values": true* } On 17 April 2015 at 09:35, Scott Chapman wrote: > Thanks, that's what I thought. > > So, please help me with my template I gave above.

Re: Could use some help with using Doc Values

2015-04-16 Thread Scott Chapman
Thanks, that's what I thought. So, please help me with my template I gave above. I am familiar with up to update it, I am just not real sure on how to change it so that a specific field uses Doc View. Or if it is easier to make it the default for all fields I suppose that's fine too since it so

Re: Could use some help with using Doc Values

2015-04-15 Thread Mark Walkom
Yes that is correct, you have to update your mappings and wait for new indices to be created from it, it's not something that can be applied retroactively without reindexing. On 16 April 2015 at 09:55, Scott Chapman wrote: > Yea, that's where I started with it. But, if I understand it, that look

Re: Could use some help with using Doc Values

2015-04-15 Thread Scott Chapman
Yea, that's where I started with it. But, if I understand it, that looks like how I can change the mapping for a specific property. But I would think I need to make a similar change to my index template otherwise new indexes that get created will no long have that mapping. Or am I misunderstand

Re: Could use some help with using Doc Values

2015-04-15 Thread Mark Walkom
Start here and you'll be good to go - http://www.elastic.co/guide/en/elasticsearch/guide/current/doc-values.html On 16 April 2015 at 08:03, Scott Chapman wrote: > Probably. I just need some help figuring out how to do that. Help? > > On Wednesday, April 15, 2015 at 5:42:55 PM UTC-4, Mark Walkom

Re: Could use some help with using Doc Values

2015-04-15 Thread Scott Chapman
Probably. I just need some help figuring out how to do that. Help? On Wednesday, April 15, 2015 at 5:42:55 PM UTC-4, Mark Walkom wrote: > > You should, ideally, be using it for anything that isn't analysed. > -- You received this message because you are subscribed to the Google Groups "elastics

Re: Could use some help with using Doc Values

2015-04-15 Thread Mark Walkom
You should, ideally, be using it for anything that isn't analysed. On 16 April 2015 at 05:45, Scott Chapman wrote: > So, I am a bit (ok more than a bit) of a newbie. We are using ELK for > collecting storing logs, and recently we started seeing these errors: > [FIELDDATA] New used memory 2570680

Could use some help with using Doc Values

2015-04-15 Thread Scott Chapman
So, I am a bit (ok more than a bit) of a newbie. We are using ELK for collecting storing logs, and recently we started seeing these errors: [FIELDDATA] New used memory 2570680025 [2.3gb] from field [@timestamp] would be larger than configured breaker I am using (more or less) the standard logsta

Re: Support for case insensitive sorts with doc values

2015-02-10 Thread Hugues Malphettes
Hi Angie, You are trying something different from what we discussed so far. In all cases, you need to make ES store your analyzed strings as doc values. One can use a client-side transform, an ES-source-transform or a custom ES mapping to do that. At the moment the strings are stored as not

Re: Support for case insensitive sorts with doc values

2015-02-10 Thread Geetanjali Paygude
criptFilter", > > "lang": "native", > > "params": { > > "field": “LastName" > > > > } > > }, > > "type": "string

Re: Support for case insensitive sorts with doc values

2015-02-10 Thread Geetanjali Paygude
se let me know if any correction is required in this script. Regards, Angie On Friday, 6 February 2015 11:40:53 UTC+5:30, Hugues Malphettes wrote: > > Hi Angie, > > On Friday, 6 February 2015 12:17:47 UTC+8, Geetanjali Paygude wrote: >> >> Hi Hugues, >> >> So

Re: Doc Values

2015-02-08 Thread David Pilato
May be some ideas here? https://github.com/elasticsearch/logstash/blob/v1.4.2/lib/logstash/outputs/elasticsearch/elasticsearch-template.json David > Le 8 févr. 2015 à 06:25, Kadaan a écrit : > > Great! That does look pretty close. Guessing I could use an index template > with order=int.max,

Re: Doc Values

2015-02-07 Thread Kadaan
Great! That does look pretty close. Guessing I could use an index template with order=int.max, set the template to * and configure the _default_ mapping. Only thing I'm not sure about is how to restrict field data to either off or doc_values for fields whose names I do not know. Dynamic templ

Re: Doc Values

2015-02-07 Thread David Pilato
Have a look at http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/indices-templates.html#indices-templates This will help IMO. David > Le 8 févr. 2015 à 02:44, Joel Baranick a écrit : > > Sure. I get that, but I'm talking about a multi-tenant environment where I > do not

Re: Doc Values

2015-02-07 Thread Joel Baranick
Sure. I get that, but I'm talking about a multi-tenant environment where I do not control the index templates or mappings which are installed. In this scenario it would be nice to be able to configure the cluster to only allow field data to be off or doc_values. On Saturday, February 7, 2015

Re: Doc Values

2015-02-07 Thread Itamar Syn-Hershko
You don't need a plugin for index when an index is created - use index templates + dynamic templates for this, e.g. http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/custom-dynamic-mapping.html#dynamic-templates -- Itamar Syn-Hershko http://code972.com | @synhershko

Re: Doc Values

2015-02-07 Thread Joel Baranick
Thanks. I will look into if I can create a plugin which will automatically enable doc_values whenever an index is created or updated. This seems like it could be very useful for multitenant clusters. -- You received this message because you are subscribed to the Google Groups "elasticsearch"

Re: Doc Values

2015-02-07 Thread Itamar Syn-Hershko
If the indexes have been already created you will have to be creative to find those fields that need updating - not familiar with a plugin that can do that. A simple client side tool that will grab all mappings from the /_mapping endpoint, change it and send it back should do For indexes that were

Re: Doc Values

2015-02-07 Thread Joel Baranick
Got it. What I was hoping for would be a way to force doc_values to be the only way for fielddata to be stored for all mapping a in the entire cluster without having to update each index. Could this be done with a plugin? -- You received this message because you are subscribed to the Google Gro

Re: Doc Values

2015-02-07 Thread Itamar Syn-Hershko
You can update mappings cluster-wide (just post the mapping definition to server:9200/*), but you will need to specify the field names explicitly -- Itamar Syn-Hershko http://code972.com | @synhershko Freelance Developer & Consultant Lucene.NET committer and PMC m

Doc Values Cluster Wide

2015-02-07 Thread Joel Baranick
Is there a way to turn doc_values on cluster wide and override any index specific settings? -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsub

Doc Values

2015-02-07 Thread Joel Baranick
Is there a way to turn doc_values on cluster wide and override any index specific settings? -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsub

Re: Support for case insensitive sorts with doc values

2015-02-05 Thread Hugues Malphettes
Hi Angie, On Friday, 6 February 2015 12:17:47 UTC+8, Geetanjali Paygude wrote: > > Hi Hugues, > > So you have extended "String" type to add custom analyzer. > > I am referring to this thread > > http://elasticsearch-users.115913.n3.nabble.com/Support-for-case

Re: Support for case insensitive sorts with doc values

2015-02-05 Thread Geetanjali Paygude
Hi Hugues, So you have extended "String" type to add custom analyzer. I am referring to this thread http://elasticsearch-users.115913.n3.nabble.com/Support-for-case-insensitive-sorts-with-doc-values-tt4064487.html Is there any way to use script/transform the source and then apply s

Re: Fielddata cache and doc values

2014-12-23 Thread vineeth mohan
Hi , Yes , definitely doc values would be a better idea. As it is not 100% memory resident , it will give a lot better stability and memory optimization to the system. On the flip side, performance might go down to , say 10 to 15%. Thanks Vineeth On Tue, Dec 23, 2014 at 8:47 PM, Han

Fielddata cache and doc values

2014-12-23 Thread Han JU
Hi, We are reviewing our ElasticSearch setup & settings, here's a question on fielddata cache and usage of doc values. Currently we set fielddata cache to be 30% of the heap size and we've enabled doc_values for all fields that we want to sort or aggregate (except 2 boolean

Re: doc values

2014-12-11 Thread Adrien Grand
stamp as doc value , > but when I checked the mapping I can see it didn't work > It didn't happen in other fields. > Is there any known issue with timestamp field and doc values? > > Lib > > -- > You received this message because you are subscribed to the Google Groups

doc values

2014-12-10 Thread libm972
Hi, I have memory problems with aggregation queries. my elastic version is 1.3.2 I tired to define _timestamp as doc value , but when I checked the mapping I can see it didn't work It didn't happen in other fields. Is there any known issue with timestamp field and doc values? Li

Re: Support for case insensitive sorts with doc values

2014-11-13 Thread Hugues Malphettes
p://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-transform.html >> >> On Tue, Oct 7, 2014 at 9:19 AM, Hugues Malphettes >> wrote: >> >>> Hi everyone, >>> >>> Case insensitive sort is elegantly supported by using a c

Re: Support for case insensitive sorts with doc values

2014-10-07 Thread Hugues Malphettes
earch.org/guide/en/elasticsearch/reference/current/mapping-transform.html > > On Tue, Oct 7, 2014 at 9:19 AM, Hugues Malphettes > wrote: > >> Hi everyone, >> >> Case insensitive sort is elegantly supported by using a custom analyzer >> [1]. >> `doc val

Re: Support for case insensitive sorts with doc values

2014-10-07 Thread Adrien Grand
at 9:19 AM, Hugues Malphettes wrote: > Hi everyone, > > Case insensitive sort is elegantly supported by using a custom analyzer > [1]. > `doc values` are documented as a great fit for sorting [2] to save heap > memory. > > However doc values are not support for analyz

Support for case insensitive sorts with doc values

2014-10-07 Thread Hugues Malphettes
Hi everyone, Case insensitive sort is elegantly supported by using a custom analyzer [1]. `doc values` are documented as a great fit for sorting [2] to save heap memory. However doc values are not support for analyzed strings at the moment. Are we planning to support doc values for analyzers

Re: How boolean fields are indexed in ElasticSearch? Doc values on boolean?

2014-10-02 Thread Adrien Grand
Hi, Boolean fields don't support doc values yet, although there is some work in progress: https://github.com/elasticsearch/elasticsearch/pull/7961 Boolean fields are simply indexed as strings: "T" for true and "F" for false and field data would require about 2 bits per

How boolean fields are indexed in ElasticSearch? Doc values on boolean?

2014-10-02 Thread Han JU
Hi, I'm taking some time reviewing our mapping. We've put doc_values for the fields that aggregation will be calculated. But I'm not sure is there any benefit to do this for boolean field? This also leads me to wonder how ElasticSearch (or Lucene) index boolean fields? Some insights or readings

Re: Doc values for field data

2014-07-16 Thread David Smith
Thank you, Adrien. That answers my questions. On Wednesday, July 16, 2014 5:24:36 AM UTC-4, Adrien Grand wrote: > > On Tue, Jul 15, 2014 at 3:25 PM, David Smith > wrote: > >> Thanks, Adrien. That brings me closer. >> >> So when the documentations say doc values

Re: Doc values for field data

2014-07-16 Thread Adrien Grand
On Tue, Jul 15, 2014 at 3:25 PM, David Smith wrote: > Thanks, Adrien. That brings me closer. > > So when the documentations say doc values do not support filtering, it's > talking about fielddata filtering for what's loaded into memory (anod not > filtering as part of a

Re: Doc values for field data

2014-07-15 Thread David Smith
Thanks, Adrien. That brings me closer. So when the documentations say doc values do not support filtering, it's talking about fielddata filtering for what's loaded into memory (anod not filtering as part of a query... say term filter). For further clarification - can a field that is no

Re: Doc values for field data

2014-07-15 Thread Adrien Grand
Hi David, Doc values are a way to compute field data at indexing time, and to store it on disk. It can do everything that "uninverted" field data can do: aggregations, sorting, etc. However, it never kicks in automatically: it needs to be configured explicitely, and can only be se

Doc values for field data

2014-07-14 Thread David K Smith
When you map fields to use doc values for field data, does that limit the functionality afforded to those fields to merely sorting and aggregations/faceting? The documentation mentions that filtering is not supported by numeric or string types when stored as doc values. Yikes, I thought that

Re: memory settings when doc values used?

2014-03-31 Thread Adrien Grand
Hi, There are no particular settings to set when using doc values, just make sure to not give too much memory to your JVM so that the operator system's filesystem cache has enough memory to do its job. If your previous setup didn't leverage doc values, it is likely that switching to

memory settings when doc values used?

2014-03-28 Thread Ryan Pedela
Have the recommended memory configuration settings changed when using doc values? -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc

Re: Indexing performance with doc values (particularly with larger number of fields)

2014-03-23 Thread Robert Muir
heap accordingly if you are using dovalues. On Sun, Mar 23, 2014 at 10:01 PM, Alex at Ikanow wrote: > This might be more of a Lucene question, but a quick google didn't throw up > anything. > > Has anyone done/seen any benchmarking on indexing performance (overhead) due > to usin

Indexing performance with doc values (particularly with larger number of fields)

2014-03-23 Thread Alex at Ikanow
This might be more of a Lucene question, but a quick google didn't throw up anything. Has anyone done/seen any benchmarking on indexing performance (overhead) due to using doc values? I often index quite large JSON objects, with many fields (eg 50), I'm trying to get a feel for whe