Re: Marvel Index Taking too much disk space

2014-12-14 Thread David Pilato
you can use curator for that. See https://github.com/elasticsearch/curator -- David Pilato | Technical Advocate | Elasticsearch.com @dadoonet | @elasticsearchfr | @scrutmydocs > Le 15 déc.

Marvel Index Taking too much disk space

2014-12-14 Thread Chetan Dev
Hi, I installed the plugin for marvel , but its creating its own indexes which are even larger than my original indexed data. is there a way to delete these indexes on daily basis or any other way ? Thanks -- You received this message because you are subscribed to the Google Groups "elastics

aggs terms is support *?

2014-12-14 Thread 唐坤
I have a scene. - Data A { name:"foodA", props:{ color: "red", taste: "sweet", xx: xx, xx: xx } } Date B { name:"foodB", props:{ color: "black", taste: "sweet", xx: xx, xx: xx } } the props field is

Re: analytics on data stored in ES

2014-12-14 Thread Ramchandra Phadake
Yes in general the fetch can be improved using standalone clients. I am NOT saying that data nodes are a bottleneck as of now.Indexing is not impacting the search. The point I am raising is data locality. Data is spread over few shards across few machines.Need to perform processing on this dat

Re: Frequent updates to documents

2014-12-14 Thread Jinal Shah
Thanks for the Nikolas. Users want to search data instantly after the save, so we are unable to use batch updates. It is good to know that even an update in single field means whole document reindex. Thanks, Jinal On Friday, 12 December 2014 16:23:55 UTC+11, Jinal Shah wrote: > > Hi, > > We ar

Re: Frequent updates to documents

2014-12-14 Thread Jinal Shah
Thanks for the reply Nikolas. Users want to search data instantly after the save, so we are unable to use batch updates. It is good to know that even an update in single field means whole document reindex. Thanks, Jinal On Friday, 12 December 2014 16:23:55 UTC+11, Jinal Shah wrote: > > Hi, > >

Re: Is there a way to do exact and full-text searching without creating two different fields?

2014-12-14 Thread am
I think I just figured it out: {"title.raw" : "I like ElasticSearch"} instead of "title: { "raw": "I like ElasticSearch" } On Sunday, December 14, 2014 9:00:52 PM UTC-5, am wrote: > > Ah, thanks. I've set this up (using ES python bindings): > > es.indices.put_mapping(index="myindex", >

Re: Is there a way to do exact and full-text searching without creating two different fields?

2014-12-14 Thread am
Ah, thanks. I've set this up (using ES python bindings): es.indices.put_mapping(index="myindex", doc_type="books", body={ "books": { "prope

Re: Is there a way to do exact and full-text searching without creating two different fields?

2014-12-14 Thread Nikolas Everett
Look at multifields. They let you send the field once and analyze it multiple times. You also might want to use keyword ananlyzer and lowercase filter rather than not_analyzed. Folks are used to case insensitivity. Nik Is there a way to do exact and full text searches without having to create two

Is there a way to do exact and full-text searching without creating two different fields?

2014-12-14 Thread am
Is there a way to do exact and full text searches without having to create two different fields? The documentation (http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/_finding_exact_values.html) states fields must have the mapping "not_analyzed" in order to avoid tokenization.

Re: Creating a custom plugin to return hashes of the terms or the terms of an Elasticsearch index

2014-12-14 Thread joergpra...@gmail.com
The termlist plugin can use filters with the 'term' parameter and pagination with the 'size' parameter. So you can get smaller term lists, for terms starting with 'a','b','c' ..., and you can limit the number of entries returned by say size=1000 (or 1 etc) The 'term' filter should be sufficien

Re: [hadoop] java.lang.NoClassDefFoundError: org/elasticsearch/hadoop/mr/EsOutputFormat

2014-12-14 Thread Costin Leau
Hi, It looks like es-hadoop is not part of your classpath (hence the NCDFE). This might be either due to some misconfiguration of your classpath or due to the way the Configuration object is used. It looks like you are using it correctly though typically I use Job(Configuration) instead of getI

Re: SearchParseException (marvel) - [No mapping found for [@timestamp] in order to sort on

2014-12-14 Thread Eugen Paraschiv
One more detail on this - the Marvel UI also displays the exact query that's failing. Running that query results in a more informative message - probably the root cause of the problem: Caused by: org.elasticsearch.search.facet.FacetPhaseExecutionException: Facet [fs.total.available_in_bytes]: f

Re: Same query, different CPU util when run with Java API versus REST

2014-12-14 Thread joergpra...@gmail.com
Can you post the full query code for better recreation? Jörg On Fri, Dec 12, 2014 at 6:44 PM, Jeff Potts wrote: > > I should mention that the Elasticsearch node, the Java service, and the > JMeter test client are all on different machines. > > Jeff > > -- > You received this message because you

Re: querying array of strings (multiword) with AND operator

2014-12-14 Thread joergpra...@gmail.com
You have a typo: POST /dummy/location { "locationArray" : ["United Kindgom", "London"], "location" : "United Kingdom" } and I'm sure you mean POST /dummy/location { "locationArray" : ["United Kingdom", "London"], "location" : "United Kingdom" } Jörg On Sun, Dec 14, 2014 at 12:57 PM, M

Re: Looking for a best practice to get all data according to some filters

2014-12-14 Thread Jonathan Foy
Just to reword what others have said, ES will allocate memory for [size] scores as I understand it (per shard?) regardless of the final result count. If you're getting back 4986 results from a query, it'd be faster to use "size": 4986 than "size": 100. What I've done in similar situations

Testing distributed characteristic of Elasticsearch

2014-12-14 Thread Luke Laird
Hi guys, Don't get me wrong. This is absolutely not another post about benchmark of Elasticsearch. First, I am pretty new to ES. Please be patient if I ask dumb questions. I am doing a test for academic use only that proving ES's distributed characteristic is an improvement over Lucene, which is

SearchParseException (marvel) - [No mapping found for [@timestamp] in order to sort on

2014-12-14 Thread Eugen Paraschiv
Hi, I'm using Elasticsearch 1.4.1 and the latest Marvel (1.2.1). I have Marvel installed on every node of the cluster and generating data into the daily index. When going into Marvel, I get the following exception: Caused by: org.elasticsearch.search.SearchParseException: [.marvel-2014.12. 14]

Re: AWS machine for ES master

2014-12-14 Thread Yoav Melamed
Thanks On Sunday, December 14, 2014 11:01:58 AM UTC+2, Yoav Melamed wrote: > > Hello, > > I run Elasticsearch cluser in AWS based on c3.8xlarge machines. > Can I use smaller machine for the masters ? > What should be enough ? > -- You received this message because you are subscribed to the Googl

Re: AWS machine for ES master

2014-12-14 Thread Mark Walkom
Master only nodes are very light, you can probably get away with 1 or 2GB for heap. Of course this will depend on your cluster topology and a few other things, so it might be best to trial it. On 14 December 2014 at 10:01, Yoav Melamed wrote: > > Hello, > > I run Elasticsearch cluser in AWS base

Re: Looking for a best practice to get all data according to some filters

2014-12-14 Thread Nikolas Everett
Search consumes O(offset + size) memory and O(ln(offset + size)*(offset+size) CPU. Scan scroll has higher overhead but is O(size) the whole time. I don't know the break even point. The other thing is that scroll provides a consistent snapshot. That means it consumes resources you shouldn't let end

[hadoop] java.lang.NoClassDefFoundError: org/elasticsearch/hadoop/mr/EsOutputFormat

2014-12-14 Thread CAI Longqi
Hello, I’m using elasticsearch-hadoop-2.0.2.jar, and meet the problem: Exception in thread "main" java.lang.NoClassDefFoundError: org/elasticsearch/hadoop/mr/EsOutputFormat at com.clqb.app.ElasticSearch.run(ElasticSearch.java:46) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)

querying array of strings (multiword) with AND operator

2014-12-14 Thread Mathew Bolek
Hi, I really don't know what's wrong, but I seem to be unable to find a way of querying array elements that contain multiple words with "AND" operator. ES version: 1.3.*, only the default standard analyser in use A dummy document POST /dummy/location { "locationArray" : ["United Kindgom", "L

UI/Plugin to visualize output of Scoring Explain flat

2014-12-14 Thread vineeth mohan
Hi , I remember seeing a UI from a plugin or otherwise which visualizes the output of explain API for scoring as a neat d3 visualization of collapsible tree - http://bl.ocks.org/mbostock/4339083 If anyone remembers the link , please reply to this mail. Thanks Vineeth -- You received

Re: Looking for a best practice to get all data according to some filters

2014-12-14 Thread David Pilato
Implication is the memory needed to be allocated on each shard. David > Le 14 déc. 2014 à 05:46, Ron Sher a écrit : > > Again, why not use a very large count size? What are the implications of > using a very large count? > Regarding performance - it seems doing 1 request with a very large cou

AWS machine for ES master

2014-12-14 Thread Yoav Melamed
Hello, I run Elasticsearch cluser in AWS based on c3.8xlarge machines. Can I use smaller machine for the masters ? What should be enough ? -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails

Frequency of significant terms in documents matching a query

2014-12-14 Thread Graeme Pietersz
I understand how to use aggregations to get significant terms with counts of the number of documents in which they occur. I would like to also be able to count the number of times these terms occur in across all documents. I can use term vectors to count how often a term occurs in a single docu