Re: Globally disable analysis (i.e. no via per-field mapping)?

2014-12-18 Thread Mark Walkom
It feels like you're almost defeating the whole purpose of using Elasticsearch with this approach! Is it really that much of a problem? On 19 December 2014 at 08:15, Eran Duchan wrote: > > I'd like not to use analysis across my schema to save a bit of CPU (I know > the penalty this inflicts on se

Re: Oracle to Elasticsearch

2014-12-18 Thread Mark Walkom
Can you elaborate on where you are seeing the extra 50k of entries? ie how did you get this count. There is currently no O/JDBC connector plugins for logstash so you need to use a river. You may also want to ask further Logstash questions at https://groups.google.com/forum/?hl=en-GB#!forum/logstas

Re: Elasticsearch taking a long time for garbage collection

2014-12-18 Thread Mark Walkom
You should really have heap the size on both nodes. What ES and java versions are you on? On 18 December 2014 at 19:54, shriyansh jain wrote: > > Hi All, > > I am seeing some warning message in elasticsearh log files which are > taking pretty long time for garbage collection. > > [2014-12-17 10:

Globally disable analysis (i.e. no via per-field mapping)?

2014-12-18 Thread Eran Duchan
I'd like not to use analysis across my schema to save a bit of CPU (I know the penalty this inflicts on searching). Right now I set "index": "not_analyzed" per field but this is cumbersome. I know I can choose between default analyzers

Wrong routing of TransportClient with sniffing enabled

2014-12-18 Thread Jae
Hi I am using ES 1.1.0 with TransportClient. We observed wrong routing from TransportClient when we scale up the cluster. For example, suppose that we have two ES clusters, es0, es1 and es_sink_0 is the TransportClient talking to es0, es_sink_1 is one talking to es1. If we scale up es1, it hap

Re: More Like This Query Returning 0 Similar Documents

2014-12-18 Thread sekharreddy mandapati
Issue is solved by adding *_routing *value in every *docs *object. If we using routing while indexing documents we have to use the *_routing *value in every *docs *object instead of specifying it in url [POST /espoc/_search? *_routing=*]. On Monday, 1 December 2014 12:59:20 UTC+5:30, sekharr

Re: Performance difference between REST and Java API

2014-12-18 Thread Marie Jacob
Thanks Jörg. I tried that, and it seems to make no difference. All I could gather are these graphs (attached) from the bigdesk plugin that are showing performance for the query and fetch phase (the first is during ES java client use, the second is using the REST calls) . It's really weird that

how to return the count of unique documents by using elasticsearch aggregation

2014-12-18 Thread Xiaoting Ye
Hi, Is there a way to return the count of the unique documents by using aggregation? My use case is pretty simple: In my data model I have an array of locations : { ..., "locations" : [ { "city" : "new york", "state" : "ny" }, { "city" : "woodbury", "s

Re: Problems with Conference Registration (Inelastic Customer Service)

2014-12-18 Thread Ben Rigby
The team at ES put a quick fix into place. Much appreciated. Thanks. -ben -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroup

Problems with Conference Registration (Inelastic Customer Service)

2014-12-18 Thread Ben Rigby
Hi. I was trying to register two of my team members and clicked "purchase" before seeing the small print that said that the ticket couldn't be used by any person other than the name on the billing information (and that there is a $100 change fee). I realized the mistake within seconds and emaile

Re: Performance difference between REST and Java API

2014-12-18 Thread joergpra...@gmail.com
How about this? https://gist.github.com/anonymous/509b3db873a30d8961ed#comment-1359074 Jörg On Thu, Dec 18, 2014 at 7:48 PM, Marie Jacob wrote: > > > I'm benchmarking results for an ES cluster, using both the REST api and > native Java client. We're getting very different response times, betwee

Re: Newbie question about Spark and Elasticsearch

2014-12-18 Thread Costin Leau
http://www.elasticsearch.org/guide/en/elasticsearch/hadoop/2.1.Beta/spark.html#spark-native On 12/18/14 8:27 PM, chris wrote: Hi, You recommend the native integration instead of MR and I see on the official documentation that MR is recommended to read/write data to ES using spark. Spark suppor

Re: $ES_HEAP_SIZE

2014-12-18 Thread Yifan Wang
What about the memory settings on Client Node where there is no index? Still 30GB? That would be either causing waste of RAM or limiting the ability of aggregations, knowing that we can only use field_data cache for aggregations, which may be 70% of the heap. Can we set HEAP larger on Client No

Elasticsearch taking a long time for garbage collection

2014-12-18 Thread shriyansh jain
Hi All, I am seeing some warning message in elasticsearh log files which are taking pretty long time for garbage collection. [2014-12-17 10:12:23,789][INFO ][monitor.jvm ] [Node1] [gc][young][145219][4796] duration [901ms], collections [1]/[1.5s], total [901ms]/[7.1m], memory [7.2

Performance difference between REST and Java API

2014-12-18 Thread Marie Jacob
I'm benchmarking results for an ES cluster, using both the REST api and native Java client. We're getting very different response times, between each of these (the REST api is doing approximately 50% better in the 95th percentile). I was wondering what the cause of this is, since the search re

Re: Only partial results returned for aggregation + ElasticsearchIllegalStateException when trying scroll

2014-12-18 Thread AlexR
I second that. May of us need accurate results at the expense of performance. So an optional two step execution for results correction (for buckets not present in all shards responses) would be very helpful! A great first step would be to do so on a single node (if not already done) when aggrega

Re: Newbie question about Spark and Elasticsearch

2014-12-18 Thread chris
Hi, You recommend the native integration instead of MR and I see on the official documentation that MR is recommended to read/write data to ES using spark. Spark support Doc what would be the basic piece of code t

Re: Is ElasticSearch truly scalable for analytics?

2014-12-18 Thread AlexR
Nick, I am not an expert in this area either but with multi-core processors (24, 32, 48) it is not uncommon to have fairly large number of shards on a node so 30 shards is not out of question I assumed that ES aggregate shard results on a node prior to shipping them to a master but I do not kno

Oracle to Elasticsearch

2014-12-18 Thread Marian Valero
I want to migrate from oracle to elasticsearch data for analize that. I have using logstash to read a csv file but when input this data I have much logs that I have inserting, for example I have 10 lines and this has inserting 15 lines of logs. How can I fix that? This is my logstash.co

Re: Only partial results returned for aggregation + ElasticsearchIllegalStateException when trying scroll

2014-12-18 Thread Eran Duchan
Thanks Adrien *> Did you try to add pagination on a request of type COUNT?* Yes, I run the aggregation with search_type=count. The thing is I need accurate results, not super fast execution. Scoring is something we don't use and need so I would like _all_ relevant results (i.e. results which p

Interpretation of Stacktrace, What does this error mean?

2014-12-18 Thread Roland Kofler
After I insert a document into es, via spring data, i believe an indexing phase is started called 'Dfs' (correct?) where a "query_binary" sort of a Hash is retrieved. this fails Failed to parse source [{"from":0,"size":10,"query_binary":"eyAnZm9vZHN0dWZmc0lkJyA6IDFmODMwZDk1LTQ0OTctNGFmZC1hND

Re: Is ElasticSearch truly scalable for analytics?

2014-12-18 Thread Nikolas Everett
I think aggregating 32 shards on one node is a bit degenerate. I imagine its more typical to aggregate across one of two shards per node. Don't get me wrong, you can totally have nodes store and query ~100 shards each without much trouble. If aggregating across a bunch of shards per node were a

Re: Is ElasticSearch truly scalable for analytics?

2014-12-18 Thread Yifan Wang
Correction "meaning reduce from 32 buckets to only one bucket, then the client node only has to process 50 buckets." should read "reduce from 32 of 50K buckets to only 1 being sent to the client node, then the client node only has to process 50 of 50K buckets". On Thursday, December 18, 2014

Re: Is ElasticSearch truly scalable for analytics?

2014-12-18 Thread Yifan Wang
Sorry, if I did not make it clear. For sure I know aggregation is done on the node for each shard, but here is the challenge. Say we set shard_size=50,000. ES will aggregate on each shard and create buckets for the matching documents, and then send top 50,000 buckets to the client node for Redu

Re: Elasticsearch upgrade from 1.1.1 to 1.3.7

2014-12-18 Thread Mark Walkom
You can try dropping the replicas, wait for things to go green, then readd them. This has worked for a few people on IRC with similar issues. On 18 December 2014 at 15:13, wrote: > > Hi All, > >I upgraded elastic search from version 1.1.1 to 1.3.7.all things are > working fine but when i chec

pre-registerd template in server

2014-12-18 Thread Vahith
hi, i am facing problem in template, could you suggest me how to store the pre-registerd template in server, and kindly tell me the default path(absolute path in linux) with some basic example for storing and fetching template, i hope you will help me out from this issue Thanks in Advance -

Re: unwind aggregation

2014-12-18 Thread Adrien Grand
I don't believe we could have such an aggregation. However I think similar results can be achieved by modeling the data differently, typically by having nested documents for each size (to take the same example as the mongo documentation) and then using the nested aggregation in order to collect ind

Re: @uboness how to improve the accuracy of terms aggregation

2014-12-18 Thread Adrien Grand
For the record, the bottleneck would not be on the master node (the node that manages the cluster state) but on the node that coordinates the execution of the search request, which is the node that your client contacts. So if you are doing costly terms aggregations with high shard sizes, it would h

Re: Is ElasticSearch truly scalable for analytics?

2014-12-18 Thread Adrien Grand
+1 to what AlexR said. I think there is indeed a bad assumption that shards just forward data to the coordinating node, this is not the case. On Thu, Dec 18, 2014 at 1:09 AM, AlexR wrote: > > if you take a terms aggregation, the heavy lifting of the aggregation is > done on each node then aggrega

Re: Only partial results returned for aggregation + ElasticsearchIllegalStateException when trying scroll

2014-12-18 Thread Adrien Grand
On Thu, Dec 18, 2014 at 11:02 AM, Eran Duchan wrote: > >1. I seem to have received only some of the response. The response >"hits.total" was 174054, yet when I summed the geohash_grid (first >aggregation) doc_count, I got about ~13K. I tried perhaps passing a large >"size" but thi

Elasticsearch upgrade from 1.1.1 to 1.3.7

2014-12-18 Thread phani . nadiminti
Hi All, I upgraded elastic search from version 1.1.1 to 1.3.7.all things are working fine but when i checked status of my cluster it is showing "yellow" like following i waited for long time but shards not relocating. { "cluster_name": "elasticsearch", "status": "yellow", "timed_

Perculator generate lots of Get requests

2014-12-18 Thread Leon Portman
Hi We have around 100 perculator queries and during indexing 4,000 documents,we are seeing around 100K Get Requests in ES. We are perculating documents directly - not existing document so there should be no Get lookups Best Regards Leon -- You received this message because you are subscribed

Re: Creating search template with client

2014-12-18 Thread Benedikt Weiner
found it my self: http://www.elasticsearch.org/guide/en/elasticsearch/client/javascript-api/current/api-reference.html#api-puttemplate On Thursday, December 18, 2014 1:26:01 PM UTC+1, Benedikt Weiner wrote: > > Hello, > > is it possible to create the a search template with the js client. > I use t

Re: stopword syntax with custom analyzer

2014-12-18 Thread Alix Martin
Gist link is https://gist.github.com/Alix-Martin/7186e38459e88a474e13 -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com

stopword syntax with custom analyzer

2014-12-18 Thread Alix Martin
Hi ! I tried to define a stopword list for my custom analyzer like this : "analysis" : { "tokenizer" : { "host_tokenizer" : { "type": "pattern", "pattern": "[a-zA-Z0-9]+", "group": 0 } }, "analyzer" : {

Re: ElasticSearch hadoop - .EsHadoopSerializationException

2014-12-18 Thread Kamil Dziublinski
Hi Costin, I didnt find any Jira or anything for issues on that help page. Anyway I created two small java classes. One for MR job and one IT test to run that job and get the exception (to not bother from cmd line). Plus some dummy input file just to get inside the mapper. I attached them. If you

Creating search template with client

2014-12-18 Thread Benedikt Weiner
Hello, is it possible to create the a search template with the js client. I use the official js client: https://github.com/elasticsearch/elasticsearch-js and want to create a search template described hier: http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-template.h

Using wildcards to speficy multiple indexes in Kibana, is it still not supported?

2014-12-18 Thread Vagif Abilov
We have groups of multiple indexes which names differ in a few characters, and we want to query all indexes belonging to the same group. I've found some old threads where it's explained that Kibana doesn't support wildcard in index names (except for date specifier) and there only way to specify

Re: ClusterBlockException after closing an index

2014-12-18 Thread Bruno Cruz
Hey Aaron, I've since figured out a way to stop that by giving an alias to certain indexes and only letting Kibana query those. Still, thanks for the help and the response, hope it will help other people that encounter this error! Merry Christmas, cheers! Bruno Cruz 2014-12-17 21:28 GMT+00:00 A

Only partial results returned for aggregation + ElasticsearchIllegalStateException when trying scroll

2014-12-18 Thread Eran Duchan
Over a sample ~2.5M document dataset, where each record holds a geopoint and some other data, I wanted ElasticSearch 1.4.1 to provide the following data: For all results in a given geo_bounding box: Group results by: (geohash of length 8, a term, day) For each group provide: 2 sums

Re: Custom _source compression / compaction to reduce disk usage

2014-12-18 Thread Eran Duchan
After a bit of testing, it was found that _source takes up 37% of the total disk space. Given everything said here (cross document compression, future support for better compression), we decided to leave this as is. -- You received this message because you are subscribed to the Google Groups "

Re: Where to find timelines for estimated release dates?

2014-12-18 Thread Mark Walkom
Sorry but there isn't anything public on this, however 1.5 is looking like it will land sometime in January. On 18 December 2014 at 04:41, Peter Portante wrote: > > Looking for a possible timeline for the release of 1.5 (anticipating > inner_hits support). Is there a place in the community where