Multiple documents with the same _id

2015-06-26 Thread David Smith
We're using elasticsearch 1.4.0 and we index documents using bulk indexing from node client. We seem to be getting quite a few duplicates as elasticsearch doesn't seem to recognize that there are already documents with the same _id. GET /searchables/_search { from: 0, size: 100,

Re: scroll query concurrently

2015-02-01 Thread David Smith
FYI, the answer is no. I did a simple test using pmap to to make multiple scroll queries in parallel. Out of 500 results, there were only 221 distinct values, so more than half were duplicates :) On Wednesday, January 28, 2015 at 12:56:21 PM UTC+1, David Smith wrote: Can I share a scroll id

scroll query concurrently

2015-01-28 Thread David Smith
Can I share a scroll id between multiple clients? If two clients ask the next batch of the scroll at the same time, will they get different results or is there a danger they will get duplicates? If there is no possibility of duplicates, I could share a scroll id across machines and process

doc_as_update and ttl

2015-01-22 Thread David Smith
Hi, I want to upsert a document and at the same time reset it's TTL. I.e. To do the upsert I use _update with doc_as_update true and to update the ttl I use _update with script and ctx._ttl. It seems I can't do both of these at the same time though. Does this mean I have to do one of: 1.

Deleted indices keep coming back w/ 1.4.0

2014-11-20 Thread David Smith
Hi, Since we upgraded to 1.4.0, deleted indices in our time-series index set keep coming back right after deletion. So whenever we drop an expired index (usually as midnight rolls), it gets deleted and removed from the alias it was under. But about half the time it comes back as an empty

Re: upgrading from 0.90.7 to 1.4. Gotchas?

2014-11-20 Thread David Smith
I can't remember what 0.90.x was unlike as that was long ago for us, but we recently upgraded from 1.1.0 to 1.4.0. Look at http://www.elasticsearch.org/guide/en/elasticsearch/reference/1.x/breaking-changes.html additionally pay attention to: - scripting: - replacement of mvel w/

Re: upgrading from 0.90.7 to 1.4. Gotchas?

2014-11-20 Thread David Smith
Also, forgot to mention... if you have native scripts, they will mysteriously throw Unsupported Operation exception whenever invoked. Looks like they made a mistake in 1.4.0 (that is now reverted on master), that requires you to override the setScorer in native scripts. It's ok, I just wish

High memory usage on dedicated master nodes

2014-07-16 Thread David Smith
We have cluster with 22 data nodes and 3 dedicated master nodes running ES 1.1.0. Data nodes have 16 GB of memory (half given to JVM heap) and the dedicated masters have 4 GB (half for heap). Data nodes have consistent memory usage (about 50-60%) but we're observing that the master node is

Re: Delete oldest X documents from index

2014-07-16 Thread David Smith
You could add datetime field to your document and order by that in ascending order and get the first X document id's. And then delete the documents by id. While that's the solution to the question you asked, it might be better to re-think your problem so that you don't have to delete the

Re: Doc values for field data

2014-07-15 Thread David Smith
Thanks, Adrien. That brings me closer. So when the documentations say doc values do not support filtering, it's talking about fielddata filtering for what's loaded into memory (anod not filtering as part of a query... say term filter). For further clarification - can a field that is not analyzed

Optimizing a query that matches a large number of documents

2014-07-13 Thread David Smith
Hi, I'm trying to optimize this query which takes 5-10s to run. This query is run repeated for different (pretty much all) users via an offline process/script daily. The index it is run against has about 4 billion documents, each query matches approximately 500k documents in that index but I

Re: Switching back to ConcurrentMergeScheduler

2014-04-19 Thread David Smith
Thanks, Jörg. Is it possible to set these via API instead of changing the yaml? -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to

Re: Switching back to ConcurrentMergeScheduler

2014-04-19 Thread David Smith
Ahh, got it. Thanks. On Saturday, April 19, 2014 10:05:39 AM UTC-4, Jörg Prante wrote: No, you can not change the merge scheduler settings via API. Threadpool settings updating works. Jörg On Sat, Apr 19, 2014 at 3:22 PM, David Smith davidk...@gmail.comjavascript: wrote: Thanks, Jörg

Switching back to ConcurrentMergeScheduler

2014-04-18 Thread David Smith
I see that ES switch back to ConcurrentMergeScheduler in 1.1.1 due to it affecting indexing performance in 1.1.0. https://github.com/elasticsearch/elasticsearch/issues/5817 We're on 1.1.0 and cannot upgrade to 1.1.1 for the time being. Is there a way to switch it back using the API? I tried the

Re: Function Score Query and Native scripts

2014-04-18 Thread David Smith
Yes, function score query works with native scripts. We use it with them. I'm not sure whether native scripts are automatically cached. On Saturday, April 12, 2014 1:49:32 PM UTC-4, Eric T wrote: Hi, The function score documentation doesn't mention any support for native scripts, does it

Re: Function Score Query and Native scripts

2014-04-18 Thread David Smith
You can use a function score query with a native script in this manner. { function_score : { query : { match_all : { } }, functions : [ { filter : { terms : { myfield : [ 103, 104, 134, 180 ], _cache : true } },

Re: Filter first then search

2014-04-18 Thread David Smith
I'm also curious to know if there is way to do the opposite of FilteredQuery... basically QueriedFilter. Filter first and then run a query on the filtered results. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group

Re: discovery.zen.minimum_master_nodes and gateway.recover_after_nodes does not work after upgrading to ES 1.0.1 ?

2014-03-27 Thread David Smith
Anybody? Help? On Wednesday, March 26, 2014 4:00:14 PM UTC-4, David Smith wrote: We have 16 node cluster on 0.90.5. We built a new cluster for 1.0.1 (yes, we will upgrade to 1.1.0 soon) but we experience this problem that I would like help with. In our 0.90.5 cluster, we had it configured

discovery.zen.minimum_master_nodes and gateway.recover_after_nodes does not work after upgrading to ES 1.0.1 ?

2014-03-26 Thread David Smith
We have 16 node cluster on 0.90.5. We built a new cluster for 1.0.1 (yes, we will upgrade to 1.1.0 soon) but we experience this problem that I would like help with. In our 0.90.5 cluster, we had it configured as: discovery.zen.minimum_master_nodes: 9 gateway.expected_nodes: 16