We're using elasticsearch 1.4.0 and we index documents using bulk indexing
from node client.
We seem to be getting quite a few duplicates as elasticsearch doesn't seem
to recognize that there are already documents with the same _id.
GET /searchables/_search
{
from: 0,
size: 100,
FYI, the answer is no.
I did a simple test using pmap to to make multiple scroll queries in
parallel. Out of 500 results, there were only 221 distinct values, so more
than half were duplicates :)
On Wednesday, January 28, 2015 at 12:56:21 PM UTC+1, David Smith wrote:
Can I share a scroll id
Can I share a scroll id between multiple clients? If two clients ask the
next batch of the scroll at the same time, will they get different results
or is there a danger they will get duplicates? If there is no possibility
of duplicates, I could share a scroll id across machines and process
Hi,
I want to upsert a document and at the same time reset it's TTL. I.e. To
do the upsert I use _update with doc_as_update true and to update the ttl I
use _update with script and ctx._ttl. It seems I can't do both of these at
the same time though. Does this mean I have to do one of:
1.
Hi,
Since we upgraded to 1.4.0, deleted indices in our time-series index set
keep coming back right after deletion. So whenever we drop an expired index
(usually as midnight rolls), it gets deleted and removed from the alias it
was under. But about half the time it comes back as an empty
I can't remember what 0.90.x was unlike as that was long ago for us, but we
recently upgraded from 1.1.0 to 1.4.0.
Look
at
http://www.elasticsearch.org/guide/en/elasticsearch/reference/1.x/breaking-changes.html
additionally pay attention to:
- scripting:
- replacement of mvel w/
Also, forgot to mention... if you have native scripts, they will
mysteriously throw Unsupported Operation exception whenever invoked. Looks
like they made a mistake in 1.4.0 (that is now reverted on master), that
requires you to override the setScorer in native scripts. It's ok, I just
wish
We have cluster with 22 data nodes and 3 dedicated master nodes running ES
1.1.0. Data nodes have 16 GB of memory (half given to JVM heap) and the
dedicated masters have 4 GB (half for heap). Data nodes have consistent
memory usage (about 50-60%) but we're observing that the master node is
You could add datetime field to your document and order by that in
ascending order and get the first X document id's. And then delete the
documents by id.
While that's the solution to the question you asked, it might be better to
re-think your problem so that you don't have to delete the
Thanks, Adrien. That brings me closer.
So when the documentations say doc values do not support filtering, it's
talking about fielddata filtering for what's loaded into memory (anod not
filtering as part of a query... say term filter). For further clarification
- can a field that is not analyzed
Hi,
I'm trying to optimize this query which takes 5-10s to run. This query is
run repeated for different (pretty much all) users via an offline
process/script daily. The index it is run against has about 4 billion
documents, each query matches approximately 500k documents in that index
but I
Thanks, Jörg. Is it possible to set these via API instead of changing the
yaml?
--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to
Ahh, got it. Thanks.
On Saturday, April 19, 2014 10:05:39 AM UTC-4, Jörg Prante wrote:
No, you can not change the merge scheduler settings via API. Threadpool
settings updating works.
Jörg
On Sat, Apr 19, 2014 at 3:22 PM, David Smith davidk...@gmail.comjavascript:
wrote:
Thanks, Jörg
I see that ES switch back to ConcurrentMergeScheduler in 1.1.1 due to it
affecting indexing performance in 1.1.0.
https://github.com/elasticsearch/elasticsearch/issues/5817
We're on 1.1.0 and cannot upgrade to 1.1.1 for the time being. Is there a
way to switch it back using the API? I tried the
Yes, function score query works with native scripts. We use it with them.
I'm not sure whether native scripts are automatically cached.
On Saturday, April 12, 2014 1:49:32 PM UTC-4, Eric T wrote:
Hi,
The function score documentation doesn't mention any support for native
scripts, does it
You can use a function score query with a native script in this manner.
{
function_score : {
query : {
match_all : { }
},
functions : [ {
filter : {
terms : {
myfield : [ 103, 104, 134, 180 ],
_cache : true
}
},
I'm also curious to know if there is way to do the opposite of
FilteredQuery... basically QueriedFilter. Filter first and then run a query
on the filtered results.
--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group
Anybody? Help?
On Wednesday, March 26, 2014 4:00:14 PM UTC-4, David Smith wrote:
We have 16 node cluster on 0.90.5. We built a new cluster for 1.0.1 (yes,
we will upgrade to 1.1.0 soon) but we experience this problem that I would
like help with.
In our 0.90.5 cluster, we had it configured
We have 16 node cluster on 0.90.5. We built a new cluster for 1.0.1 (yes,
we will upgrade to 1.1.0 soon) but we experience this problem that I would
like help with.
In our 0.90.5 cluster, we had it configured as:
discovery.zen.minimum_master_nodes: 9
gateway.expected_nodes: 16
19 matches
Mail list logo