hi! my first question!
if replica 0 bulk index, then disk read, write ratio same.
but batch finish after... disk read is high than write.
disk read = 10 X disk write...
so cpu load is high, then batch index very slow.. T.T
--
You received this message because you are subscribed to the Google
I am trying to do a Phrase matching to find similar Phrases.
Eg. Name field has following entries and all 3 should be evaluated to same :
1. USA Tech Company
2. USA Tech Company Alabama
3. USA Tech Company California
Can you suggest a Java code that uses Phrase matcher or
I think there are two potential causes:
- refreshes
- id lookups
Refreshes run periodically in order to make data fast to search,
http://www.elasticsearch.org/blog/performance-considerations-elasticsearch-indexing/
gives recommandations to improve indexing speed by increasing the refresh
Merges probably also play a part here.
On 5 November 2014 19:24, Adrien Grand adrien.gr...@elasticsearch.com
wrote:
I think there are two potential causes:
- refreshes
- id lookups
Refreshes run periodically in order to make data fast to search,
Ok .. so it is Java
1. You are not doing this right .
2. You should use BulkRequest or better BulkProcessor class
3. Do NOT do setRefresh ! This way you are forcing ES to do the real
indexing which will load the cluster a LOT
4. Set the refresh interval of your index to something line 30s or 60s
In order to do this, you would need to make a SCAN request in order to get
all documents from a given type, and for each page to build a multi
termvectors request to get term vectors.
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/docs-multi-termvectors.html
On Tue, Nov 4,
thanks answer!
when replica 0, index speed fast,
when replica 1, index speed very slow...
refresh interval same.120s
id lookup is good point.
but we need out id, can't use auto-generated id.. T.T
and when replica 0 and 1, always id lookup.
2014년 11월 5일 수요일 오후 5시 14분 54초 UTC+9, 이윤동 님의 말:
hi!
Something that could happen is that with 0 replicas all the data fit into
your filesystem cache (so everything is done in memory) while with 1
replica, some filesystem operations are translated to actual disk seeks.
Another different between 0 and 1 replicas is that in the latter case,
our index data is over 10T, so not enough in memory. ( 10 machine, memory
max 24g )
cpu is now 20 ~ 30%, wait cpu 20 ~ 25%.
disk read 60m, write 6m
cpu load 20
the problem...
* disk read very high( no search ) - cpu load high - index slow...
out goal disk read decrease.
add question!
our
hi ,
we are trying to run some aggregation over around 5 million documents with
cardinality of the fields of the order of 1000 , the aggregation is a
filter aggregation which wraps underlying term aggregation . Right now
it's taking around 1.2 secs on an average to compute it , the time
Hi,
I doubt the issue is that I'm not using bulk requests. My requests come in
one at a time, not in bulk. If you can explain why bulk is required that
would help.
I can believe that the refresh is causing the issue. I would prefer to
test that one by itself. How do I configure the
Hi,
I developed simple nodejs project using ES as our search engine and
ElasticSearchClient for node
Node is exposing api to the user(using expressJS for that)
I have few search categories (search by username, search by firstname,
search by lastname, etc...)
This is the function (using
Hi all,
I'm encountering a strange behavior when executing a search-scroll on a
single node of ES-1.3.4 with Java client.
The scenario is as follows:
1. Start a single node of version 1.3.4
2. Add snapshot repository pointing to version 1.1.1 snapshots
3. Restore snapshots version
Hello.
As a title, I would like to obtain TF-IDF score per each word in one doc.
For example
If there is a DOC-A as following,
Hello world. World is best.
I would like to obtain an output as following.
WORD SCORE
world 0.9
hello 0.87
best 0.7
Can you have any idea?
Thanks
Here is how to set refresh interval:
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/indices-update-settings.html
When you force refresh after every document, you are putting unnecessary
load to ES.
Indexing single document in a single call is completely fine, but is also
Hello Min ,
Use the explain flag -
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-request-explain.html#search-request-explain
Thanks
Vineeth
On Wed, Nov 5, 2014 at 6:32 PM, Min Cha minslo...@gmail.com wrote:
Hello.
As a title, I would like to obtain
According to
http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/relevance-is-broken.html,
the relevance is broken until we have enough data distributed uniformly
across shards.
My question is: If I initially use the ?search_type=dfs_query_then_fetch
parameter because I few
Hello, David:
I have issue with trying to retrieve all document Ids (or a single field
value of all documents) in an index.
I have about several million documents, but all I need is a list of
document id (sorted if possible), nothing else.
It is taking 5 minutes now for me to get the results.
I have two elastic search - single node cluster one is running in ubuntu os
and other is running in windows 8 enviroment
I am able to snap shot and restore indices from one elastic search server
to other server running within the same os. However when I am trying to
snapshot one indices from a
Answering myself:
According to ES blog
http://www.elasticsearch.org/blog/understanding-query-then-fetch-vs-dfs-query-then-fetch/
there is performance hit. It would be nice to have a feature that triggers
automatically DFS based on a kinda threshold...
On Wednesday, November 5, 2014 2:44:14
Can you please show the json of the request that you send to elasticsearch?
On Wed, Nov 5, 2014 at 10:52 AM, Ankur Goel ankrug...@gmail.com wrote:
hi ,
we are trying to run some aggregation over around 5 million documents with
cardinality of the fields of the order of 1000 , the aggregation
So that's not the same story.
You want to do scan and scroll.
See
http://www.elasticsearch.org/guide/en/elasticsearch/client/java-api/current/search.html
--
David ;-)
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs
Le 5 nov. 2014 à 13:48, Ted Smith tedsmithgr...@gmail.com a écrit :
Aggregations are inherently slower than facets due to the increased
flexibility. Aggregations are composable and the fact that you can feed any
sub-aggregation with the documents that match a particular bucket makes the
life of the JVM a bit harder. Facets can actually almost be considered as
I'm having an issue creating a XDCR to Elasticsearch.
I'm getting the following error on Couchbase:
xdcr_errors.log
[xdcr:error,2014-11-04T13:23:14.796,ns_1@couchbase002:0.8335.2272:xdc_vbucket_rep:terminate:489]Replication
(CAPI mode) 434c41fc737b38b9a374a08085553abf/adepteo/adepteo (
You need to get the scroll ID from each response and use that one in the
subsequent scan search. You cannot simply reuse the same scroll ID.
Brian
--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop
Why ES logs gone missing ? In between this period there was some issue
but no details logged. Nothing in OS logs.
Last Logged time stamp in previous days log
[*2014-11-04 20:27:04*,186][DEBUG][action.search.type ] [es-orn-d-01]
All shards failed for phase: [query_fetch]
First log
You can not run a plugin for ES 1.3 on ES 1.0
I use a versioning scheme where you can immediately see if a plugin is
compatible with an ES version or not. The first three numbers denote the ES
version under which the plugin was developed. The last number is the plugin
version number on this ES
I'll try that and report
Thanks,
Yarden
On Wednesday, November 5, 2014 2:48:46 PM UTC+2, Yarden Bar wrote:
Hi all,
I'm encountering a strange behavior when executing a search-scroll on a
single node of ES-1.3.4 with Java client.
The scenario is as follows:
1. Start a single node
Use index aliases: one physical index, 4000 aliases.
Jörg
On Tue, Nov 4, 2014 at 3:42 PM, John D. Ament john.d.am...@gmail.com
wrote:
Hi,
So I have what you might want to consider a large set of data.
We have about 25k records in our index, and the disk space is taking up
around 2.5 gb,
Hi,
the query_string query
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-query-string-query.html
has been extended by a heuristic to analyze wildcarded terms some time ago.
https://github.com/elasticsearch/elasticsearch/issues/787
I would like to use
Hi Guys,
I am new to elastic search and using mostly default settings.
We had a cluster failure and after
rebalancing the speed was very slow, often resulting in timeouts. When the
cluster returned,
the speed went back to normal. Is this just an inevitable consequence of
losing a cluster,
or
Hey Guys,
I have a question to put on you guys again.. :)
in the query below I have a nested query element with a function score...
BUT... it doesnt appear to be filtering based on that function like I had
hoped
any thoughts?
{
query: {
filtered: {
query: {
Is there a way to score documents so that the relevance score has a fixed
range, like from 0 to 1.0 ? The default scoring can return arbitrarily high
scores, depending on how many times the matching term appears in the
document.
It's tempting to want to normalize the score by the top-matching
What version of ES are you on, is it the same for both platforms?
On 6 November 2014 00:50, Vijay Tiwary vijaykr.tiw...@gmail.com wrote:
I have two elastic search - single node cluster one is running in ubuntu
os and other is running in windows 8 enviroment
I am able to snap shot and restore
Can we modify name of the index using java api? If so please post the lines
of code to test.
--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to
You cannot rename an index at all.
You can add an alias though.
On 6 November 2014 10:18, Subbarao Kondragunta subbu2perso...@gmail.com
wrote:
Can we modify name of the index using java api? If so please post the
lines of code to test.
--
You received this message because you are
Hi guys,
Originally I posted this in SO but found this place which seems more
suitable to ask :)
I have an elasticsearch index where my default analyzer is the snowball
analyzer so I can get the stemming and now I need the ability to have
synonyms on some of the fields as well as the
An internal error happens when I do a suggest query. I get TokenStream
contract violation: close() call missing
This only happens when I add `pre_filter` to the suggest body.
{
'text': 'my term',
'suggest' : {
'phrase' : {
I'm new to Elasticsearch and I have two questions. I've done quite a bit of
Google searching and looked at the Elasticsearch tests for guidance, but
I'm not getting the desired result.
I'm querying an Elasticsearch server I don't control, but the index seems
straight forward enough. When I
HI. When I use elasticsearch-hadoop, I encounter this error:
Could not write all entries [1576/10485504
https://github.com/elasticsearch/elasticsearch-hadoop/commit/maybe%20ES%20was%20overloaded?].
Bailing out...
My task execution schedule is as follows:
14/11/05 16:06:06 INFO
Thanks Ivan,
We finally opted for building our queries (thru a UI query builder) in a
nested fashion as dscribed above, it seems to serve our need.
Cheers for the info though.
On Tuesday, November 4, 2014 11:55:27 AM UTC-5, Ivan Brusic wrote:
Should clauses at the same time as must
What do you mean by I have also required complete aggregate data, you
result is based on the type of the aggregation you use.
may be you can elaborate more.
Ramin
On Monday, November 3, 2014 11:33:55 PM UTC-5, Rajit Garg wrote:
**I am Querying for getting aggregate data based on date_range,
I have 10 fields inserted into elasticsearch out of which 5 fields are
indexed.
Is it possible to search on non indexed field?
Thanks in advance.
Regards
Rama Krishna P
--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this
Thanks. Wish there is simple Java API method to that gives a handle to get
the list in a single call.
Would it be possible to add this feature as it is often needed ( and
supposed to be a simple process)
On the same topic, even with scan and scroll, how can I limit the result
returned only
*Hi kazoompa,*
*Suppose I have below Data in index=cars and type=transactions*
[
{
price: 2,
color: red,
make: honda,
sold: 2014-11-05
},
{
price: 12000,
color: green,
make: toyota,
sold: 2014-08-19
},
{
price: 8,
color: red,
make:
Hi
We have input documents with special characters like % and _ as values.
When it gets stored in elasticsearch these special characters are replaced
with hex code equivalent.
eg.
X3dPVA9%252bZZjFLd864e7U1udCbHZhJ77amNcaGtV7Zp6dJwl3LM%252fd1cD8j8fh8spX_14978fa269e
is stored as
addFields(_id) should work I think though all metadata will be sent but
_source.
--
David ;-)
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs
Le 6 nov. 2014 à 05:23, Ted Smith tedsmithgr...@gmail.com a écrit :
Thanks. Wish there is simple Java API method to that gives a handle to get
47 matches
Mail list logo