why disk read is high than disk write?

2014-11-05 Thread 이윤동
hi! my first question! if replica 0 bulk index, then disk read, write ratio same. but batch finish after... disk read is high than write. disk read = 10 X disk write... so cpu load is high, then batch index very slow.. T.T -- You received this message because you are subscribed to the Google

Phrase Matching using Java

2014-11-05 Thread Ap
I am trying to do a Phrase matching to find similar Phrases. Eg. Name field has following entries and all 3 should be evaluated to same : 1. USA Tech Company 2. USA Tech Company Alabama 3. USA Tech Company California Can you suggest a Java code that uses Phrase matcher or

Re: why disk read is high than disk write?

2014-11-05 Thread Adrien Grand
I think there are two potential causes: - refreshes - id lookups Refreshes run periodically in order to make data fast to search, http://www.elasticsearch.org/blog/performance-considerations-elasticsearch-indexing/ gives recommandations to improve indexing speed by increasing the refresh

Re: why disk read is high than disk write?

2014-11-05 Thread Mark Walkom
Merges probably also play a part here. On 5 November 2014 19:24, Adrien Grand adrien.gr...@elasticsearch.com wrote: I think there are two potential causes: - refreshes - id lookups Refreshes run periodically in order to make data fast to search,

Re: Performance problems with large data volumes

2014-11-05 Thread Georgi Ivanov
Ok .. so it is Java 1. You are not doing this right . 2. You should use BulkRequest or better BulkProcessor class 3. Do NOT do setRefresh ! This way you are forcing ES to do the real indexing which will load the cluster a LOT 4. Set the refresh interval of your index to something line 30s or 60s

Re: How to get term vectors of all documents for a given type !!!

2014-11-05 Thread Adrien Grand
In order to do this, you would need to make a SCAN request in order to get all documents from a given type, and for each page to build a multi termvectors request to get term vectors. http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/docs-multi-termvectors.html On Tue, Nov 4,

Re: why disk read is high than disk write?

2014-11-05 Thread 이윤동
thanks answer! when replica 0, index speed fast, when replica 1, index speed very slow... refresh interval same.120s id lookup is good point. but we need out id, can't use auto-generated id.. T.T and when replica 0 and 1, always id lookup. 2014년 11월 5일 수요일 오후 5시 14분 54초 UTC+9, 이윤동 님의 말: hi!

Re: why disk read is high than disk write?

2014-11-05 Thread Adrien Grand
Something that could happen is that with 0 replicas all the data fit into your filesystem cache (so everything is done in memory) while with 1 replica, some filesystem operations are translated to actual disk seeks. Another different between 0 and 1 replicas is that in the latter case,

Re: why disk read is high than disk write?

2014-11-05 Thread 이윤동
our index data is over 10T, so not enough in memory. ( 10 machine, memory max 24g ) cpu is now 20 ~ 30%, wait cpu 20 ~ 25%. disk read 60m, write 6m cpu load 20 the problem... * disk read very high( no search ) - cpu load high - index slow... out goal disk read decrease. add question! our

Elasticsearch Aggregation time

2014-11-05 Thread Ankur Goel
hi , we are trying to run some aggregation over around 5 million documents with cardinality of the fields of the order of 1000 , the aggregation is a filter aggregation which wraps underlying term aggregation . Right now it's taking around 1.2 secs on an average to compute it , the time

Re: Performance problems with large data volumes

2014-11-05 Thread John D. Ament
Hi, I doubt the issue is that I'm not using bulk requests. My requests come in one at a time, not in bulk. If you can explain why bulk is required that would help. I can believe that the refresh is causing the issue. I would prefer to test that one by itself. How do I configure the

Best practice creating ES API using NodeJS

2014-11-05 Thread Idan
Hi, I developed simple nodejs project using ES as our search engine and ElasticSearchClient for node Node is exposing api to the user(using expressJS for that) I have few search categories (search by username, search by firstname, search by lastname, etc...) This is the function (using

ES 1.3.4 scrolling never ends

2014-11-05 Thread Yarden Bar
Hi all, I'm encountering a strange behavior when executing a search-scroll on a single node of ES-1.3.4 with Java client. The scenario is as follows: 1. Start a single node of version 1.3.4 2. Add snapshot repository pointing to version 1.1.1 snapshots 3. Restore snapshots version

Obtaining TF-IDF score per each word in one doc

2014-11-05 Thread Min Cha
Hello. As a title, I would like to obtain TF-IDF score per each word in one doc. For example If there is a DOC-A as following, Hello world. World is best. I would like to obtain an output as following. WORD SCORE world 0.9 hello 0.87 best 0.7 Can you have any idea? Thanks

Re: Performance problems with large data volumes

2014-11-05 Thread Georgi Ivanov
Here is how to set refresh interval: http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/indices-update-settings.html When you force refresh after every document, you are putting unnecessary load to ES. Indexing single document in a single call is completely fine, but is also

Re: Obtaining TF-IDF score per each word in one doc

2014-11-05 Thread vineeth mohan
Hello Min , Use the explain flag - http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-request-explain.html#search-request-explain Thanks Vineeth On Wed, Nov 5, 2014 at 6:32 PM, Min Cha minslo...@gmail.com wrote: Hello. As a title, I would like to obtain

Distributed Frequency Search

2014-11-05 Thread Sofiane Cherchalli
According to http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/relevance-is-broken.html, the relevance is broken until we have enough data distributed uniformly across shards. My question is: If I initially use the ?search_type=dfs_query_then_fetch parameter because I few

Re: Using Java, How to retrieve one field from all documents inside an Index.

2014-11-05 Thread Ted Smith
Hello, David: I have issue with trying to retrieve all document Ids (or a single field value of all documents) in an index. I have about several million documents, but all I need is a list of document id (sorted if possible), nothing else. It is taking 5 minutes now for me to get the results.

elastic search - inter OS snapshot and restore not working

2014-11-05 Thread Vijay Tiwary
I have two elastic search - single node cluster one is running in ubuntu os and other is running in windows 8 enviroment I am able to snap shot and restore indices from one elastic search server to other server running within the same os. However when I am trying to snapshot one indices from a

Re: Distributed Frequency Search

2014-11-05 Thread Sofiane Cherchalli
Answering myself: According to ES blog http://www.elasticsearch.org/blog/understanding-query-then-fetch-vs-dfs-query-then-fetch/ there is performance hit. It would be nice to have a feature that triggers automatically DFS based on a kinda threshold... On Wednesday, November 5, 2014 2:44:14

Re: Elasticsearch Aggregation time

2014-11-05 Thread Adrien Grand
Can you please show the json of the request that you send to elasticsearch? On Wed, Nov 5, 2014 at 10:52 AM, Ankur Goel ankrug...@gmail.com wrote: hi , we are trying to run some aggregation over around 5 million documents with cardinality of the fields of the order of 1000 , the aggregation

Re: Using Java, How to retrieve one field from all documents inside an Index.

2014-11-05 Thread David Pilato
So that's not the same story. You want to do scan and scroll. See http://www.elasticsearch.org/guide/en/elasticsearch/client/java-api/current/search.html -- David ;-) Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs Le 5 nov. 2014 à 13:48, Ted Smith tedsmithgr...@gmail.com a écrit :

Re: performance of aggregations as opposed to facets

2014-11-05 Thread Adrien Grand
Aggregations are inherently slower than facets due to the increased flexibility. Aggregations are composable and the fact that you can feed any sub-aggregation with the documents that match a particular bucket makes the life of the JVM a bit harder. Facets can actually almost be considered as

Elasticsearch Couchbase XDCR errors

2014-11-05 Thread Mark Adepteo
I'm having an issue creating a XDCR to Elasticsearch. I'm getting the following error on Couchbase: xdcr_errors.log [xdcr:error,2014-11-04T13:23:14.796,ns_1@couchbase002:0.8335.2272:xdc_vbucket_rep:terminate:489]Replication (CAPI mode) 434c41fc737b38b9a374a08085553abf/adepteo/adepteo (

Re: ES 1.3.4 scrolling never ends

2014-11-05 Thread Brian
You need to get the scroll ID from each response and use that one in the subsequent scan search. You cannot simply reuse the same scroll ID. Brian -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop

Elasticsearch logs missing

2014-11-05 Thread ajay . bh111
Why ES logs gone missing ? In between this period there was some issue but no details logged. Nothing in OS logs. Last Logged time stamp in previous days log [*2014-11-04 20:27:04*,186][DEBUG][action.search.type ] [es-orn-d-01] All shards failed for phase: [query_fetch] First log

Re: Index Termlist Plugin installation

2014-11-05 Thread joergpra...@gmail.com
You can not run a plugin for ES 1.3 on ES 1.0 I use a versioning scheme where you can immediately see if a plugin is compatible with an ES version or not. The first three numbers denote the ES version under which the plugin was developed. The last number is the plugin version number on this ES

Re: ES 1.3.4 scrolling never ends

2014-11-05 Thread Yarden Bar
I'll try that and report Thanks, Yarden On Wednesday, November 5, 2014 2:48:46 PM UTC+2, Yarden Bar wrote: Hi all, I'm encountering a strange behavior when executing a search-scroll on a single node of ES-1.3.4 with Java client. The scenario is as follows: 1. Start a single node

Re: Performance problems with large data volumes

2014-11-05 Thread joergpra...@gmail.com
Use index aliases: one physical index, 4000 aliases. Jörg On Tue, Nov 4, 2014 at 3:42 PM, John D. Ament john.d.am...@gmail.com wrote: Hi, So I have what you might want to consider a large set of data. We have about 25k records in our index, and the disk space is taking up around 2.5 gb,

simple_query_string has no analyze_wildcard

2014-11-05 Thread joergpra...@gmail.com
Hi, the query_string query http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-query-string-query.html has been extended by a heuristic to analyze wildcarded terms some time ago. https://github.com/elasticsearch/elasticsearch/issues/787 I would like to use

Indexing is very slow after a node is down in a cluster

2014-11-05 Thread Praveen Kumar
Hi Guys, I am new to elastic search and using mostly default settings. We had a cluster failure and after rebalancing the speed was very slow, often resulting in timeouts. When the cluster returned, the speed went back to normal. Is this just an inevitable consequence of losing a cluster, or

Filtering on a Function Score Element (Not filtering as expected)

2014-11-05 Thread Darren McDaniel
Hey Guys, I have a question to put on you guys again.. :) in the query below I have a nested query element with a function score... BUT... it doesnt appear to be filtering based on that function like I had hoped any thoughts? { query: { filtered: { query: {

relevance in the range 0.0 to 1.0 ?

2014-11-05 Thread Dustin Boswell
Is there a way to score documents so that the relevance score has a fixed range, like from 0 to 1.0 ? The default scoring can return arbitrarily high scores, depending on how many times the matching term appears in the document. It's tempting to want to normalize the score by the top-matching

Re: elastic search - inter OS snapshot and restore not working

2014-11-05 Thread Mark Walkom
What version of ES are you on, is it the same for both platforms? On 6 November 2014 00:50, Vijay Tiwary vijaykr.tiw...@gmail.com wrote: I have two elastic search - single node cluster one is running in ubuntu os and other is running in windows 8 enviroment I am able to snap shot and restore

modify elastic search index name using java api

2014-11-05 Thread Subbarao Kondragunta
Can we modify name of the index using java api? If so please post the lines of code to test. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to

Re: modify elastic search index name using java api

2014-11-05 Thread Mark Walkom
You cannot rename an index at all. You can add an alias though. On 6 November 2014 10:18, Subbarao Kondragunta subbu2perso...@gmail.com wrote: Can we modify name of the index using java api? If so please post the lines of code to test. -- You received this message because you are

ElasticSearch enable Snowball Analyzer and Synonym on Fields

2014-11-05 Thread Iqbal Ahmed
Hi guys, Originally I posted this in SO but found this place which seems more suitable to ask :) I have an elasticsearch index where my default analyzer is the snowball analyzer so I can get the stemming and now I need the ability to have synonyms on some of the fields as well as the

TokenStream contract violation: close() call missing

2014-11-05 Thread Richard Tier
An internal error happens when I do a suggest query. I get TokenStream contract violation: close() call missing This only happens when I add `pre_filter` to the suggest body. { 'text': 'my term', 'suggest' : { 'phrase' : {

Java API Highlighting

2014-11-05 Thread William Bowen
I'm new to Elasticsearch and I have two questions. I've done quite a bit of Google searching and looked at the Elasticsearch tests for guidance, but I'm not getting the desired result. I'm querying an Elasticsearch server I don't control, but the index seems straight forward enough. When I

An error:Could not write all entries [1576/10485504]. Bailing out...

2014-11-05 Thread ggchnu
HI. When I use elasticsearch-hadoop, I encounter this error: Could not write all entries [1576/10485504 https://github.com/elasticsearch/elasticsearch-hadoop/commit/maybe%20ES%20was%20overloaded?]. Bailing out... My task execution schedule is as follows: 14/11/05 16:06:06 INFO

Re: Bool Queries and MUST/SHOULD combinations

2014-11-05 Thread kazoompa
Thanks Ivan, We finally opted for building our queries (thru a UI query builder) in a nested fashion as dscribed above, it seems to serve our need. Cheers for the info though. On Tuesday, November 4, 2014 11:55:27 AM UTC-5, Ivan Brusic wrote: Should clauses at the same time as must

Re: elastic search date range aggregation not giving complete data

2014-11-05 Thread kazoompa
What do you mean by I have also required complete aggregate data, you result is based on the type of the aggregation you use. may be you can elaborate more. Ramin On Monday, November 3, 2014 11:33:55 PM UTC-5, Rajit Garg wrote: **I am Querying for getting aggregate data based on date_range,

how to search non indexed field in elasticsearch

2014-11-05 Thread ramakrishna panguluri
I have 10 fields inserted into elasticsearch out of which 5 fields are indexed. Is it possible to search on non indexed field? Thanks in advance. Regards Rama Krishna P -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this

Re: Using Java, How to retrieve one field from all documents inside an Index.

2014-11-05 Thread Ted Smith
Thanks. Wish there is simple Java API method to that gives a handle to get the list in a single call. Would it be possible to add this feature as it is often needed ( and supposed to be a simple process) On the same topic, even with scan and scroll, how can I limit the result returned only

Re: elastic search date range aggregation not giving complete data

2014-11-05 Thread Rajit Garg
*Hi kazoompa,* *Suppose I have below Data in index=cars and type=transactions* [ { price: 2, color: red, make: honda, sold: 2014-11-05 }, { price: 12000, color: green, make: toyota, sold: 2014-08-19 }, { price: 8, color: red, make:

How elasticsearch encodes strings with special characters before storing

2014-11-05 Thread Shobana Neelakantan
Hi We have input documents with special characters like % and _ as values. When it gets stored in elasticsearch these special characters are replaced with hex code equivalent. eg. X3dPVA9%252bZZjFLd864e7U1udCbHZhJ77amNcaGtV7Zp6dJwl3LM%252fd1cD8j8fh8spX_14978fa269e is stored as

Re: Using Java, How to retrieve one field from all documents inside an Index.

2014-11-05 Thread David Pilato
addFields(_id) should work I think though all metadata will be sent but _source. -- David ;-) Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs Le 6 nov. 2014 à 05:23, Ted Smith tedsmithgr...@gmail.com a écrit : Thanks. Wish there is simple Java API method to that gives a handle to get