Re: Combining several documents in a terms filter

2015-04-24 Thread Bruno Miranda
Is there an upper limit of bool filters? On Friday, April 17, 2015 at 10:18:38 PM UTC-7, vineeth mohan wrote: Hello Daniel , Feel free to use should clause in bool filter http://www.elastic.co/guide/en/elasticsearch/reference/1.4/query-dsl-bool-filter.html . Here you can give multiple

Bulk indexing creates a lot of disk read OPS

2015-04-24 Thread Eran
Hello, I've created an index I use for logging. This means there are mostly writes, and some searches once in a while. In the phase of the first loading, I'm using several clients to concurrently index documents using the bulk API. At first, indexing takes 200 ms for a bulk of 5000 documents.

Indexing new document and check version

2015-04-24 Thread Tomáš Jurák
Hi, when I create new document and call prepareIndex it overwrites an existing one (was created just a nanosecond before this using another thread). Is it possible to use ES versions to control this behavior like when doing updates? Problem is, that when document exists I can receive its

Re: Bulk indexing creates a lot of disk read OPS

2015-04-24 Thread Eran
Hey David, I suspect it indeed might be the cause, but I'm kind of a newbie here. What metric do I need to monitor, what would be a problematic value, and basically, how can I play with merge settings to test if I can improve this? Some rules of thumbs for a newbie would be appreciated. I

Re: Bulk indexing creates a lot of disk read OPS

2015-04-24 Thread Eran
It is an issue, as I am hitting 7000 read operations per second (the limit of my volume's iops) As the index grow larger the problem worsens, and as I was once able to update with a 10 clients concurrently, now I can barely use one client. Also, I used an _optimize endpoint to have all

Re: Bulk indexing creates a lot of disk read OPS

2015-04-24 Thread David Pilato
That’s normal. I was just answering that even if you think you are only writing data while indexing, you are also reading data behind the scene to merge Lucene segments. You can potentially try to play with index.translog.flush_threshold_size

Re: FreeBSD 10.1 install elasticsearch plugin fails

2015-04-24 Thread Pccom Frank
You can find mapper attachment plugin 2.43, which is for elasticseach version 1.43, at https://github.com/elastic/elasticseach-mapper-attachments click the 2.43 link will bring you to 2.43 to download version 2.43 On Apr 24, 2015 12:44 AM, David Pilato da...@pilato.fr wrote: Ok but this mapper

Re: Bulk indexing creates a lot of disk read OPS

2015-04-24 Thread David Pilato
Merging segments could be the cause here? David Le 24 avr. 2015 à 09:54, Eran era...@gmail.com a écrit : Forgot some stats: I have 10 shards, no replicas, all on the same machine. ATM, there are some 1.5 billion records in the index. On Friday, April 24, 2015 at 10:18:27 AM UTC+3,

Apply word_delimiter token filter on words having 5 chars or more.

2015-04-24 Thread Nassim
Hi, Is it possibile to apply the word_delimiter token filter only on words having 5 chars or more ? Thank you. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email

Re: Snapshot is stuck in IN_PROGRESS

2015-04-24 Thread Pradeep Reddy
Joining new nodes with 1.5.x version then terminating old nodes solved the issue. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to

Re: Bulk indexing creates a lot of disk read OPS

2015-04-24 Thread christian . dahlqvist
Hi Eran, Which version of Elasticsearch are you using? Are you assigning your own document IDs or letting Elasticsearch assign them automatically? Best regards, Christian On Friday, April 24, 2015 at 7:49:56 AM UTC+1, Eran wrote: Hello, I've created an index I use for logging. This

Re: Geo Mapping from Twitter

2015-04-24 Thread Sree
It worked. On Friday, 24 April 2015 09:04:10 UTC+5:30, Sree wrote: Hi all, coordinates : { type : Point, coordinates : [ 100.41404641, 5.37384675 ] }, This is the Geo coordinates from Twitter. I tried it with coordinates:

Re: FreeBSD 10.1 install elasticsearch plugin fails

2015-04-24 Thread David Pilato
No. 2.43 does not exist. -- David Pilato - Developer | Evangelist elastic.co @dadoonet https://twitter.com/dadoonet | @elasticsearchfr https://twitter.com/elasticsearchfr | @scrutmydocs https://twitter.com/scrutmydocs Le 24 avr. 2015 à 12:53, Pccom Frank pccom.fr...@gmail.com a écrit :

BulkProcessor pest practices

2015-04-24 Thread mzrth_7810
Hey guys, I'm using the BulkProcessor to index documents in elasticsearch. Its definitely made my indexing throughput greater than it was before. Anyway, I was wondering if there were some best practices around exception handling with the bulk processor. For example it would be good to

Re: FreeBSD 10.1 install elasticsearch plugin fails

2015-04-24 Thread Pccom Frank
I successfully install the header plugin in FreeBSD by its port /usr/ports/textproc/elasticsearch-plugin-head In FreeBSD, it is installed by the command make install clean Not bin/plugin The problem is that there is no such a port for mapper attachments. This is my original problem. FreeBSD

Re: Bulk indexing creates a lot of disk read OPS

2015-04-24 Thread Eran
I'm using the newest version, 1.5.1 I'm assigning my own ID using path: _id: { path: msg_id }, msg_id is a self generated, hashed identifier (it's actually somewhat like a cookie ID) On Friday, April 24, 2015 at 1:47:39 PM UTC+3, christian...@elasticsearch.com wrote: Hi Eran, Which

Shuffle results by a property

2015-04-24 Thread Cassiano Tartari
​Hello! I have a marketplace and I would like to sort the search results mixing products advertisers. Can someone help me? I'm making a filtered query like this: ​query: { filtered: { query: { bool: { must: [

Re: FreeBSD 10.1 install elasticsearch plugin fails

2015-04-24 Thread Pccom Frank
https://github.com/elastic/elasticsearch-mapper-attachments/tree/v2.4.3#version-243-for-elasticsearch-14 -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to

Re: FreeBSD 10.1 install elasticsearch plugin fails

2015-04-24 Thread Pccom Frank
This is the FreeBSD elasticsearch-plugin command: /usr/local/bin/elasticsearch-plugin Usage: -u, --url [plugin location] : Set exact URL to download the plugin from -i, --install [plugin name] : Downloads and installs listed plugins [*] -t, --timeout [duration] :

Re: Bulk indexing creates a lot of disk read OPS

2015-04-24 Thread Eran
Wow, awsome. I'll try that, Thanks! On Friday, April 24, 2015 at 2:17:45 PM UTC+3, christian...@elasticsearch.com wrote: Hi Eran, If you are assigning your own ID, Elasticsearch need to search and check if the document already exists before writing it. This could explain why the bulk

Re: Bulk indexing creates a lot of disk read OPS

2015-04-24 Thread Jason Wee
merging graph you shared, looks normal to me. we had es with 10 shards too, and i monitor the segment using segmentspy, the segment graph in your attachment shown pretty same with ours. jason On Fri, Apr 24, 2015 at 4:45 PM, Eran era...@gmail.com wrote: Hey David, I suspect it indeed might

Re: BulkProcessor pest practices

2015-04-24 Thread David Pilato
If you make your bulk final, I think this could work: private final BulkProcessor bulk; CrmApp() { Client esClient = new TransportClient( ImmutableSettings.builder().put(cluster.name, devoxx) ).addTransportAddress( new InetSocketTransportAddress(127.0.0.1, 9300)

Re: Bulk indexing creates a lot of disk read OPS

2015-04-24 Thread christian . dahlqvist
Hi Eran, If you are assigning your own ID, Elasticsearch need to search and check if the document already exists before writing it. This could explain why the bulk insert performance goes down as the size of the index grows. If you are not going to update the documents, I would therefore

Re: FreeBSD 10.1 install elasticsearch plugin fails

2015-04-24 Thread Pccom Frank
You can find all versions here: https://github.com/elastic/elasticsearch-mapper-attachments#mapper-attachments-type-for-elasticsearch On Fri, Apr 24, 2015 at 7:16 AM, Pccom Frank pccom.fr...@gmail.com wrote:

Re: WordCloud in Elasticsearch

2015-04-24 Thread Jeff Fogarty
Hi Alfredo, My goal is to use the features in ES to create a wordcloud as easy as possible. The termvector or significant terms query seem to be the most useful. A visualization of the 'significant' words is all I'm after. On Thursday, April 23, 2015 at 10:26:14 AM UTC-5, Alfredo Serafini

Web page feature request

2015-04-24 Thread Attila Nagy
Hi, I use google to find elasticsearch related answers which tends to find the relevant pages from elastic.co's docs page, for example this one: http://www.elastic.co/guide/en/elasticsearch/reference/0.90/query-dsl-nested-filter.html Note that this is for 0.90, which is quite annoying, but

Re: WordCloud in Elasticsearch

2015-04-24 Thread mark
A visualization of the 'significant' words is all I'm after. The main question then is significant compared to what?. Straight popularity counts (e.g. terms agg) will just tell you the term the is very popular. To use significant_terms you need to provide a foreground set and a background set

Re: FreeBSD 10.1 install elasticsearch plugin fails

2015-04-24 Thread David Pilato
Yes. I told you. The plugin version is 2.4.3 not 2.43 /usr/local/bin/elasticsearch-plugin install elasticsearch/elasticsearch-mapper-attachments/2.4.3 -- David Pilato - Developer | Evangelist elastic.co @dadoonet https://twitter.com/dadoonet | @elasticsearchfr

Re: Elasticsearch crashed after start

2015-04-24 Thread Ann Yablunovskaya
[2015-04-24 03:38:04,501][INFO ][node ] [server] version[1.5.1], pid[29957], build[5e38401/2015-04-09T13:41:35Z] [2015-04-24 03:38:04,502][INFO ][node ] [server] initializing ... [2015-04-24 03:38:04,536][INFO ][plugins ] [server] loaded

Get a fixed random sample from all documents

2015-04-24 Thread Sebastian Rickelt
Hi, I want to fetch a fixed large number of documents randomly from Elasticsearch to compute some statistics (100,000 out of 10 M documents). The randomness has to be predictable so that I get the same documents with every request. My problem is that scan and scroll is fast but as I

Re: DynamoDB river plugin

2015-04-24 Thread Nilay Khandhar
Hi, Somehow i solved to run the plugin in elastic search but now the issue is it is not fetching the value from my dynamoDB table i.e. Recipe. i tried this curl -XPUT 'localhost:9200/_river/dynamodb/_meta' -d '{ type : dynamodb, dynamodb : { access_key : XX,

Re: FreeBSD 10.1 install elasticsearch plugin fails

2015-04-24 Thread Pccom Frank
/usr/local/bin/elasticsearch-plugin --list Installed plugins: - mapper-attachments - head On Fri, Apr 24, 2015 at 8:47 AM, Pccom Frank pccom.fr...@gmail.com wrote: Thank you! Success! elasticsearch-plugin --install elasticsearch/elasticsearch-mapper-attachments/2.4.3

Re: Apply word_delimiter token filter on words having 5 chars or more.

2015-04-24 Thread Ivan Brusic
Your best option would be to write your own filter. It should be easy since you have access to the source of the delimiter and length filters. Look at the existing filter plugins for examples on how to deploy. Ivan On Apr 24, 2015 10:39 AM, Nassim nassim.ka...@gmail.com wrote: Hi, Is it

issue with multi_match queries for nested documents

2015-04-24 Thread Anatoly Petkevich
I need multilanguage text search for documents and decided to use types and multi_match queries for that as described in http://www.elastic.co/guide/en/elasticsearch/guide/current/mapping.html It works smoothly for flat documents, but I cannot figure out whether it is applicable to nested

Marvel showing unresponsive nodes but active data

2015-04-24 Thread Tristan Hammond
Hi all. So we have an Elasticsearch cluster up and running. 5 nodes (1 master, 1 client and 3 data). At some point the Marvel dash began stating on all our nodes and indices that no report has been received for more than 2m. I'm not sure when this seems to have started, but the cluster status

Relationshilp search

2015-04-24 Thread ming ming
hi! i am a newbie to this forum and i've got a problem recently with relationship model search(Elasticsearch + Ruby on Rails). I have a document model that has_many geolocations, through georeferences. look like this *Document* has_many :geolocations, through: :georeferences has_many

Re: Issue when MatchPhasePrefix and Sort

2015-04-24 Thread TB
The field is not of 594 MB, could this be related to where the JVM does not have enough memory allocated? I have set mlock_all: true in the config.. I have not the ulimit or changes the ES_MIN and ES_MAX variables, could this be related to that? On Thursday, April 23, 2015 at 8:16:33 PM

Re: Bulk indexing creates a lot of disk read OPS

2015-04-24 Thread Eran
Forgot some stats: I have 10 shards, no replicas, all on the same machine. ATM, there are some 1.5 billion records in the index. On Friday, April 24, 2015 at 10:18:27 AM UTC+3, Eran wrote: attachments hereby On Friday, April 24, 2015 at 9:49:56 AM UTC+3, Eran wrote: Hello, I've created

Re: Issue when MatchPhasePrefix and Sort

2015-04-24 Thread Mark Walkom
See http://www.elastic.co/guide/en/elasticsearch/guide/master/_limiting_memory_usage.html#circuit-breaker Basically ES is protecting you against potential OOM killers. On 25 April 2015 at 01:27, TB txind...@gmail.com wrote: The field is not of 594 MB, could this be related to where the JVM