Re: Elastic Search as cache

2014-11-22 Thread Jorge Luis Betancourt González
Actually the _source field is a regular field defined by default as stored, do you need to fetch all the fields you're sending to ES? By default if you don't have any other field specified as stored when you request any particular field ES parses the _source field and returns it. So I don't thin

Re: Elastic Search as cache

2014-11-22 Thread Jingzhao Ou
> > I did some experiments following your advice. One issue is that: when > "_source" is disabled, when I get a document by its id, the original JSON > is not there. How am I supposed to get my data back with "_source" set to > false? > I figured out a way to get data back when "_source" is d

Re: Elastic Search as cache

2014-11-22 Thread Jingzhao Ou
Hi, Jörg, I did some experiments following your advice. One issue is that: when "_source" is disabled, when I get a document by its id, the original JSON is not there. How am I supposed to get my data back with "_source" set to false? For example, I have { "mappings":{ "_default_":

Re: Treatment of special characters in elasticsearch

2014-11-22 Thread prachicsa
I am using Java Transport client. Where do we have to specify about handling characters there? On Sunday, November 23, 2014 2:15:52 AM UTC+5:30, Jörg Prante wrote: > > Check your client. If you use curl, and shell, it's the shell or curl that > is handling characters, for example, URI percent

Re: 1.4.0 data node can't join existing 1.3.4 cluster

2014-11-22 Thread Ivan Brusic
Great work everyone. Feel better about upgrading now. On Nov 22, 2014 4:42 PM, "Boaz Leskes" wrote: > Hi Christian, Daniel, > > I believe I found the issue - it has to do with the cloud plugins (both > AWS and GCE) and the way they create the node list for the unicast based > discovery. Effectiv

Re: Elastic Search as cache

2014-11-22 Thread Jingzhao Ou
Hi, Jörg, Thanks a lot for your reply and the useful info. I have more reads than writes in my case. So, the Elastic Search solution looks promising. I will go ahead with my tests just as you suggested. Best regards, Jingzhao On Saturday, November 22, 2014 2:57:40 PM UTC-8, Jörg Prante wrote:

Re: Elastic Search as cache

2014-11-22 Thread joergpra...@gmail.com
If you set "index: no" to all fields and disabled _all and _source, you have low overhead because Lucene does not need to index or merge anything. But your concerns are correct. Elasticsearch does not have in-place updates, there is a tradeoff between time and space for that operation. If you have

Re: terms filter with values to match in uppercase is not possible?

2014-11-22 Thread Emilio García-Pumarino Álvarez
Thanks! El sábado, 22 de noviembre de 2014 22:45:27 UTC, Jörg Prante escribió: > > You are using the standard analyzer which is using a lowercase filter. > > Jörg > > On Sat, Nov 22, 2014 at 11:30 PM, Emilio García-Pumarino Álvarez < > emil...@gmail.com > wrote: > >> Hi!, >> I have a document lik

Re: terms filter with values to match in uppercase is not possible?

2014-11-22 Thread joergpra...@gmail.com
You are using the standard analyzer which is using a lowercase filter. Jörg On Sat, Nov 22, 2014 at 11:30 PM, Emilio García-Pumarino Álvarez < emili@gmail.com> wrote: > Hi!, > I have a document like this: > { >"type": "film", >"countries": ["US", "ES"] > } > > And i insert it in elas

Re: how to migrate lucene index into elasticsearch

2014-11-22 Thread joergpra...@gmail.com
I can not tell if it will work, but if you could translate your xml mapping into an Elasticsearch mapping it would be great. The next steps would be to create an empty index with the mapping, using 1 shard and no replica, _source and _all disabled. Then you could index one test doc over the ES API

terms filter with values to match in uppercase is not possible?

2014-11-22 Thread Emilio García-Pumarino Álvarez
Hi!, I have a document like this: { "type": "film", "countries": ["US", "ES"] } And i insert it in elasticsearch, then i do the follow search: GET _search?search_type=dfs_query_and_fetch { "query": { "filtered": { "query": { "term": {"type": "film"} },

Re: Is there any way of tracking request id?

2014-11-22 Thread joergpra...@gmail.com
One possibility is to use search templates http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-template.html The template names could be used as request ids. But, if I understand your scenario, the real question is about how to identify clients (e.g. by the node and sock

Re: 1.4.0 data node can't join existing 1.3.4 cluster

2014-11-22 Thread Boaz Leskes
Hi Christian, Daniel, I believe I found the issue - it has to do with the cloud plugins (both AWS and GCE) and the way they create the node list for the unicast based discovery. Effectively they mislead it to think that that all nodes on the cluster are version 1.4.0 which is not correct. I o

Re: 1.4.0 data node can't join existing 1.3.4 cluster

2014-11-22 Thread Boaz Leskes
Hi All, I believe I found the source of the problem and it has to do with the AWS plugin. I opened an issue for it, which should be pretty easy to fix: https://github.com/elasticsearch/elasticsearch-cloud-aws/issues/143 . Cheers, Boaz On Friday, November 21, 2014 5:39:32 PM UTC+2, Ivan Brusic

Re: Registering listener to an index through elasticsearch plugin

2014-11-22 Thread joergpra...@gmail.com
You can add a listener to all (primary) shards of an index using ShardIndexingService and capture indexing operations with preIndex() or postIndex() method. Jörg On Sat, Nov 22, 2014 at 9:06 PM, Milad Alshomary wrote: > Is there any way to write an elastic plugin that can listen to an index > a

Re: Treatment of special characters in elasticsearch

2014-11-22 Thread joergpra...@gmail.com
Check your client. If you use curl, and shell, it's the shell or curl that is handling characters, for example, URI percent encoding. Elasticsearch, when it has received data, does not do any extra conversion, it expects UTF-8. Jörg On Sat, Nov 22, 2014 at 7:05 PM, wrote: > I use the following

Re: Why ES node starts recovering all the data from other nodes after reboot?

2014-11-22 Thread Nikolas Everett
Tiny shards have more ever head and aren't going to score results as accurately. On Nov 22, 2014 2:04 PM, "Yves Dorfsman" wrote: > On 2014-11-22 09:35, Otis Gospodnetic wrote: > > Hi Konstantin, > > > > Check out http://gibrown.com/2014/11/19/elasticsearch-the-broken-bits/ > > > > Good writing! T

Registering listener to an index through elasticsearch plugin

2014-11-22 Thread Milad Alshomary
Is there any way to write an elastic plugin that can listen to an index and get notified with the document that is being inserted to this index ? -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving

Strange shard counts

2014-11-22 Thread Jingzhao Ou
Hi, I checked the index stats by visiting http://localhost:9200/raw/_stats?pretty { "_shards" : { "total" : 10, "successful" : 5, "failed" : 0 }, There are totally 10 shards, 5 successful, 0 failed. But where are the missing 5 shards? Any insights? Thanks a lot! Jingzhao --

Elastic Search as cache

2014-11-22 Thread Jingzhao Ou
Hi, all, I am interested in using Elastic Search to replace memcached. That way, I have one less software to maintain. First of all, I learn that I can retrieve a doc by id right after writing it into an index. There is no need to wait for indexing to finish. Second, I set "index" to "no" i

Re: Why ES node starts recovering all the data from other nodes after reboot?

2014-11-22 Thread Yves Dorfsman
On 2014-11-22 09:35, Otis Gospodnetic wrote: > Hi Konstantin, > > Check out http://gibrown.com/2014/11/19/elasticsearch-the-broken-bits/ > Good writing! Thanks. I wonder if there's any drawback from cutting indices in smaller (tiny?) shards? My thinking is this: We don't really change data in

Re: Why ES node starts recovering all the data from other nodes after reboot?

2014-11-22 Thread Konstantin Erman
Yes, I have noticed that article right away, simply because I keep googling ES related questions every day :-) Unfortunately the only practical advice I could learn from that article is to use doc_values instead of field data and it does not really help with "full node rebuild after short down

Treatment of special characters in elasticsearch

2014-11-22 Thread prachicsa
I use the following analyzer: curl -XPUT 'http://localhost:9200/sample/' -d ' { "settings" : { "index": { "analysis": { "analyzer": { "default": { "type": "custom", "tokenizer": "keyword", "filter": ["trim", "lowercase"]} } } } } }'

Re: cost of automatics refresh

2014-11-22 Thread Nikolas Everett
The cost of automatic refresh if you haven't written anything is pretty close to 0. I believe elasticsearch keeps a list of all indexes that have been written to rather than checking each one. On Nov 19, 2014 3:04 PM, "Jinyuan Zhou" wrote: > I am curious about how much cost for both cpu and memo

Re: 1.4.0 data node can't join existing 1.3.4 cluster

2014-11-22 Thread joergpra...@gmail.com
As said, the change is due to unicast action, which was split in 1.4.0 to an old and a new action, see this commit: https://github.com/elasticsearch/elasticsearch/commit/e5de47d928582694c7729d199390086983779e6e

Re: Is there any way of tracking request id?

2014-11-22 Thread Otis Gospodnetic
Hi Krysztof, I'm not 100% sure if this will help, but we are adding tracing and error capture to SPM . I *think* that may do what you are after. In the mean time you could try setting up something like Zipkin. Otis -- Monitoring * Alerting * Anomaly Detection * Centr

Re: Tribe Nodes

2014-11-22 Thread Otis Gospodnetic
I believe so, yes. Otis -- Monitoring * Alerting * Anomaly Detection * Centralized Log Management Solr & Elasticsearch Support * http://sematext.com/ On Wednesday, November 19, 2014 11:10:38 AM UTC-5, Hari Kosaraju wrote: > > Hi, > > Is it possible to have multiple tribe nodes connect to the sa

Re: cost of automatics refresh

2014-11-22 Thread Otis Gospodnetic
See http://blog.sematext.com/2013/07/08/elasticsearch-refresh-interval-vs-indexing-performance/ Otis -- Monitoring * Alerting * Anomaly Detection * Centralized Log Management Solr & Elasticsearch Support * http://sematext.com/ On Wednesday, November 19, 2014 3:04:29 PM UTC-5, Jinyuan Zhou wrot

Re: problem with heap space overusage

2014-11-22 Thread Otis Gospodnetic
It could be a number of things. Check your various ES caches. Full? Correlated with GC activity increase and eventual OOM. Then check your queries - are they big? Expensive aggregations? (the other day I saw one of our clients using agg queries 10K lines in size) I could keep asking questi

Re: how to migrate lucene index into elasticsearch

2014-11-22 Thread Otis Gospodnetic
You didn't say why you can't just reindex data from original source, but that would be the cleanest way and likely the fastest in terms of human time (and $) you'll *likely* spend if you try using a "shortcut". Otis -- Monitoring * Alerting * Anomaly Detection * Centralized Log Management Solr &

Smart way to aggregate multiple HTTP routes by detecting variable parameters?

2014-11-22 Thread Alexandre Strzelewicz
Hello, I'm switching to ElasticSearch and I'm amazed by all his capabilities and performance. God, happy to know about it now. I'm currently faced to a problem, and wanted to know if ES could help me in doing that. I have a lot of route stored in ES, and I'm doing aggregations by route, then

Re: Marvel / ES query document count major discrepancy

2014-11-22 Thread Otis Gospodnetic
Maybe it's counting replicas? Or its own docs? See SPM for an alternative. Otis -- Monitoring * Alerting * Anomaly Detection * Centralized Log Management Solr & Elasticsearch Support * http://sematext.com/ On Thursday, November 20, 2014 2:36:42 PM UTC-5, Mike Seid wr

Re: Why ES node starts recovering all the data from other nodes after reboot?

2014-11-22 Thread Otis Gospodnetic
Hi Konstantin, Check out http://gibrown.com/2014/11/19/elasticsearch-the-broken-bits/ Otis -- Monitoring * Alerting * Anomaly Detection * Centralized Log Management Solr & Elasticsearch Support * http://sematext.com/ On Thursday, November 20, 2014 9:48:56 PM UTC-5, Konstantin Erman wrote: > >

Re: Index load distribution

2014-11-22 Thread Otis Gospodnetic
To add to Mark's comment - you'll obviously want to makes sure your cluster is more or less balanced (in terms of shards, their sizes, etc.). Should happen automatically, but we've seen a number of situations where things were not working well because shards were not quite balanced, so you may

Re: Node spikes to 1000 threads and hangs, once or twice a day. Help?

2014-11-22 Thread Otis Gospodnetic
Hi, Look at query rates and see if they correlate. I'm guessing they jumped, too. SPM will help with that. Once you confirm you can trace the source of queries further upstream. Otis -- Monitoring * Alerting * Anomaly Detection * Centralized Log Management Solr & El

Re: Using ASM 5.x

2014-11-22 Thread Eugen Paraschiv
Makes sense - thanks for the response. Cheers, Eugen. On Wed, Nov 19, 2014 at 7:47 PM, joergpra...@gmail.com < joergpra...@gmail.com> wrote: > If you do not use Lucene expressions, you can ignore/remove Lucene > expression and asm jars. > > Maybe your question should be forwarded to the Lucene te

Re: Node spikes to 1000 threads and hangs, once or twice a day. Help?

2014-11-22 Thread joergpra...@gmail.com
Check the query logs. Maybe your site is crawled and you do not use robots.txt There is no default limit of 1000 threads, what are you talking about? Jörg On Sat, Nov 22, 2014 at 6:51 AM, Christopher Ambler < const.dogbe...@gmail.com> wrote: > Odd behavior - our 5-node cluster hums along happil