Re: What tools to use for indexing ES in production?

2015-02-19 Thread Kevin Liu
Well, the sales messages are coming through kafka. We need to extract some 
info from the database. We can do anything really. I'm just not sure what 
are the common practice here. It seems to be so many options. what kind of 
questions am I not asking here? 

On Monday, February 16, 2015 at 2:30:45 AM UTC-8, Mark Walkom wrote:

 This depends on how the app is made and what options you have to extract 
 data from it.

 On 16 February 2015 at 20:28, Kevin Liu ke...@ticketfly.com javascript:
  wrote:

 We want to index our sales records as they come in from our apps. We are 
 using quartz job right now, which is really slow and not really real-time. 

 We will be implementing a message bus soon for firing sales event. 

 The process is: 
 read from a message queue
 grab some extra data from mysql
 do some ETL to construct the document
 index it to ES. 

 I've been reading about apache spark, ES river, logstash. 

 My questions is what kind of tools are right for the job here? 
 is apache spark an over kill? 
 is DIY a better option here? 
 what are you guys using? 

 please advice and point me to the right thing to read. 

 -- 
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearc...@googlegroups.com javascript:.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/949b0c40-e8b1-4d40-bced-68e5c005c713%40googlegroups.com
  
 https://groups.google.com/d/msgid/elasticsearch/949b0c40-e8b1-4d40-bced-68e5c005c713%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.




-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/4a11083b-4e9f-4be2-9a8e-83caf34d56b4%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Is it safe to change node names in an existing ElasticSearch cluster

2015-02-19 Thread David Pilato
Yes. 

David

 Le 19 févr. 2015 à 08:56, Jan-Erik Westlund je.westl...@gmail.com a écrit :
 
 Correct, in that case, it will not be a rolling upgrade ;-) The service will 
 be down for a few minutes.
 Can I then change all the nodenames, and the start the services on all the 
 nodes with the new names without messing things up ?
 
 2015-02-19 7:58 GMT+01:00 David Pilato da...@pilato.fr:
 You should define this in that case: 
 http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-gateway.html#recover-after
 
 But it's not anymore a rolling upgrade, right? Your service will be down for 
 some seconds/minutes I guess.
 
 
 David
 
 Le 19 févr. 2015 à 07:52, Jan-Erik Westlund je.westl...@gmail.com a écrit 
 :
 
 I understand that, but is it safe to change all the nodenames and restart 
 all the nodes at the same time ?
 
 Skickat från min iPhone 6.
 
 19 feb 2015 kl. 07:47 skrev David Pilato da...@pilato.fr:
 
 You can change safely elasticsearch.yml file while elasticsearch is 
 running.
 This file is only loaded when elasticsearch starts.
 
 David
 
 Le 19 févr. 2015 à 07:33, Jan-Erik Westlund je.westl...@gmail.com a 
 écrit :
 
 Hi again !
 
 Thanks for Rolling restart info, that was really helpful.
 But since the elasticsearch.yml file is managed by Puppet, all the 
 nodenames will change pretty much at the same time !
 So in my case it would be best to shutdown the ES daemon on all nodes 
 first, apply the Puppet changes and then start the ES cluster again...
 Is it safe to do so ?
 
 //Jan-Erik
 
 Den onsdag 18 februari 2015 kl. 16:44:35 UTC+1 skrev David Pilato:
 Have a look at 
 http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/cluster-nodes-shutdown.html#_rolling_restart_of_nodes_full_cluster_restart
 
 -- 
 David Pilato | Technical Advocate | Elasticsearch.com
 @dadoonet | @elasticsearchfr | @scrutmydocs
 
 
 
 Le 18 févr. 2015 à 16:37, Jan-Erik Westlund je.we...@gmail.com a 
 écrit :
 
 Thanks David !
 
 All my Recovery Throttling settings are default in the 
 elasticsearch.yml file.
 How do I disable allocation, in a running production environment ? 
 Do I need to disable allocation first, restart each node / daemon, and 
 after rename the nodes ?
 
 Or maybe it would be better to down the ES cluster (all 3 nodes) during 
 a maintenance windows, change all names, and then restart the ES 
 cluster nodes again ?
 
 //Jan-Erik
 
 Den onsdag 18 februari 2015 kl. 16:18:42 UTC+1 skrev David Pilato:
 Yes. It’s safe.
 You can do it one at a time.
 
 If you already have data around and don’t want your shards moving 
 during this, you should disable allocation.
 
 
 -- 
 David Pilato | Technical Advocate | Elasticsearch.com
 @dadoonet | @elasticsearchfr | @scrutmydocs
 
 
 
 Le 18 févr. 2015 à 16:14, Jan-Erik Westlund je.we...@gmail.com a 
 écrit :
 
 Hi !
 
 Is it safe to change the node names of my 3 nodes in an existing 
 elasticsearch 1.4.0 cluster ?
 
 The reason is to get rid of the random names like: Elizabeth Betsy 
 Braddock, Franz Kafka, etc...
 
 Is it just to set the node.name: server name in elasticsearch.yml 
 and then restart the daemon ? 
 Do I do it one node at the time, or do I need down the cluster and 
 then change all node names, and then bring up the cluster again ?
 
 
 //Jan-Erik
 
 -- 
 You received this message because you are subscribed to the Google 
 Groups elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, 
 send an email to elasticsearc...@googlegroups.com.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/0bed6a3d-9315-4060-9585-cf68907f844b%40googlegroups.com.
 For more options, visit https://groups.google.com/d/optout.
 
 
 -- 
 You received this message because you are subscribed to the Google 
 Groups elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send 
 an email to elasticsearc...@googlegroups.com.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/a79c0ae3-d786-4bf4-80cb-61acdb8804d3%40googlegroups.com.
 For more options, visit https://groups.google.com/d/optout.
 
 -- 
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/9eca2130-030d-468c-825d-b66c8766ba4a%40googlegroups.com.
 For more options, visit https://groups.google.com/d/optout.
 -- 
 You received this message because you are subscribed to a topic in the 
 Google Groups elasticsearch group.
 To unsubscribe from this topic, visit 
 https://groups.google.com/d/topic/elasticsearch/ZjOHjpXVZ00/unsubscribe.
 To unsubscribe from this group and all its topics, send an email to 
 elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the 

Re: Aggregations failing on fields with custom analyzer..

2015-02-19 Thread David Pilato
I don’t know without a concrete example.
I’d say that if you map have a type number and you send 123 it could work. 

-- 
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet https://twitter.com/dadoonet | @elasticsearchfr 
https://twitter.com/elasticsearchfr | @scrutmydocs 
https://twitter.com/scrutmydocs



 Le 19 févr. 2015 à 09:30, Anil Karaka anilkar...@gmail.com a écrit :
 
 It was my mistake, the field I was trying to do an aggregation was mapped 
 double, I assumed its a string, after seeing some sample documents with 
 strings..
 
 Why didn't es throw an error when I'm indexing docs with strings instead of 
 double..?
 
 On Thursday, February 19, 2015 at 1:35:08 PM UTC+5:30, David Pilato wrote:
 Did you apply your analyzer to your mapping?
 
 David
 
 Le 19 févr. 2015 à 08:53, Anil Karaka anilk...@gmail.com javascript: a 
 écrit :
 
 http://stackoverflow.com/questions/28601082/terms-aggregation-failing-on-string-fields-with-a-custom-analyzer-in-elasticsear
  
 http://stackoverflow.com/questions/28601082/terms-aggregation-failing-on-string-fields-with-a-custom-analyzer-in-elasticsear
 
 Posted in stack over flow as well..
 
 On Thursday, February 19, 2015 at 1:01:40 PM UTC+5:30, Anil Karaka wrote:
 I wanted a custom analyzer that behaves exactly like not_analyzed, except 
 that fields are case insensitive..
 
 I have my analyzer as below, 
 
 index: {
 analysis: {
 analyzer: { // Custom Analyzer with keyword tokenizer and 
 lowercase filter, same as not_analyzed but case insensitive
 case_insensitive_keyword_analyzer: {
 tokenizer: keyword,
 filter: lowercase
 }
 }
 }
 }
 
 But when I'm trying to do term aggregation over a field with strings 
 analyzed as above, I'm getting this error..
 
 {
 error 
 :ClassCastException[org.elasticsearch.search.aggregations.bucket.terms.DoubleTerms$Bucket
  cannot be cast to 
 org.elasticsearch.search.aggregations.bucket.terms.StringTerms$Bucket],
 status : 500
 }
 
 Are there additional settings that I have to update in my custom analyzer 
 for my terms aggregation to work..?
 
 
 The better question is I want a custom analyzer that does everything similar 
 to not_analyzed but is case insensitive.. How do I achieve that?
 
 
 
 -- 
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearc...@googlegroups.com javascript:.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/91eea272-2f5e-4d9a-b975-dae5d50cd0d3%40googlegroups.com
  
 https://groups.google.com/d/msgid/elasticsearch/91eea272-2f5e-4d9a-b975-dae5d50cd0d3%40googlegroups.com?utm_medium=emailutm_source=footer.
 For more options, visit https://groups.google.com/d/optout 
 https://groups.google.com/d/optout.
 
 
 -- 
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearch+unsubscr...@googlegroups.com 
 mailto:elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/46135e6f-6946-41bd-a562-557737192a07%40googlegroups.com
  
 https://groups.google.com/d/msgid/elasticsearch/46135e6f-6946-41bd-a562-557737192a07%40googlegroups.com?utm_medium=emailutm_source=footer.
 For more options, visit https://groups.google.com/d/optout 
 https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/9B7CB626-45FA-4856-B735-8CD6912B7FBD%40pilato.fr.
For more options, visit https://groups.google.com/d/optout.


Re: Aggregations failing on fields with custom analyzer..

2015-02-19 Thread Anil Karaka
It was my mistake, the field I was trying to do an aggregation was mapped 
double, I assumed its a string, after seeing some sample documents with 
strings..

Why didn't es throw an error when I'm indexing docs with strings instead of 
double..?

On Thursday, February 19, 2015 at 1:35:08 PM UTC+5:30, David Pilato wrote:

 Did you apply your analyzer to your mapping?

 David

 Le 19 févr. 2015 à 08:53, Anil Karaka anilk...@gmail.com javascript: 
 a écrit :


 http://stackoverflow.com/questions/28601082/terms-aggregation-failing-on-string-fields-with-a-custom-analyzer-in-elasticsear

 Posted in stack over flow as well..

 On Thursday, February 19, 2015 at 1:01:40 PM UTC+5:30, Anil Karaka wrote:

 I wanted a custom analyzer that behaves exactly like not_analyzed, except 
 that fields are case insensitive..

 I have my analyzer as below, 

 index: {
 analysis: {
 analyzer: { // Custom Analyzer with keyword tokenizer and 
 lowercase filter, same as not_analyzed but case insensitive
 case_insensitive_keyword_analyzer: {
 tokenizer: keyword,
 filter: lowercase
 }
 }
 }
 }

 But when I'm trying to do term aggregation over a field with strings 
 analyzed as above, I'm getting this error..

 {
 error 
 :ClassCastException[org.elasticsearch.search.aggregations.bucket.terms.DoubleTerms$Bucket
  cannot be cast to 
 org.elasticsearch.search.aggregations.bucket.terms.StringTerms$Bucket],
 status : 500
 }

 Are there additional settings that I have to update in my custom analyzer 
 for my terms aggregation to work..?


 The better question is I want a custom analyzer that does everything similar 
 to not_analyzed but is case insensitive.. How do I achieve that?



  -- 
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearc...@googlegroups.com javascript:.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/91eea272-2f5e-4d9a-b975-dae5d50cd0d3%40googlegroups.com
  
 https://groups.google.com/d/msgid/elasticsearch/91eea272-2f5e-4d9a-b975-dae5d50cd0d3%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.



-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/46135e6f-6946-41bd-a562-557737192a07%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: ElasticSearch search performance question

2015-02-19 Thread Mark Harwood
Good stuff. You're seeing the benefits of not caching lots of single-use 
BitSets.
Now if you swap queries for your filters then you'll also see the benefits 
of not allocating multi-megabyte BitSets to hold what is typically a single 
bit for each query that you run.

On Thursday, February 19, 2015 at 6:23:08 AM UTC, Jay Danielian wrote:

 Just to update the thread.

 I added code to disable cache on all the term filters we were using, and 
 it made a huge performance improvement. Now we are able to service the 
 queries with average response time under two seconds, which is excellent 
 (we are bundling several searches using _msearch, so  2 seconds total 
 response is good) The search requests / sec metric is still peaking at 
 around 600 / sec, however our CPU only spikes to about 65% now - so I 
 think we can add more search threads to our config as we are no longer 
 maxing out CPU. I also see a a bit of disk read activity now, which against 
 our non RAID EBS drive - means we may be able to squeeze more if we switch 
 disk setup.

 It seems like having these filters add cache items was wasting CPU on 
 cache eviction and cache lookups (cache misses really) for each query - 
 which really only shows up when trying to push some load through.

 Thanks for everyone's suggestions!!

 J

 On Friday, February 13, 2015 at 11:55:52 AM UTC-5, Jay Danielian wrote:

 Thanks to all for these great suggestions. I haven't had a chance to 
 change the syntax yet, as that is a risky thing for me to quickly change 
 against our production setup. My plan is to try that this weekend (so I can 
 properly test the new syntax is returning the same results). However, is 
 there a way to turn filter caching off globally via config or elsewhere?

 Thanks!

 J

 On Friday, February 13, 2015 at 11:25:20 AM UTC-5, Mark Harwood wrote:

 So I can see in the hot threads dump the initialization requests for 
 those FixedBitSets I was talking about.
 Looking at the number of docs in your index I estimate each Term to be 
 allocating 140mb of memory in total for all these bitsets across all shards 
 given the 1bn docs in your index. Remember that you are probably setting 
 only a single bit in each of these large structures. 
 Another stat (if I read it correctly) shows 5m evictions of these 
 cached filters given their low reusability. It's fair to say you have some 
 cache churn going on :)
 Did you try my earlier suggestion of queries not filters?




 On Friday, February 13, 2015 at 2:29:42 PM UTC, Jay Danielian wrote:

 As requested here is a dump of the hot threads output. 

 Thanks!

 J

 On Thursday, February 12, 2015 at 6:45:23 PM UTC-5, Nikolas Everett 
 wrote:

 You might want to try hitting hot threads while putting your load on 
 it and seeing what you see.  Or posting it.

 Nik

 On Thu, Feb 12, 2015 at 4:44 PM, Jay Danielian 
 jay.da...@circleback.com wrote:

 Mark,

 Thanks for the initial reply. Yes, your assumption about these things 
 being very specific and thus not likely to have any re-use with regards 
 to 
 caching is correct. I have attached some screenshots from the BigDesk 
 plugin which showed a decent snapshot of what the server looked like 
 while 
 my tests were running. You can see the spikes in CPU, that essentially 
 covered the duration when the JMeter tests were running. 

 At a high level, the only thing that seems to be really stressed on 
 the server is CPU. But that makes me think that there is something in my 
 setup , query syntax, or perhaps the cache eviction rate, etc that is 
 causing it to spike so high. I also have concerns about non RAID 0 the 
 EBS 
 volumes, as I know that having one large volume does not maximize 
 throughput - however, just looking at the stats  it doesn't seem like IO 
 is 
 really a bottleneck.

 Here is a sample query structure = 
 https://gist.github.com/jaydanielian/c2be885987f344031cfc

 Also this is one query - in reality we use _msearch to pipeline 
 several of these queries in one batch. The queries also include custom 
 routing / route key to make sure we only hit one shard.

 Thanks!

 J


 On Thursday, February 12, 2015 at 4:22:29 PM UTC-5, Mark Walkom wrote:

 It'd help if you could gist/pastebin/etc a query example.

 Also your current ES and java need updating, there are known issues 
 with java 1.7u55, and you will always see performance boosts running 
 the 
 latest version of ES.

 That aside, what is your current resource utilisation like?  Are you 
 seeing lots of cache evictions, high heap use, high CPU, IO delays?

 On 13 February 2015 at 07:32, Jay Danielian 
 jay.da...@circleback.com wrote:

 I know this is difficult to answer, the real answer is always It 
 Depends :) But I am going to go ahead and hope I get some feedback 
 here.

 We are mainly using ES to issue terms searches against fields that 
 are non-analyzed. We are using ES like a key value store, where once 
 the 
 match is found we parse the _source JSON and return our model. We 

Re: Aggregations failing on fields with custom analyzer..

2015-02-19 Thread David Pilato
If you can provide a full example working as I did, we can try it and see what 
is wrong.

-- 
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet https://twitter.com/dadoonet | @elasticsearchfr 
https://twitter.com/elasticsearchfr | @scrutmydocs 
https://twitter.com/scrutmydocs



 Le 19 févr. 2015 à 10:01, Anil Karaka anilkar...@gmail.com a écrit :
 
 Im getting this error as well using your PUT requests..
 
 It feels like I'm doing something wrong.. But I don't know what exactly..
 
 I'm using this index template.. 
 https://gist.github.com/syllogismos/c2dde4f097fea149e1a0
 
 I didn't specify a particular mapping from my index but reindexed from a 
 previous index.. and ended up with that mapping and documents that looks like 
 above.. Am I seeing things and an obvious mistake? So lost right now..
 
 On Thursday, February 19, 2015 at 2:23:10 PM UTC+5:30, David Pilato wrote:
 I think you are doing something wrong.
 
 DELETE index
 PUT index
 {
   mappings: {
 doc: {
   properties: {
 foo: {
   type: double
 }
   }
 }
   }
 }
 PUT index/doc/1
 {
   foo: bar
 }
 
 gives:
 
 {
error: MapperParsingException[failed to parse [foo]]; nested: 
 NumberFormatException[For input string: \bar\]; ,
status: 400
 }
 
 -- 
 David Pilato | Technical Advocate | Elasticsearch.com 
 http://elasticsearch.com/
 @dadoonet https://twitter.com/dadoonet | @elasticsearchfr 
 https://twitter.com/elasticsearchfr | @scrutmydocs 
 https://twitter.com/scrutmydocs
 
 
 
 Le 19 févr. 2015 à 09:39, Anil Karaka anilk...@gmail.com javascript: a 
 écrit :
 
 _source : {
 Sort : ,
 gt : 2015-02-18T15:07:10,
 uid : 54867dc55b482b04da7f23d8,
 usId : 54867dc55b482b04da7f23d7,
 ut : 2015-02-18T20:37:10,
 act : productlisting,
 st : 2015-02-18T15:07:46,
 Filter : ,
 av : 3.0.0.0,
 ViewType : SmallSingleList,
 os : Windows,
 categoryid : home-kitchen-curtains-blinds
 }
 
 properties : {
 uid : {
 analyzer : case_insensitive_keyword_analyzer,
 type : string
 },
 ViewType : {
 analyzer : case_insensitive_keyword_analyzer,
 type : string
 },
 usId : {
 analyzer : case_insensitive_keyword_analyzer,
 type : string
 },
 os : {
 analyzer : case_insensitive_keyword_analyzer,
 type : string
 },
 Sort : {
 analyzer : case_insensitive_keyword_analyzer,
 type : string
 },
 Filter : {
 analyzer : case_insensitive_keyword_analyzer,
 type : string
 },
 categoryid : {
 type : double
 },
 gt : {
 format : dateOptionalTime,
 type : date
 },
 ut : {
 format : dateOptionalTime,
 type : date
 },
 st : {
 format : dateOptionalTime,
 type : date
 },
 act : {
 analyzer : case_insensitive_keyword_analyzer,
 type : string
 },
 av : {
 analyzer : case_insensitive_keyword_analyzer,
 type : string
 }
 }
 
 
 A sample document and the index mappings above..
 
 
 On Thursday, February 19, 2015 at 2:03:11 PM UTC+5:30, David Pilato wrote:
 I don’t know without a concrete example.
 I’d say that if you map have a type number and you send 123 it could work. 
 
 -- 
 David Pilato | Technical Advocate | Elasticsearch.com 
 http://elasticsearch.com/
 @dadoonet https://twitter.com/dadoonet | @elasticsearchfr 
 https://twitter.com/elasticsearchfr | @scrutmydocs 
 https://twitter.com/scrutmydocs
 
 
 
 Le 19 févr. 2015 à 09:30, Anil Karaka anilk...@gmail.com  a écrit :
 
 It was my mistake, the field I was trying to do an aggregation was mapped 
 double, I assumed its a string, after seeing some sample documents with 
 strings..
 
 Why didn't es throw an error when I'm indexing docs with strings instead of 
 double..?
 
 On Thursday, February 19, 2015 at 1:35:08 PM UTC+5:30, David Pilato wrote:
 Did you apply your analyzer to your mapping?
 
 David
 
 Le 19 févr. 2015 à 08:53, Anil Karaka anilk...@gmail.com  a écrit :
 
 http://stackoverflow.com/questions/28601082/terms-aggregation-failing-on-string-fields-with-a-custom-analyzer-in-elasticsear
  
 http://stackoverflow.com/questions/28601082/terms-aggregation-failing-on-string-fields-with-a-custom-analyzer-in-elasticsear
 
 Posted in stack over flow as well..
 
 On Thursday, February 19, 2015 at 1:01:40 PM UTC+5:30, Anil Karaka wrote:
 I wanted a custom analyzer that behaves exactly like not_analyzed, except 
 that fields are case insensitive..
 
 I have my analyzer as below, 
 
 index: {
 analysis: {
 analyzer: { // Custom Analyzer with keyword tokenizer and 
 lowercase filter, same as not_analyzed but case insensitive
 case_insensitive_keyword_analyzer: {
 tokenizer: keyword,
 filter: lowercase
 }
 }
 }
 }
 
 But when I'm trying to do term aggregation over a field with strings analyzed 
 as above, I'm getting this error..
 
 {
 error 
 :ClassCastException[org.elasticsearch.search.aggregations.bucket.terms.DoubleTerms$Bucket
  cannot be cast to 
 org.elasticsearch.search.aggregations.bucket.terms.StringTerms$Bucket],
 

Re: Is it safe to change node names in an existing ElasticSearch cluster

2015-02-19 Thread Jan-Erik Westlund
Ok, thanks again.

2015-02-19 9:06 GMT+01:00 David Pilato da...@pilato.fr:

 Yes.

 David

 Le 19 févr. 2015 à 08:56, Jan-Erik Westlund je.westl...@gmail.com a
 écrit :

 Correct, in that case, it will not be a rolling upgrade ;-) The service
 will be down for a few minutes.
 Can I then change all the nodenames, and the start the services on all the
 nodes with the new names without messing things up ?

 2015-02-19 7:58 GMT+01:00 David Pilato da...@pilato.fr:

 You should define this in that case:
 http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-gateway.html#recover-after

 But it's not anymore a rolling upgrade, right? Your service will be down
 for some seconds/minutes I guess.


 David

 Le 19 févr. 2015 à 07:52, Jan-Erik Westlund je.westl...@gmail.com a
 écrit :

 I understand that, but is it safe to change all the nodenames and restart
 all the nodes at the same time ?

 Skickat från min iPhone 6.

 19 feb 2015 kl. 07:47 skrev David Pilato da...@pilato.fr:

 You can change safely elasticsearch.yml file while elasticsearch is
 running.
 This file is only loaded when elasticsearch starts.

 David

 Le 19 févr. 2015 à 07:33, Jan-Erik Westlund je.westl...@gmail.com a
 écrit :

 Hi again !

 Thanks for Rolling restart info, that was really helpful.
 But since the elasticsearch.yml file is managed by Puppet, all the
 nodenames will change pretty much at the same time !
 So in my case it would be best to shutdown the ES daemon on all nodes
 first, apply the Puppet changes and then start the ES cluster again...
 Is it safe to do so ?

 //Jan-Erik

 Den onsdag 18 februari 2015 kl. 16:44:35 UTC+1 skrev David Pilato:

 Have a look at http://www.elasticsearch.org/guide/en/elasticsearch/
 reference/current/cluster-nodes-shutdown.html#_rolling_
 restart_of_nodes_full_cluster_restart

 --
 *David Pilato* | *Technical Advocate* | *Elasticsearch.com
 http://Elasticsearch.com*
 @dadoonet https://twitter.com/dadoonet | @elasticsearchfr
 https://twitter.com/elasticsearchfr | @scrutmydocs
 https://twitter.com/scrutmydocs



 Le 18 févr. 2015 à 16:37, Jan-Erik Westlund je.we...@gmail.com a
 écrit :

 Thanks David !

 All my Recovery Throttling settings are default in the
 elasticsearch.yml file.
 How do I disable allocation, in a running production environment ?
 Do I need to disable allocation first, restart each node / daemon, and
 after rename the nodes ?

 Or maybe it would be better to down the ES cluster (all 3 nodes) during
 a maintenance windows, change all names, and then restart the ES cluster
 nodes again ?

 //Jan-Erik

 Den onsdag 18 februari 2015 kl. 16:18:42 UTC+1 skrev David Pilato:

 Yes. It’s safe.
 You can do it one at a time.

 If you already have data around and don’t want your shards moving
 during this, you should disable allocation.


 --
 *David Pilato* | *Technical Advocate* | *Elasticsearch.com
 http://elasticsearch.com/*
 @dadoonet https://twitter.com/dadoonet | @elasticsearchfr
 https://twitter.com/elasticsearchfr | @scrutmydocs
 https://twitter.com/scrutmydocs



 Le 18 févr. 2015 à 16:14, Jan-Erik Westlund je.we...@gmail.com a
 écrit :

 Hi !

 Is it safe to change the node names of my 3 nodes in an existing
 elasticsearch 1.4.0 cluster ?

 The reason is to get rid of the random names like: Elizabeth Betsy
 Braddock, Franz Kafka, etc...

 Is it just to set the node.name: server name in elasticsearch.yml
 and then restart the daemon ?
 Do I do it one node at the time, or do I need down the cluster and then
 change all node names, and then bring up the cluster again ?


 //Jan-Erik

 --
 You received this message because you are subscribed to the Google
 Groups elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send
 an email to elasticsearc...@googlegroups.com.
 To view this discussion on the web visit https://groups.google.com/d/
 msgid/elasticsearch/0bed6a3d-9315-4060-9585-cf68907f844b%
 40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/0bed6a3d-9315-4060-9585-cf68907f844b%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.



 --
 You received this message because you are subscribed to the Google
 Groups elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send
 an email to elasticsearc...@googlegroups.com.
 To view this discussion on the web visit https://groups.google.com/d/
 msgid/elasticsearch/a79c0ae3-d786-4bf4-80cb-61acdb8804d3%
 40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/a79c0ae3-d786-4bf4-80cb-61acdb8804d3%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion 

Re: Aggregations failing on fields with custom analyzer..

2015-02-19 Thread Anil Karaka
_source : {
Sort : ,
gt : 2015-02-18T15:07:10,
uid : 54867dc55b482b04da7f23d8,
usId : 54867dc55b482b04da7f23d7,
ut : 2015-02-18T20:37:10,
act : productlisting,
st : 2015-02-18T15:07:46,
Filter : ,
av : 3.0.0.0,
ViewType : SmallSingleList,
os : Windows,
categoryid : home-kitchen-curtains-blinds
}

properties : {
uid : {
analyzer : case_insensitive_keyword_analyzer,
type : string
},
ViewType : {
analyzer : case_insensitive_keyword_analyzer,
type : string
},
usId : {
analyzer : case_insensitive_keyword_analyzer,
type : string
},
os : {
analyzer : case_insensitive_keyword_analyzer,
type : string
},
Sort : {
analyzer : case_insensitive_keyword_analyzer,
type : string
},
Filter : {
analyzer : case_insensitive_keyword_analyzer,
type : string
},
categoryid : {
type : double
},
gt : {
format : dateOptionalTime,
type : date
},
ut : {
format : dateOptionalTime,
type : date
},
st : {
format : dateOptionalTime,
type : date
},
act : {
analyzer : case_insensitive_keyword_analyzer,
type : string
},
av : {
analyzer : case_insensitive_keyword_analyzer,
type : string
}
}


A sample document and the index mappings above..


On Thursday, February 19, 2015 at 2:03:11 PM UTC+5:30, David Pilato wrote:

 I don’t know without a concrete example.
 I’d say that if you map have a type number and you send 123 it could 
 work. 

 -- 
 *David Pilato* | *Technical Advocate* | *Elasticsearch.com 
 http://Elasticsearch.com*
 @dadoonet https://twitter.com/dadoonet | @elasticsearchfr 
 https://twitter.com/elasticsearchfr | @scrutmydocs 
 https://twitter.com/scrutmydocs


  
 Le 19 févr. 2015 à 09:30, Anil Karaka anilk...@gmail.com javascript: 
 a écrit :

 It was my mistake, the field I was trying to do an aggregation was mapped 
 double, I assumed its a string, after seeing some sample documents with 
 strings..

 Why didn't es throw an error when I'm indexing docs with strings instead 
 of double..?

 On Thursday, February 19, 2015 at 1:35:08 PM UTC+5:30, David Pilato wrote:

 Did you apply your analyzer to your mapping?

 David

 Le 19 févr. 2015 à 08:53, Anil Karaka anilk...@gmail.com a écrit :


 http://stackoverflow.com/questions/28601082/terms-aggregation-failing-on-string-fields-with-a-custom-analyzer-in-elasticsear

 Posted in stack over flow as well..

 On Thursday, February 19, 2015 at 1:01:40 PM UTC+5:30, Anil Karaka wrote:

 I wanted a custom analyzer that behaves exactly like not_analyzed, 
 except that fields are case insensitive..

 I have my analyzer as below, 

 index: {
 analysis: {
 analyzer: { // Custom Analyzer with keyword tokenizer and 
 lowercase filter, same as not_analyzed but case insensitive
 case_insensitive_keyword_analyzer: {
 tokenizer: keyword,
 filter: lowercase
 }
 }
 }
 }

 But when I'm trying to do term aggregation over a field with strings 
 analyzed as above, I'm getting this error..

 {
 error 
 :ClassCastException[org.elasticsearch.search.aggregations.bucket.terms.DoubleTerms$Bucket
  cannot be cast to 
 org.elasticsearch.search.aggregations.bucket.terms.StringTerms$Bucket],
 status : 500
 }

 Are there additional settings that I have to update in my custom analyzer 
 for my terms aggregation to work..?


 The better question is I want a custom analyzer that does everything 
 similar to not_analyzed but is case insensitive.. How do I achieve that?




 -- 
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearc...@googlegroups.com.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/91eea272-2f5e-4d9a-b975-dae5d50cd0d3%40googlegroups.com
  
 https://groups.google.com/d/msgid/elasticsearch/91eea272-2f5e-4d9a-b975-dae5d50cd0d3%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


 -- 
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearc...@googlegroups.com javascript:.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/46135e6f-6946-41bd-a562-557737192a07%40googlegroups.com
  
 https://groups.google.com/d/msgid/elasticsearch/46135e6f-6946-41bd-a562-557737192a07%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.




-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 

Re: Aggregations failing on fields with custom analyzer..

2015-02-19 Thread Anil Karaka
Im getting this error as well using your PUT requests..

It feels like I'm doing something wrong.. But I don't know what exactly..

I'm using this index template.. 
https://gist.github.com/syllogismos/c2dde4f097fea149e1a0

I didn't specify a particular mapping from my index but reindexed from a 
previous index.. and ended up with that mapping and documents that looks 
like above.. Am I seeing things and an obvious mistake? So lost right now..

On Thursday, February 19, 2015 at 2:23:10 PM UTC+5:30, David Pilato wrote:

 I think you are doing something wrong.

 DELETE index
 PUT index
 {
   mappings: {
 doc: {
   properties: {
 foo: {
   type: double
 }
   }
 }
   }
 }
 PUT index/doc/1
 {
   foo: bar
 }

 gives:

 {
error: MapperParsingException[failed to parse [foo]]; nested: 
 NumberFormatException[For input string: \bar\]; ,
status: 400
 }

 -- 
 *David Pilato* | *Technical Advocate* | *Elasticsearch.com 
 http://Elasticsearch.com*
 @dadoonet https://twitter.com/dadoonet | @elasticsearchfr 
 https://twitter.com/elasticsearchfr | @scrutmydocs 
 https://twitter.com/scrutmydocs


  
 Le 19 févr. 2015 à 09:39, Anil Karaka anilk...@gmail.com javascript: 
 a écrit :

 _source : {
 Sort : ,
 gt : 2015-02-18T15:07:10,
 uid : 54867dc55b482b04da7f23d8,
 usId : 54867dc55b482b04da7f23d7,
 ut : 2015-02-18T20:37:10,
 act : productlisting,
 st : 2015-02-18T15:07:46,
 Filter : ,
 av : 3.0.0.0,
 ViewType : SmallSingleList,
 os : Windows,
 categoryid : home-kitchen-curtains-blinds
 }

 properties : {
 uid : {
 analyzer : case_insensitive_keyword_analyzer,
 type : string
 },
 ViewType : {
 analyzer : case_insensitive_keyword_analyzer,
 type : string
 },
 usId : {
 analyzer : case_insensitive_keyword_analyzer,
 type : string
 },
 os : {
 analyzer : case_insensitive_keyword_analyzer,
 type : string
 },
 Sort : {
 analyzer : case_insensitive_keyword_analyzer,
 type : string
 },
 Filter : {
 analyzer : case_insensitive_keyword_analyzer,
 type : string
 },
 categoryid : {
 type : double
 },
 gt : {
 format : dateOptionalTime,
 type : date
 },
 ut : {
 format : dateOptionalTime,
 type : date
 },
 st : {
 format : dateOptionalTime,
 type : date
 },
 act : {
 analyzer : case_insensitive_keyword_analyzer,
 type : string
 },
 av : {
 analyzer : case_insensitive_keyword_analyzer,
 type : string
 }
 }


 A sample document and the index mappings above..


 On Thursday, February 19, 2015 at 2:03:11 PM UTC+5:30, David Pilato wrote:

 I don’t know without a concrete example.
 I’d say that if you map have a type number and you send 123 it could 
 work. 

 -- 
 *David Pilato* | *Technical Advocate* | *Elasticsearch.com 
 http://elasticsearch.com/*
 @dadoonet https://twitter.com/dadoonet | @elasticsearchfr 
 https://twitter.com/elasticsearchfr | @scrutmydocs 
 https://twitter.com/scrutmydocs


  
 Le 19 févr. 2015 à 09:30, Anil Karaka anilk...@gmail.com a écrit :

 It was my mistake, the field I was trying to do an aggregation was mapped 
 double, I assumed its a string, after seeing some sample documents with 
 strings..

 Why didn't es throw an error when I'm indexing docs with strings instead 
 of double..?

 On Thursday, February 19, 2015 at 1:35:08 PM UTC+5:30, David Pilato wrote:

 Did you apply your analyzer to your mapping?

 David

 Le 19 févr. 2015 à 08:53, Anil Karaka anilk...@gmail.com a écrit :


 http://stackoverflow.com/questions/28601082/terms-aggregation-failing-on-string-fields-with-a-custom-analyzer-in-elasticsear

 Posted in stack over flow as well..

 On Thursday, February 19, 2015 at 1:01:40 PM UTC+5:30, Anil Karaka wrote:

 I wanted a custom analyzer that behaves exactly like not_analyzed, except 
 that fields are case insensitive..

 I have my analyzer as below, 

 index: {
 analysis: {
 analyzer: { // Custom Analyzer with keyword tokenizer and 
 lowercase filter, same as not_analyzed but case insensitive
 case_insensitive_keyword_analyzer: {
 tokenizer: keyword,
 filter: lowercase
 }
 }
 }
 }

 But when I'm trying to do term aggregation over a field with strings analyzed 
 as above, I'm getting this error..

 {
 error 
 :ClassCastException[org.elasticsearch.search.aggregations.bucket.terms.DoubleTerms$Bucket
  cannot be cast to 
 org.elasticsearch.search.aggregations.bucket.terms.StringTerms$Bucket],
 status : 500
 }

 Are there additional settings that I have to update in my custom analyzer for 
 my terms aggregation to work..?


 The better question is I want a custom analyzer that does everything similar 
 to not_analyzed but is case insensitive.. How do I achieve that?




 -- 
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearc...@googlegroups.com.
 To view this 

Re: Aggregations failing on fields with custom analyzer..

2015-02-19 Thread Anil Karaka
I understand what you are saying.. I was able to recreate the same error 
you showed myself..

I was not able to insert into your index whose mapping is double, but I 
am able to insert a string into my older index whose mapping is double.. 
Very weird..
But I don't know how you could recreate my case..

I'm using this index 
template, https://gist.github.com/syllogismos/c2dde4f097fea149e1a0 and then 
reindexed from an older index.. and it took the mapping as double, and has 
strings in the indexed documents later..

Thanks for your help..

On Thursday, February 19, 2015 at 2:34:14 PM UTC+5:30, David Pilato wrote:

 If you can provide a full example working as I did, we can try it and see 
 what is wrong.

 -- 
 *David Pilato* | *Technical Advocate* | *Elasticsearch.com 
 http://Elasticsearch.com*
 @dadoonet https://twitter.com/dadoonet | @elasticsearchfr 
 https://twitter.com/elasticsearchfr | @scrutmydocs 
 https://twitter.com/scrutmydocs


  
 Le 19 févr. 2015 à 10:01, Anil Karaka anilk...@gmail.com javascript: 
 a écrit :

 Im getting this error as well using your PUT requests..

 It feels like I'm doing something wrong.. But I don't know what exactly..

 I'm using this index template.. 
 https://gist.github.com/syllogismos/c2dde4f097fea149e1a0

 I didn't specify a particular mapping from my index but reindexed from a 
 previous index.. and ended up with that mapping and documents that looks 
 like above.. Am I seeing things and an obvious mistake? So lost right now..

 On Thursday, February 19, 2015 at 2:23:10 PM UTC+5:30, David Pilato wrote:

 I think you are doing something wrong.

 DELETE index
 PUT index
 {
   mappings: {
 doc: {
   properties: {
 foo: {
   type: double
 }
   }
 }
   }
 }
 PUT index/doc/1
 {
   foo: bar
 }

 gives:

 {
error: MapperParsingException[failed to parse [foo]]; nested: 
 NumberFormatException[For input string: \bar\]; ,
status: 400
 }

 -- 
 *David Pilato* | *Technical Advocate* | *Elasticsearch.com 
 http://elasticsearch.com/*
 @dadoonet https://twitter.com/dadoonet | @elasticsearchfr 
 https://twitter.com/elasticsearchfr | @scrutmydocs 
 https://twitter.com/scrutmydocs


  
 Le 19 févr. 2015 à 09:39, Anil Karaka anilk...@gmail.com a écrit :

 _source : {
 Sort : ,
 gt : 2015-02-18T15:07:10,
 uid : 54867dc55b482b04da7f23d8,
 usId : 54867dc55b482b04da7f23d7,
 ut : 2015-02-18T20:37:10,
 act : productlisting,
 st : 2015-02-18T15:07:46,
 Filter : ,
 av : 3.0.0.0,
 ViewType : SmallSingleList,
 os : Windows,
 categoryid : home-kitchen-curtains-blinds
 }

 properties : {
 uid : {
 analyzer : case_insensitive_keyword_analyzer,
 type : string
 },
 ViewType : {
 analyzer : case_insensitive_keyword_analyzer,
 type : string
 },
 usId : {
 analyzer : case_insensitive_keyword_analyzer,
 type : string
 },
 os : {
 analyzer : case_insensitive_keyword_analyzer,
 type : string
 },
 Sort : {
 analyzer : case_insensitive_keyword_analyzer,
 type : string
 },
 Filter : {
 analyzer : case_insensitive_keyword_analyzer,
 type : string
 },
 categoryid : {
 type : double
 },
 gt : {
 format : dateOptionalTime,
 type : date
 },
 ut : {
 format : dateOptionalTime,
 type : date
 },
 st : {
 format : dateOptionalTime,
 type : date
 },
 act : {
 analyzer : case_insensitive_keyword_analyzer,
 type : string
 },
 av : {
 analyzer : case_insensitive_keyword_analyzer,
 type : string
 }
 }


 A sample document and the index mappings above..


 On Thursday, February 19, 2015 at 2:03:11 PM UTC+5:30, David Pilato wrote:

 I don’t know without a concrete example.
 I’d say that if you map have a type number and you send 123 it could 
 work. 

 -- 
 *David Pilato* | *Technical Advocate* | *Elasticsearch.com 
 http://elasticsearch.com/*
 @dadoonet https://twitter.com/dadoonet | @elasticsearchfr 
 https://twitter.com/elasticsearchfr | @scrutmydocs 
 https://twitter.com/scrutmydocs


  
 Le 19 févr. 2015 à 09:30, Anil Karaka anilk...@gmail.com a écrit :

 It was my mistake, the field I was trying to do an aggregation was mapped 
 double, I assumed its a string, after seeing some sample documents with 
 strings..

 Why didn't es throw an error when I'm indexing docs with strings instead 
 of double..?

 On Thursday, February 19, 2015 at 1:35:08 PM UTC+5:30, David Pilato wrote:

 Did you apply your analyzer to your mapping?

 David

 Le 19 févr. 2015 à 08:53, Anil Karaka anilk...@gmail.com a écrit :


 http://stackoverflow.com/questions/28601082/terms-aggregation-failing-on-string-fields-with-a-custom-analyzer-in-elasticsear

 Posted in stack over flow as well..

 On Thursday, February 19, 2015 at 1:01:40 PM UTC+5:30, Anil Karaka wrote:

 I wanted a custom analyzer that behaves exactly like not_analyzed, except 
 that fields are case insensitive..

 I have my analyzer as below, 

 index: {
 analysis: {
 analyzer: { // Custom Analyzer with keyword tokenizer and 
 lowercase filter, same as not_analyzed but case insensitive
 

Re: ES OOMing and not triggering cache circuit breakers, using LocalManualCache

2015-02-19 Thread Wilfred Hughes
After some experimentation, I believe _cluster/stats shows the total field 
data across the whole cluster. I manged to push my test cluster to 198MiB 
field data cache usage.

As a result, based on Zachary's feedback, I've set the following values in 
my elasticsearch.yml:

indices.fielddata.cache.size: 15gb
indices.fielddata.cache.expire: 7d

On Thursday, 12 February 2015 15:15:32 UTC, Wilfred Hughes wrote:

 Oh, is field data per-node or total across the cluster? I grabbed a test 
 cluster with two data nodes, and I deliberately set fielddata really low:

 indices.fielddata.cache.size: 100mb

 However, after a few queries, I'm seeing more than 100MiB in use:

 $ curl http://localhost:9200/_cluster/stats?humanpretty;
 ...
   fielddata: {
 memory_size: 119.7mb,
 memory_size_in_bytes: 125543995,
 evictions: 0
   },

 Is this expected?

 On Wednesday, 11 February 2015 18:57:28 UTC, Zachary Tong wrote:

 LocalManualCache is a component of Guava's LRU cache 
 https://code.google.com/p/guava-libraries/source/browse/guava-gwt/src-super/com/google/common/cache/super/com/google/common/cache/CacheBuilder.java,
  
 which is used by Elasticsearch for both the filter and field data cache. 
  Based on your node stats, I'd agree it is the field data usage which is 
 causing your OOMs.  CircuitBreaker helps prevent OOM, but it works on a 
 per-request basis.  It's possible for individual requests to pass the CB 
 because they use small subsets of fields, but over-time the set of fields 
 loaded into Field Data continues to grow and you'll OOM anyway.

 I would prefer to set a field data limit, rather than an expiration.  A 
 hard limit prevents OOM because you don't allow the cache to grow anymore. 
  An expiration does not guarantee that, since you could get a burst of 
 activity that still fills up the heap and OOMs before the expiration can 
 work.

 -Z

 On Wednesday, February 11, 2015 at 12:50:45 PM UTC-5, Wilfred Hughes 
 wrote:

 After examining some other nodes that were using a lot of their heap, I 
 think this is actually field data cache:


 $ curl http://localhost:9200/_cluster/stats?humanpretty;
 ...
 fielddata: {
   memory_size: 21.3gb,
   memory_size_in_bytes: 22888612852,
   evictions: 0
 },
 filter_cache: {
   memory_size: 6.1gb,
   memory_size_in_bytes: 6650700423,
   evictions: 12214551
 },

 Since this is storing logstash data, I'm going to add the following 
 lines to my elasticsearch.yml and see if I observe a difference once 
 deployed to production.

 # Don't hold field data caches for more than a day, since data is
 # grouped by day and we quickly lose interest in historical data.
 indices.fielddata.cache.expire: 1d


 On Wednesday, 11 February 2015 16:29:22 UTC, Wilfred Hughes wrote:

 Hi all

 I have an ES 1.2.4 cluster which is occasionally running out of heap. I 
 have ES_HEAP_SIZE=31G and according to the heap dump generated, my biggest 
 memory users were:

 org.elasticsearch.common.cache.LocalCache$LocalManualCache 55%
 org.elasticsearch.indices.cache.filter.IndicesFilterCache 11%

 and nothing else used more than 1%.

 It's not clear to me what this cache is. I can't find any references to 
 ManualCache in the elasticsearch source code, and the docs: 
 http://www.elasticsearch.org/guide/en/elasticsearch/reference/1.x/index-modules-fielddata.html
  
 suggest to me that the circuit breakers should stop requests or reduce 
 cache usage rather that OOMing.

 At the moment my cache was filled up, the node was actually trying to 
 index some data:

 [2015-02-11 08:14:29,775][WARN ][index.translog   ] 
 [data-node-2] [logstash-2015.02.11][0] failed to flush shard on translog 
 threshold
 org.elasticsearch.index.engine.FlushFailedEngineException: 
 [logstash-2015.02.11][0] Flush failed
 at 
 org.elasticsearch.index.engine.internal.InternalEngine.flush(InternalEngine.java:805)
 at 
 org.elasticsearch.index.shard.service.InternalIndexShard.flush(InternalIndexShard.java:604)
 at 
 org.elasticsearch.index.translog.TranslogService$TranslogBasedFlush$1.run(TranslogService.java:202)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
 at java.lang.Thread.run(Thread.java:745)
 Caused by: java.lang.IllegalStateException: this writer hit an 
 OutOfMemoryError; cannot commit
 at 
 org.apache.lucene.index.IndexWriter.startCommit(IndexWriter.java:4416)
 at 
 org.apache.lucene.index.IndexWriter.prepareCommitInternal(IndexWriter.java:2989)
 at 
 org.apache.lucene.index.IndexWriter.commitInternal(IndexWriter.java:3096)
 at 
 org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:3063)
 at 
 org.elasticsearch.index.engine.internal.InternalEngine.flush(InternalEngine.java:797)
 ... 5 more
 [2015-02-11 08:14:29,812][DEBUG][action.bulk 

elasticsearch-http-basic with ES 1.4.2

2015-02-19 Thread Eric
Has anyone had any luck with using http-basic with ES 1.4.2? I just want to 
put some basic security on my ES instance from outside of the clusters and 
this appears to be the easiest way with just white listing my other nodes. 
When I install it and configure it, it shows it going to the http-basic 
plugin but it always accepts the username/password from localhost even if I 
put the wrong info in there. It also never prompts for username/password 
from other IPs connecting to it. 

Locally it shows this:

*[root@elasticsearch1 http-basic]# curl -v --user bob:wrongpassword 
localhost:9200*
* About to connect() to localhost port 9200 (#0)
*   Trying 127.0.0.1... connected
* Connected to localhost (127.0.0.1) port 9200 (#0)
* Server auth using Basic with user 'bob'
 GET / HTTP/1.1
 Authorization: Basic Ym9iOnBhc3N3b3JkMTIzNTU1
 User-Agent: curl/7.19.7 (x86_64-redhat-linux-gnu) libcurl/7.19.7 
NSS/3.14.0.0 zlib/1.2.3 libidn/1.18 libssh2/1.4.2
 Host: localhost:9200
 Accept: */*

 HTTP/1.1 200 OK
 Content-Type: text/plain; charset=UTF-8
 Content-Length: 9

* Connection #0 to host localhost left intact
* Closing connection #0

From external sources it shows this in the logs.

[2015-02-19 14:56:29,816][INFO 
][com.asquera.elasticsearch.plugins.http.HttpBasicServer] [elasticsearch1] 
Authorization:null, Host:192.168.1.4:9200, Path:/, :null, 
Request-IP:192.168.1.4, Client-IP:null, X-Client-IPnull

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/7d2f2ac4-a8fd-4538-bc21-e0cde135a84d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: elastic search on t2.micro (Amazon WS)

2015-02-19 Thread Mark Walkom
Depends on your dataset and use.
I don't go below 2GB heaps when I'm just testing things.

On 20 February 2015 at 05:52, Seung Chan Lim djs...@gmail.com wrote:

 What's the minimum RAM requirement?

 slim

 On Wednesday, February 18, 2015 at 5:18:58 PM UTC-5, Mark Walkom wrote:

 Your only real option here is to get a machine with more RAM.
 Try spinning up a VM locally, on your desk/laptop.

 On 19 February 2015 at 00:52, Seung Chan Lim djs...@gmail.com wrote:

 I'm trying to see if I can get elastic search (ES) 1.3.8 working with
 couchbase (CB) 3.0.2 on a t2.micro (Amazon WS)

 t2.micro has 1 gig of RAM, which isn't a lot, but I'm only doing test
 development on this with not a lot of documents (1000).

 I just installed ES and followed the CB instructions to install the
 plugin and set the XDCR to get the replication going from CB to ES.

 I also configured ES to have 0 replication and 1 shard (hoping this
 would help minimize RAM usage). But I'm still seeing behaviors from ES
 where it locks up the server, making it unresponsive than eventually
 complaining of lack of memory.

 Is there something else I can do to get this working on a t2.micro?

 I'm a complete newbie to ES, and any help would be great,

 thank you

 slim

 --
 You received this message because you are subscribed to the Google
 Groups elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send
 an email to elasticsearc...@googlegroups.com.
 To view this discussion on the web visit https://groups.google.com/d/
 msgid/elasticsearch/40578480-09dd-4571-9108-b2675aa5ce1b%
 40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/40578480-09dd-4571-9108-b2675aa5ce1b%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/e6259e3d-2240-4a67-8385-17fc00d6dcbb%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/e6259e3d-2240-4a67-8385-17fc00d6dcbb%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAEYi1X_o2vJZxTqofujMacmYbDXF8UkwQ4qBHAQ202GhgcztNg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


[Spark] Unable to index JSON from HDFS using SchemaRDD.saveToES()

2015-02-19 Thread m shirley
This is my first real attempt at spark/scala so be gentle.

I have a file called test.json on HDFS that I'm trying to read and index 
using Spark.  I'm able to read the file via SQLContext.jsonFile() but when 
I try to use SchemaRDD.saveToEs() I get an invalid JSON fragment received 
error.  I'm thinking that the saveToES() function isn't actually formatting 
the output in json and instead is just sending the value field of the RDD.

What am I doing wrong?

Spark 1.2.0
Elasticsearch-hadoop 2.1.0.BUILD-20150217

test.json:
{key:value}

spark-shell:
import org.apache.spark.SparkContext._
import org.elasticsearch.spark._

val sqlContext = new org.apache.spark.sql.SQLContext(sc)
import sqlContext._

val input = 
sqlContext.jsonFile(hdfs://nameservice1/user/mshirley/test.json)

input.saveToEs(mshirley_spark_test/test)

error:
snip
org.elasticsearch.hadoop.rest.EsHadoopInvalidRequest: Found unrecoverable 
error [Bad Request(400) - Invalid JSON fragment 
received[[value]][MapperParsingException[failed to parse]; n
ested: ElasticsearchParseException[Failed to derive xcontent from 
(offset=13, length=9): [123, 34, 105, 110, 100, 101, 120, 34, 58, 123, 125, 
125, 10, 91, 34, 118, 97, 108, 117, 101, 3
4, 93, 10]]; ]]; Bailing out..
snip

input:
res2: org.apache.spark.sql.SchemaRDD = 
SchemaRDD[6] at RDD at SchemaRDD.scala:108
== Query Plan ==
== Physical Plan ==
PhysicalRDD [key#0], MappedRDD[5] at map at JsonRDD.scala:47

input.printSchema():
root
 |-- key: string (nullable = true)

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/bc6caa8f-b309-488c-8b1b-4cbef1e1c9fc%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Elasticsearch index name question

2015-02-19 Thread Silvana Vezzoli


I use a Monitoring Framework designed as a solution to monitor 
heterogeneous networks and systems in terms of services (platforms, 
applications for TELCO systems).

This framework collects in synchronous way the required data from several 
devices, it stores them in a Mongo Data Base and then transfers all stored 
collections from MongoDB to Elasticsearch via river-mongodb plugin.

We can have a huge amount of data stored in a single index of 
Elasticsearch, for example, about 5.2 millions of documents can be 
collected in a single MongoDB collection for only 8 hours of monitoring and 
so the number of documents in a single index grows rapidly.

At present, I have installed on Centos 6.5 server an Elasticsearch Cluster 
configuration with one node and five indices but only one index for all 
synchronous data.

My problem is to be able to create different indices in Elasticsearch where 
I can share the synchronous data, and so I would like to know if it is 
possible to create an index name with a timestamp appended to it, like so 
Logstash uses the timestamp from an event to derive the related 
Elasticsearch index name.

Some idea, suggestion, help?

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/37bb3713-56d9-443c-b3a5-9056092b958d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: elastic search cluster behind azure public load balncer

2015-02-19 Thread Subodh Patil
Ok Thanks Mark.
So you are saying I should set node.master: true   as well as node.data: 
true for  all the 3 nodes in the cluster.
And this will hold true for  4 node or even more node cluster ?

As of now I am making data redundant on all 3 nodes  but please suggest 
performance... 
azure Load balancer  chances of failure are much lesser than  VM.


The redundancy is required from backup point of view.  So I will have 4 
node cluster with data replicated on all 4 nodes.
3  nodes in the same region (say East US) and behind load balancer  and 4th 
node will be in different region (say west us) but NOT behind load balancer 
and just holding the all data.
In case east us data center has problem then I can redirect all traffic to 
west US data center where single node will always have all update data, or 
even i can take backups from west us data center.

Thanks,
Subodh



On Thursday, February 19, 2015 at 7:10:52 AM UTC+5:30, Mark Walkom wrote:

 Yes the master can serve requests.

 You don't really want 2 masters and 1 data node though, make all 3 
 master+data to start with. 
 And sure the client can be a SPOF, but then isn't a single load balancer a 
 SPOF as well? So the question remains, where are you happy dealing with 
 these points, because at some point you cannot make *everything* redundant 
 without being excessive.

 On 18 February 2015 at 21:36, Subodh Patil subod...@gmail.com 
 javascript: wrote:


 i am trying to setup ES cluster behind azure load balancer /cloud 
 service. the cluster is 3 node with no specific data/client/master node 
 settings. By default 2 nodes are elected as master and 1 as data node.

 As the request (create/update/search) from application comes to azure 
 load balancer on 9200 port which load balanced for all 3 vms the request 
 can go to any vm.

 Will master node be able to serve the requests ?

 Many article says that you don't need load balancer for ES cluster just 
 use client node but then it becomes single point of failure as azure vm can 
 go down any point of time. so load balancing is required mainly for high 
 availability from infrastructure point of view.
 please suggest cluster setup and which nodes (data or client) to be put 
 behind load balancer.


 ES version 1.4.1 on windows server 2012 r2 vm
  

 -- 
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearc...@googlegroups.com javascript:.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/47a28ee1-f216-480c-b148-c8e0d105a07c%40googlegroups.com
  
 https://groups.google.com/d/msgid/elasticsearch/47a28ee1-f216-480c-b148-c8e0d105a07c%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.




-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/d961148e-13d5-444f-bb33-368c9a468078%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Script based transform during index

2015-02-19 Thread Demetris Lambrou
Hi Group
First apologies if this is not the right way to ask the below question but 
this is my first time.

I have some documents with source IP and destination IP address. I want to 
enhance these documents with geo info with script transform when they 
arrive. So I created a script in python and I resolve geo info for every 
destination ip and store in _source (from what I understand)

The template for the index is as below. Everything works fine however I 
have two issues. 

1. The field (which does not exist and I create it namely location) is 
not shown  in a search unless explicitly asked. 
2. Kibana 3 does not show this field or it shows as empty. 

The field location is there if I explicitly ask for it. Can you please let 
me know how I can have these added fields prior to index available as 
normal fields ?


Thanks in advance !

P.S inside the python script I update the below 
ctx['location'] = ip2geo(dest_ip)
ctx['_source']['location'] = ip2geo(dest_ip)

POST /geotest/gdoc/_search
{
query: {
match_all: {}
},
fields: [
src_ip,
dst_ip,
location -- This is the new field which I add via 
ctx['_source']['location'] = ip2geo(dest_ip)
]
}

My template

PUT /_template/geo

{
template: geo*,
mappings: {
gdoc: {
transform: {
lang: python,
script: python_ip2geo
},
_source: {
enabled: true
},
properties: {
src_ip: {
type: ip,
index: not_analyzed
},
dst_ip: {
type: ip,
index: not_analyzed
},
location: {
type: geo_point,
index: analyzed, -- does not need to be analyzed 
really
store: true,
doc_values: true,
null_value: 
}
}
}
}
}


Can someone explain a bit more on how transform fields are stored and how 
they can be indexed ?

Thanks in advance

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/3e552ca9-f110-47f0-aae5-63ea8f2a89d8%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Date Histogram Bucket Count ?

2015-02-19 Thread neil . varnas
Is it possible to apply a filter to date histogram buckets ?
For example return only buckets that are below a certain value. I was 
looking to do it with scripts but dont know how to access buckets in 
script, so if anyone knows anything about that, it would be very very very 
helpful

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/57052d83-57ea-4037-b87d-708eb040d557%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Elasticsearch performance tuning

2015-02-19 Thread Deva Raj
Hi Mark Walkom,

I have given below logstash conf file

 
  Logstash conf

input {
   file {

  }

}

filter {
  mutate
  {
gsub = [message, \n,  ]
  }
 mutate
 {
gsub = [message, \t,  ]
 }
 multiline
   {
pattern = ^ 
what = previous
   }

grok { match = [ message, 
%{TIME:log_time}\|%{WORD:Message_type}\|%{GREEDYDATA:Component}\|%{NUMBER:line_number}\|
 %{GREEDYDATA:log_message}] 
 match = [ path , 
%{GREEDYDATA}/%{GREEDYDATA:loccode}/%{GREEDYDATA:_machine}\:%{DATE:logdate}.log]

 break_on_match = false
}


#To check location is S or L
  if [loccode] == S  or [loccode] == L {
 ruby {   
code =  temp = event['_machine'].split('_')
  if  !temp.nil? || !temp.empty?
  event['_machine'] = temp[0]
end
   } 
 }
 mutate {

add_field = [event_timestamp, %{@timestamp} ]
replace = [ log_time, %{logdate} %{log_time} ]
# Remove the 'logdate' field since we don't need it anymore.
   lowercase=[loccode]
   remove = logdate

  }
# to get all site details (site name, city and co-ordinates)
sitelocator{sitename = loccode  
datafile=vendor/sitelocator/SiteDetails.csv}
date {  locale=en
match = [ log_time, -MM-dd HH:mm:ss, MM-dd- 
HH:mm:ss.SSS,ISO8601 ] }

}

output {
elasticsearch{
 }

}



I have checked step by step to find bottleneck filter. Below filter which 
took much time. Can you guide me How can I tune it to get faster. 

date { locale=en match = [ log_time, -MM-dd HH:mm:ss, 
MM-dd- HH:mm:ss.SSS,ISO8601 ] } } 
http://serverfault.com/questions/669534/elasticsearch-performance-tuning#comment818613_669558


Thanks
Devaraj

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/7eedf369-b10d-442e-b30d-5e7969bf1c59%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Elasticsearch Script merge the results of two aggregations

2015-02-19 Thread ali balci


I use aggregrations on elasticsearch version 1.3.8. I use aggregation script 
for awhile today ıt didnt work. Please help ı cant find any solution : 

This the mapping:

  mappings: {
 product: {
properties: {
   brandId: {
  type: integer
   },
   brandIsActive: {
  type: boolean
   },
   brandLink: {
  type: string,
  index: not_analyzed
   },
   brandName: {
  type: string,
  index: not_analyzed
   }

  }

   }

}


this my query:


post alias-test/product/_search
{
query: {
match_all: {}
},
aggs: {
   Brand: {
 terms: {
 script: doc['brandName'].value,
 size: 0
  } 
} 
 }  
}

This is the error:

{
   error: SearchPhaseExecutionException[Failed to execute phase 
[query_fetch], all shards failed; shardFailures {[g][mizu-20150219142655][0]: 
RemoteTransportException[[Mammomax][inet[/172.31.37.148:9300]][search/phase/query+fetch]];
 nested: SearchParseException[[mizu-20150219142655][0]: 
query[ConstantScore(*:*)],from[-1],size[-1]: Parse Failure [Failed to parse 
source 
[{\query\:{\match_all\:{}},\aggs\:{\Brand\:{\terms\:{\script\:\doc['brandName'].value\]]];
 nested: ExpressionScriptCompilationException[Field [brandName] used in 
expression must be numeric]; }],
   status: 400
}


The other query :

post test/product/_search
{
query: {
match_all: {}
},
aggs: {
   Brand: {
 terms: {
 script: doc['brandName'].value+'|'+doc['brandLink'].value,
 size: 0
  } 
} 
 }  
}


the error :

post test/product/_search
{
query: {
match_all: {}
},
aggs: {
   Brand: {
 terms: {
 script: doc['brandName'].value+'|'+doc['brandLink'].value,
 size: 0
  } 
} 
 }  
}

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/1af39cb0-893d-4a7d-b9e1-a061eed48de6%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


How to index multiple database query data to a single index file in Elastic search.

2015-02-19 Thread Gurunath pai
Hi,

Want to index multiple database data to Elastic search, I have done the 
same in Solr using DataImportHandler and DeltaImportHandler. Is there any 
way to achieve the same in Elastic Search, also want to index all these 
multiple query output all at once. I have observed that we cant index 
multiple queries here in Elastic Search.

Thanks
Guru Pai.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/7f4d61ef-2116-4991-b93f-546898ff2462%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[ES-1.4.0] Snapshot Queue

2015-02-19 Thread Yarden Bar
Hi All,

Is there a way to queue snapshot invocations?

e.g.: snapshot_1 - snapshot_2 -  - snapshot_N


Thanks,
Yarden

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/33727f3a-f969-48fb-9c9b-fb360ccd9e08%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Elasticsearch Script merge the results of two aggregations

2015-02-19 Thread ali balci
I second error :

{
   error: SearchPhaseExecutionException[Failed to execute phase
[query_fetch], all shards failed; shardFailures {[g][test][0]:
RemoteTransportException[[Mammomax][inet[/192.168.1.8:9300]][search/phase/query+fetch]];
nested: SearchParseException[[test][0]:
query[ConstantScore(*:*)],from[-1],size[-1]: Parse Failure [Failed to
parse source 
[{\query\:{\match_all\:{}},\aggs\:{\Brand\:{\terms\:{\script\:\doc['brandName'].value+'|'+doc['brandLink'].value\,\size\:0]]];
nested: ExpressionScriptCompilationException[Failed to parse
expression: doc['brandName'].value+'|'+doc['brandLink'].value];
nested: ParseException[ unexpected character ''' at position (23).];
nested: NoViableAltException; }],
   status: 400
}


2015-02-19 14:49 GMT+02:00 ali balci balci.a...@gmail.com:


 I use aggregrations on elasticsearch version 1.3.8. I use aggregation script 
 for awhile today ıt didnt work. Please help ı cant find any solution :

 This the mapping:

   mappings: {
  product: {
 properties: {
brandId: {
   type: integer
},
brandIsActive: {
   type: boolean
},
brandLink: {
   type: string,
   index: not_analyzed
},
brandName: {
   type: string,
   index: not_analyzed
}

   }

}

 }


 this my query:


 post alias-test/product/_search
 {
 query: {
 match_all: {}
 },
 aggs: {
Brand: {
  terms: {
  script: doc['brandName'].value,
  size: 0
   }
 }
  }
 }

 This is the error:

 {
error: SearchPhaseExecutionException[Failed to execute phase 
 [query_fetch], all shards failed; shardFailures {[g][mizu-20150219142655][0]: 
 RemoteTransportException[[Mammomax][inet[/172.31.37.148:9300]][search/phase/query+fetch]];
  nested: SearchParseException[[mizu-20150219142655][0]: 
 query[ConstantScore(*:*)],from[-1],size[-1]: Parse Failure [Failed to parse 
 source 
 [{\query\:{\match_all\:{}},\aggs\:{\Brand\:{\terms\:{\script\:\doc['brandName'].value\]]];
  nested: ExpressionScriptCompilationException[Field [brandName] used in 
 expression must be numeric]; }],
status: 400
 }


 The other query :

 post test/product/_search
 {
 query: {
 match_all: {}
 },
 aggs: {
Brand: {
  terms: {
  script: doc['brandName'].value+'|'+doc['brandLink'].value,
  size: 0
   }
 }
  }
 }


 the error :

 post test/product/_search
 {
 query: {
 match_all: {}
 },
 aggs: {
Brand: {
  terms: {
  script: doc['brandName'].value+'|'+doc['brandLink'].value,
  size: 0
   }
 }
  }
 }

  --
 You received this message because you are subscribed to a topic in the
 Google Groups elasticsearch group.
 To unsubscribe from this topic, visit
 https://groups.google.com/d/topic/elasticsearch/aeNWfgYNVmA/unsubscribe.
 To unsubscribe from this group and all its topics, send an email to
 elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/1af39cb0-893d-4a7d-b9e1-a061eed48de6%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/1af39cb0-893d-4a7d-b9e1-a061eed48de6%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.




-- 
Best Regards

ALİ BALCI
Bilgisayar Mühendisligi
Tel:0543 699 59 88
FACEBOOK http://www.facebook.com/alibalci.mail
BLOG http://balciali.wordpress.com/

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAMLQj%3D%2B1k0DiPwXdtxYNMjcK%2BMeQmzSAdw-9D0nrVpP-mPU4nw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


I want to implement QueryElevationComponent feature in ELasticSearch

2015-02-19 Thread Gurunath pai
HI All,

I want to implement QueryElevationComponent feature in ELasticSearch, Can 
anyone suggest me how can I go ahead with this feature in Elastic Search.

Thanks
Guru Pai.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/219ad580-8e19-418b-a775-79f61373141c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: problem with using uax_url_email

2015-02-19 Thread Marria
Hi,

for people having the same problem like me, here an answer I received from 
Pablo in PT group:

About your problem I beleive this is a constraint of the Apache Tika [1], 
which is used by the mapper-attachment plugin.
I believe that a search over Tika pdf limitations or a question on their 
list will help you more than we can.
Anyway, maybe you want to ask in the Elasticsearch main list [2], which is 
bigger than ours and has the Elasticsearch engineers.

I am sorry for not being able to help you that much.

Cheers,
Pablo

[1] http://tika.apache.org/
[2] elasti...@googlegroups.com


Le mercredi 18 février 2015 15:37:33 UTC+1, Marria a écrit :

 Hi everybody,

 I want to perform URL extraction from my PDF files. I use 
 mapper-attachment plugin to index my PDF files.

 In order to be able to perform some regex queries and extract all the urls 
 present in a pdf file, I useduax_url_email:

 curl -X PUT localhost:9200/test -d '{

settings : {

  index: {

analysis :{

  analyzer: {

default: {

  type : custom,

  tokenizer : uax_url_email,

  filter : [standard, lowercase, stop]

}

  }

}

  }

}

  }'



  and the map :

 curl -X PUT localhost:9200/test/attachment/_mapping -d '{

attachment : {

  properties : {

file : {

  type : attachment,

  fields : {

title : { store : yes },

file : { term_vector:with_positions_offsets, 
 store:yes }

  }

}

  }

}


 I indexed some PDF files, the problem is for a file , I get this (while 
 urls in this file start with http://):



 https://lh3.googleusercontent.com/-6uzhp-v0qFs/VOSfMU95byI/AUc/H4c6xvb54kg/s1600/Capture%2Bd’écran%2B2015-02-18%2Bà%2B15.17.19.png
 for another file, I got this (it leaves the http:// ):


 https://lh3.googleusercontent.com/-1rYIYWJJEbU/VOSfweFpgbI/AUk/bWzfst_uZUE/s1600/Capture%2Bd’écran%2B2015-02-18%2Bà%2B15.19.43.png
  But the problem is the urls are not recognized completely , look at this:


 https://lh3.googleusercontent.com/-vsKUj5I9MiA/VOSgtyS3yWI/AUw/64lgO4gYSdI/s1600/Capture%2Bd’écran%2B2015-02-18%2Bà%2B15.22.32.png

 Is it caused by the double column representation in the PDF file?


 https://lh4.googleusercontent.com/-c7n5-oMygRM/VOShm4hwnWI/AU4/CQNjTTctMnY/s1600/Capture%2Bd’écran%2B2015-02-18%2Bà%2B15.26.46.png

 So, what did I do wrong? how can I fix this and use regexp queries 
 successfully to extract all the URLs?


 Thank you





  
  

  

  






-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/12a4f452-6c6f-4e4f-ba00-97208efdbcba%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Aggregations failing on fields with custom analyzer..

2015-02-19 Thread David Pilato
Did you apply your analyzer to your mapping?

David

 Le 19 févr. 2015 à 08:53, Anil Karaka anilkar...@gmail.com a écrit :
 
 http://stackoverflow.com/questions/28601082/terms-aggregation-failing-on-string-fields-with-a-custom-analyzer-in-elasticsear
 
 Posted in stack over flow as well..
 
 On Thursday, February 19, 2015 at 1:01:40 PM UTC+5:30, Anil Karaka wrote:
 I wanted a custom analyzer that behaves exactly like not_analyzed, except 
 that fields are case insensitive..
 
 I have my analyzer as below, 
 
 index: {
 analysis: {
 analyzer: { // Custom Analyzer with keyword tokenizer and 
 lowercase filter, same as not_analyzed but case insensitive
 case_insensitive_keyword_analyzer: {
 tokenizer: keyword,
 filter: lowercase
 }
 }
 }
 }
 
 But when I'm trying to do term aggregation over a field with strings 
 analyzed as above, I'm getting this error..
 
 {
 error 
 :ClassCastException[org.elasticsearch.search.aggregations.bucket.terms.DoubleTerms$Bucket
  cannot be cast to 
 org.elasticsearch.search.aggregations.bucket.terms.StringTerms$Bucket],
 status : 500
 }
 
 Are there additional settings that I have to update in my custom analyzer 
 for my terms aggregation to work..?
 
 
 The better question is I want a custom analyzer that does everything similar 
 to not_analyzed but is case insensitive.. How do I achieve that?
 
 -- 
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/91eea272-2f5e-4d9a-b975-dae5d50cd0d3%40googlegroups.com.
 For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/492932A0-CBC0-497B-A9D8-C6D707DC09B6%40pilato.fr.
For more options, visit https://groups.google.com/d/optout.


Re: Aggregations failing on fields with custom analyzer..

2015-02-19 Thread David Pilato
I think you are doing something wrong.

DELETE index
PUT index
{
  mappings: {
doc: {
  properties: {
foo: {
  type: double
}
  }
}
  }
}
PUT index/doc/1
{
  foo: bar
}

gives:

{
   error: MapperParsingException[failed to parse [foo]]; nested: 
NumberFormatException[For input string: \bar\]; ,
   status: 400
}

-- 
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet https://twitter.com/dadoonet | @elasticsearchfr 
https://twitter.com/elasticsearchfr | @scrutmydocs 
https://twitter.com/scrutmydocs



 Le 19 févr. 2015 à 09:39, Anil Karaka anilkar...@gmail.com a écrit :
 
 _source : {
 Sort : ,
 gt : 2015-02-18T15:07:10,
 uid : 54867dc55b482b04da7f23d8,
 usId : 54867dc55b482b04da7f23d7,
 ut : 2015-02-18T20:37:10,
 act : productlisting,
 st : 2015-02-18T15:07:46,
 Filter : ,
 av : 3.0.0.0,
 ViewType : SmallSingleList,
 os : Windows,
 categoryid : home-kitchen-curtains-blinds
 }
 
 properties : {
 uid : {
 analyzer : case_insensitive_keyword_analyzer,
 type : string
 },
 ViewType : {
 analyzer : case_insensitive_keyword_analyzer,
 type : string
 },
 usId : {
 analyzer : case_insensitive_keyword_analyzer,
 type : string
 },
 os : {
 analyzer : case_insensitive_keyword_analyzer,
 type : string
 },
 Sort : {
 analyzer : case_insensitive_keyword_analyzer,
 type : string
 },
 Filter : {
 analyzer : case_insensitive_keyword_analyzer,
 type : string
 },
 categoryid : {
 type : double
 },
 gt : {
 format : dateOptionalTime,
 type : date
 },
 ut : {
 format : dateOptionalTime,
 type : date
 },
 st : {
 format : dateOptionalTime,
 type : date
 },
 act : {
 analyzer : case_insensitive_keyword_analyzer,
 type : string
 },
 av : {
 analyzer : case_insensitive_keyword_analyzer,
 type : string
 }
 }
 
 
 A sample document and the index mappings above..
 
 
 On Thursday, February 19, 2015 at 2:03:11 PM UTC+5:30, David Pilato wrote:
 I don’t know without a concrete example.
 I’d say that if you map have a type number and you send 123 it could work. 
 
 -- 
 David Pilato | Technical Advocate | Elasticsearch.com 
 http://elasticsearch.com/
 @dadoonet https://twitter.com/dadoonet | @elasticsearchfr 
 https://twitter.com/elasticsearchfr | @scrutmydocs 
 https://twitter.com/scrutmydocs
 
 
 
 Le 19 févr. 2015 à 09:30, Anil Karaka anilk...@gmail.com javascript: a 
 écrit :
 
 It was my mistake, the field I was trying to do an aggregation was mapped 
 double, I assumed its a string, after seeing some sample documents with 
 strings..
 
 Why didn't es throw an error when I'm indexing docs with strings instead of 
 double..?
 
 On Thursday, February 19, 2015 at 1:35:08 PM UTC+5:30, David Pilato wrote:
 Did you apply your analyzer to your mapping?
 
 David
 
 Le 19 févr. 2015 à 08:53, Anil Karaka anilk...@gmail.com  a écrit :
 
 http://stackoverflow.com/questions/28601082/terms-aggregation-failing-on-string-fields-with-a-custom-analyzer-in-elasticsear
  
 http://stackoverflow.com/questions/28601082/terms-aggregation-failing-on-string-fields-with-a-custom-analyzer-in-elasticsear
 
 Posted in stack over flow as well..
 
 On Thursday, February 19, 2015 at 1:01:40 PM UTC+5:30, Anil Karaka wrote:
 I wanted a custom analyzer that behaves exactly like not_analyzed, except 
 that fields are case insensitive..
 
 I have my analyzer as below, 
 
 index: {
 analysis: {
 analyzer: { // Custom Analyzer with keyword tokenizer and 
 lowercase filter, same as not_analyzed but case insensitive
 case_insensitive_keyword_analyzer: {
 tokenizer: keyword,
 filter: lowercase
 }
 }
 }
 }
 
 But when I'm trying to do term aggregation over a field with strings 
 analyzed as above, I'm getting this error..
 
 {
 error 
 :ClassCastException[org.elasticsearch.search.aggregations.bucket.terms.DoubleTerms$Bucket
  cannot be cast to 
 org.elasticsearch.search.aggregations.bucket.terms.StringTerms$Bucket],
 status : 500
 }
 
 Are there additional settings that I have to update in my custom analyzer 
 for my terms aggregation to work..?
 
 
 The better question is I want a custom analyzer that does everything 
 similar to not_analyzed but is case insensitive.. How do I achieve that?
 
 
 
 -- 
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearc...@googlegroups.com .
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/91eea272-2f5e-4d9a-b975-dae5d50cd0d3%40googlegroups.com
  
 https://groups.google.com/d/msgid/elasticsearch/91eea272-2f5e-4d9a-b975-dae5d50cd0d3%40googlegroups.com?utm_medium=emailutm_source=footer.
 For more options, visit https://groups.google.com/d/optout 
 https://groups.google.com/d/optout.
 
 
 -- 
 You received this message because you are subscribed to the 

disappearing log records: what can I do?

2015-02-19 Thread r . grosmann
hello group,

I have the following stack:
fluentd (td-agent 2.1.3), elasticsearch (1.4.4), kibana (3.1.2)
to manage the logging of an in-company application.

At first glance, this seems to work OK, but it appears that from time to 
time, the reported records in kibana don't match the line count of the 
logfiles.

Diving into this, it appears that when very large logfile are put in the 
flluentd logdirectory, not all records show up in elasticsearch. This does 
not show up in the logging of either fluentd or elasticsearch, so at first 
glance, everyting seems fine. I started with looking at fluentd and managed 
to get extra information, which seems to indicate that all of the log lines 
are processed.

When comparing the wc -l of the logfile and the contents of ES, the 
difference becomes visible:
ES:645551   wc:647506groot.log

Looking at the thread pool statistics with the REST api, ES reports 60 
bulk.rejects.

Right now, I have a very simple configuration.
cluster.name: cwc-dev
index.number_of_replicas: 0
index.indexing.slowlog.threshold.index.warn: 10s
index.indexing.slowlog.threshold.index.info: 5s
index.indexing.slowlog.threshold.index.debug: 2s
path.data: /data/elasticsearch

I hope you can support me on how to tackle this, since I am quite new to 
ES. So I don't know which ways are available to get extra information on 
this.

thanks in advance, Ruud


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/15a4958b-9bb8-4ca8-852b-967e15111305%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Using 'elapsed' plugin with GrayLog2

2015-02-19 Thread rlautman
I am working with a team who are creating a proof of concept monitoring and 
analytics system using a combination of LogStash, MongoDB, Elasticsearch 
and GrayLog2. 

I was wondering if anyone has any experience using the elapsed plugin with 
this set up and can tell me what (if any) issues they encountered?

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/2d194879-4a2c-4201-ae88-0985a4a0d4b2%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


waited for 30s and no initial state was set by the discovery

2015-02-19 Thread Diana Tuck
I know other people have posted this, but I've tried everything that the 
other threads have said to try.  We keep getting this at startup of our 
java node client:

INFO  [2015-02-19 17:57:45,206] org.elasticsearch.node: [localhost] 
version[1.4.2], pid[11103], build[927caff/2014-12-16T14:11:12Z]
INFO  [2015-02-19 17:57:45,207] org.elasticsearch.node: [localhost] 
initializing ...
INFO  [2015-02-19 17:57:45,217] org.elasticsearch.plugins: [localhost] 
loaded [cloud-aws], sites []
INFO  [2015-02-19 17:57:47,625] org.elasticsearch.node: [localhost] 
initialized
INFO  [2015-02-19 17:57:47,625] org.elasticsearch.node: [localhost] 
starting ...
INFO  [2015-02-19 17:57:47,716] org.elasticsearch.transport: [localhost] 
bound_address {inet[/0:0:0:0:0:0:0:0:9300]}, publish_address 
{inet[/10.99.157.0:9300]}
INFO  [2015-02-19 17:57:49,747] org.elasticsearch.discovery: [localhost] 
elasticsearch-dev/EqKAxZm9SCutQGd-0_SonA
WARN  [2015-02-19 17:58:19,749] org.elasticsearch.discovery: [localhost] 
waited for 30s and no initial state was set by the discovery
INFO  [2015-02-19 17:58:19,761] org.elasticsearch.http: [localhost] 
bound_address {inet[/0:0:0:0:0:0:0:0:9200]}, publish_address 
{inet[/10.99.157.0:9200]}
INFO  [2015-02-19 17:58:19,761] org.elasticsearch.node: [localhost] started

This is the elasticsearch.yml on our master (and only data node right now):

plugin.mandatory: cloud-aws 
cloud: 
  aws: 
region: us-west-2 
access_key: ACCESS_KEY 
secret_key: SECRET_KEY

discovery: 
  type: ec2 
  ec2: 
groups: DevAll


And this is our java node client:

Settings settings = ImmutableSettings.settingsBuilder()
.put(node.name, nodeName)
.put(cloud.aws.access_key, awsAccessKey)
.put(cloud.aws.secret_key, awsSecretKey)
.put(cloud.node.auto_attributes, true)
.put(discovery.type, ec2)
.build();
this.node = nodeBuilder()
.clusterName(clusterName)
.settings(settings)
.client(true)
.node();
this.client = node.client();

Any help would be greatly appreciated!!!

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/15f03a06-58a8-4ef3-9a73-93fc779dd6e1%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Cluster hanging on node failure

2015-02-19 Thread Max Charas
I posted here too:
http://stackoverflow.com/questions/28601885/cluster-hanging-on-node-failure

Would love to get some help with this.

Best,
Max

Den onsdag 18 februari 2015 kl. 20:30:46 UTC+1 skrev Max Charas:

 Hello all of you bright people,

 We’re currently running a smallish 300 GB cluster in production on 5 nodes 
 with around 30 mil docs. Everything works flawlessly except when a node 
 really goes down (I mean like network/ HW failure/ kill -9).

 When we lose a node the cluster becomes more or less completely 
 unresponsive for a few minutes. Both regarding indexing and querying. This 
 is of course, less than ideal as we have load 24/7.

 I would really appreciate some help with understanding best practice 
 settings to have a robust cluster. 

 First goal for us is for the cluster to not become unresponsive in the 
 event of a node crash. After reading everything I could find on the web I 
 can't really understand if ES is designed to be unresponsive for 
 ping_retries*ping_timeout seconds or if the cluster will continue to server 
 query requests even during this time. Could anyone help me shed light on 
 this?

 Secondly in the event of a even worse failure where the cluster goes into 
 red state, would it be possible to allow the cluster to still serve 
 read/query requests? 

 I would be ever so grateful for anyone willing to help me understand how 
 this works or what we would need to change to make our ES installation more 
 robust.

 I’ve included our config here:

 cluster.name: clustername

 node.name: nodename

 path.data: /index

 node.master: true

 node.data: true

 discovery.zen.minimum_master_nodes: 3

 discovery.zen.ping.multicast.enabled: false

 discovery.zen.ping.multicast.ping.enabled: false

 discovery.zen.ping.unicast.enabled: true

 discovery.zen.ping.unicast.hosts: [host1,host2,host3]

 bootstrap.mlockall: true

 index.number_of_shards: 10

 action.disable_delete_all_indices: true

 marvel.agent.exporter.es.hosts: [marvel:9200]

  



-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/fb7171cd-a55e-4ccb-b15f-a6159931b3ff%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: waited for 30s and no initial state was set by the discovery

2015-02-19 Thread Diana Tuck
Additional information - we can telnet into or ES server from our 
application server on 9300.

On Thursday, February 19, 2015 at 10:04:51 AM UTC-8, Diana Tuck wrote:

 I know other people have posted this, but I've tried everything that the 
 other threads have said to try.  We keep getting this at startup of our 
 java node client:

 INFO  [2015-02-19 17:57:45,206] org.elasticsearch.node: [localhost] 
 version[1.4.2], pid[11103], build[927caff/2014-12-16T14:11:12Z]
 INFO  [2015-02-19 17:57:45,207] org.elasticsearch.node: [localhost] 
 initializing ...
 INFO  [2015-02-19 17:57:45,217] org.elasticsearch.plugins: [localhost] 
 loaded [cloud-aws], sites []
 INFO  [2015-02-19 17:57:47,625] org.elasticsearch.node: [localhost] 
 initialized
 INFO  [2015-02-19 17:57:47,625] org.elasticsearch.node: [localhost] 
 starting ...
 INFO  [2015-02-19 17:57:47,716] org.elasticsearch.transport: [localhost] 
 bound_address {inet[/0:0:0:0:0:0:0:0:9300]}, publish_address {inet[/
 10.99.157.0:9300]}
 INFO  [2015-02-19 17:57:49,747] org.elasticsearch.discovery: [localhost] 
 elasticsearch-dev/EqKAxZm9SCutQGd-0_SonA
 WARN  [2015-02-19 17:58:19,749] org.elasticsearch.discovery: [localhost] 
 waited for 30s and no initial state was set by the discovery
 INFO  [2015-02-19 17:58:19,761] org.elasticsearch.http: [localhost] 
 bound_address {inet[/0:0:0:0:0:0:0:0:9200]}, publish_address {inet[/
 10.99.157.0:9200]}
 INFO  [2015-02-19 17:58:19,761] org.elasticsearch.node: [localhost] started

 This is the elasticsearch.yml on our master (and only data node right now):

 plugin.mandatory: cloud-aws 
 cloud: 
   aws: 
 region: us-west-2 
 access_key: ACCESS_KEY 
 secret_key: SECRET_KEY

 discovery: 
   type: ec2 
   ec2: 
 groups: DevAll


 And this is our java node client:

 Settings settings = ImmutableSettings.settingsBuilder()
 .put(node.name, nodeName)
 .put(cloud.aws.access_key, awsAccessKey)
 .put(cloud.aws.secret_key, awsSecretKey)
 .put(cloud.node.auto_attributes, true)
 .put(discovery.type, ec2)
 .build();
 this.node = nodeBuilder()
 .clusterName(clusterName)
 .settings(settings)
 .client(true)
 .node();
 this.client = node.client();

 Any help would be greatly appreciated!!!


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/f57f1be2-cda0-42c2-9001-33fa29b1a111%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: ClassNotFoundException: org.elasticsearch.discovery.ec2.Ec2DiscoveryModule

2015-02-19 Thread Diana Tuck
Thanks, David.  I eventually ended up finding the pom on the github repo. 
 Thanks for adding the documentation!!

On Wednesday, February 18, 2015 at 10:39:19 PM UTC-8, David Pilato wrote:

 Yes.

 This should be added to the doc: 
 https://github.com/elasticsearch/elasticsearch-cloud-aws/issues/176

 You need to add this dependency if you are using a NodeClient:

 groupIdorg.elasticsearch/groupId artifactIdelasticsearch-cloud-aws/
 artifactId
 version2.4.1/version
 HTH
 David

 Le 19 févr. 2015 à 01:15, Diana Tuck dtu...@gmail.com javascript: a 
 écrit :

 New to ES - Trying to use the elasticsearch-cloud-aws plugin, but when 
 starting my java client node, I'm getting ClassNotFoundException 
 on org.elasticsearch.discovery.ec2.Ec2DiscoveryModule.   Do I need to 
 install this plugin on java client nodes, and if so, how does one do that? 
  Or, rather, is there a maven dependency that can be referenced to load 
 these required classes?

 For reference, the elasticsearch.yaml is:

 plugin.mandatory: cloud-aws
 cloud: 
   aws: 
 access_key: ** 
 secret_key: * 
 discovery: 
   type: ec2

 and my java client code is:

 Settings settings = ImmutableSettings.settingsBuilder()
 .put(node.name, nodeName)
 .put(cloud.aws.access_key, awsAccessKey)
 .put(cloud.aws.secret_key, awsSecretKey)
 .put(cloud.node.auto_attributes, true)
 .put(discovery.type, ec2)
 .build();
 this.node = nodeBuilder()
 .clusterName(clusterName)
 .settings(settings)
 .client(true)
 .node();
 this.client = node.client();

 -- 
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearc...@googlegroups.com javascript:.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/925353fd-b717-417d-986f-570c634e39c1%40googlegroups.com
  
 https://groups.google.com/d/msgid/elasticsearch/925353fd-b717-417d-986f-570c634e39c1%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.



-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/3fc41382-3521-416d-8052-50f342462816%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: elastic search on t2.micro (Amazon WS)

2015-02-19 Thread Seung Chan Lim
What's the minimum RAM requirement?

slim

On Wednesday, February 18, 2015 at 5:18:58 PM UTC-5, Mark Walkom wrote:

 Your only real option here is to get a machine with more RAM.
 Try spinning up a VM locally, on your desk/laptop.

 On 19 February 2015 at 00:52, Seung Chan Lim djs...@gmail.com 
 javascript: wrote:

 I'm trying to see if I can get elastic search (ES) 1.3.8 working with 
 couchbase (CB) 3.0.2 on a t2.micro (Amazon WS)

 t2.micro has 1 gig of RAM, which isn't a lot, but I'm only doing test 
 development on this with not a lot of documents (1000).

 I just installed ES and followed the CB instructions to install the 
 plugin and set the XDCR to get the replication going from CB to ES.

 I also configured ES to have 0 replication and 1 shard (hoping this would 
 help minimize RAM usage). But I'm still seeing behaviors from ES where it 
 locks up the server, making it unresponsive than eventually complaining of 
 lack of memory.

 Is there something else I can do to get this working on a t2.micro? 

 I'm a complete newbie to ES, and any help would be great,

 thank you

 slim

 -- 
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearc...@googlegroups.com javascript:.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/40578480-09dd-4571-9108-b2675aa5ce1b%40googlegroups.com
  
 https://groups.google.com/d/msgid/elasticsearch/40578480-09dd-4571-9108-b2675aa5ce1b%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.




-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/e6259e3d-2240-4a67-8385-17fc00d6dcbb%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Multi Level nested search by using NEST API

2015-02-19 Thread Xinli Shang
I am having trouble to make multi level nested query by using NEST API.
Here is my mapping.

{
  log: {
mappings:{
  LogEvent: {
   properties:{

@timestamp:{type:date,store:true,format:-MM-dd'T'HH:mm:ss},
records:
{type:nested,properties:{

eventtype : {type:string,store:true},

detail:{type:string,store:true},

others:{

type:nested,properties:{

ScrubbedContent:{type:string,store:true},

RawContent:{type:string,store:true}

}

}
 }   } }
}
}

And here is the query that works

{
  from: 0,
  size: 1,
  query: {
filtered: {
  filter: {
and: {
  filters: [
{
  range: {
@timestamp: {
  gte: 2015-02-12T02:37:32,
  lte: 2015-02-19T02:37:32
}
  }
},
{
  nested: {
filter: {
  terms: {
records.eventtype: myeventtype
  }
},
path: records
  }
}
  ]
}
  }
}
  }
}

But if I change path: records to path: records.*others*, no result
returned. I am pretty sure I should have results for it.

Any thought why?

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHsnNRjY8dLpzd260Pg6rkoAHVKvZWwC9OZS%3DUinpaoXTJU6Dg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: formula or guidelines to calculate/estimate the index size

2015-02-19 Thread Mark Walkom
There is nothing official.

Just create an index, put in 1 of your documents and then extrapolate.

On 20 February 2015 at 04:26, Gaurav gupta gupta.gaurav0...@gmail.com
wrote:

 Could anyone assist me to know how can I find a formula or guidelines to
 calculate/estimate the index size created by elastic search. I found that
 there is a formula (excel sheet) for it for LUCENE.

 Thanks!
 Gaurav

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/CALZAj3Knr%3DusYtW8KyJVqbNsS0cK5z1MY5F-EvP94tC1%3DjBCDA%40mail.gmail.com
 https://groups.google.com/d/msgid/elasticsearch/CALZAj3Knr%3DusYtW8KyJVqbNsS0cK5z1MY5F-EvP94tC1%3DjBCDA%40mail.gmail.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAEYi1X_tJ8fMjqqYerwWm9A0ui9%2BNP2kBOwm6LB-mtk94x9YMA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


how to avoid MapperParsingException?

2015-02-19 Thread sebastian
Hi,

I'm indexing documents with the following structure:

{ name: peter, email: p...@p.com, location: MIA }
{ name: mary, email: m...@m.com, device: ipad }
{ name: mary, email: m...@m.com, metadata: { ... } }

As you can see, I only know the type for name and email fields. The 
location, device, metadata or whatever other field are dynamic fields.

So, in order to avoid a MapperParsingException, I want to persist all of 
the document fields, but ONLY mark as searchable the name and email 
fields.

Can I do that using mappings?

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/8bbe52be-59b4-4e78-8866-13716b3f862c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: elastic search on t2.micro (Amazon WS)

2015-02-19 Thread jrgns
What's your heap size (JVM setting) set to?

It needs to be at most one half the machine's RAM, so set it to 500MB.

J

On Wednesday, February 18, 2015 at 3:52:22 PM UTC+2, Seung Chan Lim wrote:

 I'm trying to see if I can get elastic search (ES) 1.3.8 working with 
 couchbase (CB) 3.0.2 on a t2.micro (Amazon WS)

 t2.micro has 1 gig of RAM, which isn't a lot, but I'm only doing test 
 development on this with not a lot of documents (1000).

 I just installed ES and followed the CB instructions to install the plugin 
 and set the XDCR to get the replication going from CB to ES.

 I also configured ES to have 0 replication and 1 shard (hoping this would 
 help minimize RAM usage). But I'm still seeing behaviors from ES where it 
 locks up the server, making it unresponsive than eventually complaining of 
 lack of memory.

 Is there something else I can do to get this working on a t2.micro? 

 I'm a complete newbie to ES, and any help would be great,

 thank you

 slim


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/bc2e94e2-ebe5-4db3-badf-a24d47ba589f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Elasticsearch index name question

2015-02-19 Thread Mark Walkom
This is possible but not automatically within ES.
LS knows it needs to switch to a new index at UTC, you need to find a
way to get the river or some other code to do this.

On 19 February 2015 at 21:59, Silvana Vezzoli silvana.vezz...@gmail.com
wrote:

 I use a Monitoring Framework designed as a solution to monitor
 heterogeneous networks and systems in terms of services (platforms,
 applications for TELCO systems).

 This framework collects in synchronous way the required data from several
 devices, it stores them in a Mongo Data Base and then transfers all stored
 collections from MongoDB to Elasticsearch via river-mongodb plugin.

 We can have a huge amount of data stored in a single index of
 Elasticsearch, for example, about 5.2 millions of documents can be
 collected in a single MongoDB collection for only 8 hours of monitoring and
 so the number of documents in a single index grows rapidly.

 At present, I have installed on Centos 6.5 server an Elasticsearch Cluster
 configuration with one node and five indices but only one index for all
 synchronous data.

 My problem is to be able to create different indices in Elasticsearch
 where I can share the synchronous data, and so I would like to know if it
 is possible to create an index name with a timestamp appended to it, like
 so Logstash uses the timestamp from an event to derive the related
 Elasticsearch index name.

 Some idea, suggestion, help?

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/37bb3713-56d9-443c-b3a5-9056092b958d%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/37bb3713-56d9-443c-b3a5-9056092b958d%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAEYi1X-cqwvKHdQETdZtWOTOUmwPWayYWKiNwmo6JXETno0%3DEg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Move Data from One index to Another Cluster Index

2015-02-19 Thread Amay Patil
How can one move data from One index in a server to another Index in 
another Server.

I know reindexing moves data into another index in the same cluster. 

How can it be done in Python ?

Ap

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/ecb23365-08a3-4b0e-8036-ed2f8b1fbc27%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Elasticsearch performance tuning

2015-02-19 Thread Mark Walkom
Don't change cache and buffer sizes unless you know what is happening, the
defaults are going to be fine.
How much heap did you give ES?

I'm not sure you can do much about the date filter though, maybe someone
else has pointers.

On 19 February 2015 at 21:12, Deva Raj devarajcse...@gmail.com wrote:

 Hi Mark Walkom,

 I have given below logstash conf file


   Logstash conf

 input {
file {

   }

 }

 filter {
   mutate
   {
 gsub = [message, \n,  ]
   }
  mutate
  {
 gsub = [message, \t,  ]
  }
  multiline
{
 pattern = ^ 
 what = previous
}

 grok { match = [ message, 
 %{TIME:log_time}\|%{WORD:Message_type}\|%{GREEDYDATA:Component}\|%{NUMBER:line_number}\|
  %{GREEDYDATA:log_message}]
  match = [ path , 
 %{GREEDYDATA}/%{GREEDYDATA:loccode}/%{GREEDYDATA:_machine}\:%{DATE:logdate}.log]

  break_on_match = false
 }


 #To check location is S or L
   if [loccode] == S  or [loccode] == L {
  ruby {
 code =  temp = event['_machine'].split('_')
   if  !temp.nil? || !temp.empty?
   event['_machine'] = temp[0]
 end
}
  }
  mutate {

 add_field = [event_timestamp, %{@timestamp} ]
 replace = [ log_time, %{logdate} %{log_time} ]
 # Remove the 'logdate' field since we don't need it anymore.
lowercase=[loccode]
remove = logdate

   }
 # to get all site details (site name, city and co-ordinates)
 sitelocator{sitename = loccode  
 datafile=vendor/sitelocator/SiteDetails.csv}
 date {  locale=en
 match = [ log_time, -MM-dd HH:mm:ss, MM-dd- 
 HH:mm:ss.SSS,ISO8601 ] }

 }

 output {
 elasticsearch{
  }

 }



 I have checked step by step to find bottleneck filter. Below filter which
 took much time. Can you guide me How can I tune it to get faster.

 date { locale=en match = [ log_time, -MM-dd HH:mm:ss,
 MM-dd- HH:mm:ss.SSS,ISO8601 ] } }
 http://serverfault.com/questions/669534/elasticsearch-performance-tuning#comment818613_669558


 Thanks
 Devaraj

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/7eedf369-b10d-442e-b30d-5e7969bf1c59%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/7eedf369-b10d-442e-b30d-5e7969bf1c59%40googlegroups.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAEYi1X-bB8%3DY0fd4HKcJ9Tw6OENwOTkMYo2muZs-Pd7-dt%2BA9w%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Help with 4 node cluster

2015-02-19 Thread Mark Walkom
You can have 4 master eligable
​if you want​
, just set discovery.zen.minimum_master_nodes
​ to 3 to ensure a quorum.​

But ideally as Christian mentioned, it's best to have an uneven number of
masters.

On 19 February 2015 at 16:51, christian.dahlqv...@elasticsearch.com wrote:

 Hi,

 You always want an odd number of master nodes (often 3), so I would
 therefore recommend setting three of the four nodes to be master eligible
 and leave the fourth as a pure data node. This will prevent the cluster
 getting partitioned into two with equal number of master nodes on both
 sides of the partition.

 Best regards,

 Christian

 On Wednesday, February 18, 2015 at 11:39:54 AM UTC, sysads wrote:

 Hi

 I am in need of help on setting up a 4 node elasticsearch servers. I
 have installed and configured ES on all 4 nodes but I am lost as to what
 the configuration in elasticsearch.yml will be:

 - if I want to have all 4 nodes both master and data
 - make node A act as primary shard while node B acts as its replica then
 node C as primary shard while node D as its own replica.

 Thanks

  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/956bd6e5-44fa-448e-93aa-9a7623972b42%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/956bd6e5-44fa-448e-93aa-9a7623972b42%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAEYi1X-hffk5m_ORJ9_LdmBqgcK2QLMCvwpuQhVJRzRvM1A8tg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Disk awarnes on Indexing.

2015-02-19 Thread Prasanth R
Hi All,

   Few of our nodes filled disks up to 90%. Is there any option to avoid 
indexing on those disks instead of relocating shards. Because each shard 
contains roughly 600G data, so relocating is costly as ours is very busy 
cluster.

Thanks
Prasath Rajan

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/f0eeb39f-de10-4db1-9db7-838c8e4d21c3%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


indexing- pls help

2015-02-19 Thread aiswarya lakshmi
Hi.. 

I am a beginner here.. I just installed elasticsearch on windows and i 
added the sense extension for google chrome.. 

I need to index some files that are present in my local system directory.. 

I did not install any extension other than sense.. 

Please tell me what more should i install and how do i add and index these 
files,, 

Thanks in advance
Aiswaryalakshmi K

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/c3620d76-5918-41e4-8eb6-fe2baadc3e59%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Combining Multiple Queries with 'OR' or 'AND'

2015-02-19 Thread Masaru Hasegawa
Hi,

bool query should work:
-
{
  query: {
bool: {
  should: [
{
  filtered: { // Query1
query: {...},
filter: {...}
  }
},
{
  filtered: { // Query2
query: {...},
filter: {...}
  }
}
  ]
}
  }
}
-


Masaru

On Fri, Feb 20, 2015 at 8:29 AM, Debashish Paul shima...@gmail.com wrote:

 Hi,
 Question

 I am trying to combine 2 user search queries using AND and OR as
 operation. I am aware of combining queries where we can merge filters, but
 I want to merge entire queries like { {BIG Elastic Query1} AND {BIG Elastic
 Query2} }.
 Details

 For instance say a user performs a search for batman in movies type with
 filters of Christian and Bale and another query Dark Knight in type
 tvshows with filters of Christopher Nolan. I want to combine both queries
 so I can look for both batman movies and Dark Knight tvshows, but not Dark
 knight movies or batman tvshows.

 For example, for the given queries
 I just want to run Query1 OR Query2 in the elasticsearch.
 Query 1:

 {
query: {
   filtered: {
  query: {
 query_string:{
query:Batman,
default_operator:AND,
fields:[
   Movies._all
]
 }
  },
  filter: {
 bool: {
must: [
   {
  query:{
 filtered:{
filter:{
   and:[
  {
 term:{
cast.firstName:Christian
 }
  },
  {
 term:{
cast.lastName:Bale
 }
  }
   ]
}
 }
  }
   }
]
 }
  }
   }
}
 }

 Query2:

 {
query: {
   filtered: {
  query: {
 query_string:{
query:Dark Knight,
default_operator:AND,
fields:[
   tvshows._all
]
 }
  },
  filter: {
 bool: {
must: [
   {
  query:{
 filtered:{
filter:{
   and:[
  {
 term:{
director.firstName:Christopher
 }
  },
  {
 term:{
director.lastName:Nolan
 }
  }
   ]
}
 }
  }
   }
]
 }
  }
   }
}
 }

  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/be301214-1b6a-4065-8c8e-f1e5048e9873%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/be301214-1b6a-4065-8c8e-f1e5048e9873%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAGmu3c1GF4sQX_Pa7RRL2xOFz0gXs2r5-dtx55AsToxCgWtKcQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: indexing- pls help

2015-02-19 Thread David Pilato
You could give a look at FSRiver plugin. Might help to start.

https://github.com/dadoonet/fsriver

David

 Le 20 févr. 2015 à 06:42, aiswarya lakshmi ais.k...@gmail.com a écrit :
 
 Hi.. 
 
 I am a beginner here.. I just installed elasticsearch on windows and i added 
 the sense extension for google chrome.. 
 
 I need to index some files that are present in my local system directory.. 
 
 I did not install any extension other than sense.. 
 
 Please tell me what more should i install and how do i add and index these 
 files,, 
 
 Thanks in advance
 Aiswaryalakshmi K
 -- 
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/c3620d76-5918-41e4-8eb6-fe2baadc3e59%40googlegroups.com.
 For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/D5F364C7-F6D8-47BC-9BF7-45D1CB5B5372%40pilato.fr.
For more options, visit https://groups.google.com/d/optout.


Re: Elasticsearch performance tuning

2015-02-19 Thread Deva Raj

I listed below instance and his heap size  details.

Medium instance 3.75 RAM 1 cores Storage :4 GB SSD 64-bit Network  

Java heap size: 2gb


R3 Large 15.25 RAM 2 cores Storage :32 GB SSD

Java heap size: 7gb


R3 High-Memory Extra Large r3.xlarge 30.5 RAM 4 cores

Java heap size: 15gb


Thanks
Devaraj

On Friday, February 20, 2015 at 4:15:12 AM UTC+5:30, Mark Walkom wrote:

 Don't change cache and buffer sizes unless you know what is happening, the 
 defaults are going to be fine.
 How much heap did you give ES?

 I'm not sure you can do much about the date filter though, maybe someone 
 else has pointers.

 On 19 February 2015 at 21:12, Deva Raj devara...@gmail.com javascript: 
 wrote:

 Hi Mark Walkom,

 I have given below logstash conf file

  
   Logstash conf

 input {
file {

   }

 }

 filter {
   mutate
   {
 gsub = [message, \n,  ]
   }
  mutate
  {
 gsub = [message, \t,  ]
  }
  multiline
{
 pattern = ^ 
 what = previous
}

 grok { match = [ message, 
 %{TIME:log_time}\|%{WORD:Message_type}\|%{GREEDYDATA:Component}\|%{NUMBER:line_number}\|
  %{GREEDYDATA:log_message}] 
  match = [ path , 
 %{GREEDYDATA}/%{GREEDYDATA:loccode}/%{GREEDYDATA:_machine}\:%{DATE:logdate}.log]

  break_on_match = false
 }


 #To check location is S or L
   if [loccode] == S  or [loccode] == L {
  ruby {   
 code =  temp = event['_machine'].split('_')
   if  !temp.nil? || !temp.empty?
   event['_machine'] = temp[0]
 end
} 
  }
  mutate {

 add_field = [event_timestamp, %{@timestamp} ]
 replace = [ log_time, %{logdate} %{log_time} ]
 # Remove the 'logdate' field since we don't need it anymore.
lowercase=[loccode]
remove = logdate

   }
 # to get all site details (site name, city and co-ordinates)
 sitelocator{sitename = loccode  
 datafile=vendor/sitelocator/SiteDetails.csv}
 date {  locale=en
 match = [ log_time, -MM-dd HH:mm:ss, MM-dd- 
 HH:mm:ss.SSS,ISO8601 ] }

 }

 output {
 elasticsearch{
  }

 }



 I have checked step by step to find bottleneck filter. Below filter which 
 took much time. Can you guide me How can I tune it to get faster. 

 date { locale=en match = [ log_time, -MM-dd HH:mm:ss, 
 MM-dd- HH:mm:ss.SSS,ISO8601 ] } } 
 http://serverfault.com/questions/669534/elasticsearch-performance-tuning#comment818613_669558


 Thanks
 Devaraj

 -- 
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearc...@googlegroups.com javascript:.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/7eedf369-b10d-442e-b30d-5e7969bf1c59%40googlegroups.com
  
 https://groups.google.com/d/msgid/elasticsearch/7eedf369-b10d-442e-b30d-5e7969bf1c59%40googlegroups.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.




-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/5335d517-d7d6-482f-a4b4-6ab06eb13e02%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Elasticsearch Script merge the results of two aggregations

2015-02-19 Thread Masaru Hasegawa
Hi,

Looks like you are using lucene expression [1]. See the link for the limitation 
of lucene expression. Today it only supports numeric values.
Since terms agg doesn’t have lang property, probably you have 
“script.default_lang set to “expression in elasticsearch.yml?

FYI, if you put “lang”:”groovy” (and if configuration allows running dynamic 
groovy script), your query should work.
But make sure you read release note [2] before turning on dynamic groovy 
scripting. (you can use groovy script without turning on dynamic scripting [3])


Masaru

[1] 
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-scripting.html#_lucene_expressions_scripts
[2] http://www.elasticsearch.org/blog/elasticsearch-1-4-3-and-1-3-8-released/
[3] 
http://www.elasticsearch.org/blog/running-groovy-scripts-without-dynamic-scripting/


On February 19, 2015 at 22:21:41, ali balci (balci.a...@gmail.com) wrote:
 I second error :
  
 {
 error: SearchPhaseExecutionException[Failed to execute phase
 [query_fetch], all shards failed; shardFailures {[g][test][0]:
 RemoteTransportException[[Mammomax][inet[/192.168.1.8:9300]][search/phase/query+fetch]];
   
 nested: SearchParseException[[test][0]:
 query[ConstantScore(*:*)],from[-1],size[-1]: Parse Failure [Failed to
 parse source 
 [{\query\:{\match_all\:{}},\aggs\:{\Brand\:{\terms\:{\script\:\doc['brandName'].value+'|'+doc['brandLink'].value\,\size\:0]]];
   
 nested: ExpressionScriptCompilationException[Failed to parse
 expression: doc['brandName'].value+'|'+doc['brandLink'].value];
 nested: ParseException[ unexpected character ''' at position (23).];
 nested: NoViableAltException; }],
 status: 400
 }
  
  
 2015-02-19 14:49 GMT+02:00 ali balci :
  
 
  I use aggregrations on elasticsearch version 1.3.8. I use aggregation 
  script for awhile  
 today ıt didnt work. Please help ı cant find any solution :
 
  This the mapping:
 
  mappings: {
  product: {
  properties: {
  brandId: {
  type: integer
  },
  brandIsActive: {
  type: boolean
  },
  brandLink: {
  type: string,
  index: not_analyzed
  },
  brandName: {
  type: string,
  index: not_analyzed
  }
 
  }
 
  }
 
  }
 
 
  this my query:
 
 
  post alias-test/product/_search
  {
  query: {
  match_all: {}
  },
  aggs: {
  Brand: {
  terms: {
  script: doc['brandName'].value,
  size: 0
  }
  }
  }
  }
 
  This is the error:
 
  {
  error: SearchPhaseExecutionException[Failed to execute phase 
  [query_fetch],  
 all shards failed; shardFailures {[g][mizu-20150219142655][0]: 
 RemoteTransportException[[Mammomax][inet[/172.31.37.148:9300]][search/phase/query+fetch]];
   
 nested: SearchParseException[[mizu-20150219142655][0]: 
 query[ConstantScore(*:*)],from[-1],size[-1]:  
 Parse Failure [Failed to parse source 
 [{\query\:{\match_all\:{}},\aggs\:{\Brand\:{\terms\:{\script\:\doc['brandName'].value\]]];
   
 nested: ExpressionScriptCompilationException[Field [brandName] used in 
 expression  
 must be numeric]; }],
  status: 400
  }
 
 
  The other query :
 
  post test/product/_search
  {
  query: {
  match_all: {}
  },
  aggs: {
  Brand: {
  terms: {
  script: doc['brandName'].value+'|'+doc['brandLink'].value,
  size: 0
  }
  }
  }
  }
 
 
  the error :
 
  post test/product/_search
  {
  query: {
  match_all: {}
  },
  aggs: {
  Brand: {
  terms: {
  script: doc['brandName'].value+'|'+doc['brandLink'].value,
  size: 0
  }
  }
  }
  }
 
  --
  You received this message because you are subscribed to a topic in the
  Google Groups elasticsearch group.
  To unsubscribe from this topic, visit
  https://groups.google.com/d/topic/elasticsearch/aeNWfgYNVmA/unsubscribe.  
  To unsubscribe from this group and all its topics, send an email to
  elasticsearch+unsubscr...@googlegroups.com.
  To view this discussion on the web visit
  https://groups.google.com/d/msgid/elasticsearch/1af39cb0-893d-4a7d-b9e1-a061eed48de6%40googlegroups.com

   
  .
  For more options, visit https://groups.google.com/d/optout.
 
  
  
  
 --
 Best Regards
  
 ALİ BALCI
 Bilgisayar Mühendisligi
 Tel:0543 699 59 88
 FACEBOOK  
 BLOG  
  
 --
 You received this message because you are subscribed to the Google Groups 
 elasticsearch  
 group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearch+unsubscr...@googlegroups.com.  
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/CAMLQj%3D%2B1k0DiPwXdtxYNMjcK%2BMeQmzSAdw-9D0nrVpP-mPU4nw%40mail.gmail.com.
   
 For more options, visit https://groups.google.com/d/optout.
  

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/etPan.54e6d79d.3a95f874.10ad%40citra.local.
For more options, visit https://groups.google.com/d/optout.


Re: Move Data from One index to Another Cluster Index

2015-02-19 Thread Honza Král
The python client has a reindex helper that can do just that - just
supply a client instance for the source and destination clusters.

http://elasticsearch-py.readthedocs.org/en/latest/helpers.html#elasticsearch.helpers.reindex

Hope this helps,
Honza

On Thu, Feb 19, 2015 at 11:44 PM, Amay Patil amaypati...@gmail.com wrote:
 How can one move data from One index in a server to another Index in another
 Server.

 I know reindexing moves data into another index in the same cluster.

 How can it be done in Python ?

 Ap

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/ecb23365-08a3-4b0e-8036-ed2f8b1fbc27%40googlegroups.com.
 For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CABfdDiopGtorHza49wP1MLPrnA4MD%3Dufd2KO0K6eyamE0AuUfQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: What tools to use for indexing ES in production?

2015-02-19 Thread Mark Walkom
Kafka is fine, you could also use Logstash to read this data and send to ES
-
http://www.elasticsearch.org/guide/en/logstash/current/plugins-inputs-kafka.html
NB this is in the 1.5 BETA release, so use with caution.

Otherwise take a look at other Logstash inputs and see if there is
something suitable you can leverage, there is also a number of official
clients if you want to roll your own. There isn't an official method, but
those last two are pretty common.

On 19 February 2015 at 20:07, Kevin Liu ke...@ticketfly.com wrote:

 Well, the sales messages are coming through kafka. We need to extract some
 info from the database. We can do anything really. I'm just not sure what
 are the common practice here. It seems to be so many options. what kind of
 questions am I not asking here?

 On Monday, February 16, 2015 at 2:30:45 AM UTC-8, Mark Walkom wrote:

 This depends on how the app is made and what options you have to extract
 data from it.

 On 16 February 2015 at 20:28, Kevin Liu ke...@ticketfly.com wrote:

 We want to index our sales records as they come in from our apps. We are
 using quartz job right now, which is really slow and not really real-time.

 We will be implementing a message bus soon for firing sales event.

 The process is:
 read from a message queue
 grab some extra data from mysql
 do some ETL to construct the document
 index it to ES.

 I've been reading about apache spark, ES river, logstash.

 My questions is what kind of tools are right for the job here?
 is apache spark an over kill?
 is DIY a better option here?
 what are you guys using?

 please advice and point me to the right thing to read.

 --
 You received this message because you are subscribed to the Google
 Groups elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send
 an email to elasticsearc...@googlegroups.com.
 To view this discussion on the web visit https://groups.google.com/d/
 msgid/elasticsearch/949b0c40-e8b1-4d40-bced-68e5c005c713%
 40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/949b0c40-e8b1-4d40-bced-68e5c005c713%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/4a11083b-4e9f-4be2-9a8e-83caf34d56b4%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/4a11083b-4e9f-4be2-9a8e-83caf34d56b4%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAEYi1X8r2as%3DXJRfYa1cTV3ZGsV_ckfjW-ZxqQCF1iDa9mz-mA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: I want to implement QueryElevationComponent feature in ELasticSearch

2015-02-19 Thread joergpra...@gmail.com
I have implemented query elevation by a function score-based conditional
boosting plugin, see

https://github.com/jprante/elasticsearch-functionscore-conditionalboost

Jörg

On Thu, Feb 19, 2015 at 2:40 PM, Gurunath pai pai.gurun...@gmail.com
wrote:

 HI All,

 I want to implement QueryElevationComponent feature in ELasticSearch, Can
 anyone suggest me how can I go ahead with this feature in Elastic Search.

 Thanks
 Guru Pai.

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/219ad580-8e19-418b-a775-79f61373141c%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/219ad580-8e19-418b-a775-79f61373141c%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoEied9-E3j%2BEgdY-%3DJzdck%3DyE8xb3QgGjrpgETxQF4xew%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


formula or guidelines to calculate/estimate the index size

2015-02-19 Thread Gaurav gupta
Could anyone assist me to know how can I find a formula or guidelines to
calculate/estimate the index size created by elastic search. I found that
there is a formula (excel sheet) for it for LUCENE.

Thanks!
Gaurav

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CALZAj3Knr%3DusYtW8KyJVqbNsS0cK5z1MY5F-EvP94tC1%3DjBCDA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: [Hadoop][Spark] Exclude metadata fields from _source

2015-02-19 Thread Itai Yaffe
Thanks for the response Costin!
As you mentioned, option 1, i.e es.mapping.exclude, is more appropriate 
when working with JSON.
Since it doesn't seem to work, I've followed your advice and raised a new 
issue (https://github.com/elasticsearch/elasticsearch-hadoop/issues/381) 
including a small test application to reproduce.
I'd be happy to hear what you think of it.

Thanks again,
   Itai

On Wednesday, February 18, 2015 at 7:42:36 PM UTC+2, Costin Leau wrote:

 Hi Itay,

 Sorry I missed your email. I'm not clear from your post how your documents 
 look like - can you post a gist somewhere with your JSON input that you are 
 sending to Elasticsearch?
 Typically the metadata appear in the _source if they are declared that 
 way. You should be able to go around this by using:
 1. es.mapping.exclude - if it doesn't seem to be working
 2. in case of Spark, by specifying the metadata through the `saveWithMeta` 
 methods which allows it to stay decoupled from the object itself.

 Since you are using JSON likely 1 is your best shot. If it doesn't work 
 for you can you please raise an issue with a quick/small sample to be able 
 to reproduce it?

 Thanks,


 On Wed, Feb 18, 2015 at 10:27 AM, Itai Yaffe it...@exelate.com 
 javascript: wrote:

 Hey,
 Has anyone experienced with such an issue?
 Perhaps Costin can help here?

 Thanks!

 On Thursday, February 12, 2015 at 8:27:14 AM UTC+2, Itai Yaffe wrote:

 Hey,
 I've recently started using Elasticsearch for Spark (Scala application).
 I've added elasticsearch-spark_2.10 version 2.1.0.BUILD-SNAPSHOT to my 
 Spark application pom file, and used 
 org.apache.spark.rdd.RDD[String].saveJsonToEs() 
 to send documents to Elasticsearch.
 When the documents are loaded to Elasticsearch, my metadata fields (e.g 
 id, index, etc.) are being loaded as part of the _source field.
 Is there a way to exclude them from the _source?
 I've tried using the new es.mapping.exclude configuration property 
 (added in this commit 
 https://github.com/elasticsearch/elasticsearch-hadoop/commit/aae4f0460a23bac9567ea2ad335c74245a1ba069
  
 - that's why I needed to take the latest build rather than using version 
 2.1.0.Beta3), but it doesn't seem to have any affect (although I'm not sure 
 it's even possible to exclude fields I'm using for mapping, e.g 
 es.mapping.id).

 A code snippet (I'm using a single-node Elasticsearch cluster for 
 testing purposes and running the Spark app from my desktop) :
 val conf = new SparkConf()...
 conf.set(es.index.auto.create, false)
 conf.set(es.nodes.discovery, false)
 conf.set(es.nodes, XXX:9200)
 conf.set(es.update.script, XXX)
 conf.set(es.update.script.params, param1:events)
 conf.set(es.update.retry.on.conflict , 2)
 conf.set(es.write.operation, upsert)
 conf.set(es.input.json, true)
 val documentsRdd =  ...
 documentsRdd.saveJsonToEs(test/user, scala.collection.Map(es.
 mapping.id - _id, es.mapping.exclude - _id))

 The JSON looks like that :
 {
   _id: ,
   _type: user,
   _index: test,
   params: {
 events: [
   {
 ...
   }
 ]
   }

 Thanks!
 }

  -- 
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearc...@googlegroups.com javascript:.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/aea88dfb-8d4b-49d1-a236-8de6d513b4f6%40googlegroups.com
  
 https://groups.google.com/d/msgid/elasticsearch/aea88dfb-8d4b-49d1-a236-8de6d513b4f6%40googlegroups.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.




-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/9f210b41-4a31-4dd4-aa2d-cae7aabd3a1f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: elastic search cluster behind azure public load balncer

2015-02-19 Thread Mark Walkom
Do not run ES across regions, it's a bad idea. Use snapshot and restore or
some other method to replicate data.

On 19 February 2015 at 20:51, Subodh Patil subodh00...@gmail.com wrote:

 Ok Thanks Mark.
 So you are saying I should set node.master: true   as well as node.data:
 true for  all the 3 nodes in the cluster.
 And this will hold true for  4 node or even more node cluster ?

 As of now I am making data redundant on all 3 nodes  but please suggest
 performance...
 azure Load balancer  chances of failure are much lesser than  VM.


 The redundancy is required from backup point of view.  So I will have 4
 node cluster with data replicated on all 4 nodes.
 3  nodes in the same region (say East US) and behind load balancer  and
 4th node will be in different region (say west us) but NOT behind load
 balancer and just holding the all data.
 In case east us data center has problem then I can redirect all traffic to
 west US data center where single node will always have all update data, or
 even i can take backups from west us data center.

 Thanks,
 Subodh



 On Thursday, February 19, 2015 at 7:10:52 AM UTC+5:30, Mark Walkom wrote:

 Yes the master can serve requests.

 You don't really want 2 masters and 1 data node though, make all 3
 master+data to start with.
 And sure the client can be a SPOF, but then isn't a single load balancer
 a SPOF as well? So the question remains, where are you happy dealing with
 these points, because at some point you cannot make *everything* redundant
 without being excessive.

 On 18 February 2015 at 21:36, Subodh Patil subod...@gmail.com wrote:


 i am trying to setup ES cluster behind azure load balancer /cloud
 service. the cluster is 3 node with no specific data/client/master node
 settings. By default 2 nodes are elected as master and 1 as data node.

 As the request (create/update/search) from application comes to azure
 load balancer on 9200 port which load balanced for all 3 vms the request
 can go to any vm.

 Will master node be able to serve the requests ?

 Many article says that you don't need load balancer for ES cluster just
 use client node but then it becomes single point of failure as azure vm can
 go down any point of time. so load balancing is required mainly for high
 availability from infrastructure point of view.
 please suggest cluster setup and which nodes (data or client) to be put
 behind load balancer.


 ES version 1.4.1 on windows server 2012 r2 vm


 --
 You received this message because you are subscribed to the Google
 Groups elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send
 an email to elasticsearc...@googlegroups.com.
 To view this discussion on the web visit https://groups.google.com/d/
 msgid/elasticsearch/47a28ee1-f216-480c-b148-c8e0d105a07c%
 40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/47a28ee1-f216-480c-b148-c8e0d105a07c%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/d961148e-13d5-444f-bb33-368c9a468078%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/d961148e-13d5-444f-bb33-368c9a468078%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAEYi1X8r5s%2BavwSCdf1qkA9Yh1mU%3DFKEh7vo7yO-jOUG0MRoTw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Elasticsearch Broken?

2015-02-19 Thread Cong Hui
Hello, 

I recently started attempting to use elasticsearch to store some data. 
In the learning process, I was doing ok until I attempted using the bulk 
api. 
I used the something along the lines of: 

conn = ES(url, timeout, bulksize) 

for each (tuple) 
   data = something(tuple) 
   conn.index(data, index name, count, bulk=true) 

Which I imagined would add a large number of items to localhost:9200/(index 
name)/(type name)/(count), but it ended up adding a large number of garbage 
at localhost:9200/(index name). 
Not sure how to proceed, but knowing that I needed a reset, I went ahead 
and did a delete on localhost:9200/*, which deleted everything as expected. 
I thus attempted to begin again, with adding a index with mapping. 

curl -XPUT http://localhost:9200/(index name) -d mapping.json 

mapping.json: 
{ 
  settings: { 
analysis: { 
  analyzer: { 
ngram_analyzer: { 
  tokenizer: ngram_tokenizer 
} 
  }, 
  tokenizer: { 
ngram_tokenizer: { 
  type: nGram, 
  min_gram: 2, 
  max_gram: 3, 
  token_chars: [ 
letter, 
digit, 
symbol, 
punctuation, 
whitespace 
  ] 
} 
  } 
} 
  }, 
  mappings: { 
access: { 
  properties: { 
date: { 
  type: date, 
  format: -MM-dd, 
  analyzer: ngram_analyzer 
}, 
time: { 
  type: date, 
  format: HH:mm:ss, 
  analyzer: ngram_analyzer 
}, 
protocol: { 
  type: string, 
  analyzer: ngram_analyzer 
}, 
source ip: { 
  type: string, 
  analyzer: ngram_analyzer 
}, 
source port: { 
  type: integer 
}, 
country: { 
  type: string, 
  analyzer: ngram_analyzer 
}, 
organization: { 
  type: string, 
  analyzer: ngram_analyzer 
}, 
dest ip: { 
  type: string, 
  analyzer: ngram_analyzer 
}, 
dest port: { 
  type: integer 
} 
  } 
} 
  } 
} 

This is where the troubles began. While the above command created a node at 
(index name), it refused to populate the index with my settings. 

{ 
  darknet : { 
aliases : { }, 
mappings : { }, 
settings : { 
  index : { 
creation_date : 1424425712525, 
mapping : { 
  json :  
}, 
uuid : cgTEPkqnQJKejLPyHqVNYA, 
number_of_replicas : 1, 
number_of_shards : 5, 
version : { 
  created : 1040399 
} 
  } 
}, 
warmers : { } 
  } 
} 

I attempted to update the node with curl -XPUT http://localhost:9200/(index 
name)/_setting 
-d mapping.json after a close, but that also left the index blank. 
I delete the index, create another index, etc. etc. but to no avail. 

In the end, I manage to make a change to the index, though not a good one. 
I made a call to the update mapping API, which changed the index into a 
horrifying monstrosity with repeating tags, like so: 

{ 
  darknet : { 
aliases : { }, 
mappings : { 
  1 : { 
properties : { 
  mappings : { 
properties : { 
  access : { 
properties : { 
  properties : { 
properties : { 
  country : { 
properties : { 
  analyzer : { 
type : string 
  }, 
  type : { 
type : string 
  } 
} 
  }, 
  date : { 
properties : { 
  analyzer : { 
type : string 
  }, 
  format : { 
type : string 
  }, 
  type : { 
type : string 
  } 
} 
  }, 
  dest ip : { 
properties : { 
  analyzer : { 
type : string 
  }, 
  type : { 
type : string 
  } 
} 
  }, 
  dest port : { 
properties : { 
  type : { 
type : string 
  } 
} 
  }, 
  organization : { 
properties : { 
  analyzer : { 
type : string 
  

Incorrect results from geo_shape filter

2015-02-19 Thread Alexei Peters
Hey Everyone,

I have a nested document that includes a spatial component that I'm using 
in a spatial query via the geo_shape filter.
I've noticed that the filter consistently returns a result that includes a 
geometry that clearly falls outside the spatial filter.
I've tested the spatial intersection in PostGIS and that correctly returns 
back no result from the same query.

Can anyone else verify this?  We have lots of other MultiPolygon data that 
seems to work just fine.

I'm using the REST api against ES v1.4.1

*Here's the document I'm testing against:*

{dates: [],
geometries: [{
  value: {
type: MultiPolygon,
coordinates: [
  [
[
  [-118.32608,
  34.07035],
  [-118.32657,
  34.07035],
  [-118.32657,
  34.07054],
  [-118.32608,
  34.07054],
  [-118.32608,
  34.07035]
]
  ],
  [
[
  [-118.32608,
  34.07021],
  [-118.32608,
  34.07004],
  [-118.32657,
  34.07004],
  [-118.32657,
  34.07021],
  [-118.32608,
  34.07021]
]
  ]
]
  },
  label: MULTIPOLYGON (((-118.32608 34.07035,-118.32657 
34.07035,-118.32657 34.07054,-118.32608 34.07054,-118.32608 
34.07035)),((-118.32608 34.07021,-118.32608 34.07004,-118.32657 
34.07004,-118.32657 34.07021,-118.32608 34.07021))),
  child_entities: [],
  entitytypeid: SPATIAL_COORDINATES_GEOMETRY.E47,
  parentid: 80446382-0e96-4db1-8a37-95aca166b785,
  entityid: 81b06621-2989-4825-98f3-88ee2ac937b6,
  property: P87,
  businesstablename: geometries
}],
child_entities: [{
  value: Windsor Square District Non-Contributor,
  label: Windsor Square District Non-Contributor,
  child_entities: [],
  entitytypeid: NAME.E41,
  parentid: 5fbe787f-8fc3-4986-b77a-38828699b68c,
  entityid: 62a82362-b7c3-4ad2-ba58-29bab7dc5dbb,
  property: P1,
  businesstablename: strings
}, {
  value: ,
  label: ,
  child_entities: [],
  entitytypeid: PLACE.E53,
  parentid: 5fbe787f-8fc3-4986-b77a-38828699b68c,
  entityid: 80446382-0e96-4db1-8a37-95aca166b785,
  property: P53,
  businesstablename: 
}],
label: ,
date_groups: [],
primaryname: Windsor Square District Non-Contributor,
value: ,
entitytypeid: HERITAGE_RESOURCE.E18,
domains: [{
  conceptid: a5675b84-fed4-4839-9afa-434be64c3899,
  child_entities: [],
  label: Primary,
  value: b171b37b-4c78-4b51-91a9-503541511be4,
  entitytypeid: NAME_TYPE.E55,
  parentid: 62a82362-b7c3-4ad2-ba58-29bab7dc5dbb,
  entityid: 4bd86b01-a047-4e7a-953a-00d2243e39db,
  property: P2,
  businesstablename: domains
}],
entityid: 5fbe787f-8fc3-4986-b77a-38828699b68c,
property: ,
businesstablename: }


*Here's the DSL:*

{
  query: {
filtered: {
  filter: {
and: [
  {
bool: {
  should: [],
  must_not: [],
  must: [
{
  nested: {
path: geometries,
query: {
  geo_shape: {
geometries.value: {
  shape: {
type: Point,
coordinates: [
  -118.34194465228431,
  34.06402964781424
]
  }
}
  }
}
  }
}
  ]
}
  }
]
  },
  query: {
match_all: {}
  }
}
  },
  from: 0,
  size: 5
} 



*Here's the mapping for the index:*

  properties: {
businesstablename: {
  index: not_analyzed,
  type: string
},
child_entities: {
  type: nested,
  properties: {
businesstablename: {
  index: not_analyzed,
  type: string
},
entitytypeid: {
  index: not_analyzed,
  type: string
},
property: {
  index: not_analyzed,
  type: string
},
entityid: {
  index: not_analyzed,
  type: string
},
label: {
  index: not_analyzed,
  type: string
},
value: {
  type: string,
  fields: {
raw: {
  index: not_analyzed,
  type: string
},
folded: {
  analyzer: folding,
  type: string
}
  }
},
parentid: {
  index: not_analyzed,
  type: string
}
  }
},
entitytypeid: {
  index: not_analyzed,
  type: string
},
domains: {
  type: nested,
  properties: {
businesstablename: {
  index: not_analyzed,
  type: string
},
entitytypeid: {
  index: not_analyzed,
  type: string
},
property: {
  index: not_analyzed,
  type: string

Re: Incorrect results from geo_shape filter

2015-02-19 Thread Alexei Peters
FYI, you can also try a search using these coordinates (which are clearly 
incorrect) in your filter, and it still returns a result.

 coordinates: [ -119, 35 ]



On Thursday, February 19, 2015 at 3:04:29 PM UTC-8, Alexei Peters wrote:

 Hey Everyone,

 I have a nested document that includes a spatial component that I'm using 
 in a spatial query via the geo_shape filter.
 I've noticed that the filter consistently returns a result that includes a 
 geometry that clearly falls outside the spatial filter.
 I've tested the spatial intersection in PostGIS and that correctly returns 
 back no result from the same query.

 Can anyone else verify this?  We have lots of other MultiPolygon data that 
 seems to work just fine.

 I'm using the REST api against ES v1.4.1

 *Here's the document I'm testing against:*

 {dates: [],
 geometries: [{
   value: {
 type: MultiPolygon,
 coordinates: [
   [
 [
   [-118.32608,
   34.07035],
   [-118.32657,
   34.07035],
   [-118.32657,
   34.07054],
   [-118.32608,
   34.07054],
   [-118.32608,
   34.07035]
 ]
   ],
   [
 [
   [-118.32608,
   34.07021],
   [-118.32608,
   34.07004],
   [-118.32657,
   34.07004],
   [-118.32657,
   34.07021],
   [-118.32608,
   34.07021]
 ]
   ]
 ]
   },
   label: MULTIPOLYGON (((-118.32608 34.07035,-118.32657 
 34.07035,-118.32657 34.07054,-118.32608 34.07054,-118.32608 
 34.07035)),((-118.32608 34.07021,-118.32608 34.07004,-118.32657 
 34.07004,-118.32657 34.07021,-118.32608 34.07021))),
   child_entities: [],
   entitytypeid: SPATIAL_COORDINATES_GEOMETRY.E47,
   parentid: 80446382-0e96-4db1-8a37-95aca166b785,
   entityid: 81b06621-2989-4825-98f3-88ee2ac937b6,
   property: P87,
   businesstablename: geometries
 }],
 child_entities: [{
   value: Windsor Square District Non-Contributor,
   label: Windsor Square District Non-Contributor,
   child_entities: [],
   entitytypeid: NAME.E41,
   parentid: 5fbe787f-8fc3-4986-b77a-38828699b68c,
   entityid: 62a82362-b7c3-4ad2-ba58-29bab7dc5dbb,
   property: P1,
   businesstablename: strings
 }, {
   value: ,
   label: ,
   child_entities: [],
   entitytypeid: PLACE.E53,
   parentid: 5fbe787f-8fc3-4986-b77a-38828699b68c,
   entityid: 80446382-0e96-4db1-8a37-95aca166b785,
   property: P53,
   businesstablename: 
 }],
 label: ,
 date_groups: [],
 primaryname: Windsor Square District Non-Contributor,
 value: ,
 entitytypeid: HERITAGE_RESOURCE.E18,
 domains: [{
   conceptid: a5675b84-fed4-4839-9afa-434be64c3899,
   child_entities: [],
   label: Primary,
   value: b171b37b-4c78-4b51-91a9-503541511be4,
   entitytypeid: NAME_TYPE.E55,
   parentid: 62a82362-b7c3-4ad2-ba58-29bab7dc5dbb,
   entityid: 4bd86b01-a047-4e7a-953a-00d2243e39db,
   property: P2,
   businesstablename: domains
 }],
 entityid: 5fbe787f-8fc3-4986-b77a-38828699b68c,
 property: ,
 businesstablename: }


 *Here's the DSL:*

 {
   query: {
 filtered: {
   filter: {
 and: [
   {
 bool: {
   should: [],
   must_not: [],
   must: [
 {
   nested: {
 path: geometries,
 query: {
   geo_shape: {
 geometries.value: {
   shape: {
 type: Point,
 coordinates: [
   -118.34194465228431,
   34.06402964781424
 ]
   }
 }
   }
 }
   }
 }
   ]
 }
   }
 ]
   },
   query: {
 match_all: {}
   }
 }
   },
   from: 0,
   size: 5
 } 



 *Here's the mapping for the index:*

   properties: {
 businesstablename: {
   index: not_analyzed,
   type: string
 },
 child_entities: {
   type: nested,
   properties: {
 businesstablename: {
   index: not_analyzed,
   type: string
 },
 entitytypeid: {
   index: not_analyzed,
   type: string
 },
 property: {
   index: not_analyzed,
   type: string
 },
 entityid: {
   index: not_analyzed,
   type: string
 },
 label: {
   index: not_analyzed,
   type: string
 },
 value: {
   type: string,
   fields: {
 raw: {
   index: not_analyzed,
   type: string
 },
 folded: {
   analyzer: folding,
   type: string
 }
   }
 },
 parentid: {
   index: not_analyzed,
   

Java node client failing to send join request to master

2015-02-19 Thread Diana Tuck
Anyone have any idea where to start with this one?  We're running our 
singular master/data node in a docker container, trying to connect via a 
Java node client on different boxes.  Please help!! 


INFO  [2015-02-20 02:55:11,411] org.elasticsearch.discovery.ec2: 
[localhost] failed to send join request to master 
[[esNode][jU4Y42dtQfCPJn1-5jFfYw][esHost][inet[/255.255.255.255]]{master=true}],
 
reason 
[RemoteTransportException[[esNode][inet[/255.255.255.255:9300]][internal:discovery/zen/join]];
 
nested: 
NotSerializableTransportException[[org.elasticsearch.transport.ConnectTransportException]
 
[localhost][inet[/255.255.255.255:9300]] connect_timeout[30s]; connection 
timed out: /255.255.255.255:9300; ]; ]

elasticsearch.yml:

cloud:

  aws:

access_key: awsAccessKey

secret_key: awsSecretKey

region: us-west-2

discovery:

  type: ec2

  ec2:

groups: elasticsearch

availability_zones: us-west-2a

tag:

  Elasticsearch: tag

network.public_host: 255.255.255.255

network.publish_host: 255.255.255.255

discovery.zen.ping.multicast.enabled: false


Java node client:

Settings settings = ImmutableSettings.settingsBuilder()
.put(node.name, nodeName)
.put(cloud.aws.access_key, awsAccessKey)
.put(cloud.aws.secret_key, awsSecretKey)
.put(cloud.aws.region, us-west-2)
.put(cloud.node.auto_attributes, true)
.put(discovery.type, ec2)
.put(discovery.ec2.groups, elasticsearch)
.put(discovery.ec2.availability_zones, us-west-2a)
.put(discovery.ec2.tag.Elasticsearch, devvpc)
.put(discovery.zen.ping.multicast.enabled, false)
.build();
this.node = nodeBuilder()
.clusterName(clusterName)
.settings(settings)
.client(true)
.node();
this.client = node.client();

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/c56baa18-0956-4b88-a063-acd4db82374e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Combining Multiple Queries with 'OR' or 'AND'

2015-02-19 Thread Debashish Paul


Hi,
Question

I am trying to combine 2 user search queries using AND and OR as operation. 
I am aware of combining queries where we can merge filters, but I want to 
merge entire queries like { {BIG Elastic Query1} AND {BIG Elastic Query2} }.
Details

For instance say a user performs a search for batman in movies type with 
filters of Christian and Bale and another query Dark Knight in type 
tvshows with filters of Christopher Nolan. I want to combine both queries 
so I can look for both batman movies and Dark Knight tvshows, but not Dark 
knight movies or batman tvshows.

For example, for the given queries
I just want to run Query1 OR Query2 in the elasticsearch.
Query 1:

{
   query: {
  filtered: {
 query: { 
query_string:{  
   query:Batman,
   default_operator:AND,
   fields:[  
  Movies._all
   ]
}
 },
 filter: {
bool: {
   must: [  
  {  
 query:{  
filtered:{  
   filter:{  
  and:[  
 {  
term:{  
   cast.firstName:Christian
}
 },
 {  
term:{  
   cast.lastName:Bale
}
 }
  ]
   }
}
 }
  }
   ]
}
 }
  }
   }
}

Query2:

{
   query: {
  filtered: {
 query: { 
query_string:{  
   query:Dark Knight,
   default_operator:AND,
   fields:[  
  tvshows._all
   ]
}
 },
 filter: {
bool: {
   must: [  
  {  
 query:{  
filtered:{  
   filter:{  
  and:[  
 {  
term:{  
   director.firstName:Christopher
}
 },
 {  
term:{  
   director.lastName:Nolan
}
 }
  ]
   }
}
 }
  }
   ]
}
 }
  }
   }
}

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/be301214-1b6a-4065-8c8e-f1e5048e9873%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Search multiple indices in Kibana 4?

2015-02-19 Thread Chris Neal
Hi all,

Trying out Kibana 4's new release today, and was wondering is this is still
possible.  In Kibana 3, you could simply comma-delimit all the index
patterns you wanted to expose to your searches, but that doesn't seem
possible in Kibana 4.

I have indexes named like:

company-customer-MMDD

I'd like to be able to search across company and customer if possible.
It appears that the Configure an Index Pattern page allows for a single
pattern, not multiples separated by commas.

If I wanted to search these two indexes:

companyA-customerA-MMDD
companyA-customerB-MMDD

but not:

companyA-customerC-MMDD

Is that possible?
Thanks!
Chris

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAND3DpijG_VaE91rFLnT-E0rGXL-kVq46yUVm4ixBTygi8wGTQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.