java.lang.UnsupportedClassVersionError: org/elasticsearch/ElasticsearchException : Unsupported major.minor version 51.0

2015-03-26 Thread Vijayakumari B N
Hi,

I have jdk1.7.0_75 in linux and elasticsearch server 1.4.1, but while 
starting my jboss server i am getting below exception. I am able to run 
elasticsearch server using same version. Any body knows what is java 
compatible version for elastic search 1.4.1.

java.lang.UnsupportedClassVersionError: 
org/elasticsearch/ElasticsearchException : Unsupported major.minor version 
51.0

I do not have version 51 anywhere in my system, please advice

Thanks,
Vijaya

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/0b2880fb-324b-4d09-920f-a075d48393a0%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: n edge gram analyzer's behave not as expected

2015-03-26 Thread Narinder Kaur
thanks for reply. I will try it.

On Friday, 27 March 2015 05:07:34 UTC+5:30, Masaru Hasegawa wrote:
>
> Hi,
>
> You'd need to specify token_chars when you configure edge ngram tokenizer(
> http://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-edgengram-tokenizer.html).
>  
> Unless, all characters are kept. Which means, words are not split on white 
> spaces.
> You can see how the analyzer works by _analyze API(
> http://www.elastic.co/guide/en/elasticsearch/reference/current/indices-analyze.html
> )
>
> You need to fix analyzer and re-index all documents.
>
>
> Masaru
>  
> On March 25, 2015 at 17:49:24, Narinder Kaur (narind...@izap.in 
> ) wrote:
>
>  Hi All,
>
> I have an custom analyzer based on n edge gram analyzer, The expectation 
> was it should analyze the text based on edges. So as per my understanding, 
> the analysis of a multi word like (Narinder Kaur)term will give 
> N
> Na
> Nar
> Nari
> Narin
> Narind
> Narinde
> Narinder
> K
> Ka
> Kau
> Kaur
>
> So now if search for narinder or kaur by using the following query:
>
>  {
>   "query": {
> "constant_score": {
>   "query": {
> "match_phrase_prefix": {
>   "primary_search_new": {
> "query": "narinder",
> "analyzer": "ys_search_analyzer_long"
>   }
> }
>   }
> }
>   }
> }
>  
> OR
>
>  {
>   "query": {
> "constant_score": {
>   "query": {
> "match_phrase_prefix": {
>   "primary_search_new": {
> "query": "kaur",
> "analyzer": "ys_search_analyzer_long"
>   }
> }
>   }
> }
>   }
> }
>
> both should have searched for the documents containing "Narinder Kaur". 
> But currently I can not search for kaur. Its working only for first term 
> match. The analyzer's used are as followed:
>
>  
> analysis: {
> analyzer: {
> ys_search_analyzer: {
> type: custom
> filter: [
> ys_word_delimiter
> trim
> lowercase
> ]
> tokenizer: ys_edge_ngram_tokenizer
> }
> ys_search_analyzer_long: {
> type: custom
> filter: [
> ys_word_delimiter
> trim
> lowercase
> ]
> tokenizer: ys_edge_ngram_tokenizer_long
> }
> }
> filter: {
> ys_word_delimiter: {
> type: word_delimiter
> stem_english_possessive: False
> }
> }
> tokenizer: {
> ys_edge_ngram_tokenizer_long: {
> type: edgeNGram
> min_gram: 1
> max_gram: 60
> }
> ys_edge_ngram_tokenizer: {
> min_gram: 1
> type: edgeNGram
> max_gram: 20
> }
> }
> }
>
>
> Please elaborate how its not working as expected? and what should I do to 
> make my requirement work without re-indexing the data.
>
> All help is appreciated.
> thanks
>  --
> You received this message because you are subscribed to the Google Groups 
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to elasticsearc...@googlegroups.com .
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/elasticsearch/76753cd7-7a47-4ca3-ba7b-90be025386b4%40googlegroups.com
>  
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/3ceb9a8f-9846-4779-81ed-8cfc4bb07847%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: n edge gram analyzer's behave not as expected

2015-03-26 Thread Narinder Kaur
Thanks for your reply. It much better clear now how to 

On Friday, 27 March 2015 05:07:34 UTC+5:30, Masaru Hasegawa wrote:
>
> Hi,
>
> You'd need to specify token_chars when you configure edge ngram tokenizer(
> http://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-edgengram-tokenizer.html).
>  
> Unless, all characters are kept. Which means, words are not split on white 
> spaces.
> You can see how the analyzer works by _analyze API(
> http://www.elastic.co/guide/en/elasticsearch/reference/current/indices-analyze.html
> )
>
> You need to fix analyzer and re-index all documents.
>
>
> Masaru
>  
> On March 25, 2015 at 17:49:24, Narinder Kaur (narind...@izap.in 
> ) wrote:
>
>  Hi All,
>
> I have an custom analyzer based on n edge gram analyzer, The expectation 
> was it should analyze the text based on edges. So as per my understanding, 
> the analysis of a multi word like (Narinder Kaur)term will give 
> N
> Na
> Nar
> Nari
> Narin
> Narind
> Narinde
> Narinder
> K
> Ka
> Kau
> Kaur
>
> So now if search for narinder or kaur by using the following query:
>
>  {
>   "query": {
> "constant_score": {
>   "query": {
> "match_phrase_prefix": {
>   "primary_search_new": {
> "query": "narinder",
> "analyzer": "ys_search_analyzer_long"
>   }
> }
>   }
> }
>   }
> }
>  
> OR
>
>  {
>   "query": {
> "constant_score": {
>   "query": {
> "match_phrase_prefix": {
>   "primary_search_new": {
> "query": "kaur",
> "analyzer": "ys_search_analyzer_long"
>   }
> }
>   }
> }
>   }
> }
>
> both should have searched for the documents containing "Narinder Kaur". 
> But currently I can not search for kaur. Its working only for first term 
> match. The analyzer's used are as followed:
>
>  
> analysis: {
> analyzer: {
> ys_search_analyzer: {
> type: custom
> filter: [
> ys_word_delimiter
> trim
> lowercase
> ]
> tokenizer: ys_edge_ngram_tokenizer
> }
> ys_search_analyzer_long: {
> type: custom
> filter: [
> ys_word_delimiter
> trim
> lowercase
> ]
> tokenizer: ys_edge_ngram_tokenizer_long
> }
> }
> filter: {
> ys_word_delimiter: {
> type: word_delimiter
> stem_english_possessive: False
> }
> }
> tokenizer: {
> ys_edge_ngram_tokenizer_long: {
> type: edgeNGram
> min_gram: 1
> max_gram: 60
> }
> ys_edge_ngram_tokenizer: {
> min_gram: 1
> type: edgeNGram
> max_gram: 20
> }
> }
> }
>
>
> Please elaborate how its not working as expected? and what should I do to 
> make my requirement work without re-indexing the data.
>
> All help is appreciated.
> thanks
>  --
> You received this message because you are subscribed to the Google Groups 
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to elasticsearc...@googlegroups.com .
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/elasticsearch/76753cd7-7a47-4ca3-ba7b-90be025386b4%40googlegroups.com
>  
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/54f7657f-ecb2-459f-8947-913a678745b0%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


percolator query problem.

2015-03-26 Thread 박재혁
sorry, I can't write English.

watch my source.
curl -XPUT 'localhost:9200/filters/.percolator/6' -d '{
"query": {
"bool": {
"must": [
{
"range": {
"price": {
"gte": "100",
 "lte": "2"
}
}
},
{
"match": {
"prod_name": {
"type": "phrase",
 "operator": "and",
 "query": "iphone"
}
}
}
]
}
}
}'   
{"_index":"filters","_type":".percolator","_id":"6","_version":1,"created":true}


percolator soure..



curl -XGET 'localhost:9200/filters/.percoaltor/_percolate' -d '{
"doc" : {
"price" : "2011",
   "prod_name" : "iphone"
}
}'  <-- it not working...

I don't know wy


mapping problem?



mapping 
.percolator": { 
   
   - "_id": { 
  - "index": "not_analyzed"
   },
   - "properties": { 
  - "create_dt": { 
 - "type": "long"
  },
  - "price": { 
 - "type": "long"
  },
  - "query": { 
 - "enabled": false,
 - "type": "object"
  },
  - "parent_category": { 
 - "type": "string"
  },
  - "detail_category": { 
 - "type": "string"
  },
  - "site_name": { 
 - "type": "string"
  },
  - "reg_dt": { 
 - "type": "long"
  },
  - "filter": { 
 - "properties": { 
- "range": { 
   - "properties": { 
  - "price": { 
 - "properties": { 
- "to": { 
   - "type": "long"
},
- "from": { 
   - "type": "long"
}
 }
  }
   }
}
 }
  },
  - "prod_name": { 
 - "type": "string"
  }
   }

},






-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/1f6b7d1c-5a19-4692-91d1-fb2e96037cb1%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


My first corrupted index

2015-03-26 Thread mdd
I had a disk problem on my development laptop and my ES index is only 
barely working.  Before restoring a [old] snapshot and just moving forward, 
I'd like to learn if this can be recovered. 

My first sign of trouble were query errors like this:

{
   "error": "NoShardAvailableActionException[[twitter_profiles][0] null]",
   "status": 503
}

The disk checks out OK now, but the logs below show what happens when the 
node starts up.

What's best way to proceed? 

Thanks in advance!

--

The details are here , but the high points 
are:

[2015-03-26 20:36:55,065][INFO ][node ] [Glob] version[
*1.4.2*], pid[10652], build[927caff/2014-12-16T14:11:12Z]

[2015-03-26 20:37:03,101][WARN ][indices.cluster  ] [Glob] 
[twitter_profiles][0] *failed to start shard*
org.elasticsearch.index.gateway.IndexShardGatewayRecoveryException: 
[twitter_profiles][0] failed to fetch index version after copying it over

Caused by: org.apache.lucene.index.CorruptIndexException: 
[twitter_profiles][0] *Preexisting corrupted index* 
[corrupted_dd6VM9dCQOeajI_AV9COnA] caused by: CorruptIndexException[Invalid 
fieldsStream maxPointer (file truncated?): maxPointer=86078308, 
length=4294967295]
org.apache.lucene.index.CorruptIndexException: Invalid fieldsStream 
maxPointer (file truncated?): maxPointer=86078308, length=4294967295

[2015-03-26 20:37:03,261][WARN ][cluster.action.shard ] [Glob] 
[twitter_profiles][0] *sending failed shard* for [twitter_profiles][0], 
node[msa32a5JQHW5aojBitwEqQ], [P], s[INITIALIZING], indexUUID 
[1tNwpxI0Rl6M2yldd_l5lw], reason [Failed to start shard, message 
[IndexShardGatewayRecoveryException[[twitter_profiles][0] failed to fetch 
index version after copying it over]; nested: 
CorruptIndexException[[twitter_profiles][0] Preexisting corrupted index 
[corrupted_dd6VM9dCQOeajI_AV9COnA] caused by: CorruptIndexException[Invalid 
fieldsStream maxPointer (file truncated?): maxPointer=86078308, 
length=4294967295]
org.apache.lucene.index.CorruptIndexException: Invalid fieldsStream 
maxPointer (file truncated?): maxPointer=86078308, length=4294967295

[2015-03-26 20:37:03,262][WARN ][cluster.action.shard ] [Glob] 
[twitter_profiles][0] *received shard failed* for [twitter_profiles][0], 
node[msa32a5JQHW5aojBitwEqQ], [P], s[INITIALIZING], indexUUID 
[1tNwpxI0Rl6M2yldd_l5lw], reason [Failed to start shard, message 
[IndexShardGatewayRecoveryException[[twitter_profiles][0] failed to fetch 
index version after copying it over]; nested: 
CorruptIndexException[[twitter_profiles][0] Preexisting corrupted index 
[corrupted_dd6VM9dCQOeajI_AV9COnA] caused by: CorruptIndexException[Invalid 
fieldsStream maxPointer (file truncated?): maxPointer=86078308, 
length=4294967295]
org.apache.lucene.index.CorruptIndexException: Invalid fieldsStream 
maxPointer (file truncated?): maxPointer=86078308, length=4294967295

[2015-03-26 20:37:03,262][INFO ][gateway  ] [Glob] *recovered 
[2] indices into cluster_state*
[2015-03-26 20:37:17,612][WARN ][indices.cluster  ] [Glob] 
[twitter_profiles][0] *failed to start shard*
org.elasticsearch.index.gateway.IndexShardGatewayRecoveryException: 
[twitter_profiles][0] *failed to fetch index version after copying it over*

Caused by: org.apache.lucene.index.CorruptIndexException: 
[twitter_profiles][0] *Preexisting corrupted index* 
[corrupted_dd6VM9dCQOeajI_AV9COnA] caused by: CorruptIndexException[Invalid 
fieldsStream maxPointer (file truncated?): maxPointer=86078308, 
length=4294967295]
org.apache.lucene.index.CorruptIndexException: Invalid fieldsStream 
maxPointer (file truncated?): maxPointer=86078308, length=4294967295

[2015-03-26 20:37:17,613][WARN ][cluster.action.shard ] [Glob] 
[twitter_profiles][0] *sending failed shard for* [twitter_profiles][0], 
node[msa32a5JQHW5aojBitwEqQ], [P], s[INITIALIZING], indexUUID 
[1tNwpxI0Rl6M2yldd_l5lw], reason [Failed to start shard, message 
[IndexShardGatewayRecoveryException[[twitter_profiles][0] failed to fetch 
index version after copying it over]; nested: 
CorruptIndexException[[twitter_profiles][0] Preexisting corrupted index 
[corrupted_dd6VM9dCQOeajI_AV9COnA] caused by: CorruptIndexException[Invalid 
fieldsStream maxPointer (file truncated?): maxPointer=86078308, 
length=4294967295]
org.apache.lucene.index.CorruptIndexException: Invalid fieldsStream 
maxPointer (file truncated?): maxPointer=86078308, length=4294967295

[2015-03-26 20:37:17,613][WARN ][cluster.action.shard ] [Glob] 
[twitter_profiles][0] *received shard failed* for [twitter_profiles][0], 
node[msa32a5JQHW5aojBitwEqQ], [P], s[INITIALIZING], indexUUID 
[1tNwpxI0Rl6M2yldd_l5lw], reason [Failed to start shard, message 
[IndexShardGatewayRecoveryException[[twitter_profiles][0] failed to fetch 
index version after copying it over]; nested: 
CorruptIndexException[[twitter_profiles][0] Preexisting corrupted index 
[corrupted_dd6VM9dCQOeajI_AV9COnA] caused by: CorruptIndexException[I

Re: How does Elasticsearch calculate the field-length norm?

2015-03-26 Thread Masaru Hasegawa
Hi,

I believe it's because field norm is encoded in single byte.
See 
http://lucene.apache.org/core/4_10_2/core/org/apache/lucene/search/similarities/DefaultSimilarity.html


Masaru

On March 26, 2015 at 14:36:45, Xudong You (xudong@gmail.com) wrote:


Per this post "theory behind relevance scoring" 
http://www.elastic.co/guide/en/elasticsearch/guide/current/scoring-theory.html

Elasticsearch calculate the field-length norm as follows:
norm(d) = 1 / √numTerms


But per my testing, seems the actual result value calculated does not meet 
above formula.


Following is my index docs:


1.
{
"title" : "quick brown fox"
}

2.
{
"title" : "quick fox"
}

Then I query "fox" with following query:
POST /vsmtest/test/_search?explain
{
  "query" : {
    "match" : {"title":"fox"}
  }
}

The result norm value are follows:

doc 1:
                        {
                           "value": 0.5,
                           "description": "fieldNorm(doc=0)"
                        }
doc 2:
                        {
                           "value": 0.625,
                           "description": "fieldNorm(doc=0)"
                        }

Can anyone help me understand how does 0.5 and 0.625 calculated per the formula?
 norm(d) = 1 / √numTerms


--
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/11f6dcea-6704-4a56-93f7-21bf89840789%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/etPan.5514c89f.216231b.166%40citra-2.local.
For more options, visit https://groups.google.com/d/optout.


Re: Elasticsearch shield and Kibana 3.x

2015-03-26 Thread Mark Walkom
Have you made config changes to KB to deal with the https, as per
http://www.elastic.co/guide/en/shield/current/_shield_with_kibana_3.html

On 27 March 2015 at 00:31, Marcello A  wrote:

> Hi All,
> we have installed shield on Elasticsearch 1.5 in a test environment. We
> have enabled the SSL on http module and the authentication but we noticed
> issue on Kibana 3.1.2 version but it's not working.
>
> Did someone configure this plugin and Kibana 3 to works together?
>
> Thanks,
> Marcello
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/6fe05fda-e941-46c5-9a21-c76e6b188116%40googlegroups.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAEYi1X8sHA4Q6-j4L0-%2BhYnWH9dZ3D222ubQx4um_3hWm38WNg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: is there a way to define mapping in java with a simple string?

2015-03-26 Thread Masaru Hasegawa
You can use string. See PutMappingRequest#source(String).

On March 26, 2015 at 05:30:58, Sai Asuka (asuka.s...@gmail.com) wrote:

Is there a way to simply pass mapping information in a json formatted string 
"{... }" without having to create an object and do a bunch of .put on it within 
Java?
--
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/c66a0784-29cc-4a42-958b-d7e189a0b17a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/etPan.5514c379.12200854.166%40citra-2.local.
For more options, visit https://groups.google.com/d/optout.


Re: Unable to preserve special characters in search results of ElasticSearch.

2015-03-26 Thread Anusha
Hi Thanks,

I resolved this issue,

In order to preserve the special characters and to search the query term in
multiple fields for exact match it is better to change the settings as shown
below.

Settings I updated:

PUT /my_index/_settings?pretty=true
{
"settings" : {
"analysis": {
  "analyzer": {
"wordAnalyzer": {
  "type": "custom",
  "tokenizer": "whitespace",
  "filter": [
"word_delimiter_for_phone","nGram_filter"
  ]
}
  },
  "filter": {
"word_delimiter_for_phone": {
  "type": "word_delimiter",
  "catenate_all": true,
  "generate_number_parts ": false,
  "split_on_case_change": false,
  "generate_word_parts": false,
  "split_on_numerics": false,
  "preserve_original": true
},
 
"nGram_filter": {
   "type": "nGram",
   "min_gram": 1,
   "max_gram": 20,
   "token_chars": [
  "letter"
   ]
}
  }
}
}
}


Mapping settings:
{
“mappings”:{

   "face" : {
 "properties" : {
"{field-1}id" : {
   "type" : "string",
   "index_name" : "change",
   "analyzer": "wordAnalyzer"
   
},
"{field-2}name" : {
"type" : "string",
   "index_name" : "change",
   "analyzer": "wordAnalyzer"
},
"{field-3}Year":
{
"type" : "string",
   "index_name" : "change",
   "analyzer": "wordAnalyzer"
},
"{field-4}Make":
{
"type" : "string",
   "index_name" : "change",
   "analyzer": "wordAnalyzer"
}
 }
  } 



and the query we can use:

GET my_index/face/_search
{
   "query": {
   "match": {
  "change": 
  {
  "query": "A/T o",
  "type": "phrase_prefix"
  }
   }
   }
}


By this we can search for that term in all the fields. In order to search in
only single field we can give that field name in the place of "change" in
match query.

And for to change in mappings I am able to update the analyzers, but not the
index_name, for to add index_name I have deleted the index and again done
the mapping as above.






--
View this message in context: 
http://elasticsearch-users.115913.n3.nabble.com/Unable-to-preserve-special-characters-in-search-results-of-ElasticSearch-tp4072409p4072648.html
Sent from the ElasticSearch Users mailing list archive at Nabble.com.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/1427371817177-4072648.post%40n3.nabble.com.
For more options, visit https://groups.google.com/d/optout.


Re: Japanese Search Results with Kuromoji plugin

2015-03-26 Thread Masaru Hasegawa
Hi,

The input text is already tokenized. As you can see, unlike normal Japanese 
text, terms are split by white spaces. I guess the input text is already 
preprocessed using Japanese analyzer.
That's why you get hits even if you don't use Japanese tokenizer. (standard 
tokenizer splits tokens on white spaces).
The reason you get results in "second case" is that input text has tokenized 
version of query string, "憲法 記念 日 にあたって".


Masaru

On March 25, 2015 at 18:24:38, Mangesh Ralegankar 
(mangesh.ralegan...@gmail.com) wrote:

We are using Elastic Search 1.1 and Kuromoji plugin 2.1.0

Trying to create index based on below two to three approaches in the settings 
for test index.
localhost:9200/test/ PUT
1.
{
  "index":{
    "analysis":{
  "tokenizer" : {
    "kuromoji" : {
  "type":"kuromoji_tokenizer",
  "mode":"search"
    }
  },
  "analyzer" : {
    "kuromoji_analyzer" : {
  "type" : "custom",
  "tokenizer" : "kuromoji_tokenizer"
    }
  }
    }
  }
}'

2.
 {
 "index": {
 "analysis": {
 "tokenizer": {
 "kuromoji": {
 "type":"kuromoji_tokenizer"
 }
 },
 "analyzer": {
 "default" : {
 "type": "custom",
 "tokenizer": "kuromoji"
 }
 }
 }
 }
 }
After this create document on this index with below text for example
localhost:9200/test/p/1 PUT
{ 
  "id": 8,
  "url": "http://someurl.html";,
    "language": "en",
    "creationDate": "2015-03-23T11:59:23.021Z",
    "text": " 憲法 記念 日 にあたって   小沢 一郎 代表   談話 ( その他 政界 と 政治 活動 ) - 無心 - Yahoo ! 
ブログ 記事 検索 ランダム こんにちは 、 ゲスト さん ログイン Yahoo ! ブログ を 始める Yahoo ! JAPAN 無心 平凡 な 毎日 に 
○ あげよ う ! お気に入り の 人 に 登録 / 削除 全体 表示 [ リスト ] < 武田 邦彦 『 ダーウィン の 番犬 ( 第 2 回 )   
メディア の プロ 意識 ・ ・ 報道 の 自由 の 乱用 を 恥じる 』 武田 邦彦 『 「 健康 と 長寿 」 の 根源 を 語る 0 5   
コレステロール は どう し たら よい の か 』 > 憲法 記念 日 にあたって   小沢 一郎 代表   談話 2014 / 5 / 6 ( 火 ) 
午前 8 : 47 小沢 一郎 その他 政界 と 政治 活動 ツイート 憲法 記念 日 にあたって の 小沢 代表 の 談話 を 、 以下 、 阿修羅 様 
より 。     憲法 記念 日 にあたって   小沢 一郎 代表   談話 http :// www . asyura 2 . com / 14 / 
senkyo 164 / msg / 906 . html 投稿 者 赤 かぶ 日時 2014 年 5 月 03 日 22 : 13 : 33 : 
igsppGRN / E 9 PQ 憲法 記念 日 にあたって http :// www . seikatsu 1 . jp / activity / 
declaration / 20140503 ozawa - danwa . html 平成 26 年 5 月 3 日   生活 の 党 小沢 一郎 代表   
談話 ( 2014 年 5 月 3 日 ) 平成 26 年 5 月 3 日 生活 の 党 代表   小沢 一郎 本日 、 日本 国 憲法 は 施行 から 67 
年 を 迎え まし た 。 生活 の 党 は 、 憲法 と は 、 国家 以前 の 普遍 的 理念 で ある 「 基本 的 人権 の 尊重 」 を 貫徹 する 
ため に 統治 権 を 制約 する 、 いわゆる 国家 権力 を 縛る もの で ある という 立憲 主義 の 考え方 を 基本 に し て い ます 。 
また 、 憲法 は 、 国家 の あり方 や 国法 秩序 の 基本 を 定める 最高 法規 として 安定 性 が 求め られる 性質 の もの で あり ます 
。 したがって 、 国民 主権 、 基本 的 人権 の 尊重 、 平和 主義 、 国際 協調 という 憲法 の 四 大 原則 は 引き続き 堅持 す べき で 
あり ます 。 しかし 安倍 政権 は 、 戦後 一貫 し た 集団 的 自衛 権 に関する 憲法 解釈 を 、 いとも 簡単 に 一 内閣 の 権限 のみ 
で 変更 しよ う と し て い ます 。 憲法 9 条 の 解釈 は 、 戦後 から 現在 まで の 長年 にわたる 国会 審議 において 、 国会 と 
政府 の 共同 作業 によって 練り上げ られ て き た もの で あり 、 国会 審議 を 経る こと も なく 、 一 内閣 が 行う 閣議 決定 
によって 軽々 に 変更 が 許さ れる もの で は あり ませ ん 。 生活 の 党 は 、 憲法 9 条 が 容認 し て いる 自衛 権 の 行使 は 
、 我が国 が 直接 攻撃 を 受け た 場合 及び 周辺 事態 法 に いう 日本 の 安全 が 脅かさ れる 場合 において 、 同盟 国 で ある 米国 
と 共同 で 攻撃 に 対処 する よう な 場合 に 限 ら れる もの と 考え ます 。 これ 以外 の 、 日本 に 直接 関係 の ない 紛争 の 
ため に 、 自衛隊 が 同盟 国 の 軍事 行動 に 参加 する こと は 、 歯止め なき 自衛 権 の 拡大 に つながり かね ない もの で あっ 
て 、 現行 憲法 9 条 は 全く これ を 許し て い ない と 考え ます 。 一方 で 、 憲法 は 、 国民 の 生命 や 財産 、 人権 を 
守る ため に 定め られ 、 平和 な 暮らし を 実現 する ため の 共同 体 の ルール として 国民 が 定め た もの な ので 、 四 大 原則 
を 守り つつ も 、 時代 や 環境 の 変化 に 応じ て 必要 が あれ ば 改正 す べき 点 は 改正 す べき です 。 生活 の 党 は 、 
国民 が より 幸せ に 、 より 安全 に 生活 でき 、 日本 が 世界 平和 に 貢献 する ため の ルール 作り を めざし 、 国民 とともに 
積極 的 に 議論 し て 参り ます 。 この ブログ を お気に入り に 登録 この 記事 に ツイート > 政治 > 政界 と 政治 活動 > その他 
政界 と 政治 活動 「 小沢 一郎 」 書庫 の 記事 一覧 戦後 70 年 の 節目 の 今年 こそ 、 国民 一人ひとり が 本当に 民主 主義 を 身 
に つける べき で は ない か と 思っ て い ます     小沢 一郎 代表 2015 / 2 / 25 ( 水 ) 午前 6 : 43 彼ら が 
最初 「 生活 の 党 と 山本 太郎 」 を 攻撃 し た とき ( 植草 一秀 の 『 知 ら れ ざる 真実 』 ) 2015 / 2 / 17 ( 火 
) 午前 6 : 09 NHK が 一郎 ・ 太郎 外し   人質 事件 で 安倍 首相 に 配慮   小沢 代表 「 NHK に 乗り込ん で バシッ と 
言っ て こい 」 ( 田中 龍作 ジャーナル ) 2015 / 2 / 4 ( 水 ) 午前 6 : 19 『 シリア における 邦人 殺害 事件 について 
』 生活 の 党 と 山本 太郎 と な かま たち 2015 / 2 / 3 ( 火 ) 午前 8 : 50 < 全文 > 小沢 一郎 氏 、 山本 太郎 
氏 が 会見 「 市民 の 希望 の 星 に なれる よう に 頑張り たい 」 〜 党名 候補 に は 「 一郎 太郎 」 も 2015 / 1 / 28 
( 水 ) 午前 7 : 03 もっと 見る コメント ( 0 ) ※ 投稿 さ れ た コメント は ブログ 開設 者 の 承認 後 に 公開 さ れ ます 
。 コメント 投稿 名前 パスワードブログ 絵文字 × SoftBank 1 SoftBank 2 SoftBank 3 SoftBank 4 docomo 
1 docomo 2 au 1 au 2 au 3 au 4 トラック バック ( 0 ) ※ トラック バック は ブログ 開設 者 の 承認 後 に 公開 
さ れ ます 。 トラック バック さ れ た 記事 この 記事 に トラック バック する URL を クリップ ボード に コピー http :// 
blogs . yahoo . co . jp / satomikimuraoffice / trackback / 983091 / 34530133 
トラック バック さ れ て いる 記事 が あり ませ ん 。 トラック バック 先 の 記事 トラック バック 先 の 記事 が あり ませ ん 。 この 
記事 の URL : http :// blogs . yahoo . co . jp / satomikimuraoffice / 34530133 . 
html < 武田 邦彦 『 ダーウィン の 番犬 ( 第 2 回 )   メディア の プロ 意識 ・ ・ 報道 の 自由 の 乱用 を 恥じる 』 武田 
邦彦 『 「 健康 と 長寿 」 の 根

Re: n edge gram analyzer's behave not as expected

2015-03-26 Thread Masaru Hasegawa
Hi,

You'd need to specify token_chars when you configure edge ngram 
tokenizer(http://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-edgengram-tokenizer.html).
 Unless, all characters are kept. Which means, words are not split on white 
spaces.
You can see how the analyzer works by _analyze 
API(http://www.elastic.co/guide/en/elasticsearch/reference/current/indices-analyze.html)

You need to fix analyzer and re-index all documents.


Masaru

On March 25, 2015 at 17:49:24, Narinder Kaur (narinder.k...@izap.in) wrote:

Hi All,

I have an custom analyzer based on n edge gram analyzer, The expectation was it 
should analyze the text based on edges. So as per my understanding, the 
analysis of a multi word like (Narinder Kaur)term will give 
N
Na
Nar
Nari
Narin
Narind
Narinde
Narinder
K
Ka
Kau
Kaur

So now if search for narinder or kaur by using the following query:

{
  "query": {
    "constant_score": {
      "query": {
        "match_phrase_prefix": {
          "primary_search_new": {
            "query": "narinder",
            "analyzer": "ys_search_analyzer_long"
          }
        }
      }
    }
  }
}

OR

{
  "query": {
    "constant_score": {
      "query": {
        "match_phrase_prefix": {
          "primary_search_new": {
            "query": "kaur",
            "analyzer": "ys_search_analyzer_long"
          }
        }
      }
    }
  }
}

both should have searched for the documents containing "Narinder Kaur". But 
currently I can not search for kaur. Its working only for first term match. The 
analyzer's used are as followed:


analysis: {
analyzer: {
ys_search_analyzer: {
type: custom
filter: [
ys_word_delimiter
trim
lowercase
]
tokenizer: ys_edge_ngram_tokenizer
}
ys_search_analyzer_long: {
type: custom
filter: [
ys_word_delimiter
trim
lowercase
]
tokenizer: ys_edge_ngram_tokenizer_long
}
}
filter: {
ys_word_delimiter: {
type: word_delimiter
stem_english_possessive: False
}
}
tokenizer: {
ys_edge_ngram_tokenizer_long: {
type: edgeNGram
min_gram: 1
max_gram: 60
}
ys_edge_ngram_tokenizer: {
min_gram: 1
type: edgeNGram
max_gram: 20
}
}
}


Please elaborate how its not working as expected? and what should I do to make 
my requirement work without re-indexing the data.

All help is appreciated.
thanks
--
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/76753cd7-7a47-4ca3-ba7b-90be025386b4%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/etPan.5514982b.79e2a9e3.166%40citra-2.local.
For more options, visit https://groups.google.com/d/optout.


Elasticsearch with influxdb

2015-03-26 Thread Sandhya Varanasi
Hi 

I am working on a project and would like to know if influxDB can have 
elasticsearch as a tool to search data. 

Thanks
Sandhya

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/d9dd8381-a2c5-4e72-80b2-9feaf0a87c0e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Custom cluster action

2015-03-26 Thread Paweł Róg
Hi Ali,

I used ClusterAction for my  case because I had to execute action on all 
nodes. I used something like this

public class CustomAction extends
ClusterAction{
... implement all required methods  (override newRequestBuilder and 
newResponse) ...


public class CustomNodeResponse extends NodeOperationResponse {
}


public class CustomRequest extends NodesOperationRequest {
... overide writeTo and readFrom methods ...
... add custom data fields in this class ... (for example I used "id" which 
was given in constructor and read/write in overridden methods ...

public class CustomRequestBuilder extends
NodesOperationRequestBuilder {
... override doExecute method. I used something below ...
@Override
protected void doExecute(ActionListener listener) {
client.execute(CustomAction.INSTANCE, request, listener);
}


public class CustomResponse extends 
NodesOperationResponse {
... implement writeTo and readFrom methods ...

public class NodeCustomResponse extends NodeOperationResponse {
... implement writeTo and readFrom methods ...


And finally the most important part

public class TransportCustomAction extends 
 TransportNodesOperationAction {
... the main part is in nodeOperation method. here is implementation of 
your action ...


--
Paweł Róg

On Thursday, March 26, 2015 at 9:26:41 PM UTC+1, Ali Lotfdar wrote:
>
> Hi Pawel,
>
> I am new in creating action plugin, and I found you have already work on 
> it. Could you please let me know if there is any sample and explanation 
> which can help me to start.
> I already reviewed cookbook ES, but I could not understand its explanation!
>
> Thank you. 
>
>
> Regards,
> Ali
>
> On Thursday, November 13, 2014 at 1:06:00 AM UTC-5, Paweł Róg wrote:
>>
>> Hi,
>> Thank you very much :-)
>>
>> Paweł
>>
>> On Wed, Nov 12, 2014 at 10:23 PM, Ivan Brusic  wrote:
>>
>>> There is also an ActionModule
>>>
>>> public void onModule(ActionModule module) {
>>> module.registerAction(MyAction.INSTANCE, TransportMyAction.class);
>>> }
>>>
>>> It is always easier to follow existing plugins.
>>>
>>> Cheers,
>>>
>>> Ivan
>>>
>>> On Wed, Nov 12, 2014 at 3:50 PM, Pawel  wrote:
>>>
 Hi,
 I'm thinking about building custom ClusterAction. I see that I can 
 build custom classes for Request, NodeResponse and NodesRespone but it is 
 not clear to me how I can register my custom action.

 In case of Rest action it was quite easy because in plugin i simply use

 public void onModule(RestModule module) {
 module.addRestAction(RestCustomAction.class);
 }

 but I cannot find any examples how I can do this in case of custom 
 ClusterAction.

 Does anybody have any example how I can achieve this? Thanks for your 
 help.

 --
 Paweł Róg

 -- 
 You received this message because you are subscribed to the Google 
 Groups "elasticsearch" group.
 To unsubscribe from this group and stop receiving emails from it, send 
 an email to elasticsearc...@googlegroups.com.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/CAF9ZkbPoNd4yZLdow_sMJmpN8dr0krEipTX9SbMOp%2BugM0L8_w%40mail.gmail.com
  
 
 .
 For more options, visit https://groups.google.com/d/optout.

>>>
>>>  -- 
>>> You received this message because you are subscribed to the Google 
>>> Groups "elasticsearch" group.
>>> To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to elasticsearc...@googlegroups.com.
>>> To view this discussion on the web visit 
>>> https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQCTD3ZTreOTPXzd1ifwTqyFj3rzRd8KOsZxs8y6qg-M3Q%40mail.gmail.com
>>>  
>>> 
>>> .
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/58822bc2-dfe0-429a-be6c-e372e700396c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


EC2 client node cluster discovery

2015-03-26 Thread Jerry Pattenaude
I am running elasticsearch 1.4.4 and the aws plugin 2.4.1

I am trying to run multiple clusters on the same machine with the same es 
install. It's a small dataset and I'm running a qe, demo, and staging 
cluster.

My java api client won't connect to a running es server after a tomcat 
restart, but when I restart the es cluster it discovers the running client.
I have opened ports 9200-9400 for internal communication using a security 
group that is also used for ec2 discovery

When I restart the staging tomcat instance it rejoins elastic search. When 
I restart qe it does not rejoin with the warning:
"org.elasticsearch.discovery:  91 - [Peter Noble] waited for 30s and no 
initial state was set by the discovery"

If I restart the qe es cluster it picks up Peter Noble right away.

Earlier today this situation was switched where the staging cluster would 
have to be restarted. I think the change was after I brought down both 
clusters and brought staging up first.

*Additional config details:*
The web servers a all seperate AWS servers in the same subnet and the 3 es 
clusters are all running on the same box.

I'm passing cluster name and node name in when starting elastic search

elasticsearch --cluster.name=QE --node.name=es-qe-1

My es master has the config:

discovery.zen.ping.multicast.enabled: false
discovery.type: ec2
discovery.ec2.groups: elasticsearch
cloud.aws.protocol: http
cloud.aws.region: us-east-1
cloud.aws.access_key: **
cloud.aws.secret_key: **
discovery.ec2.ping_timeout: 10s

I'm using the java api from a node client

NodeBuilder nodeBuilder = nodeBuilder().clusterName(clusterName).client(true);
if (isAws) {
ImmutableSettings.Builder settings =
ImmutableSettings.settingsBuilder()
 .put("cloud.aws.access_key", "***")
 .put("cloud.aws.secret_key", "***")
 .put("cloud.aws.region", "us-east-1")
 .put("cloud.aws.protocol", "http")
 .put("discovery.type", "ec2")
 .put("discovery.ec2.groups", "elasticsearch")
 .put("discovery.zen.ping.multicast.enabled", "false");
nodeBuilder = nodeBuilder.settings(settings);


Any ideas would be appreciated!


Jerry


-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/090d6bab-0ace-4537-99f7-547410fa5372%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: How to setup a connection between VMs and ELK?

2015-03-26 Thread Mark Walkom
https://www.google.com.au/?gws_rd=ssl#q=elk+tutorial

There's a whole bunch of links there to get you started.

On 27 March 2015 at 00:10, kelnrluierhfeulne  wrote:

> Would you happen to know what the process is or know of any links that go
> over the concept that allow you to do this? For example, is there a certain
> file to edit once you download logstash-forwarder?
> Thanks again
>
> On Wednesday, March 25, 2015 at 5:30:53 PM UTC-4, Mark Walkom wrote:
>>
>> There is no single command, it's a concept.
>>
>> You can use rsyslog, logstash, logstash-forwarder, logstash-courier or
>> many other pieces of software to do this.
>>
>> On 26 March 2015 at 07:47, kelnrluierhfeulne  wrote:
>>
>>> Hey thanks for the reply! I tried looking up how to do that but am still
>>> lost... Would you happen to know what commands you would use to ship the
>>> VM's logs to elasticsearch?
>>>
>>>
>>>
>>> On Wednesday, March 25, 2015 at 4:22:11 PM UTC-4, Mark Walkom wrote:

 You need to ship the logs from the VMs to ES.

 Take a look at Logstash and feel free to ask questions on
 https://groups.google.com/forum/?hl=en-GB#!forum/logstash-users

 On 26 March 2015 at 04:32, kelnrluierhfeulne  wrote:

> This is a beginner question but how would you get virtual machines to
> connect to ELK so you can see the logs of those VMs on Kibana? Is there a
> place to input the IP of the VMs so it is displayed in Kibana?
>
> --
> You received this message because you are subscribed to the Google
> Groups "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to elasticsearc...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/
> msgid/elasticsearch/54e04ffa-9bec-4c99-954c-f0a866454faa%40goo
> glegroups.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

  --
>>> You received this message because you are subscribed to the Google
>>> Groups "elasticsearch" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to elasticsearc...@googlegroups.com.
>>> To view this discussion on the web visit https://groups.google.com/d/
>>> msgid/elasticsearch/15f4ff6d-6eb6-46de-ad0b-c8046bb8c822%
>>> 40googlegroups.com
>>> 
>>> .
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/daf2a092-2ca0-4cdf-be1c-3c12984436d7%40googlegroups.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAEYi1X9FSmpeAakOpjM4x8DAWFbyK4jZcdDgXKuvHBM8_9JxJQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Custom cluster action

2015-03-26 Thread Ali Lotfdar
Hi Pawel,

I am new in creating action plugin, and I found you have already work on 
it. Could you please let me know if there is any sample and explanation 
which can help me to start.
I already reviewed cookbook ES, but I could not understand its explanation!

Thank you. 


Regards,
Ali

On Thursday, November 13, 2014 at 1:06:00 AM UTC-5, Paweł Róg wrote:
>
> Hi,
> Thank you very much :-)
>
> Paweł
>
> On Wed, Nov 12, 2014 at 10:23 PM, Ivan Brusic  > wrote:
>
>> There is also an ActionModule
>>
>> public void onModule(ActionModule module) {
>> module.registerAction(MyAction.INSTANCE, TransportMyAction.class);
>> }
>>
>> It is always easier to follow existing plugins.
>>
>> Cheers,
>>
>> Ivan
>>
>> On Wed, Nov 12, 2014 at 3:50 PM, Pawel > 
>> wrote:
>>
>>> Hi,
>>> I'm thinking about building custom ClusterAction. I see that I can build 
>>> custom classes for Request, NodeResponse and NodesRespone but it is not 
>>> clear to me how I can register my custom action.
>>>
>>> In case of Rest action it was quite easy because in plugin i simply use
>>>
>>> public void onModule(RestModule module) {
>>> module.addRestAction(RestCustomAction.class);
>>> }
>>>
>>> but I cannot find any examples how I can do this in case of custom 
>>> ClusterAction.
>>>
>>> Does anybody have any example how I can achieve this? Thanks for your 
>>> help.
>>>
>>> --
>>> Paweł Róg
>>>
>>> -- 
>>> You received this message because you are subscribed to the Google 
>>> Groups "elasticsearch" group.
>>> To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to elasticsearc...@googlegroups.com .
>>> To view this discussion on the web visit 
>>> https://groups.google.com/d/msgid/elasticsearch/CAF9ZkbPoNd4yZLdow_sMJmpN8dr0krEipTX9SbMOp%2BugM0L8_w%40mail.gmail.com
>>>  
>>> 
>>> .
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>  -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to elasticsearc...@googlegroups.com .
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQCTD3ZTreOTPXzd1ifwTqyFj3rzRd8KOsZxs8y6qg-M3Q%40mail.gmail.com
>>  
>> 
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/787c2ad8-ec02-44c2-a8e3-1fc7d3ad6a05%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


How to treat sorting for string fields as not_analyzed

2015-03-26 Thread pulkitsinghal
Only when performing a sort operation, I would like to treat a string field 
like 
{name: "first last"}

as not_analyzed ... is this possible?

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/b02b4a7b-7505-4a30-be81-c727898c18b3%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


ActionRequest support for BulkProcessor in the Java API

2015-03-26 Thread Andre Encarnacao
Hi,

I'm wondering why BulkProcessor (and BulkRequest) only supports 
IndexRequest, UpdateRequest, and DeleteRequests, and not SearchRequests? 
I'd like to be able to use Bulk Processor for processing multiple queries 
instead of having to use the MultiSearch API. Is this not possible? And if 
so, why not?

Thanks!
Andre


-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/e7fe75d3-2684-4514-ab3d-c8b361fc6de3%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


How do you ship logs from Virtual Machines to Elasticsearch?

2015-03-26 Thread kelnrluierhfeulne
Beginner here and I am trying to ship logs from Virtual Machines to 
Elasticsearch. How would I do that so they can be displayed in Kibana 
(right now Kibana is blank)?

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/4ef83780-2dd9-4bc1-b2d3-54ec0aec3510%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: exists filter broken on 1.5.0 with restored index?

2015-03-26 Thread Mads Martin Jørgensen
Thanks for fixing!

On Thursday, March 26, 2015 at 3:29:17 PM UTC+1, Igor Motov wrote:
>
> Thanks for checking. It's a bug, which should be fixed in 1.5.1 
> https://github.com/elastic/elasticsearch/pull/10268
>
> On Wednesday, 25 March 2015 13:43:28 UTC-4, Mads Martin Jørgensen wrote:
>>
>> They're similar. The 1.5.0 cluster has "created" : "1000199", and the 
>> 1.4.1 cluster also has "created" : "1000199"
>>
>> On Wednesday, March 25, 2015 at 4:45:30 PM UTC+1, Igor Motov wrote:
>>>
>>> Hi Mads Martin,
>>>
>>> Could you check the version that is returned when you run curl 
>>> "localhost:9200/my_own_index/_settings?pretty". The version will be in 
>>>
>>> "version" : {
>>>   "created" : "XXX"
>>> }
>>>
>>> Could you compare it to the version that is returned by the same index 
>>> in the pre-1.5.0 cluster?
>>>
>>> Igor
>>>
>>> On Wednesday, 25 March 2015 09:27:03 UTC-4, Mads Martin Jørgensen wrote:

 Hello all,

 Just installed es-1.5.0 with cloud-aws-2.5.0 on a machine. Did a 
 restore of a snapshot made with es-1.4.1. All documents are there, but the 
 exists filter seems broken. The query that used to return all documents 
 matching, now return 0 documents, even though the field exists when 
 reading 
 the documents.

 curl -XGET "http://localhost:9200/my_own_index/document/_search 
 "
  
 -d'

 {

"query": {

   "constant_score": {

  "filter": {

 "exists": {

"field": "history"

 }

  }

   }

}

 }'


 If we populate new documents, then the exists filter works just fine.


 Regards,

 Mads Martin

>>>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/5ef4c402-3609-4780-bc64-049ea5c76751%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


What does it mean when refresh rate is high?

2015-03-26 Thread John Smith
Using 1.4.3

So I have a nice "beefy" cluster 4 nodes of 32 cores, 128GB RAM, 5TB RAID 0 
(Using Sandisk 960GB Extreme Pro) for each machine.

I am indexing about 4000 documents per second at an average of about 800 
bytes per document. At the same time as indexing I'm running queries.

Looking at Elastic HQ numbers.

Indexing - Index:0.28ms0.32ms0.3ms0.32msIndexing - Delete:0ms0ms0ms0msSearch 
- Query:29.23ms29.36ms24.46ms36.63msSearch - Fetch:0.25ms0.24ms0.25ms0.21msGet 
- Total:0.67ms0.46ms0ms0.48msGet - Exists:1.19ms0.65ms0ms0.48msGet - 
Missing:0ms0.03ms0ms0msRefresh:25.32ms24.86ms24.5ms24.81msFlush:104.14ms
90.45ms111.14ms84.63msNo matter what test I'v done or machine configuration 
the refresh rate has always been red... What does it mean and does it 
matter?

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/2da09094-f573-42e9-963f-7f9afa880de7%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


What does it meen when refresh rate is high?

2015-03-26 Thread John Smith
Using 1.4.3

So I have a nice "beefy" cluster 4 nodes of 32 cores, 128GB RAM, 5TB RAID 0 
(Using Sandisk 960GB Extreme Pro) for each machine.

I am indexing about 4000 documents per second at an average of about 800 
bytes per document. At the same time as indexing I'm running queries.

Looking at Elastic HQ numbers.

Indexing - Index:0.28ms0.32ms0.3ms0.32msIndexing - Delete:0ms0ms0ms0msSearch 
- Query:29.23ms29.36ms24.46ms36.63msSearch - Fetch:0.25ms0.24ms0.25ms0.21msGet 
- Total:0.67ms0.46ms0ms0.48msGet - Exists:1.19ms0.65ms0ms0.48msGet - 
Missing:0ms0.03ms0ms0msRefresh:25.32ms24.86ms24.5ms24.81msFlush:104.14ms
90.45ms111.14ms84.63msNo matter what test I'v done or machine configuration 
the refresh rate has always been red... What does it mean and does it 
matter?

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/a9f2cfc2-80c8-4b5e-83b2-09a6da03eccc%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


ES Count Query Help

2015-03-26 Thread Sriram Kannan
Friends, 

I am trying to query ES *every minute* for count. I have the following 
code. I am having tough time adding the *time filter.* Looks like lot has 
changed since 1.1.0 java elasticsearch api. Can you point me to the right 
directions? 


Settings settings = ImmutableSettings.settingsBuilder().put("cluster.name", 
"test").build();
Client esClient = new TransportClient(settings).addTransportAddress(new 
InetSocketTransportAddress("myhostname", port));

 //FieldQueryBuilder hotelQuery=QueryBuilders.fieldQuery("prodName","Air");

CountResponse cr = esClient.prepareCount("test").setIndices("myindex").
setQuery(termQuery("_type","mytype")).execute().actionGet();

System.out.println(cr.getCount());


 

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/281b53a3-da04-48a2-a90a-05198931df73%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Difference between _source and fields projections

2015-03-26 Thread Udi Ben Amitai
Hi,
My question is about the performance (or other) impact of the difference 
between _source and fields projection.
These 2 queries return (at least in my perspective) an equivalent result:
{"_source":{"include":["name"]}} 
==> {... "_source":{"name":"blah"}}
and:
{"fields":["name"]}
==>{... "fields":{"name":"blah"}}


In cases where the _source is stored - 
Is there a preference to one of the approaches? 
performance/memory/functional wise?

Thanks,
Udi

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/3fada4a4-d80a-43b9-b256-f93987177d87%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Disable re balancing if one node is down.

2015-03-26 Thread mjdude5
You can get a some reallocation control through 
cluster.routing.allocation.enable but getting the exact config you mention 
I don't think is possible.  You could write a cluster-watching script that 
does what you describe though using these settings.  One problem you will 
have is when node2 comes back it needs to re-sync it's replicas from the 
masters, but you can stop balancing after one node goes down via 
cluster.routing.allocation.enable

On Wednesday, March 25, 2015 at 3:47:38 AM UTC-4, Cyril Cherian wrote:
>
> Imagine a case where I have
>
>- 3(AWS) Nodes
>- 1 Index (lest call it friends) with 3 shards and 1 replica.
>
> Name Conventions
> S1 (Index friends primary shard 1)
> S2 (Index friends primary shard 2)
> S3 (Index friends primary shard 3)
> R1 (Replica of Shard 1)
> R2 (Replica of Shard 2)
> R3 (Replica of Shard 3)
>
> Lets say that Node1 has (S1 R2) and is the master
> Node2 has (S2 R3)
> Node3 has (S3 R1)
>
> Now if due to connection failure Node 2 goes down.
> Load balancing will happen and 
> Node 1 will promote the replica (R2) as primary and new replica for (R2) 
> will be created in Node3 
> Finally after load balancing it will be like 
> Node1 has (S1 S2, R3) 
> Node3 has (S3 R1, R2)
>
> During this re balancing heavy IO operations happen and the Elastic search 
> health will become red -> yellow then green.
>
> My requirement is that if Node 2 is down the nodes must not re balance.
> I am ok if the results on query shows results from only shard S1 and S3.
> And when Node 2 is back again no re balancing should happen.
>
> Is this possible..if yes how. 
> Thanks is advance.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/0a4ce39d-134b-4cd8-a49b-3785a06ae3d9%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: ES&Lucene 32GB heap myth or fact?

2015-03-26 Thread joergpra...@gmail.com
I will not doubt your numbers.

The difference may depend on the application workload, how many heap
objects are created. ES is optimized to use very large heap objects to
decrease GC overhead. So I agree the difference  for ES may be closer to
0.5 GB / 1 GB and not 8 GB.

Jörg

On Thu, Mar 26, 2015 at 4:44 PM, Paweł Róg  wrote:

> Hi,
> Thanks for your response Jörg. Maybe I was not precise enough in my last
> e-mail. What I wanted to point out is that IMHO in ES I can get something
> different than ~30G (OOPs) == ~40G (no OOPs). As I wrote in my analysis for
> 16G reachable objects (with Xmx 30G)  from my calculations the overhead of
> disabled OOPs vs enabled OOPs is only 0.5G and for 100% heap usage (30G
> from Xmx 30G) it would be 1G. This means that 30G heap will be always less
> than eg. 32G or 33G heap in case of ES (at least for my query
> characteristics with lots of aggregations).
>
> So I again ask what are your thoughts about this? Did I make any mistake
> during my estimations?
>
> --
> Paweł Róg
>
> On Thursday, March 26, 2015 at 4:21:10 PM UTC+1, Jörg Prante wrote:
>>
>> There is no "trouble" at all, only a surprise effect to those who do not
>> understand the effect of compressed OOPs.
>>
>> Compressed OOPs solve a memory space efficiency problem but work
>> silently. The challenge is, large object pointers waste some of the CPU
>> memory bandwith when JVM must access objects on a 64bit addressable heap.
>> There is a price to pay for encoding/decoding pointers, and that is
>> performance. Most people prefer memory efficiency over speed, so current
>> Oracle JVM is now enabling compressed OOPs by default. And this feature
>> works only on heaps less than ~30GB. If you configure a larger heap (for
>> whatever reason) you lose compressed OOP feature silently. Then you get
>> better performance, but with less heap object capacity. At a heap size of
>> ~40G, you can again store as many heap objects as with ~30GB.
>>
>> Jörg
>>
>>
>> On Thu, Mar 26, 2015 at 2:28 PM, Paweł Róg  wrote:
>>
>>> Hi everyone,
>>> Every time we touch the size of JVM heap for Elasticsearch we can
>>> meet indisputable statement "don't let the heap to be bigger than 32GB -
>>> this is a magical line". Of course making heap bigger than 32G means that
>>> we lose OOPs. There are tons of blogs posts and articles which shows how
>>> switching OOPs influence on application heap usage (eg.
>>> https://blog.codecentric.de/en/2014/02/35gb-heap-less-
>>> 32gb-java-jvm-memory-oddities/). Lets ask ourselves a question if this
>>> is a very big problem for ES&Lucene too.
>>>
>>> I analyzed a few heap dumps from ES. The maximum size of the heap was
>>> set below magical boundary (Xmx was 30GB). In all cases I can see similar
>>> pattern but let's discuss it based on a single example. One heap dump I
>>> took had around 16GB (slightly more) of reachable objects in it. There were
>>> about 70M objects. Of course I cannot just take 70M to see how much of the
>>> heap I can save by having OOPs enabled but I also tried to analyze the
>>> number of references to objects (because some objects are referenced
>>> multiple times from multiple places). This gave me a number around 110M
>>> inbound references so OOPs let us save about 0.5GB of memory so when we try
>>> to estimate, this would mean around *1GB* when whole the heap is currently
>>> in use (as I wrote earlier only 16GB of reachable objects were in heap) -
>>> for analyzed case. Moreover I can observe this:
>>>
>>> 2M objects of type long[] which take 6G of heap
>>> 280K objects of type double[] which take 4.5G of heap
>>> 10M objects of type byte[] which take 2.5G of heap
>>> 4.5M objects of type char[] which take 500M of heap
>>>
>>> When we sum all of sizes we can see 13.5GB of primitive arrays pointed
>>> by less than 20M references. As we can see ES&Lucene use a lot of arrays of
>>> primitives.
>>>
>>> Elasticsearch is very "memory-hungry" especially when using
>>> aggregations, multi-dimensional aggregations and parent-child queries. I
>>> think sometimes it is reasonable to have a bigger heap if we have enough
>>> free resources.
>>>
>>> Of course we have to remember that the bigger heap means more work for
>>> GC (and currently used in JVM: CMS or G1 are not very efficient for large
>>> heaps), but ... Is there really a magical line (32GB) after crossing we get
>>> into "JVM troubles" or we can find a lot of cases where crossing the
>>> magical boundary makes sense?
>>>
>>> I'm curious what are your thoughts in this area?
>>>
>>> --
>>> Paweł Róg
>>>
>>>
>>>  --
>>> You received this message because you are subscribed to the Google
>>> Groups "elasticsearch" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to elasticsearc...@googlegroups.com.
>>> To view this discussion on the web visit https://groups.google.com/d/
>>> msgid/elasticsearch/CAHngsdhSiXbdzYxss25f-JMpe5E5J545zLrW8tnK1e74K%
>>> 3D4tqg%40mail.gmail.com
>>> 

Removing entry from the _suggest

2015-03-26 Thread Nikolay Chankov
Hi,

I am experiencing problems to remove entry from elastic search.

Here is the scenario

I have venues. Each venue could have state "open" or "closed" (in Mysql). 
When the state is "open" the entry should be in the elastic search, so I 
can search for it and display it on the site.
But when it's marked as "closed" the venue should be removed from the ES 
index, so no one can search it.

The functionality for adding and removing the entry from the index is 
working seamless, and if I search for the venue name it doesn't display in 
the search results.

The problem is that I have an autocomplete where I am using _suggest and 
when I start typing the venue name e.g. "churchills" the entry is in the 
autocomplete.

When I search in the index by venue id the entry is missing and it appear 
when I "open" it again, but suggest seems that it's not affected. I am 
quite sure that I am searching in the correct index, so it's not the case, 
that I am removing it from one index, but I am looking into another.

Are there any specifics how to remove entry from the _suggest?

My instance is ES 1.1.1

Thank you in advance.

Nik

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/0d69185c-dfe1-482b-a727-93aab57c4f94%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: ES&Lucene 32GB heap myth or fact?

2015-03-26 Thread Paweł Róg
Hi,
Thanks for your response Jörg. Maybe I was not precise enough in my last 
e-mail. What I wanted to point out is that IMHO in ES I can get something 
different than ~30G (OOPs) == ~40G (no OOPs). As I wrote in my analysis for 
16G reachable objects (with Xmx 30G)  from my calculations the overhead of 
disabled OOPs vs enabled OOPs is only 0.5G and for 100% heap usage (30G 
from Xmx 30G) it would be 1G. This means that 30G heap will be always less 
than eg. 32G or 33G heap in case of ES (at least for my query 
characteristics with lots of aggregations).

So I again ask what are your thoughts about this? Did I make any mistake 
during my estimations?

--
Paweł Róg

On Thursday, March 26, 2015 at 4:21:10 PM UTC+1, Jörg Prante wrote:
>
> There is no "trouble" at all, only a surprise effect to those who do not 
> understand the effect of compressed OOPs.
>
> Compressed OOPs solve a memory space efficiency problem but work silently. 
> The challenge is, large object pointers waste some of the CPU memory 
> bandwith when JVM must access objects on a 64bit addressable heap. There is 
> a price to pay for encoding/decoding pointers, and that is performance. 
> Most people prefer memory efficiency over speed, so current Oracle JVM is 
> now enabling compressed OOPs by default. And this feature works only on 
> heaps less than ~30GB. If you configure a larger heap (for whatever reason) 
> you lose compressed OOP feature silently. Then you get better performance, 
> but with less heap object capacity. At a heap size of ~40G, you can again 
> store as many heap objects as with ~30GB.
>
> Jörg
>
>
> On Thu, Mar 26, 2015 at 2:28 PM, Paweł Róg 
> > wrote:
>
>> Hi everyone,
>> Every time we touch the size of JVM heap for Elasticsearch we can 
>> meet indisputable statement "don't let the heap to be bigger than 32GB - 
>> this is a magical line". Of course making heap bigger than 32G means that 
>> we lose OOPs. There are tons of blogs posts and articles which shows how 
>> switching OOPs influence on application heap usage (eg. 
>> https://blog.codecentric.de/en/2014/02/35gb-heap-less-32gb-java-jvm-memory-oddities/).
>>  
>> Lets ask ourselves a question if this is a very big problem for ES&Lucene 
>> too.
>>
>> I analyzed a few heap dumps from ES. The maximum size of the heap was set 
>> below magical boundary (Xmx was 30GB). In all cases I can see similar 
>> pattern but let's discuss it based on a single example. One heap dump I 
>> took had around 16GB (slightly more) of reachable objects in it. There were 
>> about 70M objects. Of course I cannot just take 70M to see how much of the 
>> heap I can save by having OOPs enabled but I also tried to analyze the 
>> number of references to objects (because some objects are referenced 
>> multiple times from multiple places). This gave me a number around 110M 
>> inbound references so OOPs let us save about 0.5GB of memory so when we try 
>> to estimate, this would mean around *1GB* when whole the heap is currently 
>> in use (as I wrote earlier only 16GB of reachable objects were in heap) - 
>> for analyzed case. Moreover I can observe this:
>>
>> 2M objects of type long[] which take 6G of heap
>> 280K objects of type double[] which take 4.5G of heap
>> 10M objects of type byte[] which take 2.5G of heap
>> 4.5M objects of type char[] which take 500M of heap
>>
>> When we sum all of sizes we can see 13.5GB of primitive arrays pointed by 
>> less than 20M references. As we can see ES&Lucene use a lot of arrays of 
>> primitives.
>>
>> Elasticsearch is very "memory-hungry" especially when using aggregations, 
>> multi-dimensional aggregations and parent-child queries. I think sometimes 
>> it is reasonable to have a bigger heap if we have enough free resources.
>>
>> Of course we have to remember that the bigger heap means more work for GC 
>> (and currently used in JVM: CMS or G1 are not very efficient for large 
>> heaps), but ... Is there really a magical line (32GB) after crossing we get 
>> into "JVM troubles" or we can find a lot of cases where crossing the 
>> magical boundary makes sense?
>>
>> I'm curious what are your thoughts in this area?
>>
>> --
>> Paweł Róg
>>
>>
>>  -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to elasticsearc...@googlegroups.com .
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/CAHngsdhSiXbdzYxss25f-JMpe5E5J545zLrW8tnK1e74K%3D4tqg%40mail.gmail.com
>>  
>> 
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it,

Re: ES&Lucene 32GB heap myth or fact?

2015-03-26 Thread joergpra...@gmail.com
There is no "trouble" at all, only a surprise effect to those who do not
understand the effect of compressed OOPs.

Compressed OOPs solve a memory space efficiency problem but work silently.
The challenge is, large object pointers waste some of the CPU memory
bandwith when JVM must access objects on a 64bit addressable heap. There is
a price to pay for encoding/decoding pointers, and that is performance.
Most people prefer memory efficiency over speed, so current Oracle JVM is
now enabling compressed OOPs by default. And this feature works only on
heaps less than ~30GB. If you configure a larger heap (for whatever reason)
you lose compressed OOP feature silently. Then you get better performance,
but with less heap object capacity. At a heap size of ~40G, you can again
store as many heap objects as with ~30GB.

Jörg


On Thu, Mar 26, 2015 at 2:28 PM, Paweł Róg  wrote:

> Hi everyone,
> Every time we touch the size of JVM heap for Elasticsearch we can
> meet indisputable statement "don't let the heap to be bigger than 32GB -
> this is a magical line". Of course making heap bigger than 32G means that
> we lose OOPs. There are tons of blogs posts and articles which shows how
> switching OOPs influence on application heap usage (eg.
> https://blog.codecentric.de/en/2014/02/35gb-heap-less-32gb-java-jvm-memory-oddities/).
> Lets ask ourselves a question if this is a very big problem for ES&Lucene
> too.
>
> I analyzed a few heap dumps from ES. The maximum size of the heap was set
> below magical boundary (Xmx was 30GB). In all cases I can see similar
> pattern but let's discuss it based on a single example. One heap dump I
> took had around 16GB (slightly more) of reachable objects in it. There were
> about 70M objects. Of course I cannot just take 70M to see how much of the
> heap I can save by having OOPs enabled but I also tried to analyze the
> number of references to objects (because some objects are referenced
> multiple times from multiple places). This gave me a number around 110M
> inbound references so OOPs let us save about 0.5GB of memory so when we try
> to estimate, this would mean around *1GB* when whole the heap is currently
> in use (as I wrote earlier only 16GB of reachable objects were in heap) -
> for analyzed case. Moreover I can observe this:
>
> 2M objects of type long[] which take 6G of heap
> 280K objects of type double[] which take 4.5G of heap
> 10M objects of type byte[] which take 2.5G of heap
> 4.5M objects of type char[] which take 500M of heap
>
> When we sum all of sizes we can see 13.5GB of primitive arrays pointed by
> less than 20M references. As we can see ES&Lucene use a lot of arrays of
> primitives.
>
> Elasticsearch is very "memory-hungry" especially when using aggregations,
> multi-dimensional aggregations and parent-child queries. I think sometimes
> it is reasonable to have a bigger heap if we have enough free resources.
>
> Of course we have to remember that the bigger heap means more work for GC
> (and currently used in JVM: CMS or G1 are not very efficient for large
> heaps), but ... Is there really a magical line (32GB) after crossing we get
> into "JVM troubles" or we can find a lot of cases where crossing the
> magical boundary makes sense?
>
> I'm curious what are your thoughts in this area?
>
> --
> Paweł Róg
>
>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/CAHngsdhSiXbdzYxss25f-JMpe5E5J545zLrW8tnK1e74K%3D4tqg%40mail.gmail.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoFdzGQt1oTmyAYTsu7%3DcDK%3DXvUoey71DqPhbdot1hg%2Bsw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Which kind of query style is recommanded to use, JSON style or Query_string sytel? Performance differes?

2015-03-26 Thread Lincoln Xiong
Query_string is more straight forward, because most of time I use Kibana to 
test my query. But for the DSL, it's kind of hard to fully understand which 
query to use. And to test my query is also difficult because too many 
brackets...
There is a high level Python API for DSL. I'm learning it.
Thanks for your advice.

On Wednesday, March 25, 2015 at 6:14:26 PM UTC-4, Nikolas Everett wrote:
>
> query_string is a bit of a trap - if you write an invalid query it just 
> crashes.  So you find yourself working around it with tons of escaping.  
> Its also really really powerful and shouldn't be exposed directly to end 
> users unless you want them to be sneaky.
>
> For the most part I'd suggest using the JSON DSL and/or some DSL wrapper 
> around it.
>
> On Wed, Mar 25, 2015 at 4:28 PM, Lincoln Xiong  > wrote:
>
>> I am new to use Elasticsearch + Logstash + Kibana for analyzing some 
>> logs. I am about to write some scripts to automate something in 
>> searching/aggregation. Now I only have 10gb data so the performance don't 
>> vary that much when I do searching or visualization. I spent a lot of time 
>> learning ES's query DSL but seems still not get the key, wondering 
>> query_string will just do the search in Kibana, but I can also use those 
>> verbose query DSL in my script.
>>
>> My question is, regard to performance or any, is there difference to use 
>> whether query_string, the search box like searching, or use the DSL, 
>> verbose with curry brackets? Or people are encouraged to use DSL because it 
>> performs better than query_string, or something?
>>
>> For now, I don't see any difference b/w query_string and query DSL. Maybe 
>> some one can give me an example that DSL can do something that query_string 
>> cannot. 
>>
>> Any comments may help me increase my understanding of ES. Thank you!
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to elasticsearc...@googlegroups.com .
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/d475236c-a5ac-4abf-82e3-5990530ed391%40googlegroups.com
>>  
>> 
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/af886b1a-cf9b-4023-bb7a-7816ed4b6979%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Control fuzziness in a bool query.

2015-03-26 Thread Kirill
Hi,
I'm combining match query(boolean) and match query(prefix) into bool query.
{
  "bool" : {
"must" : [ {
  "match" : {
"field" : {
  "query" : "query_1",
  "type" : "phrase_prefix"
}
  }
}, {
  "match" : {
"field" : {
  "query" : "query_2",
  "type" : "boolean"
}
  }
} ]
  }
}

I want to add fuzziness, but the problem is that I want to have sum of 
Levenshtein distance to be equal to 1 for both queries. Other words to make 
one mistake only in first or only in second query. Is it possible?
Thank you in advance. 

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/ab39d942-0ca9-4236-8264-85b5e33b7a72%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Problem reading from Elasticsearch using Spark SQL

2015-03-26 Thread Dmitriy Fingerman
I found that to solve this problem I need to use BUILD-SNAPSHOT version of 
elasticsearch-hadoop.

After adding below entries to pom.xml it started to work.

   ...

sonatype-oss
http://oss.sonatype.org/content/repositories/snapshots
true
  
   ...

org.elasticsearch
elasticsearch-hadoop
2.1.0.BUILD-SNAPSHOT


On Thursday, March 26, 2015 at 10:12:25 AM UTC-4, Dmitriy Fingerman wrote:
>
> Hi,
>
> I am trying to read from Elasticsearch using Spark SQL and getting the 
> exception below.
> My environment is CDH 5.3 with Spark 1.2.0 and Elasticsearch 1.4.4.
> Since Spark SQL is not officially supported on CDH 5.3, I added the Hive 
> Jars to Spark classpath in compute-classpath.sh.
> I also added elasticsearch-hadoop-2.1.0.Beta3.jar to the Spark classpath 
> in compute-classpath.sh.
> Also, I tried adding the Hive, elasticsearch-hadoop and elasticseach-spark 
> Jars to SPARK_CLASSPATH environment variable prior to running spark-submit, 
> but got the same exception.
>
> Exception in thread "main" java.lang.RuntimeException: Failed to load 
> class for data source: org.elasticsearch.spark.sql
> at scala.sys.package$.error(package.scala:27)
> at org.apache.spark.sql.sources.CreateTableUsing.run(ddl.scala:99)
> at 
> org.apache.spark.sql.execution.ExecutedCommand.sideEffectResult$lzycompute(commands.scala:67)
> at 
> org.apache.spark.sql.execution.ExecutedCommand.sideEffectResult(commands.scala:67)
> at 
> org.apache.spark.sql.execution.ExecutedCommand.execute(commands.scala:75)
> at 
> org.apache.spark.sql.SQLContext$QueryExecution.toRdd$lzycompute(SQLContext.scala:425)
> at 
> org.apache.spark.sql.SQLContext$QueryExecution.toRdd(SQLContext.scala:425)
> at org.apache.spark.sql.SchemaRDDLike$class.$init$(SchemaRDDLike.scala:58)
> at org.apache.spark.sql.SchemaRDD.(SchemaRDD.scala:108)
> at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:303)
> at com.informatica.sats.datamgtsrv.Percolator$.main(Percolator.scala:29)
> at com.informatica.sats.datamgtsrv.Percolator.main(Percolator.scala)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:358)
> at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:75)
> at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
>
> The code with I am trying to run:
>
> import org.apache.spark._
> import org.apache.spark.sql._
> import org.apache.spark.SparkContext._
> import org.elasticsearch.spark._
> import org.elasticsearch.spark.sql._
>
> object MyTest
> {
>   def main(args: Array[String]) 
>   {
> val sparkConf = new SparkConf().setAppName("MyTest")
> val sc =  new SparkContext(sparkConf)
> val sqlContext = new SQLContext(sc)
> 
> sqlContext.sql("CREATE TEMPORARY TABLE INTERVALS" +
>"USING org.elasticsearch.spark.sql " +
>"OPTIONS (resource 'events/intervals') " )
> 
> val allRDD = sqlContext.sql("SELECT * FROM INTERVALS")
>
> allRDD.foreach(rdd => {rdd.foreach(elem => print(elem + "\n\n"));})
>   }
> }
>
> I checked in Spark source code ( resource 
> org\apache\spark\sql\sources\ddl.scala ) and saw that the run method in 
> CreateTableUsing class expects "DefaultSource.class" file for the data 
> source that needs to be loaded.
> However, there is no such class in org.elasticsearch.spark.sql package in 
> the official Elasticsearch builds.
> I checked in following jars:
>
> elasticsearch-spark_2.10-2.1.0.Beta3.jar
> elasticsearch-spark_2.10-2.1.0.Beta2.jar
> elasticsearch-spark_2.10-2.1.0.Beta1.jar
> elasticsearch-hadoop-2.1.0.Beta3.jar
>
> Can you please advise why this problem happens and how to resolve it?
>
> Thanks,
> Dmitriy Fingerman
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/c6dc8980-a8b6-495d-88ad-477662b7b66e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: exists filter broken on 1.5.0 with restored index?

2015-03-26 Thread Igor Motov
Thanks for checking. It's a bug, which should be fixed in 
1.5.1 https://github.com/elastic/elasticsearch/pull/10268

On Wednesday, 25 March 2015 13:43:28 UTC-4, Mads Martin Jørgensen wrote:
>
> They're similar. The 1.5.0 cluster has "created" : "1000199", and the 
> 1.4.1 cluster also has "created" : "1000199"
>
> On Wednesday, March 25, 2015 at 4:45:30 PM UTC+1, Igor Motov wrote:
>>
>> Hi Mads Martin,
>>
>> Could you check the version that is returned when you run curl 
>> "localhost:9200/my_own_index/_settings?pretty". The version will be in 
>>
>> "version" : {
>>   "created" : "XXX"
>> }
>>
>> Could you compare it to the version that is returned by the same index in 
>> the pre-1.5.0 cluster?
>>
>> Igor
>>
>> On Wednesday, 25 March 2015 09:27:03 UTC-4, Mads Martin Jørgensen wrote:
>>>
>>> Hello all,
>>>
>>> Just installed es-1.5.0 with cloud-aws-2.5.0 on a machine. Did a restore 
>>> of a snapshot made with es-1.4.1. All documents are there, but the exists 
>>> filter seems broken. The query that used to return all documents matching, 
>>> now return 0 documents, even though the field exists when reading the 
>>> documents.
>>>
>>> curl -XGET "http://localhost:9200/my_own_index/document/_search 
>>> "
>>>  
>>> -d'
>>>
>>> {
>>>
>>>"query": {
>>>
>>>   "constant_score": {
>>>
>>>  "filter": {
>>>
>>> "exists": {
>>>
>>>"field": "history"
>>>
>>> }
>>>
>>>  }
>>>
>>>   }
>>>
>>>}
>>>
>>> }'
>>>
>>>
>>> If we populate new documents, then the exists filter works just fine.
>>>
>>>
>>> Regards,
>>>
>>> Mads Martin
>>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/531c6e3c-5697-4fc0-b037-d8fbc438c5dd%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Problem reading from Elasticsearch using Spark SQL

2015-03-26 Thread Dmitriy Fingerman
Hi,

I am trying to read from Elasticsearch using Spark SQL and getting the 
exception below.
My environment is CDH 5.3 with Spark 1.2.0 and Elasticsearch 1.4.4.
Since Spark SQL is not officially supported on CDH 5.3, I added the Hive 
Jars to Spark classpath in compute-classpath.sh.
I also added elasticsearch-hadoop-2.1.0.Beta3.jar to the Spark classpath in 
compute-classpath.sh.
Also, I tried adding the Hive, elasticsearch-hadoop and elasticseach-spark 
Jars to SPARK_CLASSPATH environment variable prior to running spark-submit, 
but got the same exception.

Exception in thread "main" java.lang.RuntimeException: Failed to load class 
for data source: org.elasticsearch.spark.sql
at scala.sys.package$.error(package.scala:27)
at org.apache.spark.sql.sources.CreateTableUsing.run(ddl.scala:99)
at 
org.apache.spark.sql.execution.ExecutedCommand.sideEffectResult$lzycompute(commands.scala:67)
at 
org.apache.spark.sql.execution.ExecutedCommand.sideEffectResult(commands.scala:67)
at org.apache.spark.sql.execution.ExecutedCommand.execute(commands.scala:75)
at 
org.apache.spark.sql.SQLContext$QueryExecution.toRdd$lzycompute(SQLContext.scala:425)
at 
org.apache.spark.sql.SQLContext$QueryExecution.toRdd(SQLContext.scala:425)
at org.apache.spark.sql.SchemaRDDLike$class.$init$(SchemaRDDLike.scala:58)
at org.apache.spark.sql.SchemaRDD.(SchemaRDD.scala:108)
at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:303)
at com.informatica.sats.datamgtsrv.Percolator$.main(Percolator.scala:29)
at com.informatica.sats.datamgtsrv.Percolator.main(Percolator.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:358)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:75)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

The code with I am trying to run:

import org.apache.spark._
import org.apache.spark.sql._
import org.apache.spark.SparkContext._
import org.elasticsearch.spark._
import org.elasticsearch.spark.sql._

object MyTest
{
  def main(args: Array[String]) 
  {
val sparkConf = new SparkConf().setAppName("MyTest")
val sc =  new SparkContext(sparkConf)
val sqlContext = new SQLContext(sc)

sqlContext.sql("CREATE TEMPORARY TABLE INTERVALS" +
   "USING org.elasticsearch.spark.sql " +
   "OPTIONS (resource 'events/intervals') " )

val allRDD = sqlContext.sql("SELECT * FROM INTERVALS")

allRDD.foreach(rdd => {rdd.foreach(elem => print(elem + "\n\n"));})
  }
}

I checked in Spark source code ( resource 
org\apache\spark\sql\sources\ddl.scala ) and saw that the run method in 
CreateTableUsing class expects "DefaultSource.class" file for the data 
source that needs to be loaded.
However, there is no such class in org.elasticsearch.spark.sql package in 
the official Elasticsearch builds.
I checked in following jars:

elasticsearch-spark_2.10-2.1.0.Beta3.jar
elasticsearch-spark_2.10-2.1.0.Beta2.jar
elasticsearch-spark_2.10-2.1.0.Beta1.jar
elasticsearch-hadoop-2.1.0.Beta3.jar

Can you please advise why this problem happens and how to resolve it?

Thanks,
Dmitriy Fingerman

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/9162e283-e458-4d42-ab53-5ca50fa08172%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: fielddata_breaker - too high tripped value

2015-03-26 Thread Vladi Feigin
Can someone bring some insight? 

On Wednesday, March 25, 2015 at 8:29:52 PM UTC+2, Vladi Feigin wrote:
>
> Hi ,
>
> We're observing on the one of our servers the fielddata_breake.tripped 
> value much higher than on others 
> On this specific server we have tripped = ~9K but on all rest servers we 
> have tripped = 0
> What's going wrong with this server? Is it kind of hot spot in terms of 
> data distribution between servers? 
> We think it affects the queries performance.
> How do we fix this ?
> Thank you,
> Vladi 
>
> This message may contain confidential and/or privileged information. 
> If you are not the addressee or authorized to receive this on behalf of 
> the addressee you must not use, copy, disclose or take action based on this 
> message or any information herein. 
> If you have received this message in error, please advise the sender 
> immediately by reply email and delete this message. Thank you.
>

-- 
This message may contain confidential and/or privileged information. 
If you are not the addressee or authorized to receive this on behalf of the 
addressee you must not use, copy, disclose or take action based on this 
message or any information herein. 
If you have received this message in error, please advise the sender 
immediately by reply email and delete this message. Thank you.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/b1a6222a-0a06-4de0-9e73-2eb07b942d73%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Problem reading from Elasticsearch using Sparl SQL

2015-03-26 Thread Dmitriy Fingerman
Hi,

I am trying to read from Elasticsearch using Spark SQL and getting the 
exception below.
My environment is CDH 5.3 with Spark 1.2.0 and Elasticsearch 1.4.4.
Since Spark SQL is not officially supported on CDH 5.3, I added the Hive 
Jars to Spark classpath in compute-classpath.sh.
I also added elasticsearch-hadoop-2.1.0.Beta3.jar to the Spark classpath in 
compute-classpath.sh.
Also, I tried adding the Hive, elasticsearch-hadoop and elasticseach-spark 
Jars to SPARK_CLASSPATH environment variable prior to running spark-submit, 
but got the same exception.

Exception in thread "main" java.lang.RuntimeException: Failed to load class 
for data source: org.elasticsearch.spark.sql
at scala.sys.package$.error(package.scala:27)
at org.apache.spark.sql.sources.CreateTableUsing.run(ddl.scala:99)
at 
org.apache.spark.sql.execution.ExecutedCommand.sideEffectResult$lzycompute(commands.scala:67)
at 
org.apache.spark.sql.execution.ExecutedCommand.sideEffectResult(commands.scala:67)
at org.apache.spark.sql.execution.ExecutedCommand.execute(commands.scala:75)
at 
org.apache.spark.sql.SQLContext$QueryExecution.toRdd$lzycompute(SQLContext.scala:425)
at 
org.apache.spark.sql.SQLContext$QueryExecution.toRdd(SQLContext.scala:425)
at org.apache.spark.sql.SchemaRDDLike$class.$init$(SchemaRDDLike.scala:58)
at org.apache.spark.sql.SchemaRDD.(SchemaRDD.scala:108)
at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:303)
at com.informatica.sats.datamgtsrv.Percolator$.main(Percolator.scala:29)
at com.informatica.sats.datamgtsrv.Percolator.main(Percolator.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:358)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:75)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

The code with I am trying to run:

import org.apache.spark._
import org.apache.spark.sql._
import org.apache.spark.SparkContext._
import org.elasticsearch.spark._
import org.elasticsearch.spark.sql._

object MyTest
{
  def main(args: Array[String]) 
  {
val sparkConf = new SparkConf().setAppName("MyTest")
val sc =  new SparkContext(sparkConf)
val sqlContext = new SQLContext(sc)

sqlContext.sql("CREATE TEMPORARY TABLE INTERVALS" +
   "USING org.elasticsearch.spark.sql " +
   "OPTIONS (resource 'events/intervals') " )

val allRDD = sqlContext.sql("SELECT * FROM INTERVALS")

allRDD.foreach(rdd => {rdd.foreach(elem => print(elem + "\n\n"));})
  }
}

I checked in Spark source code ( resource 
org\apache\spark\sql\sources\ddl.scala ) and saw that the run method 
in CreateTableUsing class expects "DefaultSource.class" file for the data 
source that needs to be loaded.
However, there is no such class in org.elasticsearch.spark.sql package in 
the official Elasticsearch builds.
I checked in following jars:

elasticsearch-spark_2.10-2.1.0.Beta3.jar
elasticsearch-spark_2.10-2.1.0.Beta2.jar
elasticsearch-spark_2.10-2.1.0.Beta1.jar
elasticsearch-hadoop-2.1.0.Beta3.jar

Can you please advise why this problem happens and how to resolve it?

Thanks,
Dmitriy Fingerman

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/12e69eee-6fb1-401e-95f6-5c69341c1796%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Elasticsearch shield and Kibana 3.x

2015-03-26 Thread Marcello A
Hi All,
we have installed shield on Elasticsearch 1.5 in a test environment. We 
have enabled the SSL on http module and the authentication but we noticed 
issue on Kibana 3.1.2 version but it's not working.

Did someone configure this plugin and Kibana 3 to works together?

Thanks,
Marcello

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/6fe05fda-e941-46c5-9a21-c76e6b188116%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


ES&Lucene 32GB heap myth or fact?

2015-03-26 Thread Paweł Róg
Hi everyone,
Every time we touch the size of JVM heap for Elasticsearch we can
meet indisputable statement "don't let the heap to be bigger than 32GB -
this is a magical line". Of course making heap bigger than 32G means that
we lose OOPs. There are tons of blogs posts and articles which shows how
switching OOPs influence on application heap usage (eg.
https://blog.codecentric.de/en/2014/02/35gb-heap-less-32gb-java-jvm-memory-oddities/).
Lets ask ourselves a question if this is a very big problem for ES&Lucene
too.

I analyzed a few heap dumps from ES. The maximum size of the heap was set
below magical boundary (Xmx was 30GB). In all cases I can see similar
pattern but let's discuss it based on a single example. One heap dump I
took had around 16GB (slightly more) of reachable objects in it. There were
about 70M objects. Of course I cannot just take 70M to see how much of the
heap I can save by having OOPs enabled but I also tried to analyze the
number of references to objects (because some objects are referenced
multiple times from multiple places). This gave me a number around 110M
inbound references so OOPs let us save about 0.5GB of memory so when we try
to estimate, this would mean around *1GB* when whole the heap is currently
in use (as I wrote earlier only 16GB of reachable objects were in heap) -
for analyzed case. Moreover I can observe this:

2M objects of type long[] which take 6G of heap
280K objects of type double[] which take 4.5G of heap
10M objects of type byte[] which take 2.5G of heap
4.5M objects of type char[] which take 500M of heap

When we sum all of sizes we can see 13.5GB of primitive arrays pointed by
less than 20M references. As we can see ES&Lucene use a lot of arrays of
primitives.

Elasticsearch is very "memory-hungry" especially when using aggregations,
multi-dimensional aggregations and parent-child queries. I think sometimes
it is reasonable to have a bigger heap if we have enough free resources.

Of course we have to remember that the bigger heap means more work for GC
(and currently used in JVM: CMS or G1 are not very efficient for large
heaps), but ... Is there really a magical line (32GB) after crossing we get
into "JVM troubles" or we can find a lot of cases where crossing the
magical boundary makes sense?

I'm curious what are your thoughts in this area?

--
Paweł Róg

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHngsdhSiXbdzYxss25f-JMpe5E5J545zLrW8tnK1e74K%3D4tqg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Question about _score metrics

2015-03-26 Thread João Lima
Hi,

I have a query like this:

$params['body'] = array(
'query' => array(
'filtered' => array(
'query' => array(
'function_score' => array(
'script_score' => array(
'script' => '_score + (0.001 * 
doc["compraram"].value) + (0.001 * doc["clicks"].value)',
'lang' => 'groovy'
),
'boost_mode' => 'replace',
'query' => array(
'multi_match' => array(
'query' => 
substr($where['busca'],0,432),
'type' => 'best_fields',
'fields' => 
array("nome^1.3","sku^1.1","nome.ng^1.1","nome.ed^1.05","EcommerceBusca.keywords^1.005","keywords^1.0025"),
"minimum_should_match" => "99%",
'operator' => 'or',
)
)
)
),
'filter' => array(
'bool' => array(
'must' => array(
array('term' => array('ativo' => 1)),
array('term' => array('oculto' => 0)),
array('term' => array('id_instancia' => 
$this->instancia)),
),
),
),
)
),
);

But i only want to multiply the score of the field "nome" with the fields 
doc["compraram"].value and doc["clicks"].value, not the entiry score.
How can I do it?

Thanks in advance.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/2ba35496-0c62-4cf5-892b-03f604a6e484%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


limit maven to put all index in a plugin

2015-03-26 Thread Ali Lotfdar
Hello All,

I use maven for creation of rest plugin, but as I checked all the index 
included inside cluster are packaged.
How could I limit it to index(s) relevant to my plugin.

Thank you.

Regards,
Ali

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/21972748-230e-43d3-bcdb-8a97effc0dbd%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: How to setup a connection between VMs and ELK?

2015-03-26 Thread kelnrluierhfeulne
Would you happen to know what the process is or know of any links that go 
over the concept that allow you to do this? For example, is there a certain 
file to edit once you download logstash-forwarder?
Thanks again

On Wednesday, March 25, 2015 at 5:30:53 PM UTC-4, Mark Walkom wrote:
>
> There is no single command, it's a concept.
>
> You can use rsyslog, logstash, logstash-forwarder, logstash-courier or 
> many other pieces of software to do this.
>
> On 26 March 2015 at 07:47, kelnrluierhfeulne  > wrote:
>
>> Hey thanks for the reply! I tried looking up how to do that but am still 
>> lost... Would you happen to know what commands you would use to ship the 
>> VM's logs to elasticsearch?
>>
>>
>>
>> On Wednesday, March 25, 2015 at 4:22:11 PM UTC-4, Mark Walkom wrote:
>>>
>>> You need to ship the logs from the VMs to ES.
>>>
>>> Take a look at Logstash and feel free to ask questions on 
>>> https://groups.google.com/forum/?hl=en-GB#!forum/logstash-users
>>>
>>> On 26 March 2015 at 04:32, kelnrluierhfeulne  wrote:
>>>
 This is a beginner question but how would you get virtual machines to 
 connect to ELK so you can see the logs of those VMs on Kibana? Is there a 
 place to input the IP of the VMs so it is displayed in Kibana?

 -- 
 You received this message because you are subscribed to the Google 
 Groups "elasticsearch" group.
 To unsubscribe from this group and stop receiving emails from it, send 
 an email to elasticsearc...@googlegroups.com.
 To view this discussion on the web visit https://groups.google.com/d/
 msgid/elasticsearch/54e04ffa-9bec-4c99-954c-f0a866454faa%
 40googlegroups.com 
 
 .
 For more options, visit https://groups.google.com/d/optout.

>>>
>>>  -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to elasticsearc...@googlegroups.com .
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/15f4ff6d-6eb6-46de-ad0b-c8046bb8c822%40googlegroups.com
>>  
>> 
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/daf2a092-2ca0-4cdf-be1c-3c12984436d7%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Need Help with Japanese analyzer - (Kuromoji)

2015-03-26 Thread Mangesh Ralegankar
Hi All, 
I need help how to specify analyzer in the search query (curl).

{
  "from" : 0,
  "size" : 1000,
  "query" : {
"match" : {
  "text" : { 
"query" : "somejapanesetext"
"type" : "phrase", 
"analyzer" : "kuromoji" 
 }
}
  },
  "explain" : true
}

How I understand which analyzer is applied for this search query. has 
applied setting while creating index as below e.g 

{ 
  "index":{
"analysis":{
  "tokenizer" : {
"kuromoji" : {
  "type":"kuromoji_tokenizer",
  "mode":"search" 
}
  },
  "analyzer" : {
"kuromoji_analyzer" : {
  "type" : "custom",
  "tokenizer" : "kuromoji_tokenizer"
}
  }
}
  }
}'

Can some please guide to get right link or let me know what I am missing. 

Thanks 
Mangesh R

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/61a1a948-897c-4beb-ba70-ec67aa6396f1%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Unable to preserve special characters in search results of ElasticSearch.

2015-03-26 Thread Muddadi Hemaanusha
Hi Thanks, 

I resolved this issue, 

In order to preserve the special characters and to search the query term in 
multiple fields for exact match it is better to change the settings as 
shown below. 

Settings I updated: 

PUT /my_index/_settings?pretty=true 
{ 
"settings" : { 
"analysis": { 
  "analyzer": { 
"wordAnalyzer": { 
  "type": "custom", 
  "tokenizer": "whitespace", 
  "filter": [ 
"word_delimiter_for_phone","nGram_filter" 
  ] 
} 
  }, 
  "filter": { 
"word_delimiter_for_phone": { 
  "type": "word_delimiter", 
  "catenate_all": true, 
  "generate_number_parts ": false, 
  "split_on_case_change": false, 
  "generate_word_parts": false, 
  "split_on_numerics": false, 
  "preserve_original": true 
}, 
  
"nGram_filter": { 
   "type": "nGram", 
   "min_gram": 1, 
   "max_gram": 20, 
   "token_chars": [ 
  "letter" 
   ] 
} 
  } 
} 
} 
} 


Mapping settings: 
{ 
“mappings”:{ 

   "face" : { 
 "properties" : { 
"{field-1}id" : { 
   "type" : "string", 
   "index_name" : "change", 
   "analyzer": "wordAnalyzer" 

}, 
"{field-2}name" : { 
"type" : "string", 
   "index_name" : "change", 
   "analyzer": "wordAnalyzer" 
}, 
"{field-3}Year": 
{ 
"type" : "string", 
   "index_name" : "change", 
   "analyzer": "wordAnalyzer" 
}, 
"{field-4}Make": 
{ 
"type" : "string", 
   "index_name" : "change", 
   "analyzer": "wordAnalyzer" 
} 
 } 
  } 



and the query we can use: 

GET my_index/face/_search 
{ 
   "query": { 
   "match": { 
  "change": 
  { 
  "query": "A/T o", 
  "type": "phrase_prefix" 
  } 
   } 
   } 
} 


By this we can search for that term in all the fields. In order to search 
in only single field we can give that field name in the place of "change" 
in match query. 

And for to change in mappings I am able to update the analyzers, but not 
the index_name, for to add index_name I have deleted the index and again 
done the mapping as above. 


On Tuesday, March 24, 2015 at 3:50:52 PM UTC+5:30, Muddadi Hemaanusha wrote:
>
> If I escape that character '/' I will get the data irrelevant to the 
> search term 
>
> eg: if a/b, a/c, a/d, a(b), a(c), a-b, ab  is my data
>
> My requirement is if I enter the term a( then I have to get only a( as a 
> result but not ab,a-b,...
>
> I would like to get the results without escaping them, the result need to 
> preserve the special character '/' , any query which is relevant to that 
> will be helpful and the search term need to search in different fields but 
> no need to match in all fields , any one field match result is required.
>
> On Monday, March 23, 2015 at 7:50:08 PM UTC+5:30, Periyandavar wrote:
>>
>> Hi 
>>  I think You need to escape those spl char in search string, like 
>> { 
>>   "query": { 
>> "bool": { 
>>   "must": [ 
>> { 
>>   "query_string": { 
>> "fields": [ 
>>   "msg" 
>> ], 
>> "query": "a\\/" 
>>   } 
>> } 
>>   ] 
>> } 
>>   } 
>> } 
>>
>>
>>
>> -- 
>> View this message in context: 
>> http://elasticsearch-users.115913.n3.nabble.com/Unable-to-preserve-special-characters-in-search-results-of-ElasticSearch-tp4072409p4072418.html
>>  
>> Sent from the ElasticSearch Users mailing list archive at Nabble.com. 
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/db074925-217f-48a9-8d19-6960282e4a7e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Indexing large pdf document

2015-03-26 Thread Jakko Sikkar
Thank you very much for pointing that out, I read documentation but skipped 
that part somehow :)


neljapäev, 26. märts 2015 12:51.50 UTC+2 kirjutas David Pilato:
>
> There is a limit of the number of extracted characters.
>
> See 
> https://github.com/elastic/elasticsearch-mapper-attachments#indexed-characters
>
>
> -- 
> *David Pilato* - Developer | Evangelist 
> *elastic.co *
> @dadoonet  | @elasticsearchfr 
>  | @scrutmydocs 
> 
>
>
>
>
>  
> Le 26 mars 2015 à 10:51, Jakko Sikkar > 
> a écrit :
>
> Hi,
>
> I'm trying to index big document with ES and Mapper Attachment plugin (
> https://github.com/elastic/elasticsearch-mapper-attachments). Document 
> has 719 pages, but after indexing I can search phrases only up to page 33. 
> When I index a document I'm base64 encoding the file contents and file get 
> successfully added to the index. Is there some limits of the size of the 
> file?
>
> -- 
> You received this message because you are subscribed to the Google Groups 
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to elasticsearc...@googlegroups.com .
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/elasticsearch/04dd35e4-1caf-4a30-8f24-13cf47907067%40googlegroups.com
>  
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/ff655c88-1e8a-4703-935a-f0136deee442%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: elastic.co blog RSS URL missing

2015-03-26 Thread Ivan Brusic
I noticed the same thing. The link is redirecting for me, but my reader
(AOL Reader) appears not to handle redirects.

Ivan
On Mar 25, 2015 9:11 AM, "Magnus Bäck"  wrote:

> The not too widely announced move from elasticsearch.(com|org) to
> elastic.co the other week seems to have broken the old Elasticsearch
> blog RSS feed, and I can’t find the RSS URL for the replacement
> elastic.co blog. Please say there is one.
>
> A final post to the old blog referring to the new one would've been
> nice. I'm probably not the only one who's missed the updates from
> the last week or two.
>
> --
> Magnus Bäck| Software Engineer, Development Tools
> magnus.b...@sonymobile.com | Sony Mobile Communications
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/20150325071127.GA22589%40seldlx20533.corpusers.net
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQAyYcjKa8XfFBQCz7w34_GrZBnRDtA7uZKJARAZDXMsdA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Indexing large pdf document

2015-03-26 Thread David Pilato
There is a limit of the number of extracted characters.

See 
https://github.com/elastic/elasticsearch-mapper-attachments#indexed-characters 



-- 
David Pilato - Developer | Evangelist 
elastic.co
@dadoonet  | @elasticsearchfr 
 | @scrutmydocs 






> Le 26 mars 2015 à 10:51, Jakko Sikkar  a écrit :
> 
> Hi,
> 
> I'm trying to index big document with ES and Mapper Attachment plugin 
> (https://github.com/elastic/elasticsearch-mapper-attachments). Document has 
> 719 pages, but after indexing I can search phrases only up to page 33. When I 
> index a document I'm base64 encoding the file contents and file get 
> successfully added to the index. Is there some limits of the size of the file?
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to elasticsearch+unsubscr...@googlegroups.com 
> .
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/elasticsearch/04dd35e4-1caf-4a30-8f24-13cf47907067%40googlegroups.com
>  
> .
> For more options, visit https://groups.google.com/d/optout 
> .

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/68560613-23C5-4398-A7F0-FEFBACF83DEA%40pilato.fr.
For more options, visit https://groups.google.com/d/optout.


Re: encountered monitor.jvm warning

2015-03-26 Thread Abid Hussain
Maybe aggregations are a cause for memory problems. According to the docs, 
we set the fielddata cache property to
indices.fielddata.cache.size: 40%
... hoping this will help to avoid this kind of issues.

Regards,

Abid

Am Mittwoch, 25. März 2015 00:47:09 UTC+1 schrieb Jason Wee:
>
> Few filters should be fine but aggregations... hm 
>
> You could use dump stack trace and/or heap dump if it happen again and 
> start to analyze from there. Try as well, 
>
> http://www.elastic.co/guide/en/elasticsearch/reference/current/cluster-nodes-hot-threads.html
>  
>
> hth 
>
> jason 
>
> On Tue, Mar 24, 2015 at 11:59 PM, Abid Hussain 
> > wrote: 
> > All settings except from ES_HEAP (set to 10 GB) are defaults, so I 
> actually 
> > am not sure about the new gen setting. 
> > 
> > The host has 80 GB memory in total and 24 CPU cores. All ES indices 
> together 
> > sum up to ~32 GB where the biggest indices are of size ~8 GB. 
> > 
> > We are using queries mostly together with filters and also aggregations. 
> > 
> > We "solved" the problem with a restart of the cluster. Are there any 
> > recommended diagnostics to be performed when this problem occurs next 
> time? 
> > 
> > Regards, 
> > 
> > Abid 
> > 
> > Am Dienstag, 24. März 2015 15:24:43 UTC+1 schrieb Jason Wee: 
> >> 
> >> what is the new gen setting? how much is the system memory in total? 
> >> how many cpu cores in the node? what query do you use? 
> >> 
> >> jason 
> >> 
> >> On Tue, Mar 24, 2015 at 5:26 PM, Abid Hussain 
> >>  wrote: 
> >> > Hi all, 
> >> > 
> >> > we today got (for the first time) warning messages which seem to 
> >> > indicate a 
> >> > memory problem: 
> >> > [2015-03-24 09:08:12,960][WARN ][monitor.jvm  ] [Danger] 
> >> > [gc][young][413224][18109] duration [5m], collections [1]/[5.3m], 
> total 
> >> > [5m]/[16.7m], memory [7.9gb]->[3.7gb]/[9.8gb], all_pools {[young] 
> >> > [853.9mb]->[6.1mb]/[1.1gb]}{[survivor] 
> [149.7mb]->[0b]/[149.7mb]}{[old] 
> >> > [6.9gb]->[3.7gb]/[8.5gb]} 
> >> > [2015-03-24 09:08:12,960][WARN ][monitor.jvm  ] [Danger] 
> >> > [gc][old][413224][104] duration [18.4s], collections [1]/[5.3m], 
> total 
> >> > [18.4s]/[58s], memory [7.9gb]->[3.7gb]/[9.8gb], all_pools {[young] 
> >> > [853.9mb]->[6.1mb]/[1.1gb]}{[survivor] 
> [149.7mb]->[0b]/[149.7mb]}{[old] 
> >> > [6.9gb]->[3.7gb]/[8.5gb]} 
> >> > [2015-03-24 09:08:15,372][WARN ][monitor.jvm  ] [Danger] 
> >> > [gc][young][413225][18110] duration [1.4s], collections [1]/[2.4s], 
> >> > total 
> >> > [1.4s]/[16.7m], memory [3.7gb]->[5gb]/[9.8gb], all_pools {[young] 
> >> > [6.1mb]->[2.7mb]/[1.1gb]}{[survivor] [0b]->[149.7mb]/[149.7mb]}{[old] 
> >> > [3.7gb]->[4.9gb]/[8.5gb]} 
> >> > [2015-03-24 09:08:18,192][WARN ][monitor.jvm  ] [Danger] 
> >> > [gc][young][413227][18111] duration [1.4s], collections [1]/[1.8s], 
> >> > total 
> >> > [1.4s]/[16.7m], memory [5.8gb]->[6.2gb]/[9.8gb], all_pools {[young] 
> >> > [845.4mb]->[1.2mb]/[1.1gb]}{[survivor] 
> >> > [149.7mb]->[149.7mb]/[149.7mb]}{[old] 
> >> > [4.9gb]->[6gb]/[8.5gb]} 
> >> > [2015-03-24 09:08:21,506][WARN ][monitor.jvm  ] [Danger] 
> >> > [gc][young][413229][18112] duration [1.2s], collections [1]/[2.3s], 
> >> > total 
> >> > [1.2s]/[16.7m], memory [7gb]->[7.3gb]/[9.8gb], all_pools {[young] 
> >> > [848.6mb]->[2.1mb]/[1.1gb]}{[survivor] 
> >> > [149.7mb]->[149.7mb]/[149.7mb]}{[old] 
> >> > [6gb]->[7.2gb]/[8.5gb]} 
> >> > 
> >> > We're using ES 1.4.2 as a single node cluster, ES_HEAP is set to 10g, 
> >> > other 
> >> > settings are defaults. From previous posts related to this issue, it 
> is 
> >> > said 
> >> > that field data cache may be a problem. 
> >> > 
> >> > Requesting /_nodes/stats/indices/fielddata says: 
> >> > { 
> >> >"cluster_name": "my_cluster", 
> >> >"nodes": { 
> >> >   "ILUggMfTSvix8Kc0nfNVAw": { 
> >> >  "timestamp": 1427188716203, 
> >> >  "name": "Danger", 
> >> >  "transport_address": "inet[/192.168.110.91:9300]", 
> >> >  "host": "xxx", 
> >> >  "ip": [ 
> >> > "inet[/192.168.110.91:9300]", 
> >> > "NONE" 
> >> >  ], 
> >> >  "indices": { 
> >> > "fielddata": { 
> >> >"memory_size_in_bytes": 6484, 
> >> >"evictions": 0 
> >> > } 
> >> >  } 
> >> >   } 
> >> >} 
> >> > } 
> >> > 
> >> > Running top results in: 
> >> > PID USER  PR  NI  VIRT  RES  SHR S   %CPU %MEMTIME+  COMMAND 
> >> > 12735 root   20   0 15.8g  10g0 S 74 13.2   2485:26 java 
> >> > 
> >> > Any ideas what to do? If possible I would rather avoid increasing 
> >> > ES_HEAP as 
> >> > there isn't that much free memory left on the host. 
> >> > 
> >> > Regards, 
> >> > 
> >> > Abid 
> >> > 
> >> > -- 
> >> > You received this message because you are subscribed to the Google 
> >> > Groups 
> >> > "elasticsearch" group. 
> >> > To unsubscribe from this group and stop rec

Indexing large pdf document

2015-03-26 Thread Jakko Sikkar
Hi,

I'm trying to index big document with ES and Mapper Attachment plugin 
(https://github.com/elastic/elasticsearch-mapper-attachments). Document has 
719 pages, but after indexing I can search phrases only up to page 33. When 
I index a document I'm base64 encoding the file contents and file get 
successfully added to the index. Is there some limits of the size of the 
file?

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/04dd35e4-1caf-4a30-8f24-13cf47907067%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Cluster state storage question

2015-03-26 Thread Robert Gardam
All ok! 
Thanks for the explanation! I just found it odd that the same information 
was displayed twice, but if it's only sent one way that is good. Makes 
sense!

Elasticsearch has already become incredibly more stable in the past 12 
months! 
The future of ES is really exciting! 

Thanks again!
Rob




On Wednesday, March 25, 2015 at 11:22:14 PM UTC+1, Mark Walkom wrote:
>
> I did some more digging here to understand things a bit more than my last 
> (lame, sorry!) email.
>
> We only send routing_table over the network and then we build 
> routing_nodes out of it on the other side. However routing_nodes is 
> actually built only on access, so unless you use it, which happens on 
> master nodes and if you do get /_cluster/status, it's not actually built.
>
> We need routing_nodes for a quick access to node view of the shard 
> allocation and we need routing_table for the index view of shard 
> allocation. This is done to ensure performance.
>
> There is work happening to improve the overall transfer speed of the 
> cluster state between nodes, essentially to send the delta of a state to 
> all nodes, rather than the whole state as currently happens.
>
> On 26 March 2015 at 09:02, Mark Walkom > 
> wrote:
>
>> Cluster state is compressed by default, but for large clusters, or those 
>> with lots of large mappings, it can also be a problem.
>>
>> The cluster needs to know about what shards make up an index, as well as 
>> where they are located which is why. As you mentioned it is currently 
>> stored under two separate areas of the cluster state, though it's possible 
>> this could be combined to reduce the size.
>>
>> On 25 March 2015 at 04:36, Robert Gardam > > wrote:
>>
>>> Hi,
>>> I am starting to look at the size of my cluster state and I started to 
>>> notice that the shard information is duplicated. 
>>>
>>> One grouping seems to be from the view of the index and the other from 
>>> which shards live on which host.
>>>
>>> I'm sure there's a logical reason for this, i'm just interested to know 
>>> why?
>>>
>>> This probably also compresses pretty well.
>>>
>>> Cheers,
>>> Rob
>>>
>>>
>>>
>>>
>>>  -- 
>>> You received this message because you are subscribed to the Google 
>>> Groups "elasticsearch" group.
>>> To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to elasticsearc...@googlegroups.com .
>>> To view this discussion on the web visit 
>>> https://groups.google.com/d/msgid/elasticsearch/43907b97-4ef9-4950-87a0-20edd6be663a%40googlegroups.com
>>>  
>>> 
>>> .
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/9e805b21-f99f-4892-8629-edfc4443e8be%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: question for tokenizer and synonym

2015-03-26 Thread Masaru Hasegawa
Hi,

I guess you are using query string query. If you use match query instead, it 
*should* work.


Masaru


On March 25, 2015 at 06:41:11, Prateek Asthana (pary...@gmail.com) wrote:


I am having requirement similar to below:
search for "chevy" should map to search for "chevrolet". I am using synonyms to 
accomplish this.
search for "BMW 6 series" should map to "61i", "62i", "63i" and so on. I am 
planning to use synonyms to accomplish this too.
Field that contains either (chevrolet or 61i or 62i or 63i)  is having index 
analyzer as:

 "synonym_analyzer" : {
"type":  "custom",
"tokenizer": "standard",
"filter":  [ "lowercase", "synonym_filter" ]
}

My issue with above standard tokenizer is that I cannot map "BMW 6 series" to 
"61i", "62i", "63i". I cannot use "keyword" tokenizer as the search could be: 
"chevy red" or "BMW 6 series v8".  
I am open to changing tokenizers and other elements if required. Please advise.

Thanks,
Prateek
  



--
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/2891dc4d-9826-4d83-a1e0-aeeae73f47c4%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/etPan.5513c5e1.2eb141f2.166%40citra-2.local.
For more options, visit https://groups.google.com/d/optout.


Re: DocValues overhead in ES

2015-03-26 Thread Paweł Róg
Hi,
Thank you for your response. Unfortunately I think we misunderstood. I was 
NOT asking if described case can happen because I see it can :-) I was 
rather asking about ES internals and if there is any way to optimize such a 
case (including source code modifications).

--
Paweł Róg

On Thursday, March 26, 2015 at 3:31:51 AM UTC+1, Mark Walkom wrote:
>
> If you have a lot of unique values and you ask for aggregations looking 
> for unique values amongst though, then what you are seeing can happen.
>
> On 26 March 2015 at 03:05, Paweł Róg > 
> wrote:
>
>> Hi,
>> I analyzed heap dump taken from Elasticsearch and I can see a lot of 
>> space in heap is occupied by structures and references related to doc 
>> values. I can see tons of hash maps with weak references pointing to 
>> objects representing some values in DV. I was wondering if this is somehow 
>> cached on ES side or it is totally Lucene internal mechanism. Can we 
>> influence the size/number of instances of objects connected to field data?
>>
>> I'd be glad if someone can explain it to me.
>>
>> --
>> Paweł Róg
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to elasticsearc...@googlegroups.com .
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/CAHngsdhxHCDXXcx_NzcDNd%2BPnMJ%3DQigcscNXfn8kcjSCh4x3mg%40mail.gmail.com
>>  
>> 
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/1bb6b0b3-c261-4f37-b0e0-3bf537595e70%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Elasticsearch with JSON-array, causing serialize -error

2015-03-26 Thread Masaru Hasegawa
Hi,

It looks like "random_point" is defined as object type but got array of 
numbers. There might be inconsistency in data or you didn't define mapping 
correctly.
You may want to apply correct mapping, "float" or "double".


Masaru


On March 25, 2015 at 05:07:27, sebastian (sebastia...@gmail.com) wrote:

same issue here. Any clues?

On Sunday, April 20, 2014 at 2:47:40 PM UTC-3, PyrK wrote:
I'm using elasticsearch with mongodb -collection using elmongo. I have a 
collection (elasticsearch index's point of view json-array), that contains for 
example field: 

"random_point": [  0.10007477086037397,  0 ]
That's most likely the reason I get this error, when trying to index my 
collection.

[2014-04-20 16:48:51,228][DEBUG][action.bulk              ] [Emma Frost] 
[mediacontent-2014-04-20t16:48:44.116z][4] failed to execute bulk item (index) 
index {[mediacontent-2014-04$
org.elasticsearch.index.mapper.MapperParsingException: object mapping 
[random_point] trying to serialize a value with no field associated with it, 
current value [0.1000747708603739$


        at 
org.elasticsearch.index.mapper.object.ObjectMapper.serializeValue(ObjectMapper.java:595)


        at 
org.elasticsearch.index.mapper.object.ObjectMapper.parse(ObjectMapper.java:467)


        at 
org.elasticsearch.index.mapper.object.ObjectMapper.serializeValue(ObjectMapper.java:599)


        at 
org.elasticsearch.index.mapper.object.ObjectMapper.serializeArray(ObjectMapper.java:587)


        at 
org.elasticsearch.index.mapper.object.ObjectMapper.parse(ObjectMapper.java:459)


        at 
org.elasticsearch.index.mapper.DocumentMapper.parse(DocumentMapper.java:506)


        at 
org.elasticsearch.index.mapper.DocumentMapper.parse(DocumentMapper.java:450)


        at 
org.elasticsearch.index.shard.service.InternalIndexShard.prepareIndex(InternalIndexShard.java:327)


        at 
org.elasticsearch.action.bulk.TransportShardBulkAction.shardIndexOperation(TransportShardBulkAction.java:381)


        at 
org.elasticsearch.action.bulk.TransportShardBulkAction.shardOperationOnPrimary(TransportShardBulkAction.java:155)


        at 
org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction.performOnPrimary(TransportShardReplicationOperationAction$


        at 
org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction$1.run(TransportShardReplicationOperationAction.java:430)


        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146)


        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)


        at java.lang.Thread.run(Thread.java:701)


[2014-04-20 16:48:54,129][INFO ][cluster.metadata         ] [Emma Frost] 
[mediacontent-2014-04-20t16:39:09.348z] deleting index


Is there any ways to bypass this? That array is a needed value in my 
collection. Is there anyways to give some option in elasticsearch to not to 
index that JSON-field, tho it's not going to be searchable field at all?



Best regards,

PK

--
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/bfd442a0-ef52-4ae9-be6c-5e98475cbff4%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/etPan.5513c0d5.3d1b58ba.166%40citra-2.local.
For more options, visit https://groups.google.com/d/optout.


Re: Elasticsearch using huge amount of processes

2015-03-26 Thread Yogesh
My one machine setup (50GB memory, 4 cores, RHEL) has 3 data indices (15 
shards each) and a bunch of marvel indices (~20). I think the same issue is 
happening with my setup too.

@David/ Oliver, Did you find the solution to this issue?

Thanks
Yogesh

On Wednesday, September 11, 2013 at 9:21:19 PM UTC+5:30, Oliver wrote:
>
> I have disabled all indeces except the one from today. Elasticsearch still 
> uses the maximum of 10240 processes after several hours.
>
> Am Dienstag, 10. September 2013 16:30:12 UTC+2 schrieb David Pilato:
>>
>> So it means 13 indices with 5 shards. That means 65 Lucene instances 
>> running on a single box.
>> Could you try to close older indices and see how it goes?
>>
>> http://www.elasticsearch.org/guide/reference/api/admin-indices-open-close/
>>
>> -- 
>> *David Pilato* | *Technical Advocate* | *Elasticsearch.com 
>> *
>> @dadoonet  | @elasticsearchfr 
>>  | @scrutmydocs 
>> 
>>
>>
>>  
>> Le 10 sept. 2013 à 16:27, Oliver  a écrit :
>>
>> at the moment, there are these indices written in node 0
>>
>> ls -la  /var/lib/elasticsearch/elasticsearch/nodes/0/indices/
>> total 60
>> 4 drwxr-xr-x 15 elasticsearch elasticsearch 4096 Sep 10 02:00 .
>> 4 drwxr-xr-x  4 elasticsearch elasticsearch 4096 Sep  9 11:55 ..
>> 4 drwxr-xr-x  8 elasticsearch elasticsearch 4096 Aug 29 10:50 
>> logstash-2013.08.29
>> 4 drwxr-xr-x  8 elasticsearch elasticsearch 4096 Aug 30 09:37 
>> logstash-2013.08.30
>> 4 drwxr-xr-x  8 elasticsearch elasticsearch 4096 Sep  1 11:54 
>> logstash-2013.08.31
>> 4 drwxr-xr-x  8 elasticsearch elasticsearch 4096 Sep  1 11:54 
>> logstash-2013.09.01
>> 4 drwxr-xr-x  8 elasticsearch elasticsearch 4096 Sep  2 14:03 
>> logstash-2013.09.02
>> 4 drwxr-xr-x  8 elasticsearch elasticsearch 4096 Sep  3 02:00 
>> logstash-2013.09.03
>> 4 drwxr-xr-x  8 elasticsearch elasticsearch 4096 Sep  4 14:04 
>> logstash-2013.09.04
>> 4 drwxr-xr-x  8 elasticsearch elasticsearch 4096 Sep  5 02:00 
>> logstash-2013.09.05
>> 4 drwxr-xr-x  8 elasticsearch elasticsearch 4096 Sep  6 02:00 
>> logstash-2013.09.06
>> 4 drwxr-xr-x  8 elasticsearch elasticsearch 4096 Sep  9 10:08 
>> logstash-2013.09.07
>> 4 drwxr-xr-x  8 elasticsearch elasticsearch 4096 Sep  9 10:08 
>> logstash-2013.09.08
>> 4 drwxr-xr-x  8 elasticsearch elasticsearch 4096 Sep  9 10:08 
>> logstash-2013.09.09
>> 4 drwxr-xr-x  8 elasticsearch elasticsearch 4096 Sep 10 02:00 
>> logstash-2013.09.10
>>
>> We could extend the setting to unlimited, but why will no thread be 
>> closed after a while?
>> Anyway, I will try your suggestion.
>>
>>
>> Thanks and Regards
>> Oliver
>>
>> Am Dienstag, 10. September 2013 16:13:34 UTC+2 schrieb David Pilato:
>>>
>>> And how many index did you create until now? Do you have rolling indexes?
>>>
>>> May be you should increase your values?
>>>
>>> echo "elasticsearch soft nproc unlimited" | sudo tee -a 
>>> /etc/security/limits.conf
>>> echo "elasticsearch hard nproc unlimited" | sudo tee -a 
>>> /etc/security/limits.conf
>>>
>>>
>>> I think you will need to logout and login again.
>>>
>>> -- 
>>> *David Pilato* | *Technical Advocate* | *Elasticsearch.com 
>>> *
>>> @dadoonet  | @elasticsearchfr 
>>>  | @scrutmydocs 
>>> 
>>>
>>>
>>>  
>>> Le 10 sept. 2013 à 11:48, Oliver  a écrit :
>>>
>>> Hi David,
>>>
>>> thank you for this fast response. At the moment we are running 
>>> elasticsearch with the default settings, means "index.number_of_shards: 5". 
>>> We have only one (master) node running.
>>> We are sending the log content (catalina.out) only from one machine. So 
>>> I am wondering that this setup creates so many threads, and no thread will 
>>> close after some period of time. It is growing to the max ulimit setting 
>>> and then crashes.
>>>
>>>
>>> Regards
>>> Oliver
>>>
>>> Am Dienstag, 10. September 2013 09:49:06 UTC+2 schrieb David Pilato:

 How many shards do you create per node?
 May be you should create less shards or add more machines?

 -- 
 *David Pilato* | *Technical Advocate* | *Elasticsearch.com 
 *
 @dadoonet  | @elasticsearchfr 
  | @scrutmydocs 
 


  
 Le 10 sept. 2013 à 09:32, Oliver  a écrit :

 Hello,

 we send logfiles (catalina.out) with logstash (version 1.1.13-flatjar) 
 using multiline{} to our elasticsearch server (version 0.90.2).

 After several hours elasticsearch crashes because of not having enough 
 memory, more precisely, the limit of 10240 open processes for the 
 elasticsearch user has been reached. No more threads can be created.

 The actual setting of ulimit is as follows:

 ela

how to create custom mapping for the same field

2015-03-26 Thread 任东宁


When I put the following JSON to elasticsearch. It get error about nested: 
NumberFormatException
{

"id":"eee",
"crosstab":{

"dims":[

{

"id":1
},
{

"id":"asd"
}
]
}
}

I knew dims.id have different type, what should I do to avoid create index 
dim.id or create custom mapping.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/9cebbf03-09ac-4847-ab72-00c07c472e1b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Elasticsearch 1.4.4 not getting stopped

2015-03-26 Thread David Pilato
You might have another node running?
Or the service did not stop Elasticsearch? Did you run it with sudo?

David

> Le 26 mars 2015 à 08:20, Chetan Dev  a écrit :
> 
> hi,
> 
> I am stopping the elasticsearch service but even after stopping 
> i am able to successfully execute the request "curl localhost:9200/" 
> what can be the reason behind it ??
> 
> Thanks
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/elasticsearch/661e50eb-842e-46ff-9416-eb5d97b7e870%40googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/35FE5204-9DEC-4F98-A648-8F7E4D9A1113%40pilato.fr.
For more options, visit https://groups.google.com/d/optout.


Elasticsearch 1.4.4 not getting stopped

2015-03-26 Thread Chetan Dev
hi,

I am stopping the elasticsearch service but even after stopping 
i am able to successfully execute the request "curl localhost:9200/" 
what can be the reason behind it ??

Thanks

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/661e50eb-842e-46ff-9416-eb5d97b7e870%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.