Re: delete by query string query is not working with 1.2.0

2014-05-29 Thread hongsgo
Yes. it is works.
I learned a lot of things of you
Thank you.^^



--
View this message in context: 
http://elasticsearch-users.115913.n3.nabble.com/delete-by-query-string-query-is-not-working-with-1-2-0-tp4056754p4056774.html
Sent from the ElasticSearch Users mailing list archive at Nabble.com.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/1401432446694-4056774.post%40n3.nabble.com.
For more options, visit https://groups.google.com/d/optout.


Re: Random node disconnects in Azure, no resource issues as near as I can tell

2014-05-29 Thread Michael Delaney
Are u using internal fully qualified domain names, e.g 
es01.myelasticsearcservice.f3.internal.net 
If you use public load balancer end points you'll get timeouts. 

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/d6b7a52d-84a8-46d3-a42f-2a708922e567%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: alerts from Kibana/ES

2014-05-29 Thread NF
That's right, Otis.

On Friday, May 30, 2014 7:20:27 AM UTC+2, Otis Gospodnetic wrote:
>
> Hi,
>
> There's no alerting in Kibana.  Have a look at SPM 
> 
>  
> - it has ES monitoring, threshold and heartbeat alerting, anomaly 
> detection, and a number of other features.  Actually, re-reading your email 
> - you are looking to get notified when a certain event is captured?  By 
> that do you mean having something like a "saved query" that matches 
> incoming logs?
>
> Otis
> --
> Performance Monitoring * Log Analytics * Search Analytics
> Solr & Elasticsearch Support * http://sematext.com/
>
>
> On Tuesday, May 27, 2014 5:02:35 AM UTC-4, NF wrote:
>>
>> Hi,
>>
>> We’re using Kibana/Elasticsearch to visualize different kind of logs in 
>> our company. Now, we would need a feature that would allow us to send an 
>> alert/notification (email or other) when a certain event/trigger is 
>> captured.
>>
>> I’d like to know if in Kibana/Elasticsearch backlog there is such a 
>> feature planned? If so, when might we expect it available? 
>>
>> If not, could you please suggest any (open source) solution to satisfy 
>> our need?
>>
>> Thanks,
>>
>> Natalia
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/ed8a472c-6033-4f85-a97e-d3c81c3b30a9%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Search issue with snowball stemmer

2014-05-29 Thread Александр Шаманов
In the beginning, I thought a cause is sitting of analyzers, but I have 
this issue when I put other stemmers. If I don't use any stemmers the 
request http://epbyvitw0052:9200/some_content/_search?q=sampling 
returns me necessary index. I think it's problem with implementation of 
adding some stemmers.

четверг, 29 мая 2014 г., 19:45:12 UTC+3 пользователь Ivan Brusic написал:
>
> You should use the Analyze API to ensure that the tokens you are producing 
> are correct:
>
>
> http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/indices-analyze.html
>
> -- 
> Ivan
>
>
>
> On Thu, May 29, 2014 at 7:13 AM, Александр Шаманов  > wrote:
>
>> Hello everyone,
>>
>> I have follow index mapping:
>>
>> 
>> curl -XPUT 'http://localhost:9200/some_content/' -d '
>> {
>>"settings":{
>>   "query_string":{
>>  "default_con":"content",
>>  "default_operator":"AND"
>>   },
>>   "index":{
>>  "analysis":{
>> "analyzer":{
>>"en_analyser":{
>>   "filter":[
>>  "snowBallFilter"
>>   ],
>>   "type":"custom",
>>   "tokenizer":"standard"
>>}
>> },
>> "filter":{
>>"en_stopFilter":{
>>   "type":"stop",
>>   "stopwords_path":"lang/stopwords_en.txt"
>>},
>>"snowBallFilter":{
>>   "type":"snowball",
>>   "language":"English"
>>},
>>"wordDelimiterFilter":{
>>   "catenate_all":false,
>>   "catenate_words":true,
>>   "catenate_numbers":true,
>>   "generate_word_parts":true,
>>   "generate_number_parts":true,
>>   "preserve_original":true,
>>   "type":"word_delimiter",
>>   "split_on_case_change":true
>>},
>>"en_synonymFilter":{
>>   "synonyms_path":"lang/synonyms_en.txt",
>>   "ignore_case":true,
>>   "type":"synonym",
>>   "expand":false
>>},
>>"lengthFilter":{
>>   "max":250,
>>   "type":"length",
>>   "min":3
>>}
>> }
>>  }
>>   }
>>},
>>"mappings":{
>>   "docs":{
>>  "_source":{
>> "enabled":false
>>  },
>>  "analyzer":"en_analyser",
>>  "properties":{
>>  "content":{
>> "type":"string",
>> "index":"analyzed",
>> "term_vector":"with_positions_offsets",
>> "omit_norms":"true"
>>  }
>>  }
>>   }
>>}
>> }'
>>
>> and I posted the next content:
>>
>> curl -XPOST http://localhost:9200/some_content/docs/ -d '
>> {  
>>   "content" : "Some sampling text formatted for text data" 
>> }'
>>
>> When I make this one request:
>> http://epbyvitw0052:9200/some_content/docs/_search?q=sampling
>>  
>>  I'm getting result:
>> 
>> {
>> "took": 1,
>> "timed_out": false,
>> "_shards": {
>> "total": 1,
>> "successful": 1,
>> "failed": 0
>> },
>> "hits": {
>> "total": 1,
>> "max_score": 0.095891505,
>> "hits": [
>> {
>> "_index": "some_content",
>> "_type": "docs",
>> "_id": "saLfx6PYR82YR69je0JbAA",
>> "_score": 0.095891505
>> }
>> ]
>> }
>> }
>>  
>>
>> but when I send request without type:
>> http://epbyvitw0052:9200/some_content/_search?q=sampling
>>
>> then I'm getting nothing:
>> 
>> {
>> "took": 1,
>> "timed_out": false,
>> "_shards": {
>> "total": 1,
>> "successful": 1,
>> "failed": 0
>> },
>> "hits": {
>> "total": 0,
>> "max_score": null,
>> "hits": []
>> }
>> }
>> 
>>
>> although, I can make the next request with term:
>> http://epbyvitw0052:9200/some_content/_search?q=sampl
>>
>> the system found it:
>> 
>> {
>> "took": 1,
>> "timed_out": false,
>> "_shards": {
>> "total": 1,
>> "successful": 1,
>> "failed": 0
>> },
>> "hits": {
>> "total": 1,
>> "max_score": 0.095891505,
>> "hits": [
>> {
>> "_index": "some_content",
>> "_type": "docs",
>> "_id": "saLfx6PYR82YR69je0JbAA",
>> "_score": 0.095891505
>> }
>> ]
>> }
>> }
>>  
>>
>> It's issue appear when I put snowball filter into analyzer. 
>> Could you explain why the system has such behavior? 
>> May be I do something wrong.
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send 

Re: Stats aggregation of a date_histogram

2014-05-29 Thread Adrien Grand
No concrete plans so far, but this is indeed a feature we are thinking of.


On Tue, May 27, 2014 at 3:17 PM, John Smith  wrote:

> Thanks is this something planned?
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/b7a8b7d7-a7d0-4943-bacf-5014e4e7183f%40googlegroups.com
> .
> For more options, visit https://groups.google.com/d/optout.
>



-- 
Adrien Grand

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j7RyN6EzRawEgPQ%3DbQOpqyJtJBRJUP1EGQt2BcCFaa9FQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Filtering *before* a query

2014-05-29 Thread Adrien Grand
Hi Shawn,

On Wed, May 28, 2014 at 11:29 PM, Shawn O'Banion 
wrote:

> My question is: how do I execute the geo_bounding_box filter *before* 
> executing
> the terms query so that I reduce the number of documents that I have to
> query over.
>

This is why I pointed out the link about filter strategies: whether the
query is applied before, at the same time or after the filter can be
controlled using the `strategy` parameter of `filtered_query`:
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-filtered-query.html#_filter_strategy

You can try them out to see how they influence response times.

You might also want to try out the `indexed` type for the geo bounding
filter that might be faster than the default `memory` if your filter is
very selective:
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-geo-bounding-box-filter.html#_type_2

-- 
Adrien Grand

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j61-at7r_P2HRhnE473sZJzX5ZbWov76o6SYJ1n87C72g%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: delete by query string query is not working with 1.2.0

2014-05-29 Thread David Pilato
It sounds good. Does it work?

--
David ;-)
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs


Le 30 mai 2014 à 07:52, hongsgo  a écrit :

thanks you.
is it right?

# delete by query string query
curl -XDELETE
'http://10.99.196.141:22000/delete_by_query_string_query_test1/1/_query?pretty=true'
-d '
{
  "query" : { //added
   "query_string" : {
   "default_field" : "playId",
"query" : "1395395f-1543-44f7-ba11-fd8eff137af2 OR
1395395f-1543-44f7-ba11-fd8eff137af1"
   }
   }
}
'






--
View this message in context: 
http://elasticsearch-users.115913.n3.nabble.com/delete-by-query-string-query-is-not-working-with-1-2-0-tp4056754p4056764.html
Sent from the ElasticSearch Users mailing list archive at Nabble.com.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/1401429125740-4056764.post%40n3.nabble.com.
For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CA423A07-26A0-4C69-A284-AC6335931CAB%40pilato.fr.
For more options, visit https://groups.google.com/d/optout.


Re: delete by query string query is not working with 1.2.0

2014-05-29 Thread hongsgo
thanks you.
is it right?

# delete by query string query
curl -XDELETE
'http://10.99.196.141:22000/delete_by_query_string_query_test1/1/_query?pretty=true'
-d '
{
   "query" : { //added
"query_string" : {
"default_field" : "playId",
 "query" : "1395395f-1543-44f7-ba11-fd8eff137af2 OR
1395395f-1543-44f7-ba11-fd8eff137af1"
}
}
}
'






--
View this message in context: 
http://elasticsearch-users.115913.n3.nabble.com/delete-by-query-string-query-is-not-working-with-1-2-0-tp4056754p4056764.html
Sent from the ElasticSearch Users mailing list archive at Nabble.com.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/1401429125740-4056764.post%40n3.nabble.com.
For more options, visit https://groups.google.com/d/optout.


Re: alerts from Kibana/ES

2014-05-29 Thread Otis Gospodnetic
Hi,

There's no alerting in Kibana.  Have a look at SPM 
 - it has ES monitoring, threshold and heartbeat 
alerting, anomaly detection, and a number of other features.  Actually, 
re-reading your email - you are looking to get notified when a certain 
event is captured?  By that do you mean having something like a "saved 
query" that matches incoming logs?

Otis
--
Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/


On Tuesday, May 27, 2014 5:02:35 AM UTC-4, NF wrote:
>
> Hi,
>
> We’re using Kibana/Elasticsearch to visualize different kind of logs in 
> our company. Now, we would need a feature that would allow us to send an 
> alert/notification (email or other) when a certain event/trigger is 
> captured.
>
> I’d like to know if in Kibana/Elasticsearch backlog there is such a 
> feature planned? If so, when might we expect it available? 
>
> If not, could you please suggest any (open source) solution to satisfy our 
> need?
>
> Thanks,
>
> Natalia
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/8e0028e1-aa1a-44ae-bd7a-e4364827076f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Optimizations for nested aggregations

2014-05-29 Thread Otis Gospodnetic
Massive JSON responses could indeed be a problem.  I think you can easily 
see if CPU, Disk, or Network are the bottleneck using really any monitoring 
tool.  Even dstat --cpu --mem --disk --net will give you an idea. :)

Otis
--
Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/


On Tuesday, May 27, 2014 6:44:45 AM UTC-4, nil...@gmail.com wrote:
>
> Thank you for your reply.
>
> Here are some observations from a couple of days testing:
>
> - Setting up routing manually reduced the aggregation time about 40%!
> - ... however, manual routing caused data to distribute unevenly. I assume 
> we could have taken steps to improve the distribution, but we didn't 
> investigate any further
> - Upgrading from 1.1.0 to 1.2.0 didn't seem to improve speed nor memory 
> usage, although we didn't do any accurate measurements of RAM usage
> - Changing the compression value for percentiles did indeed have an effect
> - Increasing number of nodes from 3 to 5 didn't seem to improve performance
>
> Since adding additional nodes didn't seem to improve the performance, 
> there seem to be a bottleneck somewhere. The result of the aggregation is 
> very large (as a JSON-result, it results in about a million lines of text), 
> so maybe data transfer or constructing the result can be a bottleneck?
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/ade6258f-18bb-46bd-9d69-502ea64d3db2%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Managing Snapshot Files from Outside ES

2014-05-29 Thread Otis Gospodnetic
Hi,

I believe once you take a snapshot, it's up to you what you do with them... 
if I understood your question correctly.

What you describe - sending old snapshots to Glacier, deletion of old 
snapshot files sounds interesting.  Would be interesting to see info about 
how to do this published somewhere.

Otis
--
Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/


On Tuesday, May 27, 2014 11:01:12 PM UTC-4, David Severski wrote:
>
> I'm currently carrying out snapshots to S3 via the cloud-aws plugin under 
> ES 1.2. I'd like to use Amazon's lifecycle policies to at least send older 
> snapshot files to Glacier (saving money) and ideally delete snapshot files 
> over a certain period. Is managing snapshot files outside of the ES API 
> supported (or advisable)?
>
> Thanks for the help,
>
> David
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/1fd6d40a-23e1-4d72-9e3b-5ee21c33700e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: delete by query string query is not working with 1.2.0

2014-05-29 Thread David Pilato
I answered in github. API changed in 1.0.

--
David ;-)
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs


Le 30 mai 2014 à 05:31, hongsgo  a écrit :

i think do that wrote according to guideline.
https://github.com/elasticsearch/elasticsearch/blob/master/CONTRIBUTING.md

from elasticsearch version 0.90.7, was used delete-by-query-string query . 
at the one time. deleted approximately 250 doc. 
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-query-string-query.html
 

and i did upgrade version to 1.2.0.
after that, i did meet up error screen. 
please help me .. 
thanks you.

# create index
curl -XPOST 'http://10.99.196.141:22000/delete_by_query_string_query_test1'

# create mapping
curl -XPUT
'http://10.99.196.141:22000/delete_by_query_string_query_test1/1/_mapping'
-d '{
   "properties":{
   "playId":{
   "type":"string",
   "index":"not_analyzed",
   "store":"true"
   }
   }
}'

# make doc
curl -XPOST
'http://10.99.196.141:22000/delete_by_query_string_query_test1/1' -d '{
 "playId": "1395395f-1543-44f7-ba11-fd8eff137afe"
}'


# delete by query string query
curl -XDELETE
'http://10.99.196.141:22000/delete_by_query_string_query_test1/1/_query?pretty=true'
-d '
{
   "query_string" : {
   "default_field" : "playId",
"query" : "1395395f-1543-44f7-ba11-fd8eff137afe"
   }
}
'

# response occured error - 
{
 "_indices" : {
   "delete_by_query_string_query_test1" : {
 "_shards" : {
   "total" : 2,
   "successful" : 0,
   "failed" : 2,
   "failures" : [ {
 "index" : "delete_by_query_string_query_test1",
 "shard" : 1,
 "reason" :
"RemoteTransportException[[10.99.196.141_21001][inet[/10.99.196.141:21001]][deleteByQuery/shard]];
nested: QueryParsingException[[delete_by_query_string_query_test1] request
does not support [query_string]]; "
   }, {
 "index" : "delete_by_query_string_query_test1",
 "shard" : 0,
 "reason" :
"RemoteTransportException[[10.101.63.182_21000][inet[/10.101.63.182:21000]][deleteByQuery/shard]];
nested: QueryParsingException[[delete_by_query_string_query_test1] request
does not support [query_string]]; "
   } ]
 }
   }
 }
}


somebody help me please.



--
View this message in context: 
http://elasticsearch-users.115913.n3.nabble.com/delete-by-query-string-query-is-not-working-with-1-2-0-tp4056754.html
Sent from the ElasticSearch Users mailing list archive at Nabble.com.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/1401420702674-4056754.post%40n3.nabble.com.
For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/67DE9A5A-750A-41D3-B11E-70557AC58F6B%40pilato.fr.
For more options, visit https://groups.google.com/d/optout.


Re: looking for heavy write optimization

2014-05-29 Thread Otis Gospodnetic
Hi,

I see Jorg already provided a nice list of suggestions.  But check that FS 
type - ext2!  That's ancient!  Try ext3, ext4, or xfs.  If you turn off 
journaling things will be faster.  You are using UDP you said, so you must 
be OK with some data loss.

Btw. SPM  can monitor all 3 things you mentioned 
- Kafka - Storm - ES.  May be helpful to have all these metrics in a single 
UI when optimizing things.

Otis
--
Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/



On Tuesday, May 27, 2014 11:55:49 PM UTC-4, Zealot Yin wrote:
>
> CPU E5-2620 12 cores
> MEM 64G
> df  -lhT
> FilesystemTypeSize  Used Avail Use% Mounted on
> /dev/sda3 ext21.4T  379G  962G  29% /home
>
> uname -a
> 2.6.32_1-9-0-0  10 17:22:16 CST 2013 x86_64 x86_64 x86_64 GNU/Linux
>
> elasticsearch started with  *bin/elasticsearch -Xms10g -Xmx10g 
> -Des.index.store.type=niofs -d*
>
> On Wednesday, May 28, 2014 11:49:17 AM UTC+8, Mark Walkom wrote:
>>
>> What sort of hardware are you on?
>>
>> Regards,
>> Mark Walkom
>>
>> Infrastructure Engineer
>> Campaign Monitor
>> email: ma...@campaignmonitor.com
>> web: www.campaignmonitor.com
>>
>>
>> On 28 May 2014 13:44, Zealot Yin <0xma...@gmail.com> wrote:
>>
>>> hi
>>>
>>> I am writing a real time analytics tool using kafka,storm and 
>>> elasticsearch and want a elasticsearch that is write optimized for about 
>>> 80K/sec inserts(7 machine cluster).
>>>
>>>  
>>>
>>> For the purpose of high performance i using bulk udp to batch insert my 
>>> doc(each doc is about 300B, only 4 field are indexed),and i'm using  
>>> index.store.type=niofs 
>>>  for index store(mmap cause io-util 100%),but it's still seem not good 
>>> enough for my scene, *all thing i need is writing performance, anybody 
>>> has a idea for this problem?*
>>>
>>>
>>> here is my conf:
>>>
>>>
>>> bootstrap.mlockall: true
>>>
>>> threadpool.bulk.type: fixed
>>>
>>> threadpool.bulk.size: 30
>>>
>>> threadpool.bulk.queue_size: 500
>>>
>>>
>>>  index.refresh_interval: 50s
>>>
>>> indices.store.throttle.type: merge
>>>
>>> indices.store.throttle.max_bytes_per_sec: 80mb
>>>
>>> indices.memory.index_buffer_size: 30%
>>>
>>> indices.ttl.bulk_size: 10
>>>
>>> indices.memory.min_shard_index_buffer_size: 200mb
>>>
>>>
>>> bulk.udp.enabled : true 
>>>
>>> bulk.udp.bulk_actions : 1
>>>
>>> bulk.udp.bulk_size : 20mb
>>>
>>> bulk.udp.flush_interval : 10s
>>>
>>> bulk.udp.concurrent_requests : 4000
>>>
>>> bulk.udp.receive_buffer_size : 10mb
>>>
>>>
>>> index.cache.field.expire: 10m
>>>
>>> index.cache.field.max_size: 50
>>>
>>> index.cache.field.type: soft 
>>>
>>>  -- 
>>> You received this message because you are subscribed to the Google 
>>> Groups "elasticsearch" group.
>>> To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to elasticsearc...@googlegroups.com.
>>> To view this discussion on the web visit 
>>> https://groups.google.com/d/msgid/elasticsearch/c444a899-dcb3-442a-a33e-2b987fcad2e6%40googlegroups.com
>>>  
>>> 
>>> .
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/57fa553f-a376-4b01-83ef-486dd62d8f7a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Print nearby lines after query result

2014-05-29 Thread Otis Gospodnetic
I'd be curious to hear if anyone has any clever suggestions for this, too.

Otis
--
Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/


On Thursday, May 29, 2014 11:29:18 AM UTC-4, Senthil Raja wrote:
>
>
> In unix we are using grep -A 5 -B 5 "" . Looking for the similar 
> requirement. 
>
> On Thursday, May 29, 2014 8:48:47 PM UTC+5:30, Senthil Raja wrote:
>>
>>
>> Team,
>>
>> As per our requirement, i have to search a string in ES logs. And, print 
>> five lines before and after the result query. 
>>
>> Is there any way to do that ?
>>
>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/448f18d3-131c-44d4-9efd-daa57ef72069%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Red status unassigned shards help

2014-05-29 Thread Mark Walkom
It could also be the elasticsearch integrated output in ES, which adds the
LS instance as a client node to the cluster.
And you probably don't want to kill that.

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com
web: www.campaignmonitor.com


On 30 May 2014 14:11, Pawan Sharma  wrote:

> In the node another instances of elasticsearch is started, so the solution
> is first you have to find the PID ok another instances of es by
>
>
> *netstat -lnp | grep 920*
> and kill the PID if there is another es is started in 9201  port
>
> Thanks
>
>
> On Fri, May 30, 2014 at 4:03 AM, Mark Walkom 
> wrote:
>
>> Install a visual monitoring plugin like kopf and ElasticHQ, you will be
>> able to see which shards are unassigned.
>> However I think you may have replicas set, which, given you only have one
>> one, will always result in a yellow state as the cluster cannot assign
>> replicas to another node.
>>
>> You should also upgrade ES to a newer version if you can :)
>>
>> Regards,
>> Mark Walkom
>>
>> Infrastructure Engineer
>> Campaign Monitor
>> email: ma...@campaignmonitor.com
>> web: www.campaignmonitor.com
>>
>>
>> On 29 May 2014 23:45, Jason Weber  wrote:
>>
>>> I rebooted several times and I believe its collecting the correct data
>>> now. I still show 520 unassigned shards, but its collecting all my logs
>>> now. Is this something I can use the redirect command for to assign it to a
>>> new index?
>>>
>>> Jason
>>>
>>> On Tuesday, May 27, 2014 11:39:49 AM UTC-4, Jason Weber wrote:

 Could someone walk me through getting my cluster up and running. Came
 in from long weekend and my cluster was red status, I am showing a lot of
 unassigned shards.

 jmweber@MIDLOG01:/var/log/logstash$ curl localhost:9200/_cluster/
 health?pretty
 {
   "cluster_name" : "midlogcluster",
   "status" : "red",
   "timed_out" : false,
   "number_of_nodes" : 2,
   "number_of_data_nodes" : 1,
   "active_primary_shards" : 512,
   "active_shards" : 512,
   "relocating_shards" : 0,
   "initializing_shards" : 0,
   "unassigned_shards" : 520
 }


 I am running ES 0.90.11

 LS and ES are on a single server, I only have 1 node, although it shows
 2, I get yellow status normally, it works fine with that. But I am only
 collecting like 43 events per minute vs my usual 50K.

 I have seen several write ups but I seem to get a lot of no handler
 found for uri statements when I try to run them.

 Thanks,
 Jason

>>>  --
>>> You received this message because you are subscribed to the Google
>>> Groups "elasticsearch" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to elasticsearch+unsubscr...@googlegroups.com.
>>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/elasticsearch/1307dd8d-411e-4690-a6d1-8e27ce26ecec%40googlegroups.com
>>> 
>>> .
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>  --
>> You received this message because you are subscribed to the Google Groups
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to elasticsearch+unsubscr...@googlegroups.com.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/elasticsearch/CAEM624Y%2BPsF8a4C0mh-Jsi%3Dc6ogiXctAuA-Hn2oO6MVvv7SkBQ%40mail.gmail.com
>> 
>> .
>>
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/CAMUueYn0EkSL1qAH%2Bb5s0PHMW%3Ds5dK48n3dLgFAuEDziSpBfDg%40mail.gmail.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAEM624Z9yPJtT%3DefLuwUkf8_BhKtEE24OyS4E8X7GXD8FDyUQA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Elasticsearch and Smile encoded JSON

2014-05-29 Thread Drew Kutcharian
Hi Jörg,

Thanks for the comments. If I understand you correctly, you're saying that if I 
use SMILE and/or CBOR, the communication/storage won't be compressed using LZF?

- Drew


On May 29, 2014, at 2:52 PM, joergpra...@gmail.com wrote:

> 1. No (the cluster state of ES - not part of Lucene -  is saved to disk in 
> SMILE format)
> 
> 2. No.
> 
> 3. Yes, you can use SMILE on XContentBuilder classes. The result can 
> transported to the cluster, the decoding of SMILE is done transparently.
> 
> Because the transport is LZF compressed by default, you should consider if 
> disabling it for SMILE is worth it. SMILE is a compressed JSON technique but 
> I don't have numbers if there is any advantage about plain JSON with LZF 
> compressed (I doubt that SMILE is better)
> 
> Also note, there is CBOR in latest ES releases, which seems superior to SMILE 
> in many aspects (compactness, speed, standardization in RFC 7049)
> 
> https://github.com/elasticsearch/elasticsearch/pull/5509
> 
> Jörg
> 
> 
> 
> On Thu, May 29, 2014 at 7:07 AM, Drew Kutcharian  wrote:
> Hey Guys,
> 
> I wanted to get some clarification on how Elasticsearch handles/uses Smile 
> binary JSON. Mainly:
> 
> 1. Does ES convert JSON to Smile before saving into Lucene?
> 
> 2. Does ES use Smile as the wire protocol for the Java Client?
> 
> 3. If I wanted to have everything in Smile format (What's stored in Lucene, 
> fieldata, and the communication between server and client) how should I do 
> it? Should I just set the "source" to Smile byte array using the Java Client?
> 
> Note that I use the Java Client and don't really use the REST API, except for 
> debugging.
> 
> Best,
> 
> Drew
> 
> --
> You received this message because you are subscribed to the Google Groups 
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/elasticsearch/4D73E3EC-83A6-459F-AB2B-F6540D5BE3BD%40venarc.com.
> For more options, visit https://groups.google.com/d/optout.
> 
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGUzf_bz_LgHZzhrMDMf1AZ5t7eZ_HKAYrSMpm3%3D4hZMg%40mail.gmail.com.
> For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/6C7EC494-EE35-4158-9127-FE590D64C6D7%40venarc.com.
For more options, visit https://groups.google.com/d/optout.


Re: Red status unassigned shards help

2014-05-29 Thread Pawan Sharma
In the node another instances of elasticsearch is started, so the solution
is first you have to find the PID ok another instances of es by


*netstat -lnp | grep 920*
and kill the PID if there is another es is started in 9201  port

Thanks


On Fri, May 30, 2014 at 4:03 AM, Mark Walkom 
wrote:

> Install a visual monitoring plugin like kopf and ElasticHQ, you will be
> able to see which shards are unassigned.
> However I think you may have replicas set, which, given you only have one
> one, will always result in a yellow state as the cluster cannot assign
> replicas to another node.
>
> You should also upgrade ES to a newer version if you can :)
>
> Regards,
> Mark Walkom
>
> Infrastructure Engineer
> Campaign Monitor
> email: ma...@campaignmonitor.com
> web: www.campaignmonitor.com
>
>
> On 29 May 2014 23:45, Jason Weber  wrote:
>
>> I rebooted several times and I believe its collecting the correct data
>> now. I still show 520 unassigned shards, but its collecting all my logs
>> now. Is this something I can use the redirect command for to assign it to a
>> new index?
>>
>> Jason
>>
>> On Tuesday, May 27, 2014 11:39:49 AM UTC-4, Jason Weber wrote:
>>>
>>> Could someone walk me through getting my cluster up and running. Came in
>>> from long weekend and my cluster was red status, I am showing a lot of
>>> unassigned shards.
>>>
>>> jmweber@MIDLOG01:/var/log/logstash$ curl localhost:9200/_cluster/
>>> health?pretty
>>> {
>>>   "cluster_name" : "midlogcluster",
>>>   "status" : "red",
>>>   "timed_out" : false,
>>>   "number_of_nodes" : 2,
>>>   "number_of_data_nodes" : 1,
>>>   "active_primary_shards" : 512,
>>>   "active_shards" : 512,
>>>   "relocating_shards" : 0,
>>>   "initializing_shards" : 0,
>>>   "unassigned_shards" : 520
>>> }
>>>
>>>
>>> I am running ES 0.90.11
>>>
>>> LS and ES are on a single server, I only have 1 node, although it shows
>>> 2, I get yellow status normally, it works fine with that. But I am only
>>> collecting like 43 events per minute vs my usual 50K.
>>>
>>> I have seen several write ups but I seem to get a lot of no handler
>>> found for uri statements when I try to run them.
>>>
>>> Thanks,
>>> Jason
>>>
>>  --
>> You received this message because you are subscribed to the Google Groups
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to elasticsearch+unsubscr...@googlegroups.com.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/elasticsearch/1307dd8d-411e-4690-a6d1-8e27ce26ecec%40googlegroups.com
>> 
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/CAEM624Y%2BPsF8a4C0mh-Jsi%3Dc6ogiXctAuA-Hn2oO6MVvv7SkBQ%40mail.gmail.com
> 
> .
>
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAMUueYn0EkSL1qAH%2Bb5s0PHMW%3Ds5dK48n3dLgFAuEDziSpBfDg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


delete by query string query is not working with 1.2.0

2014-05-29 Thread hongsgo
i think do that wrote according to guideline.
https://github.com/elasticsearch/elasticsearch/blob/master/CONTRIBUTING.md

from elasticsearch version 0.90.7, was used delete-by-query-string query . 
at the one time. deleted approximately 250 doc. 
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-query-string-query.html
 

and i did upgrade version to 1.2.0.
after that, i did meet up error screen. 
please help me .. 
thanks you.

# create index
curl -XPOST 'http://10.99.196.141:22000/delete_by_query_string_query_test1'

# create mapping
curl -XPUT
'http://10.99.196.141:22000/delete_by_query_string_query_test1/1/_mapping'
-d '{
"properties":{
"playId":{
"type":"string",
"index":"not_analyzed",
"store":"true"
}
}
}'

# make doc
curl -XPOST
'http://10.99.196.141:22000/delete_by_query_string_query_test1/1' -d '{
  "playId": "1395395f-1543-44f7-ba11-fd8eff137afe"
}'


# delete by query string query
curl -XDELETE
'http://10.99.196.141:22000/delete_by_query_string_query_test1/1/_query?pretty=true'
-d '
{
"query_string" : {
"default_field" : "playId",
 "query" : "1395395f-1543-44f7-ba11-fd8eff137afe"
}
}
'

# response occured error - 
{
  "_indices" : {
"delete_by_query_string_query_test1" : {
  "_shards" : {
"total" : 2,
"successful" : 0,
"failed" : 2,
"failures" : [ {
  "index" : "delete_by_query_string_query_test1",
  "shard" : 1,
  "reason" :
"RemoteTransportException[[10.99.196.141_21001][inet[/10.99.196.141:21001]][deleteByQuery/shard]];
nested: QueryParsingException[[delete_by_query_string_query_test1] request
does not support [query_string]]; "
}, {
  "index" : "delete_by_query_string_query_test1",
  "shard" : 0,
  "reason" :
"RemoteTransportException[[10.101.63.182_21000][inet[/10.101.63.182:21000]][deleteByQuery/shard]];
nested: QueryParsingException[[delete_by_query_string_query_test1] request
does not support [query_string]]; "
} ]
  }
}
  }
}


somebody help me please.



--
View this message in context: 
http://elasticsearch-users.115913.n3.nabble.com/delete-by-query-string-query-is-not-working-with-1-2-0-tp4056754.html
Sent from the ElasticSearch Users mailing list archive at Nabble.com.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/1401420702674-4056754.post%40n3.nabble.com.
For more options, visit https://groups.google.com/d/optout.


Re: Random node disconnects in Azure, no resource issues as near as I can tell

2014-05-29 Thread Eric Brandes
I'm using the unicast list of nodes at the moment. I have multicast turned 
off as well.  I have not changed the default ping timeout or anything.

On Thursday, May 29, 2014 7:37:38 PM UTC-5, David Pilato wrote:
>
> Just checking: are you using azure cloud plugin or unicast list of nodes?
>
> --
> David ;-)
> Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs
>
>
> Le 30 mai 2014 à 02:12, Eric Brandes > 
> a écrit :
>
> I have a 3 node cluster running ES 1.0.1 in Azure.  They're windows VMs 
> with 7GB of RAM.  The JVM heap size is allocated at 4GB per node.  There is 
> a single index in the cluster with 50 shards and 1 replica.  The total 
> number of documents on primary shards is 29 million with a store size of 
> 60gb (including replicas).
>
> Almost every day now I get a random node disconnecting from the cluster.  
> The usual suspect is a ping timeout.  The longest GC in the logs is about 1 
> sec, and the boxes don't look resource constrained really at all. CPU never 
> goes above 20%. The used JVM heap size never goes above 6gb (the total on 
> the cluster is 12gb) and the field data cache never gets over 1gb.  The 
> node that drops out is different every day.  I have 
> minimum_number_master_nodes set so there's not any kind of split brain 
> scenario, but there are times where the disconnected node NEVER rejoins 
> until I bounce the process.
>
> Has anyone seen this before?  Is it an Azure networking issue?  How can I 
> tell?  If it's resource problems, what's the best way for me to turn on 
> logging to diagnose them?  What else can I tell you or what other steps can 
> I take to figure this out?  It's really quite maddening :(
>
> -- 
> You received this message because you are subscribed to the Google Groups 
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to elasticsearc...@googlegroups.com .
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/elasticsearch/8f85c254-9d53-4507-a340-4c8f2a4a078d%40googlegroups.com
>  
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/7671194d-3059-4220-9da5-c4e1aa169072%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Random node disconnects in Azure, no resource issues as near as I can tell

2014-05-29 Thread David Pilato
Just checking: are you using azure cloud plugin or unicast list of nodes?

--
David ;-)
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs


Le 30 mai 2014 à 02:12, Eric Brandes  a écrit :

I have a 3 node cluster running ES 1.0.1 in Azure.  They're windows VMs with 
7GB of RAM.  The JVM heap size is allocated at 4GB per node.  There is a single 
index in the cluster with 50 shards and 1 replica.  The total number of 
documents on primary shards is 29 million with a store size of 60gb (including 
replicas).

Almost every day now I get a random node disconnecting from the cluster.  The 
usual suspect is a ping timeout.  The longest GC in the logs is about 1 sec, 
and the boxes don't look resource constrained really at all. CPU never goes 
above 20%. The used JVM heap size never goes above 6gb (the total on the 
cluster is 12gb) and the field data cache never gets over 1gb.  The node that 
drops out is different every day.  I have minimum_number_master_nodes set so 
there's not any kind of split brain scenario, but there are times where the 
disconnected node NEVER rejoins until I bounce the process.

Has anyone seen this before?  Is it an Azure networking issue?  How can I tell? 
 If it's resource problems, what's the best way for me to turn on logging to 
diagnose them?  What else can I tell you or what other steps can I take to 
figure this out?  It's really quite maddening :(
-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/8f85c254-9d53-4507-a340-4c8f2a4a078d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/DE1520AB-0E38-440A-869C-A69ECE9A5295%40pilato.fr.
For more options, visit https://groups.google.com/d/optout.


Kibana User Flow Graphs similar to Google Analytics

2014-05-29 Thread Bill Clark
Hello,
I would like to use ELK to analyze user navigation patterns (ala Apache 
Access logs) and wondering if Kibana has a User Flow graph similar to 
Google Analytics?  

Regards,
Bill Clark 

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/15437927-7d91-4672-aedc-e40a388862dd%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: What OS memory does es use other than Java?

2014-05-29 Thread Edward Sargisson
For those following along at home I thought I'd provide an update.

Jörg provided a big hint and all his advise was useful. Based on what he 
said we discovered that the VM host was performing memory ballooning on the 
guest. Briefly, this is a process where the host can reclaim memory from 
the guest by inflating a memory balloon that grabs memory from the OS and 
gives it back to the host. It transfers memory pressure the host might be 
under to the guest. (Google it for more.)

We showed that our failures were happening when ballooning occurred. By 
default, the balloon is designed to inflate to 60% (from memory) of 
configured memory. Given that 50% of configured memory is mlocked these two 
settings are incompatible.

Our fix was to configure VMWare to reserve the entire configured memory. 
This means that the host doesn't try to take the memory back. It seemed 
sensible to reserve all of the configured memory as we want elasticsearch 
to keep its buffers and memory maps in place just as it would be on a 
hardware instance in production. If placed under memory pressure, the OS 
would start to reclaim these things.

After making the change, we've been running for a few weeks with no further 
failures.

Cheers,
Edward

On Wednesday, May 7, 2014 4:09:16 PM UTC-7, Edward Sargisson wrote:
>
> Hi Jörg,
> Thanks for your reply - that's given me a number of leads to follow up on.
>
> > Errors in allocating direct buffers will result in Java errors. You 
> mention Linux memory errors but unfortunately you do not quote it, so I 
> have to guess.
> We see nothing useful in elasticsearch logs. What we do see is either the 
> console saying, "Out of memory: Kill process ... score 1 or sacrifice 
> child" or, once, we saw, "Loading dm-mirror.ko module, Waiting for required 
> block device discovery, Waiting for 2 sda-like device(s)...Kernel panic - 
> not syncing: Out of memory and no killable processes".
> The first message I understand as the OOM-Killer coming out to whack a 
> process on the head. I don't understand the last one. I have screenshots of 
> these if required.
>
> > You should have enabled memory mapped files by index store mmapfs 
> (default on RHEL)
> We haven't changed this setting so I expect it is the default. I looked 
> for a way to verify this but the es api appears not to return it.
>
> > bootstrap.mlockall = true...set memlock to unlimited
> Yes - both done.
>
> > If you still encounter issues from Linux OS errors it is most probably 
> because of VMware limitations
> Is there a way to get evidence to show this? I reviewed the VMWare event 
> log and there was no ballooning in there (assuming we were looking at the 
> right spot).
>
> >  If you run a VM, you should assign at most 50% of the configured guest 
> OS memory to ES.
> We use the elasticsearch Puppet module but I modified it with a version of 
> the code in the elasticsearch Chef cookbook to automatically assign this - 
> where it appears to be assigning 60%. I was surprised by this too but I 
> copied it on the assumption that the cookbook writer knew what they were 
> doing. I've raised an issue to ask the question: 
> https://github.com/elasticsearch/cookbook-elasticsearch/issues/209
>
> For the curious: I've setup some monitoring to capture /proc/meminfo, the 
> count of the /proc//maps for elasticsearch and Flume as well as the 
> top few entries in top by memory usage. Now I'm just waiting for the next 
> failure.
>
> Thanks for any help provided.
>
> Cheers,
> Edward
>
> On Tuesday, May 6, 2014 3:23:10 PM UTC-7, Jörg Prante wrote:
>>
>> Yes, of course Elasticsearch is using off-heap memory. All the Lucene 
>> index I/O is using direct buffers in native OS memory.
>>
>> Errors in allocating direct buffers will result in Java errors. You 
>> mention Linux memory errors but unfortunately you do not quote it, so I 
>> have to guess.
>>
>> You should have enabled memory mapped files by index store mmapfs 
>> (default on RHEL) so all files that are read by ES are mapped into virtual 
>> address space of the OS VM management.
>>
>> And also bootstrap.mlockall = true, so you also need to set memlock to 
>> unlimited in /etc/security/limits.conf, because RHEL/Centos memlockable 
>> memory is limited to 25% of RAM by default. In that case, Java should throw 
>> an IOException "Map failed".
>>
>> Note, because of the memory page lock support of the host OS, you should 
>> also check what kind of virtualization you have enabled for the guest, it 
>> should be HW (full) virtualization, not paravirtualization.
>>
>> If you still encounter issues from Linux OS errors it is most probably 
>> because of VMware limitations, so you should disable the bootstrap.mlockall 
>> setting.
>>
>> As a side note, the recommended heap size is 50% of the RAM that is 
>> available to the ES process. If you run a VM, you should assign at most 50% 
>> of the configured guest OS memory to ES.
>>
>> Jörg
>>
>>
>> On Tue, May 6, 2014 at 10:35 PM, Edward Sargi

Random node disconnects in Azure, no resource issues as near as I can tell

2014-05-29 Thread Eric Brandes
I have a 3 node cluster running ES 1.0.1 in Azure.  They're windows VMs 
with 7GB of RAM.  The JVM heap size is allocated at 4GB per node.  There is 
a single index in the cluster with 50 shards and 1 replica.  The total 
number of documents on primary shards is 29 million with a store size of 
60gb (including replicas).

Almost every day now I get a random node disconnecting from the cluster.  
The usual suspect is a ping timeout.  The longest GC in the logs is about 1 
sec, and the boxes don't look resource constrained really at all. CPU never 
goes above 20%. The used JVM heap size never goes above 6gb (the total on 
the cluster is 12gb) and the field data cache never gets over 1gb.  The 
node that drops out is different every day.  I have 
minimum_number_master_nodes set so there's not any kind of split brain 
scenario, but there are times where the disconnected node NEVER rejoins 
until I bounce the process.

Has anyone seen this before?  Is it an Azure networking issue?  How can I 
tell?  If it's resource problems, what's the best way for me to turn on 
logging to diagnose them?  What else can I tell you or what other steps can 
I take to figure this out?  It's really quite maddening :(

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/8f85c254-9d53-4507-a340-4c8f2a4a078d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Run native script on non-data node

2014-05-29 Thread Fei Xie
Hi,

Anyone know if it's possible to run native script on non-data node instead
of running on all the data node?

Thank you!
Fei

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAPRQSRfzsos3h%3DdQktpcZY3N_E%3D8B9sJDHEbUR%2BBJkt7kMA6yA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Standing up new instance want to copy indexes/documents from old instance

2014-05-29 Thread Mark Walkom
If you want to keep the same cluster name, just add the new nodes to the
existing cluster, let it rebalance, then remove the olds nodes.

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com
web: www.campaignmonitor.com


On 29 May 2014 22:44, Didjit  wrote:

> Hi,
>
> Dug around and cant seem to find the answer. I have an existing instance
> of ELK running. I'm currently setting up a new instance on separate
> hardware. I want to copy the documnets/indexes over to the new instance so
> I can preserve the history. Can someone point me to a doc or give some tips
> on how to accomplish this?
>
> Thank you,
>
> Didjit
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/c5481c18-4550-4e64-b069-9122da4643e4%40googlegroups.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAEM624b0X5PXETj-YsY0UAaFur1T-q0cyTuA3tAYeXeFR3WpBw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


How to delete all entries based on the contents of two fields

2014-05-29 Thread David Reagan
I imported a LOT of apache logs the other day. Via Logstash. 'Course, I 
messed up and didn't set the timestamp correctly. Now that I've figured out 
how to set the timestamp correctly, I want to remove the logs I imported.

For the life of me I can't figure it out.

I've been looking 
at 
http://www.elasticsearch.org/guide/en/elasticsearch/reference/0.90/docs-delete-by-query.html#docs-delete-by-query
 
(Yes, I'm running 0.90.9) to figure out what to do, but I'm obviously 
missing something 

This is what I've tried so far:.

curl -XDELETE 'http://node01.domain.tld:9200/logstash-2014.05.27/_query' -d 
> '{
> "query": {
> "filtered" : {
> "query" : {
> "query_string" : {
> "query" : "message:\"*subdomain.main.tld*\" AND 
> host:\"hostimportedon\""
> }
> }
> }
> }
> }
> '


the results:

{"ok":true,"_indices":{"logstash-2014.05.27":{"_shards":{"total":5,"successful":0,"failed":5


So, how would I delete something based on two criteria? The host field 
matches "hostimportedon" and the messaged field has "subdomain.main.tld" in 
it.

I have a total of 4 elasticsearch nodes.

Thanks!

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/5fb3ec86-76b3-4536-9605-6774784f9d31%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Hide some system fields

2014-05-29 Thread Gail Long
You can use jq to selectively pull JSON values, arrays, and objects from 
elasticsearch returns.  It lets you filter out everything except the things 
you want and formats it decently as well.  The syntax takes a bit of 
getting used to but once you get it the product makes working with JSON 
output much nicer.

http://stedolan.github.io/jq/

On Thursday, May 29, 2014 5:11:07 AM UTC-6, Сергей Шилов wrote:
>
> Hi all!
> I use elasticsearch in a high-load project where an urgent need to save 
> traffic.
> I have a some queries like this:
>
> curl -XGET 
> 'http://localhost:9200/testindex/testmapping/_search?pretty&scroll=5m' -d 
> '{"from":0, "size":1000}'
>
> {
>   "_scroll_id" : 
> "cXVlcnlUaGVuRmV0Y2g7MjszMjp6TmdjNmxkM1NtV1NOeTl5X3dab1FnOzMxOnpOZ2M2bGQzU21XU055OXlfd1pvUWc7MDs=",
>   "took" : 2,
>   "timed_out" : false,
>   "_shards" : {
> "total" : 2,
> "successful" : 2,
> "failed" : 0
>   },
>   "hits" : {
> "total" : 15457332,
> "max_score" : 1.0,
> "hits" : [ {
>   "_index" : "testindex",
>   "_type" : "testmapping",
>   "_id" : "mo7vQrWUTquBRowjq2AVkw",
>   "_score" : 1.0, "_source" : 
> {"reffer_id":"","date":"2013-05-31T00:00:00","source":5,"user_id":"2fdfdf0fbbce603cf24c0eee7dabf28c"}
> }, ]
>   }
> }
>
>
> Can I exclude some system fields (like _shards.*, hits._index, hits._type, 
> hits._id, hits._score)? I found how exclude source fields, but not system.
> Also I need to get _timestamp field in _source rows. It generated from 
> 'date' field:
>
> '_timestamp' => array(
> 'enabled' => true,
> 'path' => 'date',
> 'format' => "-MM-dd'T'HH:mm:ss"
> )
>
>
> Thanks
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/73a11577-11c9-4931-a0cd-a55bf36b8739%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: ES and logstash are nit working well with each other

2014-05-29 Thread Mark Walkom
Use the HTTP output and save yourself some trouble.

Also, upgrade ES :)

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com
web: www.campaignmonitor.com


On 30 May 2014 01:38, David Pilato  wrote:

> Elasticsearch output is using es 1.x IIRC. You can not mix versions.
> You need to upgrade es or downgrade logstash or use elasticsearch HTTP
> output.
>
> My 2 cents
>
> --
> David ;-)
> Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs
>
>
> Le 29 mai 2014 à 17:21, David Montgomery  a
> écrit :
>
> Before I give up on logstash..I now placed elastic search on the same
> server using the below.
>
>
> input {
>   redis {
> host => "redis.queue.do.development.sf.test.com"
> data_type => "list"
> key => "logstash"
> codec => json
>   }
> }
>
>
> output {
> stdout { }
> elasticsearch {
> bind_host => "127.0.0.1"
> }
> }
>
>
> I get this erorr   I am using 1.4.1for logstash and
> 0.90.9 for es.
>
>
> '/usr/local/share/logstash-1.4.1/bin/logstash -f
> /usr/local/share/logstash.indexer.config
> Using milestone 2 input plugin 'redis'. This plugin should be stable, but
> if you see strange behavior, please let us know! For more information on
> plugin milestones, see http://logstash.net/docs/1.4.1/plugin-milestones
> {:level=>:warn}
> log4j, [2014-05-29T11:18:31.923]  WARN: org.elasticsearch.discovery:
> [logstash-do-logstash-sf-development-20140527082230-16645-2010] waited for
> 30s and no initial state was set by the discovery
> Exception in thread ">output"
> org.elasticsearch.discovery.MasterNotDiscoveredException: waited for [30s]
> at
> org.elasticsearch.action.support.master.TransportMasterNodeOperationAction$3.onTimeout(org/elasticsearch/action/support/master/TransportMasterNodeOperationAction.java:180)
> at
> org.elasticsearch.cluster.service.InternalClusterService$NotifyTimeout.run(org/elasticsearch/cluster/service/InternalClusterService.java:492)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(java/util/concurrent/ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(java/util/concurrent/ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(java/lang/Thread.java:744)
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> On Thu, May 29, 2014 at 1:15 PM, David Montgomery <
> davidmontgom...@gmail.com> wrote:
>
>> PS...and how is this possible?  I feel so bad I bought the kindle
>> logstash book:(
>>
>>  I changed to host rather than bind host.  I mean..wow..I have ports
>> open.  See?  ufw satus ===> 9200:9400/tcp  ALLOW
>> my.logstash.ipaddress
>>
>> I have ES running
>> service elasticsearch status
>>  * elasticsearch is running
>>
>>
>>
>>  /usr/local/share/logstash-1.4.1/bin/logstash -f
>> /usr/local/share/logstash.indexer.configUsing milestone 2 input plugin
>> 'redis'. This plugin should be stable, but if you see strange behavior,
>> please let us know! For more information on plugin milestones, see
>> http://logstash.net/docs/1.4.1/plugin-milestones {:level=>:warn}
>> Exception in thread ">output"
>> org.elasticsearch.transport.BindTransportException: Failed to bind to
>> [9300-9400]
>> at
>> org.elasticsearch.transport.netty.NettyTransport.doStart(org/elasticsearch/transport/netty/NettyTransport.java:380)
>> at
>> org.elasticsearch.common.component.AbstractLifecycleComponent.start(org/elasticsearch/common/component/AbstractLifecycleComponent.java:85)
>> at
>> org.elasticsearch.transport.TransportService.doStart(org/elasticsearch/transport/TransportService.java:92)
>> at
>> org.elasticsearch.common.component.AbstractLifecycleComponent.start(org/elasticsearch/common/component/AbstractLifecycleComponent.java:85)
>> at
>> org.elasticsearch.node.internal.InternalNode.start(org/elasticsearch/node/internal/InternalNode.java:229)
>> at
>> org.elasticsearch.node.NodeBuilder.node(org/elasticsearch/node/NodeBuilder.java:166)
>> at java.lang.reflect.Method.invoke(java/lang/reflect/Method.java:606)
>> at
>> RUBY.build_client(/usr/local/share/logstash-1.4.1/lib/logstash/outputs/elasticsearch/protocol.rb:198)
>> at
>> RUBY.client(/usr/local/share/logstash-1.4.1/lib/logstash/outputs/elasticsearch/protocol.rb:15)
>> at
>> RUBY.initialize(/usr/local/share/logstash-1.4.1/lib/logstash/outputs/elasticsearch/protocol.rb:157)
>> at
>> RUBY.register(/usr/local/share/logstash-1.4.1/lib/logstash/outputs/elasticsearch.rb:250)
>> at org.jruby.RubyArray.each(org/jruby/RubyArray.java:1613)
>> at
>> RUBY.outputworker(/usr/local/share/logstash-1.4.1/lib/logstash/pipeline.rb:220)
>> at
>> RUBY.start_outputs(/usr/local/share/logstash-1.4.1/lib/logstash/pipeline.rb:152)
>> at java.lang.Thread.run(java/lang/Thread.java:744)
>>
>>
>>
>> On Thu, May 29, 2014 at 12:43 PM, David Montgomery <
>> davidmontgom...@gmail.com> wrote:
>>
>>> Hi,
>>>
>>> I am, rather concerned that ES is not working with logstash index server.
>

Re: Red status unassigned shards help

2014-05-29 Thread Mark Walkom
Install a visual monitoring plugin like kopf and ElasticHQ, you will be
able to see which shards are unassigned.
However I think you may have replicas set, which, given you only have one
one, will always result in a yellow state as the cluster cannot assign
replicas to another node.

You should also upgrade ES to a newer version if you can :)

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com
web: www.campaignmonitor.com


On 29 May 2014 23:45, Jason Weber  wrote:

> I rebooted several times and I believe its collecting the correct data
> now. I still show 520 unassigned shards, but its collecting all my logs
> now. Is this something I can use the redirect command for to assign it to a
> new index?
>
> Jason
>
> On Tuesday, May 27, 2014 11:39:49 AM UTC-4, Jason Weber wrote:
>>
>> Could someone walk me through getting my cluster up and running. Came in
>> from long weekend and my cluster was red status, I am showing a lot of
>> unassigned shards.
>>
>> jmweber@MIDLOG01:/var/log/logstash$ curl localhost:9200/_cluster/
>> health?pretty
>> {
>>   "cluster_name" : "midlogcluster",
>>   "status" : "red",
>>   "timed_out" : false,
>>   "number_of_nodes" : 2,
>>   "number_of_data_nodes" : 1,
>>   "active_primary_shards" : 512,
>>   "active_shards" : 512,
>>   "relocating_shards" : 0,
>>   "initializing_shards" : 0,
>>   "unassigned_shards" : 520
>> }
>>
>>
>> I am running ES 0.90.11
>>
>> LS and ES are on a single server, I only have 1 node, although it shows
>> 2, I get yellow status normally, it works fine with that. But I am only
>> collecting like 43 events per minute vs my usual 50K.
>>
>> I have seen several write ups but I seem to get a lot of no handler found
>> for uri statements when I try to run them.
>>
>> Thanks,
>> Jason
>>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/1307dd8d-411e-4690-a6d1-8e27ce26ecec%40googlegroups.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAEM624Y%2BPsF8a4C0mh-Jsi%3Dc6ogiXctAuA-Hn2oO6MVvv7SkBQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: OutOfMemory Exception on client Node

2014-05-29 Thread Mark Walkom
Then you need to add more nodes or more RAM to existing nodes or delete
some data, essentially you are hitting the limits of what you can add to
the cluster.

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com
web: www.campaignmonitor.com


On 30 May 2014 03:18, VB  wrote:

> These exceptions are happening on client node and not on data node. I do
> not think we need any restart of cluster.
>
> On Tuesday, 27 May 2014 14:03:29 UTC-7, VB wrote:
>>
>> Hi all,
>>
>> We are running 90.11 and we have a cluster with client, master and data
>> nodes.
>>
>> Our client nodes are using dedicated 10g memory.
>>
>> But we are seeing these outofmemory exceptions.
>>
>> I tried to correlate this log time with logs in our exception but I did
>> not find any query which we could be causing this issue.
>>
>> Our cluster has 37 indexes with 50 shards and 1 replica.
>>
>> Some indexes has data and some doesn't.
>>
>>
>>
>> [2014-05-27 16:26:34,688][WARN ][monitor.jvm  ] [BUS9364B62]
>> [gc][old][327409][8] duration [33s], collections [1]/[33.5s], total
>> [33s]/[3.8m], memory [9gb]->[9.4gb]/[9.9gb], all_pools {[young]
>> [520.2kb]->[112.3mb]/[532.5mb]}{[survivor] [0b]->[0b]/[66.5mb]}{[old]
>> [9gb]->[9.3gb]/[9.3gb]}
>> [2014-05-27 16:27:19,992][INFO ][cluster.service  ] [BUS9364B62]
>> detected_master [ELS-10.76.121.130][BlGygpFmRn6uQNbgiEfl0A][inet[/
>> 10.76.121.130:9300]]{data=false, max_local_storage_nodes=1,
>> master=true}, added {[ELS-10.76.121.130][BlGygpFmRn6uQNbgiEfl0A][inet[/
>> 10.76.121.130:9300]]{data=false, max_local_storage_nodes=1,
>> master=true},}, reason: zen-disco-receive(from master [[ELS-10.76.121.130][
>> BlGygpFmRn6uQNbgiEfl0A][inet[/10.76.121.130:9300]]{data=false,
>> max_local_storage_nodes=1, master=true}])
>> [2014-05-27 16:27:20,008][INFO ][discovery.zen] [BUS9364B62]
>> master_left [[ELS-10.76.121.130][BlGygpFmRn6uQNbgiEfl0A][inet[/
>> 10.76.121.130:9300]]{data=false, max_local_storage_nodes=1,
>> master=true}], reason [failed to perform initial connect
>> [[ELS-10.76.121.130][inet[/10.76.121.130:9300]] connect_timeout[30s]]]
>> [2014-05-27 16:27:20,008][WARN ][monitor.jvm  ] [BUS9364B62]
>> [gc][old][327410][9] duration [44.3s], collections [1]/[45.3s], total
>> [44.3s]/[4.6m], memory [9.4gb]->[9.8gb]/[9.9gb], all_pools {[young]
>> [112.3mb]->[475mb]/[532.5mb]}{[survivor] [0b]->[0b]/[66.5mb]}{[old]
>> [9.3gb]->[9.3gb]/[9.3gb]}
>> [2014-05-27 16:28:06,856][WARN ][cluster.service  ] [BUS9364B62]
>> failed to connect to node [[ELS-10.76.121.130][
>> BlGygpFmRn6uQNbgiEfl0A][inet[/10.76.121.130:9300]]{data=false,
>> max_local_storage_nodes=1, master=true}]
>> org.elasticsearch.transport.ConnectTransportException:
>> [ELS-10.76.121.130][inet[/10.76.121.130:9300]] connect_timeout[30s]
>> at org.elasticsearch.transport.netty.NettyTransport.connectToChannels(
>> NettyTransport.java:727)
>>  at org.elasticsearch.transport.netty.NettyTransport.
>> connectToNode(NettyTransport.java:647)
>> at org.elasticsearch.transport.netty.NettyTransport.
>> connectToNode(NettyTransport.java:615)
>>  at org.elasticsearch.transport.TransportService.connectToNode(
>> TransportService.java:129)
>> at org.elasticsearch.cluster.service.InternalClusterService$
>> UpdateTask.run(InternalClusterService.java:396)
>>  at org.elasticsearch.common.util.concurrent.
>> PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(
>> PrioritizedEsThreadPoolExecutor.java:135)
>>  at java.util.concurrent.ThreadPoolExecutor.runWorker(
>> ThreadPoolExecutor.java:1145)
>> at java.util.concurrent.ThreadPoolExecutor$Worker.run(
>> ThreadPoolExecutor.java:615)
>>  at java.lang.Thread.run(Thread.java:744)
>> [2014-05-27 16:28:06,856][WARN ][monitor.jvm  ] [BUS9364B62]
>> [gc][old][327412][10] duration [45.3s], collections [1]/[45.8s], total
>> [45.3s]/[5.3m], memory [9.9gb]->[9.9gb]/[9.9gb], all_pools {[young]
>> [532.5mb]->[532.5mb]/[532.5mb]}{[survivor] [41mb]->[33.2mb]/[66.5mb]}{[old]
>> [9.3gb]->[9.3gb]/[9.3gb]}
>> [2014-05-27 16:28:53,876][WARN ][monitor.jvm  ] [BUS9364B62]
>> [gc][old][327413][11] duration [46.4s], collections [1]/[47s], total
>> [46.4s]/[6.1m], memory [9.9gb]->[9.9gb]/[9.9gb], all_pools {[young]
>> [532.5mb]->[532.5mb]/[532.5mb]}{[survivor] [33.2mb]->[54.7mb]/[66.5mb]}{[old]
>> [9.3gb]->[9.3gb]/[9.3gb]}
>> [2014-05-27 16:29:40,990][INFO ][discovery.zen] [BUS9364B62]
>> master_left [[ELS-10.76.121.130][BlGygpFmRn6uQNbgiEfl0A][inet[/
>> 10.76.121.130:9300]]{data=false, max_local_storage_nodes=1,
>> master=true}], reason [failed to perform initial connect
>> [[ELS-10.76.121.130][inet[/10.76.121.130:9300]] connect_timeout[30s]]]
>> [2014-05-27 16:29:40,990][WARN ][monitor.jvm  ] [BUS9364B62]
>> [gc][old][327414][12] duration [46.8s], collections [1]/[47.1s], total
>> [46.8s]/[6.9m], memory [9.9gb]->[9.9gb]/[9.9gb], all_pools {[young]
>> [532.5mb]->[532.5mb]/[532.5mb]}{[survi

Re: Elasticsearch and Smile encoded JSON

2014-05-29 Thread joergpra...@gmail.com
1. No (the cluster state of ES - not part of Lucene -  is saved to disk in
SMILE format)

2. No.

3. Yes, you can use SMILE on XContentBuilder classes. The result can
transported to the cluster, the decoding of SMILE is done transparently.

Because the transport is LZF compressed by default, you should consider if
disabling it for SMILE is worth it. SMILE is a compressed JSON technique
but I don't have numbers if there is any advantage about plain JSON with
LZF compressed (I doubt that SMILE is better)

Also note, there is CBOR in latest ES releases, which seems superior to
SMILE in many aspects (compactness, speed, standardization in RFC 7049)

https://github.com/elasticsearch/elasticsearch/pull/5509

Jörg



On Thu, May 29, 2014 at 7:07 AM, Drew Kutcharian  wrote:

> Hey Guys,
>
> I wanted to get some clarification on how Elasticsearch handles/uses Smile
> binary JSON. Mainly:
>
> 1. Does ES convert JSON to Smile before saving into Lucene?
>
> 2. Does ES use Smile as the wire protocol for the Java Client?
>
> 3. If I wanted to have everything in Smile format (What's stored in
> Lucene, fieldata, and the communication between server and client) how
> should I do it? Should I just set the "source" to Smile byte array using
> the Java Client?
>
> Note that I use the Java Client and don't really use the REST API, except
> for debugging.
>
> Best,
>
> Drew
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/4D73E3EC-83A6-459F-AB2B-F6540D5BE3BD%40venarc.com
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGUzf_bz_LgHZzhrMDMf1AZ5t7eZ_HKAYrSMpm3%3D4hZMg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Hide some system fields

2014-05-29 Thread Nik Everett
Source filtering is similar but normally returns nicer results. 

Sent from my iPhone

> On May 29, 2014, at 8:29 AM, Florentin Zorca  wrote:
> 
> Hi,
> 
> try using fields to specify only the fields you are interested in:
> 
> http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-request-fields.html#search-request-fields
> 
> Your query should then be something like this:
> {"fields":["date"], "from":0, "size":1000}
> Kind Regards,
> Florentin
> 
> Am Donnerstag, 29. Mai 2014 13:11:07 UTC+2 schrieb Сергей Шилов:
>> 
>> Hi all!
>> I use elasticsearch in a high-load project where an urgent need to save 
>> traffic.
>> I have a some queries like this:
>> curl -XGET 
>> 'http://localhost:9200/testindex/testmapping/_search?pretty&scroll=5m' -d 
>> '{"from":0, "size":1000}'
>> 
>> {
>>   "_scroll_id" : 
>> "cXVlcnlUaGVuRmV0Y2g7MjszMjp6TmdjNmxkM1NtV1NOeTl5X3dab1FnOzMxOnpOZ2M2bGQzU21XU055OXlfd1pvUWc7MDs=",
>>   "took" : 2,
>>   "timed_out" : false,
>>   "_shards" : {
>> "total" : 2,
>> "successful" : 2,
>> "failed" : 0
>>   },
>>   "hits" : {
>> "total" : 15457332,
>> "max_score" : 1.0,
>> "hits" : [ {
>>   "_index" : "testindex",
>>   "_type" : "testmapping",
>>   "_id" : "mo7vQrWUTquBRowjq2AVkw",
>>   "_score" : 1.0, "_source" : 
>> {"reffer_id":"","date":"2013-05-31T00:00:00","source":5,"user_id":"2fdfdf0fbbce603cf24c0eee7dabf28c"}
>> }, ]
>>   }
>> }
>> 
>> Can I exclude some system fields (like _shards.*, hits._index, hits._type, 
>> hits._id, hits._score)? I found how exclude source fields, but not system.
>> Also I need to get _timestamp field in _source rows. It generated from 
>> 'date' field:
>> '_timestamp' => array(
>> 'enabled' => true,
>> 'path' => 'date',
>> 'format' => "-MM-dd'T'HH:mm:ss"
>> )
>> 
>> Thanks
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/elasticsearch/16245f90-64fc-4a39-adf4-71663e91d950%40googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/B13EA515-40ED-4F09-97BA-799B23B98DF8%40gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Hide some system fields

2014-05-29 Thread Ivan Brusic
I do not think you can exclude the search metadata fields.

-- 
Ivan


On Thu, May 29, 2014 at 5:29 AM, Florentin Zorca  wrote:

> Hi,
>
> try using fields to specify only the fields you are interested in:
>
>
> http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-request-fields.html#search-request-fields
>
> Your query should then be something like this:
>
> {"fields":["date"], "from":0, "size":1000}
>
> Kind Regards,
>
> Florentin
>
>
> Am Donnerstag, 29. Mai 2014 13:11:07 UTC+2 schrieb Сергей Шилов:
>
>> Hi all!
>> I use elasticsearch in a high-load project where an urgent need to save
>> traffic.
>> I have a some queries like this:
>>
>> curl -XGET 
>> 'http://localhost:9200/testindex/testmapping/_search?pretty&scroll=5m' -d 
>> '{"from":0, "size":1000}'
>>
>> {
>>   "_scroll_id" : 
>> "cXVlcnlUaGVuRmV0Y2g7MjszMjp6TmdjNmxkM1NtV1NOeTl5X3dab1FnOzMxOnpOZ2M2bGQzU21XU055OXlfd1pvUWc7MDs=",
>>   "took" : 2,
>>   "timed_out" : false,
>>   "_shards" : {
>> "total" : 2,
>> "successful" : 2,
>> "failed" : 0
>>   },
>>   "hits" : {
>> "total" : 15457332,
>> "max_score" : 1.0,
>> "hits" : [ {
>>   "_index" : "testindex",
>>   "_type" : "testmapping",
>>   "_id" : "mo7vQrWUTquBRowjq2AVkw",
>>   "_score" : 1.0, "_source" : 
>> {"reffer_id":"","date":"2013-05-31T00:00:00","source":5,"user_id":"2fdfdf0fbbce603cf24c0eee7dabf28c"}
>> }, ]
>>   }
>> }
>>
>>
>> Can I exclude some system fields (like _shards.*, hits._index,
>> hits._type, hits._id, hits._score)? I found how exclude source fields, but
>> not system.
>> Also I need to get _timestamp field in _source rows. It generated from
>> 'date' field:
>>
>> '_timestamp' => array(
>> 'enabled' => true,
>> 'path' => 'date',
>> 'format' => "-MM-dd'T'HH:mm:ss"
>> )
>>
>>
>> Thanks
>>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/16245f90-64fc-4a39-adf4-71663e91d950%40googlegroups.com
> .
>
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQA0NEzd30i9vrpLv1D5nZY%2BfX9%2BLEKPUesznXa%3D2LL7EQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


encoding is longer than the max length 32766

2014-05-29 Thread Jeff Dupont
 

We’re running into a peculiar issue when updating indexes with content for 
the document.


"document contains at least one immense term in (whose utf8 encoding is 
longer than the max length 32766), all of which were skipped. please 
correct the analyzer to not produce such terms”


I’m hoping that there’s a simple fix or setting that can resolve this.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/01a22ff3-056d-4b54-8b28-a17f95d91f4b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Interpolation of discovery.zen.ping.unicast.hosts

2014-05-29 Thread InquiringMind
I believe that the host names must be comma-separated and no quotes or 
other punctuation should be present. The config within the environment 
variable or java -D (system properties) is YAML-like, not JSON. No blanks 
between the names, for example:

export ES_HOSTS=host1,host2,host3

When configuring Unicast, here are the java -D options I use in my own 
custom Elasticsearch start-up / shutdown / status script:

-Des.discovery.zen.ping.multicast.enabled=false 
-Des.discovery.zen.ping.unicast.hosts=host1,host2,host3 
-Des.discovery.zen.minimum_master_nodes=2

Brian

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/d2dad32b-00c7-4c61-b08c-503f5b29c3c6%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Adding NGram to language analyzer

2014-05-29 Thread Panagiotis Nikitopoulos
I have the exact same problem with greek language.
Have you figured out how to solve it?
Thanks!

On Monday, July 29, 2013 9:42:09 AM UTC+3, Ido Shamun wrote:
>
> Hi,
> I use the built-in Arabic analyzer to index my Arabic text.
> I want to add auto complete feature to my search, so I thought about adding
> NGram filter.
> Is it possible to extend existing analyzer?
> If no, what is the configuration of the Arabic analyzer?
>
> Thanks!
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/c95e5999-b920-497b-ab04-33423e7ceed8%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Elasticsearch and Smile encoded JSON

2014-05-29 Thread InquiringMind
Drew,

This may not help you, but it's based on my own experience.

Using the Java API (I also don't use the REST API except for exploration 
and problem reporting), I just use the JSON string as the source for every 
document.

For one of my applications (more generic), I have my own generic Record 
object that stores a mapping of field name to one or more field values. I 
then use the JSON stream parser to set it, and the XContentBuilder to 
generate it. Very quick, and very generic.

For even faster processing, I include the Jackson 2.0 libraries and then 
use the Data Binding model to serialize a Java object to JSON and 
deserialize back into a Java object. This is not as generic, but it's 
application-specific and easy to adapt to enhancements or new applications. 
To measure the performance, I created a test driver that performed the 
following 4 steps:

1. Serialize a moderately complex Java object into JSON.
2. Publish the JSON string to an LMAX Disruptor ring buffer.
3. Consume the JSON string from the LMAX Disruptor ring buffer.
4. Deserialize the JSON string back into a Java object.

Steps 1-4, inclusive were performed on two threads on my i7 MacBook at 2 
million per second. So I have no worries at all about performance when 
using JSON! Therefore, I am smiling already and have never felt the need to 
SMILE more. :-)

Brian

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/b0da6e8f-acbf-4961-849c-ec7bb0d99287%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [ANN] Elasticsearch experimental highlighter

2014-05-29 Thread Bruce Ritchie
Hi Nikolas,

I'm likely to test this in the next couple of weeks (I'm still on 0.90.9) 
however I've a question on performance. 'Its pretty quick' meaning 
comparable performance to the posting highlighter, the fast vector 
highlighter, or just quick enough for your use case?

The reason why I'm asking is because highlighting performance is the 
largest issue I face currently. Our documents have hundreds of very short 
fields (well over a thousand if you count the sub fields in a multi-field 
field) and listing every field/sub field to highlight causes queries to be 
10-20x slower than highlighting just a single field (100ms -> 2100ms for 
example). I can't use the _all field because I need to know the actual 
field that was highlighted and only the fvh highlighter returns the high 
quality results we need. I'm actually toying with the idea of doing a 
two-phase search where the first phase only highlights a few fields that 
commonly hit with a second phase that only searches the remaining hits that 
didn't highlight on the first pass. That approach may work but I'd rather 
just have a highlighter that was faster :) 


All the best,

Bruce Ritchie



On Thursday, April 10, 2014 4:04:57 PM UTC-4, Nikolas Everett wrote:
>
> I've been working on a new highlighter on and off for a few weeks and I'd 
> love for other folks to try it out: 
> https://github.com/wikimedia/search-highlighter
>
> You should try it because:
> 1.  Its pretty quick.
> 2.  It supports many of the features of the other highlighters and lets 
> you combine them in new ways.
> 3.  Has a few tricks that none other highlighters have.
> 4.  It doesn't require that you store any extra data information but will 
> use what it can to speed itself up.
>
> I've installed it on our beta site 
> 
>  
> so you can run see it in action without installing it.  
>
> Let me expand on my list above:
> It doesn't require any extra data and is nice and fast that way for short 
> fields.  Once fields get longer [0] reanalyzing them starts to take too 
> long so it is best to store offsets in the postings just like the postings 
> highlighter.  It can use term vectors the same way that the fast vector 
> highlighter can but that is slower than postings and takes up more space.
>
> It supports three fragmenters: one that mimics the postings highlighter, 
> one that mimics the fast vector highlighter, and one that always highlights 
> the whole value.
>
> It supports matched_fields, no_match_size, and most everything else in the 
> highlight api.  It doesn't support require_field_match though.
>
> It adds a handful of tricks like returning the top scoring snippets in 
> document order and weighing terms that appear early in the document 
> higher.  Nothing difficult, but still cute tricks.  Its reasonably easy to 
> implement new tricks so if you have any ideas I'd love to hear them.
>
> I don't think it is really ready for production usage yet but I'd like to 
> get there in a week or two.
>
> Thanks for reading,
>
> Nik
>
> [0]: I haven't done the measurements to figure out how long the field has 
> to be before it is faster to use postings then reanalyze it.  I did the 
> math a few months ago for how long the field has to be before vectors 
> become faster.  It was a couple of KB for my analysis chain but I'm not 
> sure any of that holds true for this highlighter.  It could be more or less.
>  

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/7b125714-48dd-4bca-a58d-d56acac94d47%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Implicit Custom Filter?

2014-05-29 Thread W Shaib
Thanks for the quick reply. unless I am missing something, these 
suggestions do not quite do what I need...

1) alias filters would not work since the filters associated with them are 
static, whereas I would need a dynamic a filter (one with params, so each 
query will include the specific values to filter for, depending on who is 
querying)
2) not sure how templates work for this... an elaboration would be 
appreciated.

On Thursday, 29 May 2014 14:16:39 UTC-4, Ivan Brusic wrote:
>
> Two options come to mind:
>
> 1) Filtered aliases: 
> http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/indices-aliases.html#filtered
>
> 2) Search template and Template queries: 
>
>
> http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-template.html
>
>
> http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-template-query.html
>
> The last feature is new and I have yet to try it out.
>
> Cheers,
>
> Ivan
>
>
>
>
> On Thu, May 29, 2014 at 11:09 AM, W Shaib  >wrote:
>
>> I am trying to set up document-level security for my index. The documents 
>> have fields which will be filtered on to enforce access permissions. 
>>
>> My question is: given a query, is it possible to set things up so that ES 
>> will invoke a custom script filter on *every* clause in said query without 
>> having to munge the query myself to insert the filter explicitly? 
>>
>> For example, if a query is: 
>>
>> filtered: { 
>> query: { 
>>term: { foo: "bar" } 
>> }, 
>> filter: { 
>>has_parent: { 
>>   type: "some_type", 
>>   query: { 
>>  term: { blah: "xyz" } 
>>   } 
>>   } 
>>} 
>> } 
>>
>> then, I would want my custom filter invoked (implicitly) on both term 
>> queries above. 
>> Is there an alternative to doing the above without preprocessing the 
>> query and explicitly inserting my custom filter everywhere? 
>>  
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to elasticsearc...@googlegroups.com .
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/80f569b4-efca-4c46-a028-0220cfb61375%40googlegroups.com
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/e8fd192f-0d6b-4227-bcd5-64458ad29042%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Implicit Custom Filter?

2014-05-29 Thread Ivan Brusic
Two options come to mind:

1) Filtered aliases:
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/indices-aliases.html#filtered

2) Search template and Template queries:

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-template.html

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-template-query.html

The last feature is new and I have yet to try it out.

Cheers,

Ivan




On Thu, May 29, 2014 at 11:09 AM, W Shaib  wrote:

> I am trying to set up document-level security for my index. The documents
> have fields which will be filtered on to enforce access permissions.
>
> My question is: given a query, is it possible to set things up so that ES
> will invoke a custom script filter on *every* clause in said query without
> having to munge the query myself to insert the filter explicitly?
>
> For example, if a query is:
>
> filtered: {
> query: {
>term: { foo: "bar" }
> },
> filter: {
>has_parent: {
>   type: "some_type",
>   query: {
>  term: { blah: "xyz" }
>   }
>   }
>}
> }
>
> then, I would want my custom filter invoked (implicitly) on both term
> queries above.
> Is there an alternative to doing the above without preprocessing the query
> and explicitly inserting my custom filter everywhere?
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/80f569b4-efca-4c46-a028-0220cfb61375%40googlegroups.com
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQBfZHPSRiRTiZJ%2BngcRXZ2PdJQFi8JBx-TK1MRHc81Y_Q%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Implicit Custom Filter?

2014-05-29 Thread W Shaib
I am trying to set up document-level security for my index. The documents 
have fields which will be filtered on to enforce access permissions. 

My question is: given a query, is it possible to set things up so that ES 
will invoke a custom script filter on *every* clause in said query without 
having to munge the query myself to insert the filter explicitly? 

For example, if a query is: 

filtered: { 
query: { 
   term: { foo: "bar" } 
}, 
filter: { 
   has_parent: { 
  type: "some_type", 
  query: { 
 term: { blah: "xyz" } 
  } 
  } 
   } 
} 

then, I would want my custom filter invoked (implicitly) on both term 
queries above. 
Is there an alternative to doing the above without preprocessing the query 
and explicitly inserting my custom filter everywhere? 

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/80f569b4-efca-4c46-a028-0220cfb61375%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Weighted random sampling and score normalization

2014-05-29 Thread Dhruv Garg
Hey all,

I am using a function score to compute a score for my documents. I'd like 
to now sample from these elements at in a weighted fashion. So if my 
function_score returns:

d1: s = 1
d2: s = 2
d3: s = 3

I'd like to see d3 show up as the first result 50% of the time, d2 show up 
as first result 33% of the time, and d3 show up as the first result 16% of 
the time.

Any ideas on how to implement such a scheme? I couldn't find a built in 
function to achieve this, but I may be wrong.

I think a pre-requisite to this is having the ability to normalize the 
scores that come out of my function_score query. Any tips on how to do that 
would also be useful. 

Thanks!

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/a9e1b63e-4c11-4f86-a460-90879c1d844b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


ElasticSearch: search document pairs by specific query?

2014-05-29 Thread Alexey Shevchenko


How to search for the document *pairs* that matched to specific 
query
 (in 
my case it is more_like_this 
query)
 
and *order* them using a specific field(s)?

Probably *pairs* can be formed using 
aggregations
 but 
I can't figure out how.

Also have no idea if we even form such aggregated pairs then how to *order*
 them?
Question is posted to stackoverflow, will appreciate if you answer there:
http://stackoverflow.com/questions/23938975/elasticsearch-search-document-pairs-by-specific-query

Thanks.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/6e877ffe-3450-4e76-89c9-9fd8ecea0cd8%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


significant terms aggregation too slow for me

2014-05-29 Thread Srinivasan Ramaswamy
I am trying to use the significant terms aggregation feature, but its 
making the search very slow. Is there any optimization that i can do to 
make it faster ? I have an index with 24 shards and 1 replica, where each 
shard size is 2.5G. With the significant terms feature turned on many 
searches take ~5s (even when the same search is repeated), with this 
feature disabled it takes only ~150ms.

I am using it like the following 

SearchRequestBuilder srb = ...;
SignificantTermsBuilder tags = 
significantTerms("st_name").field("tags").size(11);
srb.addAggregation(tags);


Does any one have any hints at how to optimize this feature ? Is there some 
level of caching involved in this feature ? If it does it shouldnt take ~5s 
when the same query is executed again and again, isnt it ?

Thanks
Srini

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/500cd549-cb72-4409-a93b-33789fd18fbe%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: ClassCastException on Sort

2014-05-29 Thread VB
Can anyone please provide us insight on this, specially someone from ES 
core dev team, this is bug in elasticsearch code. 

On Tuesday, 27 May 2014 14:55:18 UTC-7, VB wrote:
>
> We are using 90.11 and we have a use case where have following type
>
>- accountsearch: {
>   - dynamic: strict
>   - properties: {
>  - Name: {
> - index: not_analyzed
> - norms: {
>- enabled: false
> }
> - index_options: docs
> - type: string
>  }
>  - Description: {
> - index: not_analyzed
> - norms: {
>- enabled: false
> }
> - index_options: docs
> - type: string
>  }
>  - TransactionIdList: {
> - properties: {
>- transactionId: {
>   - type: long
>}
> }
> - type: nested
>  }
>  - Number: {
> - index: not_analyzed
> - norms: {
>- enabled: false
> }
> - index_options: docs
> - type: string
>  }
>  - CreateUserId: {
> - type: integer
>  }
>  - Id: {
> - type: long
>  }
>  - ExposedPartyName: {
> - index: not_analyzed
> - norms: {
>- enabled: false
> }
> - index_options: docs
> - type: string
>  }
>  - RiskItemCount: {
> - type: long
>  }
>  - SourceId: {
> - index: not_analyzed
> - norms: {
>- enabled: false
> }
> - index_options: docs
> - type: string
>  }
>  - ContractCount: {
> - type: long
>  }
>   }
>}
>
>
> On and off we get ClassCastException converting Long to String when we do 
> following query. When we remove Sort from query it works fine.  I checked 
> my all other types none of the type has field "Id" with String data type. 
>
> {
>   "sort": "Id",
>   "query": {
> "bool": {
>   "must": [
> {
>   "range": {
> "Id": {
>   "gt": "0"
> }
>   }
> }
>   ]
> }
>   }
> }
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/52122b0a-9304-4e65-b212-11cb3cfc43c2%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: OutOfMemory Exception on client Node

2014-05-29 Thread VB
These exceptions are happening on client node and not on data node. I do 
not think we need any restart of cluster.

On Tuesday, 27 May 2014 14:03:29 UTC-7, VB wrote:
>
> Hi all,
>
> We are running 90.11 and we have a cluster with client, master and data 
> nodes.
>
> Our client nodes are using dedicated 10g memory.
>
> But we are seeing these outofmemory exceptions. 
>
> I tried to correlate this log time with logs in our exception but I did 
> not find any query which we could be causing this issue.
>
> Our cluster has 37 indexes with 50 shards and 1 replica. 
>
> Some indexes has data and some doesn't.
>
>
>
> [2014-05-27 16:26:34,688][WARN ][monitor.jvm  ] [BUS9364B62] 
> [gc][old][327409][8] duration [33s], collections [1]/[33.5s], total 
> [33s]/[3.8m], memory [9gb]->[9.4gb]/[9.9gb], all_pools {[young] 
> [520.2kb]->[112.3mb]/[532.5mb]}{[survivor] [0b]->[0b]/[66.5mb]}{[old] 
> [9gb]->[9.3gb]/[9.3gb]}
> [2014-05-27 16:27:19,992][INFO ][cluster.service  ] [BUS9364B62] 
> detected_master 
> [ELS-10.76.121.130][BlGygpFmRn6uQNbgiEfl0A][inet[/10.76.121.130:9300]]{data=false,
>  
> max_local_storage_nodes=1, master=true}, added 
> {[ELS-10.76.121.130][BlGygpFmRn6uQNbgiEfl0A][inet[/10.76.121.130:9300]]{data=false,
>  
> max_local_storage_nodes=1, master=true},}, reason: zen-disco-receive(from 
> master 
> [[ELS-10.76.121.130][BlGygpFmRn6uQNbgiEfl0A][inet[/10.76.121.130:9300]]{data=false,
>  
> max_local_storage_nodes=1, master=true}])
> [2014-05-27 16:27:20,008][INFO ][discovery.zen] [BUS9364B62] 
> master_left 
> [[ELS-10.76.121.130][BlGygpFmRn6uQNbgiEfl0A][inet[/10.76.121.130:9300]]{data=false,
>  
> max_local_storage_nodes=1, master=true}], reason [failed to perform initial 
> connect [[ELS-10.76.121.130][inet[/10.76.121.130:9300]] 
> connect_timeout[30s]]]
> [2014-05-27 16:27:20,008][WARN ][monitor.jvm  ] [BUS9364B62] 
> [gc][old][327410][9] duration [44.3s], collections [1]/[45.3s], total 
> [44.3s]/[4.6m], memory [9.4gb]->[9.8gb]/[9.9gb], all_pools {[young] 
> [112.3mb]->[475mb]/[532.5mb]}{[survivor] [0b]->[0b]/[66.5mb]}{[old] 
> [9.3gb]->[9.3gb]/[9.3gb]}
> [2014-05-27 16:28:06,856][WARN ][cluster.service  ] [BUS9364B62] 
> failed to connect to node 
> [[ELS-10.76.121.130][BlGygpFmRn6uQNbgiEfl0A][inet[/10.76.121.130:9300]]{data=false,
>  
> max_local_storage_nodes=1, master=true}]
> org.elasticsearch.transport.ConnectTransportException: 
> [ELS-10.76.121.130][inet[/10.76.121.130:9300]] connect_timeout[30s]
> at 
> org.elasticsearch.transport.netty.NettyTransport.connectToChannels(NettyTransport.java:727)
> at 
> org.elasticsearch.transport.netty.NettyTransport.connectToNode(NettyTransport.java:647)
> at 
> org.elasticsearch.transport.netty.NettyTransport.connectToNode(NettyTransport.java:615)
> at 
> org.elasticsearch.transport.TransportService.connectToNode(TransportService.java:129)
> at 
> org.elasticsearch.cluster.service.InternalClusterService$UpdateTask.run(InternalClusterService.java:396)
> at 
> org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:135)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:744)
> [2014-05-27 16:28:06,856][WARN ][monitor.jvm  ] [BUS9364B62] 
> [gc][old][327412][10] duration [45.3s], collections [1]/[45.8s], total 
> [45.3s]/[5.3m], memory [9.9gb]->[9.9gb]/[9.9gb], all_pools {[young] 
> [532.5mb]->[532.5mb]/[532.5mb]}{[survivor] [41mb]->[33.2mb]/[66.5mb]}{[old] 
> [9.3gb]->[9.3gb]/[9.3gb]}
> [2014-05-27 16:28:53,876][WARN ][monitor.jvm  ] [BUS9364B62] 
> [gc][old][327413][11] duration [46.4s], collections [1]/[47s], total 
> [46.4s]/[6.1m], memory [9.9gb]->[9.9gb]/[9.9gb], all_pools {[young] 
> [532.5mb]->[532.5mb]/[532.5mb]}{[survivor] 
> [33.2mb]->[54.7mb]/[66.5mb]}{[old] [9.3gb]->[9.3gb]/[9.3gb]}
> [2014-05-27 16:29:40,990][INFO ][discovery.zen] [BUS9364B62] 
> master_left 
> [[ELS-10.76.121.130][BlGygpFmRn6uQNbgiEfl0A][inet[/10.76.121.130:9300]]{data=false,
>  
> max_local_storage_nodes=1, master=true}], reason [failed to perform initial 
> connect [[ELS-10.76.121.130][inet[/10.76.121.130:9300]] 
> connect_timeout[30s]]]
> [2014-05-27 16:29:40,990][WARN ][monitor.jvm  ] [BUS9364B62] 
> [gc][old][327414][12] duration [46.8s], collections [1]/[47.1s], total 
> [46.8s]/[6.9m], memory [9.9gb]->[9.9gb]/[9.9gb], all_pools {[young] 
> [532.5mb]->[532.5mb]/[532.5mb]}{[survivor] 
> [54.7mb]->[65.5mb]/[66.5mb]}{[old] [9.3gb]->[9.3gb]/[9.3gb]}
> [2014-05-27 16:30:27,589][WARN ][monitor.jvm  ] [BUS9364B62] 
> [gc][old][327415][13] duration [46.5s], collections [1]/[46.5s], total 
> [46.5s]/[7.7m], memory [9.9gb]->[9.9gb]/[9.9gb], all_pools {[young] 
> [532.5mb]->[532.5mb]/[532.5mb]}{[survivor] 
> [65.5mb]->[66.4mb]

Re: How to run example of context suggester in elasticsearch doc?

2014-05-29 Thread Cédric Warny
I downloaded Elasticsearch 1.2.0 and have Java 1.7.0_25. I still get the 
same error as Johnson. Is it because of the Java version or is it because I 
still have Elasticsearch 1.1.1 in another folder or is it for some other 
reason?
Thanks in advance for your help.

Cedric

Le mardi 15 avril 2014 06:40:36 UTC-4, Adrien Grand a écrit :
>
> Hi,
>
> The context suggester will be available in Elasticsearch 1.2.0, it has not 
> been released yet. (see note at the top of the documentation)
>
>
> On Tue, Apr 15, 2014 at 8:14 AM, > wrote:
>
>> I am learning the elasticsearch, and I want to run the example of context 
>> suggester following the doc: 
>> http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/suggester-context.html
>>
>> But I always get the error when I create the mapping, 
>>
>> error: {"error":"MapperParsingException[Unknown field 
>> [context]]","status":400}
>>
>> my dsl:
>> curl -X PUT localhost:9200/sale
>> curl -X PUT "localhost:9200/sale/product/_mapping" -d '
>> {
>> "product": {
>> "properties": {
>> "name": {
>> "type" : "string"
>> },
>> "tag": {
>> "type" : "string"
>> },"colorField": {"type":"string" }, 
>> "suggest": {
>> "type": "completion",
>> "context": {
>> "color": { 
>> "type": "category",
>> "path": "colorField",
>> "default": ["red", "green", "blue"]
>> },
>> "location": { 
>> "type": "geo",
>> "precision": "5m",
>> "neighbors": true,
>> "default": "u33"
>> }
>> }
>> }
>> }
>> }
>> }'
>>
>> does any one know the reason?
>>
>> thanks
>> Johnson
>>  
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to elasticsearc...@googlegroups.com .
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/d6a1b427-08ab-41a3-88f5-74d563bc6946%40googlegroups.com
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>
>
> -- 
> Adrien Grand
>  

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/cf1de8ad-4061-41bb-83a0-4b6390b99d49%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Search issue with snowball stemmer

2014-05-29 Thread Ivan Brusic
You should use the Analyze API to ensure that the tokens you are producing
are correct:

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/indices-analyze.html

-- 
Ivan



On Thu, May 29, 2014 at 7:13 AM, Александр Шаманов wrote:

> Hello everyone,
>
> I have follow index mapping:
>
> 
> curl -XPUT 'http://localhost:9200/some_content/' -d '
> {
>"settings":{
>   "query_string":{
>  "default_con":"content",
>  "default_operator":"AND"
>   },
>   "index":{
>  "analysis":{
> "analyzer":{
>"en_analyser":{
>   "filter":[
>  "snowBallFilter"
>   ],
>   "type":"custom",
>   "tokenizer":"standard"
>}
> },
> "filter":{
>"en_stopFilter":{
>   "type":"stop",
>   "stopwords_path":"lang/stopwords_en.txt"
>},
>"snowBallFilter":{
>   "type":"snowball",
>   "language":"English"
>},
>"wordDelimiterFilter":{
>   "catenate_all":false,
>   "catenate_words":true,
>   "catenate_numbers":true,
>   "generate_word_parts":true,
>   "generate_number_parts":true,
>   "preserve_original":true,
>   "type":"word_delimiter",
>   "split_on_case_change":true
>},
>"en_synonymFilter":{
>   "synonyms_path":"lang/synonyms_en.txt",
>   "ignore_case":true,
>   "type":"synonym",
>   "expand":false
>},
>"lengthFilter":{
>   "max":250,
>   "type":"length",
>   "min":3
>}
> }
>  }
>   }
>},
>"mappings":{
>   "docs":{
>  "_source":{
> "enabled":false
>  },
>  "analyzer":"en_analyser",
>  "properties":{
>  "content":{
> "type":"string",
> "index":"analyzed",
> "term_vector":"with_positions_offsets",
> "omit_norms":"true"
>  }
>  }
>   }
>}
> }'
>
> and I posted the next content:
>
> curl -XPOST http://localhost:9200/some_content/docs/ -d '
> {
>   "content" : "Some sampling text formatted for text data"
> }'
>
> When I make this one request:
> http://epbyvitw0052:9200/some_content/docs/_search?q=sampling
>
>  I'm getting result:
> 
> {
> "took": 1,
> "timed_out": false,
> "_shards": {
> "total": 1,
> "successful": 1,
> "failed": 0
> },
> "hits": {
> "total": 1,
> "max_score": 0.095891505,
> "hits": [
> {
> "_index": "some_content",
> "_type": "docs",
> "_id": "saLfx6PYR82YR69je0JbAA",
> "_score": 0.095891505
> }
> ]
> }
> }
> 
>
> but when I send request without type:
> http://epbyvitw0052:9200/some_content/_search?q=sampling
>
> then I'm getting nothing:
> 
> {
> "took": 1,
> "timed_out": false,
> "_shards": {
> "total": 1,
> "successful": 1,
> "failed": 0
> },
> "hits": {
> "total": 0,
> "max_score": null,
> "hits": []
> }
> }
> 
>
> although, I can make the next request with term:
> http://epbyvitw0052:9200/some_content/_search?q=sampl
>
> the system found it:
> 
> {
> "took": 1,
> "timed_out": false,
> "_shards": {
> "total": 1,
> "successful": 1,
> "failed": 0
> },
> "hits": {
> "total": 1,
> "max_score": 0.095891505,
> "hits": [
> {
> "_index": "some_content",
> "_type": "docs",
> "_id": "saLfx6PYR82YR69je0JbAA",
> "_score": 0.095891505
> }
> ]
> }
> }
> 
>
> It's issue appear when I put snowball filter into analyzer.
> Could you explain why the system has such behavior?
> May be I do something wrong.
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/9b919926-3384-4d72-845a-c73790d05281%40googlegroups.com
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, 

Kibana "bettermap" widget

2014-05-29 Thread Steven Pisarski
Hello,

I have been trying to configure a bettermap widget in Kibana using some 
custom (non Logstash) data but the widget never renders the map and only 
shows a graphic making one believe it is working. I have tested the curl 
command issued by bettermap which appears to be working properly. The 
"Coordinate Field" value is "pin.location" which should be setup properly 
as an array ([lon,lat]) as this field contains data like, "pin.location" : 
[ "[-73.63,42.68]" ].

I understand that this widget is experimental; however, is it only designed 
to work with Logstash? If not, are there any other bettermap reference 
documents than on the ES site because that page is extremely lean and I 
have exhausted all of my other options.

Best,
Steve

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/25e021eb-ea15-4441-905f-c98d7fd794c7%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Interpolation of discovery.zen.ping.unicast.hosts

2014-05-29 Thread Matt Hughes
I'm trying to set up unicast discovery.  I want to pass in the hosts via 
environment variable and am relying on elasticsearch.yml support of 
environment variable interpolation.

Tried two formats without any luck:  

First approach (pass in contents of array):

export ES_HOSTS='"one", "two"'
discovery.zen.ping.unicast.hosts: [${ES_HOSTS}]

org.elasticsearch.common.settings.SettingsException: Failed to load 
settings from [file:/etc/elasticsearch/elasticsearch.yml]
  at 
org.elasticsearch.common.settings.ImmutableSettings$Builder.loadFromStream(ImmutableSettings.java:920)
  at 
org.elasticsearch.common.settings.ImmutableSettings$Builder.loadFromUrl(ImmutableSettings.java:904)
  at 
org.elasticsearch.node.internal.InternalSettingsPreparer.prepareSettings(InternalSettingsPreparer.java:77)
  at 
org.elasticsearch.bootstrap.Bootstrap.initialSettings(Bootstrap.java:106)
  at org.elasticsearch.bootstrap.Bootstrap.main(Bootstrap.java:177)
  at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:32)
Caused by: while parsing a flow sequence
 in 'reader', line 326, column 35:
discovery.zen.ping.unicast.hosts: [${ES_HOSTS}]
  ^
expected ',' or ']', but got FlowMappingStart
 in 'reader', line 326, column 37:
discovery.zen.ping.unicast.hosts: [${ES_HOSTS}]


Second Approach (pass in full quoted array):

export ES_HOSTS='["one", "two"]'
discovery.zen.ping.unicast.hosts: ${ES_HOSTS}

1) ElasticsearchIllegalArgumentException[Failed to resolve address for 
[["es1"]]]
  NumberFormatException[For input string: ""es1""]2) 
IllegalStateException[This is a proxy used to support circular references 
involving constructors. The object we're proxying is not constructed yet. 
Please wait until after injection has completed to use this object.]



Any ideas?  Is there a doc somewhere that describes how the interpolation 
occurs?  The first error scares me a bit as it seems interpolation happens 
after some other parsing.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/701b9022-b4e5-4422-b376-ab01fba3c233%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Call _optimize during production work

2014-05-29 Thread David Pilato
exactly. An update is a delete and create behind the scene.

-- 
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr


Le 29 mai 2014 à 17:57:40, Kirill Teplinskiy (tkirill...@gmail.com) a écrit:

Oh, it is interesting.  I really had some suggestions with new payloads and 
some with old!  Thank you for explain!  As I understand, Lucene perform update 
operation as delete old document and create new document?

Your idea with alias looks great.  I think we will use it for the next full 
reindex, thank you again.

On Thursday, May 29, 2014 9:40:36 PM UTC+6, David Pilato wrote:
So it created new segments and at some point expunges deletes but not all.
TBH, I'd prefer using another index with an alias on top of the older one and 
at the end, switch the alias to the new one and delete old index.

--
David ;-)
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs


Le 29 mai 2014 à 17:29, Kirill Teplinskiy  a écrit :

Yes, exactly.  I update docs in the same index.

On Thursday, May 29, 2014 9:10:40 PM UTC+6, David Pilato wrote:
So you reindex into the same index and not in another clean one?
So you "update" docs, right?

--
David ;-)
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs


Le 29 mai 2014 à 16:54, Kirill Teplinskiy  a écrit :

Thank you, David!

I need to call _optimize by hands to refresh payloads in completion suggester.  
This method is recommended here: 
http://www.elasticsearch.org/guide/en/elasticsearch/reference/0.90/search-suggesters-completion.html.
  For some reason _suggest returns old payloads after reindexing on our stage 
server.  I don't know why, completion suggester on my local instance of 
ElasticSearch updates on the fly.  Maybe because our stage index is 10Gb size 
and contains 10 millions documents and my local index has only 180 000 
documents. 

On Thursday, May 29, 2014 7:37:24 PM UTC+6, David Pilato wrote:
In 99.9% you should not call optimize api and let elasticsearch/Lucene do it 
for you when needed.

To answer to your question, yes search and index operations will still possible 
during that time.


--
David ;-)
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs


Le 29 mai 2014 à 14:37, Kirill Teplinskiy  a écrit :

Hello!

Can anyone tell is it safe to call _optimize under normal production work?  
Will search requests be responded and indexing be performed?
--
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearc...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/5c75d64c-4fb6-4fe0-aaec-8dc3c5cf68f4%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearc...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/95b0bfcf-38b5-4880-ae50-efaa11aaddb1%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearc...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/f830d7a1-8ad8-4390-9860-dc04029c7cd8%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/2d0e601e-55f6-456c-8901-42e074705d26%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/etPan.5387591c.1e7ff521.28b%40MacBook-Air-de-David.local.
For more options, visit https://groups.google.com/d/optout.


Re: Call _optimize during production work

2014-05-29 Thread Kirill Teplinskiy
Oh, it is interesting.  I really had some suggestions with new payloads and 
some with old!  Thank you for explain!  As I understand, Lucene perform 
update operation as delete old document and create new document?

Your idea with alias looks great.  I think we will use it for the next full 
reindex, thank you again.

On Thursday, May 29, 2014 9:40:36 PM UTC+6, David Pilato wrote:
>
> So it created new segments and at some point expunges deletes but not all.
> TBH, I'd prefer using another index with an alias on top of the older one 
> and at the end, switch the alias to the new one and delete old index.
>
> --
> David ;-)
> Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs
>
>
> Le 29 mai 2014 à 17:29, Kirill Teplinskiy > 
> a écrit :
>
> Yes, exactly.  I update docs in the same index.
>
> On Thursday, May 29, 2014 9:10:40 PM UTC+6, David Pilato wrote:
>>
>> So you reindex into the same index and not in another clean one?
>> So you "update" docs, right?
>>
>> --
>> David ;-)
>> Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs
>>
>>
>> Le 29 mai 2014 à 16:54, Kirill Teplinskiy  a écrit :
>>
>> Thank you, David!
>>
>> I need to call _optimize by hands to refresh payloads in completion 
>> suggester.  This method is recommended here: 
>> http://www.elasticsearch.org/guide/en/elasticsearch/reference/0.90/search-suggesters-completion.html.
>>   
>> For some reason _suggest returns old payloads after reindexing on our stage 
>> server.  I don't know why, completion suggester on my local instance of 
>> ElasticSearch updates on the fly.  Maybe because our stage index is 10Gb 
>> size and contains 10 millions documents and my local index has only 180 000 
>> documents.  
>>
>> On Thursday, May 29, 2014 7:37:24 PM UTC+6, David Pilato wrote:
>>>
>>> In 99.9% you should not call optimize api and let elasticsearch/Lucene 
>>> do it for you when needed.
>>>
>>> To answer to your question, yes search and index operations will still 
>>> possible during that time.
>>>
>>>
>>> --
>>> David ;-)
>>> Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs
>>>
>>>
>>> Le 29 mai 2014 à 14:37, Kirill Teplinskiy  a écrit :
>>>
>>> Hello!
>>>
>>> Can anyone tell is it safe to call _optimize under normal production 
>>> work?  Will search requests be responded and indexing be performed?
>>>
>>> -- 
>>> You received this message because you are subscribed to the Google 
>>> Groups "elasticsearch" group.
>>> To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to elasticsearc...@googlegroups.com.
>>> To view this discussion on the web visit 
>>> https://groups.google.com/d/msgid/elasticsearch/5c75d64c-4fb6-4fe0-aaec-8dc3c5cf68f4%40googlegroups.com
>>> .
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>  -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to elasticsearc...@googlegroups.com.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/95b0bfcf-38b5-4880-ae50-efaa11aaddb1%40googlegroups.com
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>  -- 
> You received this message because you are subscribed to the Google Groups 
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to elasticsearc...@googlegroups.com .
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/elasticsearch/f830d7a1-8ad8-4390-9860-dc04029c7cd8%40googlegroups.com
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/2d0e601e-55f6-456c-8901-42e074705d26%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Call _optimize during production work

2014-05-29 Thread David Pilato
So it created new segments and at some point expunges deletes but not all.
TBH, I'd prefer using another index with an alias on top of the older one and 
at the end, switch the alias to the new one and delete old index.

--
David ;-)
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs


Le 29 mai 2014 à 17:29, Kirill Teplinskiy  a écrit :

Yes, exactly.  I update docs in the same index.

> On Thursday, May 29, 2014 9:10:40 PM UTC+6, David Pilato wrote:
> So you reindex into the same index and not in another clean one?
> So you "update" docs, right?
> 
> --
> David ;-)
> Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs
> 
> 
> Le 29 mai 2014 à 16:54, Kirill Teplinskiy  a écrit :
> 
> Thank you, David!
> 
> I need to call _optimize by hands to refresh payloads in completion 
> suggester.  This method is recommended here: 
> http://www.elasticsearch.org/guide/en/elasticsearch/reference/0.90/search-suggesters-completion.html.
>   For some reason _suggest returns old payloads after reindexing on our stage 
> server.  I don't know why, completion suggester on my local instance of 
> ElasticSearch updates on the fly.  Maybe because our stage index is 10Gb size 
> and contains 10 millions documents and my local index has only 180 000 
> documents.  
> 
>> On Thursday, May 29, 2014 7:37:24 PM UTC+6, David Pilato wrote:
>> In 99.9% you should not call optimize api and let elasticsearch/Lucene do it 
>> for you when needed.
>> 
>> To answer to your question, yes search and index operations will still 
>> possible during that time.
>> 
>> 
>> --
>> David ;-)
>> Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs
>> 
>> 
>> Le 29 mai 2014 à 14:37, Kirill Teplinskiy  a écrit :
>> 
>> Hello!
>> 
>> Can anyone tell is it safe to call _optimize under normal production work?  
>> Will search requests be responded and indexing be performed?
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to elasticsearc...@googlegroups.com.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/5c75d64c-4fb6-4fe0-aaec-8dc3c5cf68f4%40googlegroups.com.
>> For more options, visit https://groups.google.com/d/optout.
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to elasticsearc...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/elasticsearch/95b0bfcf-38b5-4880-ae50-efaa11aaddb1%40googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/f830d7a1-8ad8-4390-9860-dc04029c7cd8%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/4FD36226-3CDC-4AF2-9327-60C72CD2B196%40pilato.fr.
For more options, visit https://groups.google.com/d/optout.


Re: ES and logstash are nit working well with each other

2014-05-29 Thread David Pilato
Elasticsearch output is using es 1.x IIRC. You can not mix versions.
You need to upgrade es or downgrade logstash or use elasticsearch HTTP output.

My 2 cents

--
David ;-)
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs


Le 29 mai 2014 à 17:21, David Montgomery  a écrit :

Before I give up on logstash..I now placed elastic search on the same server 
using the below.  


input {
  redis {
host => "redis.queue.do.development.sf.test.com"
data_type => "list"
key => "logstash"
codec => json
  }
}


output {
stdout { }
elasticsearch {
bind_host => "127.0.0.1"
}
}


I get this erorr   I am using 1.4.1for logstash and 
0.90.9 for es.  


'/usr/local/share/logstash-1.4.1/bin/logstash -f 
/usr/local/share/logstash.indexer.config
Using milestone 2 input plugin 'redis'. This plugin should be stable, but if 
you see strange behavior, please let us know! For more information on plugin 
milestones, see http://logstash.net/docs/1.4.1/plugin-milestones {:level=>:warn}
log4j, [2014-05-29T11:18:31.923]  WARN: org.elasticsearch.discovery: 
[logstash-do-logstash-sf-development-20140527082230-16645-2010] waited for 30s 
and no initial state was set by the discovery
Exception in thread ">output" 
org.elasticsearch.discovery.MasterNotDiscoveredException: waited for [30s]
at 
org.elasticsearch.action.support.master.TransportMasterNodeOperationAction$3.onTimeout(org/elasticsearch/action/support/master/TransportMasterNodeOperationAction.java:180)
at 
org.elasticsearch.cluster.service.InternalClusterService$NotifyTimeout.run(org/elasticsearch/cluster/service/InternalClusterService.java:492)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(java/util/concurrent/ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(java/util/concurrent/ThreadPoolExecutor.java:615)
at java.lang.Thread.run(java/lang/Thread.java:744)















> On Thu, May 29, 2014 at 1:15 PM, David Montgomery  
> wrote:
> PS...and how is this possible?  I feel so bad I bought the kindle 
> logstash book:(
> 
> I changed to host rather than bind host.  I mean..wow..I have ports open.  
> See?  ufw satus ===> 9200:9400/tcp  ALLOW   
> my.logstash.ipaddress
> 
> I have ES running
> service elasticsearch status
>  * elasticsearch is running
> 
> 
> 
>  /usr/local/share/logstash-1.4.1/bin/logstash -f 
> /usr/local/share/logstash.indexer.configUsing milestone 2 input plugin 
> 'redis'. This plugin should be stable, but if you see strange behavior, 
> please let us know! For more information on plugin milestones, see 
> http://logstash.net/docs/1.4.1/plugin-milestones {:level=>:warn}
> Exception in thread ">output" 
> org.elasticsearch.transport.BindTransportException: Failed to bind to 
> [9300-9400]
> at 
> org.elasticsearch.transport.netty.NettyTransport.doStart(org/elasticsearch/transport/netty/NettyTransport.java:380)
> at 
> org.elasticsearch.common.component.AbstractLifecycleComponent.start(org/elasticsearch/common/component/AbstractLifecycleComponent.java:85)
> at 
> org.elasticsearch.transport.TransportService.doStart(org/elasticsearch/transport/TransportService.java:92)
> at 
> org.elasticsearch.common.component.AbstractLifecycleComponent.start(org/elasticsearch/common/component/AbstractLifecycleComponent.java:85)
> at 
> org.elasticsearch.node.internal.InternalNode.start(org/elasticsearch/node/internal/InternalNode.java:229)
> at 
> org.elasticsearch.node.NodeBuilder.node(org/elasticsearch/node/NodeBuilder.java:166)
> at java.lang.reflect.Method.invoke(java/lang/reflect/Method.java:606)
> at 
> RUBY.build_client(/usr/local/share/logstash-1.4.1/lib/logstash/outputs/elasticsearch/protocol.rb:198)
> at 
> RUBY.client(/usr/local/share/logstash-1.4.1/lib/logstash/outputs/elasticsearch/protocol.rb:15)
> at 
> RUBY.initialize(/usr/local/share/logstash-1.4.1/lib/logstash/outputs/elasticsearch/protocol.rb:157)
> at 
> RUBY.register(/usr/local/share/logstash-1.4.1/lib/logstash/outputs/elasticsearch.rb:250)
> at org.jruby.RubyArray.each(org/jruby/RubyArray.java:1613)
> at 
> RUBY.outputworker(/usr/local/share/logstash-1.4.1/lib/logstash/pipeline.rb:220)
> at 
> RUBY.start_outputs(/usr/local/share/logstash-1.4.1/lib/logstash/pipeline.rb:152)
> at java.lang.Thread.run(java/lang/Thread.java:744)
> 
> 
> 
>> On Thu, May 29, 2014 at 12:43 PM, David Montgomery 
>>  wrote:
>> Hi,
>> 
>> I am, rather concerned that ES is not working with logstash index server.
>> 
>> I start logstash index server like this:
>> 
>> /usr/local/share/logstash-1.4.1/bin/logstash -f 
>> /usr/local/share/logstash.indexer.config
>> 
>> Using milestone 2 input plugin 'redis'. This plugin should be stable, but if 
>> you see strange behavior, please let us know! For more information on plugin 
>> milestones, see http://logstash.net/docs/1.4.1/plugin-milestones 
>> {:level=>:warn}
>> log4j, [2014-05-29T00:33:

Re: Call _optimize during production work

2014-05-29 Thread Kirill Teplinskiy
Yes, exactly.  I update docs in the same index.

On Thursday, May 29, 2014 9:10:40 PM UTC+6, David Pilato wrote:
>
> So you reindex into the same index and not in another clean one?
> So you "update" docs, right?
>
> --
> David ;-)
> Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs
>
>
> Le 29 mai 2014 à 16:54, Kirill Teplinskiy > 
> a écrit :
>
> Thank you, David!
>
> I need to call _optimize by hands to refresh payloads in completion 
> suggester.  This method is recommended here: 
> http://www.elasticsearch.org/guide/en/elasticsearch/reference/0.90/search-suggesters-completion.html.
>   
> For some reason _suggest returns old payloads after reindexing on our stage 
> server.  I don't know why, completion suggester on my local instance of 
> ElasticSearch updates on the fly.  Maybe because our stage index is 10Gb 
> size and contains 10 millions documents and my local index has only 180 000 
> documents.  
>
> On Thursday, May 29, 2014 7:37:24 PM UTC+6, David Pilato wrote:
>>
>> In 99.9% you should not call optimize api and let elasticsearch/Lucene do 
>> it for you when needed.
>>
>> To answer to your question, yes search and index operations will still 
>> possible during that time.
>>
>>
>> --
>> David ;-)
>> Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs
>>
>>
>> Le 29 mai 2014 à 14:37, Kirill Teplinskiy  a écrit :
>>
>> Hello!
>>
>> Can anyone tell is it safe to call _optimize under normal production 
>> work?  Will search requests be responded and indexing be performed?
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to elasticsearc...@googlegroups.com.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/5c75d64c-4fb6-4fe0-aaec-8dc3c5cf68f4%40googlegroups.com
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>  -- 
> You received this message because you are subscribed to the Google Groups 
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to elasticsearc...@googlegroups.com .
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/elasticsearch/95b0bfcf-38b5-4880-ae50-efaa11aaddb1%40googlegroups.com
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/f830d7a1-8ad8-4390-9860-dc04029c7cd8%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Print nearby lines after query result

2014-05-29 Thread Senthil Raja

In unix we are using grep -A 5 -B 5 "" . Looking for the similar 
requirement. 

On Thursday, May 29, 2014 8:48:47 PM UTC+5:30, Senthil Raja wrote:
>
>
> Team,
>
> As per our requirement, i have to search a string in ES logs. And, print 
> five lines before and after the result query. 
>
> Is there any way to do that ?
>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/39bf46eb-e749-44e4-8a9e-3658b32e5ce7%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: ES and logstash are nit working well with each other

2014-05-29 Thread David Montgomery
Before I give up on logstash..I now placed elastic search on the same
server using the below.


input {
  redis {
host => "redis.queue.do.development.sf.test.com"
data_type => "list"
key => "logstash"
codec => json
  }
}


output {
stdout { }
elasticsearch {
bind_host => "127.0.0.1"
}
}


I get this erorr   I am using 1.4.1for logstash and
0.90.9 for es.


'/usr/local/share/logstash-1.4.1/bin/logstash -f
/usr/local/share/logstash.indexer.config
Using milestone 2 input plugin 'redis'. This plugin should be stable, but
if you see strange behavior, please let us know! For more information on
plugin milestones, see
http://logstash.net/docs/1.4.1/plugin-milestones{:level=>:warn}
log4j, [2014-05-29T11:18:31.923]  WARN: org.elasticsearch.discovery:
[logstash-do-logstash-sf-development-20140527082230-16645-2010] waited for
30s and no initial state was set by the discovery
Exception in thread ">output"
org.elasticsearch.discovery.MasterNotDiscoveredException: waited for [30s]
at
org.elasticsearch.action.support.master.TransportMasterNodeOperationAction$3.onTimeout(org/elasticsearch/action/support/master/TransportMasterNodeOperationAction.java:180)
at
org.elasticsearch.cluster.service.InternalClusterService$NotifyTimeout.run(org/elasticsearch/cluster/service/InternalClusterService.java:492)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(java/util/concurrent/ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(java/util/concurrent/ThreadPoolExecutor.java:615)
at java.lang.Thread.run(java/lang/Thread.java:744)















On Thu, May 29, 2014 at 1:15 PM, David Montgomery  wrote:

> PS...and how is this possible?  I feel so bad I bought the kindle
> logstash book:(
>
> I changed to host rather than bind host.  I mean..wow..I have ports open.
> See?  ufw satus ===> 9200:9400/tcp  ALLOW
> my.logstash.ipaddress
>
> I have ES running
> service elasticsearch status
>  * elasticsearch is running
>
>
>
>  /usr/local/share/logstash-1.4.1/bin/logstash -f
> /usr/local/share/logstash.indexer.configUsing milestone 2 input plugin
> 'redis'. This plugin should be stable, but if you see strange behavior,
> please let us know! For more information on plugin milestones, see
> http://logstash.net/docs/1.4.1/plugin-milestones {:level=>:warn}
> Exception in thread ">output"
> org.elasticsearch.transport.BindTransportException: Failed to bind to
> [9300-9400]
> at
> org.elasticsearch.transport.netty.NettyTransport.doStart(org/elasticsearch/transport/netty/NettyTransport.java:380)
> at
> org.elasticsearch.common.component.AbstractLifecycleComponent.start(org/elasticsearch/common/component/AbstractLifecycleComponent.java:85)
> at
> org.elasticsearch.transport.TransportService.doStart(org/elasticsearch/transport/TransportService.java:92)
> at
> org.elasticsearch.common.component.AbstractLifecycleComponent.start(org/elasticsearch/common/component/AbstractLifecycleComponent.java:85)
> at
> org.elasticsearch.node.internal.InternalNode.start(org/elasticsearch/node/internal/InternalNode.java:229)
> at
> org.elasticsearch.node.NodeBuilder.node(org/elasticsearch/node/NodeBuilder.java:166)
> at java.lang.reflect.Method.invoke(java/lang/reflect/Method.java:606)
> at
> RUBY.build_client(/usr/local/share/logstash-1.4.1/lib/logstash/outputs/elasticsearch/protocol.rb:198)
> at
> RUBY.client(/usr/local/share/logstash-1.4.1/lib/logstash/outputs/elasticsearch/protocol.rb:15)
> at
> RUBY.initialize(/usr/local/share/logstash-1.4.1/lib/logstash/outputs/elasticsearch/protocol.rb:157)
> at
> RUBY.register(/usr/local/share/logstash-1.4.1/lib/logstash/outputs/elasticsearch.rb:250)
> at org.jruby.RubyArray.each(org/jruby/RubyArray.java:1613)
> at
> RUBY.outputworker(/usr/local/share/logstash-1.4.1/lib/logstash/pipeline.rb:220)
> at
> RUBY.start_outputs(/usr/local/share/logstash-1.4.1/lib/logstash/pipeline.rb:152)
> at java.lang.Thread.run(java/lang/Thread.java:744)
>
>
>
> On Thu, May 29, 2014 at 12:43 PM, David Montgomery <
> davidmontgom...@gmail.com> wrote:
>
>> Hi,
>>
>> I am, rather concerned that ES is not working with logstash index server.
>>
>> I start logstash index server like this:
>>
>> /usr/local/share/logstash-1.4.1/bin/logstash -f
>> /usr/local/share/logstash.indexer.config
>>
>> Using milestone 2 input plugin 'redis'. This plugin should be stable, but
>> if you see strange behavior, please let us know! For more information on
>> plugin milestones, see 
>> http://logstash.net/docs/1.4.1/plugin-milestones{:level=>:warn}
>> log4j, [2014-05-29T00:33:12.473]  WARN: org.elasticsearch.discovery:
>> [logstash-do-logstash-sf-development-20140527082230-2162-2010] waited for
>> 30s and no initial state was set by the discovery
>> Exception in thread ">output"
>> org.elasticsearch.discovery.MasterNotDiscoveredException: waited for [30s]
>> at
>> org.elasticsearch.actio

Print nearby lines after query result

2014-05-29 Thread Senthil Raja

Team,

As per our requirement, i have to search a string in ES logs. And, print 
five lines before and after the result query. 

Is there any way to do that ?


-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/8881e7d5-c659-49c0-8481-595bdab0a7cb%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Call _optimize during production work

2014-05-29 Thread David Pilato
So you reindex into the same index and not in another clean one?
So you "update" docs, right?

--
David ;-)
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs


Le 29 mai 2014 à 16:54, Kirill Teplinskiy  a écrit :

Thank you, David!

I need to call _optimize by hands to refresh payloads in completion suggester.  
This method is recommended here: 
http://www.elasticsearch.org/guide/en/elasticsearch/reference/0.90/search-suggesters-completion.html.
  For some reason _suggest returns old payloads after reindexing on our stage 
server.  I don't know why, completion suggester on my local instance of 
ElasticSearch updates on the fly.  Maybe because our stage index is 10Gb size 
and contains 10 millions documents and my local index has only 180 000 
documents.  

> On Thursday, May 29, 2014 7:37:24 PM UTC+6, David Pilato wrote:
> In 99.9% you should not call optimize api and let elasticsearch/Lucene do it 
> for you when needed.
> 
> To answer to your question, yes search and index operations will still 
> possible during that time.
> 
> 
> --
> David ;-)
> Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs
> 
> 
> Le 29 mai 2014 à 14:37, Kirill Teplinskiy  a écrit :
> 
> Hello!
> 
> Can anyone tell is it safe to call _optimize under normal production work?  
> Will search requests be responded and indexing be performed?
> -- 
> You received this message because you are subscribed to the Google Groups 
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to elasticsearc...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/elasticsearch/5c75d64c-4fb6-4fe0-aaec-8dc3c5cf68f4%40googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/95b0bfcf-38b5-4880-ae50-efaa11aaddb1%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/DA72C917-FE7C-4040-A1DC-3B95ADC7AE3E%40pilato.fr.
For more options, visit https://groups.google.com/d/optout.


Re: Call _optimize during production work

2014-05-29 Thread Kirill Teplinskiy
Thank you, David!

I need to call _optimize by hands to refresh payloads in completion 
suggester.  This method is recommended here: 
http://www.elasticsearch.org/guide/en/elasticsearch/reference/0.90/search-suggesters-completion.html.
  
For some reason _suggest returns old payloads after reindexing on our stage 
server.  I don't know why, completion suggester on my local instance of 
ElasticSearch updates on the fly.  Maybe because our stage index is 10Gb 
size and contains 10 millions documents and my local index has only 180 000 
documents.  

On Thursday, May 29, 2014 7:37:24 PM UTC+6, David Pilato wrote:
>
> In 99.9% you should not call optimize api and let elasticsearch/Lucene do 
> it for you when needed.
>
> To answer to your question, yes search and index operations will still 
> possible during that time.
>
>
> --
> David ;-)
> Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs
>
>
> Le 29 mai 2014 à 14:37, Kirill Teplinskiy > 
> a écrit :
>
> Hello!
>
> Can anyone tell is it safe to call _optimize under normal production 
> work?  Will search requests be responded and indexing be performed?
>
> -- 
> You received this message because you are subscribed to the Google Groups 
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to elasticsearc...@googlegroups.com .
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/elasticsearch/5c75d64c-4fb6-4fe0-aaec-8dc3c5cf68f4%40googlegroups.com
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/95b0bfcf-38b5-4880-ae50-efaa11aaddb1%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Script field coercion Out of range

2014-05-29 Thread Evan Borden
I have two ES installations with identical configuration. The following 
script_fields query:

curl -XPOST localhost:9200/ -d '
{"script_fields": {"foo": {"script": "rint(\"32768\")"}}}
'

produces the following error on one node and not the other.

{

   - timed_out: false
   - _shards: {
  - total: 4
  - successful: 3
  - failed: 1
  - failures: [
 - {
- status: 500
- reason: CompileException[[Error: Value out of range. 
Value:"32768" Radix:10] [Near : {... rint("32768") }] ^ [Line: 
1, 
Column: 1]]; nested: NumberFormatException[Value out of range. 
Value:"32768" Radix:10]; 
 }
  ]
   }
   - hits: {
  - total: 1591475
  - max_score: 1
  - hits: [ ]
   }

}

The failing node is attempting to coerce to a short, but I cannot find any 
reason for this behavior.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/a42e0adc-fa94-4159-822b-44464d3bf4f2%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Search issue with snowball stemmer

2014-05-29 Thread Александр Шаманов
Hello everyone,

I have follow index mapping:


curl -XPUT 'http://localhost:9200/some_content/' -d '
{
   "settings":{
  "query_string":{
 "default_con":"content",
 "default_operator":"AND"
  },
  "index":{
 "analysis":{
"analyzer":{
   "en_analyser":{
  "filter":[
 "snowBallFilter"
  ],
  "type":"custom",
  "tokenizer":"standard"
   }
},
"filter":{
   "en_stopFilter":{
  "type":"stop",
  "stopwords_path":"lang/stopwords_en.txt"
   },
   "snowBallFilter":{
  "type":"snowball",
  "language":"English"
   },
   "wordDelimiterFilter":{
  "catenate_all":false,
  "catenate_words":true,
  "catenate_numbers":true,
  "generate_word_parts":true,
  "generate_number_parts":true,
  "preserve_original":true,
  "type":"word_delimiter",
  "split_on_case_change":true
   },
   "en_synonymFilter":{
  "synonyms_path":"lang/synonyms_en.txt",
  "ignore_case":true,
  "type":"synonym",
  "expand":false
   },
   "lengthFilter":{
  "max":250,
  "type":"length",
  "min":3
   }
}
 }
  }
   },
   "mappings":{
  "docs":{
 "_source":{
"enabled":false
 },
 "analyzer":"en_analyser",
 "properties":{
 "content":{
"type":"string",
"index":"analyzed",
"term_vector":"with_positions_offsets",
"omit_norms":"true"
 }
 }
  }
   }
}'

and I posted the next content:

curl -XPOST http://localhost:9200/some_content/docs/ -d '
{  
  "content" : "Some sampling text formatted for text data" 
}'

When I make this one request:
http://epbyvitw0052:9200/some_content/docs/_search?q=sampling
 
 I'm getting result:

{
"took": 1,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 0.095891505,
"hits": [
{
"_index": "some_content",
"_type": "docs",
"_id": "saLfx6PYR82YR69je0JbAA",
"_score": 0.095891505
}
]
}
}
 

but when I send request without type:
http://epbyvitw0052:9200/some_content/_search?q=sampling

then I'm getting nothing:

{
"took": 1,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 0,
"max_score": null,
"hits": []
}
}


although, I can make the next request with term:
http://epbyvitw0052:9200/some_content/_search?q=sampl

the system found it:

{
"took": 1,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 0.095891505,
"hits": [
{
"_index": "some_content",
"_type": "docs",
"_id": "saLfx6PYR82YR69je0JbAA",
"_score": 0.095891505
}
]
}
}
 

It's issue appear when I put snowball filter into analyzer. 
Could you explain why the system has such behavior? 
May be I do something wrong.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/9b919926-3384-4d72-845a-c73790d05281%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


scroll aggregations

2014-05-29 Thread Deepjot Singh
hello friends,
I have an index with some 10m records.
When I try to find distincts in one field (around 2m) my Java runs out of 
memory.
Can I implement a scan and scroll on this aggregation to retrieve the same 
data in smaller parts.


Thanks

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/e0c92441-d1e7-44bc-bf47-940279553ebd%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Red status unassigned shards help

2014-05-29 Thread Jason Weber
I rebooted several times and I believe its collecting the correct data now. 
I still show 520 unassigned shards, but its collecting all my logs now. Is 
this something I can use the redirect command for to assign it to a new 
index?

Jason

On Tuesday, May 27, 2014 11:39:49 AM UTC-4, Jason Weber wrote:
>
> Could someone walk me through getting my cluster up and running. Came in 
> from long weekend and my cluster was red status, I am showing a lot of 
> unassigned shards.
>
> jmweber@MIDLOG01:/var/log/logstash$ curl 
> localhost:9200/_cluster/health?pretty
> {
>   "cluster_name" : "midlogcluster",
>   "status" : "red",
>   "timed_out" : false,
>   "number_of_nodes" : 2,
>   "number_of_data_nodes" : 1,
>   "active_primary_shards" : 512,
>   "active_shards" : 512,
>   "relocating_shards" : 0,
>   "initializing_shards" : 0,
>   "unassigned_shards" : 520
> }
>
>
> I am running ES 0.90.11
>
> LS and ES are on a single server, I only have 1 node, although it shows 2, 
> I get yellow status normally, it works fine with that. But I am only 
> collecting like 43 events per minute vs my usual 50K.
>
> I have seen several write ups but I seem to get a lot of no handler found 
> for uri statements when I try to run them.
>
> Thanks,
> Jason
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/1307dd8d-411e-4690-a6d1-8e27ce26ecec%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Call _optimize during production work

2014-05-29 Thread David Pilato
In 99.9% you should not call optimize api and let elasticsearch/Lucene do it 
for you when needed.

To answer to your question, yes search and index operations will still possible 
during that time.


--
David ;-)
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs


Le 29 mai 2014 à 14:37, Kirill Teplinskiy  a écrit :

Hello!

Can anyone tell is it safe to call _optimize under normal production work?  
Will search requests be responded and indexing be performed?
-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/5c75d64c-4fb6-4fe0-aaec-8dc3c5cf68f4%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/BF20D0F9-0BAA-4A23-9303-0F39F63C1F42%40pilato.fr.
For more options, visit https://groups.google.com/d/optout.


Re: Levenshtein distance

2014-05-29 Thread Adrian C
Resolved this by setting transpositions true on the request - didn't see 
this option documented but found it by looking through the source.

{
   "size":50,
   "query":{
  "fuzzy":{
 "surname":{
"value":"arosn",
"transpositions":true,
"fuzziness":2,
"prefix_length":1,
"max_expansions":100
 }
  }
   }
}

On Wednesday, 28 May 2014 13:10:22 UTC+1, Adrian C wrote:
>
> Hi,
>
> I am new to ES and have been doing some simple testing of fuzzy matching. 
> I have a query related to Levenshtein distance. Does ElasticSearch 
> use Levenshtein distance or Damerau–Levenshtein distance?
>
> For example I have the following text stored in an index (analyzer: 
> simple):
> AARONS
>
> When search using 'arosn' the text is not found. The queries that I have 
> been testing with are as follows:
>
> {
>"size":50,
>"query":{
>   "fuzzy":{
>  "surname":{
> "value":"arosn",
> "fuzziness":2,
> "prefix_length":1,
> "max_expansions":100
>  }
>   }
>}
> }
>
> and 
>
> {
>"size":50,
>"query":{
>   "match":{
>  "surname":{
> "query":"arosn",
> "fuzziness":2
>  }
>   }
>}
> }
>
> {
>"size":50,
>"query":{
>   "match":{
>  "surname":{
> "query":"arosn~",
> "fuzziness":2
>  }
>   }
>}
> }
>
> {
>"size":50,
>"query":{
>   "query_string":{
>  "default_field":"surname",
>  "fuzziness":2,
>  "query":"arosn~2"
>   }
>}
> }
>
> If the Damerau–Levenshtein distance algorithm was is use then I would 
> expect this to match with a distance of two:
>
> arosn + (a) à aarosn + swap (n & s) à aarons
>
> I am a little confused as there is reference to Damerau–Levenshtein: 
> http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-query-string-query.html#_fuzziness
>
> So any ideas on how I can get Damerau–Levenshtein to work?
>
> Thanks
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/e4cf61f0-1c6c-42d4-a01a-e51598bdf196%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Nested object type and join:false and geo_shape

2014-05-29 Thread horse . badorties . ny
Hello,

I am trying to achieve something very similar, where a nested filter is 
applied on a nested document and I only want the sub doc and not the root 
in the hit source.  Setting the filter's 'join' to false always returns 0 
hits.  Did you find a resolution?

Thank you!

On Monday, February 10, 2014 3:09:48 PM UTC-5, bants wrote:
>
> Hi All, 
>
> I would like to be able to search against documents using a geo_shape 
> filter where the geojson is a nested subdocument and only retrieve the 1 
> sub document that matched the geographical filter (not the whole document). 
> I think the docs (specifically nested object type and nested query/filter 
> docs) state this is possible using join:false. For some reason I can't get 
> it to work though and I'm convinced its a user error or lack of 
> understanding. 
>
> On ES 90.5 and below is a worked example.
>
> Can someone point me in the right direction please?
>
> Thanks
>
> # Clear the deck and create a new index
>
> > curl -XDELETE http://localhost:9200/test
>
> {"ok":true,"acknowledged":true}
>
>
> > curl -XPUT  http://localhost:9200/test
>
> {"ok":true,"acknowledged":true}
>
>
> # Set a new mapping for the testtype
>
> > curl -XPUT http://localhost:9200/test/testtype/_mapping -d '{"testtype": 
> {"properties": {"entities": {"type": "nested", "properties": {"geometry": 
> {"tree": "quadtree", "type": "geo_shape","precision": "10m"}}'
> {"ok":true,"acknowledged":true}
>
> # Index a new document
>
> curl -XPUT http://localhost:9200/test/testtype/doc1 -d '{"id" : "doc1", 
> "entities": [{"geometry": {"type" : "Point", "coordinates": [0.0, 0.0]}}, 
> {"geometry": {"type" : "Point", "coordinates": [180.0, 90.0]}}]}'
>
> # Query WITH join:false
>
> > curl -XGET http://localhost:9200/test/testtype/_search -d '{"query": 
> {"filtered": {"filter": {"nested" : {"path" : "entities", *"join":false*, 
> "filter" : {"geo_shape": {"entities.geometry": {"shape": {"type": 
> "envelope","coordinates": [[-10.0, 10.0],[10.0, -10.0]]}},"query": 
> {"match_all": {}'
>
>
> {"took":0,"timed_out":false,"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":0,"max_score":null,"hits":[]}}
>
> # Query WITHour join:false
>
> > curl -XGET http://localhost:9200/test/testtype/_search -d '{"query": 
> {"filtered": {"filter": {"nested" : {"path" : "entities", "filter" : 
> {"geo_shape": {"entities.geometry": {"shape": {"type": 
> "envelope","coordinates": [[-10.0, 10.0],[10.0, -10.0]]}},"query": 
> {"match_all": {}'
>
> {"took":2,"timed_out":false,"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":1,"max_score":1.0,"hits":[{"_index":"test","_type":"testtype","_id":"doc1","_score":1.0,
>  
> "_source" : {"id" : "doc1", "entities": [{"geometry": {"type" : "Point", 
> "coordinates": [0.0, 0.0]}}, {"geometry": {"type" : "Point", "coordinates": 
> [180.0, 90.0]}}]}}]}}
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/3209ffa9-bf61-402e-a21a-571deaf34fad%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


dynamic aggregation

2014-05-29 Thread Bogdan Ilie
Please tell me if it s posible to make dynamic aggregation on all inner 
proprietes of a field, or nested object, because i have many product every 
of this product have different attributes, so there is posible to make 
automatically aggs without knowing before what attributes field are named?

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/5962c848-a2c0-46c0-b58b-6b17b964a102%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: ES OutOfMemory on a 30GB index

2014-05-29 Thread Paul Sanwald
We've narrowed the problem down to a multi_match clause in our query:
 {"multi_match":{"fields":["attachments.*.bodies"], "query":"foobar"}}

This has to do with the way we've structured our index, We are searching an 
index that contains emails, and we are indexing attachments in the 
attachments.*.bodies fields. For example, attachments.1.bodies would 
contain the text body of an attachment.

This structure is clearly sub-optimal in terms of multi_match queries, but 
I need to structure our index in some way that we can search the contents 
of an email and the parsed contents of its attachments, and get back the 
email as a result.

>From reading the docs, it seems like the better way to solve this is with 
nested types?
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-nested-type.html

--paul

On Wednesday, May 28, 2014 7:11:05 PM UTC-4, Paul Sanwald wrote:
>
> Sorry, it's Java 7:
>
> jvm: {
> pid: 20424
> version: 1.7.0_09-icedtea
> vm_name: OpenJDK 64-Bit Server VM
> vm_version: 23.7-b01
> vm_vendor: Oracle Corporation
> start_time: 1401309063644
> mem: {
> heap_init_in_bytes: 1073741824
> heap_max_in_bytes: 10498867200
> non_heap_init_in_bytes: 24313856
> non_heap_max_in_bytes: 318767104
> direct_max_in_bytes: 10498867200
> }
> gc_collectors: [
> PS Scavenge
> PS MarkSweep
> ]
> memory_pools: [
> Code Cache
> PS Eden Space
> PS Survivor Space
> PS Old Gen
> PS Perm Gen
> ]
>
> On Wednesday, May 28, 2014 6:58:26 PM UTC-4, Mark Walkom wrote:
>>
>> What java version are you running, it's not in the stats gist.
>>
>> Regards,
>> Mark Walkom
>>
>> Infrastructure Engineer
>> Campaign Monitor
>> email: ma...@campaignmonitor.com
>> web: www.campaignmonitor.com
>>  
>>
>> On 29 May 2014 08:33, Paul Sanwald  wrote:
>>
>>> I apologize about the signature, it's automatic. I've created a gist 
>>> with the cluster node stats:
>>> https://gist.github.com/pcsanwald/e11ba02ac591757c8d92
>>>
>>> We are using 1.1.0, using aggregations a lot but nothing crazy. We run 
>>> our app on much much larger indices successfully. But, the problem seems to 
>>> be present itself on even basic search cases. The one thing that's 
>>> different about this dataset is a lot of it is in spanish.
>>>
>>> thanks for your help!
>>>
>>> On Wednesday, May 28, 2014 6:22:59 PM UTC-4, Mark Walkom wrote:

 Can you provide some specs on your cluster, OS, RAM, heap, disk, java 
 and ES versions?
 Are you using parent/child relationships, TTLs, large facet or other 
 queries?


 (Also, your elaborate legalese signature is kind of moot given you're 
 posting to a public mailing list :p)

 Regards,
 Mark Walkom

 Infrastructure Engineer
 Campaign Monitor
 email: ma...@campaignmonitor.com
 web: www.campaignmonitor.com
  

 On 29 May 2014 07:27, Paul Sanwald  wrote:

> Hi Everyone,
>We are seeing continual OOM exceptions on one of our 1.1.0 
> elasticsearch clusters, the index is ~30GB, quite small. I'm trying to 
> work 
> out the root cause via heap dump analysis, but not having a lot of luck. 
> I 
> don't want to include a bunch of unnecessary info, but the stacktrace 
> we're 
> seeing is pasted below. Has anyone seen this before? I've been using the 
> cluster stats and node stats APIs to try and find a smoking gun, but I'm 
> not seeing anything that looks out of the ordinary.
>
> Any ideas?
>
> 14/05/27 20:37:08 WARN transport.netty: [Strongarm] Failed to send 
> error message back to client for action [search/phase/query]
> java.lang.OutOfMemoryError: GC overhead limit exceeded
> 14/05/27 20:37:08 WARN transport.netty: [Strongarm] Actual Exception
> org.elasticsearch.search.query.QueryPhaseExecutionException: 
> [eventdata][2]: q
> uery[ConstantScore(*:*)],from[0],size[0]: Query Failed [Failed to 
> execute main
>  query]
> at org.elasticsearch.search.query.QueryPhase.execute(
> QueryPhase.java:1
> 27)
> at org.elasticsearch.search.SearchService.executeQueryPhase(
> SearchService.java:257)
> at org.elasticsearch.search.action.
> SearchServiceTransportAction$SearchQueryTransportHandler.
> messageReceived(SearchServiceTransportAction.java:623)
> at org.elasticsearch.search.action.
> SearchServiceTransportAction$SearchQueryTransportHandler.
> messageReceived(SearchServiceTransportAction.java:612)
> at org.elasticsearch.transport.netty.MessageChannelHandler$
> RequestHandler.run(MessageChannelHandler.java:270)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(
> ThreadPoolExecutor.java:1145)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(
> ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:722)
> Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded
>  
>>>

Standing up new instance want to copy indexes/documents from old instance

2014-05-29 Thread Didjit
Hi,

Dug around and cant seem to find the answer. I have an existing instance of 
ELK running. I'm currently setting up a new instance on separate hardware. 
I want to copy the documnets/indexes over to the new instance so I can 
preserve the history. Can someone point me to a doc or give some tips on 
how to accomplish this?

Thank you,

Didjit

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/c5481c18-4550-4e64-b069-9122da4643e4%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Call _optimize during production work

2014-05-29 Thread Kirill Teplinskiy
Hello!

Can anyone tell is it safe to call _optimize under normal production work?  
Will search requests be responded and indexing be performed?

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/5c75d64c-4fb6-4fe0-aaec-8dc3c5cf68f4%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Hide some system fields

2014-05-29 Thread Florentin Zorca
Hi,

try using fields to specify only the fields you are interested in:

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-request-fields.html#search-request-fields

Your query should then be something like this:

{"fields":["date"], "from":0, "size":1000}

Kind Regards,

Florentin


Am Donnerstag, 29. Mai 2014 13:11:07 UTC+2 schrieb Сергей Шилов:
>
> Hi all!
> I use elasticsearch in a high-load project where an urgent need to save 
> traffic.
> I have a some queries like this:
>
> curl -XGET 
> 'http://localhost:9200/testindex/testmapping/_search?pretty&scroll=5m' -d 
> '{"from":0, "size":1000}'
>
> {
>   "_scroll_id" : 
> "cXVlcnlUaGVuRmV0Y2g7MjszMjp6TmdjNmxkM1NtV1NOeTl5X3dab1FnOzMxOnpOZ2M2bGQzU21XU055OXlfd1pvUWc7MDs=",
>   "took" : 2,
>   "timed_out" : false,
>   "_shards" : {
> "total" : 2,
> "successful" : 2,
> "failed" : 0
>   },
>   "hits" : {
> "total" : 15457332,
> "max_score" : 1.0,
> "hits" : [ {
>   "_index" : "testindex",
>   "_type" : "testmapping",
>   "_id" : "mo7vQrWUTquBRowjq2AVkw",
>   "_score" : 1.0, "_source" : 
> {"reffer_id":"","date":"2013-05-31T00:00:00","source":5,"user_id":"2fdfdf0fbbce603cf24c0eee7dabf28c"}
> }, ]
>   }
> }
>
>
> Can I exclude some system fields (like _shards.*, hits._index, hits._type, 
> hits._id, hits._score)? I found how exclude source fields, but not system.
> Also I need to get _timestamp field in _source rows. It generated from 
> 'date' field:
>
> '_timestamp' => array(
> 'enabled' => true,
> 'path' => 'date',
> 'format' => "-MM-dd'T'HH:mm:ss"
> )
>
>
> Thanks
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/16245f90-64fc-4a39-adf4-71663e91d950%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Aggregation-"sql like" optimization guidance with elasticsearch 1.0.0

2014-05-29 Thread Niko Nyrhila
Hi,

You can nest aggregations, so in this case you'd first use Date Histogram 
aggregation with an interval of one hour:
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-aggregations-bucket-datehistogram-aggregation.html

Then you'd aggregate by "id" field:
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-aggregations-bucket-terms-aggregation.html

Here is an example:
http://www.solinea.com/blog/elasticsearch-aggs-save-the-day

This should be very fast, even when running on a single machine.


On Friday, January 31, 2014 3:36:20 AM UTC+2, Maxime Nay wrote:
>
> Hi,
>
> We are experimenting elasticsearch 1.0.0, and are particularly excited 
> about the new aggregation feature.
>
> Here is one of our use-case that we would like to optimize :
>
> Right now, to imitate a basic SQL group by query that would look like : 
> SELECT day, hour, id, SUM(views), SUM(clicks), SUM(video_plays) FROM 
> events GROUP BY day, hour, id
>
> we are issuing this kind of queries :
>
> {  
> "size" : 0,
> "query":{"match_all":{}},
> "aggs" : {
> "test_aggregation" : {
> "terms" : {
> "script" : "doc['day'].date + '-' + doc['hour'].value + 
> '-' + doc['id'].value",
> "order" : { "_term" : "asc" },
> "size": 
> },
> "aggs" : {
> "sum_click" : { "sum" : { "field" : "clicks" } },
> "sum_views" : { "sum" : { "field" : "views" } },
> "sum_video_plays" : { "sum" : { "field" : "video_plays" } }
> }
> }
> }
> }
>
> But the perfs for this kind of queries are kind of low. Thus, we would 
> like to know if there are a more optimized way to get what we want.
>
> Thanks !
> Maxime
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/bb2293a1-b83c-45a1-af42-e48b3fd9a0c9%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Need help on similarity ranking approach

2014-05-29 Thread Alex Ksikes
Also this plugin could provide a solution to your problem:

http://yannbrrd.github.io/

On Thursday, May 29, 2014 10:42:47 AM UTC+2, Rgs wrote:
>
> hi, 
>
> What i did now is, i have created a custom similarity & similarity 
> provider 
> class which extends DefaultSimilarity and AbstractSimilarityProvider 
> classes 
> respectively and overridden the idf() method to return 1. 
>
> Now I'm getting some percentage values like 1, 0.987, 0.876 etc and 
> interpret it as 100%, 98%, 87% etc. 
>
> Can you please confirm whether this approach can be taken for finding the 
> percentage of similarity? 
>
> sorry for the late reply. 
>
> Thanks 
> Rgs 
>
>
>
> -- 
> View this message in context: 
> http://elasticsearch-users.115913.n3.nabble.com/Need-help-on-similarity-ranking-approach-tp4054847p4056680.html
>  
> Sent from the ElasticSearch Users mailing list archive at Nabble.com. 
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/d4a2ee12-b9af-4142-a2e9-71b85cc9141c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Hide some system fields

2014-05-29 Thread Сергей Шилов
Hi all!
I use elasticsearch in a high-load project where an urgent need to save 
traffic.
I have a some queries like this:

curl -XGET 
'http://localhost:9200/testindex/testmapping/_search?pretty&scroll=5m' -d 
'{"from":0, "size":1000}'

{
  "_scroll_id" : 
"cXVlcnlUaGVuRmV0Y2g7MjszMjp6TmdjNmxkM1NtV1NOeTl5X3dab1FnOzMxOnpOZ2M2bGQzU21XU055OXlfd1pvUWc7MDs=",
  "took" : 2,
  "timed_out" : false,
  "_shards" : {
"total" : 2,
"successful" : 2,
"failed" : 0
  },
  "hits" : {
"total" : 15457332,
"max_score" : 1.0,
"hits" : [ {
  "_index" : "testindex",
  "_type" : "testmapping",
  "_id" : "mo7vQrWUTquBRowjq2AVkw",
  "_score" : 1.0, "_source" : 
{"reffer_id":"","date":"2013-05-31T00:00:00","source":5,"user_id":"2fdfdf0fbbce603cf24c0eee7dabf28c"}
}, ]
  }
}


Can I exclude some system fields (like _shards.*, hits._index, hits._type, 
hits._id, hits._score)? I found how exclude source fields, but not system.
Also I need to get _timestamp field in _source rows. It generated from 
'date' field:

'_timestamp' => array(
'enabled' => true,
'path' => 'date',
'format' => "-MM-dd'T'HH:mm:ss"
)


Thanks

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/cbc201ec-42b1-47a6-9186-2d240ae669c3%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Need help on similarity ranking approach

2014-05-29 Thread Alex Ksikes
Hello,

I am not sure that would work. I'd first index you document, and then use 
mlt with this document id and include set to true (added in latest ES 
release). Then you'll know how "far" your documents are from the queried 
document. Also, make sure to pick up most of the terms, by 
setting percent_terms_to_match=0, max_query_terms=high value 
and min_doc_freq=1. In order to know what terms from the queried document 
have matched in the response, you can use explain.

Alex

On Thursday, May 29, 2014 10:42:47 AM UTC+2, Rgs wrote:
>
> hi, 
>
> What i did now is, i have created a custom similarity & similarity 
> provider 
> class which extends DefaultSimilarity and AbstractSimilarityProvider 
> classes 
> respectively and overridden the idf() method to return 1. 
>
> Now I'm getting some percentage values like 1, 0.987, 0.876 etc and 
> interpret it as 100%, 98%, 87% etc. 
>
> Can you please confirm whether this approach can be taken for finding the 
> percentage of similarity? 
>
> sorry for the late reply. 
>
> Thanks 
> Rgs 
>
>
>
> -- 
> View this message in context: 
> http://elasticsearch-users.115913.n3.nabble.com/Need-help-on-similarity-ranking-approach-tp4054847p4056680.html
>  
> Sent from the ElasticSearch Users mailing list archive at Nabble.com. 
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/184a015f-fe68-4a24-999b-367d60d23798%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Failed to send release search context - SourceLookup fetching cost (?)

2014-05-29 Thread Mateusz Kaczynski
Anyone encoutered something similar / related?

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/498619f8-f05b-44ca-8846-3e15a15946c6%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Failed to send release search context - SourceLookup fetching cost (?)

2014-05-29 Thread Mateusz Kaczynski
Anyone encountered something similar / related? 

On Tuesday, 27 May 2014 13:46:19 UTC, Mateusz Kaczynski wrote:
>
> We have recently changed some of our code to include additional call to 
> SourceLookup.extractValue(path) in fetch stage. Soon after, we have 
> started experiencing some issues with search stability (with some of our 
> searches failing to finish, others taking very long). 
>
> I can see search lookup queue getting filled up (to 1000) and I can see 
> the following spamming the logs:
>
> es-cluster-3-3 elasticsearch[2014-05-26 18:59:28,584][WARN ][search.action 
> ] [Maeby] Failed to send release search context
> es-cluster-3-3 
> elasticsearchorg.elasticsearch.transport.SendRequestTransportException: 
> [Oscar][inet[/
> 10.0.0.1/ip-10-0-0-1.eu-west-1.compute.internal:9300]][search/freeContext
> ]
> es-cluster-3-3 elasticsearch at 
> org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:202)
> es-cluster-3-3 elasticsearch at 
> org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:173)
> es-cluster-3-3 elasticsearch at 
> org.elasticsearch.search.action.SearchServiceTransportAction.sendFreeContext(SearchServiceTransportAction.java:103)
> es-cluster-3-3 elasticsearch at 
> org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.releaseIrrelevantSearchContexts(TransportSearchTypeAction.java:392)
> es-cluster-3-3 elasticsearch at 
> org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction$AsyncAction.finishHim(TransportSearchQueryThenFetchAction.java:191)
> es-cluster-3-3 elasticsearch at 
> org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction$AsyncAction.onFetchFailure(TransportSearchQueryThenFetchAction.java:177)
> es-cluster-3-3 elasticsearch at 
> org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction$AsyncAction$3.onFailure(TransportSearchQueryThenFetchAction.java:165)
> es-cluster-3-3 elasticsearch at 
> org.elasticsearch.search.action.SearchServiceTransportAction$10.handleException(SearchServiceTransportAction.java:426)
> es-cluster-3-3 elasticsearch at 
> org.elasticsearch.transport.TransportService$Adapter$2$1.run(TransportService.java:316)
> es-cluster-3-3 elasticsearch at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> es-cluster-3-3 elasticsearch at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> es-cluster-3-3 elasticsearch at java.lang.Thread.run(Thread.java:745)
> es-cluster-3-3 elasticsearchCaused by: 
> org.elasticsearch.transport.NodeNotConnectedException: [Oscar][inet[/
> 10.0.0.1/ip-10-0-0-1.eu-west-1.compute.internal:9300]] Node not connected
> es-cluster-3-3 elasticsearch at 
> org.elasticsearch.transport.netty.NettyTransport.nodeChannel(NettyTransport.java:859)
> es-cluster-3-3 elasticsearch at 
> org.elasticsearch.transport.netty.NettyTransport.sendRequest(NettyTransport.java:540)
> es-cluster-3-3 elasticsearch at 
> org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:189)
> es-cluster-3-3 elasticsearch ... 11 more
>
> There is also a long GC running at the same time, not sure if it might be 
> the cause or the effect. 
> Is it at all possible that this might have been caused by a call to 
> SourceLookup.extractValue() assuming the field that is extracted was not 
> specified in the query?
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/8f226f6e-e453-4bb9-b204-2c61a9f30662%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Trying to fetch document ids with a geohash_grid aggregation

2014-05-29 Thread svartalf
Hi!

I'm trying to make a clustering of geo-points and got stuck with a two 
problems.

1. If bucket contains only one document, is there any way to get it's id?
2. Is there any way to get only aggregated data from ES with out of a 
'hits' information? It is really a lot of unnecessary data there.

ES version: 1.1.1

Query example:
 

> {
> "size": 1000,
> "aggregations": {
> 'cells': {
> 'geohash_grid': {
> 'field': 'point',
> 'precision': precision,
> },
> 'aggregations': {
> 'lat': {
> 'avg': {
> 'script': '''doc['point'].lat''',
> }
> },
> 'lon': {
> 'avg': {
> 'script': '''doc['point'].lon''',
> }
> }
> }
> }
> },
> 'query': {
> 'constant_score': {
> 'filter': {
> 'exists': {
> 'field': 'point',
> }
> }
> }
> }
> }

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/408a5cb8-9eaa-4f49-a448-ae790722d4e5%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Nodes restarting automatically

2014-05-29 Thread David Pilato
I think but might be wrong that this node as unresponsive does not collect 
anymore GC data.
May be you could look in the past before things starting to be worse.


--
David ;-)
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs


Le 29 mai 2014 à 10:43, Jorge Ferrando  a écrit :

This is what Marvel shows for old GC in the last 6 hours for that node:




> On Thu, May 29, 2014 at 10:39 AM, David Pilato  wrote:
> It sounds like the old GC is not able to clean old gen space enough.
> I guess that if you look at your Marvel dashboards, you can see that on old 
> GC.
> 
> So memory pressure is the first guess. You may have too many old GC cycles.
> 
> 
> --
> David ;-)
> Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs
> 
> 
> Le 29 mai 2014 à 10:32, Jorge Ferrando  a écrit :
> 
> Thanks for the answer David
> 
> I added this setting to elasticsearch.yml some days ago to see if that what's 
> the problem:
> 
> discovery.zen.ping.timeout: 5s
> discovery.zen.fd.ping_interval: 5s
> discovery.zen.fd.ping_timeout: 60s
> discovery.zen.fd.ping_retries: 3
> 
> If I'm not mistaken, with those settings the node should be marked as 
> unavailable after 3m and most of the times it happens quicker. Am I wrong?
> 
> 
>> On Thu, May 29, 2014 at 10:29 AM, David Pilato  wrote:
>> GC took too much time so your node become unresponsive I think.
>> If you set 30 Gb RAM, you should increase the time out ping setting before a 
>> node is marked as unresponsive.
>> 
>> And if you are under memory pressure, you could try to check your requests 
>> and see if you can have some optimization or start new nodes...
>> 
>> My 2 cents.
>> 
>> --
>> David ;-)
>> Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs
>> 
>> 
>> Le 29 mai 2014 à 09:56, Jorge Ferrando  a écrit :
>> 
>> I've been analyzing the problem with Marvel and nagios and I managed to get 
>> 2 more details:
>> 
>> - The node restarting/reinitializing it's always the same. Node 3
>> - It always happens quickly after getting the cluster in green state. 
>> Between some seconds and 2-3 minutes
>> 
>> I have debug mode on in logging.yml:
>> 
>> logger:
>>   # log action execution errors for easier debugging
>>   action: DEBUG
>> 
>> But i dont see anything in the log. For instance, this is the last time it 
>> happened at around 9:47 the cluster became green and 9:50 the node restarted
>> 
>> [2014-05-29 09:30:57,235][INFO ][monitor.jvm  ] [elastic ASIC 
>> nodo 3] [gc][young][129][20] duration [745ms], collections [1]/[1s], total 
>> [745ms]/[8.5s], memory [951.1mb]->[598.9mb]/[29.9gb], all_pools {[young] 
>> [421.5mb]->[8.2mb]/[532.5mb]}{[survivor] [66.5mb]->[66.5mb]/[66.5mb]}{[old] 
>> [463.1mb]->[524.1mb]/[29.3gb]}
>> [2014-05-29 09:45:36,322][WARN ][monitor.jvm  ] [elastic ASIC 
>> nodo 3] [gc][old][964][1] duration [29.5s], collections [1]/[30.4s], total 
>> [29.5s]/[29.5s], memory [5.1gb]->[4.3gb]/[29.9gb], all_pools {[young] 
>> [29.4mb]->[34.9mb]/[532.5mb]}{[survivor] [59.9mb]->[0b]/[66.5mb]}{[old] 
>> [5gb]->[4.2gb]/[29.3gb]}
>> [2014-05-29 09:50:41,040][INFO ][node ] [elastic ASIC 
>> nodo 3] version[1.2.0], pid[7021], build[c82387f/2014-05-22T12:49:13Z]
>> [2014-05-29 09:50:41,041][INFO ][node ] [elastic ASIC 
>> nodo 3] initializing ...
>> [2014-05-29 09:50:41,063][INFO ][plugins  ] [elastic ASIC 
>> nodo 3] loaded [marvel], sites [marvel, paramedic, inquisitor, HQ, bigdesk, 
>> head]
>> [2014-05-29 09:50:47,908][INFO ][node ] [elastic ASIC 
>> nodo 3] initialized
>> [2014-05-29 09:50:47,909][INFO ][node ] [elastic ASIC 
>> nodo 3] starting ...
>> 
>> ¿Is there any other way of debugging what's going on with that node? 
>> 
>> 
>> 
>> 
>>> On Tue, May 27, 2014 at 12:49 PM, Jorge Ferrando  wrote:
>>> I thought about that but It would be strange because they are 3 Virtual 
>>> Machines in the same VMWare cluster with other hundreds of services and 
>>> nobody reported any networking problem.
>>> 
>>> 
 On Thu, May 22, 2014 at 3:16 PM, emeschitc  wrote:
 Hi,
 
 I may be wrong but it seems to me you have a problem with your network. It 
 may be a flaky connection, broken nic or something wrong with your 
 configuration for discovery and/or data transport ? 
 
 Caused by: org.elasticsearch.transport.NodeNotConnectedException: [elastic 
 ASIC nodo 2][inet[/158.42.250.79:9301]] Node not connected
at 
 org.elasticsearch.transport.netty.NettyTransport.nodeChannel(NettyTransport.java:859)
at 
 org.elasticsearch.transport.netty.NettyTransport.sendRequest(NettyTransport.java:540)
at 
 org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:189)
 
 Check the status of the network on this node.
 
 
 
> On Thu, May 22, 2014 at 2:07 PM, Jorge Ferrando [via ElasticSearch Users] 
> <[hidden email]> wrote:
> Hello 
> 
> We 

How to know the query commands that ES server accepted? Any flag?

2014-05-29 Thread Ivan Ji
Hi all,

I am wondering are there any debug flag which can enable the ES server to 
print out all the query commands it accepted?

For some debugging reason, I need to know the query command sequence that 
ES accepted. 

Or are there any config which can enable this logging?

Any ideas?

Best,

Ivan

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/d2c56f6e-c257-4f40-ad70-10c725408484%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: ElasticSearch and save the information into a file!! Urgent... Thank you!!

2014-05-29 Thread Francho punto
Thank you David,

I will try it now, but I'm worried because I don't have to much idea about 
using this program, but I will try tu use that API, and I keep you informed.

Than you!!! :)



El jueves, 29 de mayo de 2014 10:33:29 UTC+2, David Pilato escribió:
>
> You should use scan and scroll API because the query will just return by 
> default the 10 more relevant docs, not the whole resultset.
> Though it won't format your result. You need to parse JSON on your client 
> and render it as you need.
>
> --
> David ;-)
> Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs
>
>
> Le 29 mai 2014 à 10:05, Francho punto > 
> a écrit :
>
> Hello everyone!,
>
> I am working on my final project work and I have to use ElasticSearch but 
> I am new so I don`t have enough idea, and I am running out of time...
>
> I have installed the river twitter for elasticsearch and I collect 
> information from there depending on some search terms. What I want to know 
> is how I can dump the collected information to a file. For example .txt, to 
> process it later. 
>
> What I have done is to colect all information from the river (with the 
> code attached putting it on a. Sh) and save the information in a fime using 
> >> file.txt 
>
> The code that I use to get the information is this:
>
>  
>
> -XPOST curl -d '
> {
>
> "query": 
>
> {
>
> "query_string": 
>
> {
>
> query ":" * "/ / To catch all the tweets 
>
> } 
>
> } 
>
> } '
>
>
>  
> The output is the set of tweets with a lot of information like post id, 
> name, text, language...all mixed.
>
> I wonder if you could tell me another option easier or with the output 
> clearer. 
>
> Regards, and thank you very much!
>
>
>  -- 
> You received this message because you are subscribed to the Google Groups 
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to elasticsearc...@googlegroups.com .
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/elasticsearch/a13a2205-27e5-442b-b200-022b780fe74c%40googlegroups.com
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/7050b803-f7c1-4569-a664-d343b872517f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Need help on similarity ranking approach

2014-05-29 Thread Rgs
hi,

What i did now is, i have created a custom similarity & similarity provider
class which extends DefaultSimilarity and AbstractSimilarityProvider classes
respectively and overridden the idf() method to return 1.

Now I'm getting some percentage values like 1, 0.987, 0.876 etc and
interpret it as 100%, 98%, 87% etc.

Can you please confirm whether this approach can be taken for finding the
percentage of similarity?

sorry for the late reply.

Thanks
Rgs



--
View this message in context: 
http://elasticsearch-users.115913.n3.nabble.com/Need-help-on-similarity-ranking-approach-tp4054847p4056680.html
Sent from the ElasticSearch Users mailing list archive at Nabble.com.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/1401352954051-4056680.post%40n3.nabble.com.
For more options, visit https://groups.google.com/d/optout.


Re: Nodes restarting automatically

2014-05-29 Thread David Pilato
It sounds like the old GC is not able to clean old gen space enough.
I guess that if you look at your Marvel dashboards, you can see that on old GC.

So memory pressure is the first guess. You may have too many old GC cycles.


--
David ;-)
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs


Le 29 mai 2014 à 10:32, Jorge Ferrando  a écrit :

Thanks for the answer David

I added this setting to elasticsearch.yml some days ago to see if that what's 
the problem:

discovery.zen.ping.timeout: 5s
discovery.zen.fd.ping_interval: 5s
discovery.zen.fd.ping_timeout: 60s
discovery.zen.fd.ping_retries: 3

If I'm not mistaken, with those settings the node should be marked as 
unavailable after 3m and most of the times it happens quicker. Am I wrong?


> On Thu, May 29, 2014 at 10:29 AM, David Pilato  wrote:
> GC took too much time so your node become unresponsive I think.
> If you set 30 Gb RAM, you should increase the time out ping setting before a 
> node is marked as unresponsive.
> 
> And if you are under memory pressure, you could try to check your requests 
> and see if you can have some optimization or start new nodes...
> 
> My 2 cents.
> 
> --
> David ;-)
> Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs
> 
> 
> Le 29 mai 2014 à 09:56, Jorge Ferrando  a écrit :
> 
> I've been analyzing the problem with Marvel and nagios and I managed to get 2 
> more details:
> 
> - The node restarting/reinitializing it's always the same. Node 3
> - It always happens quickly after getting the cluster in green state. Between 
> some seconds and 2-3 minutes
> 
> I have debug mode on in logging.yml:
> 
> logger:
>   # log action execution errors for easier debugging
>   action: DEBUG
> 
> But i dont see anything in the log. For instance, this is the last time it 
> happened at around 9:47 the cluster became green and 9:50 the node restarted
> 
> [2014-05-29 09:30:57,235][INFO ][monitor.jvm  ] [elastic ASIC 
> nodo 3] [gc][young][129][20] duration [745ms], collections [1]/[1s], total 
> [745ms]/[8.5s], memory [951.1mb]->[598.9mb]/[29.9gb], all_pools {[young] 
> [421.5mb]->[8.2mb]/[532.5mb]}{[survivor] [66.5mb]->[66.5mb]/[66.5mb]}{[old] 
> [463.1mb]->[524.1mb]/[29.3gb]}
> [2014-05-29 09:45:36,322][WARN ][monitor.jvm  ] [elastic ASIC 
> nodo 3] [gc][old][964][1] duration [29.5s], collections [1]/[30.4s], total 
> [29.5s]/[29.5s], memory [5.1gb]->[4.3gb]/[29.9gb], all_pools {[young] 
> [29.4mb]->[34.9mb]/[532.5mb]}{[survivor] [59.9mb]->[0b]/[66.5mb]}{[old] 
> [5gb]->[4.2gb]/[29.3gb]}
> [2014-05-29 09:50:41,040][INFO ][node ] [elastic ASIC 
> nodo 3] version[1.2.0], pid[7021], build[c82387f/2014-05-22T12:49:13Z]
> [2014-05-29 09:50:41,041][INFO ][node ] [elastic ASIC 
> nodo 3] initializing ...
> [2014-05-29 09:50:41,063][INFO ][plugins  ] [elastic ASIC 
> nodo 3] loaded [marvel], sites [marvel, paramedic, inquisitor, HQ, bigdesk, 
> head]
> [2014-05-29 09:50:47,908][INFO ][node ] [elastic ASIC 
> nodo 3] initialized
> [2014-05-29 09:50:47,909][INFO ][node ] [elastic ASIC 
> nodo 3] starting ...
> 
> ¿Is there any other way of debugging what's going on with that node? 
> 
> 
> 
> 
>> On Tue, May 27, 2014 at 12:49 PM, Jorge Ferrando  wrote:
>> I thought about that but It would be strange because they are 3 Virtual 
>> Machines in the same VMWare cluster with other hundreds of services and 
>> nobody reported any networking problem.
>> 
>> 
>>> On Thu, May 22, 2014 at 3:16 PM, emeschitc  wrote:
>>> Hi,
>>> 
>>> I may be wrong but it seems to me you have a problem with your network. It 
>>> may be a flaky connection, broken nic or something wrong with your 
>>> configuration for discovery and/or data transport ? 
>>> 
>>> Caused by: org.elasticsearch.transport.NodeNotConnectedException: [elastic 
>>> ASIC nodo 2][inet[/158.42.250.79:9301]] Node not connected
>>> at 
>>> org.elasticsearch.transport.netty.NettyTransport.nodeChannel(NettyTransport.java:859)
>>> at 
>>> org.elasticsearch.transport.netty.NettyTransport.sendRequest(NettyTransport.java:540)
>>> at 
>>> org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:189)
>>> 
>>> Check the status of the network on this node.
>>> 
>>> 
>>> 
 On Thu, May 22, 2014 at 2:07 PM, Jorge Ferrando [via ElasticSearch Users] 
 <[hidden email]> wrote:
 Hello 
 
 We have a cluster of 3 nodes running Ubuntu 12.04.4 LTS 64bits, and 
 elasticsearch v1.1.1
 
 It's be running flawlessly but since the last weak some of the nodes 
 restarts randomly and cluster gets to red state, then yellow, then green 
 and it happens again in a loop (sometimes it even doesnt get green state)
 
 I've tried to look at the logs but i can't find and obvious reason of what 
 can be going on 
 
 I've found entries like these, but I don't know if they are in some way 
 related to the crash:
 
 [

Re: ElasticSearch and save the information into a file!! Urgent... Thank you!!

2014-05-29 Thread David Pilato
You should use scan and scroll API because the query will just return by 
default the 10 more relevant docs, not the whole resultset.
Though it won't format your result. You need to parse JSON on your client and 
render it as you need.

--
David ;-)
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs


Le 29 mai 2014 à 10:05, Francho punto  a écrit :

Hello everyone!,

I am working on my final project work and I have to use ElasticSearch but I am 
new so I don`t have enough idea, and I am running out of time...

I have installed the river twitter for elasticsearch and I collect information 
from there depending on some search terms. What I want to know is how I can 
dump the collected information to a file. For example .txt, to process it 
later. 

What I have done is to colect all information from the river (with the code 
attached putting it on a. Sh) and save the information in a fime using >> 
file.txt 

The code that I use to get the information is this:
 
-XPOST curl -d '
{
"query": 
{
"query_string": 
{
query ":" * "/ / To catch all the tweets 
} 
} 
} '

 
The output is the set of tweets with a lot of information like post id, name, 
text, language...all mixed.

I wonder if you could tell me another option easier or with the output clearer. 

Regards, and thank you very much!


-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/a13a2205-27e5-442b-b200-022b780fe74c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/F6B62E3D-B5FB-49CA-B536-E1330BAF2EC6%40pilato.fr.
For more options, visit https://groups.google.com/d/optout.


Re: TransportClient of ES 1.1.1 on Ubantu 12.x throws 'No node available' exception

2014-05-29 Thread Chetana
Thanks David, .bashrc had PATH set for 2 versions. One was overriding the 
other
 

On Wednesday, May 28, 2014 3:02:29 PM UTC+5:30, David Pilato wrote:

> It sounds like you are mixing elasticsearch versions (client and node) or 
> JVM versions.
>
> --
> David ;-)
> Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs
>
>
> Le 28 mai 2014 à 10:34, Chetana > a 
> écrit :
>
> ES log shows this warning message, 'transport.netty- Message not fully 
> read (request) for [0] and action [], resetting' 
> On Wednesday, May 28, 2014 12:07:39 PM UTC+5:30, Chetana wrote:
>
>> I have ugraded ES from 0.90.2 to ES 1.1.1 recently. I am able to get the 
>> Tranpsort client of ES 1.1.1 and create mapping on Windows 2010. But I get 
>> a 'No node available' exception on Ubantu 12.x
>>  
>> Here is the code I use for connecting to ES
>>  
>> ImmutableSettings.Builder settingsBuilder = 
>> ImmutableSettings.settingsBuilder();
>>  
>> // because default cluster name used
>>   //settingsBuilder.put("cluster.name", "elasticsearch");
>>  
>>   TransportClient client = new TransportClient(settingsBuilder.build());
>>   client.addTransportAddress(new InetSocketTransportAddress(
>>   "localhost", 
>>   9300));
>>   
>>client.admin().cluster().prepareHealth()
>>  .setWaitForYellowStatus() 
>> .setTimeout(TimeValue.timeValueMinutes(1)) .execute() .actionGet();
>>  
>> CreateIndexRequestBuilder builder = 
>> client.admin().indices().prepareCreate("sample").setSettings(settings);
>>   client.admin().indices().create(builder.request()).get();
>>   PutMappingRequestBuilder putBuilder = 
>> client.admin().indices().preparePutMapping("sample");
>>   putBuilder.setType("sample");
>>   
>>   XContentBuilder parentMapping 
>> = 
>> XContentFactory.jsonBuilder().startObject().startObject("sample").field("include_in_all",false).endObject().endObject();
>>   
>>   putBuilder.setSource(parentMapping);
>>   client.admin().indices().putMapping(putBuilder.request()).get();
>>  
>> Am I missing any setting? Please suggest how to trobleshoot this issue
>>  
>> Thanks
>>
>  -- 
> You received this message because you are subscribed to the Google Groups 
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to elasticsearc...@googlegroups.com .
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/elasticsearch/228ae7a0-d688-443f-88c2-127b708810dd%40googlegroups.com
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/c0d0a81e-cec0-46c4-9e3d-85410207dbd5%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Nodes restarting automatically

2014-05-29 Thread Jorge Ferrando
Thanks for the answer David

I added this setting to elasticsearch.yml some days ago to see if that
what's the problem:

discovery.zen.ping.timeout: 5s
discovery.zen.fd.ping_interval: 5s
discovery.zen.fd.ping_timeout: 60s
discovery.zen.fd.ping_retries: 3

If I'm not mistaken, with those settings the node should be marked as
unavailable after 3m and most of the times it happens quicker. Am I wrong?


On Thu, May 29, 2014 at 10:29 AM, David Pilato  wrote:

> GC took too much time so your node become unresponsive I think.
> If you set 30 Gb RAM, you should increase the time out ping setting before
> a node is marked as unresponsive.
>
> And if you are under memory pressure, you could try to check your requests
> and see if you can have some optimization or start new nodes...
>
> My 2 cents.
>
> --
> David ;-)
> Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs
>
>
> Le 29 mai 2014 à 09:56, Jorge Ferrando  a écrit :
>
> I've been analyzing the problem with Marvel and nagios and I managed to
> get 2 more details:
>
> - The node restarting/reinitializing it's always the same. Node 3
> - It always happens quickly after getting the cluster in green state.
> Between some seconds and 2-3 minutes
>
> I have debug mode on in logging.yml:
>
> logger:
>   # log action execution errors for easier debugging
>   action: DEBUG
>
> But i dont see anything in the log. For instance, this is the last time it
> happened at around 9:47 the cluster became green and 9:50 the node restarted
>
> [2014-05-29 09:30:57,235][INFO ][monitor.jvm  ] [elastic ASIC
> nodo 3] [gc][young][129][20] duration [745ms], collections [1]/[1s], total
> [745ms]/[8.5s], memory [951.1mb]->[598.9mb]/[29.9gb], all_pools {[young]
> [421.5mb]->[8.2mb]/[532.5mb]}{[survivor] [66.5mb]->[66.5mb]/[66.5mb]}{[old]
> [463.1mb]->[524.1mb]/[29.3gb]}
> [2014-05-29 09:45:36,322][WARN ][monitor.jvm  ] [elastic ASIC
> nodo 3] [gc][old][964][1] duration [29.5s], collections [1]/[30.4s], total
> [29.5s]/[29.5s], memory [5.1gb]->[4.3gb]/[29.9gb], all_pools {[young]
> [29.4mb]->[34.9mb]/[532.5mb]}{[survivor] [59.9mb]->[0b]/[66.5mb]}{[old]
> [5gb]->[4.2gb]/[29.3gb]}
> [2014-05-29 09:50:41,040][INFO ][node ] [elastic ASIC
> nodo 3] version[1.2.0], pid[7021], build[c82387f/2014-05-22T12:49:13Z]
> [2014-05-29 09:50:41,041][INFO ][node ] [elastic ASIC
> nodo 3] initializing ...
> [2014-05-29 09:50:41,063][INFO ][plugins  ] [elastic ASIC
> nodo 3] loaded [marvel], sites [marvel, paramedic, inquisitor, HQ, bigdesk,
> head]
> [2014-05-29 09:50:47,908][INFO ][node ] [elastic ASIC
> nodo 3] initialized
> [2014-05-29 09:50:47,909][INFO ][node ] [elastic ASIC
> nodo 3] starting ...
>
> ¿Is there any other way of debugging what's going on with that node?
>
>
>
>
> On Tue, May 27, 2014 at 12:49 PM, Jorge Ferrando 
> wrote:
>
>> I thought about that but It would be strange because they are 3 Virtual
>> Machines in the same VMWare cluster with other hundreds of services and
>> nobody reported any networking problem.
>>
>>
>> On Thu, May 22, 2014 at 3:16 PM, emeschitc  wrote:
>>
>>> Hi,
>>>
>>> I may be wrong but it seems to me you have a problem with your network.
>>> It may be a flaky connection, broken nic or something wrong with your
>>> configuration for discovery and/or data transport ?
>>>
>>> Caused by: org.elasticsearch.transport.NodeNotConnectedException:
>>> [elastic ASIC nodo 2][inet[/158.42.250.79:9301]] Node not connected
>>>  at
>>> org.elasticsearch.transport.netty.NettyTransport.nodeChannel(NettyTransport.java:859)
>>> at
>>> org.elasticsearch.transport.netty.NettyTransport.sendRequest(NettyTransport.java:540)
>>>  at
>>> org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:189)
>>>
>>> Check the status of the network on this node.
>>>
>>>
>>>
>>> On Thu, May 22, 2014 at 2:07 PM, Jorge Ferrando [via ElasticSearch
>>> Users] <[hidden email]
>>> > wrote:
>>>
 Hello

 We have a cluster of 3 nodes running Ubuntu 12.04.4 LTS 64bits, and
 elasticsearch v1.1.1

 It's be running flawlessly but since the last weak some of the nodes
 restarts randomly and cluster gets to red state, then yellow, then green
 and it happens again in a loop (sometimes it even doesnt get green state)

 I've tried to look at the logs but i can't find and obvious reason of
 what can be going on

 I've found entries like these, but I don't know if they are in some way
 related to the crash:

 [2014-05-22 13:55:16,150][WARN ][index.codec  ] [elastic
 ASIC nodo 3] [logstash-2014.05.22] no index mapper found for field:
 [date_end] returning default postings format
 [2014-05-22 13:55:16,151][WARN ][index.codec  ] [elastic
 ASIC nodo 3] [logstash-2014.05.22] no index mapper found for field:
 [date_end.raw] ret

Re: Nodes restarting automatically

2014-05-29 Thread David Pilato
GC took too much time so your node become unresponsive I think.
If you set 30 Gb RAM, you should increase the time out ping setting before a 
node is marked as unresponsive.

And if you are under memory pressure, you could try to check your requests and 
see if you can have some optimization or start new nodes...

My 2 cents.

--
David ;-)
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs


Le 29 mai 2014 à 09:56, Jorge Ferrando  a écrit :

I've been analyzing the problem with Marvel and nagios and I managed to get 2 
more details:

- The node restarting/reinitializing it's always the same. Node 3
- It always happens quickly after getting the cluster in green state. Between 
some seconds and 2-3 minutes

I have debug mode on in logging.yml:

logger:
  # log action execution errors for easier debugging
  action: DEBUG

But i dont see anything in the log. For instance, this is the last time it 
happened at around 9:47 the cluster became green and 9:50 the node restarted

[2014-05-29 09:30:57,235][INFO ][monitor.jvm  ] [elastic ASIC nodo 
3] [gc][young][129][20] duration [745ms], collections [1]/[1s], total 
[745ms]/[8.5s], memory [951.1mb]->[598.9mb]/[29.9gb], all_pools {[young] 
[421.5mb]->[8.2mb]/[532.5mb]}{[survivor] [66.5mb]->[66.5mb]/[66.5mb]}{[old] 
[463.1mb]->[524.1mb]/[29.3gb]}
[2014-05-29 09:45:36,322][WARN ][monitor.jvm  ] [elastic ASIC nodo 
3] [gc][old][964][1] duration [29.5s], collections [1]/[30.4s], total 
[29.5s]/[29.5s], memory [5.1gb]->[4.3gb]/[29.9gb], all_pools {[young] 
[29.4mb]->[34.9mb]/[532.5mb]}{[survivor] [59.9mb]->[0b]/[66.5mb]}{[old] 
[5gb]->[4.2gb]/[29.3gb]}
[2014-05-29 09:50:41,040][INFO ][node ] [elastic ASIC nodo 
3] version[1.2.0], pid[7021], build[c82387f/2014-05-22T12:49:13Z]
[2014-05-29 09:50:41,041][INFO ][node ] [elastic ASIC nodo 
3] initializing ...
[2014-05-29 09:50:41,063][INFO ][plugins  ] [elastic ASIC nodo 
3] loaded [marvel], sites [marvel, paramedic, inquisitor, HQ, bigdesk, head]
[2014-05-29 09:50:47,908][INFO ][node ] [elastic ASIC nodo 
3] initialized
[2014-05-29 09:50:47,909][INFO ][node ] [elastic ASIC nodo 
3] starting ...

¿Is there any other way of debugging what's going on with that node? 




> On Tue, May 27, 2014 at 12:49 PM, Jorge Ferrando  wrote:
> I thought about that but It would be strange because they are 3 Virtual 
> Machines in the same VMWare cluster with other hundreds of services and 
> nobody reported any networking problem.
> 
> 
>> On Thu, May 22, 2014 at 3:16 PM, emeschitc  wrote:
>> Hi,
>> 
>> I may be wrong but it seems to me you have a problem with your network. It 
>> may be a flaky connection, broken nic or something wrong with your 
>> configuration for discovery and/or data transport ? 
>> 
>> Caused by: org.elasticsearch.transport.NodeNotConnectedException: [elastic 
>> ASIC nodo 2][inet[/158.42.250.79:9301]] Node not connected
>>  at 
>> org.elasticsearch.transport.netty.NettyTransport.nodeChannel(NettyTransport.java:859)
>>  at 
>> org.elasticsearch.transport.netty.NettyTransport.sendRequest(NettyTransport.java:540)
>>  at 
>> org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:189)
>> 
>> Check the status of the network on this node.
>> 
>> 
>> 
>>> On Thu, May 22, 2014 at 2:07 PM, Jorge Ferrando [via ElasticSearch Users] 
>>> <[hidden email]> wrote:
>>> Hello 
>>> 
>>> We have a cluster of 3 nodes running Ubuntu 12.04.4 LTS 64bits, and 
>>> elasticsearch v1.1.1
>>> 
>>> It's be running flawlessly but since the last weak some of the nodes 
>>> restarts randomly and cluster gets to red state, then yellow, then green 
>>> and it happens again in a loop (sometimes it even doesnt get green state)
>>> 
>>> I've tried to look at the logs but i can't find and obvious reason of what 
>>> can be going on 
>>> 
>>> I've found entries like these, but I don't know if they are in some way 
>>> related to the crash:
>>> 
>>> [2014-05-22 13:55:16,150][WARN ][index.codec  ] [elastic ASIC 
>>> nodo 3] [logstash-2014.05.22] no index mapper found for field: [date_end] 
>>> returning default postings format
>>> [2014-05-22 13:55:16,151][WARN ][index.codec  ] [elastic ASIC 
>>> nodo 3] [logstash-2014.05.22] no index mapper found for field: 
>>> [date_end.raw] returning default postings format
>>> [2014-05-22 13:55:16,151][WARN ][index.codec  ] [elastic ASIC 
>>> nodo 3] [logstash-2014.05.22] no index mapper found for field: [date_start] 
>>> returning default postings format
>>> [2014-05-22 13:55:16,151][WARN ][index.codec  ] [elastic ASIC 
>>> nodo 3] [logstash-2014.05.22] no index mapper found for field: 
>>> [date_start.raw] returning default postings format
>>> 
>>> 
>>> For instance right now it was in yellow state, really close to get to the 
>>> green state and suddenly node 3 autorestarted and now cluster is red with 
>>> 2000 sh

ElasticSearch and save the information into a file!! Urgent... Thank you!!

2014-05-29 Thread Francho punto
Hello everyone!,

I am working on my final project work and I have to use ElasticSearch but I 
am new so I don`t have enough idea, and I am running out of time...

I have installed the river twitter for elasticsearch and I collect 
information from there depending on some search terms. What I want to know 
is how I can dump the collected information to a file. For example .txt, to 
process it later. 

What I have done is to colect all information from the river (with the code 
attached putting it on a. Sh) and save the information in a fime using >> 
file.txt 

The code that I use to get the information is this:

 

-XPOST curl -d '
{

"query": 

{

"query_string": 

{

query ":" * "/ / To catch all the tweets 

} 

} 

} '


 
The output is the set of tweets with a lot of information like post id, 
name, text, language...all mixed.

I wonder if you could tell me another option easier or with the output 
clearer. 

Regards, and thank you very much!


-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/a13a2205-27e5-442b-b200-022b780fe74c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Nodes restarting automatically

2014-05-29 Thread Jorge Ferrando
I've been analyzing the problem with Marvel and nagios and I managed to get
2 more details:

- The node restarting/reinitializing it's always the same. Node 3
- It always happens quickly after getting the cluster in green state.
Between some seconds and 2-3 minutes

I have debug mode on in logging.yml:

logger:
  # log action execution errors for easier debugging
  action: DEBUG

But i dont see anything in the log. For instance, this is the last time it
happened at around 9:47 the cluster became green and 9:50 the node restarted

[2014-05-29 09:30:57,235][INFO ][monitor.jvm  ] [elastic ASIC
nodo 3] [gc][young][129][20] duration [745ms], collections [1]/[1s], total
[745ms]/[8.5s], memory [951.1mb]->[598.9mb]/[29.9gb], all_pools {[young]
[421.5mb]->[8.2mb]/[532.5mb]}{[survivor] [66.5mb]->[66.5mb]/[66.5mb]}{[old]
[463.1mb]->[524.1mb]/[29.3gb]}
[2014-05-29 09:45:36,322][WARN ][monitor.jvm  ] [elastic ASIC
nodo 3] [gc][old][964][1] duration [29.5s], collections [1]/[30.4s], total
[29.5s]/[29.5s], memory [5.1gb]->[4.3gb]/[29.9gb], all_pools {[young]
[29.4mb]->[34.9mb]/[532.5mb]}{[survivor] [59.9mb]->[0b]/[66.5mb]}{[old]
[5gb]->[4.2gb]/[29.3gb]}
[2014-05-29 09:50:41,040][INFO ][node ] [elastic ASIC
nodo 3] version[1.2.0], pid[7021], build[c82387f/2014-05-22T12:49:13Z]
[2014-05-29 09:50:41,041][INFO ][node ] [elastic ASIC
nodo 3] initializing ...
[2014-05-29 09:50:41,063][INFO ][plugins  ] [elastic ASIC
nodo 3] loaded [marvel], sites [marvel, paramedic, inquisitor, HQ, bigdesk,
head]
[2014-05-29 09:50:47,908][INFO ][node ] [elastic ASIC
nodo 3] initialized
[2014-05-29 09:50:47,909][INFO ][node ] [elastic ASIC
nodo 3] starting ...

¿Is there any other way of debugging what's going on with that node?




On Tue, May 27, 2014 at 12:49 PM, Jorge Ferrando  wrote:

> I thought about that but It would be strange because they are 3 Virtual
> Machines in the same VMWare cluster with other hundreds of services and
> nobody reported any networking problem.
>
>
> On Thu, May 22, 2014 at 3:16 PM, emeschitc  wrote:
>
>> Hi,
>>
>> I may be wrong but it seems to me you have a problem with your network.
>> It may be a flaky connection, broken nic or something wrong with your
>> configuration for discovery and/or data transport ?
>>
>> Caused by: org.elasticsearch.transport.NodeNotConnectedException:
>> [elastic ASIC nodo 2][inet[/158.42.250.79:9301]] Node not connected
>>  at
>> org.elasticsearch.transport.netty.NettyTransport.nodeChannel(NettyTransport.java:859)
>> at
>> org.elasticsearch.transport.netty.NettyTransport.sendRequest(NettyTransport.java:540)
>>  at
>> org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:189)
>>
>> Check the status of the network on this node.
>>
>>
>>
>> On Thu, May 22, 2014 at 2:07 PM, Jorge Ferrando [via ElasticSearch Users]
>> <[hidden email] >
>> wrote:
>>
>>> Hello
>>>
>>> We have a cluster of 3 nodes running Ubuntu 12.04.4 LTS 64bits, and
>>> elasticsearch v1.1.1
>>>
>>> It's be running flawlessly but since the last weak some of the nodes
>>> restarts randomly and cluster gets to red state, then yellow, then green
>>> and it happens again in a loop (sometimes it even doesnt get green state)
>>>
>>> I've tried to look at the logs but i can't find and obvious reason of
>>> what can be going on
>>>
>>> I've found entries like these, but I don't know if they are in some way
>>> related to the crash:
>>>
>>> [2014-05-22 13:55:16,150][WARN ][index.codec  ] [elastic
>>> ASIC nodo 3] [logstash-2014.05.22] no index mapper found for field:
>>> [date_end] returning default postings format
>>> [2014-05-22 13:55:16,151][WARN ][index.codec  ] [elastic
>>> ASIC nodo 3] [logstash-2014.05.22] no index mapper found for field:
>>> [date_end.raw] returning default postings format
>>> [2014-05-22 13:55:16,151][WARN ][index.codec  ] [elastic
>>> ASIC nodo 3] [logstash-2014.05.22] no index mapper found for field:
>>> [date_start] returning default postings format
>>> [2014-05-22 13:55:16,151][WARN ][index.codec  ] [elastic
>>> ASIC nodo 3] [logstash-2014.05.22] no index mapper found for field:
>>> [date_start.raw] returning default postings format
>>>
>>>
>>> For instance right now it was in yellow state, really close to get to
>>> the green state and suddenly node 3 autorestarted and now cluster is red
>>> with 2000 shard initializing. The log in that node shows this:
>>>
>>> [2014-05-22 13:59:48,498][INFO ][monitor.jvm  ] [elastic
>>> ASIC nodo 3] [gc][young][1181][222] duration [735ms], collections [1]/[1s],
>>> total [735ms]/[1.1m], memory [6.5gb]->[6.1gb]/[19.9gb], all_pools {[young]
>>> [456mb]->[7.2mb]/[532.5mb]}{[survivor] [66.5mb]->[66.5mb]/[66.5mb]}{[old]
>>> [6gb]->[6gb]/[19.3gb]}
>>> [2014-05-22 14:03:44,825][INFO ][node ] [elast

Duplicate Results Following Upgrade to 1.2.0 & SortScript

2014-05-29 Thread Nariman Haghighi
The following query is now returning duplicate documents following the 
upgrade to 1.2.0, was this an intentional change that's documented 
somewhere?

{
  "from": 0,
  "size": 25,
  "sort": {
"_script": {
  "order": "desc",
  "type": "number",
  "script": "(doc['id'].value + salt).hashCode()",
  "params": {
"salt": "TMJHXYFHHR"
  }
}
  },
  "query": {
"term": {
  "featured": {
"value": "true"
  }
}
  }
}

The cluster has 2 nodes and the index has the default of 5 shards.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/4c569495-b04b-4185-9eb9-bc159590848a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.