Re: Adding new nodes to cluster with unicast without restarting

2014-03-13 Thread Hari Prasad

Is the cluster name the only was to restrict nodes from joining into 
cluster, unicast or multicast. 

I do have security concern and my concern is is there a way the address it 
from elasticsearch itself rather than going into the network layer.
On Friday, 14 March 2014 12:22:52 UTC+5:30, David Pilato wrote:
>
> If the cluster name is different, the new node won't join the cluster.
> Also, if you have a security concern, you should restrict network access 
> to your nodes on a transport layer (9300). 
>
>
>
> --
> David ;-)
> Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs
>
>
> Le 14 mars 2014 à 07:11, Hari Prasad > a 
> écrit :
>
> Hi
> Can you please explain the following 
> I first started one node with only its ip in unicast host list and 
> multicast discovery false. The cluster started with one node.
> Then I started another node with its unicast host value of node one. This 
> also joined the cluster.
>
> This is in sync with what you said. But if this is the case the what the 
> use for setting the multicast as false and setting the unicast host list.
> Cant any node join the cluster just like that by setting the unicast to 
> any node in the cluster. how to limit this.
>
> On Thursday, 13 March 2014 19:46:08 UTC+5:30, Hari Prasad wrote:
>>
>> Ok Thank you :)
>>
>> On Thursday, 13 March 2014 19:35:52 UTC+5:30, David Pilato wrote:
>>>
>>> yes
>>>
>>> -- 
>>> *David Pilato* | *Technical Advocate* | *Elasticsearch.com 
>>> *
>>> @dadoonet  | 
>>> @elasticsearchfr
>>>
>>>
>>> Le 13 mars 2014 à 15:00:10, Hari Prasad (iamha...@gmail.com) a écrit:
>>>
>>> Is this the case the even if discovery.zen.ping.multicast.enabled is 
>>> false?
>>>
>>> On Thursday, 13 March 2014 19:17:19 UTC+5:30, David Pilato wrote: 

  Yes. Just launch the new node and set its unicast values to other 
 running nodes.
 It will connect to the cluster and the cluster will add him as a new 
 node.

 You don't have to modify existing settings, although you should do it 
 to have updated settings in case of restart.

 --
 David ;-)
 Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

  
 Le 13 mars 2014 à 14:38, Hari Prasad  a écrit :

  Hi 
 I am having an elasticsearch cluster. I am using the unicast to 
 discover nodes. Can I add nodes to list dynamically without restarting the 
 cluster? 
 I tried to do this with the prepareUpdateSettings but i got "ignoring 
 transient setting [discovery.zen.ping.unicast.hosts], not dynamically 
 updateable".
 Is there any other way to do this without restarting the cluster.

 I am not going for multicast because i don't want rouge nodes to join 
 my cluster. I can go for it if i can, in anyway, limit what all nodes join 
 the cluster, other than the cluster name.
 Are there any ways to do this.

 Thanks 
 Hari
  --
 You received this message because you are subscribed to the Google 
 Groups "elasticsearch" group.
 To unsubscribe from this group and stop receiving emails from it, send 
 an email to elasticsearc...@googlegroups.com.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/127e4ba6-3fab-4fca-8595-e4506ef8f101%40googlegroups.com
 .
 For more options, visit https://groups.google.com/d/optout.
  
>>>  --
>>> You received this message because you are subscribed to the Google 
>>> Groups "elasticsearch" group.
>>> To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to elasticsearc...@googlegroups.com.
>>> To view this discussion on the web visit 
>>> https://groups.google.com/d/msgid/elasticsearch/a5a2de96-f308-495a-8de4-07feca421759%40googlegroups.com
>>> .
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>>  -- 
> You received this message because you are subscribed to the Google Groups 
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to elasticsearc...@googlegroups.com .
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/elasticsearch/7af0beb0-880c-4df4-b182-eb026210959b%40googlegroups.com
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an

Re: Adding new nodes to cluster with unicast without restarting

2014-03-13 Thread David Pilato
If the cluster name is different, the new node won't join the cluster.
Also, if you have a security concern, you should restrict network access to 
your nodes on a transport layer (9300). 



--
David ;-)
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs


Le 14 mars 2014 à 07:11, Hari Prasad  a écrit :

Hi
Can you please explain the following 
I first started one node with only its ip in unicast host list and multicast 
discovery false. The cluster started with one node.
Then I started another node with its unicast host value of node one. This also 
joined the cluster.

This is in sync with what you said. But if this is the case the what the use 
for setting the multicast as false and setting the unicast host list.
Cant any node join the cluster just like that by setting the unicast to any 
node in the cluster. how to limit this.

> On Thursday, 13 March 2014 19:46:08 UTC+5:30, Hari Prasad wrote:
> Ok Thank you :)
> 
>> On Thursday, 13 March 2014 19:35:52 UTC+5:30, David Pilato wrote:
>> yes
>> 
>> -- 
>> David Pilato | Technical Advocate | Elasticsearch.com
>> @dadoonet | @elasticsearchfr
>> 
>> 
>>> Le 13 mars 2014 à 15:00:10, Hari Prasad (iamha...@gmail.com) a écrit:
>>> 
>>> Is this the case the even if discovery.zen.ping.multicast.enabled is false?
>>> 
 On Thursday, 13 March 2014 19:17:19 UTC+5:30, David Pilato wrote:
 Yes. Just launch the new node and set its unicast values to other running 
 nodes.
 It will connect to the cluster and the cluster will add him as a new node.
 
 You don't have to modify existing settings, although you should do it to 
 have updated settings in case of restart.
 
 --
 David ;-)
 Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs
 
 
 Le 13 mars 2014 à 14:38, Hari Prasad  a écrit :
 
 Hi
 I am having an elasticsearch cluster. I am using the unicast to discover 
 nodes. Can I add nodes to list dynamically without restarting the cluster? 
 I tried to do this with the prepareUpdateSettings but i got "ignoring 
 transient setting [discovery.zen.ping.unicast.hosts], not dynamically 
 updateable".
 Is there any other way to do this without restarting the cluster.
 
 I am not going for multicast because i don't want rouge nodes to join my 
 cluster. I can go for it if i can, in anyway, limit what all nodes join 
 the cluster, other than the cluster name.
 Are there any ways to do this.
 
 Thanks 
 Hari
 --
 You received this message because you are subscribed to the Google Groups 
 "elasticsearch" group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearc...@googlegroups.com.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/127e4ba6-3fab-4fca-8595-e4506ef8f101%40googlegroups.com.
 For more options, visit https://groups.google.com/d/optout.
>>> --
>>> You received this message because you are subscribed to the Google Groups 
>>> "elasticsearch" group.
>>> To unsubscribe from this group and stop receiving emails from it, send an 
>>> email to elasticsearc...@googlegroups.com.
>>> To view this discussion on the web visit 
>>> https://groups.google.com/d/msgid/elasticsearch/a5a2de96-f308-495a-8de4-07feca421759%40googlegroups.com.
>>> For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/7af0beb0-880c-4df4-b182-eb026210959b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/67888A41-9415-4F99-B1AE-CEDD0FFC7544%40pilato.fr.
For more options, visit https://groups.google.com/d/optout.


Re: ES 1.0.1 snapshot creation fails

2014-03-13 Thread David Pilato
can you run mkdir /data/backup_es/snapshot-snapshot_6 from es machine using 
elasticsearch user?

--
David ;-)
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs


Le 14 mars 2014 à 07:10, Yatish Teddla  a écrit :

Hi David,

I have given full access permissions to that direcory(given 777 permissons to 
it) .
But still iam getting that error.


> On Fri, Mar 14, 2014 at 11:37 AM, David Pilato  wrote:
> The user who is running elasticsearch needs to have full control of your 
> /data/backup_es
> 
> Is it the case?
> 
> --
> David ;-)
> Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs
> 
> 
> Le 14 mars 2014 à 06:48, Yatish Teddla  a écrit :
> 
> Hi Everyone,
> 
> For backup directory used a mounted directory from other machine using sshfs. 
> For that directory changed the owner and group to "elasticsearch".
> 
> When iam trying to create a snapshot getting below error.
> 
> {"error":"SnapshotCreationException[[backup:snapshot_6] failed to create 
> snapshot]; nested: FileNotFoundException[/data/backup_es/snapshot-snapshot_6 
> (Permission denied)]; ","status":500}
> 
> Any idea why iam getting this error or how can i use the mounted directory as 
> a backup space?
> 
> 
> 
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/elasticsearch/0655a68a-9bcc-428e-b9e1-04e1ac1b9e4e%40googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
> -- 
> You received this message because you are subscribed to a topic in the Google 
> Groups "elasticsearch" group.
> To unsubscribe from this topic, visit 
> https://groups.google.com/d/topic/elasticsearch/rGXD2WErqp0/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to 
> elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/elasticsearch/A956B35A-090B-48A0-8045-6EA035B20386%40pilato.fr.
> For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAFF7EL%2Bof4wpTDumhL0YkZwYtY42E%2B%3DP4qFeu1aAOOAt3eLq0w%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/B9C89C92-D19E-47EF-9A30-A50D50E0EABA%40pilato.fr.
For more options, visit https://groups.google.com/d/optout.


Re: ES 1.0.1 snapshot creation fails

2014-03-13 Thread Yatish Teddla
Hi David,

I have given full access permissions to that direcory(given 777 permissons
to it) .
But still iam getting that error.


On Fri, Mar 14, 2014 at 11:37 AM, David Pilato  wrote:

> The user who is running elasticsearch needs to have full control of your
> /data/backup_es
>
> Is it the case?
>
> --
> David ;-)
> Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs
>
>
> Le 14 mars 2014 à 06:48, Yatish Teddla  a écrit :
>
> Hi Everyone,
>
> For backup directory used a mounted directory from other machine using
> sshfs. For that directory changed the owner and group to "elasticsearch".
>
> When iam trying to create a snapshot getting below error.
>
> {"error":"SnapshotCreationException[[backup:snapshot_6] failed to create
> snapshot]; nested:
> FileNotFoundException[/data/backup_es/snapshot-snapshot_6 (Permission
> denied)]; ","status":500}
>
> Any idea why iam getting this error or how can i use the mounted directory
> as a backup space?
>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/0655a68a-9bcc-428e-b9e1-04e1ac1b9e4e%40googlegroups.com
> .
> For more options, visit https://groups.google.com/d/optout.
>
> --
> You received this message because you are subscribed to a topic in the
> Google Groups "elasticsearch" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/elasticsearch/rGXD2WErqp0/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/A956B35A-090B-48A0-8045-6EA035B20386%40pilato.fr
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAFF7EL%2Bof4wpTDumhL0YkZwYtY42E%2B%3DP4qFeu1aAOOAt3eLq0w%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Adding new nodes to cluster with unicast without restarting

2014-03-13 Thread Hari Prasad
Hi
Can you please explain the following 
I first started one node with only its ip in unicast host list and 
multicast discovery false. The cluster started with one node.
Then I started another node with its unicast host value of node one. This 
also joined the cluster.

This is in sync with what you said. But if this is the case the what the 
use for setting the multicast as false and setting the unicast host list.
Cant any node join the cluster just like that by setting the unicast to any 
node in the cluster. how to limit this.

On Thursday, 13 March 2014 19:46:08 UTC+5:30, Hari Prasad wrote:
>
> Ok Thank you :)
>
> On Thursday, 13 March 2014 19:35:52 UTC+5:30, David Pilato wrote:
>>
>> yes
>>
>> -- 
>> *David Pilato* | *Technical Advocate* | *Elasticsearch.com*
>> @dadoonet  | 
>> @elasticsearchfr
>>
>>
>> Le 13 mars 2014 à 15:00:10, Hari Prasad (iamha...@gmail.com) a écrit:
>>
>> Is this the case the even if discovery.zen.ping.multicast.enabled is 
>> false?
>>
>> On Thursday, 13 March 2014 19:17:19 UTC+5:30, David Pilato wrote: 
>>>
>>>  Yes. Just launch the new node and set its unicast values to other 
>>> running nodes.
>>> It will connect to the cluster and the cluster will add him as a new 
>>> node.
>>>
>>> You don't have to modify existing settings, although you should do it to 
>>> have updated settings in case of restart.
>>>
>>> --
>>> David ;-)
>>> Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs
>>>
>>>  
>>> Le 13 mars 2014 à 14:38, Hari Prasad  a écrit :
>>>
>>>  Hi 
>>> I am having an elasticsearch cluster. I am using the unicast to discover 
>>> nodes. Can I add nodes to list dynamically without restarting the cluster? 
>>> I tried to do this with the prepareUpdateSettings but i got "ignoring 
>>> transient setting [discovery.zen.ping.unicast.hosts], not dynamically 
>>> updateable".
>>> Is there any other way to do this without restarting the cluster.
>>>
>>> I am not going for multicast because i don't want rouge nodes to join my 
>>> cluster. I can go for it if i can, in anyway, limit what all nodes join the 
>>> cluster, other than the cluster name.
>>> Are there any ways to do this.
>>>
>>> Thanks 
>>> Hari
>>>  --
>>> You received this message because you are subscribed to the Google 
>>> Groups "elasticsearch" group.
>>> To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to elasticsearc...@googlegroups.com.
>>> To view this discussion on the web visit 
>>> https://groups.google.com/d/msgid/elasticsearch/127e4ba6-3fab-4fca-8595-e4506ef8f101%40googlegroups.com
>>> .
>>> For more options, visit https://groups.google.com/d/optout.
>>>  
>>  --
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to elasticsearc...@googlegroups.com.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/a5a2de96-f308-495a-8de4-07feca421759%40googlegroups.com
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/7af0beb0-880c-4df4-b182-eb026210959b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: ES 1.0.1 snapshot creation fails

2014-03-13 Thread David Pilato
The user who is running elasticsearch needs to have full control of your 
/data/backup_es

Is it the case?

--
David ;-)
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs


Le 14 mars 2014 à 06:48, Yatish Teddla  a écrit :

Hi Everyone,

For backup directory used a mounted directory from other machine using sshfs. 
For that directory changed the owner and group to "elasticsearch".

When iam trying to create a snapshot getting below error.

{"error":"SnapshotCreationException[[backup:snapshot_6] failed to create 
snapshot]; nested: FileNotFoundException[/data/backup_es/snapshot-snapshot_6 
(Permission denied)]; ","status":500}

Any idea why iam getting this error or how can i use the mounted directory as a 
backup space?

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/0655a68a-9bcc-428e-b9e1-04e1ac1b9e4e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/A956B35A-090B-48A0-8045-6EA035B20386%40pilato.fr.
For more options, visit https://groups.google.com/d/optout.


Term Filter by document

2014-03-13 Thread Allon
When I read the case for using a term filter from a field of a document, it 
made perfect sense that it would be completely unwieldy to have to fetch 
over the wire, then submit back over the wire what could potentially be 
tens or hundreds of thousands of values.  But when I read the solution that 
can only point to the values stored within a single document it didn't make 
sense to me how you could create or manage a document of that size to begin 
with?

Effectively, if you want to follow the tweets of your followers, all of 
your followers need to be stored on the same user document.  Eventually 
your list of followers could grow quite big but to add or remove followers 
you need to get and update the document with potentially hundreds of 
thousands of followers over the wire anyway.  Right?  So I don't understand 
the value.

Could someone explain it to me because I would really like to filter 
documents that relate to a large set of changing values.  It seems like 
this filter could only be used against types that don't change frequently.

Thanks,

-Allon

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/a24fafe3-cb3e-43ef-af86-f0e06d308f9a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


ES 1.0.1 snapshot creation fails

2014-03-13 Thread Yatish Teddla
Hi Everyone,

For backup directory used a mounted directory from other machine using 
sshfs. For that directory changed the owner and group to "elasticsearch".

When iam trying to create a snapshot getting below error.

{"error":"SnapshotCreationException[[backup:snapshot_6] failed to create 
snapshot]; nested: 
FileNotFoundException[/data/backup_es/snapshot-snapshot_6 (Permission 
denied)]; ","status":500}

Any idea why iam getting this error or how can i use the mounted directory 
as a backup space?

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/0655a68a-9bcc-428e-b9e1-04e1ac1b9e4e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Question on script scoring

2014-03-13 Thread Amit Soni
Hi all - After a good amount of debugging, I figured out the problem and
wanted to share with this group. I had defined the mapping for 'boostValue'
wherein I had "index=no", my bad :)

After I changed mapping to "index": "not_analyzed" for this boost field, it
worked fine.

-Amit.


On Thu, Mar 6, 2014 at 9:36 AM, Amit Soni  wrote:

> hello everyone - first of all my apology for following up on this soon
> enough.
>
> wondering if anyone has got things working with script scoring and can
> share their script, or help me with what might be wrong with the below
> script.
>
>
> "script_score": {
> "script": "doc['boostValue'].empty || (doc['boostValue'].value <=
> 0) ? _score : doc['boostValue'].value * _score"
> }
>
> -Amit.
>
>
> On Mon, Mar 3, 2014 at 7:13 PM, Amit Soni  wrote:
>
>> and yes I am running 1.0.0.
>>
>> @Nikolas and Clint - I tried your suggestions and played with parenthesis
>> but didnt have any luck. anything else I am missing?
>>
>> thanks much for your help!
>>
>> -Amit.
>>
>>
>> On Mon, Mar 3, 2014 at 6:56 PM, Amit Soni  wrote:
>>
>>> Thanks much Nikolas and Clint.
>>>
>>> @*Binh* - Below is my complete query:
>>>
>>> {
>>>   "from": 0,
>>>   "size": 15,
>>>   "query": {
>>> "function_score": {
>>>   "query": {
>>> "filtered": {
>>>   "query": {
>>> "simple_query_string": {
>>>   "query": "some query",
>>>   "fields": [
>>> "searchKeywords",
>>> "website",
>>> "name^2.0"
>>>   ],
>>>   "default_operator": "and"
>>> }
>>>   },
>>>   "filter": {
>>> "term": {
>>>   "region": "some region"
>>> }
>>>   }
>>> }
>>>   },
>>>   "script_score": {
>>> "script": "doc['boostValue'].empty || (doc['boostValue'].value
>>> <= 0) ? _score : doc['boostValue'].value * _score"
>>>   }
>>> }
>>>   },
>>>   "explain": false
>>> }
>>>
>>>
>>> -Amit.
>>>
>>>
>>> On Mon, Mar 3, 2014 at 1:14 PM, Nikolas Everett wrote:
>>>
 First, .empty is super slow right now:
 https://github.com/elasticsearch/elasticsearch/issues/5086
 Second, be paranoid about order of operations with MVEL.  I'd play
 around with parrens until it worked.
 Third, multiply is the default boost mode so you can just leave out the
 _score which might make the MVEL simpler.


 On Mon, Mar 3, 2014 at 4:03 PM, Binh Ly  wrote:

> Are you using the function_score query in 1.0? If so, can you show
> your full query? Thanks!
>
> --
> You received this message because you are subscribed to the Google
> Groups "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/7452f652-f9b0-424c-a3ec-3e991afbd42f%40googlegroups.com
> .
>
> For more options, visit https://groups.google.com/groups/opt_out.
>

  --
 You received this message because you are subscribed to the Google
 Groups "elasticsearch" group.
 To unsubscribe from this group and stop receiving emails from it, send
 an email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/CAPmjWd3tgEtVD4Lt6Ce3i225M2ZHq%2BuBDc1Zsrh3xH6fKYtSLw%40mail.gmail.com
 .

 For more options, visit https://groups.google.com/groups/opt_out.

>>>
>>>
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAAOGaQJEE4sesCrGuVfKcXqo8qOV6jONr%3D9NRfvnx4N0WSytfg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: (ES 0.90.1) Cannot connect to elasticsearch cluster after a node is removed

2014-03-13 Thread Hui
Hi All,

After testing in another cluster, I found that the cluster can be connected 
but it was very slow.

At this moment, every normal request(~50ms) becomes 41732ms to 85984ms 
while the cluster is in Yellow health and there is no unassigned shard(s).

It becomes 50ms again after the problem node re-joins.

There is no exception log in the master node. 

Thanks

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/840c93bb-f62a-4a65-9220-c725918436f8%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: how to intellij

2014-03-13 Thread Benjamin Black
that did it! thanks, andrew!


On Thu, Mar 13, 2014 at 4:47 PM, Andrew Selden <
andrew.sel...@elasticsearch.com> wrote:

> Hi Ben,
>
> Once you have cloned the source, you should be able to open it as a
> project in Intellij by just pointing to the root directory.
>
> To run an ES server from Intellij you can create a run configuration with
> these settings:
>
> Main class: org.elasticsearch.bootstrap.Elasticsearch
>
> VM options: -Xms256m -Xmx1g -Xss256k -Djava.awt.headless=true
> -XX:+UseParNewGC -XX:+UseConcMarkSweepGC
> -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly
> -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=logs/heapdump.hprof
> -Delasticsearch -Des.foreground=yes -Djava.library.path=lib/sigar -ea
> -Des.config=/elasticsearch.yml
>  -Des.logger.level=DEBUG
>
> Working directory: I have this pointed to the top-level source dir. Not
> sure it matters.
>
> Environment: ES_TEST_LOCAL=true
>
> With that you should be able to hit the magical green arrow button to run
> it. You can set breakpoints in the code, go to a terminal and issue curl
> commands, and you'll be able to step through the code.
>
> - A
>
> On Mar 13, 2014, at 1:50 PM, Benjamin Black  wrote:
>
> would one of you wise sages offer up the magical incantations for working
> with ES source in intellij? specifically, what are the configuration steps
> from cloning the github repo through to debugging a running ES instance? i
> have had no luck with either following the README, random mailing list
> posts, or blogs or with banging my face on the keyboard like an angry
> simian. your assistance is appreciated.
>
>
> b
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
>
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/663e722e-9cff-45ba-bbd8-b4dd8efa9829%40googlegroups.com
> .
> For more options, visit https://groups.google.com/d/optout.
>
>
>  --
> You received this message because you are subscribed to a topic in the
> Google Groups "elasticsearch" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/elasticsearch/esBwSZodKOk/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/76FE575C-5730-4ED7-B3AA-0245289E9A03%40elasticsearch.com
> .
>
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CA%2BVbu7wEfEC6J7KwDiPV3gUL1OW%3DrwNDOx91v%3DM86XCDgiuqQQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Timeouts on Node Stats API?

2014-03-13 Thread Xiao Yu
After restarting the node I see logs like the following gist which seems to 
suggest there are some internal networking issues perhaps?

https://gist.github.com/xyu/9541662


-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/bc0181ec-e33b-4a13-a4dc-7e7d68801177%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Timeouts on Node Stats API?

2014-03-13 Thread Xiao Yu

>
> Can you do a hot_threads while this is happening?
>

I took a couple samples from 2 nodes that were experiencing this issue:

https://gist.github.com/xyu/9541604

It seems like the problematic nodes are just doing normal searching 
operations?

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/009a0765-17e8-40b3-8fab-6b7456b34d60%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: (ES 0.90.1) Cannot connect to elasticsearch cluster after a node is removed

2014-03-13 Thread Hui
Hi Echin,

Since the problem node ip is defined in the client es connection by JAVA 
API, I guess the client will still try to connect to this node. So, there 
are such warnings.

It should be fine for client to keep working with the cluster. However, in 
my case, the java client is not reachable and timeout(through HTTP 
protocol).

I will try to create a testing cluster with same settings to test does the 
client work fine in this condition.

Thanks.

On Thursday, March 13, 2014 11:54:53 PM UTC+8, echin1999 wrote:
>
> One more thing - I notice that functionally, the client is still able to 
> communicate to the remaining active node.  so I guess this warning is just 
> a "warning".  must be some background thread that periodically looks for 
> the missing node, while the main Client instance can still communicate to 
> the active node.would you be able to verify if its merely a warning for 
> you?  if so, i might just not worry about this for now.
>
> On Thursday, March 13, 2014 5:15:36 AM UTC-4, Hui wrote:
>>
>> Hi Dome, 
>>
>> Do you mean the service of 10.1.4.196 is not open? Yes, the service 
>> should be stopped when it was rebooted.
>>
>> But the master node 10.1.4.197 has removed the problem node 10.1.4.196 
>> when it cannot ping the machine 10.1.4.196.
>>
>> The cluster should be fine after this operation. Do I understand it 
>> wrongly?
>>
>> Thanks
>>
>> On Thursday, March 13, 2014 4:48:17 PM UTC+8, Dome.C.Wei wrote:
>>>
>>> That must be the service not open.
>>>
>>> 在 2014年3月13日星期四UTC+8下午2时10分22秒,Hui写道:

 Hi Mark,

 Thanks for replying.

 The master (10.1.4.197) and other nodes can be reached while the 
 problem node(10.1.4.196) is not reachable.
 So, we can see the cluster status at that moment

  "status" : "yellow",
   "timed_out" : false,
   "unassigned_shards" : 0,


 On Thursday, March 13, 2014 2:03:44 PM UTC+8, Mark Walkom wrote:
>
> It looks like a networking issue, at least based on "No route to host" 
> in the error.
> Can you ping the master when this is happening, what about doing a 
> telnet test?
>
> Regards,
> Mark Walkom
>
> Infrastructure Engineer
> Campaign Monitor
> email: ma...@campaignmonitor.com
> web: www.campaignmonitor.com
>
>
> On 13 March 2014 16:54, Hui  wrote:
>
>> Hi All,
>>
>>
>> This is the log for the case.
>>
>>
>> The node 10.1.4.196 is removed at 14:08 due to machine reboot, the 
>> client keeps trying to connect to the elasticsearch cluster but fails.
>>
>> Master Node : 
>> [2014-03-08 14:08:26,531][INFO ][cluster.service  ] 
>> [10.1.4.197:9202] removed 
>> {[10.1.4.196:9202][_sJrum34QWGqEkv8CvAtow][inet[/10.1.4.196:9302]],}, 
>> reason: 
>> zen-disco-node_failed([10.1.4.196:9202][_sJrum34QWGqEkv8CvAtow][inet[/10.1.4.196:9302]]),
>>  reason failed to ping, tried [3] times, each with maximum [30s] timeout
>>
>>
>> Client : 
>> 2014-03-08 14:15:36,184 WARN  org.elasticsearch.transport.netty - 
>> [Bulldozer] exception caught on transport layer [[id: 0x50dc218f]], 
>> closing connection
>> java.net.NoRouteToHostException: No route to host
>>
>>
>> (The cluster health at this moment is Yellow and there is no unassigned 
>> shard.)
>>
>>
>>
>>
>> The node is back at 14:25, the client can successfully connected to the 
>> cluster again.
>>
>> Client :
>>
>> 2014-03-08 14:25:20,597 WARN  org.elasticsearch.transport.netty - 
>> [Bulldozer] exception caught on transport layer [[id: 0xf24d85d7]], 
>> closing connection
>> java.net.NoRouteToHostException: No route to host
>>
>>
>> Master Node :
>>
>> [2014-03-08 14:25:57,984][INFO ][cluster.service  ] 
>> [10.1.4.197:9202] added 
>> {[10.1.4.196:9202][rFZ7k7XSSY231EgPoDfmFw][inet[/10.1.4.196:9302]],}, 
>> reason: zen-disco-receive(join from 
>> node[[10.1.4.196:9202][rFZ7k7XSSY231EgPoDfmFw][inet[/10.1.4.196:9302]]])
>>
>>
>> (The cluster health at this moment is Green.)
>>
>> In the above case, the client should be able to connect to the cluster 
>> even a node is removed from the cluster.
>>
>>
>> For the client, the connection is created as followings : 
>>
>>
>> Settings settings = ImmutableSettings.settingsBuilder()
>> .put("cluster.name", "clustername")
>>
>> .put("client.transport.sniff", true)
>>
>>
>> .build();
>> 
>>
>> TransportClient client = new TransportClient(settings);
>>
>> client.addTransportAddress(new InetSocketTransportAddress(
>> "10.1.4.195" /* hostname */, 9300 /* port */));
>>
>> client.addTransportAddress(new InetSocketTransportAddress(
>>
>

Re: fielddata breaker question

2014-03-13 Thread Lee Hinman
On 3/13/14, 1:37 AM, Dunaeth wrote:
> I tried to clear all caches to see if it could help but the fielddata
> breaker estimated size is still skyrocketing... If it's not cache
> issue, and it's linked with our data inserts, I can only think about
> insert process or percolation queries. Any idea ?
> 

Hi Danaeth,

Can you add the lines:

logger:
  indices.fielddata.breaker: TRACE
  index.fielddata: TRACE
  common.breaker: TRACE

in logging.yml for your elasticsearch configuration and restart the
cluster? This will log information about the breaker estimation and
adjustment. If you can run some queries and attach the logs it would be
helpful in tracking down what's going on.

;; Lee

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/53226CE5.3060208%40gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: External routing [76] and document path routing [khabar] mismatch

2014-03-13 Thread Robin Boutros
Something to add:

When I index an item, I reference his parent with its id, not his account 
name. Is that part of the problem? Can I use the account to set the item's 
parent when indexing it? And if so, how would elasticsearch know that I'm 
using this field?

On Thursday, March 13, 2014 10:30:12 PM UTC-4, Robin Boutros wrote:
>
> Hey,
>
> I have a parent/child relationship between Item and Player.
>
> {
>   "item": {
>   "_parent": {
>   "type": "player"
>   },
>   "_routing": {
>   "required":true,
>   "path":"account"
>   },
>   "properties": {
> 
>   
> "account":{"type":"string","index":"not_analyzed","omit_norms":true,"index_options":"docs"}
>   }
>   }
> }
>
> {
>   "player": {
>   "_routing": {
>   "required":true,
>   "path":"account"
>   },
>   "properties": {
>   
> {"type":"string","index":"not_analyzed","omit_norms":true,"index_options":"docs"}
>   
>
>   }
>   }
> }
>
> Ok, so with this, I'm hoping to have items indexed on the same shard as 
> their parents.
>
> But when I try to save an item, I get this error:
>
> *External routing [1] and document path routing [Frank1234r] mismatch*
>
> *Frank1234r *is the account name, and *1 *is the id of the player with 
> the name.
>
> What do I need to do to fix that? What the "external routing" exactly? 
> Thanks!
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/8b4e0c5a-d477-42fc-bf1e-846efafc195f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


External routing [76] and document path routing [khabar] mismatch

2014-03-13 Thread Robin Boutros
Hey,

I have a parent/child relationship between Item and Player.

{
  "item": {
  "_parent": {
  "type": "player"
  },
  "_routing": {
  "required":true,
  "path":"account"
  },
  "properties": {

  
"account":{"type":"string","index":"not_analyzed","omit_norms":true,"index_options":"docs"}
  }
  }
}

{
  "player": {
  "_routing": {
  "required":true,
  "path":"account"
  },
  "properties": {
  
{"type":"string","index":"not_analyzed","omit_norms":true,"index_options":"docs"}
  
   
  }
  }
}

Ok, so with this, I'm hoping to have items indexed on the same shard as 
their parents.

But when I try to save an item, I get this error:

*External routing [1] and document path routing [Frank1234r] mismatch*

*Frank1234r *is the account name, and *1 *is the id of the player with the 
name.

What do I need to do to fix that? What the "external routing" exactly? 
Thanks!

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/46a626e8-abed-4e13-9e33-3f42a1f379a3%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Constantly increasing memory outside of Java heap

2014-03-13 Thread Jos Kraaijeveld
There are no other processes running except for ES and the program which 
posts the updates. The memory is constantly increasing when the updater is 
running, but is stale (and doesn't release the memory at all, no matter how 
much is used) whenever ES is idle.

On Thursday, March 13, 2014 5:32:43 PM UTC-7, Zachary Tong wrote:
>
> Also, are there other processes running which may be causing the problem? 
>  Does the behavior only happen when ES is running?
>
>
>
> On Thursday, March 13, 2014 8:31:18 PM UTC-4, Zachary Tong wrote:
>>
>> Cool, curious to see what happens.  As an aside, I would recommend 
>> downgrading to Java 1.7.0_u25.  There are known bugs in the most recent 
>> Oracle JVM versions which have not been resolved yet.  u25 is the most 
>> recent safe version.  I don't think that's your problem, but it's a good 
>> general consideration anyway.
>>
>> -Z
>>
>>
>>
>> On Thursday, March 13, 2014 8:23:34 PM UTC-4, Jos Kraaijeveld wrote:
>>>
>>> @Mark:
>>> The heap is set to 2GB, using mlockall. The problem occurs with both 
>>> OpenJDK7 and OracleJDK7, both the latest versions. I have one index, which 
>>> is very small:
>>> index: 
>>> {
>>> primary_size_in_bytes: 37710681
>>> size_in_bytes: 37710681
>>> }
>>>
>>> @Zachary Our systems are set up to alert when memory is about to run 
>>> out. We use Ganglia to monitor our systems and that represents the memory 
>>> as 'used', rather than 'cached'. I will try to just let it run until memory 
>>> runs out and report back after that though.
>>>
>>>
>>>
>>> On Thursday, March 13, 2014 5:17:20 PM UTC-7, Zachary Tong wrote:

 I believe you are just witnessing the OS caching files in memory. 
  Lucene (and therefore by extension Elasticsearch) uses a large number of 
 files to represent segments.  TTL + updates will cause even higher file 
 turnover than usual.

 The OS manages all of this caching and will reclaim it for other 
 processes when needed.  Are you experiencing problems, or just witnessing 
 memory usage?  I wouldn't be concerned unless there is an actual problem 
 that you are seeing.



 On Thursday, March 13, 2014 8:07:13 PM UTC-4, Jos Kraaijeveld wrote:
>
> Hey,
>
> I've run into an issue which is preventing me from moving forwards 
> with ES. I've got an application where I keep 'live' documents in 
> ElasticSearch. Each document is a combination from data from multiple 
> sources, which are merged together using doc_as_upsert. Each document has 
> a 
> TTL which is updated whenever new data comes in for a document, so 
> documents die whenever no data source has given information about it for 
> a 
> while. The amount of documents generally doesn't exceed 15.000 so it's a 
> fairly small data set.
>
> Whenever I leave this running, slowly but surely memory usage on the 
> box creeps up, seemingly unbounded until there is no more resident memory 
> left. The Java process nicely keeps within its set ES_MAX_HEAP bounds, 
> but 
> it seems the mapping from storage on disk to memory is every-increasing, 
> even when the amount of 'live' documents goes to 0. 
>
> I was wondering if anyone has seen such a memory problem before and 
> whether there are ways to debug memory usage which is unaccounted for by 
> processes in 'top'.
>


-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/497bffad-b26f-438e-b603-e7a4a3b90adf%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Constantly increasing memory outside of Java heap

2014-03-13 Thread Zachary Tong
Also, are there other processes running which may be causing the problem? 
 Does the behavior only happen when ES is running?



On Thursday, March 13, 2014 8:31:18 PM UTC-4, Zachary Tong wrote:
>
> Cool, curious to see what happens.  As an aside, I would recommend 
> downgrading to Java 1.7.0_u25.  There are known bugs in the most recent 
> Oracle JVM versions which have not been resolved yet.  u25 is the most 
> recent safe version.  I don't think that's your problem, but it's a good 
> general consideration anyway.
>
> -Z
>
>
>
> On Thursday, March 13, 2014 8:23:34 PM UTC-4, Jos Kraaijeveld wrote:
>>
>> @Mark:
>> The heap is set to 2GB, using mlockall. The problem occurs with both 
>> OpenJDK7 and OracleJDK7, both the latest versions. I have one index, which 
>> is very small:
>> index: 
>> {
>> primary_size_in_bytes: 37710681
>> size_in_bytes: 37710681
>> }
>>
>> @Zachary Our systems are set up to alert when memory is about to run out. 
>> We use Ganglia to monitor our systems and that represents the memory as 
>> 'used', rather than 'cached'. I will try to just let it run until memory 
>> runs out and report back after that though.
>>
>>
>>
>> On Thursday, March 13, 2014 5:17:20 PM UTC-7, Zachary Tong wrote:
>>>
>>> I believe you are just witnessing the OS caching files in memory. 
>>>  Lucene (and therefore by extension Elasticsearch) uses a large number of 
>>> files to represent segments.  TTL + updates will cause even higher file 
>>> turnover than usual.
>>>
>>> The OS manages all of this caching and will reclaim it for other 
>>> processes when needed.  Are you experiencing problems, or just witnessing 
>>> memory usage?  I wouldn't be concerned unless there is an actual problem 
>>> that you are seeing.
>>>
>>>
>>>
>>> On Thursday, March 13, 2014 8:07:13 PM UTC-4, Jos Kraaijeveld wrote:

 Hey,

 I've run into an issue which is preventing me from moving forwards with 
 ES. I've got an application where I keep 'live' documents in 
 ElasticSearch. 
 Each document is a combination from data from multiple sources, which are 
 merged together using doc_as_upsert. Each document has a TTL which is 
 updated whenever new data comes in for a document, so documents die 
 whenever no data source has given information about it for a while. The 
 amount of documents generally doesn't exceed 15.000 so it's a fairly small 
 data set.

 Whenever I leave this running, slowly but surely memory usage on the 
 box creeps up, seemingly unbounded until there is no more resident memory 
 left. The Java process nicely keeps within its set ES_MAX_HEAP bounds, but 
 it seems the mapping from storage on disk to memory is every-increasing, 
 even when the amount of 'live' documents goes to 0. 

 I was wondering if anyone has seen such a memory problem before and 
 whether there are ways to debug memory usage which is unaccounted for by 
 processes in 'top'.

>>>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/2efc954a-9b20-4aca-9bfc-59d012da719d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Constantly increasing memory outside of Java heap

2014-03-13 Thread Zachary Tong
Cool, curious to see what happens.  As an aside, I would recommend 
downgrading to Java 1.7.0_u25.  There are known bugs in the most recent 
Oracle JVM versions which have not been resolved yet.  u25 is the most 
recent safe version.  I don't think that's your problem, but it's a good 
general consideration anyway.

-Z



On Thursday, March 13, 2014 8:23:34 PM UTC-4, Jos Kraaijeveld wrote:
>
> @Mark:
> The heap is set to 2GB, using mlockall. The problem occurs with both 
> OpenJDK7 and OracleJDK7, both the latest versions. I have one index, which 
> is very small:
> index: 
> {
> primary_size_in_bytes: 37710681
> size_in_bytes: 37710681
> }
>
> @Zachary Our systems are set up to alert when memory is about to run out. 
> We use Ganglia to monitor our systems and that represents the memory as 
> 'used', rather than 'cached'. I will try to just let it run until memory 
> runs out and report back after that though.
>
>
>
> On Thursday, March 13, 2014 5:17:20 PM UTC-7, Zachary Tong wrote:
>>
>> I believe you are just witnessing the OS caching files in memory.  Lucene 
>> (and therefore by extension Elasticsearch) uses a large number of files to 
>> represent segments.  TTL + updates will cause even higher file turnover 
>> than usual.
>>
>> The OS manages all of this caching and will reclaim it for other 
>> processes when needed.  Are you experiencing problems, or just witnessing 
>> memory usage?  I wouldn't be concerned unless there is an actual problem 
>> that you are seeing.
>>
>>
>>
>> On Thursday, March 13, 2014 8:07:13 PM UTC-4, Jos Kraaijeveld wrote:
>>>
>>> Hey,
>>>
>>> I've run into an issue which is preventing me from moving forwards with 
>>> ES. I've got an application where I keep 'live' documents in ElasticSearch. 
>>> Each document is a combination from data from multiple sources, which are 
>>> merged together using doc_as_upsert. Each document has a TTL which is 
>>> updated whenever new data comes in for a document, so documents die 
>>> whenever no data source has given information about it for a while. The 
>>> amount of documents generally doesn't exceed 15.000 so it's a fairly small 
>>> data set.
>>>
>>> Whenever I leave this running, slowly but surely memory usage on the box 
>>> creeps up, seemingly unbounded until there is no more resident memory left. 
>>> The Java process nicely keeps within its set ES_MAX_HEAP bounds, but it 
>>> seems the mapping from storage on disk to memory is every-increasing, even 
>>> when the amount of 'live' documents goes to 0. 
>>>
>>> I was wondering if anyone has seen such a memory problem before and 
>>> whether there are ways to debug memory usage which is unaccounted for by 
>>> processes in 'top'.
>>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/a492cb9a-abeb-4b0e-ae97-5db7df843565%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Low priority queries or query throttling?

2014-03-13 Thread Zachary Tong
What's the nature of the queries?  There may be some optimizations that can 
be made.

How much memory is on the box total?

I would not recommend G1 GC.  It is promising but we still see bug reports 
where G1 just straight up crashes.  For now, the official ES recommendation 
is still CMS.  FWIW, G1 will use more CPU than CMS by definition, because 
of the way G1 operates (e.g. shorter pauses at the cost of more CPU).  That 
could partially explain your increased load.

There is currently no way to give a priority to queries, although I agree 
that would be very nice.  There are some tricks you can do to control where 
queries go using search preferences (see 
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-request-preference.html).
 
 For example, you could send all "slow" queries to a single node, and send 
all other queries to the rest of your cluster.  That would effectively 
bottleneck the slow queries, assuming the "slow node" has all the shards 
required to execute the query.

Similarly, you can use Allocation Awareness and Forced Zones to control 
which indices end up on which shards, etc.




On Thursday, March 13, 2014 5:15:58 PM UTC-4, Peter Wright wrote:
>
> Hi,
> I am currently having trouble with fairly slow and intensive queries 
> causing excessive load on my elasticsearch cluster and I would like to 
> know people's opinions on ways to mitigate or prevent that excessive load.
>
> We attempt about 50 of these slow queries per second, and they take an 
> average of 300ms which adds up to more than we can process,, causing 
> excessive load and sometimes causing elasticsearch to become 
> non-responsive.
>
> The slow queries are all low priority, and we have other, high priority 
> queries running on the index. Slow queries could take 10 seconds to process 
> for all we care, and we'd rather have them fail than cause excessive load 
> on the cluster. 
>
> Is there a way to give these queries a lower priority and to force them to 
> use no more than a certain percentage of the cluster's resources? Or is it 
> possible to refuse certain types of queries if elasticsearch is under 
> excessive load?
>
> I am also curious if people have thoughts on what could improve the 
> throughput of these queries based on the information given below. I can 
> give more details about the structure of the queries themselves if 
> necessary.
>
>
>
> The cluster is made of two machines (both have 16 CPU cores and let 
> elasticsearch use 15G of memory) running ES 0.90.12 with the G1 garbage 
> collector. I began experiencing higher load with this setup than before 
> when I upgraded to 0.90.12 from 0.90.0 beta (which was using the CMS 
> collector with the default settings), however, several other changes were 
> made at the same time and it isn't yet fully clear whether either the 
> change of version or the change of GC is responsible, or whether it was 
> simply a coincidence. Some thoughts on that would be appreciated.
>
> The index has 5 shards and one replica (both machines each have a version 
> of all shards), is a couple gigs in size and contains a couple million 
> documents.
>
> Thanks!
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/a13d004a-e180-43ad-90c1-132bd05bfdfa%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Constantly increasing memory outside of Java heap

2014-03-13 Thread Jos Kraaijeveld
@Mark:
The heap is set to 2GB, using mlockall. The problem occurs with both 
OpenJDK7 and OracleJDK7, both the latest versions. I have one index, which 
is very small:
index: 
{
primary_size_in_bytes: 37710681
size_in_bytes: 37710681
}

@Zachary Our systems are set up to alert when memory is about to run out. 
We use Ganglia to monitor our systems and that represents the memory as 
'used', rather than 'cached'. I will try to just let it run until memory 
runs out and report back after that though.



On Thursday, March 13, 2014 5:17:20 PM UTC-7, Zachary Tong wrote:
>
> I believe you are just witnessing the OS caching files in memory.  Lucene 
> (and therefore by extension Elasticsearch) uses a large number of files to 
> represent segments.  TTL + updates will cause even higher file turnover 
> than usual.
>
> The OS manages all of this caching and will reclaim it for other processes 
> when needed.  Are you experiencing problems, or just witnessing memory 
> usage?  I wouldn't be concerned unless there is an actual problem that you 
> are seeing.
>
>
>
> On Thursday, March 13, 2014 8:07:13 PM UTC-4, Jos Kraaijeveld wrote:
>>
>> Hey,
>>
>> I've run into an issue which is preventing me from moving forwards with 
>> ES. I've got an application where I keep 'live' documents in ElasticSearch. 
>> Each document is a combination from data from multiple sources, which are 
>> merged together using doc_as_upsert. Each document has a TTL which is 
>> updated whenever new data comes in for a document, so documents die 
>> whenever no data source has given information about it for a while. The 
>> amount of documents generally doesn't exceed 15.000 so it's a fairly small 
>> data set.
>>
>> Whenever I leave this running, slowly but surely memory usage on the box 
>> creeps up, seemingly unbounded until there is no more resident memory left. 
>> The Java process nicely keeps within its set ES_MAX_HEAP bounds, but it 
>> seems the mapping from storage on disk to memory is every-increasing, even 
>> when the amount of 'live' documents goes to 0. 
>>
>> I was wondering if anyone has seen such a memory problem before and 
>> whether there are ways to debug memory usage which is unaccounted for by 
>> processes in 'top'.
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/bb012b9b-0ffa-4e37-9ef8-284049f644ae%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Constantly increasing memory outside of Java heap

2014-03-13 Thread Zachary Tong
I believe you are just witnessing the OS caching files in memory.  Lucene 
(and therefore by extension Elasticsearch) uses a large number of files to 
represent segments.  TTL + updates will cause even higher file turnover 
than usual.

The OS manages all of this caching and will reclaim it for other processes 
when needed.  Are you experiencing problems, or just witnessing memory 
usage?  I wouldn't be concerned unless there is an actual problem that you 
are seeing.



On Thursday, March 13, 2014 8:07:13 PM UTC-4, Jos Kraaijeveld wrote:
>
> Hey,
>
> I've run into an issue which is preventing me from moving forwards with 
> ES. I've got an application where I keep 'live' documents in ElasticSearch. 
> Each document is a combination from data from multiple sources, which are 
> merged together using doc_as_upsert. Each document has a TTL which is 
> updated whenever new data comes in for a document, so documents die 
> whenever no data source has given information about it for a while. The 
> amount of documents generally doesn't exceed 15.000 so it's a fairly small 
> data set.
>
> Whenever I leave this running, slowly but surely memory usage on the box 
> creeps up, seemingly unbounded until there is no more resident memory left. 
> The Java process nicely keeps within its set ES_MAX_HEAP bounds, but it 
> seems the mapping from storage on disk to memory is every-increasing, even 
> when the amount of 'live' documents goes to 0. 
>
> I was wondering if anyone has seen such a memory problem before and 
> whether there are ways to debug memory usage which is unaccounted for by 
> processes in 'top'.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/29a04d80-8cee-4775-b2b7-fb0abb7e865c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Constantly increasing memory outside of Java heap

2014-03-13 Thread Mark Walkom
How much heap, what java version, how big are your indexes?

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com
web: www.campaignmonitor.com


On 14 March 2014 11:11, Jos Kraaijeveld  wrote:

> I forgot to mention, I'm running ElasticSearch 1.0.1 on Ubuntu 12.04 with
> 24GB of available RAM.
>
>
> On Thursday, March 13, 2014 5:07:13 PM UTC-7, Jos Kraaijeveld wrote:
>>
>> Hey,
>>
>> I've run into an issue which is preventing me from moving forwards with
>> ES. I've got an application where I keep 'live' documents in ElasticSearch.
>> Each document is a combination from data from multiple sources, which are
>> merged together using doc_as_upsert. Each document has a TTL which is
>> updated whenever new data comes in for a document, so documents die
>> whenever no data source has given information about it for a while. The
>> amount of documents generally doesn't exceed 15.000 so it's a fairly small
>> data set.
>>
>> Whenever I leave this running, slowly but surely memory usage on the box
>> creeps up, seemingly unbounded until there is no more resident memory left.
>> The Java process nicely keeps within its set ES_MAX_HEAP bounds, but it
>> seems the mapping from storage on disk to memory is every-increasing, even
>> when the amount of 'live' documents goes to 0.
>>
>> I was wondering if anyone has seen such a memory problem before and
>> whether there are ways to debug memory usage which is unaccounted for by
>> processes in 'top'.
>>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/5616eb61-5199-4a6c-a257-18b31c582d83%40googlegroups.com
> .
>
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAEM624ZstrfY8rvq1F-WqLZ6zWzLCU4Ubv6zB%2BScxJRUDeVw1g%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Constantly increasing memory outside of Java heap

2014-03-13 Thread Jos Kraaijeveld
I forgot to mention, I'm running ElasticSearch 1.0.1 on Ubuntu 12.04 with 
24GB of available RAM.

On Thursday, March 13, 2014 5:07:13 PM UTC-7, Jos Kraaijeveld wrote:
>
> Hey,
>
> I've run into an issue which is preventing me from moving forwards with 
> ES. I've got an application where I keep 'live' documents in ElasticSearch. 
> Each document is a combination from data from multiple sources, which are 
> merged together using doc_as_upsert. Each document has a TTL which is 
> updated whenever new data comes in for a document, so documents die 
> whenever no data source has given information about it for a while. The 
> amount of documents generally doesn't exceed 15.000 so it's a fairly small 
> data set.
>
> Whenever I leave this running, slowly but surely memory usage on the box 
> creeps up, seemingly unbounded until there is no more resident memory left. 
> The Java process nicely keeps within its set ES_MAX_HEAP bounds, but it 
> seems the mapping from storage on disk to memory is every-increasing, even 
> when the amount of 'live' documents goes to 0. 
>
> I was wondering if anyone has seen such a memory problem before and 
> whether there are ways to debug memory usage which is unaccounted for by 
> processes in 'top'.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/5616eb61-5199-4a6c-a257-18b31c582d83%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Constantly increasing memory outside of Java heap

2014-03-13 Thread Jos Kraaijeveld
Hey,

I've run into an issue which is preventing me from moving forwards with ES. 
I've got an application where I keep 'live' documents in ElasticSearch. 
Each document is a combination from data from multiple sources, which are 
merged together using doc_as_upsert. Each document has a TTL which is 
updated whenever new data comes in for a document, so documents die 
whenever no data source has given information about it for a while. The 
amount of documents generally doesn't exceed 15.000 so it's a fairly small 
data set.

Whenever I leave this running, slowly but surely memory usage on the box 
creeps up, seemingly unbounded until there is no more resident memory left. 
The Java process nicely keeps within its set ES_MAX_HEAP bounds, but it 
seems the mapping from storage on disk to memory is every-increasing, even 
when the amount of 'live' documents goes to 0. 

I was wondering if anyone has seen such a memory problem before and whether 
there are ways to debug memory usage which is unaccounted for by processes 
in 'top'.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/68ac8858-9074-43f1-9ad4-666de8cba344%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: how to intellij

2014-03-13 Thread Andrew Selden
Hi Ben,

Once you have cloned the source, you should be able to open it as a project in 
Intellij by just pointing to the root directory. 

To run an ES server from Intellij you can create a run configuration with these 
settings:

Main class: org.elasticsearch.bootstrap.Elasticsearch

VM options: -Xms256m -Xmx1g -Xss256k -Djava.awt.headless=true -XX:+UseParNewGC 
-XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 
-XX:+UseCMSInitiatingOccupancyOnly -XX:+HeapDumpOnOutOfMemoryError 
-XX:HeapDumpPath=logs/heapdump.hprof -Delasticsearch -Des.foreground=yes 
-Djava.library.path=lib/sigar -ea 
-Des.config=/elasticsearch.yml  
-Des.logger.level=DEBUG

Working directory: I have this pointed to the top-level source dir. Not sure it 
matters.

Environment: ES_TEST_LOCAL=true

With that you should be able to hit the magical green arrow button to run it. 
You can set breakpoints in the code, go to a terminal and issue curl commands, 
and you'll be able to step through the code. 

- A

On Mar 13, 2014, at 1:50 PM, Benjamin Black  wrote:

> would one of you wise sages offer up the magical incantations for working 
> with ES source in intellij? specifically, what are the configuration steps 
> from cloning the github repo through to debugging a running ES instance? i 
> have had no luck with either following the README, random mailing list posts, 
> or blogs or with banging my face on the keyboard like an angry simian. your 
> assistance is appreciated.
> 
> 
> b
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/elasticsearch/663e722e-9cff-45ba-bbd8-b4dd8efa9829%40googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/76FE575C-5730-4ED7-B3AA-0245289E9A03%40elasticsearch.com.
For more options, visit https://groups.google.com/d/optout.


Re: ES-Hive

2014-03-13 Thread P lva
I have a simple query 
insert into table eslogs select * from eslogs_ext;
Hive and elasticsearch are running on the same host. 

To execute the script I'm following the directions from the link.
http://www.elasticsearch.org/guide/en/elasticsearch/hadoop/current/hive.html

There are two elasticsearch nodes, and they can recognize each other (as 
indicated by start up process) , but why would hive not be able to pick 
them up ? Can you explain what could have gone wrong ?


On Thursday, March 13, 2014 4:28:14 PM UTC-5, Costin Leau wrote:
>
> What does your Hive script look like? Can you confirm the ip/address of 
> your Hive and Elasticsearch ? How are you 
> executing the script? 
> The error indicates an error in your network configuration. 
>
> Cheers, 
>
> P.S. Feel free to post a gist or whatever it's convenient. 
>
> On 3/13/2014 10:38 PM, P lva wrote: 
> > Hi, I have few weblogs in a hive table that I'd like to visualize in 
> kibana. 
> > ES is on the same node as hive server. 
> > 
> > Followed directions from this page 
> http://www.elasticsearch.org/guide/en/elasticsearch/hadoop/current/hive.html 
> > 
> > I can create a table  using esstorage handler, but when I tried to 
> ingest data into this table I got 
> > 
> > Error: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing 
> > row {***first row of my table**} 
> > at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:175) 
> >  at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) 
> >  at 
> org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429) 
> >  at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) 
> >  at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162) 
> >  at java.security.AccessController.doPrivileged(Native Method) 
> >  at javax.security.auth.Subject.doAs(Subject.java:415) 
> >  at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
>  
>
> >  at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157) 
> > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive 
> Runtime Error while processing row {*** first row of 
> > my table**} 
> > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.io.IOException: Out of nodes and retries; caught exception 
> >  at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOperator.java:652)
>  
>
> >  at 
> org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:504) 
> >  at 
> org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:842) 
> >  at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:88)
>  
>
> >  at 
> org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:504) 
> >  at 
> org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:842) 
> >  at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:91)
>  
>
> >  at 
> org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:504) 
> >  at 
> org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:842) 
> >  at 
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:534) 
> >  ... 9 more 
> > Caused by: java.io.IOException: Out of nodes and retries; caught 
> exception 
> >  at 
> org.elasticsearch.hadoop.rest.NetworkClient.execute(NetworkClient.java:81) 
> >  at 
> org.elasticsearch.hadoop.rest.RestClient.execute(RestClient.java:221) 
> >  at 
> org.elasticsearch.hadoop.rest.RestClient.execute(RestClient.java:205) 
> >  at 
> org.elasticsearch.hadoop.rest.RestClient.execute(RestClient.java:209) 
> >  at 
> org.elasticsearch.hadoop.rest.RestClient.get(RestClient.java:103) 
> >  at 
> org.elasticsearch.hadoop.rest.RestClient.discoverNodes(RestClient.java:85) 
> >  at 
> org.elasticsearch.hadoop.rest.InitializationUtils.discoverNodesIfNeeded(InitializationUtils.java:60)
>  
>
> >  at 
> org.elasticsearch.hadoop.mr.EsOutputFormat$ESRecordWriter.init(EsOutputFormat.java:165)
>  
>
> >  at 
> org.elasticsearch.hadoop.hive.EsHiveOutputFormat$ESHiveRecordWriter.write(EsHiveOutputFormat.java:50)
>  
>
> >  at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOperator.java:638)
>  
>
> >  ... 18 more 
> > Caused by: java.net.ConnectException: Connection refused 
> >  at java.net.PlainSocketImpl.socketConnect(Native Method) 
> >  at 
> java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339) 
>
> >  at 
> java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
>  
>
> >  at 
> java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182) 
> >  at java.net.SocksSocketImpl.connect(SocksSock

Correction in the variable name for ES Hive Documentation in es.json parameter

2014-03-13 Thread Deepak Subhramanian
Hi,
I was struggling to load json document to ES from Hive. Later realised that 
there was a mistake in documentation. 

CREATE EXTERNAL TABLE json (data STRING 
)
STORED BY 'org.elasticsearch.hadoop.hive.EsStorageHandler'
TBLPROPERTIES('es.resource' = '...',
  'es.json.input` = 'yes' 
);

Correct parameter  name is es.input.json instead of es.json.input and value is 
'true' instead of 'yes'

CREATE EXTERNAL TABLE json (data STRING 
)
STORED BY 'org.elasticsearch.hadoop.hive.EsStorageHandler'
TBLPROPERTIES('es.resource' = '...',
  'es.input.json' = 'true' 
);

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/26aada4e-4a71-4cbe-8ab7-d6d8efeb8b19%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: ES Error on logstash syslog input - Invalid date format

2014-03-13 Thread Chris Laplante
Adding a mutate on these messages on the LS side to drop the timestamp
field did the trick. This is sort of puzzling though since that field is a
stock LS field and worked in a similar case.

Eg.

Mar 12 16:54:14 worked
Mar 13 12:59:39 failed

Thanks,

-Chris



On Thu, Mar 13, 2014 at 1:33 PM, Binh Ly  wrote:

> You have 2 timestamp fields: @timestamp, and timestamp. Looks like the
> timestamp field is the one that cannot be parsed. I see this value in the
> first doc: "timestamp":"Mar 13 12:15:39". You either need to format this
> properly from the LS side, or use the right date format on the ES side.
>
> --
> You received this message because you are subscribed to a topic in the
> Google Groups "elasticsearch" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/elasticsearch/4msT7NJT-tM/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/1a60d95c-f959-4f64-9307-c0aa4ce7e2f3%40googlegroups.com
> .
>
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAPWb6toSethsM2gs98DxHGu3h4M2EYbE2ZyAQ_%3DLHB4abnjXwQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: ES-Hive

2014-03-13 Thread Costin Leau
What does your Hive script look like? Can you confirm the ip/address of your Hive and Elasticsearch ? How are you 
executing the script?

The error indicates an error in your network configuration.

Cheers,

P.S. Feel free to post a gist or whatever it's convenient.

On 3/13/2014 10:38 PM, P lva wrote:

Hi, I have few weblogs in a hive table that I'd like to visualize in kibana.
ES is on the same node as hive server.

Followed directions from this page 
http://www.elasticsearch.org/guide/en/elasticsearch/hadoop/current/hive.html

I can create a table  using esstorage handler, but when I tried to ingest data 
into this table I got

Error: java.lang.RuntimeException: 
org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
processing
row {***first row of my table**}
at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:175)
 at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
 at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
 at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
 at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:415)
 at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
 at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error 
while processing row {*** first row of
my table**}
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
java.io.IOException: Out of nodes and retries; caught exception
 at 
org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOperator.java:652)
 at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:504)
 at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:842)
 at 
org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:88)
 at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:504)
 at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:842)
 at 
org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:91)
 at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:504)
 at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:842)
 at 
org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:534)
 ... 9 more
Caused by: java.io.IOException: Out of nodes and retries; caught exception
 at 
org.elasticsearch.hadoop.rest.NetworkClient.execute(NetworkClient.java:81)
 at 
org.elasticsearch.hadoop.rest.RestClient.execute(RestClient.java:221)
 at 
org.elasticsearch.hadoop.rest.RestClient.execute(RestClient.java:205)
 at 
org.elasticsearch.hadoop.rest.RestClient.execute(RestClient.java:209)
 at org.elasticsearch.hadoop.rest.RestClient.get(RestClient.java:103)
 at 
org.elasticsearch.hadoop.rest.RestClient.discoverNodes(RestClient.java:85)
 at 
org.elasticsearch.hadoop.rest.InitializationUtils.discoverNodesIfNeeded(InitializationUtils.java:60)
 at 
org.elasticsearch.hadoop.mr.EsOutputFormat$ESRecordWriter.init(EsOutputFormat.java:165)
 at 
org.elasticsearch.hadoop.hive.EsHiveOutputFormat$ESHiveRecordWriter.write(EsHiveOutputFormat.java:50)
 at 
org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOperator.java:638)
 ... 18 more
Caused by: java.net.ConnectException: Connection refused
 at java.net.PlainSocketImpl.socketConnect(Native Method)
 at 
java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
 at 
java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
 at 
java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
 at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
 at java.net.Socket.connect(Socket.java:579)
 at java.net.Socket.connect(Socket.java:528)
 at java.net.Socket.(Socket.java:425)
 at java.net.Socket.(Socket.java:280)
 at
org.apache.commons.httpclient.protocol.DefaultProtocolSocketFactory.createSocket(DefaultProtocolSocketFactory.java:80)
 at
org.apache.commons.httpclient.protocol.DefaultProtocolSocketFactory.createSocket(DefaultProtocolSocketFactory.java:122)
 at 
org.apache.commons.httpclient.HttpConnection.open(HttpConnection.java:707)
 at 
org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry(HttpMethodDirector.java:387)
 at 
org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMethodDirector.java:171)
 at 
org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:397)
 at 
org.apache.commons.httpclient.HttpClient.executeMetho

Re: elasticsearch-py usage

2014-03-13 Thread Josh Harrison
It doesn't look like the elasticsearch-py API covers the river use case. 
When I've run into things like this I've always just run a manual CURL 
request, or if I need to do it from within a script I just do a basic 
command with requests, ala
requests.put("http://localhost:9200/_river/mydocs/_meta"; data='{"type": 
"fs", "fs": {  "url": "/tmp",  "update_rate": 90,  "includes": 
"*.doc,*.pdf", "excludes": "resume" }}')
Not the most elegant approach, but it works!

On Thursday, March 13, 2014 1:57:55 PM UTC-7, Kent Tenney wrote:
>
> From the fsriver doc: 
>
> curl -XPUT 'localhost:9200/_river/mydocs/_meta' -d '{ 
>   "type": "fs", 
>   "fs": { 
>  "url": "/tmp", 
>  "update_rate": 90, 
>  "includes": "*.doc,*.pdf", 
>  "excludes": "resume" 
>} 
> }' 
>
> How does this translate to the Python API? 
>
> Thanks, 
> Kent 
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/c9d25008-fe3d-43dc-a57f-e8e510f8a3ce%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: elasticsearch-py usage

2014-03-13 Thread Honza Král
Hello Kent,

you can always access the raw transport and send any request you wish
for the unsupported APIs:

from elasticsearch import Elasticsearch
es = Elasticsearch()
data, status = es.transport.perform_request('PUT', '/_river/',
body={'type': 'fs',})

Hope this helps,

Honza Kral

On Thu, Mar 13, 2014 at 9:57 PM, Kent Tenney  wrote:
> From the fsriver doc:
>
> curl -XPUT 'localhost:9200/_river/mydocs/_meta' -d '{
>   "type": "fs",
>   "fs": {
>  "url": "/tmp",
>  "update_rate": 90,
>  "includes": "*.doc,*.pdf",
>  "excludes": "resume"
>}
> }'
>
> How does this translate to the Python API?
>
> Thanks,
> Kent
>
> --
> You received this message because you are subscribed to the Google Groups 
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/elasticsearch/CAAa_k2f9zW7VLR5%3DjkqipQwzZanU%2BQxCDWSVJ0FD1L_6RP6g7w%40mail.gmail.com.
> For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CABfdDiod%2B59QVa9dtOT8TBSAtuHh8Zm_ongDu-2vs%3DeZD-UcFw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Low priority queries or query throttling?

2014-03-13 Thread Peter Wright
Hi,
I am currently having trouble with fairly slow and intensive queries 
causing excessive load on my elasticsearch cluster and I would like to know 
people's opinions on ways to mitigate or prevent that excessive load.

We attempt about 50 of these slow queries per second, and they take an 
average of 300ms which adds up to more than we can process,, causing 
excessive load and sometimes causing elasticsearch to become non-responsive.

The slow queries are all low priority, and we have other, high priority 
queries running on the index. Slow queries could take 10 seconds to process 
for all we care, and we'd rather have them fail than cause excessive load 
on the cluster. 

Is there a way to give these queries a lower priority and to force them to 
use no more than a certain percentage of the cluster's resources? Or is it 
possible to refuse certain types of queries if elasticsearch is under 
excessive load?

I am also curious if people have thoughts on what could improve the 
throughput of these queries based on the information given below. I can 
give more details about the structure of the queries themselves if 
necessary.



The cluster is made of two machines (both have 16 CPU cores and let 
elasticsearch use 15G of memory) running ES 0.90.12 with the G1 garbage 
collector. I began experiencing higher load with this setup than before 
when I upgraded to 0.90.12 from 0.90.0 beta (which was using the CMS 
collector with the default settings), however, several other changes were 
made at the same time and it isn't yet fully clear whether either the 
change of version or the change of GC is responsible, or whether it was 
simply a coincidence. Some thoughts on that would be appreciated.

The index has 5 shards and one replica (both machines each have a version 
of all shards), is a couple gigs in size and contains a couple million 
documents.

Thanks!

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/38e8c947-7dff-4eab-9cc0-b492e20a769e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


elasticsearch-py usage

2014-03-13 Thread Kent Tenney
>From the fsriver doc:

curl -XPUT 'localhost:9200/_river/mydocs/_meta' -d '{
  "type": "fs",
  "fs": {
 "url": "/tmp",
 "update_rate": 90,
 "includes": "*.doc,*.pdf",
 "excludes": "resume"
   }
}'

How does this translate to the Python API?

Thanks,
Kent

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAAa_k2f9zW7VLR5%3DjkqipQwzZanU%2BQxCDWSVJ0FD1L_6RP6g7w%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Elasticsearch index mapping in java

2014-03-13 Thread Nikita Tovstoles
fwiw, I fixed my issue below by using prepareCreate().setSource() - rather 
than .setSettings() with idx config in the following format:

{"settings"  : {
  "index": {
"number_of_shards"  : 1,
"number_of_replicas": 0,
"analysis"  : {
  "analyzer": {
"lowercase_keyword": {
  "tokenizer": "keyword",
  "filter"   : ["lowercase"]
}
  }
}
  }
}, "mappings": {
  "type1": {
"properties": {
  "name": {
"type" : "string",
"index": "not_analyzed"
  }
}
  }
}}



On Thursday, March 13, 2014 9:58:56 AM UTC-7, Nikita Tovstoles wrote:
>
> Hi, Kevin:
>
> Create Index 
> refsays
>  mappings can be included in index settings JSON. Are you saying that's 
> not supported by the Java client? (Fwiw, I am seeing the same - just wanted 
> to confirm):
>  the following settings successfully create index with predefined mappings 
> when using curl put /idx -d @settings.json but not when fed to java client 
> like so:
>
>
> Settings settings = 
> ImmutableSettings.settingsBuilder().loadFromClasspath("es_admin_idx_settings.json").build();
> return 
> getClient().admin().indices().prepareCreate(idxName).setSettings(settings).execute();
>
> INFO metadata:114 - [Seth] [idx-elasticsearchserviceintegrationtest] 
> creating index, cause [api], shards [2]/[0], mappings [] //EMPTY MAPPINGS
>
> {
>   "index"   : {
> "number_of_shards"  : 1,
> "number_of_replicas": 0
>   },
>   "mappings": {
> "eligibility_criteria": {
>   "properties": {
> "name": {
>   "type"  : "string",
>   "fields": {
> "raw": {
>   "type" : "string",
>   "index": "not_analyzed"
> }
>   }
> }
>   }
> }
>   }
> }
>
>
> On Tuesday, February 4, 2014 5:00:30 PM UTC-8, Kevin Wang wrote:
>>
>> The index request is used to index document, you should use put mapping 
>> request.
>>
>> e,g,
>> PutMappingResponse response = 
>> client.admin().indices().preparePutMapping(INDEX).setType(INDEX_TYPE).setSource(source).get();
>>
>>
>> On Wednesday, February 5, 2014 1:27:41 AM UTC+11, Doru Sular wrote:
>>>
>>> Hi guys,
>>>
>>> I am trying to create an index with the following code:
>>> XContentBuilder source = XContentFactory.jsonBuilder().startObject()
>>> //
>>> .startObject("settings")
>>> .field("number_of_shards", 1)
>>> .endObject()// end settings
>>> .startObject("mappings")
>>> .startObject(INDEX_TYPE)//
>>> .startObject("properties")//
>>> .startObject("user")
>>> .field("type", "string") // start user
>>> .field("store", "yes")
>>> .field("index", "analyzed")//
>>> .endObject()// end user
>>> .startObject("postDate")//
>>> .field("type", "date")
>>> .field("store", "yes")
>>> .field("index", "analyzed")//
>>> .endObject()// end post date
>>> .startObject("message") //
>>> .field("type", "string")
>>> .field("store", "yes")
>>> .field("index", "not_analyzed")
>>> .endObject() // end user field
>>> .endObject() // end properties
>>> .endObject() // end index type
>>> .endObject() // end mappings
>>> .endObject(); // end the container object
>>>
>>> IndexResponse response = this.client.prepareIndex(INDEX,INDEX_TYPE
>>> ).setSource(source)
>>> .setType(INDEX_TYPE).execute()
>>> .actionGet();
>>>
>>>
>>> I want to have the "message" field not analyzed, because later I want to 
>>> use facets to obtain unique messages.
>>> Unfortunately my code seems to add just a document in index with the 
>>> following structure:
>>> {
>>>   "settings": {
>>> "number_of_shards": 1
>>>   },
>>>   "mappings": {
>>> "tweet": {
>>>   "properties": {
>>> "user": {
>>>   "type": "string",
>>>   "store": "yes",
>>>   "index": "analyzed"
>>> },
>>> "postDate": {
>>>   "type": "date",
>>>   "store": "yes",
>>>   "index": "analyzed"
>>> },
>>> "message": {
>>>   "type": "string",
>>>   "store": "yes",
>>>   "index": "not_analyzed"
>>> }
>>>   }
>>> }
>>>   }
>>> }
>>>
>>> Please help me to spot the error, it seems that mapping are not created.
>>> Thank you very much,
>>> Doru
>>>
>>>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com

how to intellij

2014-03-13 Thread Benjamin Black
would one of you wise sages offer up the magical incantations for working 
with ES source in intellij? specifically, what are the configuration steps 
from cloning the github repo through to debugging a running ES instance? i 
have had no luck with either following the README, random mailing list 
posts, or blogs or with banging my face on the keyboard like an angry 
simian. your assistance is appreciated.


b

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/663e722e-9cff-45ba-bbd8-b4dd8efa9829%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: elasticsearch memory usage

2014-03-13 Thread Hicham Mallah
Added index.codec.bloom.load: false to the elasticsearch.yml, doesn't seem
to have changed anything.

It is at 63% after 2 hours and a half up time.

Watching stuff on Bigdesk everything seems to be normal:

Memory:
Committed: 7.8gb
Used: 4.5gb



The used is going up and down normally, so heap is being cleaned no?

So it is working as expected, can't find anything, could it be Oracle Java,
should I try using OpenJDK at the place?!

Really thankful for you guys trying to help me



- - - - - - - - - -
Sincerely:
Hicham Mallah
Software Developer
mallah.hic...@gmail.com
00961 700 49 600



On Thu, Mar 13, 2014 at 7:23 PM, joergpra...@gmail.com <
joergpra...@gmail.com> wrote:

> There might be massive bloom cache loading for the Lucene codec. My
> suggestion is to disable it. Try start ES nodes with
>
> index:
>   codec:
> bloom:
>   load: false
>
> Bloom cache does not seem to fit perfectly into the diagnostics as you
> described, that is just from the exception you sent.
>
> Jörg
>
>
>
> On Thu, Mar 13, 2014 at 6:01 PM, Hicham Mallah wrote:
>
>> If I start elasticsearch from the bin folder not using the wrapper, I get
>> these exceptions after about 2 mins:
>>
>> Exception in thread "elasticsearch[Adam X][generic][T#5]"
>> java.lang.OutOfMemoryError: Java heap space
>> at
>> org.apache.lucene.util.fst.BytesStore.(BytesStore.java:62)
>> at org.apache.lucene.util.fst.FST.(FST.java:366)
>> at org.apache.lucene.util.fst.FST.(FST.java:301)
>> at
>> org.apache.lucene.codecs.BlockTreeTermsReader$FieldReader.(BlockTreeTermsReader.java:481)
>> at
>> org.apache.lucene.codecs.BlockTreeTermsReader.(BlockTreeTermsReader.java:175)
>> at
>> org.apache.lucene.codecs.lucene41.Lucene41PostingsFormat.fieldsProducer(Lucene41PostingsFormat.java:437)
>> at
>> org.elasticsearch.index.codec.postingsformat.BloomFilterPostingsFormat$BloomFilteredFieldsProducer.(BloomFilterPostingsFormat.java:131)
>> at
>> org.elasticsearch.index.codec.postingsformat.BloomFilterPostingsFormat.fieldsProducer(BloomFilterPostingsFormat.java:102)
>> at
>> org.elasticsearch.index.codec.postingsformat.Elasticsearch090PostingsFormat.fieldsProducer(Elasticsearch090PostingsFormat.java:79)
>> at
>> org.apache.lucene.codecs.perfield.PerFieldPostingsFormat$FieldsReader.(PerFieldPostingsFormat.java:195)
>> at
>> org.apache.lucene.codecs.perfield.PerFieldPostingsFormat.fieldsProducer(PerFieldPostingsFormat.java:244)
>> at
>> org.apache.lucene.index.SegmentCoreReaders.(SegmentCoreReaders.java:115)
>> at
>> org.apache.lucene.index.SegmentReader.(SegmentReader.java:95)
>> at
>> org.apache.lucene.index.ReadersAndUpdates.getReader(ReadersAndUpdates.java:141)
>> at
>> org.apache.lucene.index.ReadersAndUpdates.getReadOnlyClone(ReadersAndUpdates.java:235)
>> at
>> org.apache.lucene.index.StandardDirectoryReader.open(StandardDirectoryReader.java:100)
>> at
>> org.apache.lucene.index.IndexWriter.getReader(IndexWriter.java:382)
>> at
>> org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:111)
>> at
>> org.apache.lucene.search.XSearcherManager.(XSearcherManager.java:94)
>> at
>> org.elasticsearch.index.engine.internal.InternalEngine.buildSearchManager(InternalEngine.java:1462)
>> at
>> org.elasticsearch.index.engine.internal.InternalEngine.start(InternalEngine.java:279)
>> at
>> org.elasticsearch.index.shard.service.InternalIndexShard.performRecoveryPrepareForTranslog(InternalIndexShard.java:706)
>> at
>> org.elasticsearch.index.gateway.local.LocalIndexShardGateway.recover(LocalIndexShardGateway.java:201)
>> at
>> org.elasticsearch.index.gateway.IndexShardGatewayService$1.run(IndexShardGatewayService.java:189)
>> at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>> at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>> at java.lang.Thread.run(Thread.java:744)
>>
>>
>> - - - - - - - - - -
>> Sincerely:
>> Hicham Mallah
>> Software Developer
>> mallah.hic...@gmail.com
>> 00961 700 49 600
>>
>>
>>
>> On Thu, Mar 13, 2014 at 6:47 PM, Hicham Mallah 
>> wrote:
>>
>>> Hello again,
>>>
>>> setting bootstrap.mlockall to true seems to have made memory usage
>>> slower, so like at the place of elasticsearch being killed after ~2 hours
>>> it will be killed after ~3 hours. What I see weird, is why is the process
>>> releasing memory one back to the OS but not doing it again? And why is it
>>> not abiding by this DIRECT_SIZE setting too.
>>>
>>> Thanks for the help
>>>
>>>
>>> - - - - - - - - - -
>>> Sincerely:
>>> Hicham Mallah
>>> Software Developer
>>> mallah.hic...@gmail.com
>>> 00961 700 49 600
>>>
>>>
>>>
>>> On Thu, Mar 13, 2014 at 4:45 PM, Hicham Mallah 
>>> wrote:
>>>
 Jorg the issue is after the JVM giving back memory to the OS, it starts
 going up again, and never giv

ES-Hive

2014-03-13 Thread P lva
Hi, I have few weblogs in a hive table that I'd like to visualize in kibana.
ES is on the same node as hive server. 

Followed directions from this 
page 
http://www.elasticsearch.org/guide/en/elasticsearch/hadoop/current/hive.html

I can create a table  using esstorage handler, but when I tried to ingest 
data into this table I got

Error: java.lang.RuntimeException: 
org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
processing row {***first row of my table**} 
at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:175)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
Error while processing row {*** first row of my table**}
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
java.io.IOException: Out of nodes and retries; caught exception
at 
org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOperator.java:652)
at 
org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:504)
at 
org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:842)
at 
org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:88)
at 
org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:504)
at 
org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:842)
at 
org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:91)
at 
org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:504)
at 
org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:842)
at 
org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:534)
... 9 more
Caused by: java.io.IOException: Out of nodes and retries; caught exception
at 
org.elasticsearch.hadoop.rest.NetworkClient.execute(NetworkClient.java:81)
at 
org.elasticsearch.hadoop.rest.RestClient.execute(RestClient.java:221)
at 
org.elasticsearch.hadoop.rest.RestClient.execute(RestClient.java:205)
at 
org.elasticsearch.hadoop.rest.RestClient.execute(RestClient.java:209)
at org.elasticsearch.hadoop.rest.RestClient.get(RestClient.java:103)
at 
org.elasticsearch.hadoop.rest.RestClient.discoverNodes(RestClient.java:85)
at 
org.elasticsearch.hadoop.rest.InitializationUtils.discoverNodesIfNeeded(InitializationUtils.java:60)
at 
org.elasticsearch.hadoop.mr.EsOutputFormat$ESRecordWriter.init(EsOutputFormat.java:165)
at 
org.elasticsearch.hadoop.hive.EsHiveOutputFormat$ESHiveRecordWriter.write(EsHiveOutputFormat.java:50)
at 
org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOperator.java:638)
... 18 more
Caused by: java.net.ConnectException: Connection refused
at java.net.PlainSocketImpl.socketConnect(Native Method)
at 
java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
at 
java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
at 
java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:579)
at java.net.Socket.connect(Socket.java:528)
at java.net.Socket.(Socket.java:425)
at java.net.Socket.(Socket.java:280)
at 
org.apache.commons.httpclient.protocol.DefaultProtocolSocketFactory.createSocket(DefaultProtocolSocketFactory.java:80)
at 
org.apache.commons.httpclient.protocol.DefaultProtocolSocketFactory.createSocket(DefaultProtocolSocketFactory.java:122)
at 
org.apache.commons.httpclient.HttpConnection.open(HttpConnection.java:707)
at 
org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry(HttpMethodDirector.java:387)
at 
org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMethodDirector.java:171)
at 
org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:397)
at 
org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:323)
at 
org.elasticsearch.hadoop.rest.commonshttp.CommonsHttpTransport.execute(CommonsHttpTransport.java:160)
at 
org.elasticsearch.hadoop.rest.NetworkClient.execute(NetworkClient.java:74)
... 27 more

Now, I changed the config network.host to the ipadress of the server. Now 
when I run hive inse

Re: My ES stuck once a week with no reason?

2014-03-13 Thread Khasan Bold
On usual days I am seeing this log all the time in only one box:

@4000532216d90c452aac [2014-03-13 
16:36:31,205][DEBUG][action.admin.cluster.stats] [Lloigoroth] failed to 
execute on node [Mo2-u0RSQT6qqbMjW1CWag]
@4000532216d90c45327c 
org.elasticsearch.transport.RemoteTransportException: [Ketch, 
Dan][inet[/172.22.4.23:9300]][cluster/stats/n]
@4000532216d90c453664 Caused by: 
org.elasticsearch.transport.ActionNotFoundTransportException: No handler 
for action [cluster/stats/n]
@4000532216d90c453a4c at 
org.elasticsearch.transport.netty.MessageChannelHandler.handleRequest(MessageChannelHandler.java:205)
@4000532216d90c45809c at 
org.elasticsearch.transport.netty.MessageChannelHandler.messageReceived(MessageChannelHandler.java:108)
@4000532216d90c458484 at 
org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
@4000532216d90c45886c at 
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
@4000532216d90c45980c at 
org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
@4000532216d90c459bf4 at 
org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:296)
@4000532216d90c45af7c at 
org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:462)
@4000532216d90c45b364 at 
org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:443)
@4000532216d90c45b74c at 
org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:303)
@4000532216d90c45c304 at 
org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
@4000532216d90c45c6ec at 
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
@4000532216d90c45cad4 at 
org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
@4000532216d90c45da74 at 
org.elasticsearch.common.netty.OpenChannelsHandler.handleUpstream(OpenChannelsHandler.java:74)
@4000532216d90c45da74 at 
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
@4000532216d90c45de5c at 
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
@4000532216d90c45f1e4 at 
org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:268)
@4000532216d90c45f5cc at 
org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:255)
@4000532216d90c45f5cc at 
org.elasticsearch.common.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
@4000532216d90c45f9b4 at 
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:109)
@4000532216d90c46056c at 
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:312)
@4000532216d90c460954 at 
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:90)
@4000532216d90c460d3c at 
org.elasticsearch.common.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
@4000532216d90c460d3c at 
org.elasticsearch.common.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
@4000532216d90c466714 at 
org.elasticsearch.common.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
@4000532216d90c466afc at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
@4000532216d90c466afc at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
@4000532216d90c46826c at java.lang.Thread.run(Thread.java:724)

On Thursday, March 13, 2014 1:35:25 PM UTC-7, Khasan Bold wrote:
>
> I have a 4 box ES installed, the version that I am using is 0.90.10 but it 
> fails once in a week. What I am getting is 50X error in kibana. When I 
> check the log one of the nodes are stuck. It is fine after restart. The 
> memories are fine for them. What else can I check ?
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/8a19ffd0-dd80-4334-9f57-7e3641688b52%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


My ES stuck once a week with no reason?

2014-03-13 Thread Khasan Bold
I have a 4 box ES installed, the version that I am using is 0.90.10 but it 
fails once in a week. What I am getting is 50X error in kibana. When I 
check the log one of the nodes are stuck. It is fine after restart. The 
memories are fine for them. What else can I check ?

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/555d73de-1f65-4ca0-ba19-8917e2cc645f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: ES Error on logstash syslog input - Invalid date format

2014-03-13 Thread Binh Ly
You have 2 timestamp fields: @timestamp, and timestamp. Looks like the 
timestamp field is the one that cannot be parsed. I see this value in the 
first doc: "timestamp":"Mar 13 12:15:39". You either need to format this 
properly from the LS side, or use the right date format on the ES side.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/1a60d95c-f959-4f64-9307-c0aa4ce7e2f3%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Elasticsearch index mapping in java

2014-03-13 Thread Binh Ly
:) I was confused. I meant to reply to OP. Thanks!

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/96f1a30f-f4ba-4cbf-9d05-680ac25e1086%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Elasticsearch index mapping in java

2014-03-13 Thread Nikita Tovstoles
yep, that's what i used (see my prior post)


On Thu, Mar 13, 2014 at 1:08 PM, Binh Ly  wrote:

> prepareIndex() is to index a document. What you want is prepareCreate(). I
> have an example here (check the method createIndexFullMapping()):
>
>
> https://github.com/bly2k/es-java-examples/blob/master/admin/IndexAdminExample.java
>
>  --
> You received this message because you are subscribed to a topic in the
> Google Groups "elasticsearch" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/elasticsearch/NjBqXhwloq4/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/ef4c75f2-4f29-4fc2-92ea-79077df54a9c%40googlegroups.com
> .
>
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAJwaA224_AMwCoLRUTWJj6JLzrqJKRFhdNFeGByf4%2Bye3g18sQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: elasticsearch reroute mechanism

2014-03-13 Thread David Pilato
As you are new to all this, I'm wondering what you would like to achieve here 
or why do you think this is important for your use case.
I meant that by default elasticsearch is doing all that reroute thing for you 
if a node is added or removed so you don't need to take care of that.

To answer, reroute do what documentation describes: it can move a shard to node 
or allocate a shard which is not yet allocated.


-- 
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr


Le 13 mars 2014 à 20:41:39, Furkan KAMACI (furkankam...@gmail.com) a écrit:

Hi;

I am new to elasticsearch and not familiar to its terms too. Could anybody 
explain how elasticsearch reroute mechanism 
(http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/cluster-reroute.html)
 works internally?

Thanks;
Furkan KAMACI
--
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/a8357b2b-ddf5-4a79-bbba-efe51be2ebb7%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/etPan.532211e6.6b8b4567.1c8b%40MacBook-Air-de-David.local.
For more options, visit https://groups.google.com/d/optout.


Re: Elasticsearch index mapping in java

2014-03-13 Thread Binh Ly
prepareIndex() is to index a document. What you want is prepareCreate(). I 
have an example here (check the method createIndexFullMapping()):

https://github.com/bly2k/es-java-examples/blob/master/admin/IndexAdminExample.java

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/ef4c75f2-4f29-4fc2-92ea-79077df54a9c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Timeouts on Node Stats API?

2014-03-13 Thread Binh Ly
Can you do a hot_threads while this is happening?

curl "localhost:9200/_nodes/hot_threads"

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/da6c8c6e-eac5-4b92-a923-a1438607581b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Problem with configuring index template via file

2014-03-13 Thread Binh Ly
You won't see your template in the list API, but if you create a new index 
named logstash-, it should take effect properly unless it is 
overriden by another template with a higher order.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/f0c7553c-fa83-43d4-962a-c3876e0fbc87%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Accessing non-stored fields

2014-03-13 Thread Binh Ly
Yes you'd need to store the content_type to get it back. The _source field 
in your case is actually nothing more than the base64 of your raw input 
document at the time you indexed it.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/e7c8c6d7-7c9d-49a6-8234-9bebec5050d8%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


elasticsearch reroute mechanism

2014-03-13 Thread Furkan KAMACI
Hi;

I am new to elasticsearch and not familiar to its terms too. Could anybody 
explain how elasticsearch reroute mechanism (
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/cluster-reroute.html)
 
works internally?

Thanks;
Furkan KAMACI

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/a8357b2b-ddf5-4a79-bbba-efe51be2ebb7%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


ES Error on logstash syslog input - Invalid date format

2014-03-13 Thread Chris Laplante
I am having an intermittent problem with indexing some logstash syslog 
entries. It complains about an invalid date format. Oddly it has stopped 
working and then seemingly started working again on its own. I obviously 
may have inadvertently changed something but I have not been able to pin 
down what it is. The remote host is an rsyslog ubuntu machine, I get the 
errors on both stock application syslog entries and entries from our app.

Below is a sample of the error I am getting.  

[2014-03-13 12:15:55,641][DEBUG][action.bulk  ] [Power 
Princess] [logstash-2014.03.13][3] failed to execute bulk item (index) 
index {[logstash-2014.03.13][syslog][fwm304NgSEu8FRkTsh2EwQ], 
source[{"message":"Lab Manager Error: Error in undeploying. Contact the 
Administrator if the problem persists.  on ba-labmanager02.efi.internal 
when undeploying 
~calculus3~617436.i18~sutirtha~watkins","@version":"1","@timestamp":"2014-03-13T19:15:39.000Z","type":"syslog","host":"calculus-daemons-2.efi.internal","priority":156,"timestamp":"Mar
 
13 
12:15:39","logsource":"calculus-daemons-2","program":"vfi","pid":"15207","severity":4,"facility":19,"facility_label":"local3","severity_label":"Warning","tags":["Calculus","IDC"],"location":"IDC","automationserver":"labmanagerBA"}]}
org.elasticsearch.index.mapper.MapperParsingException: failed to parse 
[timestamp]
at 
org.elasticsearch.index.mapper.core.AbstractFieldMapper.parse(AbstractFieldMapper.java:418)
at 
org.elasticsearch.index.mapper.object.ObjectMapper.serializeValue(ObjectMapper.java:616)
at 
org.elasticsearch.index.mapper.object.ObjectMapper.parse(ObjectMapper.java:469)
at 
org.elasticsearch.index.mapper.DocumentMapper.parse(DocumentMapper.java:515)
at 
org.elasticsearch.index.mapper.DocumentMapper.parse(DocumentMapper.java:462)
at 
org.elasticsearch.index.shard.service.InternalIndexShard.prepareCreate(InternalIndexShard.java:371)
at 
org.elasticsearch.action.bulk.TransportShardBulkAction.shardIndexOperation(TransportShardBulkAction.java:400)
at 
org.elasticsearch.action.bulk.TransportShardBulkAction.shardOperationOnPrimary(TransportShardBulkAction.java:153)
at 
org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction.performOnPrimary(TransportShardReplicationOperationAction.java:556)
at 
org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction$1.run(TransportShardReplicationOperationAction.java:426)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:679)
*Caused by: org.elasticsearch.index.mapper.MapperParsingException: failed 
to parse date field [Mar 13 12:15:39], tried both date format 
[dateOptionalTime], and timestamp number with locale []*
at 
org.elasticsearch.index.mapper.core.DateFieldMapper.parseStringValue(DateFieldMapper.java:582)
at 
org.elasticsearch.index.mapper.core.DateFieldMapper.innerParseCreateField(DateFieldMapper.java:510)
at 
org.elasticsearch.index.mapper.core.NumberFieldMapper.parseCreateField(NumberFieldMapper.java:215)
at 
org.elasticsearch.index.mapper.core.AbstractFieldMapper.parse(AbstractFieldMapper.java:408)
... 12 more
Caused by: java.lang.IllegalArgumentException: Invalid format: "Mar 13 
12:15:39"
at 
org.elasticsearch.common.joda.time.format.DateTimeFormatter.parseMillis(DateTimeFormatter.java:754)
at 
org.elasticsearch.index.mapper.core.DateFieldMapper.parseStringValue(DateFieldMapper.java:576)
... 15 more

[2014-03-13 12:15:56,115][DEBUG][action.bulk  ] [Power 
Princess] [logstash-2014.03.13][2] failed to execute bulk item (index) 
index {[logstash-2014.03.13][syslog][Pyb-uSDER0Cj7nuBQ7101Q], 
source[{"message":"Deleting ~calculus3~617436.i18~sutirtha~watkins on 
ba-labmanager02.efi.internal","@version":"1","@timestamp":"2014-03-13T19:15:39.000Z","type":"syslog","host":"calculus-daemons-2.efi.internal","priority":156,"timestamp":"Mar
 
13 
12:15:39","logsource":"calculus-daemons-2","program":"vfi","pid":"15207","severity":4,"facility":19,"facility_label":"local3","severity_label":"Warning","tags":["Calculus","IDC"],"location":"IDC","automationserver":"labmanagerBA"}]}
org.elasticsearch.index.mapper.MapperParsingException: failed to parse 
[timestamp]
at 
org.elasticsearch.index.mapper.core.AbstractFieldMapper.parse(AbstractFieldMapper.java:418)
at 
org.elasticsearch.index.mapper.object.ObjectMapper.serializeValue(ObjectMapper.java:616)
at 
org.elasticsearch.index.mapper.object.ObjectMapper.parse(ObjectMapper.java:469)
at 
org.elasticsearch.index.mapper.DocumentMapper.parse(DocumentMapper.java:515)
at 
org.elasticsearch.index.mapper.DocumentMapper.parse(DocumentMapper.java:462)
at 
org.elasticsearch.index.shard.service.InternalIndexShard.prepareCreate(InternalIndexShard.java:371)
at 
org.elasticsearch.action.bulk.TransportShardBulkAction.shardIndexOperation(T

Re: Elasticsearch Startup issues.

2014-03-13 Thread P lva
It worked fine after manually setting all the environment variables.

I would say this though.   
Server a : ES out of box works from home dir. 
Server b : ES out of box neither works from /usr/lib/ nor does it work from 
home dir. Only way is to manually set env parameters. 
Both servers are built identically.



On Wednesday, March 12, 2014 2:57:57 PM UTC-5, Jörg Prante wrote:
>
> The script elasticsearch, how you call it with absolute path under system 
> dir /usr/lib, has problems in traversing the path to find the parent folder 
> and set the variable ES_HOME.
>
> You should use an init script, like provided by the distributions or the 
> service wrapper, with preconfigured environment, so ES_HOME and 
> ES_CLASSPATH (i.e. the $ES_HOME/lib folder) can be used.
>
> Jörg
>
>
> On Wed, Mar 12, 2014 at 8:44 PM, P lva >wrote:
>
>> I didnt want elasticsearch to be under a user's directory so I have it in 
>> /usr/lib/elasticsearch. All the files are owned by the non root user that 
>> is running the command 
>> bash /usr/lib/elasticsearch/bin/elasticsearch
>>
>>
>> On Wednesday, March 12, 2014 2:21:41 PM UTC-5, Jörg Prante wrote:
>>
>>> And do you start Elasticsearch by the same user you installed 
>>> Elasticsearch under?
>>>
>>> Jörg
>>>
>>>
>>> On Wed, Mar 12, 2014 at 8:19 PM, joerg...@gmail.com 
>>> wrote:
>>>
 How do you start Elasticsearch, by

 $ cd $ES_HOME
 $ ./bin/elasticsearch

 or

 $ cd $ES_HOME/bin
 $ ./elasticsearch

 or

 ...

 ?

 Jörg



 On Wed, Mar 12, 2014 at 8:12 PM, P lva  wrote:

> I have 
> RHEL 6. 
> Java 1.7
> default shell is bash. 
> I tried it on a server, it worked fine right out of the box, but when 
> I moved to the actual set of servers it failed. 
> Its got something to do with environment for sure, but I can't figure 
> out what. 
>
>
> On Wednesday, March 12, 2014 5:20:56 AM UTC-5, Alexander Reelsen wrote:
>
>> Hey,
>>
>> can you be more verbose, how you are actually starting elasticsearch, 
>> which version you are using, what operating system you are running on, 
>> which java version you are running, what is your default shell? Did you 
>> modify anything from the standard installation?
>> Thanks.
>>
>>
>> --Alex
>>
>>
>> On Tue, Mar 11, 2014 at 11:38 PM, P lva  wrote:
>>
>>> I get this error at elasticsearch startup
>>>  
>>> Exception in thread "main" java.lang.NoClassDefFoundError: 
>>> org.apache.lucene.util.Version
>>>at org.elasticsearch.Version.(Version.java:42)
>>>at java.lang.Class.initializeClass(libgcj.so.10)
>>>at org.elasticsearch.bootstrap.Bootstrap.buildErrorMessage(Boot
>>> strap.java:252)
>>>at org.elasticsearch.bootstrap.Bootstrap.main(Bootstrap.java:178)
>>>at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch
>>> .java:32)
>>>
>>> The jar containing this class is present in the lib directory and 
>>> echo $ES_CLASSPATH shows this jar. 
>>>
>>>
>>>  -- 
>>> You received this message because you are subscribed to the Google 
>>> Groups "elasticsearch" group.
>>> To unsubscribe from this group and stop receiving emails from it, 
>>> send an email to elasticsearc...@googlegroups.com.
>>>
>>> To view this discussion on the web visit 
>>> https://groups.google.com/d/msgid/elasticsearch/fd314c1e-a78
>>> a-4d93-a1e4-5c442fab1bee%40googlegroups.com
>>> .
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>  -- 
> You received this message because you are subscribed to the Google 
> Groups "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send 
> an email to elasticsearc...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/
> msgid/elasticsearch/ccb52426-5681-4505-b08f-8233ba0d78ef%
> 40googlegroups.com
> .
>
> For more options, visit https://groups.google.com/d/optout.
>


>>>  -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to elasticsearc...@googlegroups.com .
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/412cb050-aecf-4309-b230-db2d538ddd12%40googlegroups.com
>> .

Re: Rapidly Degrading Bulk Indexing Performance

2014-03-13 Thread Elliott Bradshaw
Thanks guys.

I've made some changes to my bulk indexing.  I'm now kicking off java bulk 
loaders with 8 threads a piece on 3 of our 11 servers.  This initially did 
not help, so I went in and checked out the hot_threads in ElasticHQ.  
Virtually all CPU was being allocated to building SpatialPrefixTrees!  I 
changed my geoshape resolution from 1KM to 10KM on the index and began 
reindexing.  I'm now hitting 125 million records/hour over the past 20 
minutes!  What's more, indexing speed has remained relatively constant over 
the load!

What doesn't make sense to me is that the building of SpatialIndexTrees 
should be roughly CPU constant over the course of the bulk index, and 
performance was degrading dramatically as the index got bigger.  Has anyone 
else experienced this problem??  Where before it was taking 10-12 hours to 
index the data, it will now likely finish indexing within the hour.  That 
seems like an awfully big difference (though it might also have to do with 
the new Java loaders)...

On Wednesday, March 12, 2014 5:35:31 PM UTC-4, Mark Walkom wrote:
>
> Use a plugin like Marvel or ElasticHQ.
>
> Regards,
> Mark Walkom
>
> Infrastructure Engineer
> Campaign Monitor
> email: ma...@campaignmonitor.com 
> web: www.campaignmonitor.com
>
>
> On 12 March 2014 23:29, Elliott Bradshaw 
> > wrote:
>
>> Thanks Binh, Mark.
>>
>> I'm using Oracle's Java 7 (1.7.0_51).  I will try to upgrade to 
>> Elasticsearch 1.0.1 if possible.
>>
>> It could definitely be a disk speed issue.  Unfortunately, we're working 
>> in a virtualized environment and cannot upgrade to SSD storage.
>>
>> What utility do you use for gc collection times, hot threads?
>>
>> Thanks!
>>
>> On Tuesday, March 11, 2014 6:13:51 PM UTC-4, Mark Walkom wrote:
>>>
>>> What java version are you using?
>>>
>>> Regards,
>>> Mark Walkom
>>>
>>> Infrastructure Engineer
>>> Campaign Monitor
>>> email: ma...@campaignmonitor.com
>>> web: www.campaignmonitor.com
>>>
>>>
>>> On 12 March 2014 00:34, Elliott Bradshaw  wrote:
>>>
 Hi All,

 We are currently attempting to optimize our configuration for a static 
 index of roughly 120 million records.  In time, this index will probably 
 be 
 much larger, but for now this is the working set.  We've been playing 
 around with Elasticsearch for several months now, and have made great 
 progress with performance tuning.  However, we still run into issues which 
 leave us scratching our heads.  One such issue is an unexpected indexing 
 speed drop as the index grows.

 We are working on an 11 node cluster.  Each node has 8 CPUs and 16G of 
 memory.  Heap size of each JVM is set to min/max of 8G.  Vm.swappiness has 
 been set to 0 on all of the systems, as they are being used solely for 
 Elasticsearch.  The Elasticsearch version is 0.90.7.  We are focusing on 
 loading a single index, and it has been initialized with 48 shards, with a 
 refresh interval of 120 seconds.  We're currently using Elasticsearch HQ 
 for real time monitoring of the system state, along with linux utils like 
 top, iotop and iftop.  Everything appears to be in order.

 Frequently we have to reindex the entire dataset as we are working in a 
 development environment and are still determining how best to structure 
 the 
 dataset.  We are indexing via a batch load script that fires off 10,000 
 record curl requests to the _bulk endpoint.  We partition the entire 
 dataset between three servers and run the batch load script simultaneously 
 on each one.

 At first, this appears to work great.  Initial indexing speeds are 
 roughly 50 million/hour, which would load the entire dataset in a little 
 over 2 hours.  However, once the index approaches 20 million records, 
 indexing performance drops significantly (down to roughly 10 
 million/hour).  As the index continues to grow, performance continues to 
 degrade, and I have seen it drop as low as less than 1 million records per 
 hour.  All in all, it takes nearly a day to index the entire dataset of 
 120 
 million records.

 I was hoping that the community might be able to offer some advice as 
 to what we might be doing wrong, or suggest other diagnostic approaches.  
 We're really trying to ratchet this system up to prepare it for production 
 mode, and are currently left scratching our heads.  Any thoughts, 
 opinions, 
 or tips would be greatly appreciated.

 Thanks!

 -- 
 You received this message because you are subscribed to the Google 
 Groups "elasticsearch" group.
 To unsubscribe from this group and stop receiving emails from it, send 
 an email to elasticsearc...@googlegroups.com.
 To view this discussion on the web visit https://groups.google.com/d/
 msgid/elasticsearch/98958587-eaf9-4451-84ee-78c38e7eab42%
 40googlegroups.com

Facets & multi-valued, numeric fields

2014-03-13 Thread Eric Jain
For sorting, elasticsearch lets me specify how I want to deal with fields 
that contain multiple numeric values, so I can have elasticsearch use e.g. 
the max value in each document.

Is there a similar option I can use when aggregating documents? For 
example, I might want to get the average of the max value in each document.

  http://stackoverflow.com/questions/22368807/facets-multi-valued-fields

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/fa3ab4f6-1c95-4742-85b0-a13d414163de%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: How to install Mapping attachment Plugin with debian install

2014-03-13 Thread ZenMaster80
Thanks - I figured it out as soon as I posted.
I found this explained the directory structure well.

https://gist.github.com/mystix/5460660

On Thursday, March 13, 2014 1:48:07 PM UTC-4, David Pilato wrote:
>
> It should be in /usr/share/elasticsearch/bin/
>
>
>
> -- 
> *David Pilato* | *Technical Advocate* | *Elasticsearch.com*
> @dadoonet  | 
> @elasticsearchfr
>
>
> Le 13 mars 2014 à 17:19:49, ZenMaster80 (sabda...@gmail.com ) 
> a écrit:
>
> On my local machine, I do this: bin/plugin -install ... 
>
>  With debian installation, I am not sure where the "bin/plugin' folder is?
>  Anyone knows?
>  --
> You received this message because you are subscribed to the Google Groups 
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to elasticsearc...@googlegroups.com .
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/elasticsearch/14d64f1f-fb5d-4c7c-876c-726814229d26%40googlegroups.com
> .
> For more options, visit https://groups.google.com/d/optout.
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/e1068cb3-a877-42ca-b0b0-d9d503f7cb53%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: How to install Mapping attachment Plugin with debian install

2014-03-13 Thread David Pilato
It should be in /usr/share/elasticsearch/bin/



-- 
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr


Le 13 mars 2014 à 17:19:49, ZenMaster80 (sabdall...@gmail.com) a écrit:

On my local machine, I do this: bin/plugin -install ...

With debian installation, I am not sure where the "bin/plugin' folder is?
Anyone knows?
--
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/14d64f1f-fb5d-4c7c-876c-726814229d26%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/etPan.5321ef57.431bd7b7.158d%40MacBook-Air-de-David.local.
For more options, visit https://groups.google.com/d/optout.


proper use of aggregation?

2014-03-13 Thread Tinou Bao
My instincts says this is not the proper use of aggregation but want to 
check w/ people who have actually used it. We want to bucket on a very high 
cardinality field and return **ALL** buckets (no size limit). For example, 
imagine documents representing people and their parents:

person - parent
===
john - cindy
james - cindy
tony - mark
tim - doug

I want to bucket by parent, so it'll be

cindy
   - john
   - james
mark
   - tony
doug
   - tim

This is a high cardinality field, so already it concerns me. I want all 
buckets (setting size to zero). So if I have 10,000 documents I have 5,000 
parent buckets and I want all 5,000 of these parent buckets. Essentially 
I'm trying to display by parent (group by parent). Moreover, I want to sort 
the parent's age (so imagine the parent has an age it it). Or maybe I want 
to sort by the average person (child) age in each bucket. So w/ aggregation 
this seems possible:

bucket by parent, sort by average age of person, bucket by person (to get 
all people for a parent bucket), set size to zero.

But it feels very wrong to me, both in terms of the potential performance 
issues around unlimited, high cardinality buckets and the sorting of those 
buckets; and that aggregration/bucketing wasn't designed for this.

Any input/feedback would be appreciated.
-T

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/ea6665a1-6562-456c-a806-937fd9f15463%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: elasticsearch memory usage

2014-03-13 Thread joergpra...@gmail.com
There might be massive bloom cache loading for the Lucene codec. My
suggestion is to disable it. Try start ES nodes with

index:
  codec:
bloom:
  load: false

Bloom cache does not seem to fit perfectly into the diagnostics as you
described, that is just from the exception you sent.

Jörg



On Thu, Mar 13, 2014 at 6:01 PM, Hicham Mallah wrote:

> If I start elasticsearch from the bin folder not using the wrapper, I get
> these exceptions after about 2 mins:
>
> Exception in thread "elasticsearch[Adam X][generic][T#5]"
> java.lang.OutOfMemoryError: Java heap space
> at org.apache.lucene.util.fst.BytesStore.(BytesStore.java:62)
> at org.apache.lucene.util.fst.FST.(FST.java:366)
> at org.apache.lucene.util.fst.FST.(FST.java:301)
> at
> org.apache.lucene.codecs.BlockTreeTermsReader$FieldReader.(BlockTreeTermsReader.java:481)
> at
> org.apache.lucene.codecs.BlockTreeTermsReader.(BlockTreeTermsReader.java:175)
> at
> org.apache.lucene.codecs.lucene41.Lucene41PostingsFormat.fieldsProducer(Lucene41PostingsFormat.java:437)
> at
> org.elasticsearch.index.codec.postingsformat.BloomFilterPostingsFormat$BloomFilteredFieldsProducer.(BloomFilterPostingsFormat.java:131)
> at
> org.elasticsearch.index.codec.postingsformat.BloomFilterPostingsFormat.fieldsProducer(BloomFilterPostingsFormat.java:102)
> at
> org.elasticsearch.index.codec.postingsformat.Elasticsearch090PostingsFormat.fieldsProducer(Elasticsearch090PostingsFormat.java:79)
> at
> org.apache.lucene.codecs.perfield.PerFieldPostingsFormat$FieldsReader.(PerFieldPostingsFormat.java:195)
> at
> org.apache.lucene.codecs.perfield.PerFieldPostingsFormat.fieldsProducer(PerFieldPostingsFormat.java:244)
> at
> org.apache.lucene.index.SegmentCoreReaders.(SegmentCoreReaders.java:115)
> at
> org.apache.lucene.index.SegmentReader.(SegmentReader.java:95)
> at
> org.apache.lucene.index.ReadersAndUpdates.getReader(ReadersAndUpdates.java:141)
> at
> org.apache.lucene.index.ReadersAndUpdates.getReadOnlyClone(ReadersAndUpdates.java:235)
> at
> org.apache.lucene.index.StandardDirectoryReader.open(StandardDirectoryReader.java:100)
> at
> org.apache.lucene.index.IndexWriter.getReader(IndexWriter.java:382)
> at
> org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:111)
> at
> org.apache.lucene.search.XSearcherManager.(XSearcherManager.java:94)
> at
> org.elasticsearch.index.engine.internal.InternalEngine.buildSearchManager(InternalEngine.java:1462)
> at
> org.elasticsearch.index.engine.internal.InternalEngine.start(InternalEngine.java:279)
> at
> org.elasticsearch.index.shard.service.InternalIndexShard.performRecoveryPrepareForTranslog(InternalIndexShard.java:706)
> at
> org.elasticsearch.index.gateway.local.LocalIndexShardGateway.recover(LocalIndexShardGateway.java:201)
> at
> org.elasticsearch.index.gateway.IndexShardGatewayService$1.run(IndexShardGatewayService.java:189)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:744)
>
>
> - - - - - - - - - -
> Sincerely:
> Hicham Mallah
> Software Developer
> mallah.hic...@gmail.com
> 00961 700 49 600
>
>
>
> On Thu, Mar 13, 2014 at 6:47 PM, Hicham Mallah wrote:
>
>> Hello again,
>>
>> setting bootstrap.mlockall to true seems to have made memory usage
>> slower, so like at the place of elasticsearch being killed after ~2 hours
>> it will be killed after ~3 hours. What I see weird, is why is the process
>> releasing memory one back to the OS but not doing it again? And why is it
>> not abiding by this DIRECT_SIZE setting too.
>>
>> Thanks for the help
>>
>>
>> - - - - - - - - - -
>> Sincerely:
>> Hicham Mallah
>> Software Developer
>> mallah.hic...@gmail.com
>> 00961 700 49 600
>>
>>
>>
>> On Thu, Mar 13, 2014 at 4:45 PM, Hicham Mallah 
>> wrote:
>>
>>> Jorg the issue is after the JVM giving back memory to the OS, it starts
>>> going up again, and never gives back memory till its killed, currently
>>> memory usage is up to 66% and still going up. HEAP size is currently set to
>>> 8gb which is 1/4 the amount of memory I have. I tried it at 16, 12, now at
>>> 8 but still facing the issue, lowering it more will cause undesirable speed
>>> on the website. I'll try mlockall now, and see what happens, but looking at
>>> Bigdesk on 18.6mb of swap is used.
>>>
>>> I'll let you know what happens with mlockall on.
>>>
>>> - - - - - - - - - -
>>> Sincerely:
>>> Hicham Mallah
>>> Software Developer
>>> mallah.hic...@gmail.com
>>> 00961 700 49 600
>>>
>>>
>>>
>>> On Thu, Mar 13, 2014 at 4:38 PM, joergpra...@gmail.com <
>>> joergpra...@gmail.com> wrote:
>>>
 From the gist, it alls looks very well. There is no reason for the OOM
 killer to kick in. Your syst

Re: elasticsearch memory usage

2014-03-13 Thread Hicham Mallah
If I start elasticsearch from the bin folder not using the wrapper, I get
these exceptions after about 2 mins:

Exception in thread "elasticsearch[Adam X][generic][T#5]"
java.lang.OutOfMemoryError: Java heap space
at org.apache.lucene.util.fst.BytesStore.(BytesStore.java:62)
at org.apache.lucene.util.fst.FST.(FST.java:366)
at org.apache.lucene.util.fst.FST.(FST.java:301)
at
org.apache.lucene.codecs.BlockTreeTermsReader$FieldReader.(BlockTreeTermsReader.java:481)
at
org.apache.lucene.codecs.BlockTreeTermsReader.(BlockTreeTermsReader.java:175)
at
org.apache.lucene.codecs.lucene41.Lucene41PostingsFormat.fieldsProducer(Lucene41PostingsFormat.java:437)
at
org.elasticsearch.index.codec.postingsformat.BloomFilterPostingsFormat$BloomFilteredFieldsProducer.(BloomFilterPostingsFormat.java:131)
at
org.elasticsearch.index.codec.postingsformat.BloomFilterPostingsFormat.fieldsProducer(BloomFilterPostingsFormat.java:102)
at
org.elasticsearch.index.codec.postingsformat.Elasticsearch090PostingsFormat.fieldsProducer(Elasticsearch090PostingsFormat.java:79)
at
org.apache.lucene.codecs.perfield.PerFieldPostingsFormat$FieldsReader.(PerFieldPostingsFormat.java:195)
at
org.apache.lucene.codecs.perfield.PerFieldPostingsFormat.fieldsProducer(PerFieldPostingsFormat.java:244)
at
org.apache.lucene.index.SegmentCoreReaders.(SegmentCoreReaders.java:115)
at
org.apache.lucene.index.SegmentReader.(SegmentReader.java:95)
at
org.apache.lucene.index.ReadersAndUpdates.getReader(ReadersAndUpdates.java:141)
at
org.apache.lucene.index.ReadersAndUpdates.getReadOnlyClone(ReadersAndUpdates.java:235)
at
org.apache.lucene.index.StandardDirectoryReader.open(StandardDirectoryReader.java:100)
at
org.apache.lucene.index.IndexWriter.getReader(IndexWriter.java:382)
at
org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:111)
at
org.apache.lucene.search.XSearcherManager.(XSearcherManager.java:94)
at
org.elasticsearch.index.engine.internal.InternalEngine.buildSearchManager(InternalEngine.java:1462)
at
org.elasticsearch.index.engine.internal.InternalEngine.start(InternalEngine.java:279)
at
org.elasticsearch.index.shard.service.InternalIndexShard.performRecoveryPrepareForTranslog(InternalIndexShard.java:706)
at
org.elasticsearch.index.gateway.local.LocalIndexShardGateway.recover(LocalIndexShardGateway.java:201)
at
org.elasticsearch.index.gateway.IndexShardGatewayService$1.run(IndexShardGatewayService.java:189)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)


- - - - - - - - - -
Sincerely:
Hicham Mallah
Software Developer
mallah.hic...@gmail.com
00961 700 49 600



On Thu, Mar 13, 2014 at 6:47 PM, Hicham Mallah wrote:

> Hello again,
>
> setting bootstrap.mlockall to true seems to have made memory usage slower,
> so like at the place of elasticsearch being killed after ~2 hours it will
> be killed after ~3 hours. What I see weird, is why is the process releasing
> memory one back to the OS but not doing it again? And why is it not abiding
> by this DIRECT_SIZE setting too.
>
> Thanks for the help
>
>
> - - - - - - - - - -
> Sincerely:
> Hicham Mallah
> Software Developer
> mallah.hic...@gmail.com
> 00961 700 49 600
>
>
>
> On Thu, Mar 13, 2014 at 4:45 PM, Hicham Mallah wrote:
>
>> Jorg the issue is after the JVM giving back memory to the OS, it starts
>> going up again, and never gives back memory till its killed, currently
>> memory usage is up to 66% and still going up. HEAP size is currently set to
>> 8gb which is 1/4 the amount of memory I have. I tried it at 16, 12, now at
>> 8 but still facing the issue, lowering it more will cause undesirable speed
>> on the website. I'll try mlockall now, and see what happens, but looking at
>> Bigdesk on 18.6mb of swap is used.
>>
>> I'll let you know what happens with mlockall on.
>>
>> - - - - - - - - - -
>> Sincerely:
>> Hicham Mallah
>> Software Developer
>> mallah.hic...@gmail.com
>> 00961 700 49 600
>>
>>
>>
>> On Thu, Mar 13, 2014 at 4:38 PM, joergpra...@gmail.com <
>> joergpra...@gmail.com> wrote:
>>
>>> From the gist, it alls looks very well. There is no reason for the OOM
>>> killer to kick in. Your system is idle and there is much room for
>>> everything.
>>>
>>> Just to quote you:
>>>
>>> "What's happening is that elasticsearch starts using memory till 50%
>>> then it goes back down to about 30% gradually then starts to go up again
>>> gradually and never goes back down."
>>>
>>> What you see is ES JVM process giving back memory to the OS, which is no
>>> reason to worry about in regard to process killing. It is just undesirable
>>> behaviour, and it is all a matter of correct configuration of the heap size.
>>>
>>> You should

Re: Elasticsearch index mapping in java

2014-03-13 Thread Nikita Tovstoles
Hi, Kevin:

Create Index 
refsays
 mappings can be included in index settings JSON. Are you saying that's 
not supported by the Java client? (Fwiw, I am seeing the same - just wanted 
to confirm):
 the following settings successfully create index with predefined mappings 
when using curl put /idx -d @settings.json but not when fed to java client 
like so:


Settings settings = 
ImmutableSettings.settingsBuilder().loadFromClasspath("es_admin_idx_settings.json").build();
return 
getClient().admin().indices().prepareCreate(idxName).setSettings(settings).execute();

INFO metadata:114 - [Seth] [idx-elasticsearchserviceintegrationtest] 
creating index, cause [api], shards [2]/[0], mappings [] //EMPTY MAPPINGS

{
  "index"   : {
"number_of_shards"  : 1,
"number_of_replicas": 0
  },
  "mappings": {
"eligibility_criteria": {
  "properties": {
"name": {
  "type"  : "string",
  "fields": {
"raw": {
  "type" : "string",
  "index": "not_analyzed"
}
  }
}
  }
}
  }
}


On Tuesday, February 4, 2014 5:00:30 PM UTC-8, Kevin Wang wrote:
>
> The index request is used to index document, you should use put mapping 
> request.
>
> e,g,
> PutMappingResponse response = 
> client.admin().indices().preparePutMapping(INDEX).setType(INDEX_TYPE).setSource(source).get();
>
>
> On Wednesday, February 5, 2014 1:27:41 AM UTC+11, Doru Sular wrote:
>>
>> Hi guys,
>>
>> I am trying to create an index with the following code:
>> XContentBuilder source = XContentFactory.jsonBuilder().startObject()
>> //
>> .startObject("settings")
>> .field("number_of_shards", 1)
>> .endObject()// end settings
>> .startObject("mappings")
>> .startObject(INDEX_TYPE)//
>> .startObject("properties")//
>> .startObject("user")
>> .field("type", "string") // start user
>> .field("store", "yes")
>> .field("index", "analyzed")//
>> .endObject()// end user
>> .startObject("postDate")//
>> .field("type", "date")
>> .field("store", "yes")
>> .field("index", "analyzed")//
>> .endObject()// end post date
>> .startObject("message") //
>> .field("type", "string")
>> .field("store", "yes")
>> .field("index", "not_analyzed")
>> .endObject() // end user field
>> .endObject() // end properties
>> .endObject() // end index type
>> .endObject() // end mappings
>> .endObject(); // end the container object
>>
>> IndexResponse response = this.client.prepareIndex(INDEX,INDEX_TYPE
>> ).setSource(source)
>> .setType(INDEX_TYPE).execute()
>> .actionGet();
>>
>>
>> I want to have the "message" field not analyzed, because later I want to 
>> use facets to obtain unique messages.
>> Unfortunately my code seems to add just a document in index with the 
>> following structure:
>> {
>>   "settings": {
>> "number_of_shards": 1
>>   },
>>   "mappings": {
>> "tweet": {
>>   "properties": {
>> "user": {
>>   "type": "string",
>>   "store": "yes",
>>   "index": "analyzed"
>> },
>> "postDate": {
>>   "type": "date",
>>   "store": "yes",
>>   "index": "analyzed"
>> },
>> "message": {
>>   "type": "string",
>>   "store": "yes",
>>   "index": "not_analyzed"
>> }
>>   }
>> }
>>   }
>> }
>>
>> Please help me to spot the error, it seems that mapping are not created.
>> Thank you very much,
>> Doru
>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/c04558a1-5faa-48eb-8861-0145d4d8c5e3%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: elasticsearch memory usage

2014-03-13 Thread Hicham Mallah
Hello again,

setting bootstrap.mlockall to true seems to have made memory usage slower,
so like at the place of elasticsearch being killed after ~2 hours it will
be killed after ~3 hours. What I see weird, is why is the process releasing
memory one back to the OS but not doing it again? And why is it not abiding
by this DIRECT_SIZE setting too.

Thanks for the help


- - - - - - - - - -
Sincerely:
Hicham Mallah
Software Developer
mallah.hic...@gmail.com
00961 700 49 600



On Thu, Mar 13, 2014 at 4:45 PM, Hicham Mallah wrote:

> Jorg the issue is after the JVM giving back memory to the OS, it starts
> going up again, and never gives back memory till its killed, currently
> memory usage is up to 66% and still going up. HEAP size is currently set to
> 8gb which is 1/4 the amount of memory I have. I tried it at 16, 12, now at
> 8 but still facing the issue, lowering it more will cause undesirable speed
> on the website. I'll try mlockall now, and see what happens, but looking at
> Bigdesk on 18.6mb of swap is used.
>
> I'll let you know what happens with mlockall on.
>
> - - - - - - - - - -
> Sincerely:
> Hicham Mallah
> Software Developer
> mallah.hic...@gmail.com
> 00961 700 49 600
>
>
>
> On Thu, Mar 13, 2014 at 4:38 PM, joergpra...@gmail.com <
> joergpra...@gmail.com> wrote:
>
>> From the gist, it alls looks very well. There is no reason for the OOM
>> killer to kick in. Your system is idle and there is much room for
>> everything.
>>
>> Just to quote you:
>>
>> "What's happening is that elasticsearch starts using memory till 50% then
>> it goes back down to about 30% gradually then starts to go up again
>> gradually and never goes back down."
>>
>> What you see is ES JVM process giving back memory to the OS, which is no
>> reason to worry about in regard to process killing. It is just undesirable
>> behaviour, and it is all a matter of correct configuration of the heap size.
>>
>> You should check if your ES starts from service wrapper or from the bin
>> folder, and adjust the parameters for heap size. I recommend only to use
>> ES_HEAP_SIZE parameter. Set this to max. 50% RAM (as you did). But do not
>> use different values at other places, or use MIN or MAX. ES_HEAP_SIZE is
>> doing the right thing for you.
>>
>> With bootstrap mlockall, you can lock the ES JVM process into main
>> memory, this helps much regarding to performance and fast GC, as it reduces
>> swapping. You can test if this setting will invoke the OOM killer too, as
>> it increases the pressure on main memory (but, as said, there is plenty
>> room in your machine).
>>
>> Jörg
>>
>>
>> On Thu, Mar 13, 2014 at 3:13 PM, Hicham Mallah 
>> wrote:
>>
>>> Hello Zachary,
>>>
>>> Thanks for your reply and the pointer to the settings.
>>>
>>> Here are the output of the commands you requested:
>>>
>>>
>>> curl -XGET "http://localhost:9200/_nodes/stats";
>>> curl -XGET "http://localhost:9200/_nodes";
>>>
>>> https://gist.github.com/codebird/9529114
>>>
>>>
>>> - - - - - - - - - -
>>> Sincerely:
>>> Hicham Mallah
>>> Software Developer
>>> mallah.hic...@gmail.com
>>> 00961 700 49 600
>>>
>>>
>>>
>>> On Thu, Mar 13, 2014 at 3:57 PM, Zachary Tong wrote:
>>>
 Can you gist up the output of these two commands?

 curl -XGET "http://localhost:9200/_nodes/stats";

 curl -XGET "http://localhost:9200/_nodes";

 Those are my first-stop APIs for determining where memory is being
 allocated.


 By the way, these settings don't do anything anymore (they were
 depreciated and removed):

 index.cache.field.type: soft
 index.term_index_interval: 256
 index.term_index_divisor: 5

 index.cache.field.max_size: 1



 `max_size` was replaced with `indices.fielddata.cache.size` and accepts
 a value like "10gb" or "30%"

 And this is just bad settings in general (causes a lot of GC thrashing):

 index.cache.field.expire: 10m




 On Thursday, March 13, 2014 8:42:54 AM UTC-4, Hicham Mallah wrote:

> Now the process went back down to 25% usage, from now on it will go
> back up, and won't stop going up.
>
> Sorry for spamming
>
>  - - - - - - - - - -
> Sincerely:
> Hicham Mallah
> Software Developer
> mallah...@gmail.com
> 00961 700 49 600
>
>
>
> On Thu, Mar 13, 2014 at 2:37 PM, Hicham Mallah wrote:
>
>>  Here's the top after ~1 hour running:
>>
>>  PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND
>> 780 root  20   0  317g  14g 7.1g S 492.9 46.4 157:50.89 java
>>
>>
>> - - - - - - - - - -
>> Sincerely:
>> Hicham Mallah
>> Software Developer
>> mallah...@gmail.com
>> 00961 700 49 600
>>
>>
>>
>> On Thu, Mar 13, 2014 at 2:36 PM, Hicham Mallah 
>> wrote:
>>
>>> Hello Jörg
>>>
>>> Thanks for the reply, our swap size is 2g. I don't know at what %
>>> the process is being kil

Re: Does Marvel degrade gracefully on ES 0.90.1?

2014-03-13 Thread Boaz Leskes
Hi Edward,

Sadly, Marvel is incompatible with 0.90.1 and will disable itself upon 
startup. 

Cheers,
Boaz

On Thursday, March 13, 2014 5:18:32 PM UTC+1, Edward Sargisson wrote:
>
> Hi all,
> Does Marvel work at all on 0.90.1 - even in a degraded fashion?
>
> I know that its minimum is 0.90.9. I have a possible performance-related 
> failure to chase down where Marvel might be very useful in finding it. 
> However, I don't want to change the conditions of the problem until we 
> understand it - and upgrading to 0.90.9 may just do that.
>
> Cheers,
> Edward
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/67c75b4e-11f1-4627-9419-89656d9e9ce6%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Testing docFieldLongs

2014-03-13 Thread redrubia
I have a function, in which I call docFieldLongs. In unit testing I need to 
create and override this function and also return ScriptDocValues.Longs.

@Test
public void testRunAsLongs throws Exception()
{
script = new MaxiScoreScript(params){
@Override
protected ScriptDocValues.Longs docFieldLongs(String getter){
IndexWriter writer = new IndexWriter(new RAMDirectory(), 
new IndexWriterConfig(Lucene.VERSION, new   
StandardAnalyzer(Lucene.VERSION)).setMergePolicy(new 
LogByteSizeMergePolicy()));
Document d  = new Document();
d.add(new LongField("value", 102, Field.Store.NO));
d.add(new LongField("value" ,101, Field.Store.NO));
writer.addDocument(d);

IndexNumericFieldData indexFieldData = 
IndexFieldDataService.getForField("value");
AtomicNumericFieldData fieldData = 
indexFieldData.load(refreshReader());

AtomicNumericFieldData temp = fieldData;

// return (ScriptDocValues.Longs) temp.getScriptValues();
}

}


}


The main aim of this is to get the same type returned. However I am having 
a few issues with this, with regards to refreshReader.  I also feel this is 
a very long winded way of doing things.

In summary, I simply want to override `docFiledLongs`, to `return 
ScriptDocValues.Longs` which I have defined. In my function, I call 
docFieldLongs("docFieldItems").getValues();

I want this statement when tested to return a List items, being 
[101L,102L].

Any help would be greatly appreciated.

Thanks

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/c3642fc6-57cf-496e-813f-a2b761b1b25f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


How to install Mapping attachment Plugin with debian install

2014-03-13 Thread ZenMaster80
On my local machine, I do this: bin/plugin -install ...

With debian installation, I am not sure where the "bin/plugin' folder is?
Anyone knows?

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/14d64f1f-fb5d-4c7c-876c-726814229d26%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Does Marvel degrade gracefully on ES 0.90.1?

2014-03-13 Thread Edward Sargisson
Hi all,
Does Marvel work at all on 0.90.1 - even in a degraded fashion?

I know that its minimum is 0.90.9. I have a possible performance-related 
failure to chase down where Marvel might be very useful in finding it. 
However, I don't want to change the conditions of the problem until we 
understand it - and upgrading to 0.90.9 may just do that.

Cheers,
Edward

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/2ae17288-5aa3-456e-b80d-a4b645526b1f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Authentication again

2014-03-13 Thread kidkid
Hi Martin,
For your situation, I suggest:

1.Disable HTTP & run your ES node only in internal network.
2.Making a middleware to provide a restful service so that other could 
communicate.
3. Your middleware will listen & use client api (ex: java client) to work 
with ES cluster & return result back.

Regards.


On Sunday, March 9, 2014 11:14:15 PM UTC+7, Martin Hátaš wrote:
>
> I know that authentication between nodes (native protocol) is not provided 
> natively by ES. 
> But I have 2 aditional questions:
>
> 1) Do you consider to put this functionality to the roadmap (in the near 
> future)?
> 2) Have anyone solved this issue by some homebrewed solution (e.g. by 
> wrapping or extending Node/Client class)?
>
> We really recognize ES as brilliant piece of software for MANY reasons but 
> we cann´t intergrate it to our product/solutions mainly due to missing 
> authentication/encryption.
>
> Thanks
>
>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/d28f0d19-3137-497e-8775-3be7f171e960%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [Ann] Elasticsearch Image Plugin 1.1.0 released

2014-03-13 Thread ZenMaster80
Great, I am interested in trying this.

On Thursday, March 13, 2014 7:09:38 AM UTC-4, Kevin Wang wrote:
>
> Hi All,
>
> I've released version 1.1.0 of Elasticsearch Image Plugin.
> The Image Plugin is an Content Based Image Retrieval Plugin for 
> Elasticsearch using LIRE (Lucene Image Retrieval). It allows users to index 
> images and search for similar images.
>
> Changes in 1.1.0:
>
>- Added limit in image query
>- Added plugin version in es-plugin.properties
>
>
> https://github.com/kzwang/elasticsearch-image
>
> Also I've created a demo website for this plugin (
> http://demo.elasticsearch-image.com/), it has 1,000,000 images (well, 
> haven't finish index all images yet, but it should be able to demo this 
> plugin) from MIRFLICKR-1M collection (http://press.liacs.nl/mirflickr)
>
>
> Thanks,
> Kevin
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/5003a60a-4013-4273-87ef-b30a298d78d4%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Mapping Attachment plugin Installtion/dubian

2014-03-13 Thread ZenMaster80
I am having trouble finding how to install the above plugin? I installed 
Elastic Search with Dubian.
Typically On my local linux machine I did "/bin/plugin ", I am not sure 
where is the 'bin/plugin" goes with the dubian installation?

Thanks

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/685b87e0-34bd-49da-993a-5a92927cc0f1%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: (ES 0.90.1) Cannot connect to elasticsearch cluster after a node is removed

2014-03-13 Thread echin1999
One more thing - I notice that functionally, the client is still able to 
communicate to the remaining active node.  so I guess this warning is just 
a "warning".  must be some background thread that periodically looks for 
the missing node, while the main Client instance can still communicate to 
the active node.would you be able to verify if its merely a warning for 
you?  if so, i might just not worry about this for now.

On Thursday, March 13, 2014 5:15:36 AM UTC-4, Hui wrote:
>
> Hi Dome, 
>
> Do you mean the service of 10.1.4.196 is not open? Yes, the service should 
> be stopped when it was rebooted.
>
> But the master node 10.1.4.197 has removed the problem node 10.1.4.196 
> when it cannot ping the machine 10.1.4.196.
>
> The cluster should be fine after this operation. Do I understand it 
> wrongly?
>
> Thanks
>
> On Thursday, March 13, 2014 4:48:17 PM UTC+8, Dome.C.Wei wrote:
>>
>> That must be the service not open.
>>
>> 在 2014年3月13日星期四UTC+8下午2时10分22秒,Hui写道:
>>>
>>> Hi Mark,
>>>
>>> Thanks for replying.
>>>
>>> The master (10.1.4.197) and other nodes can be reached while the problem 
>>> node(10.1.4.196) is not reachable.
>>> So, we can see the cluster status at that moment
>>>
>>>  "status" : "yellow",
>>>   "timed_out" : false,
>>>   "unassigned_shards" : 0,
>>>
>>>
>>> On Thursday, March 13, 2014 2:03:44 PM UTC+8, Mark Walkom wrote:

 It looks like a networking issue, at least based on "No route to host" 
 in the error.
 Can you ping the master when this is happening, what about doing a 
 telnet test?

 Regards,
 Mark Walkom

 Infrastructure Engineer
 Campaign Monitor
 email: ma...@campaignmonitor.com
 web: www.campaignmonitor.com


 On 13 March 2014 16:54, Hui  wrote:

> Hi All,
>
>
> This is the log for the case.
>
>
> The node 10.1.4.196 is removed at 14:08 due to machine reboot, the client 
> keeps trying to connect to the elasticsearch cluster but fails.
>
> Master Node : 
> [2014-03-08 14:08:26,531][INFO ][cluster.service  ] 
> [10.1.4.197:9202] removed 
> {[10.1.4.196:9202][_sJrum34QWGqEkv8CvAtow][inet[/10.1.4.196:9302]],}, 
> reason: 
> zen-disco-node_failed([10.1.4.196:9202][_sJrum34QWGqEkv8CvAtow][inet[/10.1.4.196:9302]]),
>  reason failed to ping, tried [3] times, each with maximum [30s] timeout
>
>
> Client : 
> 2014-03-08 14:15:36,184 WARN  org.elasticsearch.transport.netty - 
> [Bulldozer] exception caught on transport layer [[id: 0x50dc218f]], 
> closing connection
> java.net.NoRouteToHostException: No route to host
>
>
> (The cluster health at this moment is Yellow and there is no unassigned 
> shard.)
>
>
>
>
> The node is back at 14:25, the client can successfully connected to the 
> cluster again.
>
> Client :
>
> 2014-03-08 14:25:20,597 WARN  org.elasticsearch.transport.netty - 
> [Bulldozer] exception caught on transport layer [[id: 0xf24d85d7]], 
> closing connection
> java.net.NoRouteToHostException: No route to host
>
>
> Master Node :
>
> [2014-03-08 14:25:57,984][INFO ][cluster.service  ] 
> [10.1.4.197:9202] added 
> {[10.1.4.196:9202][rFZ7k7XSSY231EgPoDfmFw][inet[/10.1.4.196:9302]],}, 
> reason: zen-disco-receive(join from 
> node[[10.1.4.196:9202][rFZ7k7XSSY231EgPoDfmFw][inet[/10.1.4.196:9302]]])
>
>
> (The cluster health at this moment is Green.)
>
> In the above case, the client should be able to connect to the cluster 
> even a node is removed from the cluster.
>
>
> For the client, the connection is created as followings : 
>
>
> Settings settings = ImmutableSettings.settingsBuilder()
> .put("cluster.name", "clustername")
>
> .put("client.transport.sniff", true)
>
>
> .build();
> 
>
> TransportClient client = new TransportClient(settings);
>
> client.addTransportAddress(new InetSocketTransportAddress(
> "10.1.4.195" /* hostname */, 9300 /* port */));
>
> client.addTransportAddress(new InetSocketTransportAddress(
>
> "10.1.4.196" /* hostname */, 9300 /* port */)); 
>  client.addTransportAddress(new InetSocketTransportAddress(
> "10.1.4.197" /* hostname */, 9300 /* port */));
>
> The master node is 10.1.4.197 while the node being removed is 
> 10.1.4.196.
>
> For the cluster setting, all setting is using the default except the 
> the discovery.zen.minimum_master_nodes which is set to 3.
>
> Is there any problem for the above setting which cause this issue?
>
> Thanks.
>
>  -- 
> You received this message because you are subscribed to the Google 
> Groups "elasticsearch" group.
> To unsubscribe from this 

Re: Node not joining cluster on boot

2014-03-13 Thread joergpra...@gmail.com
Enter

ip addr show

or

ifconfig

and check if MULTICAST is configured on the interface.

Jörg


On Thu, Mar 13, 2014 at 4:29 PM, Guillaume Loetscher wrote:

> Definitely a multicast problem.
>
> I've decided to switch to unicast, and I manage to shutdown any nodes
> (elected master or not), and the remaining one is taking up the load
> perfectly.
>
> When the other node is getting back online, using the unicast discovery,
> there's no problem, the elected master discovered another master node, and
> add it in the cluster.
>
> I don't know what was failing in my (virtual) network configuration, but
> honestly, I cannot lost several more hours to point out where my mockup
> failed for multicast.
>
> Le jeudi 13 mars 2014 15:23:06 UTC+1, Guillaume Loetscher a écrit :
>
>> @Xiao Yu : nope, it's not working also.
>>
>> @Clinton Gormley : Yes, just after the "no matching id" error, a telnet
>> from Node 1 to node 2 is possible, and I got a valid connection.
>>
>> All, please reming that after such issue, if I manually stop the service
>> on Node 2, then restart it, it will manage to reach the cluster without a
>> problem.
>>
>> I'm suspecting a "race condition" here, something like "Node 2 container
>> is booting so fast that the bridge is not ready to handle the multicast
>> packet, leading to a connection problem".
>>
>> Le jeudi 13 mars 2014 14:00:40 UTC+1, Xiao Yu a écrit :
>>>
>>> Total shot in the dark here but try taking the hashmark out of the node
>>> names and see if that helps?
>>>
>>> On Thursday, March 13, 2014 5:31:30 AM UTC-4, Guillaume Loetscher wrote:

 Sure

 Node # 1:
 root@es_node1:~# grep -E '^[^#]' /etc/elasticsearch/elasticsearch.yml
 cluster.name: logstash
 node.name: "Node ES #1"
 node.master: true
 node.data: true
 index.number_of_shards: 2
 index.number_of_replicas: 1
 discovery.zen.ping.timeout: 10s

 Node #2 :
 root@es_node2:~# grep -E '^[^#]' /etc/elasticsearch/elasticsearch.yml
 cluster.name: logstash
 node.name: "Node ES #2"
 node.master: true
 node.data: true
 index.number_of_shards: 2
 index.number_of_replicas: 1
 discovery.zen.ping.timeout: 10s





 Le jeudi 13 mars 2014 10:15:16 UTC+1, David Pilato a écrit :
>
> did you set the same cluster name on both nodes?
>
> --
> *David Pilato* | *Technical Advocate* | *Elasticsearch.com*
> @dadoonet  | 
> @elasticsearchfr
>
>
> Le 13 mars 2014 à 09:57:35, Guillaume Loetscher (ster...@gmail.com) a
> écrit:
>
> Hi,
>
> First, thanks for the answers and remarks.
>
> You are both right, the issue I'm currently facing leads to a
> "split-brain" situation, where Node #1 & Node #2 are both master, and 
> doing
> their own life on their side. I'll see to change my configuration and the
> number of node, in order to limit this situation (I already checked this
> article talking about split-brain in 
> ES
> ).
>
> However, this split-brain situation is the result of the problem with
> the discovery / broadcast, which is represented in the log of Node #2 
> here :
>  [2014-03-12 22:03:52,709][WARN ][discovery.zen.ping.multicast] [Node
> ES #2] received ping response ping_response{target [[Node ES
> #1][LbMQazWXR9uB6Q7R2xLxGQ][inet[/172.16.0.100:9300]]{master=true}],
> master [[Node ES #1][LbMQazWXR9uB6Q7R2xLxGQ][inet[/172.16.0.100:9300
> ]]{master=true}], cluster_name[logstash]} with no matching id [1]
>
> So, the connectivity between Node #1 (which is the first one online,
> and therefore master) and Node #2 is established, as the log on Node #2
> clearly said "received ping response", but with an "ID that didn't match".
>
> This is apparently why Node #2 didn't join the cluster on Node #1, and
> this is this specific issue I want to resolve.
>
> Thanks,
>
> Le jeudi 13 mars 2014 07:03:35 UTC+1, David Pilato a écrit :
>>
>> Bonjour :-)
>>
>> You should set min_master_nodes to 2. Although I'd recommend having 3
>> nodes instead of 2.
>>
>> --
>> David ;-)
>> Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs
>>
>> Le 12 mars 2014 à 23:58, Guillaume Loetscher  a
>> écrit :
>>
>>  Hi,
>>
>> I've begun to test Elasticsearch recently, on a little mockup I've
>> designed.
>>
>> Currently, I'm running two nodes on two LXC (v0.9) containers. Those
>> containers are linked using veth to a bridge declared on the host.
>>
>> When I start the first node, the cluster starts, but when I start the
>> second node a bit later, it seems to get some information from the other
>> node but it always ended with the same "no matchi

Re: (ES 0.90.1) Cannot connect to elasticsearch cluster after a node is removed

2014-03-13 Thread echin1999
Hi.
that would be my assumption as well.  By the way, I am getting this same 
warning that you are getting.  Very similar scenario (2 nodes in a cluster 
- all works fine when everything is running. Warning appears on client if 
one of the nodes is taken down).
  
I am using v. 0.90 - not sure if that matters.


On Thursday, March 13, 2014 5:15:36 AM UTC-4, Hui wrote:
>
> Hi Dome, 
>
> Do you mean the service of 10.1.4.196 is not open? Yes, the service should 
> be stopped when it was rebooted.
>
> But the master node 10.1.4.197 has removed the problem node 10.1.4.196 
> when it cannot ping the machine 10.1.4.196.
>
> The cluster should be fine after this operation. Do I understand it 
> wrongly?
>
> Thanks
>
> On Thursday, March 13, 2014 4:48:17 PM UTC+8, Dome.C.Wei wrote:
>>
>> That must be the service not open.
>>
>> 在 2014年3月13日星期四UTC+8下午2时10分22秒,Hui写道:
>>>
>>> Hi Mark,
>>>
>>> Thanks for replying.
>>>
>>> The master (10.1.4.197) and other nodes can be reached while the problem 
>>> node(10.1.4.196) is not reachable.
>>> So, we can see the cluster status at that moment
>>>
>>>  "status" : "yellow",
>>>   "timed_out" : false,
>>>   "unassigned_shards" : 0,
>>>
>>>
>>> On Thursday, March 13, 2014 2:03:44 PM UTC+8, Mark Walkom wrote:

 It looks like a networking issue, at least based on "No route to host" 
 in the error.
 Can you ping the master when this is happening, what about doing a 
 telnet test?

 Regards,
 Mark Walkom

 Infrastructure Engineer
 Campaign Monitor
 email: ma...@campaignmonitor.com
 web: www.campaignmonitor.com


 On 13 March 2014 16:54, Hui  wrote:

> Hi All,
>
>
> This is the log for the case.
>
>
> The node 10.1.4.196 is removed at 14:08 due to machine reboot, the client 
> keeps trying to connect to the elasticsearch cluster but fails.
>
> Master Node : 
> [2014-03-08 14:08:26,531][INFO ][cluster.service  ] 
> [10.1.4.197:9202] removed 
> {[10.1.4.196:9202][_sJrum34QWGqEkv8CvAtow][inet[/10.1.4.196:9302]],}, 
> reason: 
> zen-disco-node_failed([10.1.4.196:9202][_sJrum34QWGqEkv8CvAtow][inet[/10.1.4.196:9302]]),
>  reason failed to ping, tried [3] times, each with maximum [30s] timeout
>
>
> Client : 
> 2014-03-08 14:15:36,184 WARN  org.elasticsearch.transport.netty - 
> [Bulldozer] exception caught on transport layer [[id: 0x50dc218f]], 
> closing connection
> java.net.NoRouteToHostException: No route to host
>
>
> (The cluster health at this moment is Yellow and there is no unassigned 
> shard.)
>
>
>
>
> The node is back at 14:25, the client can successfully connected to the 
> cluster again.
>
> Client :
>
> 2014-03-08 14:25:20,597 WARN  org.elasticsearch.transport.netty - 
> [Bulldozer] exception caught on transport layer [[id: 0xf24d85d7]], 
> closing connection
> java.net.NoRouteToHostException: No route to host
>
>
> Master Node :
>
> [2014-03-08 14:25:57,984][INFO ][cluster.service  ] 
> [10.1.4.197:9202] added 
> {[10.1.4.196:9202][rFZ7k7XSSY231EgPoDfmFw][inet[/10.1.4.196:9302]],}, 
> reason: zen-disco-receive(join from 
> node[[10.1.4.196:9202][rFZ7k7XSSY231EgPoDfmFw][inet[/10.1.4.196:9302]]])
>
>
> (The cluster health at this moment is Green.)
>
> In the above case, the client should be able to connect to the cluster 
> even a node is removed from the cluster.
>
>
> For the client, the connection is created as followings : 
>
>
> Settings settings = ImmutableSettings.settingsBuilder()
> .put("cluster.name", "clustername")
>
> .put("client.transport.sniff", true)
>
>
> .build();
> 
>
> TransportClient client = new TransportClient(settings);
>
> client.addTransportAddress(new InetSocketTransportAddress(
> "10.1.4.195" /* hostname */, 9300 /* port */));
>
> client.addTransportAddress(new InetSocketTransportAddress(
>
> "10.1.4.196" /* hostname */, 9300 /* port */)); 
>  client.addTransportAddress(new InetSocketTransportAddress(
> "10.1.4.197" /* hostname */, 9300 /* port */));
>
> The master node is 10.1.4.197 while the node being removed is 
> 10.1.4.196.
>
> For the cluster setting, all setting is using the default except the 
> the discovery.zen.minimum_master_nodes which is set to 3.
>
> Is there any problem for the above setting which cause this issue?
>
> Thanks.
>
>  -- 
> You received this message because you are subscribed to the Google 
> Groups "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send 
> an email to elasticsearc...@googlegroups.com.
> To view this d

Re: elasticsearch memory usage

2014-03-13 Thread Hicham Mallah
Jorg the issue is after the JVM giving back memory to the OS, it starts
going up again, and never gives back memory till its killed, currently
memory usage is up to 66% and still going up. HEAP size is currently set to
8gb which is 1/4 the amount of memory I have. I tried it at 16, 12, now at
8 but still facing the issue, lowering it more will cause undesirable speed
on the website. I'll try mlockall now, and see what happens, but looking at
Bigdesk on 18.6mb of swap is used.

I'll let you know what happens with mlockall on.

- - - - - - - - - -
Sincerely:
Hicham Mallah
Software Developer
mallah.hic...@gmail.com
00961 700 49 600



On Thu, Mar 13, 2014 at 4:38 PM, joergpra...@gmail.com <
joergpra...@gmail.com> wrote:

> From the gist, it alls looks very well. There is no reason for the OOM
> killer to kick in. Your system is idle and there is much room for
> everything.
>
> Just to quote you:
>
> "What's happening is that elasticsearch starts using memory till 50% then
> it goes back down to about 30% gradually then starts to go up again
> gradually and never goes back down."
>
> What you see is ES JVM process giving back memory to the OS, which is no
> reason to worry about in regard to process killing. It is just undesirable
> behaviour, and it is all a matter of correct configuration of the heap size.
>
> You should check if your ES starts from service wrapper or from the bin
> folder, and adjust the parameters for heap size. I recommend only to use
> ES_HEAP_SIZE parameter. Set this to max. 50% RAM (as you did). But do not
> use different values at other places, or use MIN or MAX. ES_HEAP_SIZE is
> doing the right thing for you.
>
> With bootstrap mlockall, you can lock the ES JVM process into main memory,
> this helps much regarding to performance and fast GC, as it reduces
> swapping. You can test if this setting will invoke the OOM killer too, as
> it increases the pressure on main memory (but, as said, there is plenty
> room in your machine).
>
> Jörg
>
>
> On Thu, Mar 13, 2014 at 3:13 PM, Hicham Mallah wrote:
>
>> Hello Zachary,
>>
>> Thanks for your reply and the pointer to the settings.
>>
>> Here are the output of the commands you requested:
>>
>>
>> curl -XGET "http://localhost:9200/_nodes/stats";
>> curl -XGET "http://localhost:9200/_nodes";
>>
>> https://gist.github.com/codebird/9529114
>>
>>
>> - - - - - - - - - -
>> Sincerely:
>> Hicham Mallah
>> Software Developer
>> mallah.hic...@gmail.com
>> 00961 700 49 600
>>
>>
>>
>> On Thu, Mar 13, 2014 at 3:57 PM, Zachary Tong wrote:
>>
>>> Can you gist up the output of these two commands?
>>>
>>> curl -XGET "http://localhost:9200/_nodes/stats";
>>>
>>> curl -XGET "http://localhost:9200/_nodes";
>>>
>>> Those are my first-stop APIs for determining where memory is being
>>> allocated.
>>>
>>>
>>> By the way, these settings don't do anything anymore (they were
>>> depreciated and removed):
>>>
>>> index.cache.field.type: soft
>>> index.term_index_interval: 256
>>> index.term_index_divisor: 5
>>>
>>> index.cache.field.max_size: 1
>>>
>>>
>>>
>>> `max_size` was replaced with `indices.fielddata.cache.size` and accepts
>>> a value like "10gb" or "30%"
>>>
>>> And this is just bad settings in general (causes a lot of GC thrashing):
>>>
>>> index.cache.field.expire: 10m
>>>
>>>
>>>
>>>
>>> On Thursday, March 13, 2014 8:42:54 AM UTC-4, Hicham Mallah wrote:
>>>
 Now the process went back down to 25% usage, from now on it will go
 back up, and won't stop going up.

 Sorry for spamming

  - - - - - - - - - -
 Sincerely:
 Hicham Mallah
 Software Developer
 mallah...@gmail.com
 00961 700 49 600



 On Thu, Mar 13, 2014 at 2:37 PM, Hicham Mallah wrote:

>  Here's the top after ~1 hour running:
>
>  PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND
> 780 root  20   0  317g  14g 7.1g S 492.9 46.4 157:50.89 java
>
>
> - - - - - - - - - -
> Sincerely:
> Hicham Mallah
> Software Developer
> mallah...@gmail.com
> 00961 700 49 600
>
>
>
> On Thu, Mar 13, 2014 at 2:36 PM, Hicham Mallah wrote:
>
>> Hello Jörg
>>
>> Thanks for the reply, our swap size is 2g. I don't know at what % the
>> process is being killed as the first time it happened I wasn't around, 
>> and
>> then I never let that happen again as the website is online. After 2 
>> hours
>> of running the memory in sure is going up to 60%, I am restarting each 
>> time
>> when it arrives at 70% (2h/2h30) when I am around and testing config
>> changes. When I am not around, I am setting a cron job to restart the
>> server every 2 hours. Server has apache and mysql running on it too.
>>
>>
>>
>> - - - - - - - - - -
>> Sincerely:
>> Hicham Mallah
>> Software Developer
>> mallah...@gmail.com
>> 00961 700 49 600
>>
>>
>>
>> On Thu, Mar 13, 2014 at 2:22 PM, joerg...@gm

Re: elasticsearch memory usage

2014-03-13 Thread joergpra...@gmail.com
>From the gist, it alls looks very well. There is no reason for the OOM
killer to kick in. Your system is idle and there is much room for
everything.

Just to quote you:

"What's happening is that elasticsearch starts using memory till 50% then
it goes back down to about 30% gradually then starts to go up again
gradually and never goes back down."

What you see is ES JVM process giving back memory to the OS, which is no
reason to worry about in regard to process killing. It is just undesirable
behaviour, and it is all a matter of correct configuration of the heap size.

You should check if your ES starts from service wrapper or from the bin
folder, and adjust the parameters for heap size. I recommend only to use
ES_HEAP_SIZE parameter. Set this to max. 50% RAM (as you did). But do not
use different values at other places, or use MIN or MAX. ES_HEAP_SIZE is
doing the right thing for you.

With bootstrap mlockall, you can lock the ES JVM process into main memory,
this helps much regarding to performance and fast GC, as it reduces
swapping. You can test if this setting will invoke the OOM killer too, as
it increases the pressure on main memory (but, as said, there is plenty
room in your machine).

Jörg


On Thu, Mar 13, 2014 at 3:13 PM, Hicham Mallah wrote:

> Hello Zachary,
>
> Thanks for your reply and the pointer to the settings.
>
> Here are the output of the commands you requested:
>
>
> curl -XGET "http://localhost:9200/_nodes/stats";
> curl -XGET "http://localhost:9200/_nodes";
>
> https://gist.github.com/codebird/9529114
>
>
> - - - - - - - - - -
> Sincerely:
> Hicham Mallah
> Software Developer
> mallah.hic...@gmail.com
> 00961 700 49 600
>
>
>
> On Thu, Mar 13, 2014 at 3:57 PM, Zachary Tong wrote:
>
>> Can you gist up the output of these two commands?
>>
>> curl -XGET "http://localhost:9200/_nodes/stats";
>>
>> curl -XGET "http://localhost:9200/_nodes";
>>
>> Those are my first-stop APIs for determining where memory is being
>> allocated.
>>
>>
>> By the way, these settings don't do anything anymore (they were
>> depreciated and removed):
>>
>> index.cache.field.type: soft
>> index.term_index_interval: 256
>> index.term_index_divisor: 5
>>
>> index.cache.field.max_size: 1
>>
>>
>>
>> `max_size` was replaced with `indices.fielddata.cache.size` and accepts a
>> value like "10gb" or "30%"
>>
>> And this is just bad settings in general (causes a lot of GC thrashing):
>>
>> index.cache.field.expire: 10m
>>
>>
>>
>>
>> On Thursday, March 13, 2014 8:42:54 AM UTC-4, Hicham Mallah wrote:
>>
>>> Now the process went back down to 25% usage, from now on it will go back
>>> up, and won't stop going up.
>>>
>>> Sorry for spamming
>>>
>>>  - - - - - - - - - -
>>> Sincerely:
>>> Hicham Mallah
>>> Software Developer
>>> mallah...@gmail.com
>>> 00961 700 49 600
>>>
>>>
>>>
>>> On Thu, Mar 13, 2014 at 2:37 PM, Hicham Mallah wrote:
>>>
  Here's the top after ~1 hour running:

  PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND
 780 root  20   0  317g  14g 7.1g S 492.9 46.4 157:50.89 java


 - - - - - - - - - -
 Sincerely:
 Hicham Mallah
 Software Developer
 mallah...@gmail.com
 00961 700 49 600



 On Thu, Mar 13, 2014 at 2:36 PM, Hicham Mallah wrote:

> Hello Jörg
>
> Thanks for the reply, our swap size is 2g. I don't know at what % the
> process is being killed as the first time it happened I wasn't around, and
> then I never let that happen again as the website is online. After 2 hours
> of running the memory in sure is going up to 60%, I am restarting each 
> time
> when it arrives at 70% (2h/2h30) when I am around and testing config
> changes. When I am not around, I am setting a cron job to restart the
> server every 2 hours. Server has apache and mysql running on it too.
>
>
>
> - - - - - - - - - -
> Sincerely:
> Hicham Mallah
> Software Developer
> mallah...@gmail.com
> 00961 700 49 600
>
>
>
> On Thu, Mar 13, 2014 at 2:22 PM, joerg...@gmail.com <
> joerg...@gmail.com> wrote:
>
>> You wrote, the OOM killer killed the ES process. With 32g (and the
>> swap size), the process must be very big. much more than you configured.
>> Can you give more info about the live size of the process, after ~2 
>> hours?
>> Are there more application processes on the box?
>>
>> Jörg
>>
>>
>> On Thu, Mar 13, 2014 at 12:46 PM, Hicham Mallah 
>> wrote:
>>
>>> Hello,
>>>
>>> I have been using elasticsearch on a ubuntu server for a year now,
>>> and everything was going great. I had an index of 150,000,000 entries of
>>> domain names, running small queries on it, just filtering by 1 term no
>>> sorting no wildcard nothing. Now we moved servers, I have now a CentOS 6
>>> server, 32GB ram and running elasticserach but now we have 2 indices, of
>>> about 150 million entr

Re: elasticsearch error docs

2014-03-13 Thread patrik . ragnarsson
On Monday, July 15, 2013 12:23:37 PM UTC+2, Jörg Prante wrote:
>
> I'm not sure how this error type listing can help you. You must catch 
> every exception on the bulk response. If you encounter one, you should 
> stop indexing no matter what happend. 
>

I can see how a list of all the error types can help.

It's not true that you have stop indexing no matter what happened. Bulk 
requests with the create action returns conflict errors, but that might be 
okay for you and you want to ignore those errors. On other errors you might 
want to re-queue the request for some reason (e.g. your 
threadpool.bulk.queue_size was too low and you got something like 
"EsRejectedExecutionException[rejected execution (queue capacity 50)").

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/16f25cde-df7e-4d60-a603-5995212a2dc8%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Node not joining cluster on boot

2014-03-13 Thread Guillaume Loetscher
@Xiao Yu : nope, it's not working also.

@Clinton Gormley : Yes, just after the "no matching id" error, a telnet 
from Node 1 to node 2 is possible, and I got a valid connection.

All, please reming that after such issue, if I manually stop the service on 
Node 2, then restart it, it will manage to reach the cluster without a 
problem.

I'm suspecting a "race condition" here, something like "Node 2 container is 
booting so fast that the bridge is not ready to handle the multicast 
packet, leading to a connection problem".

Le jeudi 13 mars 2014 14:00:40 UTC+1, Xiao Yu a écrit :
>
> Total shot in the dark here but try taking the hashmark out of the node 
> names and see if that helps?
>
> On Thursday, March 13, 2014 5:31:30 AM UTC-4, Guillaume Loetscher wrote:
>>
>> Sure
>>
>> Node # 1:
>> root@es_node1:~# grep -E '^[^#]' /etc/elasticsearch/elasticsearch.yml 
>> cluster.name: logstash
>> node.name: "Node ES #1"
>> node.master: true
>> node.data: true
>> index.number_of_shards: 2
>> index.number_of_replicas: 1
>> discovery.zen.ping.timeout: 10s
>>
>> Node #2 :
>> root@es_node2:~# grep -E '^[^#]' /etc/elasticsearch/elasticsearch.yml
>> cluster.name: logstash
>> node.name: "Node ES #2"
>> node.master: true
>> node.data: true
>> index.number_of_shards: 2
>> index.number_of_replicas: 1
>> discovery.zen.ping.timeout: 10s
>>
>>
>>
>>
>>
>> Le jeudi 13 mars 2014 10:15:16 UTC+1, David Pilato a écrit :
>>>
>>> did you set the same cluster name on both nodes?
>>>
>>> -- 
>>> *David Pilato* | *Technical Advocate* | *Elasticsearch.com*
>>> @dadoonet  | 
>>> @elasticsearchfr
>>>
>>>
>>> Le 13 mars 2014 à 09:57:35, Guillaume Loetscher (ster...@gmail.com) a 
>>> écrit:
>>>
>>> Hi,
>>>
>>> First, thanks for the answers and remarks.
>>>
>>> You are both right, the issue I'm currently facing leads to a 
>>> "split-brain" situation, where Node #1 & Node #2 are both master, and doing 
>>> their own life on their side. I'll see to change my configuration and the 
>>> number of node, in order to limit this situation (I already checked this 
>>> article talking about split-brain in 
>>> ES
>>> ).
>>>
>>> However, this split-brain situation is the result of the problem with 
>>> the discovery / broadcast, which is represented in the log of Node #2 here :
>>>  [2014-03-12 22:03:52,709][WARN ][discovery.zen.ping.multicast] [Node ES 
>>> #2] 
>>> received ping response ping_response{target [[Node ES 
>>> #1][LbMQazWXR9uB6Q7R2xLxGQ][inet[/172.16.0.100:9300]]{master=true}], 
>>> master [[Node ES 
>>> #1][LbMQazWXR9uB6Q7R2xLxGQ][inet[/172.16.0.100:9300]]{master=true}], 
>>> cluster_name[logstash]} with no matching id [1]
>>>  
>>> So, the connectivity between Node #1 (which is the first one online, and 
>>> therefore master) and Node #2 is established, as the log on Node #2 clearly 
>>> said "received ping response", but with an "ID that didn't match".
>>>
>>> This is apparently why Node #2 didn't join the cluster on Node #1, and 
>>> this is this specific issue I want to resolve.
>>>
>>> Thanks,
>>>
>>> Le jeudi 13 mars 2014 07:03:35 UTC+1, David Pilato a écrit : 

 Bonjour :-)

 You should set min_master_nodes to 2. Although I'd recommend having 3 
 nodes instead of 2.

 --
 David ;-) 
 Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs
  
 Le 12 mars 2014 à 23:58, Guillaume Loetscher  a 
 écrit :

  Hi,

 I've begun to test Elasticsearch recently, on a little mockup I've 
 designed.

 Currently, I'm running two nodes on two LXC (v0.9) containers. Those 
 containers are linked using veth to a bridge declared on the host.

 When I start the first node, the cluster starts, but when I start the 
 second node a bit later, it seems to get some information from the other 
 node but it always ended with the same "no matchind id" error.

 Here's what I'm doing :

 I start the LXC container of the first node :
  root@lada:~# date && lxc-start -n es_node1 -d
 mercredi 12 mars 2014, 22:59:39 (UTC+0100)
  


 I logon the node, check the log file :
  [2014-03-12 21:59:41,927][INFO ][node ] [Node ES #1] 
 version[0.90.12], pid[1129], build[26feed7/2014-02-25T15:38:23Z]
 [2014-03-12 21:59:41,928][INFO ][node ] [Node ES #1] 
 initializing ...
 [2014-03-12 21:59:41,944][INFO ][plugins  ] [Node ES #1] 
 loaded [], sites []
 [2014-03-12 21:59:47,262][INFO ][node ] [Node ES #1] 
 initialized
 [2014-03-12 21:59:47,263][INFO ][node ] [Node ES #1] 
 starting ...
 [2014-03-12 21:59:47,485][INFO ][transport] [Node ES #1] 
 bound_address {inet[/0:0:0:0:0:0:0:0:9300]}, publish_address {inet[/
 172.16.0.100:930

Re: Adding new nodes to cluster with unicast without restarting

2014-03-13 Thread Hari Prasad
Ok Thank you :)

On Thursday, 13 March 2014 19:35:52 UTC+5:30, David Pilato wrote:
>
> yes
>
> -- 
> *David Pilato* | *Technical Advocate* | *Elasticsearch.com*
> @dadoonet  | 
> @elasticsearchfr
>
>
> Le 13 mars 2014 à 15:00:10, Hari Prasad (iamha...@gmail.com ) 
> a écrit:
>
> Is this the case the even if discovery.zen.ping.multicast.enabled is false?
>
> On Thursday, 13 March 2014 19:17:19 UTC+5:30, David Pilato wrote: 
>>
>>  Yes. Just launch the new node and set its unicast values to other 
>> running nodes.
>> It will connect to the cluster and the cluster will add him as a new node.
>>
>> You don't have to modify existing settings, although you should do it to 
>> have updated settings in case of restart.
>>
>> --
>> David ;-)
>> Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs
>>
>>  
>> Le 13 mars 2014 à 14:38, Hari Prasad  a écrit :
>>
>>  Hi 
>> I am having an elasticsearch cluster. I am using the unicast to discover 
>> nodes. Can I add nodes to list dynamically without restarting the cluster? 
>> I tried to do this with the prepareUpdateSettings but i got "ignoring 
>> transient setting [discovery.zen.ping.unicast.hosts], not dynamically 
>> updateable".
>> Is there any other way to do this without restarting the cluster.
>>
>> I am not going for multicast because i don't want rouge nodes to join my 
>> cluster. I can go for it if i can, in anyway, limit what all nodes join the 
>> cluster, other than the cluster name.
>> Are there any ways to do this.
>>
>> Thanks 
>> Hari
>>  --
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to elasticsearc...@googlegroups.com.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/127e4ba6-3fab-4fca-8595-e4506ef8f101%40googlegroups.com
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>  
>  --
> You received this message because you are subscribed to the Google Groups 
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to elasticsearc...@googlegroups.com .
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/elasticsearch/a5a2de96-f308-495a-8de4-07feca421759%40googlegroups.com
> .
> For more options, visit https://groups.google.com/d/optout.
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/e0b97af5-eb5c-48c2-b9ca-783f50aab8fb%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: elasticsearch memory usage

2014-03-13 Thread Hicham Mallah
Hello Zachary,

Thanks for your reply and the pointer to the settings.

Here are the output of the commands you requested:

curl -XGET "http://localhost:9200/_nodes/stats";
curl -XGET "http://localhost:9200/_nodes";

https://gist.github.com/codebird/9529114


- - - - - - - - - -
Sincerely:
Hicham Mallah
Software Developer
mallah.hic...@gmail.com
00961 700 49 600



On Thu, Mar 13, 2014 at 3:57 PM, Zachary Tong wrote:

> Can you gist up the output of these two commands?
>
> curl -XGET "http://localhost:9200/_nodes/stats";
>
> curl -XGET "http://localhost:9200/_nodes";
>
> Those are my first-stop APIs for determining where memory is being
> allocated.
>
>
> By the way, these settings don't do anything anymore (they were
> depreciated and removed):
>
> index.cache.field.type: soft
> index.term_index_interval: 256
> index.term_index_divisor: 5
>
> index.cache.field.max_size: 1
>
>
>
> `max_size` was replaced with `indices.fielddata.cache.size` and accepts a
> value like "10gb" or "30%"
>
> And this is just bad settings in general (causes a lot of GC thrashing):
>
> index.cache.field.expire: 10m
>
>
>
>
> On Thursday, March 13, 2014 8:42:54 AM UTC-4, Hicham Mallah wrote:
>
>> Now the process went back down to 25% usage, from now on it will go back
>> up, and won't stop going up.
>>
>> Sorry for spamming
>>
>>  - - - - - - - - - -
>> Sincerely:
>> Hicham Mallah
>> Software Developer
>> mallah...@gmail.com
>> 00961 700 49 600
>>
>>
>>
>> On Thu, Mar 13, 2014 at 2:37 PM, Hicham Mallah wrote:
>>
>>>  Here's the top after ~1 hour running:
>>>
>>>  PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND
>>> 780 root  20   0  317g  14g 7.1g S 492.9 46.4 157:50.89 java
>>>
>>>
>>> - - - - - - - - - -
>>> Sincerely:
>>> Hicham Mallah
>>> Software Developer
>>> mallah...@gmail.com
>>> 00961 700 49 600
>>>
>>>
>>>
>>> On Thu, Mar 13, 2014 at 2:36 PM, Hicham Mallah wrote:
>>>
 Hello Jörg

 Thanks for the reply, our swap size is 2g. I don't know at what % the
 process is being killed as the first time it happened I wasn't around, and
 then I never let that happen again as the website is online. After 2 hours
 of running the memory in sure is going up to 60%, I am restarting each time
 when it arrives at 70% (2h/2h30) when I am around and testing config
 changes. When I am not around, I am setting a cron job to restart the
 server every 2 hours. Server has apache and mysql running on it too.



 - - - - - - - - - -
 Sincerely:
 Hicham Mallah
 Software Developer
 mallah...@gmail.com
 00961 700 49 600



 On Thu, Mar 13, 2014 at 2:22 PM, joerg...@gmail.com >>> > wrote:

> You wrote, the OOM killer killed the ES process. With 32g (and the
> swap size), the process must be very big. much more than you configured.
> Can you give more info about the live size of the process, after ~2 hours?
> Are there more application processes on the box?
>
> Jörg
>
>
> On Thu, Mar 13, 2014 at 12:46 PM, Hicham Mallah 
> wrote:
>
>> Hello,
>>
>> I have been using elasticsearch on a ubuntu server for a year now,
>> and everything was going great. I had an index of 150,000,000 entries of
>> domain names, running small queries on it, just filtering by 1 term no
>> sorting no wildcard nothing. Now we moved servers, I have now a CentOS 6
>> server, 32GB ram and running elasticserach but now we have 2 indices, of
>> about 150 million entries each 32 shards, still running the same queries 
>> on
>> them nothing changed in the queries. But since we went online with the 
>> new
>> server, I have to restart elasticsearch every 2 hours before OOM killer
>> kills it.
>>
>> What's happening is that elasticsearch starts using memory till 50%
>> then it goes back down to about 30% gradually then starts to go up again
>> gradually and never goes back down.
>>
>> I have tried all the solutions I found on the net, I am a developer
>> not a server admin.
>>
>> *I have these setting in my service wrapper configuration*
>>
>> set.default.ES_HOME=/home/elasticsearch
>> set.default.ES_HEAP_SIZE=8192
>> set.default.MAX_OPEN_FILES=65535
>> set.default.MAX_LOCKED_MEMORY=10240
>> set.default.CONF_DIR=/home/elasticsearch/conf
>> set.default.WORK_DIR=/home/elasticsearch/tmp
>> set.default.DIRECT_SIZE=4g
>>
>> # Java Additional Parameters
>> wrapper.java.additional.1=-Delasticsearch-service
>> wrapper.java.additional.2=-Des.path.home=%ES_HOME%
>> wrapper.java.additional.3=-Xss256k
>> wrapper.java.additional.4=-XX:+UseParNewGC
>> wrapper.java.additional.5=-XX:+UseConcMarkSweepGC
>> wrapper.java.additional.6=-XX:CMSInitiatingOccupancyFraction=75
>> wrapper.java.additional.7=-XX:+UseCMSInitiatingOccupancyOnly
>> wrapper.java.additional.8=-XX:+HeapDumpOnOutOfMemoryError

Someone used jquery Select2 library with Elasticsearch?

2014-03-13 Thread Maciej Egermeier
Hi,
Do you know if it is possible to use Select2 autocomplete library with 
elasticsearch indexes? It looks great and it would be nice to use 
elasticsearch instead of real system objects for this autocomplete.
Take a look at Select2 library home page:
http://ivaynberg.github.io/select2/

Regards,

Maciej

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/2f933e24-2582-4384-8de3-89a9350b6fe3%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Adding new nodes to cluster with unicast without restarting

2014-03-13 Thread Hari Prasad
Is this the case the even if discovery.zen.ping.multicast.enabled is false?

On Thursday, 13 March 2014 19:17:19 UTC+5:30, David Pilato wrote:
>
> Yes. Just launch the new node and set its unicast values to other running 
> nodes.
> It will connect to the cluster and the cluster will add him as a new node.
>
> You don't have to modify existing settings, although you should do it to 
> have updated settings in case of restart.
>
> --
> David ;-)
> Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs
>
>
> Le 13 mars 2014 à 14:38, Hari Prasad > a 
> écrit :
>
> Hi
> I am having an elasticsearch cluster. I am using the unicast to discover 
> nodes. Can I add nodes to list dynamically without restarting the cluster? 
> I tried to do this with the prepareUpdateSettings but i got "ignoring 
> transient setting [discovery.zen.ping.unicast.hosts], not dynamically 
> updateable".
> Is there any other way to do this without restarting the cluster.
>
> I am not going for multicast because i don't want rouge nodes to join my 
> cluster. I can go for it if i can, in anyway, limit what all nodes join the 
> cluster, other than the cluster name.
> Are there any ways to do this.
>
> Thanks 
> Hari
>
> -- 
> You received this message because you are subscribed to the Google Groups 
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to elasticsearc...@googlegroups.com .
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/elasticsearch/127e4ba6-3fab-4fca-8595-e4506ef8f101%40googlegroups.com
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/a5a2de96-f308-495a-8de4-07feca421759%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Adding new nodes to cluster with unicast without restarting

2014-03-13 Thread David Pilato
yes

-- 
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr


Le 13 mars 2014 à 15:00:10, Hari Prasad (iamhari1...@gmail.com) a écrit:

Is this the case the even if discovery.zen.ping.multicast.enabled is false?

On Thursday, 13 March 2014 19:17:19 UTC+5:30, David Pilato wrote:
Yes. Just launch the new node and set its unicast values to other running nodes.
It will connect to the cluster and the cluster will add him as a new node.

You don't have to modify existing settings, although you should do it to have 
updated settings in case of restart.

--
David ;-)
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs


Le 13 mars 2014 à 14:38, Hari Prasad  a écrit :

Hi
I am having an elasticsearch cluster. I am using the unicast to discover nodes. 
Can I add nodes to list dynamically without restarting the cluster? 
I tried to do this with the prepareUpdateSettings but i got "ignoring transient 
setting [discovery.zen.ping.unicast.hosts], not dynamically updateable".
Is there any other way to do this without restarting the cluster.

I am not going for multicast because i don't want rouge nodes to join my 
cluster. I can go for it if i can, in anyway, limit what all nodes join the 
cluster, other than the cluster name.
Are there any ways to do this.

Thanks 
Hari
--
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearc...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/127e4ba6-3fab-4fca-8595-e4506ef8f101%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/a5a2de96-f308-495a-8de4-07feca421759%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/etPan.5321bb40.140e0f76.158d%40MacBook-Air-de-David.local.
For more options, visit https://groups.google.com/d/optout.


Re: elasticsearch memory usage

2014-03-13 Thread Zachary Tong
Can you gist up the output of these two commands?

curl -XGET "http://localhost:9200/_nodes/stats";

curl -XGET "http://localhost:9200/_nodes";

Those are my first-stop APIs for determining where memory is being 
allocated.


By the way, these settings don't do anything anymore (they were depreciated 
and removed):

index.cache.field.type: soft 
index.term_index_interval: 256 
index.term_index_divisor: 5 

index.cache.field.max_size: 1

 

`max_size` was replaced with `indices.fielddata.cache.size` and accepts a 
value like "10gb" or "30%"

And this is just bad settings in general (causes a lot of GC thrashing):

index.cache.field.expire: 10m 


 

On Thursday, March 13, 2014 8:42:54 AM UTC-4, Hicham Mallah wrote:
>
> Now the process went back down to 25% usage, from now on it will go back 
> up, and won't stop going up.
>
> Sorry for spamming
>
> - - - - - - - - - -
> Sincerely:
> Hicham Mallah 
> Software Developer
> mallah...@gmail.com 
> 00961 700 49 600
>   
>
>
> On Thu, Mar 13, 2014 at 2:37 PM, Hicham Mallah 
> 
> > wrote:
>
>> Here's the top after ~1 hour running:
>>
>>  PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND
>> 780 root  20   0  317g  14g 7.1g S 492.9 46.4 157:50.89 java
>>
>>  
>> - - - - - - - - - -
>> Sincerely:
>> Hicham Mallah 
>> Software Developer
>> mallah...@gmail.com 
>> 00961 700 49 600
>>   
>>
>>
>> On Thu, Mar 13, 2014 at 2:36 PM, Hicham Mallah 
>> 
>> > wrote:
>>
>>> Hello Jörg
>>>
>>> Thanks for the reply, our swap size is 2g. I don't know at what % the 
>>> process is being killed as the first time it happened I wasn't around, and 
>>> then I never let that happen again as the website is online. After 2 hours 
>>> of running the memory in sure is going up to 60%, I am restarting each time 
>>> when it arrives at 70% (2h/2h30) when I am around and testing config 
>>> changes. When I am not around, I am setting a cron job to restart the 
>>> server every 2 hours. Server has apache and mysql running on it too.
>>>
>>>
>>>
>>> - - - - - - - - - -
>>> Sincerely:
>>> Hicham Mallah 
>>> Software Developer
>>> mallah...@gmail.com 
>>> 00961 700 49 600
>>>   
>>>
>>>
>>> On Thu, Mar 13, 2014 at 2:22 PM, joerg...@gmail.com  <
>>> joerg...@gmail.com > wrote:
>>>
 You wrote, the OOM killer killed the ES process. With 32g (and the swap 
 size), the process must be very big. much more than you configured. Can 
 you 
 give more info about the live size of the process, after ~2 hours? Are 
 there more application processes on the box?

 Jörg


 On Thu, Mar 13, 2014 at 12:46 PM, Hicham Mallah 
 
 > wrote:

> Hello, 
>
> I have been using elasticsearch on a ubuntu server for a year now, and 
> everything was going great. I had an index of 150,000,000 entries of 
> domain 
> names, running small queries on it, just filtering by 1 term no sorting 
> no 
> wildcard nothing. Now we moved servers, I have now a CentOS 6 server, 
> 32GB 
> ram and running elasticserach but now we have 2 indices, of about 150 
> million entries each 32 shards, still running the same queries on them 
> nothing changed in the queries. But since we went online with the new 
> server, I have to restart elasticsearch every 2 hours before OOM killer 
> kills it. 
>
> What's happening is that elasticsearch starts using memory till 50% 
> then it goes back down to about 30% gradually then starts to go up again 
> gradually and never goes back down. 
>
> I have tried all the solutions I found on the net, I am a developer 
> not a server admin. 
>
> *I have these setting in my service wrapper configuration*
>
> set.default.ES_HOME=/home/elasticsearch 
> set.default.ES_HEAP_SIZE=8192 
> set.default.MAX_OPEN_FILES=65535 
> set.default.MAX_LOCKED_MEMORY=10240 
> set.default.CONF_DIR=/home/elasticsearch/conf 
> set.default.WORK_DIR=/home/elasticsearch/tmp 
> set.default.DIRECT_SIZE=4g 
>
> # Java Additional Parameters 
> wrapper.java.additional.1=-Delasticsearch-service 
> wrapper.java.additional.2=-Des.path.home=%ES_HOME% 
> wrapper.java.additional.3=-Xss256k 
> wrapper.java.additional.4=-XX:+UseParNewGC 
> wrapper.java.additional.5=-XX:+UseConcMarkSweepGC 
> wrapper.java.additional.6=-XX:CMSInitiatingOccupancyFraction=75 
> wrapper.java.additional.7=-XX:+UseCMSInitiatingOccupancyOnly 
> wrapper.java.additional.8=-XX:+HeapDumpOnOutOfMemoryError 
> wrapper.java.additional.9=-Djava.awt.headless=true 
> wrapper.java.additional.10=-XX:MinHeapFreeRatio=40 
> wrapper.java.additional.11=-XX:MaxHeapFreeRatio=70 
> wrapper.java.additional.12=-XX:CMSInitiatingOccupancyFraction=75 
> wrapper.java.additional.13=-XX:+UseCMSInitiatingOccupancyOnly 
> wrapper.java.additional.15=-XX:MaxDirectMemorySize=4g 
> # Initial Java Heap Size (in MB) 
> wrapp

Re: query filter is not working

2014-03-13 Thread David Pilato
message field has been analyzed using standard analyzer. It means that you 
message content has been indexed using lowercase.
a Term Filter does not analyze your query.

"DEBUG" is <> than "debug".

If you want to find your term in the inverted index, you have either to analyze 
your query (matchQuery for example) or lowercase in that case your searched 
term.

curl -XPOST http://10.203.251.142:9200/log-2014.03.03/_search -d
'{
    "query": {
        "constant_score": {
            "filter": {
                "term": { "message": "debug" }
            }
        }
    }
}
'

-- 
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr


Le 13 mars 2014 à 14:24:39, Subhadip Bagui (i.ba...@gmail.com) a écrit:

Hi David,

I have done following steps u suggested. The string search is working now.

But for filter I have to always pass strings in lowercase; where as for query 
text search I can give the proper string sequence inserted in doc. query shown 
below.

May be this is very basic and I'm doing something wrong. I'm a week old on 
elasticsearch and trying to understand the query-sql and text search. Pls help 
to clear the conception.


1) curl -XDELETE http://10.203.251.142:9200/log-2014.03.03

2) 
curl -XPUT http://10.203.251.142:9200/log-2014.03.03/ -d 
'{
    "settings": {
        "index": {
            "number_of_shards": 3,
            "number_of_replicas": 0,
            "index.cache.field.type": "soft",
            "index.refresh_interval": "30s",
            "index.store.compress.stored": true
        }
    },
    "mappings": {
        "apache-log": {
            "properties": {
                "message": {
                    "type": "string",
                    "fields": {
                        "actual": {
                            "type": "string",
                            "index": "not_analyzed"
                        }
                    }
                },
                "@version": {
                    "type": "long",
                    "index": "not_analyzed"
                },
                "@timestamp": {
                    "type": "date",
                    "format": "-MM-dd HH:mm:ss",
                    "index": "not_analyzed"
                },
                "type": {
                    "type": "string",
                    "index": "not_analyzed"
                },
                "host": {
                    "type": "string",
                    "index": "not_analyzed"
                },
                "path": {
                    "type": "string",
                    "norms": {
                        "enabled": false
                    },
                    "index": "not_analyzed"
                }
            }
        }
    }
}'

3) curl -XPUT http://10.203.251.142:9200/_bulk -d '
{ "index": {"_index": "log-2014.03.03", "_type": "apache-log", "_id": "1"}}
{ "message": "03-03-2014 18:39:35,025 DEBUG 
[org.springframework.scheduling.quartz.SchedulerFactoryBean#0_Worker-8]   
com.aricent.aricloud.monitoring.CloudController 121 - 
com.sun.jersey.core.spi.factory.ResponseImpl@1139f1b","@version": 
"1","@timestamp": "2014-03-03 18:39:35","type": "apache-access", "host": 
"cloudclient.aricent.com", "path": 
"/opt/apache-tomcat-7.0.40/logs/aricloud/monitoring.log" }
{ "index": {"_index": "log-2014.03.03", "_type": "apache-log", "_id": "2"}}
{ "message": "\tat org.quartz.core.JobRunShell.run(JobRunShell.java:223)", 
"@version": "1", "@timestamp": "2014-03-03 18:39:36","type": "apache-access", 
"host": "cloudclient.aricent.com", "path": 
"/opt/apache-tomcat-7.0.40/logs/aricloud/monitoring.log" }
{ "index": {"_index": "log-2014.03.03", "_type": "apache-log", "_id": "3"}}
{ "message": "03-03-2014 18:39:35,030  INFO 
[org.springframework.scheduling.quartz.SchedulerFactoryBean#0_Worker-8] 
com.amazonaws.http.HttpClientFactory 128 - Configuring Proxy. Proxy Host: 
10.203.193.227 Proxy Port: 80", "@version": "2", "@timestamp": "2014-03-03 
18:40:35", "type": "apache-access", "host": "cloudclient.aricent.com", "path": 
"/opt/apache-tomcat-7.0.40/logs/aricloud/monitoring.log" }
{ "index": {"_index": "log-2014.03.03", "_type": "apache-log", "_id": "4"}}
{ "message": "\tat 
org.apache.http.protocol.ImmutableHttpProcessor.process(ImmutableHttpProcessor.java:109)",
 "@version": "3", "@timestamp": "2014-03-03 18:43:35", "type": "apache-access", 
"host": "cloudclient.aricent.com", "path": 
"/opt/apache-tomcat-7.0.40/logs/aricloud/monitoring.log" }
{ "index": {"_index": "log-2014.03.03", "_type": "apache-log", "_id": "5"}}
{ "message": "03-03-2014 18:45:30,002 DEBUG 
[org.springframework.scheduling.quartz.SchedulerFactoryBean#0_Worker-9] 
com.aricent.aricloud.monitoring.scheduler.SchedulerJob 22 - Entering 
SchedulerJob", "@version": "3", "@timestamp": "2014-03-03 18:45:35", "type": 
"apache-access", "host": "cloudclient.aricent.com", "path": 
"/opt/apache-tomcat-7.0.40/logs/aricloud/monitoring.log" }
\n'

4) curl -XGET 'http://10.203.251.142:9200/

Re: Adding new nodes to cluster with unicast without restarting

2014-03-13 Thread David Pilato
Yes. Just launch the new node and set its unicast values to other running nodes.
It will connect to the cluster and the cluster will add him as a new node.

You don't have to modify existing settings, although you should do it to have 
updated settings in case of restart.

--
David ;-)
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs


Le 13 mars 2014 à 14:38, Hari Prasad  a écrit :

Hi
I am having an elasticsearch cluster. I am using the unicast to discover nodes. 
Can I add nodes to list dynamically without restarting the cluster? 
I tried to do this with the prepareUpdateSettings but i got "ignoring transient 
setting [discovery.zen.ping.unicast.hosts], not dynamically updateable".
Is there any other way to do this without restarting the cluster.

I am not going for multicast because i don't want rouge nodes to join my 
cluster. I can go for it if i can, in anyway, limit what all nodes join the 
cluster, other than the cluster name.
Are there any ways to do this.

Thanks 
Hari
-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/127e4ba6-3fab-4fca-8595-e4506ef8f101%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/01537FBE-A376-4FBA-BD16-5BF0D5476EAC%40pilato.fr.
For more options, visit https://groups.google.com/d/optout.


Re: function_score and elasticsearch-php

2014-03-13 Thread Zachary Tong
No problem, glad to help!  The syntax is definitely kinda gross, I'll try 
to write some docs on it soon to help others.

Let me know if you run into any more problems, and feel free to open a 
ticket at the Elasticsearch-PHP github repo, I keep a closer eye on tickets 
than the mailing list :)

-Z

On Thursday, March 13, 2014 9:37:42 AM UTC-4, Erdal Gunyar wrote:
>
> Hello Zachary,
>
> Thank you for your quick and working responses!
> I've previously tried with double array method and didn't worked, I should 
> have missed something at that time.
>
> And thanks also for the object method, didn't know :)
>
> Have a good day all,
>
> Erdal.
>
>
> Le mercredi 12 mars 2014 20:05:11 UTC+1, Zachary Tong a écrit :
>>
>> For the record, this array syntax should work as well:
>>
>> $qry = array(
>> 'query' => array(
>> 'function_score' => array(
>> 'functions' => array(
>> array("script_score" => array('script' => 
>> "doc['boostfield'].value"))
>> ),
>> 'query' => array(
>>'query_string' => array('query' => 'MyQuery')
>> ),
>> 'score_mode' => 'multiply'
>> )
>> ));
>>
>> But the object notation tends to be safer because it can handle empty 
>> objects (for example, a random function without a seed is just 
>> `"random_score" : {}`, which will break the array notation.  Objects make 
>> sure that doesn't happen.
>>
>>
>> On Wednesday, March 12, 2014 2:47:00 PM UTC-4, Erdal Gunyar wrote:
>>>
>>> Hi everybody,
>>>
>>> Does anyone here successfully implemented function_score with 
>>> elasticsearch-php?
>>> Of course, without passing all the body as a JSON string.
>>>
>>> I've actually tried but it failed, it looks like it's impossible to pass 
>>> the "array+object" located in the "functions" part :
>>> "query": {
>>> "function_score": {
>>> "query": {  
>>> "query_string": {
>>> "query": "MyQuery",
>>> }
>>> },
>>> "functions": [{
>>> "script_score": { 
>>> "script": "doc['boostfield'].value"
>>> }
>>> }],
>>> "score_mode": "multiply"
>>> }
>>> },
>>>
>>> Any help will be appreciated! :)
>>>
>>> Thanks,
>>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/6d868de2-1c3e-4f4b-abfa-d8fc45e8cd67%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Adding new nodes to cluster with unicast without restarting

2014-03-13 Thread Hari Prasad
Hi
I am having an elasticsearch cluster. I am using the unicast to discover 
nodes. Can I add nodes to list dynamically without restarting the cluster? 
I tried to do this with the prepareUpdateSettings but i got "ignoring 
transient setting [discovery.zen.ping.unicast.hosts], not dynamically 
updateable".
Is there any other way to do this without restarting the cluster.

I am not going for multicast because i don't want rouge nodes to join my 
cluster. I can go for it if i can, in anyway, limit what all nodes join the 
cluster, other than the cluster name.
Are there any ways to do this.

Thanks 
Hari

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/127e4ba6-3fab-4fca-8595-e4506ef8f101%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: function_score and elasticsearch-php

2014-03-13 Thread Erdal Gunyar
Hello Zachary,

Thank you for your quick and working responses!
I've previously tried with double array method and didn't worked, I should 
have missed something at that time.

And thanks also for the object method, didn't know :)

Have a good day all,

Erdal.


Le mercredi 12 mars 2014 20:05:11 UTC+1, Zachary Tong a écrit :
>
> For the record, this array syntax should work as well:
>
> $qry = array(
> 'query' => array(
> 'function_score' => array(
> 'functions' => array(
> array("script_score" => array('script' => 
> "doc['boostfield'].value"))
> ),
> 'query' => array(
>'query_string' => array('query' => 'MyQuery')
> ),
> 'score_mode' => 'multiply'
> )
> ));
>
> But the object notation tends to be safer because it can handle empty 
> objects (for example, a random function without a seed is just 
> `"random_score" : {}`, which will break the array notation.  Objects make 
> sure that doesn't happen.
>
>
> On Wednesday, March 12, 2014 2:47:00 PM UTC-4, Erdal Gunyar wrote:
>>
>> Hi everybody,
>>
>> Does anyone here successfully implemented function_score with 
>> elasticsearch-php?
>> Of course, without passing all the body as a JSON string.
>>
>> I've actually tried but it failed, it looks like it's impossible to pass 
>> the "array+object" located in the "functions" part :
>> "query": {
>> "function_score": {
>> "query": {  
>> "query_string": {
>> "query": "MyQuery",
>> }
>> },
>> "functions": [{
>> "script_score": { 
>> "script": "doc['boostfield'].value"
>> }
>> }],
>> "score_mode": "multiply"
>> }
>> },
>>
>> Any help will be appreciated! :)
>>
>> Thanks,
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/7c556d4b-88b7-4a0b-bf73-3c4442ab707b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: query filter is not working

2014-03-13 Thread Subhadip Bagui
Hi David,

I have done following steps u suggested. The string search is working now.

But for filter I have to always pass strings in lowercase; where as for 
query text search I can give the proper string sequence inserted in doc. 
query shown below.

May be this is very basic and I'm doing something wrong. I'm a week old on 
elasticsearch and trying to understand the query-sql and text search. Pls 
help to clear the conception.


1) curl -XDELETE http://10.203.251.142:9200/log-2014.03.03

2) 
curl -XPUT http://10.203.251.142:9200/log-2014.03.03/ -d 
'{
"settings": {
"index": {
"number_of_shards": 3,
"number_of_replicas": 0,
"index.cache.field.type": "soft",
"index.refresh_interval": "30s",
"index.store.compress.stored": true
}
},
"mappings": {
"apache-log": {
"properties": {
"message": {
"type": "string",
"fields": {
"actual": {
"type": "string",
"index": "not_analyzed"
}
}
},
"@version": {
"type": "long",
"index": "not_analyzed"
},
"@timestamp": {
"type": "date",
"format": "-MM-dd HH:mm:ss",
"index": "not_analyzed"
},
"type": {
"type": "string",
"index": "not_analyzed"
},
"host": {
"type": "string",
"index": "not_analyzed"
},
"path": {
"type": "string",
"norms": {
"enabled": false
},
"index": "not_analyzed"
}
}
}
}
}'

3) curl -XPUT http://10.203.251.142:9200/_bulk -d '
{ "index": {"_index": "log-2014.03.03", "_type": "apache-log", "_id": "1"}}
{ "message": "03-03-2014 18:39:35,025 DEBUG 
[org.springframework.scheduling.quartz.SchedulerFactoryBean#0_Worker-8]   
com.aricent.aricloud.monitoring.CloudController 121 - 
com.sun.jersey.core.spi.factory.ResponseImpl@1139f1b","@version": 
"1","@timestamp": "2014-03-03 18:39:35","type": "apache-access", "host": 
"cloudclient.aricent.com", "path": 
"/opt/apache-tomcat-7.0.40/logs/aricloud/monitoring.log" }
{ "index": {"_index": "log-2014.03.03", "_type": "apache-log", "_id": "2"}}
{ "message": "\tat org.quartz.core.JobRunShell.run(JobRunShell.java:223)", 
"@version": "1", "@timestamp": "2014-03-03 18:39:36","type": 
"apache-access", "host": "cloudclient.aricent.com", "path": 
"/opt/apache-tomcat-7.0.40/logs/aricloud/monitoring.log" }
{ "index": {"_index": "log-2014.03.03", "_type": "apache-log", "_id": "3"}}
{ "message": "03-03-2014 18:39:35,030  INFO 
[org.springframework.scheduling.quartz.SchedulerFactoryBean#0_Worker-8] 
com.amazonaws.http.HttpClientFactory 128 - Configuring Proxy. Proxy Host: 
10.203.193.227 Proxy Port: 80", "@version": "2", "@timestamp": "2014-03-03 
18:40:35", "type": "apache-access", "host": "cloudclient.aricent.com", 
"path": "/opt/apache-tomcat-7.0.40/logs/aricloud/monitoring.log" }
{ "index": {"_index": "log-2014.03.03", "_type": "apache-log", "_id": "4"}}
{ "message": "\tat 
org.apache.http.protocol.ImmutableHttpProcessor.process(ImmutableHttpProcessor.java:109)",
 
"@version": "3", "@timestamp": "2014-03-03 18:43:35", "type": 
"apache-access", "host": "cloudclient.aricent.com", "path": 
"/opt/apache-tomcat-7.0.40/logs/aricloud/monitoring.log" }
{ "index": {"_index": "log-2014.03.03", "_type": "apache-log", "_id": "5"}}
{ "message": "03-03-2014 18:45:30,002 DEBUG 
[org.springframework.scheduling.quartz.SchedulerFactoryBean#0_Worker-9] 
com.aricent.aricloud.monitoring.scheduler.SchedulerJob 22 - Entering 
SchedulerJob", "@version": "3", "@timestamp": "2014-03-03 18:45:35", 
"type": "apache-access", "host": "cloudclient.aricent.com", "path": 
"/opt/apache-tomcat-7.0.40/logs/aricloud/monitoring.log" }
\n'

4) curl -XGET 'http://10.203.251.142:9200/log-2014.03.03/_refresh'


5) query ==>
curl -XPOST http://10.203.251.142:9200/log-2014.03.03/_search -d
'{
"query": {
"match": {
"message" : {
"query": "Proxy Port",
"type" : "phrase"
}
  }
   }
}'

null value -
curl -XPOST http://10.203.251.142:9200/log-2014.03.03/_search -d
'{
"query": {
"constant_score": {
"filter": {
"term": { "message": "DEBUG" }
}
}
}
}
'

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussio

Re: Node not joining cluster on boot

2014-03-13 Thread Xiao Yu
Total shot in the dark here but try taking the hashmark out of the node 
names and see if that helps?

On Thursday, March 13, 2014 5:31:30 AM UTC-4, Guillaume Loetscher wrote:
>
> Sure
>
> Node # 1:
> root@es_node1:~# grep -E '^[^#]' /etc/elasticsearch/elasticsearch.yml 
> cluster.name: logstash
> node.name: "Node ES #1"
> node.master: true
> node.data: true
> index.number_of_shards: 2
> index.number_of_replicas: 1
> discovery.zen.ping.timeout: 10s
>
> Node #2 :
> root@es_node2:~# grep -E '^[^#]' /etc/elasticsearch/elasticsearch.yml
> cluster.name: logstash
> node.name: "Node ES #2"
> node.master: true
> node.data: true
> index.number_of_shards: 2
> index.number_of_replicas: 1
> discovery.zen.ping.timeout: 10s
>
>
>
>
>
> Le jeudi 13 mars 2014 10:15:16 UTC+1, David Pilato a écrit :
>>
>> did you set the same cluster name on both nodes?
>>
>> -- 
>> *David Pilato* | *Technical Advocate* | *Elasticsearch.com*
>> @dadoonet  | 
>> @elasticsearchfr
>>
>>
>> Le 13 mars 2014 à 09:57:35, Guillaume Loetscher (ster...@gmail.com) a 
>> écrit:
>>
>> Hi,
>>
>> First, thanks for the answers and remarks.
>>
>> You are both right, the issue I'm currently facing leads to a 
>> "split-brain" situation, where Node #1 & Node #2 are both master, and doing 
>> their own life on their side. I'll see to change my configuration and the 
>> number of node, in order to limit this situation (I already checked this 
>> article talking about split-brain in 
>> ES
>> ).
>>
>> However, this split-brain situation is the result of the problem with the 
>> discovery / broadcast, which is represented in the log of Node #2 here :
>>  [2014-03-12 22:03:52,709][WARN ][discovery.zen.ping.multicast] [Node ES #2] 
>> received ping response ping_response{target [[Node ES 
>> #1][LbMQazWXR9uB6Q7R2xLxGQ][inet[/172.16.0.100:9300]]{master=true}], 
>> master [[Node ES 
>> #1][LbMQazWXR9uB6Q7R2xLxGQ][inet[/172.16.0.100:9300]]{master=true}], 
>> cluster_name[logstash]} with no matching id [1]
>>  
>> So, the connectivity between Node #1 (which is the first one online, and 
>> therefore master) and Node #2 is established, as the log on Node #2 clearly 
>> said "received ping response", but with an "ID that didn't match".
>>
>> This is apparently why Node #2 didn't join the cluster on Node #1, and 
>> this is this specific issue I want to resolve.
>>
>> Thanks,
>>
>> Le jeudi 13 mars 2014 07:03:35 UTC+1, David Pilato a écrit : 
>>>
>>> Bonjour :-)
>>>
>>> You should set min_master_nodes to 2. Although I'd recommend having 3 
>>> nodes instead of 2.
>>>
>>> --
>>> David ;-) 
>>> Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs
>>>  
>>> Le 12 mars 2014 à 23:58, Guillaume Loetscher  a 
>>> écrit :
>>>
>>>  Hi,
>>>
>>> I've begun to test Elasticsearch recently, on a little mockup I've 
>>> designed.
>>>
>>> Currently, I'm running two nodes on two LXC (v0.9) containers. Those 
>>> containers are linked using veth to a bridge declared on the host.
>>>
>>> When I start the first node, the cluster starts, but when I start the 
>>> second node a bit later, it seems to get some information from the other 
>>> node but it always ended with the same "no matchind id" error.
>>>
>>> Here's what I'm doing :
>>>
>>> I start the LXC container of the first node :
>>>  root@lada:~# date && lxc-start -n es_node1 -d
>>> mercredi 12 mars 2014, 22:59:39 (UTC+0100)
>>>  
>>>
>>>
>>> I logon the node, check the log file :
>>>  [2014-03-12 21:59:41,927][INFO ][node ] [Node ES #1] 
>>> version[0.90.12], pid[1129], build[26feed7/2014-02-25T15:38:23Z]
>>> [2014-03-12 21:59:41,928][INFO ][node ] [Node ES #1] 
>>> initializing ...
>>> [2014-03-12 21:59:41,944][INFO ][plugins  ] [Node ES #1] 
>>> loaded [], sites []
>>> [2014-03-12 21:59:47,262][INFO ][node ] [Node ES #1] 
>>> initialized
>>> [2014-03-12 21:59:47,263][INFO ][node ] [Node ES #1] 
>>> starting ...
>>> [2014-03-12 21:59:47,485][INFO ][transport] [Node ES #1] 
>>> bound_address {inet[/0:0:0:0:0:0:0:0:9300]}, publish_address {inet[/
>>> 172.16.0.100:9300]}
>>> [2014-03-12 21:59:57,573][INFO ][cluster.service  ] [Node ES #1] 
>>> new_master [Node ES 
>>> #1][LbMQazWXR9uB6Q7R2xLxGQ][inet[/172.16.0.100:9300]]{master=true}, reason: 
>>> zen-disco-join (elected_as_master)
>>> [2014-03-12 21:59:57,657][INFO ][discovery] [Node ES #1] 
>>> logstash/LbMQazWXR9uB6Q7R2xLxGQ
>>> [2014-03-12 21:59:57,733][INFO ][http ] [Node ES #1] 
>>> bound_address {inet[/0:0:0:0:0:0:0:0:9200]}, publish_address {inet[/
>>> 172.16.0.100:9200]}
>>> [2014-03-12 21:59:57,735][INFO ][node ] [Node ES #1] 
>>> started
>>> [2014-03-12 21:59:59,569][INFO ][gateway  ] [Node ES #1] 
>>> recovered [2] indices into cl

Re: precise field matching without defining a not_analyzed extra field possible?

2014-03-13 Thread Nikolas Everett
Missed your last email.  Ignore my suggestion and use the raw field:)


On Thu, Mar 13, 2014 at 8:50 AM, Nikolas Everett  wrote:

> If you plan to do this frequently then go with the raw field.  It'll be
> faster.
>
> If you want to fool around without changing any mappings then use a script
> filter to get the field from the _source.  It isn't efficient at all.  I'd
> suggest guarding it with a more efficient filter.  Like so:
>
>"query": {
>   "filtered": {
>  "filter": {
> "and": [
>{
>   "term": {
>  "title": "filter"
>   }
>},
>{
>   "script": {
>  "script": "_source['title'] == 'Filter'"
>   }
>}
> ]
>  }
>   }
>}
>
>
> Nik
>
>
> On Thu, Mar 13, 2014 at 5:17 AM, Clinton Gormley wrote:
>
>>
>> Appreciate that Clint. But I was asking whether I could do without having
>>> to modify mappings - see ref to another post seemingly alluding to that
>>
>>
>> That post refers to using the keyword_repeat token filter to index
>> stemmed and unstemmed tokens in the same positions. It won't work for your
>> use case for exactly the reasons that you gave before:
>>
>> then _analyze still does not return "bob jr" as one of tokens for text
>>> "bob jr" - guessing because Standard tokenizer splits "bob jr" into "bob"
>>> and "jr" and thus "keyword_repeat" never sees "bob jr".
>>>
>>> On the other hand if I use "keyword" tokenizer than "bob jr" is the sole
>>> token returned (which probably means use case #1 won't be addressable).
>>> Also not clear what purpose would keyword_repeat serve in this case.
>>>
>>
>> Why don't you like the idea of using multi-fields?  It solves your
>> problem correctly and easily.
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to elasticsearch+unsubscr...@googlegroups.com.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/elasticsearch/CAPt3XKRYBjVS%2BVu7F3OiRf_mzQg%3DqW20fDvxRV-joiSNKMnqeg%40mail.gmail.com
>> .
>>
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAPmjWd0TF9KDpw4BmiK5xyVZDa6jkhBLd5EMeeaiGGtF4q64Lw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: precise field matching without defining a not_analyzed extra field possible?

2014-03-13 Thread Nikita Tovstoles
I originally thought that using multi-fields would require manual mapping
if the entire data model + thought that keyword_repeat offers an
alternative not requiring mapping changes. After your comments + peeking at
KeywordRepeatFilter src I see I was wrong on both. Thanks for your help!
On Mar 13, 2014 2:17 AM, "Clinton Gormley"  wrote:

>
> Appreciate that Clint. But I was asking whether I could do without having
>> to modify mappings - see ref to another post seemingly alluding to that
>
>
> That post refers to using the keyword_repeat token filter to index stemmed
> and unstemmed tokens in the same positions. It won't work for your use case
> for exactly the reasons that you gave before:
>
> then _analyze still does not return "bob jr" as one of tokens for text
>> "bob jr" - guessing because Standard tokenizer splits "bob jr" into "bob"
>> and "jr" and thus "keyword_repeat" never sees "bob jr".
>>
>> On the other hand if I use "keyword" tokenizer than "bob jr" is the sole
>> token returned (which probably means use case #1 won't be addressable).
>> Also not clear what purpose would keyword_repeat serve in this case.
>>
>
> Why don't you like the idea of using multi-fields?  It solves your problem
> correctly and easily.
>
> --
> You received this message because you are subscribed to a topic in the
> Google Groups "elasticsearch" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/elasticsearch/pMondr_iunw/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/CAPt3XKRYBjVS%2BVu7F3OiRf_mzQg%3DqW20fDvxRV-joiSNKMnqeg%40mail.gmail.com
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAJwaA21RA6JAWp02Nwhhqd4dshnETBkH%2BCTV-TPqNoHzOqxq_g%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Timeouts on Node Stats API?

2014-03-13 Thread Xiao Yu
After restarting nodes I'm also getting a bunch of errors for calls to the 
index stats API *after* the node has come back up. Seems like there's some 
issue here where stats API calls fails, does not time out and causes a 
backup of other calls until a thread pool is full?

Mar 13 12:22:31 esm1.global.search.sat.wordpress.com [2014-03-13 
12:22:31,063][DEBUG][action.admin.indices.stats] 
[esm1.global.search.sat.wordpress.com] [test-lang-analyzers-0][18], 
node[pnlbpsoZRlWbVfgIPSb9vg], [R], s[STARTED]: Failed to execute 
[org.elasticsearch.action.admin.indices.stats.IndicesStatsRequest@35193b8f]
Mar 13 12:22:31 esm1.global.search.sat.wordpress.com 
org.elasticsearch.transport.NodeDisconnectedException: 
[es4.global.search.sat.wordpress.com][inet[es4.global.search.sat.wordpress.com/76.74.248.144:9300]][indices/stats/s]
 
disconnected

It looks like these requests gets put into the 
"MANAGEMENT"
 
thread pool which we've left at the default configs.

On Wednesday, March 12, 2014 5:13:35 PM UTC-4, Xiao Yu wrote:
>
> Hello,
>
> We have a cluster that's still running on 0.90.9 and it's recently 
> developed an interesting issue. Random (data) nodes within our cluster will 
> occasionally stop responding to the node stats API and we see errors like 
> the following in our cluster logs on the master node.
>
> Mar 12 20:53:17 esm1.global.search.iad.wordpress.com [2014-03-12 
> 20:53:17,945][DEBUG][action.admin.cluster.node.stats] [
> esm1.global.search.iad.wordpress.com] failed to execute on node 
> [CBIR6UWfSvqPIHOSgJ3c2Q]
> Mar 12 20:53:17 
> esm1.global.search.iad.wordpress.comorg.elasticsearch.transport.ReceiveTimeoutTransportException:
>  [
> es3.global.search.iad.wordpress.com][inet[
> 66.155.9.130/66.155.9.130:9300]][cluster/nodes/stats/n]
>  
> request_id [12395955] timed out after [15001ms]
> Mar 12 20:53:17 esm1.global.search.iad.wordpress.com at 
> org.elasticsearch.transport.TransportService$TimeoutHandler.run(TransportService.java:356)
> Mar 12 20:53:17 esm1.global.search.iad.wordpress.com at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> Mar 12 20:53:17 esm1.global.search.iad.wordpress.com at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> Mar 12 20:53:17 esm1.global.search.iad.wordpress.com at 
> java.lang.Thread.run(Thread.java:724)
>
> While this is happening our cluster appears to function "normally", 
> queries made to the problematic node process and return normally and 
> judging by the eth traffic and load on the box it appears to even be 
> handling queries and even rebalancing shards. The only way to solve this 
> problem appears to be to reboot the node at which point the node 
> disconnects then rejoins the cluster and functions like any other node.
>
> Mar 12 21:02:45 esm1.global.search.iad.wordpress.com [2014-03-12 
> 21:02:45,991][INFO ][action.admin.cluster.node.shutdown] [
> esm1.global.search.iad.wordpress.com] [partial_cluster_shutdown]: 
> requested, shutting down [[CBIR6UWfSvqPIHOSgJ3c2Q]] in [1s]
> Mar 12 21:02:46 esm1.global.search.iad.wordpress.com [2014-03-12 
> 21:02:46,995][INFO ][action.admin.cluster.node.shutdown] [
> esm1.global.search.iad.wordpress.com] [partial_cluster_shutdown]: done 
> shutting down [[CBIR6UWfSvqPIHOSgJ3c2Q]]
> ...
> Mar 12 21:03:40 
> esm1.global.search.iad.wordpress.comorg.elasticsearch.transport.NodeDisconnectedException:
>  [
> es3.global.search.iad.wordpress.com][inet[
> 66.155.9.130/66.155.9.130:9300]][indices/stats/s]
>  
> disconnected
> ...
> Mar 12 21:03:41 esm1.global.search.iad.wordpress.com [2014-03-12 
> 21:03:41,045][INFO ][cluster.service  ] [
> esm1.global.search.iad.wordpress.com] removed {[
> es3.global.search.iad.wordpress.com][CBIR6UWfSvqPIHOSgJ3c2Q][inet[
> 66.155.9.130/66.155.9.130:9300]]{dc=iad,
>  
> parity=1, master=false},}, reason: zen-disco-node_left([
> es3.global.search.iad.wordpress.com][CBIR6UWfSvqPIHOSgJ3c2Q][inet[/66.155.9.130:9300]]{dc=iad,
>  
> parity=1, master=false})
> Mar 12 21:04:29 esm1.global.search.iad.wordpress.com [2014-03-12 
> 21:04:29,077][INFO ][cluster.service  ] [
> esm1.global.search.iad.wordpress.com] added {[
> es3.global.search.iad.wordpress.com][4Nj-a0SxRTuagLelneThQg][inet[/66.155.9.130:9300]]{dc=iad,
>  
> parity=1, master=false},}, reason: zen-disco-receive(join from node[[
> es3.global.search.iad.wordpress.com][4Nj-a0SxRTuagLelneThQg][inet[/66.155.9.130:9300]]{dc=iad,
>  
> parity=1, master=false}])
>
> All this sometimes causes shards to relocate or go into a recovery state 
> needlessly.
>
> Nothing appears in the problematic node's logs aside from some slow 
> queries similar to what's on all the other dat

Re: precise field matching without defining a not_analyzed extra field possible?

2014-03-13 Thread Nikolas Everett
If you plan to do this frequently then go with the raw field.  It'll be
faster.

If you want to fool around without changing any mappings then use a script
filter to get the field from the _source.  It isn't efficient at all.  I'd
suggest guarding it with a more efficient filter.  Like so:

   "query": {
  "filtered": {
 "filter": {
"and": [
   {
  "term": {
 "title": "filter"
  }
   },
   {
  "script": {
 "script": "_source['title'] == 'Filter'"
  }
   }
]
 }
  }
   }


Nik


On Thu, Mar 13, 2014 at 5:17 AM, Clinton Gormley wrote:

>
> Appreciate that Clint. But I was asking whether I could do without having
>> to modify mappings - see ref to another post seemingly alluding to that
>
>
> That post refers to using the keyword_repeat token filter to index stemmed
> and unstemmed tokens in the same positions. It won't work for your use case
> for exactly the reasons that you gave before:
>
> then _analyze still does not return "bob jr" as one of tokens for text
>> "bob jr" - guessing because Standard tokenizer splits "bob jr" into "bob"
>> and "jr" and thus "keyword_repeat" never sees "bob jr".
>>
>> On the other hand if I use "keyword" tokenizer than "bob jr" is the sole
>> token returned (which probably means use case #1 won't be addressable).
>> Also not clear what purpose would keyword_repeat serve in this case.
>>
>
> Why don't you like the idea of using multi-fields?  It solves your problem
> correctly and easily.
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/CAPt3XKRYBjVS%2BVu7F3OiRf_mzQg%3DqW20fDvxRV-joiSNKMnqeg%40mail.gmail.com
> .
>
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAPmjWd2wjK0W2tbrhusB%3DqOD27HX4hUva_K0zWLnibX%3DBhJh-Q%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: elasticsearch memory usage

2014-03-13 Thread Hicham Mallah
Now the process went back down to 25% usage, from now on it will go back
up, and won't stop going up.

Sorry for spamming

- - - - - - - - - -
Sincerely:
Hicham Mallah
Software Developer
mallah.hic...@gmail.com
00961 700 49 600



On Thu, Mar 13, 2014 at 2:37 PM, Hicham Mallah wrote:

> Here's the top after ~1 hour running:
>
>  PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND
> 780 root  20   0  317g  14g 7.1g S 492.9 46.4 157:50.89 java
>
>
> - - - - - - - - - -
> Sincerely:
> Hicham Mallah
> Software Developer
> mallah.hic...@gmail.com
> 00961 700 49 600
>
>
>
> On Thu, Mar 13, 2014 at 2:36 PM, Hicham Mallah wrote:
>
>> Hello Jörg
>>
>> Thanks for the reply, our swap size is 2g. I don't know at what % the
>> process is being killed as the first time it happened I wasn't around, and
>> then I never let that happen again as the website is online. After 2 hours
>> of running the memory in sure is going up to 60%, I am restarting each time
>> when it arrives at 70% (2h/2h30) when I am around and testing config
>> changes. When I am not around, I am setting a cron job to restart the
>> server every 2 hours. Server has apache and mysql running on it too.
>>
>>
>>
>> - - - - - - - - - -
>> Sincerely:
>> Hicham Mallah
>> Software Developer
>> mallah.hic...@gmail.com
>> 00961 700 49 600
>>
>>
>>
>> On Thu, Mar 13, 2014 at 2:22 PM, joergpra...@gmail.com <
>> joergpra...@gmail.com> wrote:
>>
>>> You wrote, the OOM killer killed the ES process. With 32g (and the swap
>>> size), the process must be very big. much more than you configured. Can you
>>> give more info about the live size of the process, after ~2 hours? Are
>>> there more application processes on the box?
>>>
>>> Jörg
>>>
>>>
>>> On Thu, Mar 13, 2014 at 12:46 PM, Hicham Mallah >> > wrote:
>>>
 Hello,

 I have been using elasticsearch on a ubuntu server for a year now, and
 everything was going great. I had an index of 150,000,000 entries of domain
 names, running small queries on it, just filtering by 1 term no sorting no
 wildcard nothing. Now we moved servers, I have now a CentOS 6 server, 32GB
 ram and running elasticserach but now we have 2 indices, of about 150
 million entries each 32 shards, still running the same queries on them
 nothing changed in the queries. But since we went online with the new
 server, I have to restart elasticsearch every 2 hours before OOM killer
 kills it.

 What's happening is that elasticsearch starts using memory till 50%
 then it goes back down to about 30% gradually then starts to go up again
 gradually and never goes back down.

 I have tried all the solutions I found on the net, I am a developer not
 a server admin.

 *I have these setting in my service wrapper configuration*

 set.default.ES_HOME=/home/elasticsearch
 set.default.ES_HEAP_SIZE=8192
 set.default.MAX_OPEN_FILES=65535
 set.default.MAX_LOCKED_MEMORY=10240
 set.default.CONF_DIR=/home/elasticsearch/conf
 set.default.WORK_DIR=/home/elasticsearch/tmp
 set.default.DIRECT_SIZE=4g

 # Java Additional Parameters
 wrapper.java.additional.1=-Delasticsearch-service
 wrapper.java.additional.2=-Des.path.home=%ES_HOME%
 wrapper.java.additional.3=-Xss256k
 wrapper.java.additional.4=-XX:+UseParNewGC
 wrapper.java.additional.5=-XX:+UseConcMarkSweepGC
 wrapper.java.additional.6=-XX:CMSInitiatingOccupancyFraction=75
 wrapper.java.additional.7=-XX:+UseCMSInitiatingOccupancyOnly
 wrapper.java.additional.8=-XX:+HeapDumpOnOutOfMemoryError
 wrapper.java.additional.9=-Djava.awt.headless=true
 wrapper.java.additional.10=-XX:MinHeapFreeRatio=40
 wrapper.java.additional.11=-XX:MaxHeapFreeRatio=70
 wrapper.java.additional.12=-XX:CMSInitiatingOccupancyFraction=75
 wrapper.java.additional.13=-XX:+UseCMSInitiatingOccupancyOnly
 wrapper.java.additional.15=-XX:MaxDirectMemorySize=4g
 # Initial Java Heap Size (in MB)
 wrapper.java.initmemory=%ES_HEAP_SIZE%

 *And these in elasticsearch.yml*
 ES_MIN_MEM: 5g
 ES_MAX_MEM: 5g
 #index.store.type=mmapfs
 index.cache.field.type: soft
 index.cache.field.max_size: 1
 index.cache.field.expire: 10m
 index.term_index_interval: 256
 index.term_index_divisor: 5

 *java version: *
 java version "1.7.0_51"
 Java(TM) SE Runtime Environment (build 1.7.0_51-b13)
 Java HotSpot(TM) 64-Bit Server VM (build 24.51-b03, mixed mode)

 *Elasticsearch version*
  "version" : {
 "number" : "1.0.0",
 "build_hash" : "a46900e9c72c0a623d71b54016357d5f94c8ea32",
 "build_timestamp" : "2014-02-12T16:18:34Z",
 "build_snapshot" : false,
 "lucene_version" : "4.6"
   }

 Using elastica PHP


 I have tried playing with values up and down to try to make it work,
 but nothing is changing.

 Please any help would be highly apprec

Re: elasticsearch memory usage

2014-03-13 Thread Hicham Mallah
Here's the top after ~1 hour running:

 PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND
780 root  20   0  317g  14g 7.1g S 492.9 46.4 157:50.89 java


- - - - - - - - - -
Sincerely:
Hicham Mallah
Software Developer
mallah.hic...@gmail.com
00961 700 49 600



On Thu, Mar 13, 2014 at 2:36 PM, Hicham Mallah wrote:

> Hello Jörg
>
> Thanks for the reply, our swap size is 2g. I don't know at what % the
> process is being killed as the first time it happened I wasn't around, and
> then I never let that happen again as the website is online. After 2 hours
> of running the memory in sure is going up to 60%, I am restarting each time
> when it arrives at 70% (2h/2h30) when I am around and testing config
> changes. When I am not around, I am setting a cron job to restart the
> server every 2 hours. Server has apache and mysql running on it too.
>
>
>
> - - - - - - - - - -
> Sincerely:
> Hicham Mallah
> Software Developer
> mallah.hic...@gmail.com
> 00961 700 49 600
>
>
>
> On Thu, Mar 13, 2014 at 2:22 PM, joergpra...@gmail.com <
> joergpra...@gmail.com> wrote:
>
>> You wrote, the OOM killer killed the ES process. With 32g (and the swap
>> size), the process must be very big. much more than you configured. Can you
>> give more info about the live size of the process, after ~2 hours? Are
>> there more application processes on the box?
>>
>> Jörg
>>
>>
>> On Thu, Mar 13, 2014 at 12:46 PM, Hicham Mallah 
>> wrote:
>>
>>> Hello,
>>>
>>> I have been using elasticsearch on a ubuntu server for a year now, and
>>> everything was going great. I had an index of 150,000,000 entries of domain
>>> names, running small queries on it, just filtering by 1 term no sorting no
>>> wildcard nothing. Now we moved servers, I have now a CentOS 6 server, 32GB
>>> ram and running elasticserach but now we have 2 indices, of about 150
>>> million entries each 32 shards, still running the same queries on them
>>> nothing changed in the queries. But since we went online with the new
>>> server, I have to restart elasticsearch every 2 hours before OOM killer
>>> kills it.
>>>
>>> What's happening is that elasticsearch starts using memory till 50% then
>>> it goes back down to about 30% gradually then starts to go up again
>>> gradually and never goes back down.
>>>
>>> I have tried all the solutions I found on the net, I am a developer not
>>> a server admin.
>>>
>>> *I have these setting in my service wrapper configuration*
>>>
>>> set.default.ES_HOME=/home/elasticsearch
>>> set.default.ES_HEAP_SIZE=8192
>>> set.default.MAX_OPEN_FILES=65535
>>> set.default.MAX_LOCKED_MEMORY=10240
>>> set.default.CONF_DIR=/home/elasticsearch/conf
>>> set.default.WORK_DIR=/home/elasticsearch/tmp
>>> set.default.DIRECT_SIZE=4g
>>>
>>> # Java Additional Parameters
>>> wrapper.java.additional.1=-Delasticsearch-service
>>> wrapper.java.additional.2=-Des.path.home=%ES_HOME%
>>> wrapper.java.additional.3=-Xss256k
>>> wrapper.java.additional.4=-XX:+UseParNewGC
>>> wrapper.java.additional.5=-XX:+UseConcMarkSweepGC
>>> wrapper.java.additional.6=-XX:CMSInitiatingOccupancyFraction=75
>>> wrapper.java.additional.7=-XX:+UseCMSInitiatingOccupancyOnly
>>> wrapper.java.additional.8=-XX:+HeapDumpOnOutOfMemoryError
>>> wrapper.java.additional.9=-Djava.awt.headless=true
>>> wrapper.java.additional.10=-XX:MinHeapFreeRatio=40
>>> wrapper.java.additional.11=-XX:MaxHeapFreeRatio=70
>>> wrapper.java.additional.12=-XX:CMSInitiatingOccupancyFraction=75
>>> wrapper.java.additional.13=-XX:+UseCMSInitiatingOccupancyOnly
>>> wrapper.java.additional.15=-XX:MaxDirectMemorySize=4g
>>> # Initial Java Heap Size (in MB)
>>> wrapper.java.initmemory=%ES_HEAP_SIZE%
>>>
>>> *And these in elasticsearch.yml*
>>> ES_MIN_MEM: 5g
>>> ES_MAX_MEM: 5g
>>> #index.store.type=mmapfs
>>> index.cache.field.type: soft
>>> index.cache.field.max_size: 1
>>> index.cache.field.expire: 10m
>>> index.term_index_interval: 256
>>> index.term_index_divisor: 5
>>>
>>> *java version: *
>>> java version "1.7.0_51"
>>> Java(TM) SE Runtime Environment (build 1.7.0_51-b13)
>>> Java HotSpot(TM) 64-Bit Server VM (build 24.51-b03, mixed mode)
>>>
>>> *Elasticsearch version*
>>>  "version" : {
>>> "number" : "1.0.0",
>>> "build_hash" : "a46900e9c72c0a623d71b54016357d5f94c8ea32",
>>> "build_timestamp" : "2014-02-12T16:18:34Z",
>>> "build_snapshot" : false,
>>> "lucene_version" : "4.6"
>>>   }
>>>
>>> Using elastica PHP
>>>
>>>
>>> I have tried playing with values up and down to try to make it work, but
>>> nothing is changing.
>>>
>>> Please any help would be highly appreciated.
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "elasticsearch" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to elasticsearch+unsubscr...@googlegroups.com.
>>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/elasticsearch/4059bf32-ae30-45fa-947c-98ef4540920a%40googlegroups.com

Re: elasticsearch memory usage

2014-03-13 Thread Hicham Mallah
Hello Jörg

Thanks for the reply, our swap size is 2g. I don't know at what % the
process is being killed as the first time it happened I wasn't around, and
then I never let that happen again as the website is online. After 2 hours
of running the memory in sure is going up to 60%, I am restarting each time
when it arrives at 70% (2h/2h30) when I am around and testing config
changes. When I am not around, I am setting a cron job to restart the
server every 2 hours. Server has apache and mysql running on it too.



- - - - - - - - - -
Sincerely:
Hicham Mallah
Software Developer
mallah.hic...@gmail.com
00961 700 49 600



On Thu, Mar 13, 2014 at 2:22 PM, joergpra...@gmail.com <
joergpra...@gmail.com> wrote:

> You wrote, the OOM killer killed the ES process. With 32g (and the swap
> size), the process must be very big. much more than you configured. Can you
> give more info about the live size of the process, after ~2 hours? Are
> there more application processes on the box?
>
> Jörg
>
>
> On Thu, Mar 13, 2014 at 12:46 PM, Hicham Mallah 
> wrote:
>
>> Hello,
>>
>> I have been using elasticsearch on a ubuntu server for a year now, and
>> everything was going great. I had an index of 150,000,000 entries of domain
>> names, running small queries on it, just filtering by 1 term no sorting no
>> wildcard nothing. Now we moved servers, I have now a CentOS 6 server, 32GB
>> ram and running elasticserach but now we have 2 indices, of about 150
>> million entries each 32 shards, still running the same queries on them
>> nothing changed in the queries. But since we went online with the new
>> server, I have to restart elasticsearch every 2 hours before OOM killer
>> kills it.
>>
>> What's happening is that elasticsearch starts using memory till 50% then
>> it goes back down to about 30% gradually then starts to go up again
>> gradually and never goes back down.
>>
>> I have tried all the solutions I found on the net, I am a developer not a
>> server admin.
>>
>> *I have these setting in my service wrapper configuration*
>>
>> set.default.ES_HOME=/home/elasticsearch
>> set.default.ES_HEAP_SIZE=8192
>> set.default.MAX_OPEN_FILES=65535
>> set.default.MAX_LOCKED_MEMORY=10240
>> set.default.CONF_DIR=/home/elasticsearch/conf
>> set.default.WORK_DIR=/home/elasticsearch/tmp
>> set.default.DIRECT_SIZE=4g
>>
>> # Java Additional Parameters
>> wrapper.java.additional.1=-Delasticsearch-service
>> wrapper.java.additional.2=-Des.path.home=%ES_HOME%
>> wrapper.java.additional.3=-Xss256k
>> wrapper.java.additional.4=-XX:+UseParNewGC
>> wrapper.java.additional.5=-XX:+UseConcMarkSweepGC
>> wrapper.java.additional.6=-XX:CMSInitiatingOccupancyFraction=75
>> wrapper.java.additional.7=-XX:+UseCMSInitiatingOccupancyOnly
>> wrapper.java.additional.8=-XX:+HeapDumpOnOutOfMemoryError
>> wrapper.java.additional.9=-Djava.awt.headless=true
>> wrapper.java.additional.10=-XX:MinHeapFreeRatio=40
>> wrapper.java.additional.11=-XX:MaxHeapFreeRatio=70
>> wrapper.java.additional.12=-XX:CMSInitiatingOccupancyFraction=75
>> wrapper.java.additional.13=-XX:+UseCMSInitiatingOccupancyOnly
>> wrapper.java.additional.15=-XX:MaxDirectMemorySize=4g
>> # Initial Java Heap Size (in MB)
>> wrapper.java.initmemory=%ES_HEAP_SIZE%
>>
>> *And these in elasticsearch.yml*
>> ES_MIN_MEM: 5g
>> ES_MAX_MEM: 5g
>> #index.store.type=mmapfs
>> index.cache.field.type: soft
>> index.cache.field.max_size: 1
>> index.cache.field.expire: 10m
>> index.term_index_interval: 256
>> index.term_index_divisor: 5
>>
>> *java version: *
>> java version "1.7.0_51"
>> Java(TM) SE Runtime Environment (build 1.7.0_51-b13)
>> Java HotSpot(TM) 64-Bit Server VM (build 24.51-b03, mixed mode)
>>
>> *Elasticsearch version*
>>  "version" : {
>> "number" : "1.0.0",
>> "build_hash" : "a46900e9c72c0a623d71b54016357d5f94c8ea32",
>> "build_timestamp" : "2014-02-12T16:18:34Z",
>> "build_snapshot" : false,
>> "lucene_version" : "4.6"
>>   }
>>
>> Using elastica PHP
>>
>>
>> I have tried playing with values up and down to try to make it work, but
>> nothing is changing.
>>
>> Please any help would be highly appreciated.
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to elasticsearch+unsubscr...@googlegroups.com.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/elasticsearch/4059bf32-ae30-45fa-947c-98ef4540920a%40googlegroups.com
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>  --
> You received this message because you are subscribed to a topic in the
> Google Groups "elasticsearch" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/elasticsearch/D4WNQZSvqSU/unsubscribe.
> To unsubscribe from this group and all it

Re: elasticsearch memory usage

2014-03-13 Thread joergpra...@gmail.com
You wrote, the OOM killer killed the ES process. With 32g (and the swap
size), the process must be very big. much more than you configured. Can you
give more info about the live size of the process, after ~2 hours? Are
there more application processes on the box?

Jörg


On Thu, Mar 13, 2014 at 12:46 PM, Hicham Mallah wrote:

> Hello,
>
> I have been using elasticsearch on a ubuntu server for a year now, and
> everything was going great. I had an index of 150,000,000 entries of domain
> names, running small queries on it, just filtering by 1 term no sorting no
> wildcard nothing. Now we moved servers, I have now a CentOS 6 server, 32GB
> ram and running elasticserach but now we have 2 indices, of about 150
> million entries each 32 shards, still running the same queries on them
> nothing changed in the queries. But since we went online with the new
> server, I have to restart elasticsearch every 2 hours before OOM killer
> kills it.
>
> What's happening is that elasticsearch starts using memory till 50% then
> it goes back down to about 30% gradually then starts to go up again
> gradually and never goes back down.
>
> I have tried all the solutions I found on the net, I am a developer not a
> server admin.
>
> *I have these setting in my service wrapper configuration*
>
> set.default.ES_HOME=/home/elasticsearch
> set.default.ES_HEAP_SIZE=8192
> set.default.MAX_OPEN_FILES=65535
> set.default.MAX_LOCKED_MEMORY=10240
> set.default.CONF_DIR=/home/elasticsearch/conf
> set.default.WORK_DIR=/home/elasticsearch/tmp
> set.default.DIRECT_SIZE=4g
>
> # Java Additional Parameters
> wrapper.java.additional.1=-Delasticsearch-service
> wrapper.java.additional.2=-Des.path.home=%ES_HOME%
> wrapper.java.additional.3=-Xss256k
> wrapper.java.additional.4=-XX:+UseParNewGC
> wrapper.java.additional.5=-XX:+UseConcMarkSweepGC
> wrapper.java.additional.6=-XX:CMSInitiatingOccupancyFraction=75
> wrapper.java.additional.7=-XX:+UseCMSInitiatingOccupancyOnly
> wrapper.java.additional.8=-XX:+HeapDumpOnOutOfMemoryError
> wrapper.java.additional.9=-Djava.awt.headless=true
> wrapper.java.additional.10=-XX:MinHeapFreeRatio=40
> wrapper.java.additional.11=-XX:MaxHeapFreeRatio=70
> wrapper.java.additional.12=-XX:CMSInitiatingOccupancyFraction=75
> wrapper.java.additional.13=-XX:+UseCMSInitiatingOccupancyOnly
> wrapper.java.additional.15=-XX:MaxDirectMemorySize=4g
> # Initial Java Heap Size (in MB)
> wrapper.java.initmemory=%ES_HEAP_SIZE%
>
> *And these in elasticsearch.yml*
> ES_MIN_MEM: 5g
> ES_MAX_MEM: 5g
> #index.store.type=mmapfs
> index.cache.field.type: soft
> index.cache.field.max_size: 1
> index.cache.field.expire: 10m
> index.term_index_interval: 256
> index.term_index_divisor: 5
>
> *java version: *
> java version "1.7.0_51"
> Java(TM) SE Runtime Environment (build 1.7.0_51-b13)
> Java HotSpot(TM) 64-Bit Server VM (build 24.51-b03, mixed mode)
>
> *Elasticsearch version*
>  "version" : {
> "number" : "1.0.0",
> "build_hash" : "a46900e9c72c0a623d71b54016357d5f94c8ea32",
> "build_timestamp" : "2014-02-12T16:18:34Z",
> "build_snapshot" : false,
> "lucene_version" : "4.6"
>   }
>
> Using elastica PHP
>
>
> I have tried playing with values up and down to try to make it work, but
> nothing is changing.
>
> Please any help would be highly appreciated.
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/4059bf32-ae30-45fa-947c-98ef4540920a%40googlegroups.com
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoFcdFx98JugN7oDD0%3DBXMrY5v8-1LtBMdHeAXWJeho67Q%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


elasticsearch memory usage

2014-03-13 Thread Hicham Mallah
Hello, 

I have been using elasticsearch on a ubuntu server for a year now, and 
everything was going great. I had an index of 150,000,000 entries of domain 
names, running small queries on it, just filtering by 1 term no sorting no 
wildcard nothing. Now we moved servers, I have now a CentOS 6 server, 32GB 
ram and running elasticserach but now we have 2 indices, of about 150 
million entries each 32 shards, still running the same queries on them 
nothing changed in the queries. But since we went online with the new 
server, I have to restart elasticsearch every 2 hours before OOM killer 
kills it. 

What's happening is that elasticsearch starts using memory till 50% then it 
goes back down to about 30% gradually then starts to go up again gradually 
and never goes back down. 

I have tried all the solutions I found on the net, I am a developer not a 
server admin. 

*I have these setting in my service wrapper configuration*

set.default.ES_HOME=/home/elasticsearch 
set.default.ES_HEAP_SIZE=8192 
set.default.MAX_OPEN_FILES=65535 
set.default.MAX_LOCKED_MEMORY=10240 
set.default.CONF_DIR=/home/elasticsearch/conf 
set.default.WORK_DIR=/home/elasticsearch/tmp 
set.default.DIRECT_SIZE=4g 

# Java Additional Parameters 
wrapper.java.additional.1=-Delasticsearch-service 
wrapper.java.additional.2=-Des.path.home=%ES_HOME% 
wrapper.java.additional.3=-Xss256k 
wrapper.java.additional.4=-XX:+UseParNewGC 
wrapper.java.additional.5=-XX:+UseConcMarkSweepGC 
wrapper.java.additional.6=-XX:CMSInitiatingOccupancyFraction=75 
wrapper.java.additional.7=-XX:+UseCMSInitiatingOccupancyOnly 
wrapper.java.additional.8=-XX:+HeapDumpOnOutOfMemoryError 
wrapper.java.additional.9=-Djava.awt.headless=true 
wrapper.java.additional.10=-XX:MinHeapFreeRatio=40 
wrapper.java.additional.11=-XX:MaxHeapFreeRatio=70 
wrapper.java.additional.12=-XX:CMSInitiatingOccupancyFraction=75 
wrapper.java.additional.13=-XX:+UseCMSInitiatingOccupancyOnly 
wrapper.java.additional.15=-XX:MaxDirectMemorySize=4g 
# Initial Java Heap Size (in MB) 
wrapper.java.initmemory=%ES_HEAP_SIZE% 

*And these in elasticsearch.yml*
ES_MIN_MEM: 5g 
ES_MAX_MEM: 5g 
#index.store.type=mmapfs 
index.cache.field.type: soft 
index.cache.field.max_size: 1 
index.cache.field.expire: 10m 
index.term_index_interval: 256 
index.term_index_divisor: 5 

*java version: *
java version "1.7.0_51" 
Java(TM) SE Runtime Environment (build 1.7.0_51-b13) 
Java HotSpot(TM) 64-Bit Server VM (build 24.51-b03, mixed mode) 

*Elasticsearch version*
 "version" : { 
"number" : "1.0.0", 
"build_hash" : "a46900e9c72c0a623d71b54016357d5f94c8ea32", 
"build_timestamp" : "2014-02-12T16:18:34Z", 
"build_snapshot" : false, 
"lucene_version" : "4.6" 
  } 

Using elastica PHP 


I have tried playing with values up and down to try to make it work, but 
nothing is changing.   

Please any help would be highly appreciated. 

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/4059bf32-ae30-45fa-947c-98ef4540920a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Problem with configuring index template via file

2014-03-13 Thread Sergey Zemlyanoy
 

Dear all,

In order to overwrite some index settings I created a custom template:

{
"logstash2" : {
"order" : 1,
"template" : "logstash-*",
"settings" : 
{ "index.number_of_replicas" : "0" } 

}
}

Template is placed in /etc/elasticsearch/config/templates/logstash2.json

[root@logstash elasticsearch]# ls -la /etc/elasticsearch/
total 56
drwxr-xr-x 3 root root 4096 Feb 26 13:26 .
drwxr-xr-x. 99 root root 12288 Feb 22 03:37 ..
drwxr-xr-x 3 root root 4096 Feb 26 11:27 config
rw-rr- 1 root root 12686 Feb 26 11:52 elasticsearch.yml
rw-rr- 1 root root 12662 Jan 20 14:24 elasticsearch.yml.rpmsave
rw-rr- 1 root root 1512 Jan 15 18:05 logging.yml

Also I leaved in default state elasticsearch.yml file, so all path 
variables are commented.
After service restart I don't see any new index template. What am I doing 
wrong? Is the template path wrong?

Thanks in advance

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/b78efdfb-3449-432f-a23a-fdff3e3a0e1b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[Ann] Elasticsearch Image Plugin 1.1.0 released

2014-03-13 Thread Kevin Wang
Hi All,

I've released version 1.1.0 of Elasticsearch Image Plugin.
The Image Plugin is an Content Based Image Retrieval Plugin for 
Elasticsearch using LIRE (Lucene Image Retrieval). It allows users to index 
images and search for similar images.

Changes in 1.1.0:

   - Added limit in image query
   - Added plugin version in es-plugin.properties
   

https://github.com/kzwang/elasticsearch-image

Also I've created a demo website for this plugin (
http://demo.elasticsearch-image.com/), it has 1,000,000 images (well, 
haven't finish index all images yet, but it should be able to demo this 
plugin) from MIRFLICKR-1M collection (http://press.liacs.nl/mirflickr)


Thanks,
Kevin

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/14c7ca2e-e6c0-4c68-bedd-02fd0c85db40%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: query filter is not working

2014-03-13 Thread Subhadip Bagui
Hi David,

I have done following steps u suggested. The exact string search is working 
now.
But when I'm trying the below query for string matching it's giving null 
result.

May this is very basic and I'm doing something wrong. I'm a week old on 
elasticsearch and trying to understand the query-sql and text search. Pls 
help to clear the conception.

create index
create mapping
create a doc
refresh
query
--
query==>
{
"query": {
"constant_score": {
"filter": {
"term": { "message.original": 
"org.apache.http.protocol.immutablehttpprocessor.process" }
}
}
}
}

mapping ==>
{
   
   - "log-2014.03.03":{
  - "mappings":{
 - "apache-log":{
- "properties":{
   - "@timestamp":{
  - "type":"date",
  - "format":"-MM-dd HH:mm:ss"
   },
   - "@version":{
  - "type":"long"
   },
   - "host":{
  - "type":"string",
  - "index":"not_analyzed"
   },
   - "message":{
  - "type":"string",
  - "fields":{
 - "actual":{
- "type":"string",
- "index":"not_analyzed"
 }
  }
   },
   - "path":{
  - "type":"string",
  - "index":"not_analyzed"
   },
   - "type":{
  - "type":"string",
  - "index":"not_analyzed"
   }
}
 }
  }
   }

}

doc ==>
{
   
   - "_index":"log-2014.03.03",
   - "_type":"apache-log",
   - "_id":"5",
   - "_version":1,
   - "found":true,
   - "_source":{
  - "message":"org.apache.http.protocol.ImmutableHttpProcessor.process",
  - "@version":"3",
  - "@timestamp":"2014-03-03 18:45:35",
  - "type":"apache-access",
  - "host":"cloudclient.aricent.com",
  - "path":"/opt/apache-tomcat-7.0.40/logs/aricloud/monitoring.log"
   }

}

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/74e0cc0e-25db-47cf-8524-cc1c151a7a76%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


  1   2   >