date:20150225

No. You need to reindex.


David

> Le 26 févr. 2015 à 08:10, tao hiko  a écrit :
> 
> Hi,
> 
> I backed index up with snapshot command (index_A) and I created new index 
> with different mapping (index_B) and then I need to restore data from index_A 
> to index_B.
> 
> But I restored to index_B but mapping has been changed to index_A.
> 
> Is this possible to restore from another index and keep mapping of 
> destination index?
> 
> If yes, How to do?
> 
> Thank you,
> Hiko
> -- 
> You received this message because you are subscribed to the Google Groups 
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/elasticsearch/b000e173-07f9-4545-a544-3142df128681%40googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/4A890A93-4B94-49B4-B659-FD4E191159AB%40pilato.fr.
For more options, visit https://groups.google.com/d/optout.

Re: how many memory is require (win7 64bit)?

Hi !
 
now the service runs ! my problem is to start service correct !!!
 
yesterday i only call "service.bat"
 
today: service.bat install !!
 
service will install and i am able to start the service via dialog.
 
now i go to the next steps 
 
regards jan :-)
 
 
 

Am Mittwoch, 25. Februar 2015 14:43:50 UTC+1 schrieb jan99:

> Hi !
>  
> i want to use elasticsearch for a company wiki an my machine is win7 64bit 
> with 8GB Ram.
>  
> know i want to start like service. i stopped the current elasticsearch.bat 
> and call service.bat by kontextmenue *start as admin*
>  
> i get the message that the command is unable to start because there is not 
> enough memory.
>  
> my machine to small or is there a possiblity to define a special parameter?
>  
> reagards Jan
>  
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/b50e19f5-a047-40ef-9688-3f6f16e014bc%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: java.lang.OutOfMemoryError: Java heap space

2015-02-25 Thread Shohedul Hasan

I am using only one index ,5 shards,  the problem is i have tried with only 
one mail which is 10MB . Still it's causing the error. I created another 
mail which is 15MB but that is working. Why some particular documents 
 cause error where some bigger document don't.

On Thursday, February 26, 2015 at 12:12:05 PM UTC+6, Mark Walkom wrote:
>
> Then you probably have too much data for your cluster.
>
> How many indices, how many shards per index, how many nodes, how many GB 
> of data is it, what ES and java version and release are you running.
>
> On 26 February 2015 at 16:09, Shohedul Hasan  > wrote:
>
>> Hi, I have 12GB Ram, also I tried 10GB heap space. but With higher Heap 
>> space the error is coming fast.
>>
>> On Tuesday, February 24, 2015 at 2:17:55 PM UTC+6, Mark Walkom wrote:
>>>
>>> How much RAM and heap have you assigned things?
>>>
>>> On 24 February 2015 at 15:14, Shohedul Hasan  
>>> wrote:
>>>
 Hi, i am new to elasticsearch. I have tried to store mails in 
 Elasticsearch. but for some mail which are big i am getting error, and my 
 code is crashing. I am getting the following error:

 [message_owned_v1][1] Index failed for [message#029f49b7deac406088b81
 ddeb6a4c0df_ff8081814bb4af0e014bb4af0ea4_f1090e77-5f1e-444a-bfde-
 fb9ee4eb4c38]
 org.elasticsearch.index.engine.IndexFailedEngineException: 
 [message_owned_v1][1] Index failed for [message#029f49b7deac406088b81
 ddeb6a4c0df_ff8081814bb4af0e014bb4af0ea4_f1090e77-5f1e-444a-bfde-
 fb9ee4eb4c38]
 at 
 org.elasticsearch.index.engine.internal.InternalEngine.index(InternalEngine.java:499)
  
 ~[elasticsearch-1.3.2.jar:na]
 at org.elasticsearch.index.shard.service.InternalIndexShard.ind
 ex(InternalIndexShard.java:409) ~[elasticsearch-1.3.2.jar:na]
 at org.elasticsearch.action.index.TransportIndexAction.shardOpe
 rationOnPrimary(TransportIndexAction.java:195) 
 ~[elasticsearch-1.3.2.jar:na]
 at org.elasticsearch.action.support.replication.TransportShardR
 eplicationOperationAction$AsyncShardOperationAction.performOnPrimary(
 TransportShardReplicationOperationAction.java:522) 
 ~[elasticsearch-1.3.2.jar:na]
 at org.elasticsearch.action.support.replication.TransportShardR
 eplicationOperationAction$AsyncShardOperationAction$1.run(Tr
 ansportShardReplicationOperationAction.java:421) 
 ~[elasticsearch-1.3.2.jar:na]
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
  
 [na:1.8.0_31]
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
  
 [na:1.8.0_31]
 at java.lang.Thread.run(Thread.java:745) [na:1.8.0_31]
 Caused by: java.lang.OutOfMemoryError: Java heap space

 -- 
 You received this message because you are subscribed to the Google 
 Groups "elasticsearch" group.
 To unsubscribe from this group and stop receiving emails from it, send 
 an email to elasticsearc...@googlegroups.com.
 To view this discussion on the web visit https://groups.google.com/d/
 msgid/elasticsearch/618d69ef-57c5-4541-b02d-d79d9d002d45%
 40googlegroups.com 
 
 .
 For more options, visit https://groups.google.com/d/optout.

>>>
>>>  -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to elasticsearc...@googlegroups.com .
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/b4fddb51-6260-444b-9a98-077093848723%40googlegroups.com
>>  
>> 
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/85455fdf-9468-4108-9717-453a68dee147%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: packages.elasticsearch.org gone?

2015-02-25 Thread Nikolay Kolev

It's not working for me either since at least yesterday.

On Monday, February 23, 2015 at 1:04:07 AM UTC-8, Amos S wrote:
>
> Thanks everyone for testing. It appears to be up.
>
> On Monday, 23 February 2015 16:54:45 UTC+11, Amos S wrote:
>>
>> Hi,
>>
>> Today I tried to install a server using ElasticSearch's Ubuntu repo at 
>> http://packages.elasticsearch.org but get 404's.
>> Tried it multiple times.
>> Is there something wrong with this repository?
>> Is anyone aware of a good public mirror available?
>>
>> Thanks.
>>
>> --Amos
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/ffbc3bb3-6e17-424e-a23e-2b0cb8432b00%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Backup and Restore on different mapping

2015-02-25 Thread tao hiko

Hi,

I backed index up with snapshot command (index_A) and I created new index 
with different mapping (index_B) and then I need to restore data from 
index_A to index_B.

But I restored to index_B but mapping has been changed to index_A.

Is this possible to restore from another index and keep mapping of 
destination index?

If yes, How to do?

Thank you,
Hiko

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/b000e173-07f9-4545-a544-3142df128681%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: ES vs. Lucene memory

Definitely sounds like FS cache to me.

On 26 February 2015 at 17:14, liu wei  wrote:

> Is there anyway to actually see how much memory is used by file system
> cache?
> A strange problem I see is, my machine has 32GB in total, i gave 16GB to
> ES. But very often the total OS memory usage will reach 100%, but task
> manager only shows around 12~13GB of the java exe. But when i kill the java
> process, memory usage dropped to only 20%. I can't explain why this is
> happening. Not sure if it's the file system cache.
>
> On Wednesday, November 26, 2014 at 12:51:11 PM UTC-8, Adrien Grand wrote:
>>
>> Indeed the behaviour is the same on Windows and Linux: memory that is not
>> used by processes is used by the operating system in order to cache the
>> hottest parts of the file system. The reason why the docs say that the rest
>> should be left to Lucene is that most disk accesses that elasticsearch
>> performs are done through Lucene.
>>
>> On Wed, Nov 26, 2014 at 8:44 PM, Nikolas Everett 
>> wrote:
>>
>>> I imagine all operating systems have some kind of disk caching.  I just
>>> happen to be used to linux.
>>>
>>>
>>> On Wed, Nov 26, 2014 at 2:42 PM, BradVido  wrote:
>>>
 I see, but I'm running on Windows. Is the behavior similar, or does
 this not exist on Windows?

 On Wednesday, November 26, 2014 1:01:02 PM UTC-6, Nikolas Everett wrote:
>
> Lucene runs in the same JVM as Elasticsearch but (by default) it mmaps
> files and then iterates over their content inteligently.  That means most
> of its actual storage is "off heap" (its a java buzz-phrase).  Anyway,
> Linux will serve reads from mmaped files from its page cache.  That is why
> you want to leave linux a whole bunch of unused memory.
>
> Nik
>
> On Wed, Nov 26, 2014 at 1:53 PM, BradVido  wrote:
>
>> I've read the recommendations for ES_HEAP_SIZE
>> 
>>  which
>> basically state to set -Xms and -Xmx to 50% physical RAM.
>> It says the rest should be left for Lucene to use (OS filesystem
>> caching).
>> But I'm confused on how Lucene uses that. Doesn't Lucene run in the
>> same JVM as ES? So they would share the same max heap setting of 50%.
>>
>> Can someone clear this up?
>>
>> --
>> You received this message because you are subscribed to the Google
>> Groups "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it,
>> send an email to elasticsearc...@googlegroups.com.
>> To view this discussion on the web visit https://groups.google.com/d/
>> msgid/elasticsearch/4c0b4045-74d4-400a-b636-e150a1261be4%40goo
>> glegroups.com
>> 
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>  --
 You received this message because you are subscribed to the Google
 Groups "elasticsearch" group.
 To unsubscribe from this group and stop receiving emails from it, send
 an email to elasticsearc...@googlegroups.com.
 To view this discussion on the web visit https://groups.google.com/d/
 msgid/elasticsearch/b7c67641-6a83-428c-91d2-10bbb6d17f0f%
 40googlegroups.com
 
 .

 For more options, visit https://groups.google.com/d/optout.

>>>
>>>  --
>>> You received this message because you are subscribed to the Google
>>> Groups "elasticsearch" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to elasticsearc...@googlegroups.com.
>>> To view this discussion on the web visit https://groups.google.com/d/
>>> msgid/elasticsearch/CAPmjWd3g0g7gv79a6DchCCjrbhD_
>>> Ek3tfAkuPA08Q1GYbdH2Rg%40mail.gmail.com
>>> 
>>> .
>>>
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>
>>
>> --
>> Adrien Grand
>>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/24871d7a-e99c-4905-aac4-793816dca96b%40googlegroups.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Googl

Re: Getting MasterNotDiscoveredException multiple times

2015-02-25 Thread prachicsa

I am using unicast discovery. 

On Thursday, February 26, 2015 at 11:40:36 AM UTC+5:30, Mark Walkom wrote:
>
> Are you using multicast or unicast discovery?
>
> On 26 February 2015 at 16:03, > wrote:
>
>> Hi,
>>
>> I am getting the following exceptions multiple times. And that makes me 
>> restart all ES nodes. 
>>
>> Once I do that, everything comes back to normal. 
>>
>> org.elasticsearch.discovery.MasterNotDiscoveredException: waited for [1m]
>> at 
>> org.elasticsearch.action.support.master.TransportMasterNodeOperationAction$3.onTimeout(TransportMasterNodeOperationAction.java:180)
>> at 
>> org.elasticsearch.cluster.service.InternalClusterService$NotifyTimeout.run(InternalClusterService.java:492)
>> at 
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>> at 
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>> at java.lang.Thread.run(Thread.java:724)
>>
>>
>> Is there some configurations issue or so? There are total 5 nodes, out of 
>> which for node, I have marked master flag to true. 
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to elasticsearc...@googlegroups.com .
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/e3428482-c966-45ef-951a-f16083e73ce4%40googlegroups.com
>>  
>> 
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/764584cb-d63f-49d8-ab27-5f7b7faff12d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Strange issue after upgrading from 1.1.0 to 1.4.1 ES Version

2015-02-25 Thread Sean Clemmer

+1

I have a similar story: After around six months using the v1.3.x series, I
upgraded from v1.3.4 to v1.4.4. I've had monitoring and metrics in place
for a while now, and compared to baseline I'm seeing occasional periods
where nodes appear to drop out and usually back into the cluster (with
NodeNotConnectedException). During these periods the cluster status is red
on-and-off for maybe 15-30 minutes with anywhere from 30 minutes to 2 hours
in-between. The issue is worse in larger clusters with larger shard counts
(thousands, tens of thousands).

Resource utilization is still good. The number of shards (and the amount of
data) is essentially constant. I'm confident the upgrade was the only
change; I have strict controls on the clusters.

On Wed, Feb 25, 2015 at 5:18 PM, sagarl  wrote:

> Hi,
>
> We recently upgraded one of our ES Clusters from ES Version 1.1.0 to 1.4.1.
>
> We have dedicated master-data-search deployment in AWS. Cluster settings
> are same for all the clusters.
>
> Strangely, only in one cluster; we are seeing that nodes are constantly
> failing to connect to Master node and rejoining back.
> *It happens all the time, even during idle period (when there are no read
> or writes)*.
>
> We keep on seeing following exception in the logs
> org.elasticsearch.transport.NodeNotConnectedException
>
> Because of this, Cluster has slowed down considerably.
>
> We use kopf plugin for monitoring and it keeps popping up message -
> "Loading cluster information is talking too long"
>
> There is not much data on individual nodes; almost 80% disk is free. CPU
> and Heap are doing fine.
>
> Only difference between this cluster and other cluster is the number of
> indices and shards. Other clusters have shards in hundreds and indices in
> double digit. While this cluster has around 5000 shards and close to 250
> indices.
>
> But we are not sure, if number of shards or indices can cause reconnection
> issues between nodes.
>
> Not sure if it's really related to 1.4.1 or something else. But in that
> case, other clusters should have been affected too.
>
> Any help will be appreciated !
>
> Thanks,
>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/12e9b4ab-7600-4d22-a347-c03edaebe4f3%40googlegroups.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CADa-AwfBqx%3D2AGfgwj%3DZjLHMyCtKUa_tAU_Ffdy-MbQ2Ve5tBQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: ES vs. Lucene memory

2015-02-25 Thread liu wei

Is there anyway to actually see how much memory is used by file system 
cache? 
A strange problem I see is, my machine has 32GB in total, i gave 16GB to 
ES. But very often the total OS memory usage will reach 100%, but task 
manager only shows around 12~13GB of the java exe. But when i kill the java 
process, memory usage dropped to only 20%. I can't explain why this is 
happening. Not sure if it's the file system cache.

On Wednesday, November 26, 2014 at 12:51:11 PM UTC-8, Adrien Grand wrote:
>
> Indeed the behaviour is the same on Windows and Linux: memory that is not 
> used by processes is used by the operating system in order to cache the 
> hottest parts of the file system. The reason why the docs say that the rest 
> should be left to Lucene is that most disk accesses that elasticsearch 
> performs are done through Lucene.
>
> On Wed, Nov 26, 2014 at 8:44 PM, Nikolas Everett  > wrote:
>
>> I imagine all operating systems have some kind of disk caching.  I just 
>> happen to be used to linux.
>>
>>
>> On Wed, Nov 26, 2014 at 2:42 PM, BradVido > > wrote:
>>
>>> I see, but I'm running on Windows. Is the behavior similar, or does this 
>>> not exist on Windows?
>>>
>>> On Wednesday, November 26, 2014 1:01:02 PM UTC-6, Nikolas Everett wrote:

 Lucene runs in the same JVM as Elasticsearch but (by default) it mmaps 
 files and then iterates over their content inteligently.  That means most 
 of its actual storage is "off heap" (its a java buzz-phrase).  Anyway, 
 Linux will serve reads from mmaped files from its page cache.  That is why 
 you want to leave linux a whole bunch of unused memory.

 Nik

 On Wed, Nov 26, 2014 at 1:53 PM, BradVido  wrote:

> I've read the recommendations for ES_HEAP_SIZE 
> 
>  which 
> basically state to set -Xms and -Xmx to 50% physical RAM.
> It says the rest should be left for Lucene to use (OS filesystem 
> caching). 
> But I'm confused on how Lucene uses that. Doesn't Lucene run in the 
> same JVM as ES? So they would share the same max heap setting of 50%.
>
> Can someone clear this up?
>
> -- 
> You received this message because you are subscribed to the Google 
> Groups "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send 
> an email to elasticsearc...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/
> msgid/elasticsearch/4c0b4045-74d4-400a-b636-e150a1261be4%
> 40googlegroups.com 
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

  -- 
>>> You received this message because you are subscribed to the Google 
>>> Groups "elasticsearch" group.
>>> To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to elasticsearc...@googlegroups.com .
>>> To view this discussion on the web visit 
>>> https://groups.google.com/d/msgid/elasticsearch/b7c67641-6a83-428c-91d2-10bbb6d17f0f%40googlegroups.com
>>>  
>>> 
>>> .
>>>
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>  -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to elasticsearc...@googlegroups.com .
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/CAPmjWd3g0g7gv79a6DchCCjrbhD_Ek3tfAkuPA08Q1GYbdH2Rg%40mail.gmail.com
>>  
>> 
>> .
>>
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>
>
> -- 
> Adrien Grand
>  

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/24871d7a-e99c-4905-aac4-793816dca96b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: java.lang.OutOfMemoryError: Java heap space

Then you probably have too much data for your cluster.

How many indices, how many shards per index, how many nodes, how many GB of
data is it, what ES and java version and release are you running.

On 26 February 2015 at 16:09, Shohedul Hasan  wrote:

> Hi, I have 12GB Ram, also I tried 10GB heap space. but With higher Heap
> space the error is coming fast.
>
> On Tuesday, February 24, 2015 at 2:17:55 PM UTC+6, Mark Walkom wrote:
>>
>> How much RAM and heap have you assigned things?
>>
>> On 24 February 2015 at 15:14, Shohedul Hasan 
>> wrote:
>>
>>> Hi, i am new to elasticsearch. I have tried to store mails in
>>> Elasticsearch. but for some mail which are big i am getting error, and my
>>> code is crashing. I am getting the following error:
>>>
>>> [message_owned_v1][1] Index failed for [message#029f49b7deac406088b81
>>> ddeb6a4c0df_ff8081814bb4af0e014bb4af0ea4_f1090e77-5f1e-444a-bfde-
>>> fb9ee4eb4c38]
>>> org.elasticsearch.index.engine.IndexFailedEngineException:
>>> [message_owned_v1][1] Index failed for [message#029f49b7deac406088b81
>>> ddeb6a4c0df_ff8081814bb4af0e014bb4af0ea4_f1090e77-5f1e-444a-bfde-
>>> fb9ee4eb4c38]
>>> at 
>>> org.elasticsearch.index.engine.internal.InternalEngine.index(InternalEngine.java:499)
>>> ~[elasticsearch-1.3.2.jar:na]
>>> at org.elasticsearch.index.shard.service.InternalIndexShard.ind
>>> ex(InternalIndexShard.java:409) ~[elasticsearch-1.3.2.jar:na]
>>> at org.elasticsearch.action.index.TransportIndexAction.shardOpe
>>> rationOnPrimary(TransportIndexAction.java:195)
>>> ~[elasticsearch-1.3.2.jar:na]
>>> at org.elasticsearch.action.support.replication.TransportShardR
>>> eplicationOperationAction$AsyncShardOperationAction.performOnPrimary(
>>> TransportShardReplicationOperationAction.java:522)
>>> ~[elasticsearch-1.3.2.jar:na]
>>> at org.elasticsearch.action.support.replication.TransportShardR
>>> eplicationOperationAction$AsyncShardOperationAction$1.run(Tr
>>> ansportShardReplicationOperationAction.java:421)
>>> ~[elasticsearch-1.3.2.jar:na]
>>> at 
>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>>> [na:1.8.0_31]
>>> at 
>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>>> [na:1.8.0_31]
>>> at java.lang.Thread.run(Thread.java:745) [na:1.8.0_31]
>>> Caused by: java.lang.OutOfMemoryError: Java heap space
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "elasticsearch" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to elasticsearc...@googlegroups.com.
>>> To view this discussion on the web visit https://groups.google.com/d/
>>> msgid/elasticsearch/618d69ef-57c5-4541-b02d-d79d9d002d45%
>>> 40googlegroups.com
>>> 
>>> .
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/b4fddb51-6260-444b-9a98-077093848723%40googlegroups.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAEYi1X_AFc6Jx-Wzkv-xck-ByLCsGmPKiD4x7nAZL6SFaWGOGg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Getting MasterNotDiscoveredException multiple times

Are you using multicast or unicast discovery?

On 26 February 2015 at 16:03,  wrote:

> Hi,
>
> I am getting the following exceptions multiple times. And that makes me
> restart all ES nodes.
>
> Once I do that, everything comes back to normal.
>
> org.elasticsearch.discovery.MasterNotDiscoveredException: waited for [1m]
> at
> org.elasticsearch.action.support.master.TransportMasterNodeOperationAction$3.onTimeout(TransportMasterNodeOperationAction.java:180)
> at
> org.elasticsearch.cluster.service.InternalClusterService$NotifyTimeout.run(InternalClusterService.java:492)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:724)
>
>
> Is there some configurations issue or so? There are total 5 nodes, out of
> which for node, I have marked master flag to true.
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/e3428482-c966-45ef-951a-f16083e73ce4%40googlegroups.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAEYi1X9xP9USP3SjL_Z9ceoJ_k9g2yTXOL9NncnkVnWxyEqKOQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Suddenly, my SearchRequestBuilder's setSource doesn't work!

2015-02-25 Thread tonylxc

Yeah, I did what you said and it works.

However, my query is more complex than the example I posted.

The real query is something like this. It has been working very well, (i.e. 
through setSource() method), since my previous commit about 3 months ago. 
But it just stopped working today.

Does Elasticsearch change something in the latest version?

{
  "size": "100",
  "query": {
"indices": {
  "no_match_query": "none",
  "indices": [
"some_index"
  ],
  "query": {
"nested": {
  "path": "NE",
  "query": {
"filtered": {
  "filter": {
"bool": {
  "must": [
{
  "term": {
"NE.CODE": "CODE_1"
  }
},
{
  "script": {
"script": "str = doc[fieldName].value; if (str == 
null) return false; else return str.toInteger() < fieldValue;",
"params": {
  "fieldName": "NE.VALUE",
  "fieldValue": 90
}
  }
},
{
  "range": {
"NE.DAYS": {
  "from": 98,
  "to": 104
}
  }
}
  ]
}
  }
}
  }
}
  }
}
  },
  "post_filter": {
"query": {
  "nested": {
"path": "NE",
"query": {
  "filtered": {
"filter": {
  "bool": {
"must": [
  {
"term": {
  "NE.": "CODE_2"
}
  },
  {
"range": {
  "NE.VALUE": {
"from": 70,
"to": 104
  }
}
  },
  {
"term": {
  "NE.LOCATION": "SOME_LOCATION"
}
  }
]
  }
}
  }
}
  }
}
  }
}



On Wednesday, February 25, 2015 at 9:20:31 PM UTC-8, David Pilato wrote:
>
> Try remove the "query" level.
>
> String query = "{\"filtered\":{\"filter\":{\"term\":{\"_id\":\"100\"";
>
> HTH
> --
> David ;-)
> Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs
>
> Le 26 févr. 2015 à 06:09, tonylxc > a 
> écrit :
>
> I'm using the latest Elasticsearch Java API 1.4.4. And I find this issue 
> annoying me really badly.
>
> Normally, if you want to do a query using JSON string, you could do this:
>
>  Client client = ...
>
>
>  String query = 
> "{\"query\":{\"filtered\":{\"filter\":{\"term\":{\"_id\":\"100\"}";
>
>
>  SearchRequestBuilder request = client
>  .prepareSearch("patient_centric_1")
>  .setTypes("patient_data")
>  .setSource(query);
>
>
>  logger.debug(request.toString());
>
>
>  SearchResponse response = request.execute().actionGet();
>
>
> And then parse the response.
>
> But it doesn't work for me. The response I expected is get some data with 
> _id=100, but the response I get is all the data. And the debug output 
> prints "{ }".
>
> To be precisely, it worked yesterday with version 1.4.2, but not today. 
> And even I downgraded to 1.4.2, it doesn't work!!
>
> But if I use the QueryBuilders to do the same query, it works. And the 
> debug output prints the equivalent JSON query string.
>
> Client client = ...
>
>  String query = 
> "{\"query\":{\"filtered\":{\"filter\":{\"term\":{\"_id\":\"100\"}";
>
>  SearchRequestBuilder request = client
>  .prepareSearch("patient_centric_1")
>  .setTypes("patient_data")
>  .setQuery(QueryBuilders.filteredQuery(QueryBuilders.matchAllQuery(), 
> FilterBuilders.termFilter("_id", "100")));
>
> logger.debug(request.toString());
>
> SearchResponse response = request.execute().actionGet();
>
>
>
> Does anyone know why it happens?
>
>  -- 
> You received this message because you are subscribed to the Google Groups 
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to elasticsearc...@googlegroups.com .
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/elasticsearch/46724ac4-fc6d-4d0f-bb31-7e61baf89439%40googlegroups.com
>  
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To vie

Re: Suddenly, my SearchRequestBuilder's setSource doesn't work!

Try remove the "query" level.

String query = "{\"filtered\":{\"filter\":{\"term\":{\"_id\":\"100\"";

HTH
--
David ;-)
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

> Le 26 févr. 2015 à 06:09, tonylxc  a écrit :
> 
> I'm using the latest Elasticsearch Java API 1.4.4. And I find this issue 
> annoying me really badly.
> 
> Normally, if you want to do a query using JSON string, you could do this:
> 
>  Client client = ...
> 
> 
>  String query = 
> "{\"query\":{\"filtered\":{\"filter\":{\"term\":{\"_id\":\"100\"}";
> 
> 
>  SearchRequestBuilder request = client
>  .prepareSearch("patient_centric_1")
>  .setTypes("patient_data")
>  .setSource(query);
> 
> 
>  logger.debug(request.toString());
> 
> 
>  SearchResponse response = request.execute().actionGet();
> 
> 
> And then parse the response.
> 
> But it doesn't work for me. The response I expected is get some data with 
> _id=100, but the response I get is all the data. And the debug output prints 
> "{ }".
> 
> To be precisely, it worked yesterday with version 1.4.2, but not today. And 
> even I downgraded to 1.4.2, it doesn't work!!
> 
> But if I use the QueryBuilders to do the same query, it works. And the debug 
> output prints the equivalent JSON query string.
> 
> Client client = ...
> 
>  String query = 
> "{\"query\":{\"filtered\":{\"filter\":{\"term\":{\"_id\":\"100\"}";
> 
>  SearchRequestBuilder request = client
>  .prepareSearch("patient_centric_1")
>  .setTypes("patient_data")
>  .setQuery(QueryBuilders.filteredQuery(QueryBuilders.matchAllQuery(), 
> FilterBuilders.termFilter("_id", "100")));
> 
> logger.debug(request.toString());
> 
> SearchResponse response = request.execute().actionGet();
> 
> 
> 
> Does anyone know why it happens?
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/elasticsearch/46724ac4-fc6d-4d0f-bb31-7e61baf89439%40googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/9927DAFC-08DE-49FA-AF42-7A3FD4A0EBD3%40pilato.fr.
For more options, visit https://groups.google.com/d/optout.

Re: java.lang.OutOfMemoryError: Java heap space

2015-02-25 Thread Shohedul Hasan

Hi, I have 12GB Ram, also I tried 10GB heap space. but With higher Heap 
space the error is coming fast.

On Tuesday, February 24, 2015 at 2:17:55 PM UTC+6, Mark Walkom wrote:
>
> How much RAM and heap have you assigned things?
>
> On 24 February 2015 at 15:14, Shohedul Hasan  > wrote:
>
>> Hi, i am new to elasticsearch. I have tried to store mails in 
>> Elasticsearch. but for some mail which are big i am getting error, and my 
>> code is crashing. I am getting the following error:
>>
>> [message_owned_v1][1] Index failed for [message#
>> 029f49b7deac406088b81ddeb6a4c0df_ff8081814bb4af0e014bb4af0ea400
>> 00_f1090e77-5f1e-444a-bfde-fb9ee4eb4c38]
>> org.elasticsearch.index.engine.IndexFailedEngineException: 
>> [message_owned_v1][1] Index failed for [message#
>> 029f49b7deac406088b81ddeb6a4c0df_ff8081814bb4af0e014bb4af0ea400
>> 00_f1090e77-5f1e-444a-bfde-fb9ee4eb4c38]
>> at 
>> org.elasticsearch.index.engine.internal.InternalEngine.index(InternalEngine.java:499)
>>  
>> ~[elasticsearch-1.3.2.jar:na]
>> at org.elasticsearch.index.shard.service.InternalIndexShard.
>> index(InternalIndexShard.java:409) ~[elasticsearch-1.3.2.jar:na]
>> at org.elasticsearch.action.index.TransportIndexAction.
>> shardOperationOnPrimary(TransportIndexAction.java:195) 
>> ~[elasticsearch-1.3.2.jar:na]
>> at org.elasticsearch.action.support.replication.
>> TransportShardReplicationOperationAction$AsyncShardOperationAction.
>> performOnPrimary(TransportShardReplicationOperationAction.java:522) 
>> ~[elasticsearch-1.3.2.jar:na]
>> at org.elasticsearch.action.support.replication.
>> TransportShardReplicationOperationAction$AsyncShardOperationAction$1.run(
>> TransportShardReplicationOperationAction.java:421) 
>> ~[elasticsearch-1.3.2.jar:na]
>> at 
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>>  
>> [na:1.8.0_31]
>> at 
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>>  
>> [na:1.8.0_31]
>> at java.lang.Thread.run(Thread.java:745) [na:1.8.0_31]
>> Caused by: java.lang.OutOfMemoryError: Java heap space
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to elasticsearc...@googlegroups.com .
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/618d69ef-57c5-4541-b02d-d79d9d002d45%40googlegroups.com
>>  
>> 
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/b4fddb51-6260-444b-9a98-077093848723%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Suddenly, my SearchRequestBuilder's setSource doesn't work!

2015-02-25 Thread tonylxc

I'm using the latest Elasticsearch Java API 1.4.4. And I find this issue 
annoying me really badly.

Normally, if you want to do a query using JSON string, you could do this:

 Client client = ...


 String query = 
"{\"query\":{\"filtered\":{\"filter\":{\"term\":{\"_id\":\"100\"}";


 SearchRequestBuilder request = client
 .prepareSearch("patient_centric_1")
 .setTypes("patient_data")
 .setSource(query);


 logger.debug(request.toString());


 SearchResponse response = request.execute().actionGet();


And then parse the response.

But it doesn't work for me. The response I expected is get some data with 
_id=100, but the response I get is all the data. And the debug output 
prints "{ }".

To be precisely, it worked yesterday with version 1.4.2, but not today. And 
even I downgraded to 1.4.2, it doesn't work!!

But if I use the QueryBuilders to do the same query, it works. And the 
debug output prints the equivalent JSON query string.

Client client = ...

 String query = 
"{\"query\":{\"filtered\":{\"filter\":{\"term\":{\"_id\":\"100\"}";

 SearchRequestBuilder request = client
 .prepareSearch("patient_centric_1")
 .setTypes("patient_data")
 .setQuery(QueryBuilders.filteredQuery(QueryBuilders.matchAllQuery(), 
FilterBuilders.termFilter("_id", "100")));

logger.debug(request.toString());

SearchResponse response = request.execute().actionGet();



Does anyone know why it happens?

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/46724ac4-fc6d-4d0f-bb31-7e61baf89439%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Getting MasterNotDiscoveredException multiple times

2015-02-25 Thread prachicsa

Hi,

I am getting the following exceptions multiple times. And that makes me 
restart all ES nodes. 

Once I do that, everything comes back to normal. 

org.elasticsearch.discovery.MasterNotDiscoveredException: waited for [1m]
at 
org.elasticsearch.action.support.master.TransportMasterNodeOperationAction$3.onTimeout(TransportMasterNodeOperationAction.java:180)
at 
org.elasticsearch.cluster.service.InternalClusterService$NotifyTimeout.run(InternalClusterService.java:492)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:724)


Is there some configurations issue or so? There are total 5 nodes, out of 
which for node, I have marked master flag to true. 

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/e3428482-c966-45ef-951a-f16083e73ce4%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Elasticsearch Load Test goes OutOfMemory

2015-02-25 Thread Malaka Gallage

Hi Mark,

I run one random query out of four defined queries at a time.

   1. Getting the average of a field over some time period.
   2. Getting the max  of a field over some time period.
   3. Getting the min of a field over some time period.
   4. Getting a percentile of a field over some time period.

Note that one of this query runs only once a second.

Thanks
Malaka 

On Thursday, February 26, 2015 at 6:26:51 AM UTC+5:30, Mark Walkom wrote:
>
> What sort of queries are you running?
>
> On 25 February 2015 at 22:08, Malaka Gallage  > wrote:
>
>> Hi Mark,
>>
>> Yes I'm using bulk API for the tests. Usually OOM error happens when the 
>> cluster has around 30 million records. Is there anyway to tune the ES 
>> cluster to perform better?
>>
>> Thanks
>> Malaka
>>
>> On Wednesday, February 25, 2015 at 12:25:52 PM UTC+5:30, Mark Walkom 
>> wrote:
>>>
>>> If you are getting queue capacity rejections then you are over working 
>>> your cluster. Are you using the bulk API for your tests?
>>> How much data is in your cluster when you get OOM?
>>>
>>> On 25 February 2015 at 16:28, Malaka Gallage  wrote:
>>>
 Hi all,

 I need some help here. I started a load test for Elasticsearch before 
 using that in production environment. I have three EC2 instances that are 
 configured in following manner which creates a Elasticsearch cluster.

 All three machines has the following same hardware configurations.

 32GB RAM
 160GB SSD hard disk
 8 core CPU

 *Machine 01*
 Elasticsearch server (16GB heap)
 Elasticsearch Java client (Who generates a continues load and report to 
 ES - 4GB heap)

 *Machine 02*
 Elasticsearch server (16GB heap)
 Elasticsearch Java client (Who generates a continues load and report to 
 ES - 4GB heap)

 *Machine 03*
 Elasticsearch server (16GB heap)
 Elasticsearch Java client (Who queries from ES continuously - 1GB heap)

 Note that the two clients together generates around 20K records per 
 second and report them as bulks with average size of 25. The other client 
 queries only one query per second. My document has the following format.

 {
 "_index": "my_index",
 "_type": "my_type",
 "_id": "7334236299916134105",
 "_score": 3.607,
 "_source": {
"long_1": 96186289301793,
"long_2": 7334236299916134000,
"string_1": "random_string",
"long_3": 96186289301793,
"string_2": "random_string",
"string_3": "random_string",
"string_4": "random_string",
"string_5": "random_string",
"long_4": 5457314198948537000
   }
 }

 The problem is, after few minutes, Elasticsearch reports errors in the 
 logs like this.

 [2015-02-24 08:03:58,070][ERROR][marvel.agent.exporter] [Gateway] 
 create failure (index:[.marvel-2015.02.24] type: [cluster_stats]): 
 RemoteTransportException[[Marvel 
 Girl][inet[/10.167.199.140:9300]][bulk/shard]]; 
 nested: EsRejectedExecutionException[rejected execution (queue 
 capacity 50) on org.elasticsearch.action.support.replication.
 TransportShardReplicationOperationAction$AsyncShardOperationAction$1@
 76dbf01];

 [2015-02-25 04:23:36,459][ERROR][marvel.agent.exporter] [Wildside] 
 create failure (index:[.marvel-2015.02.25] type: [index_stats]): 
 UnavailableShardsException[[.marvel-2015.02.25][0] [2] shardIt, [0] 
 active : Timeout waiting for [1m], request: org.elasticsearch.action.bulk.
 BulkShardRequest@2e7693b7]

 Note that this error happens for different indices and different types.

 Again after few minutes, Elasticsearch clients get 
 NoNodeAvailableException. I hope that is because Elasticsearch cluster 
 malfunctioning due to above errors. But eventually the clients get 
 "java.lang.OutOfMemoryError: GC overhead limit exceeded" error.

 I did some profiling and found out that increasing 
 the org.elasticsearch.action.index.IndexRequest instances is the cause 
 for this OutOfMemory error. I tried even with "index.store.type: memory" 
 and it seems still the Elasticsearch cluster cannot build the indices to 
 the required rate.

 Please point out any tuning parameters or any method to get rid of 
 these issues. Or please explain a different way to report and query this 
 amount of load.

 Thanks
 Malaka

 -- 
 You received this message because you are subscribed to the Google 
 Groups "elasticsearch" group.
 To unsubscribe from this group and stop receiving emails from it, send 
 an email to elasticsearc...@googlegroups.com.
 To view this discussion on the web visit https://grou

Integrating custom directive into Kibana

2015-02-25 Thread Mohit Garg



I am seeking some code design related information. I have created a Kibana 
like dashboard with Splunk as backend, but very application-specific in 
terms of D3 based visualizations which has highly interactive maps.

I found kibana after writing my front-end. I am impressed with the 
modularized & organized UI dashboard of Kibana which I wish my web App also 
has. I want to workout middle way in which I can add my directives in 
kibana. I did not find any documentation for development with Kibana. Can 
someone share their experience of extending kibana with new directives? Are 
there any guidelines on such kind of integration?

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/2e2381b1-522e-442c-b7c2-29b73c484992%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

script to do a ES rolling upgrade.

2015-02-25 Thread Kevin Burton

It seems like there should be a script out there (I'm lazy) that someone 
has already written to do a per -node restart after an upgrade.

It should block until the cluster is green again so you can move on to the 
next host.

I was going to write something in ansible but it seems to be easier to just 
find a single rolling-restart script which can do it... 

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/64ef396e-2994-45d7-9eca-ea0052fdd688%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Word count score

2015-02-25 Thread Chris Pall

You can use multiple terms in a should clause:

https://www.found.no/play/gist/905a2a067dd79593e8f5

Ignore the comment about stackoverflow in the save, I learned about this 
way to share elasticsearch queries through the example given there. Sorry 
about that.

There may be simpler ways, but this way works pretty effectively.


On Wednesday, February 25, 2015 at 11:33:55 AM UTC-5, Christophe Rosko 
wrote:
>
> Hi !
>
>
> I'd like to calculate the score of a query as the number of matched 
> wordsas, as do sphinx's "word_count".
> For example, if the query is "word1 word2 word3",
> then for each document, if the title contains word1, his score will be 1, 
> if it contains word1 and word2 the score will be 2, and the score will be 3 
> if it contains the 3 words.
>
>
> How could I do that ?
>
> Thanks for your help !
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/5bbd7090-1622-40ec-ab98-50c2925ac555%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

elasticsearch heavy garbage collection

2015-02-25 Thread chris85lang

Hello,

we use a elasticsearch cluster with two nodes together with logstash and 
kibana. When we receive a large amount of logs (>200 per sec.) 
elasticsearch becomes unusable slow. 

We see a lot of garbage collection going on:
[2015-02-25 18:27:02,494][WARN ][monitor.jvm] [server1] [gc][old][564][30] 
duration [12.4s], collections [1]/[13s], total [12.4s]/[3.5m], memory 
[9.7gb]->[8.8gb]/[9.8gb], all_pools {[young] 
[1.1gb]->[323.4mb]/[1.1gb]}{[survivor] [65.5mb]->[0b]/[149.7mb]}{[old] 
[8.5gb]->[8.5gb]/[8.5gb]}
[2015-02-25 18:27:14,098][INFO ][monitor.jvm] [server1] [gc][old][566][31] 
duration [9.2s], collections [1]/[9.8s], total [9.2s]/[3.7m], memory 
[9.7gb]->[8.7gb]/[9.8gb], all_pools {[young] 
[1.1gb]->[237.4mb]/[1.1gb]}{[survivor] [65.7mb]->[0b]/[149.7mb]}{[old] 
[8.5gb]->[8.5gb]/[8.5gb]}
[2015-02-25 18:27:27,717][WARN ][monitor.jvm] [server1] [gc][old][568][32] 
duration [12s], collections [1]/[12.6s], total [12s]/[3.9m], memory 
[9.7gb]->[8.8gb]/[9.8gb], all_pools {[young] 
[1.1gb]->[275.2mb]/[1.1gb]}{[survivor] [73.1mb]->[0b]/[149.7mb]}{[old] 
[8.5gb]->[8.5gb]/[8.5gb]}
[2015-02-25 18:27:39,518][INFO ][monitor.jvm] [server1] [gc][old][570][33] 
duration [9.5s], collections [1]/[9.9s], total [9.5s]/[4.1m], memory 
[9.8gb]->[8.8gb]/[9.8gb], all_pools {[young] 
[1.1gb]->[341mb]/[1.1gb]}{[survivor] [99.7mb]->[0b]/[149.7mb]}{[old] 
[8.5gb]->[8.5gb]/[8.5gb]}
[2015-02-25 18:27:49,549][INFO ][monitor.jvm] [server1] [gc][old][571][34] 
duration [9s], collections [1]/[10s], total [9s]/[4.2m], memory 
[8.8gb]->[8.8gb]/[9.8gb], all_pools {[young] 
[341mb]->[364.5mb]/[1.1gb]}{[survivor] [0b]->[0b]/[149.7mb]}{[old] 
[8.5gb]->[8.5gb]/[8.5gb]}
[2015-02-25 18:27:59,113][INFO ][monitor.jvm] [server1] [gc][old][572][35] 
duration [8.5s], collections [1]/[9.5s], total [8.5s]/[4.4m], memory 
[8.8gb]->[8.9gb]/[9.8gb], all_pools {[young] 
[364.5mb]->[379.8mb]/[1.1gb]}{[survivor] [0b]->[0b]/[149.7mb]}{[old] 
[8.5gb]->[8.5gb]/[8.5gb]}
[2015-02-25 18:28:08,713][INFO ][monitor.jvm] [server1] [gc][old][574][36] 
duration [8.5s], collections [1]/[8.5s], total [8.5s]/[4.5m], memory 
[9.8gb]->[8.9gb]/[9.8gb], all_pools {[young] 
[1.1gb]->[420.4mb]/[1.1gb]}{[survivor] [146.1mb]->[0b]/[149.7mb]}{[old] 
[8.5gb]->[8.5gb]/[8.5gb]}
[2015-02-25 18:28:18,486][INFO ][monitor.jvm] [server1] [gc][old][576][37] 
duration [8.3s], collections [1]/[8.7s], total [8.3s]/[4.6m], memory 
[9.7gb]->[8.8gb]/[9.8gb], all_pools {[young] 
[1.1gb]->[365.2mb]/[1.1gb]}{[survivor] [66.6mb]->[0b]/[149.7mb]}{[old] 
[8.5gb]->[8.5gb]/[8.5gb]}
[2015-02-25 18:28:28,169][INFO ][monitor.jvm] [server1] [gc][old][578][38] 
duration [8.3s], collections [1]/[8.6s], total [8.3s]/[4.8m], memory 
[9.7gb]->[8.9gb]/[9.8gb], all_pools {[young] 
[1.1gb]->[389mb]/[1.1gb]}{[survivor] [88.7mb]->[0b]/[149.7mb]}{[old] 
[8.5gb]->[8.5gb]/[8.5gb]}
[2015-02-25 18:28:38,022][INFO ][monitor.jvm] [server1] [gc][old][580][39] 
duration [8.6s], collections [1]/[8.8s], total [8.6s]/[4.9m], memory 
[9.7gb]->[8.9gb]/[9.8gb], all_pools {[young] 
[1.1gb]->[387.2mb]/[1.1gb]}{[survivor] [79mb]->[0b]/[149.7mb]}{[old] 
[8.5gb]->[8.5gb]/[8.5gb]}
[2015-02-25 18:28:48,061][INFO ][monitor.jvm] [server1] [gc][old][582][40] 
duration [8.2s], collections [1]/[9s], total [8.2s]/[5.1m], memory 
[9.7gb]->[8.8gb]/[9.8gb], all_pools {[young] 
[1.1gb]->[325.4mb]/[1.1gb]}{[survivor] [14.6mb]->[0b]/[149.7mb]}{[old] 
[8.5gb]->[8.5gb]/[8.5gb]}


In htop we see its mainly one core who is busy all the time:




Any ideas?

Cheers,
Chris

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/b85a0a38-a461-4d14-b188-1a77f5307444%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: ElasticSearch synchronization with OrientDB

Official clients are listed here - http://www.elasticsearch.org/guide/

On 26 February 2015 at 12:09, Michalis Michaelidis 
wrote:

> Thank you,
>
> I have that in mind. What  do you mean with official clients? I don't
> think it is a good idea to hit both orientdb and elasticsearch when I am
> inserting something for example..
>
> Τη Τρίτη, 24 Φεβρουαρίου 2015 - 3:16:20 μ.μ. UTC-5, ο χρήστης Mark Walkom
> έγραψε:
>>
>> You can also DIY and leverage the official clients.
>>
>> Be aware that in the long run that rivers are being deprecated.
>>
>> On 25 February 2015 at 06:25, Michalis Michaelidis 
>> wrote:
>>
>>> Hello,
>>>
>>> I would like some guidelines about how to approach ElasticSearch
>>> synchronization with OrientDB:
>>>
>>> Doing some search I have found those approaches:
>>>
>>> 1) Dedicated River plugin - Like this one: https://github.com/sksamuel/
>>> elasticsearch-river-neo4j
>>>
>>> In this google group discussion (https://groups.google.com/d/
>>> msg/orient-database/YAesdS2qAYc/yCp7v9pF6tcJ) someone said that it
>>> could be done using hooks api of OrientDB that is more efficient than just
>>> pooling. Any thoughts on that?
>>>
>>> 2) JDBC River plugin ( https://github.com/jprante/
>>> elasticsearch-river-jdbc) - I could just use this one since OrientDB is
>>> providing a JDBC driver. Could there be compatibility problems?
>>>
>>> A person suggested Single point of failure problems with river plugins
>>> and I read that river plugins could be deprecated (?) in the future (
>>> http://stackoverflow.com/questions/22237111/preferred-
>>> method-of-indexing-bulk-data-into-elasticsearch). I don't know if SPOF
>>> is actually a reality as I see river plugins used for many types of
>>> resources that seem very decoupled.
>>>
>>> 3) Someone in twitter suggested embedding Elastic Search in OrientDB as
>>> it was done with Lucene https://github.com/orientechnologies/orientdb-
>>> lucene. Could that have scalability problems for either OrientDB or
>>> Elastic Search. I guess this cannot take the full advantage of Elastic
>>> Search and it is using that for querying only..
>>>
>>> Please guide me if I need to implement myself something or I could use
>>> existing tools and what are the tradeoffs of the previous or other
>>> approaches. I could have the support from Orient Technologies if I need it.
>>>
>>>
>>> Thank you,
>>>
>>> Michail
>>>
>>>
>>>  --
>>> You received this message because you are subscribed to the Google
>>> Groups "elasticsearch" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to elasticsearc...@googlegroups.com.
>>> To view this discussion on the web visit https://groups.google.com/d/
>>> msgid/elasticsearch/66a17298-5c32-46a4-80a0-0a1ccc5fb465%
>>> 40googlegroups.com
>>> 
>>> .
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/e0b55ff9-0c3c-4473-b701-26de2ed3b85c%40googlegroups.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAEYi1X9zDaL-_dmQVOxEjojnkB_6VHOW_pWO3mK6i-XCoLscyQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Strange issue after upgrading from 1.1.0 to 1.4.1 ES Version

2015-02-25 Thread sagarl

Hi,

We recently upgraded one of our ES Clusters from ES Version 1.1.0 to 1.4.1.

We have dedicated master-data-search deployment in AWS. Cluster settings
are same for all the clusters.

Strangely, only in one cluster; we are seeing that nodes are constantly
failing to connect to Master node and rejoining back.
*It happens all the time, even during idle period (when there are no read
or writes)*.

We keep on seeing following exception in the logs
org.elasticsearch.transport.NodeNotConnectedException

Because of this, Cluster has slowed down considerably.

We use kopf plugin for monitoring and it keeps popping up message -
"Loading cluster information is talking too long"

There is not much data on individual nodes; almost 80% disk is free. CPU
and Heap are doing fine.

Only difference between this cluster and other cluster is the number of
indices and shards. Other clusters have shards in hundreds and indices in
double digit. While this cluster has around 5000 shards and close to 250
indices.

But we are not sure, if number of shards or indices can cause reconnection
issues between nodes.

Not sure if it's really related to 1.4.1 or something else. But in that
case, other clusters should have been affected too.

Any help will be appreciated !

Thanks,

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/12e9b4ab-7600-4d22-a347-c03edaebe4f3%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: ElasticSearch synchronization with OrientDB

2015-02-25 Thread Michalis Michaelidis

Thank you,

I have that in mind. What  do you mean with official clients? I don't think 
it is a good idea to hit both orientdb and elasticsearch when I am 
inserting something for example..

Τη Τρίτη, 24 Φεβρουαρίου 2015 - 3:16:20 μ.μ. UTC-5, ο χρήστης Mark Walkom 
έγραψε:
>
> You can also DIY and leverage the official clients.
>
> Be aware that in the long run that rivers are being deprecated.
>
> On 25 February 2015 at 06:25, Michalis Michaelidis  > wrote:
>
>> Hello,
>>
>> I would like some guidelines about how to approach ElasticSearch 
>> synchronization with OrientDB:
>>
>> Doing some search I have found those approaches:
>>
>> 1) Dedicated River plugin - Like this one: 
>> https://github.com/sksamuel/elasticsearch-river-neo4j 
>>
>> In this google group discussion (
>> https://groups.google.com/d/msg/orient-database/YAesdS2qAYc/yCp7v9pF6tcJ) 
>> someone said that it could be done using hooks api of OrientDB that is more 
>> efficient than just pooling. Any thoughts on that? 
>>
>> 2) JDBC River plugin ( 
>> https://github.com/jprante/elasticsearch-river-jdbc) - I could just use 
>> this one since OrientDB is providing a JDBC driver. Could there be 
>> compatibility problems?
>>
>> A person suggested Single point of failure problems with river plugins 
>> and I read that river plugins could be deprecated (?) in the future (
>> http://stackoverflow.com/questions/22237111/preferred-method-of-indexing-bulk-data-into-elasticsearch).
>>  
>> I don't know if SPOF is actually a reality as I see river plugins used for 
>> many types of resources that seem very decoupled.
>>
>> 3) Someone in twitter suggested embedding Elastic Search in OrientDB as 
>> it was done with Lucene 
>> https://github.com/orientechnologies/orientdb-lucene. Could that have 
>> scalability problems for either OrientDB or Elastic Search. I guess this 
>> cannot take the full advantage of Elastic Search and it is using that for 
>> querying only..
>>
>> Please guide me if I need to implement myself something or I could use 
>> existing tools and what are the tradeoffs of the previous or other 
>> approaches. I could have the support from Orient Technologies if I need it.
>>
>>
>> Thank you,
>>
>> Michail
>>
>>
>>  -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to elasticsearc...@googlegroups.com .
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/66a17298-5c32-46a4-80a0-0a1ccc5fb465%40googlegroups.com
>>  
>> 
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/e0b55ff9-0c3c-4473-b701-26de2ed3b85c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Consistent query score of path_hierarchy tokenized field

2015-02-25 Thread ziggy

My goal is to be able to search ns1.example.com and get results for 
ns1.example.com, ns2.example.com, subdomain.example.com, and example.com.

I indexed the field with a path_hierarchy tokenizer with token order 
reversed and the lowercase token filter for the domain names.

I have tried to limit the matches with min_score, and the _score is not 
consistent from one search to another.

I think what I need a score based on the parts of the path that match.  A 
query for ns1.example.com must score ns1.example.com as 3, example.com as 
2, ns2.example.com as 2, example2.com as 1, etc.

My idea of how to achieve this goal is to use a match query and then 
rescore the search with a script.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/afde9f8b-587e-4bf2-98c9-dbf728475bdd%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Sporadic node disconnected issues

You may find it's GC related, so check your logs on the nodes.
Take a look at
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-discovery-zen.html
for some timeout options around discovery.

On 25 February 2015 at 16:45, Darshat Shah  wrote:

> Hi
> I have an ES cluster with 27 nodes (3 master, 24 data). At times I see a
> burst of nodes leaving and rejoining within couple of minutes. Each node
> has 16GB allocated for the JVM heap and are not close to touching those
> limits. There are no memory issues, and there is no search/index
> operations going on when this occurred. But there are quite a few
> nodedisconnected messages that suddenly appear on the master. It doesn’t
> seem to happen all the time but in bursts.
>
>
>
> During this time, on the master, I see NodeDisconnectedException for a
> node. On that node, I see messages that say “master left (reason =
> transport disconnected)”. I don't think its split-brain though with the
> number of messages in the logs its hard to figure out. Also min number of
> master setting is set to 2. The outcome is that it causes a whole lot of
> shards to shift around.
>
>
> I'd like to involve our network specialists to troubleshoot
> connectivity but not sure what to ask them to look for. In what scenarios
> does ElasticSearch reports node disconnected? Should they be looking at TCP
> connectivity, run some ping tests, etc.?
>
> Also are there timeout values that can be configured so we can reduce
> false positives for node disconnected events?
>
>
> Thanks
>
> Darshat
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/e7ad5de3-0e9b-4496-9c96-5162b784bac1%40googlegroups.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAEYi1X8GY035smZmFognwxzaaOzWkSgR0aw%3DqAoudC2a0OQ%2BUg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Elasticsearch Load Test goes OutOfMemory

What sort of queries are you running?

On 25 February 2015 at 22:08, Malaka Gallage  wrote:

> Hi Mark,
>
> Yes I'm using bulk API for the tests. Usually OOM error happens when the
> cluster has around 30 million records. Is there anyway to tune the ES
> cluster to perform better?
>
> Thanks
> Malaka
>
> On Wednesday, February 25, 2015 at 12:25:52 PM UTC+5:30, Mark Walkom wrote:
>>
>> If you are getting queue capacity rejections then you are over working
>> your cluster. Are you using the bulk API for your tests?
>> How much data is in your cluster when you get OOM?
>>
>> On 25 February 2015 at 16:28, Malaka Gallage  wrote:
>>
>>> Hi all,
>>>
>>> I need some help here. I started a load test for Elasticsearch before
>>> using that in production environment. I have three EC2 instances that are
>>> configured in following manner which creates a Elasticsearch cluster.
>>>
>>> All three machines has the following same hardware configurations.
>>>
>>> 32GB RAM
>>> 160GB SSD hard disk
>>> 8 core CPU
>>>
>>> *Machine 01*
>>> Elasticsearch server (16GB heap)
>>> Elasticsearch Java client (Who generates a continues load and report to
>>> ES - 4GB heap)
>>>
>>>
>>> *Machine 02*
>>> Elasticsearch server (16GB heap)
>>> Elasticsearch Java client (Who generates a continues load and report to
>>> ES - 4GB heap)
>>>
>>>
>>> *Machine 03*
>>> Elasticsearch server (16GB heap)
>>> Elasticsearch Java client (Who queries from ES continuously - 1GB heap)
>>>
>>>
>>> Note that the two clients together generates around 20K records per
>>> second and report them as bulks with average size of 25. The other client
>>> queries only one query per second. My document has the following format.
>>>
>>> {
>>> "_index": "my_index",
>>> "_type": "my_type",
>>> "_id": "7334236299916134105",
>>> "_score": 3.607,
>>> "_source": {
>>>"long_1": 96186289301793,
>>>"long_2": 7334236299916134000,
>>>"string_1": "random_string",
>>>"long_3": 96186289301793,
>>>"string_2": "random_string",
>>>"string_3": "random_string",
>>>"string_4": "random_string",
>>>"string_5": "random_string",
>>>"long_4": 5457314198948537000
>>>   }
>>> }
>>>
>>> The problem is, after few minutes, Elasticsearch reports errors in the
>>> logs like this.
>>>
>>> [2015-02-24 08:03:58,070][ERROR][marvel.agent.exporter] [Gateway]
>>> create failure (index:[.marvel-2015.02.24] type: [cluster_stats]):
>>> RemoteTransportException[[Marvel 
>>> Girl][inet[/10.167.199.140:9300]][bulk/shard]];
>>> nested: EsRejectedExecutionException[rejected execution (queue capacity
>>> 50) on org.elasticsearch.action.support.replication.
>>> TransportShardReplicationOperationAction$AsyncShardOperationAction$1@
>>> 76dbf01];
>>>
>>> [2015-02-25 04:23:36,459][ERROR][marvel.agent.exporter] [Wildside]
>>> create failure (index:[.marvel-2015.02.25] type: [index_stats]):
>>> UnavailableShardsException[[.marvel-2015.02.25][0] [2] shardIt, [0]
>>> active : Timeout waiting for [1m], request: org.elasticsearch.action.bulk.
>>> BulkShardRequest@2e7693b7]
>>>
>>> Note that this error happens for different indices and different types.
>>>
>>> Again after few minutes, Elasticsearch clients get
>>> NoNodeAvailableException. I hope that is because Elasticsearch cluster
>>> malfunctioning due to above errors. But eventually the clients get
>>> "java.lang.OutOfMemoryError: GC overhead limit exceeded" error.
>>>
>>> I did some profiling and found out that increasing
>>> the org.elasticsearch.action.index.IndexRequest instances is the cause
>>> for this OutOfMemory error. I tried even with "index.store.type: memory"
>>> and it seems still the Elasticsearch cluster cannot build the indices to
>>> the required rate.
>>>
>>> Please point out any tuning parameters or any method to get rid of these
>>> issues. Or please explain a different way to report and query this amount
>>> of load.
>>>
>>>
>>> Thanks
>>> Malaka
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "elasticsearch" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to elasticsearc...@googlegroups.com.
>>> To view this discussion on the web visit https://groups.google.com/d/
>>> msgid/elasticsearch/35a29ca5-02f6-4fe9-8600-2cdb91c519cf%
>>> 40googlegroups.com
>>> 
>>> .
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
>

Re: EC2 cluster storage question

2015-02-25 Thread Chris Pall

Supposedly doing RAID0 EBS volumes helps to mitigate some I/O issues. You 
could go that route and avoid having to refresh, assuming the performance 
would be acceptable.


On Tuesday, February 24, 2015 at 10:27:16 AM UTC-5, Paul Sanwald wrote:
>
> More detail below, but the the crux of my question is: What's the best way 
> to spin up/down "on demand" an ES cluster on EC2 that uses ephermal local 
> storage? Essentially, I want to run the cluster during the week and spin 
> down over the weekend. Other than brute force snapshot/restore, is there 
> any more creative way to do this, like mirroring local storage to EBS or 
> similar?
>
> Some more background:
> We run multiple ES clusters on ec2 (we use opsworks for deployment 
> automation). We started out several years back using EBS because we didn't 
> know any better, and have switched over to using SSD based local storage. 
> The performance improvements have been unbelievable.
>
> Obviously, using ephermal local storage comes at a cost: we use 
> replication, take frequent snapshots, and store all source data to mitigate 
> the risk of data loss. the other thing that local storage means is that our 
> cluster essentially needs to be up and running 24/7, which I think is a 
> fairly normal.
>
> I'm investigating some ways to save on cost for a large-ish cluster, and 
> one of the things is that we don't need it to necessarily run 24/7; 
> specifically, we want to turn the cluster off over the weekend. That said, 
> restoring terabytes from snapshot doesn't seem like a very efficient way to 
> do this, so I want to consider options, and was hoping the community could 
> help me in identifying options that I am missing.
>
> thanks in advance for any thoughts you may have.
>
> --paul
>
> *Important Notice:*  The information contained in or attached to this 
> email message is confidential and proprietary information of RedOwl 
> Analytics, Inc., and by opening this email or any attachment the recipient 
> agrees to keep such information strictly confidential and not to use or 
> disclose the information other than as expressly authorized by RedOwl 
> Analytics, Inc.  If you are not the intended recipient, please be aware 
> that any use, printing, copying, disclosure, dissemination, or the taking 
> of any act in reliance on this communication or the information contained 
> herein is strictly prohibited. If you think that you have received this 
> email message in error, please delete it and notify the sender.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/6513247e-5c4e-4fd6-9486-ba8b2245c575%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Fwd: FW: Java heap memory is low but process memory isstill high

2015-02-25 Thread liu wei

Hi everyone,



Just wondering if anyone has encountered similar things before I dig more
into Java memory management. So we found machines are constantly in high
memory usage although the “reported” memory usage from ElasticSearch is
low. A couple of points:

1.   Java Heap Memory(top left chart) drops under 2G, which I guess is
mainly because of GC on old pool (bottom right chart)

2.   FieldData usage is low (top right)

3.   OS memory never drops (bottom left chart). When I login to the
machines, the Java process is still taking 12~13GB memory.

Anyone had similar problems before?

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAFzuQNQpLUmvehdgSbgj934cwvH-YtyTcvMtTgG9vj%3D9nUW%2BKg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Using a percolator for a huge blacklist

2015-02-25 Thread alexandre . klein

Hello,

I am trying to try and test different usages of the percolator.

I was wondering: i am currently storing in elasticsearch a lot of 
informations from twitter. And there is a lot of... inapropriate content.

Is percolator a good idea to store a query containing violent expressions 
list or domain list to match ?

It could be a huge list of words (10K entries, at last) , i don't think the 
query match with OR would fit for this usage...
Do you have an idea of how i could implement it? Or is it a bad idea ?

Alex

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/9132c2b7-0bd8-416f-a1d6-368314acf7ad%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: EC2 cluster storage question

2015-02-25 Thread Norberto Meijome

Yes, of course EBS all the time would help for storage, but it can't
compete with local ssd in speed.
 On 25/02/2015 9:31 pm, "Mark Walkom"  wrote:

> Fair point. The rsync option could work, but then why not just use EBS and
> then shut the nodes down to save the rsync work?
> Tagging nodes probably won't help in this instance.
>
> Basically if you want to shut everything down you need to go through
> recovery, and depending on how long that takes it may not be worth the
> cost. This is something you need to test.
>
> On 25 February 2015 at 18:14, Norberto Meijome  wrote:
>
>> OP points out he is using ephemeral storage...hence shutdown will destroy
>> the data...but it can be rsynced to EBS as part of the shutdown
>> process...and then repeat in reverse when starting things up again...
>>
>> Though I guess you could let ES take care of it by tagging nodes
>> accordingly and updating the index settings .(hope it makes sense...)
>> On 25/02/2015 4:58 pm, "Mark Walkom"  wrote:
>>
>>> Why not just shut the cluster down, disable allocation first and then
>>> just gracefully power things off?
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "elasticsearch" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to elasticsearch+unsubscr...@googlegroups.com.
>>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/elasticsearch/CAEYi1X_D15Aq62TzhbTN8kWKDPGpsuoYP2e2RJta9N5_tu4_ZA%40mail.gmail.com
>>> 
>>> .
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>  --
>> You received this message because you are subscribed to the Google Groups
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to elasticsearch+unsubscr...@googlegroups.com.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/elasticsearch/CACj2-4Jafq4Fqf2GOsdK5OCcmdk3AtW3B2%3DjJYHTgCjyUzOQWg%40mail.gmail.com
>> 
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/CAEYi1X-oqv%3DFHF3%3DoULiWy_rJBf4PSi3AjgbDE_BtBwLP9Xt_w%40mail.gmail.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CACj2-4K2q1h3roWtrMxAGfxZoUGCBZfq5RH52Mh_UPkxSEzTzg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

From ES aggregation result to a List of Maps (Java)

2015-02-25 Thread Sven Jörns

Hi,

what is the best way to convert an ES Aggregation result to a List of Maps 
in Java using Java API?

I want to display the results into a pivot-table or in the first step just 
print out the filled rows into the console.

Thanks
Sven

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/5c172b60-602c-44d5-9ae7-e57949b01bf1%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Attribute awareness during recovery?

Define closer though, someone might use zone awareness on a rack or room
level, these are just abstracted concepts/tags after all.

On 25 February 2015 at 23:35, Neil Andrassy (The Filter) <
neil.andra...@thefilter.com> wrote:

> That seems like a shame given the immutability of the underlying blocks.
> Sure, the primary needs to identify the specific set of blocks to be
> replicated but I don't see why the block data itself couldn't be pulled
> from a "closer" replica if it exists there?
>
> On 25 February 2015 at 06:56, Mark Walkom  wrote:
>
>> Recovery always needs to come from the primary, otherwise you cannot be
>> certain your dataset is valid.
>>
>> On 24 February 2015 at 21:18, Neil Andrassy 
>> wrote:
>>
>>> Hi,
>>>
>>> Does anybody know if there is a way, when a node fails and shards need
>>> to be recovered, of configuring ElasticSearch to prefer to recover from a
>>> node sharing the same awareness attributes (e.g. same rack, same zone etc.
>>> a bit like the "automatic preference when searching / getingedit" reference
>>> in the user guide). We're typically seeing a lot of traffic between "zones"
>>> when a failure occurs and I wondered why this was the case / if it was
>>> avoidable. Maybe I'm missing something and recovery always needs to
>>> replicate from the active primary?
>>>
>>> Thanks in advance for any guidance or information,
>>>
>>> N
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "elasticsearch" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to elasticsearch+unsubscr...@googlegroups.com.
>>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/elasticsearch/dfa05e57-15a8-4509-a45f-62db4d94867a%40googlegroups.com
>>> 
>>> .
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>  --
>> You received this message because you are subscribed to a topic in the
>> Google Groups "elasticsearch" group.
>> To unsubscribe from this topic, visit
>> https://groups.google.com/d/topic/elasticsearch/txa_4eosq0k/unsubscribe.
>> To unsubscribe from this group and all its topics, send an email to
>> elasticsearch+unsubscr...@googlegroups.com.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/elasticsearch/CAEYi1X_ihD_AkLgoF5FjAJbju9dnCAwAYhoZbns-gf%2BWYXWQMQ%40mail.gmail.com
>> 
>> .
>>
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>
>
> --
> Neil Andrassy  |  CTO  |  The Filter
> phone  |  +44 (0)1225 588 004
> skype | andrassynp
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/CABpTWLObLq3tJuE48p0OfdwoXB6KoAHZvKbi6x%2B98SjuPfrP7A%40mail.gmail.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAEYi1X_ymUMvG0VXH29_q-_LN0fM0cV1hgxpvVSZEAS10XAg1Q%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Some install bugs - probably there is solution, not yet found by me

What is wrong with the ones included in the deb/rpms?

On 26 February 2015 at 01:47, Peter Takac  wrote:

> Hi, bellow is service script, it has some small bugs but works fine for me:
>
>
> #!/bin/sh
> # ElastiSearch daemon start/stop script.
> # Comments to support chkconfig on RedHat Linux
> # chkconfig: 2345 64 36
> #title : elasticsearch
> #description   : This script is used such a damon/service start/stop
> script.
> #author: Peter TAKAC / AssmodeuS
> #date  : 20150210
> #version   : 0.15
> #usage : service elasticsearch
> #notes : service start/stop script, also check the conf path. if
> proces with the same path runs, it will not start it
> #bash_version  : 3.2.51(1)
> #
>
> ### BEGIN INIT INFO
> # Provides: ElasticSearch
> # Required-Start: $network $named
> # Required-Stop: $network $named
> # Should-Start: ypbind nscd ldap ntpd xntpd
> # Default-Start: 2 3 4 5
> # Default-Stop: 0 1 6
> # Short-Description: This service manages the ElasticSearch daemon
> # Description: Elasticsearch is a very scalable, schema-free and
> high-performance search solution supporting multi-tenancy and near realtime
> search.
> ### END INIT INFO
>
> # Basic variables and settings
> es_home="/SIP/elasticsearch/es-int-inst"
> es_bindir="$es_home/bin"
> es_exec="elasticsearch"
> es_prog="$es_bindir/$es_exec"
> es_pidpath="/var/run/elasticsearch"
> es_pidfile="$es_pidpath/${es_exec}.pid"
> es_lockdir="/var/lock/subsys"
> es_lockfile="$es_lockdir/$es_exec"
> es_user=elasticsearch
> es_group=elasticsearch
> # Heap size defaults to 256m min, 1g max
> # Set ES_HEAP_SIZE to 50% of available RAM, but no more than 31g
> #ES_HEAP_SIZE=2g
> # Heap new generation
> #ES_HEAP_NEWSIZE=
> # max direct memory
> #ES_DIRECT_SIZE=
> # Additional Java OPTS
> #ES_JAVA_OPTS=
> # Maximum number of open files
> MAX_OPEN_FILES=65535
> # Maximum amount of locked memory
> #MAX_LOCKED_MEMORY=
>
> #More application related settings
> # Elasticsearch log directory
> es_logdir="$es_home/logs/$es_exec"
> # Elasticsearch data directory
> es_datadir="/SIP/elasticsearch/es-int-data"
> # Elasticsearch work directory
> es_workdir="$es_home/work"
> # Elasticsearch configuration directory
> #es_confdir=/etc/$es_exec
> es_confdir="$es_home/config"
> # Elasticsearch configuration file (elasticsearch.yml)
> es_confile="$es_confdir/elasticsearch.yml"
> # Maximum number of VMA (Virtual Memory Areas) a process can own
> MAX_MAP_COUNT=262144
> es_daemon="$es_bindir/$es_exec"
> es_daemon_opts="-d -p $es_pidfile -Des.default.path.home=$es_home
> -Des.default.path.logs=$es_logdir -Des.default.path.data=$es_datadir
> -Des.default.path.work=$es_workdir -Des.default.path.conf=$es_confdir"
>
> PATH=$PATH:$es_home
> export PATH
> export JAVA_HOME
> export ES_HEAP_SIZE
> export ES_HEAP_NEWSIZE
> export ES_DIRECT_SIZE
> export ES_JAVA_OPTS
>
> # Default value, in seconds, after which the script should timeout waiting
> # for server start.
> # 0 means do not wait at all
> # Negative numbers mean to wait indefinitely
> service_startup_timeout=900
>
> #
> # Use LSB init script functions for printing messages, if possible
> #
> lsb_functions="/lib/lsb/init-functions"
> if test -f $lsb_functions ; then
>   . $lsb_functions
> else
>   log_success_msg()
>   {
> echo " SUCCESS! $@"
>   }
>   log_failure_msg()
>   {
> echo " ERROR! $@"
>   }
> fi
>
> mode=$1 # start or stop
>
> [ $# -ge 1 ] && shift
>
> case `echo "testing\c"`,`echo -n testing` in
>   *c*,-n*) echo_n= echo_c= ;;
>   *c*,*)   echo_n=-n echo_c= ;;
>   *) echo_n= echo_c='\c' ;;
> esac
>
> wait_for_pid () {
>   verb="$1"# created | removed
>   pid="$2"# process ID of the program operating
> on the pid-file
>   pid_file_path="$3" # path to the PID file.
>
>   i=0
>   avoid_race_condition="by checking again"
>
>   while test $i -ne $service_startup_timeout ; do
>
> case "$verb" in
>   'created')
> # wait for a PID-file to pop into existence.
> test -s "$pid_file_path" && i='' && break
>   ;;
>   'removed')
> # wait for this PID-file to disappear
> test ! -s "$pid_file_path" && i='' && break
>   ;;
>   *)
> echo "wait_for_pid () usage: wait_for_pid created|removed pid
> pid_file_path"
> exit 1
>   ;;
> esac
>
> # if server is not running, then pid-file will never be updated
> if test -n "$pid"; then
>   if kill -0 "$pid" 2>/dev/null; then
> : # the server still runs
>   else
> # The server may have exited between the last pid-file check and
> now.
> if test -n "$avoid_race_condition"; then
>   avoid_race_condition=""
>   continue# Check again.
> fi
> # there is nothing that will affect the file.
> log_failure_msg "The server quit without updating PID file
> ($pid_file_path)."
> return 1# not waiting any more.
>

Re: Easy ELK Stack setup

If you don't want to waste time then automation is the key.

On 26 February 2015 at 02:43, Thomas Güttler  wrote:

> Hi,
>
> I want to setup an ELK stack without wasting time. That's why I ask here
> before starting.
>
> My environment is simple: all traffic comes in from localhost. There is
> only one server for the ELK setup.
>
> But there will be several ELK stacks running in the future. But again each
> traffic will come in only from localhost.
> The systems will run isolated.
>
> I see these solutions:
>
>   - take a docker container
>
>   - do it by hand (RPM install)
>
>   - use Chef/Puppet. But up to now we don't use any of those tools.
>
>   - any other idea?
>
> What do you think?
>
>
> Regards,
>   Thomas Güttler
>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/b82b581c-cb25-47f3-83f2-7f6877c21ec4%40googlegroups.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAEYi1X8A8gbymoyEhZ%3DDQrC3-fgS0zxY03gafv7b3LdHFSuT%2BA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Kiban4 Issue: How to adjust interval value in the case of date histogram aggregation

2015-02-25 Thread cong yue

Hi
 I found for Kibana4, the interval value of date histogram aggregation can 
not be adjusted as kibana3. Now my visualization object json is like
--
visState
*{*
*  "aggs": [*
*{*
*  "id": "1",*
*  "params": {*
*"field": "cacheCode"*
*  },*
*  "schema": "metric",*
*  "type": "avg"*
*},*
*{*
*  "id": "2",*
*  "params": {*
*"extended_bounds": {},*
*"field": "accessTime",*
*"interval": "10minutes",*
*"min_doc_count": 1*
*  },*
*  "schema": "segment",*
*  "type": "date_histogram"*
*}*
*  ],*
*  "listeners": {},*
*  "params": {*
*"addLegend": true,*
*"addTooltip": true,*
*"defaultYExtents": false,*
*"shareYAxis": true*
*  },*
*  "type": "line"*
*}*


kibanaSavedObjectMeta.searchSourceJSON
*{*
*  "query": {*
*"query_string": {*
*  "analyze_wildcard": true,*
*  "query": "*"*
*}*
*  },*
*  "filter": []*
*}*

*--*
*I want to filter my query like*
*---*
*#Cache hit ratio for timeline baseGET /ats/_search{  "query": {
"filtered": {  "query": {"match_all": {}  },  "filter": 
{"range": {"accessTime": {  "gte": "now-1d/d"}  
}}}  },   "size": 0,  "aggs": {"accessTimes": {  
"date_histogram": {"field": "accessTime","interval": "10m"  
},  "aggs": {"hit_ratio": {  "avg": {
"field": "cacheCode"  }}  }}  }}*
*---*


*How I can do this from kiban4? Kibana is with real cool of new charts, new 
menu of discovery and visualization menus, but I still can not how I 
customize the query and filter for kibana4. Always the setting in the top 
bar will take effect. Is this the limitation of kibana4? I want to do some 
similar thing as marvel do for my server applications. May I have to roll 
back to kibana3 to do this?*

*thanks,*
*Cong*

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/96f76542-c2d6-409e-b6b1-5dbc9f97f2cd%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Any Elasticsearch hosting vendors that support Couchbase-Elasticsearch transport plugin?

2015-02-25 Thread Raj S

Anyone has a knowledge of such vendors.? . I already talked to one and they 
don't.

-Thanks
Rajesh

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/61fa7dc2-e9ed-4d0a-bc52-b01b7aac5c16%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: ES 1.4.4 with Marvel latest throws exception in Chrome

2015-02-25 Thread Boaz Leskes

Great.

This is a netty settings but I would speculate it's built in order to 
protect from a single, malicious request from using huge amounts of memory.


On Wednesday, February 25, 2015 at 5:25:44 PM UTC+1, Jay Hilden wrote:
>
> Boaz, I found the culprit, I had cookies with the domain of localhost that 
>> must have been longer than 8192 because once I cleared those it worked fine.
>>
>
> Out of curiosity, why is there a restriction on header at 8192? 
>
> On Wednesday, February 25, 2015 at 1:33:50 AM UTC-6, Boaz Leskes wrote:
>>
>> Hi Jay,
>>
>> There is a setting to increase that limit, but I want to understand 
>> what's happening as it is unexpected. Can you open up the developer tools 
>> in Chrome and check the headers of the failed request? I wonder what it is 
>> that is so long.
>>
>> Cheers,
>> Boaz
>>
>> On Tuesday, February 24, 2015 at 11:16:33 PM UTC+1, Jay Hilden wrote:
>>>
>>> I am on a Windows 8 PC and I downloaded ES 1.4.4, installed Marvel's 
>>> latest version, and started ES.  ES starts up just fine but when I tried to 
>>> view marvel I get an exception within ES:
>>>
>>> [2015-02-24 16:10:14,462][WARN ][http.netty   ] [Kid Nova] 
>>> Caught exception while handling client http traffic, closing connection 
>>> [id: 0xc3000144, /0:0:0:0:0:0:0:1:51607 => /0:0:0:0:0:0:0:1:9200]
>>> org.elasticsearch.common.netty.handler.codec.frame.TooLongFrameException: 
>>> HTTP header is larger than 8192 bytes.
>>>
>>> The only thing that I changed was the cluster name in the .yml file, 
>>> nothing else was touched.
>>>
>>> This exception happens in Chrome, in FF Developer Edition it runs just 
>>> fine.  
>>>
>>> What's up with that?
>>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/ebe1182d-995e-4a22-9411-076b3191173f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

delete one child type mapping, affected other children queries

2015-02-25 Thread gitfy

I am using elasticsearch 1.3.4. I have the following setup.

There is one index which has lots of types, in those one type is a parent
for rest of the types in that index. Everytime, we create new type, the
parent/child relationship is established by creating the mapping. I have
the setting { routing : true } in my mapping.

At one point, i need to clear out completely all the data in a specific
child type, so i tried to delete and reload. When i deleted the child type,
what happened to my surprise was all the queries to other children types
which had a "has_parent" in it failed completed. They returned no records
at all. There were no errors in the response.

It was as if the parent key based children data cache across all types got
dropped and it never rebuild. I came to this conclusion, since everything
worked after i restarted elasticsearch.

Could someone shed some light on what could have happened and how do we
handle such type deletion without affecting queries for other types. If
there are any workaround that could done without restarting would be also
nice.

Thank.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/7712c6d5-a66b-41d6-b22e-afa7e99af5ff%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: ClassCastException when using Java API

2015-02-25 Thread Tim Molter

Anyone, any ideas?

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/0bdff454-a024-4629-b871-e7bc480f9bd8%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Bulk API does not work for huge data!

2015-02-25 Thread JZ

You can increase it by setting this in your cluster:

http.max_content_length: 1500mb

You need a lot of memory to index it though and I have found that big bulk
sizes are actually slower than indexing with smaller bulk sizes.

/JZ


On Wed, Feb 25, 2015 at 2:47 PM, Ali Lotfdar 
wrote:

> Thank you David, so big job for 2000 requests!
>
> On Tuesday, February 24, 2015 at 6:13:12 PM UTC-5, David Pilato wrote:
>>
>> Split your bulk in smaller parts.
>> For example I'm not injecting bulks with more than 1 requests.
>>
>> 1gb of data is a way too much!
>>
>> David
>>
>> Le 24 févr. 2015 à 23:50, Ali Lotfdar  a écrit :
>>
>> Dear All,
>>
>> I use bulk Api "$ curl -s -XPOST localhost:9200/_bulk --data-binary
>> @requests; for a file 1.43GB. But after execution nothing happen!
>> For small files I do not have any problem.
>>
>> Thanks to let me know if I have to do something!
>>
>> Regards,
>> Ali
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to elasticsearc...@googlegroups.com.
>> To view this discussion on the web visit https://groups.google.com/d/
>> msgid/elasticsearch/95afbaff-957f-44f5-8836-e4eef951e05a%
>> 40googlegroups.com
>> 
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/2c1536ad-fd78-4c7d-9908-b9bee345553c%40googlegroups.com
> 
> .
>
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAA%2BD3eWrSh5E2RNdEPBzx7v2E-CH0mra7yRRMyFUnAAmdn%2B%2BJA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Word count score

2015-02-25 Thread Christophe Rosko

Hi !


I'd like to calculate the score of a query as the number of matched 
wordsas, as do sphinx's "word_count".
For example, if the query is "word1 word2 word3",
then for each document, if the title contains word1, his score will be 1, 
if it contains word1 and word2 the score will be 2, and the score will be 3 
if it contains the 3 words.


How could I do that ?

Thanks for your help !

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/349babc4-5899-431e-9208-2bd6f9bea30f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: ES 1.4.4 with Marvel latest throws exception in Chrome

2015-02-25 Thread Jay Hilden

Boaz, here is the output from fiddler.



On Wednesday, February 25, 2015 at 1:33:50 AM UTC-6, Boaz Leskes wrote:
>
> Hi Jay,
>
> There is a setting to increase that limit, but I want to understand what's 
> happening as it is unexpected. Can you open up the developer tools in 
> Chrome and check the headers of the failed request? I wonder what it is 
> that is so long.
>
> Cheers,
> Boaz, I found the culprit, I had cookies with the domain of localhost that 
> must have been longer than 8192 because once I cleared those it worked fine.
>

Out of curiosity, why is there a restriction on header at 8192? 

>
> On Tuesday, February 24, 2015 at 11:16:33 PM UTC+1, Jay Hilden wrote:
>>
>> I am on a Windows 8 PC and I downloaded ES 1.4.4, installed Marvel's 
>> latest version, and started ES.  ES starts up just fine but when I tried to 
>> view marvel I get an exception within ES:
>>
>> [2015-02-24 16:10:14,462][WARN ][http.netty   ] [Kid Nova] 
>> Caught exception while handling client http traffic, closing connection 
>> [id: 0xc3000144, /0:0:0:0:0:0:0:1:51607 => /0:0:0:0:0:0:0:1:9200]
>> org.elasticsearch.common.netty.handler.codec.frame.TooLongFrameException: 
>> HTTP header is larger than 8192 bytes.
>>
>> The only thing that I changed was the cluster name in the .yml file, 
>> nothing else was touched.
>>
>> This exception happens in Chrome, in FF Developer Edition it runs just 
>> fine.  
>>
>> What's up with that?
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/ae47082d-3a3f-4bc7-866d-39b0fa5914aa%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: ES 1.4.4 with Marvel latest throws exception in Chrome

2015-02-25 Thread Jay Hilden


>
> Boaz, I found the culprit, I had cookies with the domain of localhost that 
> must have been longer than 8192 because once I cleared those it worked fine.
>

Out of curiosity, why is there a restriction on header at 8192? 

On Wednesday, February 25, 2015 at 1:33:50 AM UTC-6, Boaz Leskes wrote:
>
> Hi Jay,
>
> There is a setting to increase that limit, but I want to understand what's 
> happening as it is unexpected. Can you open up the developer tools in 
> Chrome and check the headers of the failed request? I wonder what it is 
> that is so long.
>
> Cheers,
> Boaz
>
> On Tuesday, February 24, 2015 at 11:16:33 PM UTC+1, Jay Hilden wrote:
>>
>> I am on a Windows 8 PC and I downloaded ES 1.4.4, installed Marvel's 
>> latest version, and started ES.  ES starts up just fine but when I tried to 
>> view marvel I get an exception within ES:
>>
>> [2015-02-24 16:10:14,462][WARN ][http.netty   ] [Kid Nova] 
>> Caught exception while handling client http traffic, closing connection 
>> [id: 0xc3000144, /0:0:0:0:0:0:0:1:51607 => /0:0:0:0:0:0:0:1:9200]
>> org.elasticsearch.common.netty.handler.codec.frame.TooLongFrameException: 
>> HTTP header is larger than 8192 bytes.
>>
>> The only thing that I changed was the cluster name in the .yml file, 
>> nothing else was touched.
>>
>> This exception happens in Chrome, in FF Developer Edition it runs just 
>> fine.  
>>
>> What's up with that?
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/807b6194-3b82-40ab-b2b4-28bbd1004809%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Modeling index for aggregation performance

2015-02-25 Thread Justin Warkentin

I'm a bit stuck trying to figure out the best way to model my indices for
aggregations. I'm currently storing article hits in indices that roll over
each month. Each index tends to have around 60M records. However, I have
two concerns:

1. In the future I expect the number of indices will grow into the
hundreds. If I'm trying to aggregate the total number of hits or the hits
per month of an article across the many indices, will the query end up
getting very slow since it has to aggregate across them all? Would it be
better to store all the hits for an article in the same index and use a new
index for blocks of article IDs instead of a new index per month to make
the index predictable for a certain article?

2. What about when I want to see what the top 10 articles of all time are?
This would require doing an aggregation of all articles across all indices,
right? How slow will that get when there are hundreds of indices with 60M+
records per index? Would it be better to store a hit counter on the article
record itself that gets updated occasionally?

Is there a better way to model the indices that would accommodate both of
these use cases?

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/2758357b-dadf-4097-917a-0cc54ca2109e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: mapreduce with filter script?

2015-02-25 Thread John Smith

So do an agreggation on term = brand?

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-aggregations.html


On Wednesday, 25 February 2015 09:50:19 UTC-5, bryan rasmussen wrote:
>
> Hi, 
>
> I would like to get a script to work like mapreduce over the results of my 
> query, so that if I have a query that returns 4 documents
> {
>  brands: "brand1"
> },
> {
>  brands: "brand2"
> },
> {
>  brands: "brand2"
> },
> {
>  brands: "brand3"
> }
>
> and what I want to come out is the documents
>
> {
>  brands: "brand1"
> },
> {
>  brands: "brand2"
> },
> {
>  brands: "brand3"
> }
>
> of course this is a very simplified example, and the query I am doing is 
> not on the brands field but it is the brands field I want to reduce / get 
> rid of duplicates. 
>
> thanks,
> Bryan Rasmussen
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/d8962c18-124d-4e4c-a01e-8bca6911ec8a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Easy ELK Stack setup

2015-02-25 Thread Thomas Güttler

Hi,

I want to setup an ELK stack without wasting time. That's why I ask here 
before starting.

My environment is simple: all traffic comes in from localhost. There is 
only one server for the ELK setup.

But there will be several ELK stacks running in the future. But again each 
traffic will come in only from localhost.
The systems will run isolated. 

I see these solutions:

  - take a docker container

  - do it by hand (RPM install)

  - use Chef/Puppet. But up to now we don't use any of those tools.

  - any other idea?

What do you think?


Regards,
  Thomas Güttler

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/b82b581c-cb25-47f3-83f2-7f6877c21ec4%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: how many memory is require (win7 64bit)?

I guess that you did this? 
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/_installation.html
 

Is that right?

Then you tried this?

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/setup-service-win.html
 


If not, please do.
If yes, please describe which exact step does not work as expected.



-- 
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet  | @elasticsearchfr 
 | @scrutmydocs 




> Le 25 févr. 2015 à 16:04, jan99  a écrit :
> 
> Hi !
>  
> current i found the side: 
> http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/heap-sizing.html
>  
> 
>  
> so i define a windows environment variable called ES_HEAP_SIZE = 1024
>  
> now i started the service.bat - but there is no success!
>  
> no service will installed !
>  
> when i try to elasticsearch-service-x64 via start as ... in the contextmenü 
> the window will open and close !
>  
> and now?
>  
> regards jan
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to elasticsearch+unsubscr...@googlegroups.com 
> .
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/elasticsearch/5630bce5-2324-49a9-abbf-e495f83e97f5%40googlegroups.com
>  
> .
> For more options, visit https://groups.google.com/d/optout 
> .

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/1B93D9F6-F336-4585-848C-0CA59593D62C%40pilato.fr.
For more options, visit https://groups.google.com/d/optout.

Re: Queue size

2015-02-25 Thread Christopher Bourez

OK, not very great that there is no timeout. It is indeed the search queue 
that goes to 
1000 
/_cat/thread_pool?v&h=id,host,suggest.active,suggest.rejected,suggest.completed,search.queue

It is a request that breaks all the shards (no replica), because I only 
find one error in the log file (others are "lastShard error"). Not very 
great from ES to be broken by one request ?!

The error says : 

2015-02-25 00:07:44,973][DEBUG][action.search.type   ] 
> [production21-ebs2.localdomain] [231571] Failed to execute fetch phase
>
> [Error: Runtime.getRuntime().exec("/tmp/cnmm").getInputStream(): Cannot 
> run program "/tmp/cnmm": error=13, Permission denied]
>
> [Near : {... w InputStreamReader(Runtime.getRuntime().exec("/tm }]
>
> It 
sounds 
http://stackoverflow.com/questions/26652187/elasticsearch-server-stops-due-to-java-io-ioexception-break

I will desactivate dynamic scripting and we'll keep you in touch. 

I'm quite surprised by the conception of ES, or is ES too young for 
production ?





On Monday, February 23, 2015 at 5:25:57 PM UTC+1, Jörg Prante wrote:
>
> Once a query is submitted, Elasticsearch will execute the query until it 
> terminates. A query timeout only returns results prematurely, it does not 
> cancel ongoing query threads on nodes.
>
> Jörg
>
> On Mon, Feb 23, 2015 at 11:48 AM, Christopher Bourez <
> christoph...@gmail.com > wrote:
>
>> Ok for default replica number in index settings... sounds good
>>
>> index_indexing_slowlog and index_indexing_slowlog are void but errors 
>> appear in the main logs. let me find them next time
>>
>> But is there any way to put a timeout in the server on queries ? because 
>> I thought they would not last more than 30s and having a maximum of 150 
>> requests per minutes should not fill the queue at any time.
>>
>>
>>  
>> On Sunday, February 22, 2015 at 9:05:08 PM UTC+1, Jörg Prante wrote:
>>>
>>> I assume search actions got stuck and block the subsequent ones, which 
>>> results in the search queue filling up. Maybe the cause is printed in the 
>>> server logs.
>>>
>>> Setting replica to 0 with just one node helps to fix the 15 shards/30 
>>> total shards count but that is an unrelated story.
>>>
>>> Jörg
>>>
>>>
>>> On Fri, Feb 20, 2015 at 10:04 PM, Mark Walkom  
>>> wrote:
>>>
 It means your cluster is probably overloaded.
 Your missing shards are probably replicas, which will never be assigned 
 with a one node cluster, head should show these.

 What is that massive spike right before the end of the graph? Are you 
 also monitoring things like load and other OS level stats?

 On 21 February 2015 at 03:32, Christopher Bourez <
 christoph...@gmail.com> wrote:

> I'm having the following problem : 
>
> "SearchPhaseExecutionException[Failed to execute phase [query], all 
> shards failed; shardFailures {[rlrweRJAQJqaoKfFB8il5A][stt_prod][3]: 
> EsRejectedExecutionException[rejected execution (queue capacity 1000) 
> on org.elasticsearch.action.search.type.TransportSearchTypeAction
>
> It sounds very strange; when I restarted the server, it worked fine 
> again.
>
> What could happen ? 
>
> Here is my configuration : 
> - ES version is 1.0.1
> - I have 3 indexes, of respective size 2.5G, 1.7G and 250M, each one 
> has 5 shards
> - the cluster is one only one instance (solo)
> - the state of the cluster says 15 successful shard, 0 failed shard 
> and 30 total shards (where are the 15 shards missing ?)
> - in my settings, mlockall is set to true
> - I enabled script.disable_dynamic: false, installed plugins _head and 
> action-updatebyquery
> - ES heap size is correctly set to 50% by the recipe which I can 
> confirm using top command : 
> 5320 elastic+  20   0  9.918g 4.788g  72980 S   7.6 65.3  29:49.42 java
> - I'm using only 30% of disk capacity
>
> My traffic is not more than 125 requests per minutes : 
>
>
> 
>
> So if I understand well, each request can live 30s, how come I have a 
> queue of 1000 ?!
> Can ES save the requests in the queue while the shards have failed ? 
> Why do the shards do not come back ?
>
> Thanks for your help (I'm not using ES usually, more Solr or 
> CloudSearch)
>
> (I also posted it here : https://github.com/
> elasticsearch/elasticsearch/issues/9792 )
>
> -- 
> You received this message because you are subscribed to the Google 
> Groups "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send 
> an email to elasticsearc...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/
> msgid/elasticsearch/33bfd4c7-4388-4e02-8e96-d8bb8cdd17ca%
> 40googlegroups.com 
>

Re: Assistance requried for Logstash filter with GROK

2015-02-25 Thread Bharath Paruchuri

Yes. That helped me. Thanks.


On Wed, Feb 25, 2015 at 12:12 PM, Magnus Bäck 
wrote:

> On Wednesday, February 25, 2015 at 05:55 CET,
>  Bharath Paruchuri  wrote:
>
> > I'm trying to filter below weblogic log using Logtrash filter GROK.
>
> Please post Logstash question to the logstash-users mailing list.
>
> https://groups.google.com/forum/#!forum/logstash-users
>
> [...]
>
> >   multiline {
> > type => "SOA1-diagnostic"
> > pattern => "^\[%{TIMESTAMP_ISO8601\]"
>
> Couldn't help noticing that there's a } missing here.
>
> [...]
>
> --
> Magnus Bäck| Software Engineer, Development Tools
> magnus.b...@sonymobile.com | Sony Mobile Communications
>
> --
> You received this message because you are subscribed to a topic in the
> Google Groups "elasticsearch" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/elasticsearch/M8N7KsefD-s/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/20150225064247.GB25857%40seldlx20533.corpusers.net
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CA%2BRL%2BAW3Rx2pfAGiHhPmL9YgnPcEufcgUHkqbhSnhHd0a7qSEw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: how many memory is require (win7 64bit)?

Hi !
 
current i found the side: 
http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/heap-sizing.html
 
so i define a windows environment variable called ES_HEAP_SIZE = 1024
 
now i started the service.bat - but there is no success!
 
no service will installed !
 
when i try to elasticsearch-service-x64 via start as ... in the contextmenü 
the window will open and close !
 
and now?
 
regards jan

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/5630bce5-2324-49a9-abbf-e495f83e97f5%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

mapreduce with filter script?

2015-02-25 Thread bryan rasmussen

Hi, 

I would like to get a script to work like mapreduce over the results of my 
query, so that if I have a query that returns 4 documents
{
 brands: "brand1"
},
{
 brands: "brand2"
},
{
 brands: "brand2"
},
{
 brands: "brand3"
}

and what I want to come out is the documents

{
 brands: "brand1"
},
{
 brands: "brand2"
},
{
 brands: "brand3"
}

of course this is a very simplified example, and the query I am doing is 
not on the brands field but it is the brands field I want to reduce / get 
rid of duplicates. 

thanks,
Bryan Rasmussen

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/f5ed9d72-8fe2-44f2-a485-7ac87cfca4df%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Some install bugs - probably there is solution, not yet found by me

2015-02-25 Thread Peter Takac

Hi, bellow is my service/daemon startup/shutdown script. It has some bugs, 
but works fine for me.




#!/bin/sh
# ElastiSearch daemon start/stop script.
# Comments to support chkconfig on RedHat Linux
# chkconfig: 2345 64 36
#title : elasticsearch
#description   : This script is used such a damon/service start/stop script.
#author: Peter TAKAC / AssmodeuS
#date  : 20150210
#version   : 0.15
#usage : service elasticsearch
#notes : service start/stop script, also check the conf path. if 
proces with the same path runs, it will not start it
#bash_version  : 3.2.51(1)
#

### BEGIN INIT INFO
# Provides: ElasticSearch
# Required-Start: $network $named
# Required-Stop: $network $named
# Should-Start: ypbind nscd ldap ntpd xntpd
# Default-Start: 2 3 4 5
# Default-Stop: 0 1 6
# Short-Description: This service manages the ElasticSearch daemon
# Description: Elasticsearch is a very scalable, schema-free and 
high-performance search solution supporting multi-tenancy and near realtime 
search.
### END INIT INFO

# Basic variables and settings
es_home="/elasticsearch/es-int-inst"
es_bindir="$es_home/bin"
es_exec="elasticsearch"
es_prog="$es_bindir/$es_exec"
es_pidpath="/var/run/elasticsearch"
es_pidfile="$es_pidpath/${es_exec}.pid"
es_lockdir="/var/lock/subsys"
es_lockfile="$es_lockdir/$es_exec"
es_user=elasticsearch
es_group=elasticsearch
# Heap size defaults to 256m min, 1g max
# Set ES_HEAP_SIZE to 50% of available RAM, but no more than 31g
#ES_HEAP_SIZE=2g
# Heap new generation
#ES_HEAP_NEWSIZE=
# max direct memory
#ES_DIRECT_SIZE=
# Additional Java OPTS
#ES_JAVA_OPTS=
# Maximum number of open files
MAX_OPEN_FILES=65535
# Maximum amount of locked memory
#MAX_LOCKED_MEMORY=

#More application related settings
# Elasticsearch log directory
es_logdir="$es_home/logs/$es_exec"
# Elasticsearch data directory
es_datadir="/elasticsearch/es-int-data"
# Elasticsearch work directory
es_workdir="$es_home/work"
# Elasticsearch configuration directory
#es_confdir=/etc/$es_exec
es_confdir="$es_home/config"
# Elasticsearch configuration file (elasticsearch.yml)
es_confile="$es_confdir/elasticsearch.yml"
# Maximum number of VMA (Virtual Memory Areas) a process can own
MAX_MAP_COUNT=262144
es_daemon="$es_bindir/$es_exec"
es_daemon_opts="-d -p $es_pidfile -Des.default.path.home=$es_home 
-Des.default.path.logs=$es_logdir -Des.default.path.data=$es_datadir 
-Des.default.path.work=$es_workdir -Des.default.path.conf=$es_confdir"

PATH=$PATH:$es_home
export PATH
export JAVA_HOME
export ES_HEAP_SIZE
export ES_HEAP_NEWSIZE
export ES_DIRECT_SIZE
export ES_JAVA_OPTS

# Default value, in seconds, after which the script should timeout waiting
# for server start.
# 0 means do not wait at all
# Negative numbers mean to wait indefinitely
service_startup_timeout=900

#
# Use LSB init script functions for printing messages, if possible
#
lsb_functions="/lib/lsb/init-functions"
if test -f $lsb_functions ; then
  . $lsb_functions
else
  log_success_msg()
  {
echo " SUCCESS! $@"
  }
  log_failure_msg()
  {
echo " ERROR! $@"
  }
fi

mode=$1 # start or stop

[ $# -ge 1 ] && shift

case `echo "testing\c"`,`echo -n testing` in
  *c*,-n*) echo_n= echo_c= ;;
  *c*,*)   echo_n=-n echo_c= ;;
  *) echo_n= echo_c='\c' ;;
esac

wait_for_pid () {
  verb="$1"# created | removed
  pid="$2"# process ID of the program operating on 
the pid-file
  pid_file_path="$3" # path to the PID file.

  i=0
  avoid_race_condition="by checking again"

  while test $i -ne $service_startup_timeout ; do

case "$verb" in
  'created')
# wait for a PID-file to pop into existence.
test -s "$pid_file_path" && i='' && break
  ;;
  'removed')
# wait for this PID-file to disappear
test ! -s "$pid_file_path" && i='' && break
  ;;
  *)
echo "wait_for_pid () usage: wait_for_pid created|removed pid 
pid_file_path"
exit 1
  ;;
esac

# if server is not running, then pid-file will never be updated
if test -n "$pid"; then
  if kill -0 "$pid" 2>/dev/null; then
: # the server still runs
  else
# The server may have exited between the last pid-file check and 
now.
if test -n "$avoid_race_condition"; then
  avoid_race_condition=""
  continue# Check again.
fi
# there is nothing that will affect the file.
log_failure_msg "The server quit without updating PID file 
($pid_file_path)."
return 1# not waiting any more.
  fi
fi

echo $echo_n ".$echo_c"
i=`expr $i + 1`
sleep 1

  done

  if test -z "$i" ; then
log_success_msg
return 0
  else
log_failure_msg
return 1
  fi
}

#
# Set pid file if not given
#
if test -z "$es_pidfile"
then
  es_pidfile=$es_datadir/`hostname`.pid
else
  case "$es_pidfile" in
/* ) ;;
* )   es_pidfile="$es_datadir/$es_pidfile" ;;

Re: Some install bugs - probably there is solution, not yet found by me

2015-02-25 Thread Peter Takac

Hi, bellow is service script, it has some small bugs but works fine for me:


#!/bin/sh
# ElastiSearch daemon start/stop script.
# Comments to support chkconfig on RedHat Linux
# chkconfig: 2345 64 36
#title : elasticsearch
#description   : This script is used such a damon/service start/stop script.
#author: Peter TAKAC / AssmodeuS
#date  : 20150210
#version   : 0.15
#usage : service elasticsearch
#notes : service start/stop script, also check the conf path. if 
proces with the same path runs, it will not start it
#bash_version  : 3.2.51(1)
#

### BEGIN INIT INFO
# Provides: ElasticSearch
# Required-Start: $network $named
# Required-Stop: $network $named
# Should-Start: ypbind nscd ldap ntpd xntpd
# Default-Start: 2 3 4 5
# Default-Stop: 0 1 6
# Short-Description: This service manages the ElasticSearch daemon
# Description: Elasticsearch is a very scalable, schema-free and 
high-performance search solution supporting multi-tenancy and near realtime 
search.
### END INIT INFO

# Basic variables and settings
es_home="/SIP/elasticsearch/es-int-inst"
es_bindir="$es_home/bin"
es_exec="elasticsearch"
es_prog="$es_bindir/$es_exec"
es_pidpath="/var/run/elasticsearch"
es_pidfile="$es_pidpath/${es_exec}.pid"
es_lockdir="/var/lock/subsys"
es_lockfile="$es_lockdir/$es_exec"
es_user=elasticsearch
es_group=elasticsearch
# Heap size defaults to 256m min, 1g max
# Set ES_HEAP_SIZE to 50% of available RAM, but no more than 31g
#ES_HEAP_SIZE=2g
# Heap new generation
#ES_HEAP_NEWSIZE=
# max direct memory
#ES_DIRECT_SIZE=
# Additional Java OPTS
#ES_JAVA_OPTS=
# Maximum number of open files
MAX_OPEN_FILES=65535
# Maximum amount of locked memory
#MAX_LOCKED_MEMORY=

#More application related settings
# Elasticsearch log directory
es_logdir="$es_home/logs/$es_exec"
# Elasticsearch data directory
es_datadir="/SIP/elasticsearch/es-int-data"
# Elasticsearch work directory
es_workdir="$es_home/work"
# Elasticsearch configuration directory
#es_confdir=/etc/$es_exec
es_confdir="$es_home/config"
# Elasticsearch configuration file (elasticsearch.yml)
es_confile="$es_confdir/elasticsearch.yml"
# Maximum number of VMA (Virtual Memory Areas) a process can own
MAX_MAP_COUNT=262144
es_daemon="$es_bindir/$es_exec"
es_daemon_opts="-d -p $es_pidfile -Des.default.path.home=$es_home 
-Des.default.path.logs=$es_logdir -Des.default.path.data=$es_datadir 
-Des.default.path.work=$es_workdir -Des.default.path.conf=$es_confdir"

PATH=$PATH:$es_home
export PATH
export JAVA_HOME
export ES_HEAP_SIZE
export ES_HEAP_NEWSIZE
export ES_DIRECT_SIZE
export ES_JAVA_OPTS

# Default value, in seconds, after which the script should timeout waiting
# for server start.
# 0 means do not wait at all
# Negative numbers mean to wait indefinitely
service_startup_timeout=900

#
# Use LSB init script functions for printing messages, if possible
#
lsb_functions="/lib/lsb/init-functions"
if test -f $lsb_functions ; then
  . $lsb_functions
else
  log_success_msg()
  {
echo " SUCCESS! $@"
  }
  log_failure_msg()
  {
echo " ERROR! $@"
  }
fi

mode=$1 # start or stop

[ $# -ge 1 ] && shift

case `echo "testing\c"`,`echo -n testing` in
  *c*,-n*) echo_n= echo_c= ;;
  *c*,*)   echo_n=-n echo_c= ;;
  *) echo_n= echo_c='\c' ;;
esac

wait_for_pid () {
  verb="$1"# created | removed
  pid="$2"# process ID of the program operating on 
the pid-file
  pid_file_path="$3" # path to the PID file.

  i=0
  avoid_race_condition="by checking again"

  while test $i -ne $service_startup_timeout ; do

case "$verb" in
  'created')
# wait for a PID-file to pop into existence.
test -s "$pid_file_path" && i='' && break
  ;;
  'removed')
# wait for this PID-file to disappear
test ! -s "$pid_file_path" && i='' && break
  ;;
  *)
echo "wait_for_pid () usage: wait_for_pid created|removed pid 
pid_file_path"
exit 1
  ;;
esac

# if server is not running, then pid-file will never be updated
if test -n "$pid"; then
  if kill -0 "$pid" 2>/dev/null; then
: # the server still runs
  else
# The server may have exited between the last pid-file check and 
now.
if test -n "$avoid_race_condition"; then
  avoid_race_condition=""
  continue# Check again.
fi
# there is nothing that will affect the file.
log_failure_msg "The server quit without updating PID file 
($pid_file_path)."
return 1# not waiting any more.
  fi
fi

echo $echo_n ".$echo_c"
i=`expr $i + 1`
sleep 1

  done

  if test -z "$i" ; then
log_success_msg
return 0
  else
log_failure_msg
return 1
  fi
}

#
# Set pid file if not given
#
if test -z "$es_pidfile"
then
  es_pidfile=$es_datadir/`hostname`.pid
else
  case "$es_pidfile" in
/* ) ;;
* )   es_pidfile="$es_datadir/$es_pidfile" ;;
  esac
fi

case "$m

Re: EC2 cluster storage question

2015-02-25 Thread Paul Sanwald

Thanks, the rsync to EBS is what I was rolling around in my head, but 
wasn't sure if it was a dumb idea.

We used to use Elastic Block Store, but have gotten incredible performance 
gains from moving to SSD local storage. The ES team doesn't recommend any 
kind of NAS 
,
 
and they re-iterated in their recent webinar that they couldn't really 
recommend EBS. This was exactly in line with our experience: it will work, 
but performance is less predictable and certainly degraded from ephermal 
storage.

Sounds like I have two options:
1 - shutdown and just restore from snapshot when we start back up.
2 - sync local storage to EBS when we shutdown, and the reverse when we 
start up.

Not sure if the juice is going to be worth the squeeze for either of these 
options, but I appreciate everyone's thoughts.

Thanks!

--paul

On Wednesday, February 25, 2015 at 2:15:01 AM UTC-5, Norberto Meijome wrote:
>
> OP points out he is using ephemeral storage...hence shutdown will destroy 
> the data...but it can be rsynced to EBS as part of the shutdown 
> process...and then repeat in reverse when starting things up again...
>
> Though I guess you could let ES take care of it by tagging nodes 
> accordingly and updating the index settings .(hope it makes sense...)
> On 25/02/2015 4:58 pm, "Mark Walkom" > 
> wrote:
>
>> Why not just shut the cluster down, disable allocation first and then 
>> just gracefully power things off?
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to elasticsearc...@googlegroups.com .
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/CAEYi1X_D15Aq62TzhbTN8kWKDPGpsuoYP2e2RJta9N5_tu4_ZA%40mail.gmail.com
>>  
>> 
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
-- 
*Important Notice:*  The information contained in or attached to this email 
message is confidential and proprietary information of RedOwl Analytics, 
Inc., and by opening this email or any attachment the recipient agrees to 
keep such information strictly confidential and not to use or disclose the 
information other than as expressly authorized by RedOwl Analytics, Inc. 
 If you are not the intended recipient, please be aware that any use, 
printing, copying, disclosure, dissemination, or the taking of any act in 
reliance on this communication or the information contained herein is 
strictly prohibited. If you think that you have received this email message 
in error, please delete it and notify the sender.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/8f970a6d-806a-4290-9cb8-1f54217a8ed8%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Custom Analyzer Validation

2015-02-25 Thread Colin Goodheart-Smithe

Thanks for raising this. It looks like this is indeed a bug. I have raised 
an issue for it on 
github: https://github.com/elasticsearch/elasticsearch/issues/9879

Colin

On Wednesday, 25 February 2015 14:03:28 UTC, Balasundaram Nanthisamy wrote:
>
> We are trying to cover the corner cases in our project and in one of the 
> invalid custom analyzer scenarios not captured in validation. The invalid 
> custom analyzer is not captured on settings update. Is it a known issue?
>
> PUT http://localhost:9200/test_idx
> Content-Type: application/json
> {
> "settings" : {
> "number_of_shards" : 3,
> "number_of_replicas" : 2
> }
> }
>  -- response --
> 200 OK
> Content-Type:  application/json; charset=UTF-8
> Content-Length:  21
>
> {"acknowledged":true}
>
> =
>
> POST http://localhost:9200/test_idx/_close
> Content-Type: application/json
>
>  -- response --
> 200 OK
> Content-Type:  application/json; charset=UTF-8
> Content-Length:  21
>
> {"acknowledged":true}
>
> 
>
> PUT http://localhost:9200/test_idx/_settings
> Content-Type: application/json
> {
> "analysis": {
> "analyzer": {
> "my_custom_analyzer_1": {
> "tokenizer": "my_custom_tokenizer"
> }
> }
> }
> }
>  -- response --
> 200 OK
> Content-Type:  application/json; charset=UTF-8
> Content-Length:  21
>
> {"acknowledged":true}
>
> ===
>
> POST http://localhost:9200/test_idx/_open
> Content-Type: application/json
>
>  -- response --
> 200 OK
> Content-Type:  application/json; charset=UTF-8
> Content-Length:  21
>
> {"acknowledged":true}
>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/f4912afb-a583-46dd-8a96-e69e169a71b5%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: how many memory is require (win7 64bit)?

 
HI !
 
i did not set anything like this !
 
*Did you set any memory settings for the JVM HEAP?*
 
where did i add this parameter - did you have a link?
 
*If so, you might have forgotten to add m or g after the size. Like 
ES_HEAP_SIZE=1024 instead of 1024m* 
 
no i did !
 
regards Jan

Am Mittwoch, 25. Februar 2015 14:43:50 UTC+1 schrieb jan99:

> Hi !
>  
> i want to use elasticsearch for a company wiki an my machine is win7 64bit 
> with 8GB Ram.
>  
> know i want to start like service. i stopped the current elasticsearch.bat 
> and call service.bat by kontextmenue *start as admin*
>  
> i get the message that the command is unable to start because there is not 
> enough memory.
>  
> my machine to small or is there a possiblity to define a special parameter?
>  
> reagards Jan
>  
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/4e696b9f-8fff-4f5e-ac1d-e9311232616f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: how many memory is require (win7 64bit)?

Did you set any memory settings for the JVM HEAP?
If so, you might have forgotten to add m or g after the size. Like 
ES_HEAP_SIZE=1024 instead of 1024m 




-- 
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet  | @elasticsearchfr 
 | @scrutmydocs 




> Le 25 févr. 2015 à 14:45, jan99  a écrit :
> 
> If i start elasticsearch-service-x64.exe the window will show and hide in the 
> next moment !
>  
> reagards Jan
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to elasticsearch+unsubscr...@googlegroups.com 
> .
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/elasticsearch/70962568-1571-4ca2-9307-e9f0126ec509%40googlegroups.com
>  
> .
> For more options, visit https://groups.google.com/d/optout 
> .

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/8A80859B-0A0B-4549-B7EC-AA17734661B3%40pilato.fr.
For more options, visit https://groups.google.com/d/optout.

Custom Analyzer Validation

2015-02-25 Thread Balasundaram Nanthisamy

We are trying to cover the corner cases in our project and in one of the 
invalid custom analyzer scenarios not captured in validation. The invalid 
custom analyzer is not captured on settings update. Is it a known issue?

PUT http://localhost:9200/test_idx
Content-Type: application/json
{
"settings" : {
"number_of_shards" : 3,
"number_of_replicas" : 2
}
}
 -- response --
200 OK
Content-Type:  application/json; charset=UTF-8
Content-Length:  21

{"acknowledged":true}

=

POST http://localhost:9200/test_idx/_close
Content-Type: application/json

 -- response --
200 OK
Content-Type:  application/json; charset=UTF-8
Content-Length:  21

{"acknowledged":true}



PUT http://localhost:9200/test_idx/_settings
Content-Type: application/json
{
"analysis": {
"analyzer": {
"my_custom_analyzer_1": {
"tokenizer": "my_custom_tokenizer"
}
}
}
}
 -- response --
200 OK
Content-Type:  application/json; charset=UTF-8
Content-Length:  21

{"acknowledged":true}

===

POST http://localhost:9200/test_idx/_open
Content-Type: application/json

 -- response --
200 OK
Content-Type:  application/json; charset=UTF-8
Content-Length:  21

{"acknowledged":true}


-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/3eafcb49-c0aa-4bbf-9d90-4559ca831dcb%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Bulk API does not work for huge data!

2015-02-25 Thread Ali Lotfdar

Thank you David, so big job for 2000 requests!

On Tuesday, February 24, 2015 at 6:13:12 PM UTC-5, David Pilato wrote:
>
> Split your bulk in smaller parts.
> For example I'm not injecting bulks with more than 1 requests.
>
> 1gb of data is a way too much!
>
> David
>
> Le 24 févr. 2015 à 23:50, Ali Lotfdar > 
> a écrit :
>
> Dear All,
>
> I use bulk Api "$ curl -s -XPOST localhost:9200/_bulk --data-binary 
> @requests; for a file 1.43GB. But after execution nothing happen!
> For small files I do not have any problem.
>
> Thanks to let me know if I have to do something!
>
> Regards,
> Ali
>
> -- 
> You received this message because you are subscribed to the Google Groups 
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to elasticsearc...@googlegroups.com .
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/elasticsearch/95afbaff-957f-44f5-8836-e4eef951e05a%40googlegroups.com
>  
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/2c1536ad-fd78-4c7d-9908-b9bee345553c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: how many memory is require (win7 64bit)?

If i start elasticsearch-service-x64.exe the window will show and hide in 
the next moment !
 
reagards Jan

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/70962568-1571-4ca2-9307-e9f0126ec509%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: concerns on possible load of aggregation

2015-02-25 Thread Jilles van Gurp

You need to look into using an index template that uses optimal mapping for 
your data. For logstash, it really helps to use doc_values on all fields 
you aggregate on and turning off norms as well on those fields. Doc_values 
means elasticsearch uses memory mapped files instead of heap memory for the 
field values. WIth huge aggregations this means the system will get slower 
but less likely to run out of memory if you get a lot of requests. Without 
doc_values, you will want to configure field data circuitbreakers properly 
to ensure you don't run out of memory. This typically means that searches 
that would have run out of memory abort with an error instead, which is 
preferable to your cluster crashing but not great from an end user 
perspective.

Jilles

On Wednesday, February 25, 2015 at 9:09:43 AM UTC+1, Seungjin Lee wrote:
>
> We are running a PAAS built with elasticsearch and we want to provide 
> multi-column count aggregation feature through ES aggregation
>
> Let's take below as an example
>
> POST /INDEX_PATTREN-*/_search
> {
> "query":{"match":{"project":"dummyProject"}},
> "size":0,
>"aggs": {
>   "col1": {
>  "terms": {
> "field": "host",
> "size":5
>  },
>  "aggs": {
> "col2": {
>"terms": {
>   "field": "source",
>   "size":5
>},
>"aggs":{
>"col3":{
>"terms":{
>"field":"version",
>"size":5
>}
>}
>}
> }
>  }
>   }
>}
> }
>
>
> We use daily index, stores 30 days amount of data, approximately 500GB per 
> day index.
>
> So the example aggreagation will investigate huge data.
>
> But we found out that it's blazingly fast, we use 20 data nodes together 
> with several search/master nodes, and it responds within 10 minutes.
>
>
>
>
> OK, but what if there's many request at the same time, what can happen?
>
> Will those requests just make other requests to slow down(in this case, 
> increase # of machines will be a solution?) or possibly cause OOM or 
> whatever critical error on ES daemon? 
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/745a95f9-d963-472c-9ece-f326521707b5%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

how many memory is require (win7 64bit)?

Hi !
 
i want to use elasticsearch for a company wiki an my machine is win7 64bit 
with 8GB Ram.
 
know i want to start like service. i stopped the current elasticsearch.bat 
and call service.bat by kontextmenue *start as admin*
 
i get the message that the command is unable to start because there is not 
enough memory.
 
my machine to small or is there a possiblity to define a special parameter?
 
reagards Jan
 

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/b2a8893f-d730-41cc-b833-f1ef82a93f98%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

elasticsearch endless garbage collection

2015-02-25 Thread chris85lang

Hello,

we use a elasticsearch cluster with two nodes together with logstash and 
kibana. When we receive a large amount of logs (>200 per sec.) 
elasticsearch becomes unusable slow. 

We see a lot of garbage collection going on as well:
[2015-02-25 18:27:02,494][WARN ][monitor.jvm] [server1] [gc][old][564][30] 
duration [12.4s], collections [1]/[13s], total [12.4s]/[3.5m], memory 
[9.7gb]->[8.8gb]/[9.8gb], all_pools {[young] 
[1.1gb]->[323.4mb]/[1.1gb]}{[survivor] [65.5mb]->[0b]/[149.7mb]}{[old] 
[8.5gb]->[8.5gb]/[8.5gb]}
[2015-02-25 18:27:14,098][INFO ][monitor.jvm] [server1] [gc][old][566][31] 
duration [9.2s], collections [1]/[9.8s], total [9.2s]/[3.7m], memory 
[9.7gb]->[8.7gb]/[9.8gb], all_pools {[young] 
[1.1gb]->[237.4mb]/[1.1gb]}{[survivor] [65.7mb]->[0b]/[149.7mb]}{[old] 
[8.5gb]->[8.5gb]/[8.5gb]}
[2015-02-25 18:27:27,717][WARN ][monitor.jvm] [server1] [gc][old][568][32] 
duration [12s], collections [1]/[12.6s], total [12s]/[3.9m], memory 
[9.7gb]->[8.8gb]/[9.8gb], all_pools {[young] 
[1.1gb]->[275.2mb]/[1.1gb]}{[survivor] [73.1mb]->[0b]/[149.7mb]}{[old] 
[8.5gb]->[8.5gb]/[8.5gb]}
[2015-02-25 18:27:39,518][INFO ][monitor.jvm] [server1] [gc][old][570][33] 
duration [9.5s], collections [1]/[9.9s], total [9.5s]/[4.1m], memory 
[9.8gb]->[8.8gb]/[9.8gb], all_pools {[young] 
[1.1gb]->[341mb]/[1.1gb]}{[survivor] [99.7mb]->[0b]/[149.7mb]}{[old] 
[8.5gb]->[8.5gb]/[8.5gb]}
[2015-02-25 18:27:49,549][INFO ][monitor.jvm] [server1] [gc][old][571][34] 
duration [9s], collections [1]/[10s], total [9s]/[4.2m], memory 
[8.8gb]->[8.8gb]/[9.8gb], all_pools {[young] 
[341mb]->[364.5mb]/[1.1gb]}{[survivor] [0b]->[0b]/[149.7mb]}{[old] 
[8.5gb]->[8.5gb]/[8.5gb]}
[2015-02-25 18:27:59,113][INFO ][monitor.jvm] [server1] [gc][old][572][35] 
duration [8.5s], collections [1]/[9.5s], total [8.5s]/[4.4m], memory 
[8.8gb]->[8.9gb]/[9.8gb], all_pools {[young] 
[364.5mb]->[379.8mb]/[1.1gb]}{[survivor] [0b]->[0b]/[149.7mb]}{[old] 
[8.5gb]->[8.5gb]/[8.5gb]}
[2015-02-25 18:28:08,713][INFO ][monitor.jvm] [server1] [gc][old][574][36] 
duration [8.5s], collections [1]/[8.5s], total [8.5s]/[4.5m], memory 
[9.8gb]->[8.9gb]/[9.8gb], all_pools {[young] 
[1.1gb]->[420.4mb]/[1.1gb]}{[survivor] [146.1mb]->[0b]/[149.7mb]}{[old] 
[8.5gb]->[8.5gb]/[8.5gb]}
[2015-02-25 18:28:18,486][INFO ][monitor.jvm] [server1] [gc][old][576][37] 
duration [8.3s], collections [1]/[8.7s], total [8.3s]/[4.6m], memory 
[9.7gb]->[8.8gb]/[9.8gb], all_pools {[young] 
[1.1gb]->[365.2mb]/[1.1gb]}{[survivor] [66.6mb]->[0b]/[149.7mb]}{[old] 
[8.5gb]->[8.5gb]/[8.5gb]}
[2015-02-25 18:28:28,169][INFO ][monitor.jvm] [server1] [gc][old][578][38] 
duration [8.3s], collections [1]/[8.6s], total [8.3s]/[4.8m], memory 
[9.7gb]->[8.9gb]/[9.8gb], all_pools {[young] 
[1.1gb]->[389mb]/[1.1gb]}{[survivor] [88.7mb]->[0b]/[149.7mb]}{[old] 
[8.5gb]->[8.5gb]/[8.5gb]}
[2015-02-25 18:28:38,022][INFO ][monitor.jvm] [server1] [gc][old][580][39] 
duration [8.6s], collections [1]/[8.8s], total [8.6s]/[4.9m], memory 
[9.7gb]->[8.9gb]/[9.8gb], all_pools {[young] 
[1.1gb]->[387.2mb]/[1.1gb]}{[survivor] [79mb]->[0b]/[149.7mb]}{[old] 
[8.5gb]->[8.5gb]/[8.5gb]}
[2015-02-25 18:28:48,061][INFO ][monitor.jvm] [server1] [gc][old][582][40] 
duration [8.2s], collections [1]/[9s], total [8.2s]/[5.1m], memory 
[9.7gb]->[8.8gb]/[9.8gb], all_pools {[young] 
[1.1gb]->[325.4mb]/[1.1gb]}{[survivor] [14.6mb]->[0b]/[149.7mb]}{[old] 
[8.5gb]->[8.5gb]/[8.5gb]}


In htop we see its mainly one core who is busy all the time:
http://i.imgur.com/7LDS8h4.png

Any ideas?

Cheers,
Chris

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/e03dcdcf-2c74-405f-b073-8190c10959a9%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Elasticsearch and endless garbage collection

2015-02-25 Thread chris85lang

Hello,

we are using a elasticsearch cluster (2x nodes) together with logstash and 
kibana.

We are facing the problem that with higher log inputs (>200 per sec.) 
elasticsearch becomes unusable slow and we see a lot of gc going on: 

[2015-02-25 18:27:02,494][WARN ][monitor.jvm] [server1] [gc][old][564][30] 
duration [12.4s], collections [1]/[13s], total [12.4s]/[3.5m], memory 
[9.7gb]->[8.8gb]/[9.8gb], all_pools {[young] 
[1.1gb]->[323.4mb]/[1.1gb]}{[survivor] [65.5mb]->[0b]/[149.7mb]}{[old] 
[8.5gb]->[8.5gb]/[8.5gb]}
[2015-02-25 18:27:14,098][INFO ][monitor.jvm] [server1] [gc][old][566][31] 
duration [9.2s], collections [1]/[9.8s], total [9.2s]/[3.7m], memory 
[9.7gb]->[8.7gb]/[9.8gb], all_pools {[young] 
[1.1gb]->[237.4mb]/[1.1gb]}{[survivor] [65.7mb]->[0b]/[149.7mb]}{[old] 
[8.5gb]->[8.5gb]/[8.5gb]}
[2015-02-25 18:27:27,717][WARN ][monitor.jvm] [server1] [gc][old][568][32] 
duration [12s], collections [1]/[12.6s], total [12s]/[3.9m], memory 
[9.7gb]->[8.8gb]/[9.8gb], all_pools {[young] 
[1.1gb]->[275.2mb]/[1.1gb]}{[survivor] [73.1mb]->[0b]/[149.7mb]}{[old] 
[8.5gb]->[8.5gb]/[8.5gb]}
[2015-02-25 18:27:39,518][INFO ][monitor.jvm ] [server1] [gc][old][570][33] 
duration [9.5s], collections [1]/[9.9s], total [9.5s]/[4.1m], memory 
[9.8gb]->[8.8gb]/[9.8gb], all_pools {[young] 
[1.1gb]->[341mb]/[1.1gb]}{[survivor] [99.7mb]->[0b]/[149.7mb]}{[old] 
[8.5gb]->[8.5gb]/[8.5gb]}
[2015-02-25 18:27:49,549][INFO ][monitor.jvm] [server1] [gc][old][571][34] 
duration [9s], collections [1]/[10s], total [9s]/[4.2m], memory 
[8.8gb]->[8.8gb]/[9.8gb], all_pools {[young] 
[341mb]->[364.5mb]/[1.1gb]}{[survivor] [0b]->[0b]/[149.7mb]}{[old] 
[8.5gb]->[8.5gb]/[8.5gb]}
[2015-02-25 18:27:59,113][INFO ][monitor.jvm] [server1] [gc][old][572][35] 
duration [8.5s], collections [1]/[9.5s], total [8.5s]/[4.4m], memory 
[8.8gb]->[8.9gb]/[9.8gb], all_pools {[young] 
[364.5mb]->[379.8mb]/[1.1gb]}{[survivor] [0b]->[0b]/[149.7mb]}{[old] 
[8.5gb]->[8.5gb]/[8.5gb]}
[2015-02-25 18:28:08,713][INFO ][monitor.jvm ] [server1] [gc][old][574][36] 
duration [8.5s], collections [1]/[8.5s], total [8.5s]/[4.5m], memory 
[9.8gb]->[8.9gb]/[9.8gb], all_pools {[young] 
[1.1gb]->[420.4mb]/[1.1gb]}{[survivor] [146.1mb]->[0b]/[149.7mb]}{[old] 
[8.5gb]->[8.5gb]/[8.5gb]}
[2015-02-25 18:28:18,486][INFO ][monitor.jvm ] [server1] [gc][old][576][37] 
duration [8.3s], collections [1]/[8.7s], total [8.3s]/[4.6m], memory 
[9.7gb]->[8.8gb]/[9.8gb], all_pools {[young] 
[1.1gb]->[365.2mb]/[1.1gb]}{[survivor] [66.6mb]->[0b]/[149.7mb]}{[old] 
[8.5gb]->[8.5gb]/[8.5gb]}
[2015-02-25 18:28:28,169][INFO ][monitor.jvm ] [server1] [gc][old][578][38] 
duration [8.3s], collections [1]/[8.6s], total [8.3s]/[4.8m], memory 
[9.7gb]->[8.9gb]/[9.8gb], all_pools {[young] 
[1.1gb]->[389mb]/[1.1gb]}{[survivor] [88.7mb]->[0b]/[149.7mb]}{[old] 
[8.5gb]->[8.5gb]/[8.5gb]}
[2015-02-25 18:28:38,022][INFO ][monitor.jvm] [server1] [gc][old][580][39] 
duration [8.6s], collections [1]/[8.8s], total [8.6s]/[4.9m], memory 
[9.7gb]->[8.9gb]/[9.8gb], all_pools {[young] 
[1.1gb]->[387.2mb]/[1.1gb]}{[survivor] [79mb]->[0b]/[149.7mb]}{[old] 
[8.5gb]->[8.5gb]/[8.5gb]}
[2015-02-25 18:28:48,061][INFO ][monitor.jvm] [server1] [gc][old][582][40] 
duration [8.2s], collections [1]/[9s], total [8.2s]/[5.1m], memory 
[9.7gb]->[8.8gb]/[9.8gb], all_pools {[young] 
[1.1gb]->[325.4mb]/[1.1gb]}{[survivor] [14.6mb]->[0b]/[149.7mb]}{[old] 
[8.5gb]->[8.5gb]/[8.5gb]}


In htop we see that a singe thread is working all the time:
http://i.imgur.com/wq9h2i2.png

Any ideas?

Cheers,
Chris



-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/e13404c8-b074-410f-845f-d2d394778a32%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: EC2 cluster storage question

2015-02-25 Thread Christopher Rimondi

The 'i' and 'h' series are attractive because of the disk performance. We
considered using them but it was just not feasible given the volatility of
ephemeral storage.

On Wed, Feb 25, 2015 at 5:30 AM, Mark Walkom  wrote:

> Fair point. The rsync option could work, but then why not just use EBS and
> then shut the nodes down to save the rsync work?
> Tagging nodes probably won't help in this instance.
>
> Basically if you want to shut everything down you need to go through
> recovery, and depending on how long that takes it may not be worth the
> cost. This is something you need to test.
>
> On 25 February 2015 at 18:14, Norberto Meijome  wrote:
>
>> OP points out he is using ephemeral storage...hence shutdown will destroy
>> the data...but it can be rsynced to EBS as part of the shutdown
>> process...and then repeat in reverse when starting things up again...
>>
>> Though I guess you could let ES take care of it by tagging nodes
>> accordingly and updating the index settings .(hope it makes sense...)
>> On 25/02/2015 4:58 pm, "Mark Walkom"  wrote:
>>
>>> Why not just shut the cluster down, disable allocation first and then
>>> just gracefully power things off?
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "elasticsearch" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to elasticsearch+unsubscr...@googlegroups.com.
>>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/elasticsearch/CAEYi1X_D15Aq62TzhbTN8kWKDPGpsuoYP2e2RJta9N5_tu4_ZA%40mail.gmail.com
>>> 
>>> .
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>  --
>> You received this message because you are subscribed to the Google Groups
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to elasticsearch+unsubscr...@googlegroups.com.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/elasticsearch/CACj2-4Jafq4Fqf2GOsdK5OCcmdk3AtW3B2%3DjJYHTgCjyUzOQWg%40mail.gmail.com
>> 
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/CAEYi1X-oqv%3DFHF3%3DoULiWy_rJBf4PSi3AjgbDE_BtBwLP9Xt_w%40mail.gmail.com
> 
> .
>
> For more options, visit https://groups.google.com/d/optout.
>



-- 
Chris Rimondi | http://twitter.com/crimondi | securitygrit.com

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CA%2BqatLjGbssnPA-S%3D0dRQsWjmRyd2XdqzvSu6Tc3%3D42UGNYXog%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: how to integrate elasticsearch and the curl-command into windows7 os

If you really want to use CURL on Windows, just Google it. I ended up removing 
it.

That said, you should use Marvel / Sense which is free for dev or other site 
plugins or REST plugins for your browser.

HTH 

--
David ;-)
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

> Le 25 févr. 2015 à 12:54, jan99  a écrit :
> 
> hi !
>  
> now i have installed Elasticsearch 1.4.4 for my company wiki and started the 
> batch-File and the cmd-window with Curl-Command in cmd windows in the desktop.
>  
> normally a user is unlock on the machine - but how to integrate this two 
> process in a windows system - to start elasticsearch i found  
> http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/setup-service-win.html.
>  - but how to integrate the curl-command?
>  
> regards Jan
> -- 
> You received this message because you are subscribed to the Google Groups 
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/elasticsearch/7b7b9e21-ec9d-42f4-b707-cdc1313452d9%40googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/39554D16-CC96-432C-B395-F9EE9D20CBCA%40pilato.fr.
For more options, visit https://groups.google.com/d/optout.

Re: Attribute awareness during recovery?

2015-02-25 Thread Neil Andrassy (The Filter)

That seems like a shame given the immutability of the underlying blocks.
Sure, the primary needs to identify the specific set of blocks to be
replicated but I don't see why the block data itself couldn't be pulled
from a "closer" replica if it exists there?

On 25 February 2015 at 06:56, Mark Walkom  wrote:

> Recovery always needs to come from the primary, otherwise you cannot be
> certain your dataset is valid.
>
> On 24 February 2015 at 21:18, Neil Andrassy 
> wrote:
>
>> Hi,
>>
>> Does anybody know if there is a way, when a node fails and shards need to
>> be recovered, of configuring ElasticSearch to prefer to recover from a node
>> sharing the same awareness attributes (e.g. same rack, same zone etc. a bit
>> like the "automatic preference when searching / getingedit" reference in
>> the user guide). We're typically seeing a lot of traffic between "zones"
>> when a failure occurs and I wondered why this was the case / if it was
>> avoidable. Maybe I'm missing something and recovery always needs to
>> replicate from the active primary?
>>
>> Thanks in advance for any guidance or information,
>>
>> N
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to elasticsearch+unsubscr...@googlegroups.com.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/elasticsearch/dfa05e57-15a8-4509-a45f-62db4d94867a%40googlegroups.com
>> 
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>  --
> You received this message because you are subscribed to a topic in the
> Google Groups "elasticsearch" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/elasticsearch/txa_4eosq0k/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/CAEYi1X_ihD_AkLgoF5FjAJbju9dnCAwAYhoZbns-gf%2BWYXWQMQ%40mail.gmail.com
> 
> .
>
> For more options, visit https://groups.google.com/d/optout.
>



-- 
Neil Andrassy  |  CTO  |  The Filter
phone  |  +44 (0)1225 588 004
skype | andrassynp

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CABpTWLObLq3tJuE48p0OfdwoXB6KoAHZvKbi6x%2B98SjuPfrP7A%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

how to integrate elasticsearch and the curl-command into windows7 os

hi !
 
now i have installed Elasticsearch 1.4.4 for my company wiki and started 
the batch-File and the cmd-window with Curl-Command in cmd windows in the 
desktop.
 
normally a user is unlock on the machine - but how to integrate this two 
process in a windows system - to start elasticsearch i found  
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/setup-service-win.html.
 
- but how to integrate the curl-command?
 
regards Jan

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/7b7b9e21-ec9d-42f4-b707-cdc1313452d9%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Elasticsearch Load Test goes OutOfMemory

2015-02-25 Thread Malaka Gallage

Hi Mark,

Yes I'm using bulk API for the tests. Usually OOM error happens when the 
cluster has around 30 million records. Is there anyway to tune the ES 
cluster to perform better?

Thanks
Malaka

On Wednesday, February 25, 2015 at 12:25:52 PM UTC+5:30, Mark Walkom wrote:
>
> If you are getting queue capacity rejections then you are over working 
> your cluster. Are you using the bulk API for your tests?
> How much data is in your cluster when you get OOM?
>
> On 25 February 2015 at 16:28, Malaka Gallage  > wrote:
>
>> Hi all,
>>
>> I need some help here. I started a load test for Elasticsearch before 
>> using that in production environment. I have three EC2 instances that are 
>> configured in following manner which creates a Elasticsearch cluster.
>>
>> All three machines has the following same hardware configurations.
>>
>> 32GB RAM
>> 160GB SSD hard disk
>> 8 core CPU
>>
>> *Machine 01*
>> Elasticsearch server (16GB heap)
>> Elasticsearch Java client (Who generates a continues load and report to 
>> ES - 4GB heap)
>>
>>
>> *Machine 02*
>> Elasticsearch server (16GB heap)
>> Elasticsearch Java client (Who generates a continues load and report to 
>> ES - 4GB heap)
>>
>>
>> *Machine 03*
>> Elasticsearch server (16GB heap)
>> Elasticsearch Java client (Who queries from ES continuously - 1GB heap)
>>
>>
>> Note that the two clients together generates around 20K records per 
>> second and report them as bulks with average size of 25. The other client 
>> queries only one query per second. My document has the following format.
>>
>> {
>> "_index": "my_index",
>> "_type": "my_type",
>> "_id": "7334236299916134105",
>> "_score": 3.607,
>> "_source": {
>>"long_1": 96186289301793,
>>"long_2": 7334236299916134000,
>>"string_1": "random_string",
>>"long_3": 96186289301793,
>>"string_2": "random_string",
>>"string_3": "random_string",
>>"string_4": "random_string",
>>"string_5": "random_string",
>>"long_4": 5457314198948537000
>>   }
>> }
>>
>> The problem is, after few minutes, Elasticsearch reports errors in the 
>> logs like this.
>>
>> [2015-02-24 08:03:58,070][ERROR][marvel.agent.exporter] [Gateway] 
>> create failure (index:[.marvel-2015.02.24] type: [cluster_stats]): 
>> RemoteTransportException[[Marvel 
>> Girl][inet[/10.167.199.140:9300]][bulk/shard]]; nested: 
>> EsRejectedExecutionException[rejected execution (queue capacity 50) on 
>> org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction$1@76dbf01];
>>
>> [2015-02-25 04:23:36,459][ERROR][marvel.agent.exporter] [Wildside] 
>> create failure (index:[.marvel-2015.02.25] type: [index_stats]): 
>> UnavailableShardsException[[.marvel-2015.02.25][0] [2] shardIt, [0] active 
>> : Timeout waiting for [1m], request: 
>> org.elasticsearch.action.bulk.BulkShardRequest@2e7693b7]
>>
>> Note that this error happens for different indices and different types.
>>
>> Again after few minutes, Elasticsearch clients get 
>> NoNodeAvailableException. I hope that is because Elasticsearch cluster 
>> malfunctioning due to above errors. But eventually the clients get 
>> "java.lang.OutOfMemoryError: GC overhead limit exceeded" error.
>>
>> I did some profiling and found out that increasing 
>> the org.elasticsearch.action.index.IndexRequest instances is the cause for 
>> this OutOfMemory error. I tried even with "index.store.type: memory" and it 
>> seems still the Elasticsearch cluster cannot build the indices to the 
>> required rate.
>>
>> Please point out any tuning parameters or any method to get rid of these 
>> issues. Or please explain a different way to report and query this amount 
>> of load.
>>
>>
>> Thanks
>> Malaka
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to elasticsearc...@googlegroups.com .
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/35a29ca5-02f6-4fe9-8600-2cdb91c519cf%40googlegroups.com
>>  
>> 
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/a78a83f3-090d-4265-8d8c-1fe0d669e70b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

how to use the term query in url search

2015-02-25 Thread Vahith

hi,

i am try to use the term query in url without using API, how to use that 
kindly suggest me..

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/06be6823-a482-4036-8d50-57f6b24c89b7%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: EC2 cluster storage question

Fair point. The rsync option could work, but then why not just use EBS and
then shut the nodes down to save the rsync work?
Tagging nodes probably won't help in this instance.

Basically if you want to shut everything down you need to go through
recovery, and depending on how long that takes it may not be worth the
cost. This is something you need to test.

On 25 February 2015 at 18:14, Norberto Meijome  wrote:

> OP points out he is using ephemeral storage...hence shutdown will destroy
> the data...but it can be rsynced to EBS as part of the shutdown
> process...and then repeat in reverse when starting things up again...
>
> Though I guess you could let ES take care of it by tagging nodes
> accordingly and updating the index settings .(hope it makes sense...)
> On 25/02/2015 4:58 pm, "Mark Walkom"  wrote:
>
>> Why not just shut the cluster down, disable allocation first and then
>> just gracefully power things off?
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to elasticsearch+unsubscr...@googlegroups.com.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/elasticsearch/CAEYi1X_D15Aq62TzhbTN8kWKDPGpsuoYP2e2RJta9N5_tu4_ZA%40mail.gmail.com
>> 
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/CACj2-4Jafq4Fqf2GOsdK5OCcmdk3AtW3B2%3DjJYHTgCjyUzOQWg%40mail.gmail.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAEYi1X-oqv%3DFHF3%3DoULiWy_rJBf4PSi3AjgbDE_BtBwLP9Xt_w%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: ES java.lang.OutOfMemoryError during Indexing?

14 thousand shards is a lot! It's definitely going to be contributing to
your problem.

How much data is that in your cluster, how many indices?
What version of ES, what java version and release?

On 25 February 2015 at 20:34,  wrote:

> Hi all,
> My cluster consists of 3 master nodes and 6 data nodes. Each node has 4
> vCPU, 16gb RAM (8gb for ES_HEAP). Each data node has 1TB storage (4 x
> 250gb, each connected to a different storage device). The main load for the
> cluster is about 3K to 4K events per second (about 150 million events per
> day, each event is about 200 to 300 bytes)  from some network devices. I
> use redis for caching and logstash for formatting. The  caching ,formatting
> and indexing to ES are done on other machines (8 vCPU 32gb RAM). The
> cluster works fine, well not all the time. Sometime, I'll get
> OutOfMemoryError in one of the data nodes, and then it ruins everything.
> Well, I know I can disable heap dump on OutOfMemoryError, but still, I'd
> like to know why these errors occur  in the first place. Because I don't
> think the "speed" is the key point. The ES cluster has been working
> perfectly for 2 month, with the load about 150 million per day. But last
> time OutOfMemoryError happened, the load was only 80M per day (estimated).
> And even if there was a burst of events, there are 2 redis servers with
> 32gb RAM for caching.
> My cluster generates 4 index per day, each index has 12 shards. The
> cluster started collecting log event about a year ago, the main load
> started around 3 month ago and there are around 14K shards in the cluster.
> Each index has serveral types, but the main load goes into one specific
> index/type. Mappings for each index/type are pre-defined. Some are pretty
> complicated, but the mapping for main load is rather simple. I did notice
> that on a regular day, the heap usage on data nodes are quite high (around
> 60~90). So my questions/concers are:
> 1  If I disable heap dump, will this data node act as normal as it could
> be?
> Currently, during the heap dump, the data node will not be available
> and will disconnect from the cluster. This will cause shard relocation and
> trigger a series of OutOfMemoryError on other data nodes.
> 2 What causes these OutOf MemoryError?
> Too many shards? Too many segments? Shard/Segment too big? I can
> reconfigure my logstash so that indices are created every month/week rather
> than every day. But if I want to store the logs by day with a reasonable
> shard per index setting and overall index count.  What kind of hardware
> (heap size) do I need?
> 3 Is there any setting to be tuned?
>
>
>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/3f85f8cb-acc9-49c8-a153-de17576ba98b%40googlegroups.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAEYi1X-RgMSH2wwHaYaNiBH-N6axkkuiZ5V4jFpk0OJhAs4Skw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Combining Filtered Query with Query String and Exact Match

2015-02-25 Thread Jim Cumming

Hi All, I'm having some trouble bending the query api to my requirements on 
this one and was wondering if someone could point me in the right 
direction. 

We want to search some website content using the query string query. This 
needs to be filtered by various terms. However we need to boost a result if 
there is an exact match on the tags field. The tags field is an array of 
values, so this would be an exact match on 1 of the items in the array. 

I started with this query 

{
"from": 0,
"size": 20,
"query": {
"filtered": {
"query": {
"query_string": {
"fields": [
"title^2",
"body",
"file",
"summary",
"tags^10"
],
"query": "Research Institute",
"use_dis_max": true
}
},
"filter": {
"bool": {
"must": [
{
"terms": {
"type": [
"file",
"page",
"link",
"folder",
"calendar"
]
}
}
]
}
}
}
}
}

But this boosts items with the tag of just 'Research' as well as 'Research 
Institute' and ends up promoting some fairly irrelevant results. 

I thought of using a bool query with a filtered query element and a match 
query with a boost value, but ES wont execute that. Does anyone have any 
suggestions?

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/ae72db3d-b2c2-4651-ab5f-f9930c8796ef%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Truncate response

2015-02-25 Thread James

Hi,

I am just wondering if there is an efficient and straight forward way to 
tell elasticsearch to truncate the reponse it sends. For example one of my 
fields is a "description". Is it possible to have it to when it sends the 
search response it only sends the first 200 charts of the description (and 
yet still keeps the full description in the index to search against).

Any help would be appreciated as searches on the subject turned up nothing.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/24d86390-5810-4e9e-b68a-efbe6c8b3751%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: very slow performance with version upgradation in Elasticsearch