Re: elasticsearch java.lang.ArrayIndexOutOfBoundsException: 1

2014-12-29 Thread David Pilato
I don't know this plugin but are you sure you can provide a shell script?
Sounds like Groovy is trying to execute it...

--
David ;-)
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

> Le 30 déc. 2014 à 04:57, Vinay H M  a écrit :
> 
> 
> 
>> On Tuesday, December 30, 2014 9:23:58 AM UTC+5:30, Vinay H M wrote:
>> Hi All
>> 
>> I found the error while running the elasticsearch ..plzz some one solve it 
>> 
>> 
>> [2014-12-30 
>> 09:16:22,389][ERROR][org.agileworks.elasticsearch.river.csv.CSVRiver] 
>> [Aliyah Bishop] [csv][my_csv_river] Error has occured during processing file 
>> 'PDUserDeviceDataTable.csv.processing' , skipping line: 
>> '[249573";"875";"testaasim";"00:12:F3:1B:A5:68";"2";"1344";"0";"29.7";"58.3";"1419835852";"20.0";"30.0";"40.0";"50.0";"500";"500";"12.9226205";"77.5605173]'
>>  and continue in processing
>> java.lang.ArrayIndexOutOfBoundsException: 1
>>  at 
>> org.codehaus.groovy.runtime.BytecodeInterface8.objectArrayGet(BytecodeInterface8.java:360)
>>  at 
>> org.agileworks.elasticsearch.river.csv.OpenCSVFileProcessor.processDataLine(OpenCSVFileProcessor.groovy:72)
>>  at 
>> org.agileworks.elasticsearch.river.csv.OpenCSVFileProcessor.this$2$processDataLine(OpenCSVFileProcessor.groovy)
>>  at 
>> org.agileworks.elasticsearch.river.csv.OpenCSVFileProcessor$this$2$processDataLine.callCurrent(Unknown
>>  Source)
>>  at 
>> org.agileworks.elasticsearch.river.csv.OpenCSVFileProcessor.process(OpenCSVFileProcessor.groovy:49)
>>  at 
>> org.agileworks.elasticsearch.river.csv.CSVConnector.processAllFiles(CSVConnector.groovy:47)
>>  at 
>> org.agileworks.elasticsearch.river.csv.CSVConnector.run(CSVConnector.groovy:20)
>>  at java.lang.Thread.run(Thread.java:745)
> 
> 
> the command i am using to create index 
> 
> curl -XPUT localhost:9200/_river/my_csv_river/_meta -d '
> {
> "type" : "csv",
> "csv_file" : {
> "folder" : "/home/paqs/Downloads/kibana/dec",
> "filename_pattern" : ".*\\.csv$",
> "poll":"1m",
> "fields" : [
>"Sno",
>"userld",
>"userName",
>"deviceld",
>"deviceCurrentMode",
>"co2Level",
>"dustLevel",
>"temperature",
>"relativeHumidity",
>"timeStamp",
>"tempLow",
>"tempHigh",
>"rhLow",
>"rhHigh",
>"dust",
>"pollution",
>"latitude",
>"longitude"
> ],
> "first_line_is_header" : "false",
> "field_separator" : ",",
> "escape_character" : "\\",
> "quote_character" : "\"",
> "field_id" : "id",
> "field_timestamp" : "imported_at",
> "concurrent_requests" : "1",
> "charset" : "UTF-8",
> "script_before_file": 
> "/home/paqs/Downloads/kibana/dec/before_file.sh",
> "script_after_file": "/home/paqs/Downloads/kibana/dec/after_file.sh",
> "script_before_all": "/home/paqs/Downloads/kibana/dec/before_all.sh",
> "script_after_all": "/home/paqs/Downloads/kibana/dec/after_all.sh"
> },
> "index" : {
> "index" : "decdevicedata",
> "type" : "alert",
> "bulk_size" : 1000,
> "bulk_threshold" : 10
> }
> }'
> 
> 
> the curl command i am using to create the mapping 
> 
> Create a mapping
> #
> curl -XPUT http://localhost:9200/decdevicedata -d '
> {
> "settings" : {
> "number_of_shards" : 1
> },
> "mappings" : {
> "alert" : {
> "properties" : {
> "Sno": {"type" : "integer"},
> "co2Level" : {"type" : "integer"},
> "deviceCurrentMode" : {"type" : "integer"},
> "deviceld"  : {"type" : "string"},
> "dust"  : {"type" : "integer"},
> "dustLevel" : {"type" : "integer"},
> "latitude": {"type" : "integer"},
> "longitude": {"type" : "integer"},
> "pollution" : {"type" : "integer"},
> "relativeHumidity" : {"type" : "float"},
> "rhLow": {"type" : "float"},
> "rhHigh": {"type" : "float"},
> "temperature": {"type" : "float"},
> "tempLow": {"type" : "float"},
> "tempHigh": {"type" : "float"},
> "timeStamp" : {"type" : "date", "ignore_malformed" : true, 
> "format" : "dateOptionalTime"},
> "userld" : {"type" : "integer"},
> "userName" : {"type" : "string", "index" : "not_analyzed"}
> 
> }
> }
> }
> }'
> 
> 
> 
>  
> -- 
> You received this message because you are subscribed to the Google Groups 
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/elasticsearch/2bb5ffb1-3213-4055-8f58-467721e1ed5d%40googlegroups.

Re: GIS Query not working on ES as expected

2014-12-29 Thread Andy Bo
Thanks Peter.  Issue at this point is resolved.  I will upload about half a
milion polygons and test further.  Thanks for your help.

On Sun, Dec 28, 2014 at 10:29 AM, Peter Johnson 
wrote:

> I think I figured out what's going on, see the github issue.
>
> -P
>
>
> On Sunday, 28 December 2014 05:21:07 UTC, AndyGIS wrote:
>>
>> Filed https://github.com/elasticsearch/elasticsearch/issues/9079
>> FYI!
>>
>>
>> On Saturday, December 27, 2014 9:07:50 PM UTC-8, AndyGIS wrote:
>>>
>>> What i have is a simple test case for the feature (this is not a corner
>>> case)Is this feature expected to work in some other coordinate system?
>>> Any workaround to overcome this bug?  Thanks!
>>> Andy
>>>
>>> On Saturday, December 27, 2014 6:40:44 AM UTC-8, Peter Johnson wrote:

 Hey Andy,

 I tried to fix this but you're right, it seems to be a bug. I've
 attached a full bug report which you might want to add to a new github
 issue.

 https://gist.github.com/anonymous/c250d602d1e7fe6d3655
 https://gist.github.com/missinglink/6e96f06e9e6032aa6416

 I also tried using 'geohash' instead of 'quadtree' with the same result.

 -P


 On Friday, 26 December 2014 22:51:56 UTC, AndyGIS wrote:
>
> Hi,
> I am new to GIS w/ ES.
>
> Below is my request/ response.  I am puzzled why do i get a polygon
> returned when it does not intersect w/ envelope defined in request.  What
> am i missing?  Appreciate quick response.  Thanks,
> Andy
>
> *Request*:
>
> *_search*
> {"size": 51000,"query": { "geo_shape": { "POLYS": {"shape": {"type":
> "envelope", "coordinates": [*[-118.58, 35.32],[-118.68, 35.30]*]},
> "relation": "intersects"
>
> *Response*:
>
>   "took" : 1172,
>   "timed_out" : false,
>   "_shards" : {
> "total" : 5,
> "successful" : 5,
> "failed" : 0
>   },
>   "hits" : {
> "total" : 94420,
> "max_score" : 1.0,
> "hits" : [ {
>   "_index" : "XXX",
>   "_type" : "XXX_data",
>   "_id" : "1234",
>   "_score" : 1.0,
>   
> "_source":{"POLYS":{"type":"Polygon","coordinates":*[[[-117.7656797176
> <7656797176>,35.2420721325],[-117.766565557,35.2429646794],[-117.7675000712
> <7675000712>,35.2429532681],[-117.7661768866,35.2415486409],[-117.7661640858,35.2415341862],[-117.765523769,35.2419167046],[-117.7656797176
> <7656797176>,35.2420721325]*]]}, "STATE": "XX", "ID": 1234
> ,"COUNTY_NAME": "YY"}]}
> }, {
>   "_index" : "XXX",
>   "_type" : "XXX_data",
>   "_id" : "1235",
>   "_score" : 1.0,
> 
> 
>
>
>
> _mapping?pretty'
>
> {
>   "XXX" : {
> "mappings" : {
>   "XXX_data" : {
> "properties" : {
>   "COUNTY_NAME" : {
> "type" : "string"
>   },
>   "ID" : {
> "type" : "long"
>   },
>   "POLYS" : {
> "type" : "geo_shape",
> "tree" : "quadtree",
> "tree_levels" : 26
>   },
>   "STATE" : {
> "type" : "string"
>   }
> }
>   }
> }
>   }
> }
>
>  --
> You received this message because you are subscribed to a topic in the
> Google Groups "elasticsearch" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/elasticsearch/AqVrhQ7UiG8/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/0b3a1d4f-7ad5-4163-947d-992a33b71ec7%40googlegroups.com
> 
> .
>
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CADuvwZUUDq7CFgSONUaJOgvr6DNsr7zs8%3Dq0GJFC%2BxznBhr%2BTw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Preventing stop-of-the-world garbage collection

2014-12-29 Thread Michal Taborsky
Hi Christopher, thanks.

Field and filter caches are not the problem, I think, they occupy only 
minority of the memory. The garbage collection in fact frees up a lot of 
memory, so I think the problem is that the standard GC that is supposed to 
run continuously cannot keep up. I will give G1 a try, though I have seen 
in several places that it's not recommended as it's not stable enough.

Michal

Dne úterý, 30. prosince 2014 1:55:57 UTC+1 Chris Rimondi napsal(a):
>
> +1 for using G1GC. In addition I would suggest not trying to fine tune GC 
> settings. If you have stop the world old GCs taking 20+ seconds you have a 
> more fundamental issue at play. I speak from experience on that. We had 
> similar issues and no amount of JVM/GC tuning could mask the fact we simply 
> didn't have enough memory. 
>
> If you aren't already doing so look at the amount of heap used by the 
> filter and field caches. Are you capping them? If you aren't expensive 
> queries could saturate your entire heap. Along the same line keep tabs on 
> your evictions. ES provides granular metrics so you can look at both filter 
> and field cache evictions. 
>
> On Mon, Dec 29, 2014 at 8:03 AM, joerg...@gmail.com  <
> joerg...@gmail.com > wrote:
>
>> You said, very complex documents and queries, and 22 GB heap. Without 
>> knowing more about your queries and filters, it is hard to comment.  There 
>> is default query/filter caching in some cases.
>>
>> Jörg
>>
>> On Mon, Dec 29, 2014 at 1:55 PM, Michal Taborsky > > wrote:
>>
>>> Hi Jörg, thanks for your reply.
>>>
>>> What do you mean if we have setup caching? We do not have any special 
>>> caching configuration, we use the defaults. How do you suggest we 
>>> reconfigure ES? That is what I am trying to find out.
>>>
>>> All best,
>>> Michal
>>>
>>>
>>> Dne pondělí, 29. prosince 2014 12:06:43 UTC+1 Jörg Prante napsal(a):

 You could use G1 GC for nicer behavior regarding application stop 
 times, but before tinkering with GC, it would be better to check if you 
 have set up caching, and if it is possible to clear caches or reconfigure 
 ES.

 Jörg


 On Mon, Dec 29, 2014 at 10:36 AM, Michal Taborsky >>> > wrote:

> Hello everyone,
>
> we are using ES as a backend of an online service and occasionally, we 
> are hit by a big garbage collection, which stops the node completely and 
> causes all sorts of problems. The nodes have plenty of memory I think. 
> During the GC it looks like this. 
>
> [cz-dc-v-313] [gc][young][2270193][2282693] duration [1.6m], 
> collections [3]/[2m], total [1.6m]/[17.6h], memory 
> [21.1gb]->[6.5gb]/[22gb], all_pools {[young] 
> [478.6mb]->[224.7mb]/[599mb]}{[survivor] 
> [74.8mb]->[0b]/[74.8mb]}{[old] [20.6gb]->[6.3gb]/[21.3gb]}
> [cz-dc-v-313] [gc][old][2270193][2344] duration [24.1s], collections 
> [1]/[2m], total [24.1s]/[6.1m], memory [21.1gb]->[6.5gb]/[22gb], 
> all_pools 
> {[young] [478.6mb]->[224.7mb]/[599mb]}{[survivor] 
> [74.8mb]->[0b]/[74.8mb]}{[old] [20.6gb]->[6.3gb]/[21.3gb]}
>
> This might happen once a day, usually during a period of heavy 
> indexing, sometimes it doesn't. We tried decresing the heap size, but it 
> does not have that much of an effect. It makes the GC take a bit less 
> time, 
> but makes it happen a bit more often. 
>
> The data is actually fairly small in size, about 30G in total, but 
> very complex documents and queries. This is a 5-node cluster, the nodes 
> have 32G RAM with 22G assigned to ES heap.
>
> I know the manual says we should not touch the JVM GC settings but I 
> feel we might have to. Does anyone have any idea how to prevent these 
> garbage collections from ever happening?
>
> Thanks,
> Michal
>
> -- 
> You received this message because you are subscribed to the Google 
> Groups "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send 
> an email to elasticsearc...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/
> msgid/elasticsearch/29125088-8c43-4d97-b77b-71819fa11d09%
> 40googlegroups.com 
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

  -- 
>>> You received this message because you are subscribed to the Google 
>>> Groups "elasticsearch" group.
>>> To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to elasticsearc...@googlegroups.com .
>>> To view this discussion on the web visit 
>>> https://groups.google.com/d/msgid/elasticsearch/c84f17a2-0351-4473-aef3-5e4f08fc3c90%40googlegroups.com
>>>  
>>> 

how to enter multi valued geo_point attribute in elastic search

2014-12-29 Thread Abhimanyu Nagrath
Hi, 
I want to know the way to enter multi valued geo_point attribute in 
elastic search?at present i am using 
[{"lat":-40.2345,"lon":-30.2345},{"lat":-25.5678,"lon":-23.6789}] i am able 
to enter but i am not able see the data entry on elastic search UI and also 
not able to query.So please tell me that wether I am doing any thig wrong 
or is there any other way.
Thank You
Abhimanyu

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/cc41fe0f-ec5c-4b1e-a608-b23cfb73ee60%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Visualize stats of MS SQL tables using ElasticSearch

2014-12-29 Thread Ashutosh Parab
What I am doing is loading my MS SQL database into ElasticSearch. I want to 
perform different types of aggregations/statistics correaltions on the rows 
of those tables. So I wanted to know whether there is any tool to visualize 
such data. 
Is there a tutorial to demonstrate how this can be done?

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/53a2bbe3-8a69-4432-883e-60baaf8a35f9%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Visualize statistics of MS Sql tables using ElasticSearch

2014-12-29 Thread Ashutosh Parab
What I am doing is that loading the my 

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/9554b6f0-78d4-4e0f-b690-d353f642be58%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: elasticsearch java.lang.ArrayIndexOutOfBoundsException: 1

2014-12-29 Thread Vinay H M


On Tuesday, December 30, 2014 9:23:58 AM UTC+5:30, Vinay H M wrote:
>
> Hi All
>
> I found the error while running the elasticsearch ..plzz some one solve it 
>
>
> [2014-12-30 
> 09:16:22,389][ERROR][org.agileworks.elasticsearch.river.csv.CSVRiver] 
> [Aliyah Bishop] [csv][my_csv_river] Error has occured during processing 
> file 'PDUserDeviceDataTable.csv.processing' , skipping line: 
> '[249573";"875";"testaasim";"00:12:F3:1B:A5:68";"2";"1344";"0";"29.7";"58.3";"1419835852";"20.0";"30.0";"40.0";"50.0";"500";"500";"12.9226205";"77.5605173]'
>  
> and continue in processing
> java.lang.ArrayIndexOutOfBoundsException: 1
> at 
> org.codehaus.groovy.runtime.BytecodeInterface8.objectArrayGet(BytecodeInterface8.java:360)
> at 
> org.agileworks.elasticsearch.river.csv.OpenCSVFileProcessor.processDataLine(OpenCSVFileProcessor.groovy:72)
> at 
> org.agileworks.elasticsearch.river.csv.OpenCSVFileProcessor.this$2$processDataLine(OpenCSVFileProcessor.groovy)
> at 
> org.agileworks.elasticsearch.river.csv.OpenCSVFileProcessor$this$2$processDataLine.callCurrent(Unknown
>  
> Source)
> at 
> org.agileworks.elasticsearch.river.csv.OpenCSVFileProcessor.process(OpenCSVFileProcessor.groovy:49)
> at 
> org.agileworks.elasticsearch.river.csv.CSVConnector.processAllFiles(CSVConnector.groovy:47)
> at 
> org.agileworks.elasticsearch.river.csv.CSVConnector.run(CSVConnector.groovy:20)
> at java.lang.Thread.run(Thread.java:745)
>


the command i am using to create index 

curl -XPUT localhost:9200/_river/my_csv_river/_meta -d '
{
"type" : "csv",
"csv_file" : {
"folder" : "/home/paqs/Downloads/kibana/dec",
"filename_pattern" : ".*\\.csv$",
"poll":"1m",
"fields" : [
   "Sno",
   "userld",
   "userName",
   "deviceld",
   "deviceCurrentMode",
   "co2Level",
   "dustLevel",
   "temperature",
   "relativeHumidity",
   "timeStamp",
   "tempLow",
   "tempHigh",
   "rhLow",
   "rhHigh",
   "dust",
   "pollution",
   "latitude",
   "longitude"
],
"first_line_is_header" : "false",
"field_separator" : ",",
"escape_character" : "\\",
"quote_character" : "\"",
"field_id" : "id",
"field_timestamp" : "imported_at",
"concurrent_requests" : "1",
"charset" : "UTF-8",
"script_before_file": 
"/home/paqs/Downloads/kibana/dec/before_file.sh",
"script_after_file": 
"/home/paqs/Downloads/kibana/dec/after_file.sh",
"script_before_all": 
"/home/paqs/Downloads/kibana/dec/before_all.sh",
"script_after_all": "/home/paqs/Downloads/kibana/dec/after_all.sh"
},
"index" : {
"index" : "decdevicedata",
"type" : "alert",
"bulk_size" : 1000,
"bulk_threshold" : 10
}
}'


the curl command i am using to create the mapping 

Create a mapping
#
curl -XPUT http://localhost:9200/decdevicedata -d '
{
"settings" : {
"number_of_shards" : 1
},
"mappings" : {
"alert" : {
"properties" : {
"Sno": {"type" : "integer"},
"co2Level" : {"type" : "integer"},
"deviceCurrentMode" : {"type" : "integer"},
"deviceld"  : {"type" : "string"},
"dust"  : {"type" : "integer"},
"dustLevel" : {"type" : "integer"},
"latitude": {"type" : "integer"},
"longitude": {"type" : "integer"},
"pollution" : {"type" : "integer"},
"relativeHumidity" : {"type" : "float"},
"rhLow": {"type" : "float"},
"rhHigh": {"type" : "float"},
"temperature": {"type" : "float"},
"tempLow": {"type" : "float"},
"tempHigh": {"type" : "float"},
"timeStamp" : {"type" : "date", "ignore_malformed" : true, 
"format" : "dateOptionalTime"},
"userld" : {"type" : "integer"},
"userName" : {"type" : "string", "index" : "not_analyzed"}

}
}
}
}'



 

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/2bb5ffb1-3213-4055-8f58-467721e1ed5d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


elasticsearch java.lang.ArrayIndexOutOfBoundsException: 1

2014-12-29 Thread Vinay H M
Hi All

I found the error while running the elasticsearch ..plzz some one solve it 


[2014-12-30 
09:16:22,389][ERROR][org.agileworks.elasticsearch.river.csv.CSVRiver] 
[Aliyah Bishop] [csv][my_csv_river] Error has occured during processing 
file 'PDUserDeviceDataTable.csv.processing' , skipping line: 
'[249573";"875";"testaasim";"00:12:F3:1B:A5:68";"2";"1344";"0";"29.7";"58.3";"1419835852";"20.0";"30.0";"40.0";"50.0";"500";"500";"12.9226205";"77.5605173]'
 
and continue in processing
java.lang.ArrayIndexOutOfBoundsException: 1
at 
org.codehaus.groovy.runtime.BytecodeInterface8.objectArrayGet(BytecodeInterface8.java:360)
at 
org.agileworks.elasticsearch.river.csv.OpenCSVFileProcessor.processDataLine(OpenCSVFileProcessor.groovy:72)
at 
org.agileworks.elasticsearch.river.csv.OpenCSVFileProcessor.this$2$processDataLine(OpenCSVFileProcessor.groovy)
at 
org.agileworks.elasticsearch.river.csv.OpenCSVFileProcessor$this$2$processDataLine.callCurrent(Unknown
 
Source)
at 
org.agileworks.elasticsearch.river.csv.OpenCSVFileProcessor.process(OpenCSVFileProcessor.groovy:49)
at 
org.agileworks.elasticsearch.river.csv.CSVConnector.processAllFiles(CSVConnector.groovy:47)
at 
org.agileworks.elasticsearch.river.csv.CSVConnector.run(CSVConnector.groovy:20)
at java.lang.Thread.run(Thread.java:745)

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/dd2f00ef-d99a-48a6-9051-ebd974c39243%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [hadoop] ElasticsearchIllegalArgumentException

2014-12-29 Thread CAI Longqi
You're right. After carefully reviewing my mapping, I find some of my 
page.revision.comment have an attribute 'delete', some do not. It seems 
that elasticsearch requires all the indices to be in the same structure. 
Thanks :-)

在 2014年12月29日星期一UTC+8下午8时20分27秒,Costin Leau写道:
>
> It looks like you have a mapping problem in Elasticsearch. Typically 
> this occurs when you try incompatible/multiple value types on the same 
> key (for example setting a string to a number field or an object to a 
> string field - this looks like the case, etc...). 
> The field in question looks to be `page.revision.comment` - check your 
> mappings (do you depend on the automatic mapping or define one 
> yourself?). 
>
> On 12/25/14, CAI Longqi > wrote: 
> > Finally when reducing, it prompts me with such error: 
> > 
> > 14/12/25 00:35:45 INFO mapreduce.Job:  map 100% reduce 96% 
> > 
> > 14/12/25 00:35:55 INFO mapreduce.Job: Task Id : 
> > attempt_1417849082194_0158_r_00_2, Status : FAILED 
> > 
> > Error: org.elasticsearch.hadoop.rest.EsHadoopInvalidRequest: Found 
> > unrecoverable error [Bad Request(400) - [MapperPa 
> > 
> > rsingException[failed to parse [page.revision.comment]]; nested: 
> > ElasticsearchIllegalArgumentException[unknown prope 
> > 
> > rty [deleted]]; ]]; Bailing out.. 
> > 
> > at 
> > 
> org.elasticsearch.hadoop.rest.RestClient.retryFailedEntries(RestClient.java:199)
>  
>
> > 
> > at 
> > org.elasticsearch.hadoop.rest.RestClient.bulk(RestClient.java:165) 
> > 
> > at 
> > 
> org.elasticsearch.hadoop.rest.RestRepository.sendBatch(RestRepository.java:170)
>  
>
> > 
> > at 
> > 
> org.elasticsearch.hadoop.rest.RestRepository.doWriteToIndex(RestRepository.java:160)
>  
>
> > 
> > at 
> > 
> org.elasticsearch.hadoop.rest.RestRepository.writeToIndex(RestRepository.java:130)
>  
>
> > 
> > at 
> > 
> org.elasticsearch.hadoop.mr.EsOutputFormat$EsRecordWriter.write(EsOutputFormat.java:159)
>  
>
> > 
> > at 
> > 
> org.apache.hadoop.mapred.ReduceTask$NewTrackingRecordWriter.write(ReduceTask.java:558)
>  
>
> > 
> > at 
> > 
> org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:89)
>  
>
> > 
> > at 
> > 
> org.apache.hadoop.mapreduce.lib.reduce.WrappedReducer$Context.write(WrappedReducer.java:105)
>  
>
> > 
> > at org.apache.hadoop.mapreduce.Reducer.reduce(Reducer.java:150) 
> > 
> > at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:171) 
> > 
> > at 
> > org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:627) 
> > 
> > at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389) 
> > 
> > at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) 
> > 
> > at java.security.AccessController.doPrivileged(Native Method) 
> > 
> > at javax.security.auth.Subject.doAs(Subject.java:415) 
> > 
> > at 
> > 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
>  
>
> > 
> > at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163) 
> > 
> > What does that mean? Any suggestions? 
> > 
> > -- 
> > You received this message because you are subscribed to the Google 
> Groups 
> > "elasticsearch" group. 
> > To unsubscribe from this group and stop receiving emails from it, send 
> an 
> > email to elasticsearc...@googlegroups.com . 
> > To view this discussion on the web visit 
> > 
> https://groups.google.com/d/msgid/elasticsearch/af659de5-5274-425d-9901-8941ad7fcfb2%40googlegroups.com.
>  
>
> > For more options, visit https://groups.google.com/d/optout. 
> > 
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/51d1ac6e-4fdd-4fed-a291-170d6d890423%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Preventing stop-of-the-world garbage collection

2014-12-29 Thread Christopher Rimondi
+1 for using G1GC. In addition I would suggest not trying to fine tune GC
settings. If you have stop the world old GCs taking 20+ seconds you have a
more fundamental issue at play. I speak from experience on that. We had
similar issues and no amount of JVM/GC tuning could mask the fact we simply
didn't have enough memory.

If you aren't already doing so look at the amount of heap used by the
filter and field caches. Are you capping them? If you aren't expensive
queries could saturate your entire heap. Along the same line keep tabs on
your evictions. ES provides granular metrics so you can look at both filter
and field cache evictions.

On Mon, Dec 29, 2014 at 8:03 AM, joergpra...@gmail.com <
joergpra...@gmail.com> wrote:

> You said, very complex documents and queries, and 22 GB heap. Without
> knowing more about your queries and filters, it is hard to comment.  There
> is default query/filter caching in some cases.
>
> Jörg
>
> On Mon, Dec 29, 2014 at 1:55 PM, Michal Taborsky <
> michal.tabor...@gmail.com> wrote:
>
>> Hi Jörg, thanks for your reply.
>>
>> What do you mean if we have setup caching? We do not have any special
>> caching configuration, we use the defaults. How do you suggest we
>> reconfigure ES? That is what I am trying to find out.
>>
>> All best,
>> Michal
>>
>>
>> Dne pondělí, 29. prosince 2014 12:06:43 UTC+1 Jörg Prante napsal(a):
>>>
>>> You could use G1 GC for nicer behavior regarding application stop times,
>>> but before tinkering with GC, it would be better to check if you have set
>>> up caching, and if it is possible to clear caches or reconfigure ES.
>>>
>>> Jörg
>>>
>>>
>>> On Mon, Dec 29, 2014 at 10:36 AM, Michal Taborsky 
>>> wrote:
>>>
 Hello everyone,

 we are using ES as a backend of an online service and occasionally, we
 are hit by a big garbage collection, which stops the node completely and
 causes all sorts of problems. The nodes have plenty of memory I think.
 During the GC it looks like this.

 [cz-dc-v-313] [gc][young][2270193][2282693] duration [1.6m],
 collections [3]/[2m], total [1.6m]/[17.6h], memory
 [21.1gb]->[6.5gb]/[22gb], all_pools {[young] 
 [478.6mb]->[224.7mb]/[599mb]}{[survivor]
 [74.8mb]->[0b]/[74.8mb]}{[old] [20.6gb]->[6.3gb]/[21.3gb]}
 [cz-dc-v-313] [gc][old][2270193][2344] duration [24.1s], collections
 [1]/[2m], total [24.1s]/[6.1m], memory [21.1gb]->[6.5gb]/[22gb], all_pools
 {[young] [478.6mb]->[224.7mb]/[599mb]}{[survivor]
 [74.8mb]->[0b]/[74.8mb]}{[old] [20.6gb]->[6.3gb]/[21.3gb]}

 This might happen once a day, usually during a period of heavy
 indexing, sometimes it doesn't. We tried decresing the heap size, but it
 does not have that much of an effect. It makes the GC take a bit less time,
 but makes it happen a bit more often.

 The data is actually fairly small in size, about 30G in total, but very
 complex documents and queries. This is a 5-node cluster, the nodes have 32G
 RAM with 22G assigned to ES heap.

 I know the manual says we should not touch the JVM GC settings but I
 feel we might have to. Does anyone have any idea how to prevent these
 garbage collections from ever happening?

 Thanks,
 Michal

 --
 You received this message because you are subscribed to the Google
 Groups "elasticsearch" group.
 To unsubscribe from this group and stop receiving emails from it, send
 an email to elasticsearc...@googlegroups.com.
 To view this discussion on the web visit https://groups.google.com/d/
 msgid/elasticsearch/29125088-8c43-4d97-b77b-71819fa11d09%
 40googlegroups.com
 
 .
 For more options, visit https://groups.google.com/d/optout.

>>>
>>>  --
>> You received this message because you are subscribed to the Google Groups
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to elasticsearch+unsubscr...@googlegroups.com.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/elasticsearch/c84f17a2-0351-4473-aef3-5e4f08fc3c90%40googlegroups.com
>> 
>> .
>>
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/CAKdsXoG%3D_meGtEDEUv7uCMZjLaibB28Mg4uBbWju1fSnqOHqGQ%40mail.gmail.com
> 

Re: Does elasticsearch support minimum score in highlight query?

2014-12-29 Thread Nikolas Everett
No it doesn't. Highlighting is way weirder to implement then it probably
should be so concepts like score don't match over too well. They do weigh
segments but that wight isn't the same beast as a document score. Its much
more heuristicy. None of them support a minimum weight cutoff.

You could implement one but it'd take some tuning to get it to make sense.

Nik

On Dec 29, 2014 5:49 PM, "Yang Liu"  wrote:
>
> Hi, guys.
> I know that elasticsearch can have minimum score when we set up the query.
> I just want to know that does it also support minimum score for
highlighting query?
> After I running some highlighting query to ES, I realize that it
highlights segments inprecisely.
> Is there minimum score I can set for highlight query to make the
highlighting functionality more correct?
>
> --
> You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/be477505-a3e0-4b7e-bc75-ba4266ae3841%40googlegroups.com
.
> For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAPmjWd0J4a%3DVBvfySETN4-pQ4WDFiEFycARcGF%2BTzNVavQV6pg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Does elasticsearch support minimum score in highlight query?

2014-12-29 Thread Yang Liu
Hi, guys.
I know that elasticsearch can have minimum score when we set up the query.
I just want to know that does it also support minimum score for 
highlighting query?
After I running some highlighting query to ES, I realize that it highlights 
segments inprecisely.
Is there minimum score I can set for highlight query to make the 
highlighting functionality more correct?

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/be477505-a3e0-4b7e-bc75-ba4266ae3841%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [IMPORTANT] Issues using Perl API client installation

2014-12-29 Thread Vilas Reddy
Thanks Jörg for your prompt reply.

Can you please elaborate on how to do this. As i mentioned i am using 
cygwin.

Regards,
Vilas

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/a8d16219-beb9-4cdc-9abc-422efbfe7b3a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Elasticsearch : Need advice on architectural design of my cluster

2014-12-29 Thread Mark Walkom
Ideally you want to keep different types in different indexes.
And you definitely don't want everything in one massive index as that won't
scale well.

On 28 December 2014 at 22:41, Mandeep Gulati 
wrote:

> I am quite new to elasticsearch. I need to build a search system using the
> data from MongoDB. So, here is a high level overview of my application:
>
>- There are different users belonging to different organizations
>- A User can upload multiple datasets. Each dataset is stored as a
>single document in MongoDB. However, each dataset contains an array of
>nodes which contain the data we are interested in.
>- User can load one dataset at a time to his workspace and view the
>entire data for that particular dataset. But at a time, one user can view
>only one dataset. So, datasets are independent from each other and we never
>need to have any aggregation on multiple datasets.
>- User can perform a search in a dataset which is loaded in his/her
>workspace. Search should return the matching elements from the nodes array
>of the dataset
>
> For illustration, here is a single doc in MongoDB datasets collection
>
> {
>   "_id": ObjectId()
>   "setName": "dummy_set",
>   "nodes": [
> {
>   "id": ObjectId(),
>   "label": "some text",
>   "content" : "more text"
> },
> . . .
>   ]
> }
>
> For this, the design that I have though about is:
>
>- There will be one index in my cluster
>- Each single dataset will be stored in a separate type in the index.
>Name of the type will be the ObjectId of the dataset in mongoDB
>- Each element in the nodes array of dataset will become a single
>document in the corresponding type in elasticsearch.
>- I will use custom routing to make sure a single dataset resides on
>one shard only. For that, I will be using the type name (ObjectId of
>dataset from MongoDB) as my routing key. I assume, I will have to store it
>with each document in elasticsearch?
>
> Now I need to know if I am heading in a right direction ? Does the
> solution look scalable or is there something terribly wrong with the design
> and would love to hear some suggestions on how to improve it.
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/a177625f-2462-40c4-aad2-514ee3553b64%40googlegroups.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAEYi1X8iOLoOxxNV6woPoLykeYXVWk-Y2ouPSX_2G3hvZLsYcg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: javascript es client - should es clients be pooled?

2014-12-29 Thread phil swenson
thanks jack, let me know how your testing goes!

On Fri, Dec 26, 2014 at 11:57 AM, Jack Park 
wrote:

> On my platform, there is one and only one es client, through which all
> requests go. It seems to work well, though "well" is defined as having just
> a few people using it at one time. We plan to start stress testing the
> system soon.
>
> Cheers
> Jack
>
> On Fri, Dec 26, 2014 at 9:55 AM, phil swenson 
> wrote:
>
>> no answer, so let me ask a different way.  how are most of the es
>> javascript client apps managing the instance of the client?
>>
>> do you have one es client per app (singleton) that all requests go
>> through?  Do you create and destroy clients for every request?  Do you use
>> a es client pool using something like
>> https://github.com/coopernurse/node-pool ?
>>
>> Thanks for any comments!
>>
>> phil
>>
>> On Wed, Dec 24, 2014 at 10:34 AM, Phil Swenson 
>> wrote:
>>
>>> I'm writing a node/es app using the es javascript api
>>>
>>> Is there any reason to use pooling for all the javascript clients?  Or
>>> should I just use one client for the app?
>>>
>>> Thanks,
>>> phil
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "elasticsearch" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to elasticsearch+unsubscr...@googlegroups.com.
>>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/elasticsearch/a44a6a6e-8763-44f5-9e2d-8ea52dcf0345%40googlegroups.com
>>> 
>>> .
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>  --
>> You received this message because you are subscribed to the Google Groups
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to elasticsearch+unsubscr...@googlegroups.com.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/elasticsearch/CAGenvhVYD-fRkrVx5j0XR_xNYM2eSFFFxh2TxJnEar-HEWVfSg%40mail.gmail.com
>> 
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/CAH6s0fyEhW5b-BXCYViAb7Ej4XNC5shjjVE_kYfPKOcx6H-dng%40mail.gmail.com
> 
> .
>
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAGenvhX4Gk80WG%3D33QzTzpDL-PDQ4eOm4JZZkToG1sXCY_bVfw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: [IMPORTANT] Issues using Perl API client installation

2014-12-29 Thread joergpra...@gmail.com
You must set environment variables CC, CPP, CXX to the proper gcc programs
before ES Perl client build is calling the configure/make.

Jörg

On Mon, Dec 29, 2014 at 10:26 PM, Vilas Reddy  wrote:

> Hi,
>
> I am trying to use Perl API for retrieving data from Elasticsearch.
> I am using Elasticsearch in windows cygwin.
>
> I need help with installing perl api and using it. I tried the following:
>
> *1. Installed cpan in cygwin and tried installing using cpan
> Search::Elasticsearch. I get the following error*:
>
> cpan[3]> install Search::Elasticsearch
> Going to read '/home/VILASP/.cpan/Metadata'
>   Database was generated on Fri, 26 Dec 2014 11:53:14 GMT
> Running install for module 'Search::Elasticsearch'
> Running make for D/DR/DRTECH/Search-Elasticsearch-1.16.tar.gz
> Checksum for
> /home/VILASP/.cpan/sources/authors/id/D/DR/DRTECH/Search-Elasticsearch-1.16.tar.gz
> ok
> Scanning cache /home/VILASP/.cpan/build for sizes
>
> DONE
> sh: -d: invalid option
> Usage:  sh [GNU long option] [option] ...
> sh [GNU long option] [option] script-file ...
> GNU long options:
> --debug
> --debugger
> --dump-po-strings
> --dump-strings
> --help
> --init-file
> --login
> --noediting
> --noprofile
> --norc
> --posix
> --protected
> --rcfile
> --restricted
> --verbose
> --version
> --wordexp
> Shell options:
> -irsD or -c command or -O shopt_option  (invocation only)
> -abefhkmnptuvxBCHP or -o option
> Uncompressed
> /home/VILASP/.cpan/sources/authors/id/D/DR/DRTECH/Search-Elasticsearch-1.16.tar.gz
> successfully
> Using Tar:/usr/bin/tar xf "Search-Elasticsearch-1.16.tar":
> Untarred Search-Elasticsearch-1.16.tar successfully
> Package contains both files[Search-Elasticsearch-1.16.tar] and
> directories[Search-Elasticsearch-1.16]; not recognized as a perl package,
> giving up
>   Package contains both files[Search-Elasticsearch-1.16.tar] and
> directories[Search-Elasticsearch-1.16]; not recognized as a perl package,
> giving up, won't make
> Running make test
>   Make had some problems, won't test
> Running make install
>   Make had some problems, won't install
> Could not read metadata file. Falling back to other methods to determine
> prerequisites
> Failed during this command:
>  DRTECH/Search-Elasticsearch-1.16.tar.gz  : writemakefile NO --
> Package contains both files[Search-Elasticsearch-1.16.tar] and
> directories[Search-Elasticsearch-1.16]; not recognized as a perl package,
> giving up
>
> *Is there any manual way of installing the perl-api client?*
>
> *2. Installed Elasticsearch-Perl-master as a plugin in elasticsearch. Is
> it correct? What is the use of this Perl-Master?*
>
> *I am struck for few days trying to install the perl client api. Need
> urgent help.*
>
> Thanks,
> Vilas
>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/d85b321d-9226-4aae-ac70-4e4d4ace827c%40googlegroups.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoEDz6mdd1e0JQ0xD-SJ5EeJPE1O%2ByedHMMpkmM%2BpTe7ag%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


[IMPORTANT] Issues using Perl API client installation

2014-12-29 Thread Vilas Reddy
Hi,

I am trying to use Perl API for retrieving data from Elasticsearch.
I am using Elasticsearch in windows cygwin.

I need help with installing perl api and using it. I tried the following:

*1. Installed cpan in cygwin and tried installing using cpan 
Search::Elasticsearch. I get the following error*:

cpan[3]> install Search::Elasticsearch
Going to read '/home/VILASP/.cpan/Metadata'
  Database was generated on Fri, 26 Dec 2014 11:53:14 GMT
Running install for module 'Search::Elasticsearch'
Running make for D/DR/DRTECH/Search-Elasticsearch-1.16.tar.gz
Checksum for 
/home/VILASP/.cpan/sources/authors/id/D/DR/DRTECH/Search-Elasticsearch-1.16.tar.gz
 
ok
Scanning cache /home/VILASP/.cpan/build for sizes
DONE
sh: -d: invalid option
Usage:  sh [GNU long option] [option] ...
sh [GNU long option] [option] script-file ...
GNU long options:
--debug
--debugger
--dump-po-strings
--dump-strings
--help
--init-file
--login
--noediting
--noprofile
--norc
--posix
--protected
--rcfile
--restricted
--verbose
--version
--wordexp
Shell options:
-irsD or -c command or -O shopt_option  (invocation only)
-abefhkmnptuvxBCHP or -o option
Uncompressed 
/home/VILASP/.cpan/sources/authors/id/D/DR/DRTECH/Search-Elasticsearch-1.16.tar.gz
 
successfully
Using Tar:/usr/bin/tar xf "Search-Elasticsearch-1.16.tar":
Untarred Search-Elasticsearch-1.16.tar successfully
Package contains both files[Search-Elasticsearch-1.16.tar] and 
directories[Search-Elasticsearch-1.16]; not recognized as a perl package, 
giving up
  Package contains both files[Search-Elasticsearch-1.16.tar] and 
directories[Search-Elasticsearch-1.16]; not recognized as a perl package, 
giving up, won't make
Running make test
  Make had some problems, won't test
Running make install
  Make had some problems, won't install
Could not read metadata file. Falling back to other methods to determine 
prerequisites
Failed during this command:
 DRTECH/Search-Elasticsearch-1.16.tar.gz  : writemakefile NO -- Package 
contains both files[Search-Elasticsearch-1.16.tar] and 
directories[Search-Elasticsearch-1.16]; not recognized as a perl package, 
giving up

*Is there any manual way of installing the perl-api client?*

*2. Installed Elasticsearch-Perl-master as a plugin in elasticsearch. Is it 
correct? What is the use of this Perl-Master?*

*I am struck for few days trying to install the perl client api. Need 
urgent help.*

Thanks,
Vilas

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/d85b321d-9226-4aae-ac70-4e4d4ace827c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


How to compare data based on different dates Using Kibana3

2014-12-29 Thread Ramakrishna N
Hi,

I have a question regarding the ElasticSearch and Kibana.

My goal is I have to compare two different dates of data by hour/min/sec.

All my data is indexed and everything in single type.

Regards,
Ramakrishna Namburu

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/9254196d-8093-4de6-8b4e-adb7ef1e8cba%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Aging strategy described in the ES website's "Retiring Data" document isn't working for me

2014-12-29 Thread Mark Walkom
It doesn't matter where the primary or replica's live, they are the same
thing.

If you only want to query the second node then send your queries to it and
use local preference -
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-request-preference.html

On 29 December 2014 at 02:08, Steve Johnson  wrote:

> Thanks Mark for pointing out the first part about not needing the
> close/open.  I must have picked those up while fooling around trying to
> affect the correct behavior.  I originally followed the post I mentioned,
> which did not have the close/open.
>
> For the second pointI want a replica of each shard.  The issue is
> where the active vs backup shard lives.  Notice the shard boxes in my image
> with darker outlines vs those without.  The issue is illustrated in the
> first two indexes by which of each pair of shards is outlined.
>
> Thanks for your input!
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/a5b88cf1-7d72-4f8c-b850-6c51be7be4ab%40googlegroups.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAEYi1X_FtASH_Sp0KCDcWno7BhRa9TByiSq9WyRUEO4dzG1c%3DA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Cascading cluster failure

2014-12-29 Thread Kris Davey
Mark,

We have 18 data nodes across 9 physical servers, each has 128GB of RAM and 
40 Cores. All nodes are currently datanodes and all are able to be master 
nodes. We are getting ready to change this however and put in 3 dedicated 
master nodes.

-Kris

On Thursday, December 25, 2014 5:01:44 PM UTC-7, Mark Walkom wrote:
>
> 3 million requests a second!
>
> Can you provide some details on your cluster, ie node type count?
>
> On 26 December 2014 at 05:14, vineeth mohan  > wrote:
>
>> Hi , 
>>
>> That could be a good reason.
>> But then it wont happen without you change the threadpool settings for 
>> index.
>> If there is load more than it can process , it will go to the queue.
>> And the queue is by default 20 per node also goes full the requests get 
>> rejected.
>> Can you see if there is any rejected index requests in the process.
>>
>> But again , that wont help if you haven't changes the threadpool settings.
>>
>> Thanks
>>Vineeth
>>
>> On Thu, Dec 25, 2014 at 11:39 PM, Abhishek > > wrote:
>>
>>> Thanks Vineeth. I was thinking about that but these merge errors were 
>>> also happening before the outage. Also the elasticsearch process was never 
>>> dead. Also I was wondering at the time of outage we had about 3 million 
>>> requests per second. Just the large number of requests caused the network 
>>> layer go crazy ? Because everything recovered in 7 minutes. 
>>>
>>> Sent from my iPhone
>>>
>>> On Dec 25, 2014, at 11:12 PM, vineeth mohan >> > wrote:
>>>
>>> Hello Abhishek , 
>>>
>>> Can you try to correlate merge operation of shards and this time of 
>>> cascading failures ?
>>> I feel there is a correlation between both.
>>> If so , we can do some optimization on that side.
>>>
>>> Thanks
>>>   Vineeth
>>>
>>> On Thu, Dec 25, 2014 at 8:53 AM, Abhishek Andhavarapu <
>>> abhis...@gmail.com > wrote:
>>>
 Mark,

 Thanks for reading. Our heap sizes are less than 32 gigs to avoid 
 uncompressed pointers. We ideally double our cluster every year the number 
 of shards is plan for future growth. And the way documents are spread 
 across all the nodes in the cluster etc..

 Thanks,
 Abhishek

 On Thursday, December 25, 2014 2:05:22 AM UTC+5:30, Mark Walkom wrote:
>
> That's a pretty big number of shards, why is it so high?
> The recommended there is one shard per node, so you should (ideally) 
> have closer to 6600 shards.
>
> On 25 December 2014 at 07:07, Pat Wright  wrote:
>
>> Mark,  
>>
>> I work on the cluster as well so i can answer the size/makeup.  
>> Data: 580GB
>> Shards: 10K
>> Indices: 347 
>> ES version: 1.3.2
>>
>> Not sure the Java version.  
>>
>> Thanks for getting back!
>>
>> pat
>>
>>
>> On Wednesday, December 24, 2014 12:04:03 PM UTC-7, Mark Walkom wrote:
>>>
>>> You should drop your heap to 31GB, over that and you lose some 
>>> performance and actual heap stack due to uncompressed pointers.
>>>
>>> it looks like a node, or nodes, dropped out due to GC. How much 
>>> data, how many indexes do you have? What ES and java versions?
>>>
>>>
>>> On 24 December 2014 at 22:29, Abhishek  wrote:
>>>
 Thanks for reading vineeth. That was my initial thought but I 
 couldn't find any old gc during the outage. Each es node has 32 gigs. 
 Each 
 box has 128gigs split between 2 es nodes(32G each)  and file system 
 cache 
 (64G).

 On Wed, Dec 24, 2014 at 4:49 PM, vineeth mohan <
 vm.vine...@gmail.com> wrote:

> Hi , 
>
> What is the memory for each of these machines ?
> Also see if there is any correlation between garbage collection 
> and the time this anomaly happens. 
> Chances are that the stop the world time might block the ping for 
> sometime and the cluster might feel some nodes are gone.
>
> Thanks
>   Vineeth
>
> On Wed, Dec 24, 2014 at 4:23 PM, Abhishek Andhavarapu <
> abhis...@gmail.com> wrote:
>
>> Hi all,
>>
>> We recently had a cascading cluster failure. From 16:35 to 16:42 
>> the cluster went red and recovered it self. I can't seem to find any 
>> obvious logs around this time. 
>>
>> The cluster has about 19 nodes. 9 physical boxes running two 
>> instances of elasticsearch. And one vm as balancer for indexing.  
>> The CPU 
>> is normal and memory usage is below 75%
>>
>>
>> 
>>
>> Heap during the outage
>>
>>
>>
>> 
>>
>>

Re: Startup issues with ES 1.3.5

2014-12-29 Thread Chris Moore
Master and data are separate nodes. The problem node (master) never leaves 
the cluster (there are no messages in the logs of the other nodes and 
/_cat/health reports it is still there). It will respond to requests that 
don't require checking with other nodes for any data (so _cat/health is 
fine but /_search is not). Detaching jstack does not fix that behavior.

On Friday, December 26, 2014 10:55:52 AM UTC-5, Gurvinder Singh wrote:
>
> Do you have master and data node separate or they running on same ES 
> node process. Another thing does after jstack, process becomes 
> responsive again or it still remains out of cluster. 
>
> On 12/26/2014 04:43 PM, Chris Moore wrote: 
> > I tried your configuration suggestions, but the behavior was no 
> > different. I have attached the jstack output from the troubled node 
> > (master). It didn't appear to indicate anything of note, but I have 
> > attached it. 
> > 
> > On Thursday, December 25, 2014 8:33:20 AM UTC-5, Gurvinder Singh wrote: 
> > 
> > We might have faced similar problem with ES 1.3.6. The reason we 
> found 
> > was might be due to concurrent merges. These settings have helped us 
> > in fixing the issue. 
> > merge: 
> > policy: 
> >   max_merge_at_once: 5 
> >   reclaim_deletes_weight: 4.0 
> >   segments_per_tier: 5 
> > indices: 
> >   store: 
> > throttle: 
> >   max_bytes_per_sec: 40mb # as we have few SATA disk for storage 
> >   type: merge 
> > 
> > you can check your hanged process by attaching jstack to it as 
> > 
> > jstack -F  
> > 
> > Also once you detach the jstack process become responding again and 
> > joins cluster.  Although it should not happen at all as if disk is 
> the 
> > limitation ES should not stop responding. 
> > 
> > - Gurvinder 
> > On 12/24/2014 08:00 PM, Mark Walkom wrote: 
> > > Ok a few things that don't make sense to me; 
> > > 
> > > 1. 10 indexes of only ~220Kb? Are you sure of this? 2. If so why 
> > > not just one index? 3. Is baseball_data.json the data for an 
> entire 
> > > index? If not can you clarify. 4. What java version are you on? 5. 
> > > What monitoring were you using? 6. Can you delete all your data, 
> > > switch monitoring on, start reindexing and then watch what 
> happens? 
> > > Marvel would be ideal for this. 
> > > 
> > > What you are seeing is really, really weird. That is a high shard 
> > > count however given the dataset is small I wouldn't think it'd 
> > > cause problems (but I could be wrong). 
> > > 
> > > On 25 December 2014 at 02:27, Chris Moore  >  
> > > > wrote: 
> > > 
> > > Attached is the script we've been using to load the data and the 
> > > dataset. This is the mapping and a sample document 
> > > 
> > > { "baseball_1" : { "mappings" : { "team" : { "properties" : { "L" 
> : 
> > > { "type" : "integer", "store" : true }, "W" : { "type" : 
> > > "integer", "store" : true }, "name" : { "type" : "string", "store" 
> > > : true }, "teamID" : { "type" : "string", "store" : true }, 
> > > "yearID" : { "type" : "string", "store" : true } } } } } } 
> > > 
> > > {"yearID":"1871", "teamID":"PH1", "W":"21", "L":"7", 
> > > "name":"Philadelphia Athletics"} 
> > > 
> > > On Wednesday, December 24, 2014 10:22:00 AM UTC-5, Chris Moore 
> > > wrote: 
> > > 
> > > We tried many different test setups yesterday. The first setup we 
> > > tried was: 
> > > 
> > > 1 Master, 2 Data nodes 38 indices 10 shards per index 1 replica 
> per 
> > > index 760 total shards (380 primary, 760 total) Each index had 
> > > 2,745 documents Each index was 218.9kb in size (according to the 
> > > _cat/indices API) 
> > > 
> > > We realize that 10 shards per index with only 2 nodes is not a 
> good 
> > > idea, so we changed that and reran the tests. 
> > > 
> > > We changed shards per index to the default of 5 and put 100 
> indices 
> > > on the 2 boxes and ran into the same issue. It was the same 
> > > dataset, so all other size information is correct. 
> > > 
> > > After that, we turned off one of the data nodes, set replicas to 0 
> > > and shards per index to 1. With the same dataset, I loaded ~440 
> > > indices and ran into the timeout issues with the Master and Data 
> > > nodes just idling. 
> > > 
> > > This is just a test dataset that we came up with to quickly test 
> > > our issues that contains no confidential information. Once we 
> > > figure out the issues affecting this test dataset, we'll try 
> things 
> > > with our real dataset. 
> > > 
> > > 
> > > All of this works fine on ES 1.1.2, but not on 1.3.x (1.3.5 is our 
> > > current test version). We have also tried our real setup on 1.4.1 
> > > to no avail.

Re: ElasticSearch Hash Function

2014-12-29 Thread joergpra...@gmail.com
For ES 1.4, see

https://github.com/elasticsearch/elasticsearch/blob/1.4/src/main/java/org/elasticsearch/cluster/routing/operation/plain/PlainOperationRouting.java#L265

and the hash function is

https://github.com/elasticsearch/elasticsearch/blob/1.4/src/main/java/org/elasticsearch/cluster/routing/operation/hash/djb/DjbHashFunction.java

Note, the hash function will change in ES 2.0.

You can not check from outside what hash function ES uses, this is internal
use only.

Jörg


On Mon, Dec 29, 2014 at 12:48 PM, Costya Regev  wrote:

> Hi,
>
>
> I would like to predict the shards routing  of a set of  keys as a
> function of the number of shards,without defining the actual index.
> Elasticsearch Version 1.4.2
>
> How can i check which hash function is used for routing ? and how can i
> check it inside my java program ?
>
>
>
>
>
>
> Thanks,
> Costya
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/8410c34e-5ad7-4b6b-89b9-7cbf758010eb%40googlegroups.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGLciGr6QEKDy_PLAU2v68aXNGHXf5gmN1w9Xb8aq%2BtnQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: ElasticSearch roadmap?

2014-12-29 Thread Nikolas Everett
Your best bet is to look at github issues and pull requests tagged for the
next release.

Elasticsearch the company has a roadmap for elasticsearch the open source
project but it isn't public.

Nik
On Dec 29, 2014 6:57 AM, "PrasathRajan"  wrote:

> Hi All,
>
>   Does ElasticSearch upcomming release have any road map?
>
> Thanks
> Prasath Rajan
>
>
>
> --
> View this message in context:
> http://elasticsearch-users.115913.n3.nabble.com/ElasticSearch-roadmap-tp199069p4068263.html
> Sent from the ElasticSearch Users mailing list archive at Nabble.com.
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/1419854248810-4068263.post%40n3.nabble.com
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAPmjWd2oChqJ8ZcBJ15d2%3DnGRZsYADWxvzL79m5QKUwdWrLCtw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Date histogram doesn't contain value field in ES version:1.4.2

2014-12-29 Thread Yatish Teddla
Got in Adrien, Thanks.

On Monday, 29 December 2014 19:09:30 UTC+5:30, Adrien Grand wrote:
>
> Hi,
>
> Aggregations don't support `value_field` and require to use a 
> sub-aggregation instead. This might be useful: 
> http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-facets-migrating-to-aggs.html#_value_field_2
>
> On Mon, Dec 29, 2014 at 1:06 PM, Yatish Teddla  > wrote:
>
>> Hi Admin.
>> Currently iam working on ES version 1.0.1 . In the date histogram facets 
>> supports "value_field" that helps to do bucketing on a different key. Now 
>> we are planning to upgrade ES version to 1.4.2, but in that date histogram 
>> facets i didn't find any documentation regarding this "value_field".
>> I need date bucketing with a different key.
>>
>> Can you please help me to find the documentation for this or am i missing 
>> any thing?
>>
>> Ref urls:
>>
>> http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-facets-date-histogram-facet.html
>>
>> http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-aggregations-bucket-datehistogram-aggregation.html
>>
>> Any help is much appreciated!
>>
>> Thanks,
>> Yatish.
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to elasticsearc...@googlegroups.com .
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/964d0c2a-809e-4dd9-bfde-be8298a9e8f8%40googlegroups.com
>>  
>> 
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>
>
> -- 
> Adrien Grand
>  

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/812794b7-547e-4beb-9099-fd9db73d9358%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Date histogram doesn't contain value field in ES version:1.4.2

2014-12-29 Thread Adrien Grand
Hi,

Aggregations don't support `value_field` and require to use a
sub-aggregation instead. This might be useful:
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-facets-migrating-to-aggs.html#_value_field_2

On Mon, Dec 29, 2014 at 1:06 PM, Yatish Teddla  wrote:

> Hi Admin.
> Currently iam working on ES version 1.0.1 . In the date histogram facets
> supports "value_field" that helps to do bucketing on a different key. Now
> we are planning to upgrade ES version to 1.4.2, but in that date histogram
> facets i didn't find any documentation regarding this "value_field".
> I need date bucketing with a different key.
>
> Can you please help me to find the documentation for this or am i missing
> any thing?
>
> Ref urls:
>
> http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-facets-date-histogram-facet.html
>
> http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-aggregations-bucket-datehistogram-aggregation.html
>
> Any help is much appreciated!
>
> Thanks,
> Yatish.
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/964d0c2a-809e-4dd9-bfde-be8298a9e8f8%40googlegroups.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>



-- 
Adrien Grand

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j5a%2B7tyiEJMd24N_Ovf7YNWPRMLDSAE8NU%2BD%3D92ss7gfQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Preventing stop-of-the-world garbage collection

2014-12-29 Thread joergpra...@gmail.com
You said, very complex documents and queries, and 22 GB heap. Without
knowing more about your queries and filters, it is hard to comment.  There
is default query/filter caching in some cases.

Jörg

On Mon, Dec 29, 2014 at 1:55 PM, Michal Taborsky 
wrote:

> Hi Jörg, thanks for your reply.
>
> What do you mean if we have setup caching? We do not have any special
> caching configuration, we use the defaults. How do you suggest we
> reconfigure ES? That is what I am trying to find out.
>
> All best,
> Michal
>
>
> Dne pondělí, 29. prosince 2014 12:06:43 UTC+1 Jörg Prante napsal(a):
>>
>> You could use G1 GC for nicer behavior regarding application stop times,
>> but before tinkering with GC, it would be better to check if you have set
>> up caching, and if it is possible to clear caches or reconfigure ES.
>>
>> Jörg
>>
>>
>> On Mon, Dec 29, 2014 at 10:36 AM, Michal Taborsky 
>> wrote:
>>
>>> Hello everyone,
>>>
>>> we are using ES as a backend of an online service and occasionally, we
>>> are hit by a big garbage collection, which stops the node completely and
>>> causes all sorts of problems. The nodes have plenty of memory I think.
>>> During the GC it looks like this.
>>>
>>> [cz-dc-v-313] [gc][young][2270193][2282693] duration [1.6m], collections
>>> [3]/[2m], total [1.6m]/[17.6h], memory [21.1gb]->[6.5gb]/[22gb], all_pools
>>> {[young] [478.6mb]->[224.7mb]/[599mb]}{[survivor]
>>> [74.8mb]->[0b]/[74.8mb]}{[old] [20.6gb]->[6.3gb]/[21.3gb]}
>>> [cz-dc-v-313] [gc][old][2270193][2344] duration [24.1s], collections
>>> [1]/[2m], total [24.1s]/[6.1m], memory [21.1gb]->[6.5gb]/[22gb], all_pools
>>> {[young] [478.6mb]->[224.7mb]/[599mb]}{[survivor]
>>> [74.8mb]->[0b]/[74.8mb]}{[old] [20.6gb]->[6.3gb]/[21.3gb]}
>>>
>>> This might happen once a day, usually during a period of heavy indexing,
>>> sometimes it doesn't. We tried decresing the heap size, but it does not
>>> have that much of an effect. It makes the GC take a bit less time, but
>>> makes it happen a bit more often.
>>>
>>> The data is actually fairly small in size, about 30G in total, but very
>>> complex documents and queries. This is a 5-node cluster, the nodes have 32G
>>> RAM with 22G assigned to ES heap.
>>>
>>> I know the manual says we should not touch the JVM GC settings but I
>>> feel we might have to. Does anyone have any idea how to prevent these
>>> garbage collections from ever happening?
>>>
>>> Thanks,
>>> Michal
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "elasticsearch" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to elasticsearc...@googlegroups.com.
>>> To view this discussion on the web visit https://groups.google.com/d/
>>> msgid/elasticsearch/29125088-8c43-4d97-b77b-71819fa11d09%
>>> 40googlegroups.com
>>> 
>>> .
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/c84f17a2-0351-4473-aef3-5e4f08fc3c90%40googlegroups.com
> 
> .
>
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoG%3D_meGtEDEUv7uCMZjLaibB28Mg4uBbWju1fSnqOHqGQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Preventing stop-of-the-world garbage collection

2014-12-29 Thread Michal Taborsky
Hi Jörg, thanks for your reply.

What do you mean if we have setup caching? We do not have any special 
caching configuration, we use the defaults. How do you suggest we 
reconfigure ES? That is what I am trying to find out.

All best,
Michal


Dne pondělí, 29. prosince 2014 12:06:43 UTC+1 Jörg Prante napsal(a):
>
> You could use G1 GC for nicer behavior regarding application stop times, 
> but before tinkering with GC, it would be better to check if you have set 
> up caching, and if it is possible to clear caches or reconfigure ES.
>
> Jörg
>
>
> On Mon, Dec 29, 2014 at 10:36 AM, Michal Taborsky  > wrote:
>
>> Hello everyone,
>>
>> we are using ES as a backend of an online service and occasionally, we 
>> are hit by a big garbage collection, which stops the node completely and 
>> causes all sorts of problems. The nodes have plenty of memory I think. 
>> During the GC it looks like this. 
>>
>> [cz-dc-v-313] [gc][young][2270193][2282693] duration [1.6m], collections 
>> [3]/[2m], total [1.6m]/[17.6h], memory [21.1gb]->[6.5gb]/[22gb], all_pools 
>> {[young] [478.6mb]->[224.7mb]/[599mb]}{[survivor] 
>> [74.8mb]->[0b]/[74.8mb]}{[old] [20.6gb]->[6.3gb]/[21.3gb]}
>> [cz-dc-v-313] [gc][old][2270193][2344] duration [24.1s], collections 
>> [1]/[2m], total [24.1s]/[6.1m], memory [21.1gb]->[6.5gb]/[22gb], all_pools 
>> {[young] [478.6mb]->[224.7mb]/[599mb]}{[survivor] 
>> [74.8mb]->[0b]/[74.8mb]}{[old] [20.6gb]->[6.3gb]/[21.3gb]}
>>
>> This might happen once a day, usually during a period of heavy indexing, 
>> sometimes it doesn't. We tried decresing the heap size, but it does not 
>> have that much of an effect. It makes the GC take a bit less time, but 
>> makes it happen a bit more often. 
>>
>> The data is actually fairly small in size, about 30G in total, but very 
>> complex documents and queries. This is a 5-node cluster, the nodes have 32G 
>> RAM with 22G assigned to ES heap.
>>
>> I know the manual says we should not touch the JVM GC settings but I feel 
>> we might have to. Does anyone have any idea how to prevent these garbage 
>> collections from ever happening?
>>
>> Thanks,
>> Michal
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to elasticsearc...@googlegroups.com .
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/29125088-8c43-4d97-b77b-71819fa11d09%40googlegroups.com
>>  
>> 
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/c84f17a2-0351-4473-aef3-5e4f08fc3c90%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Elasticsearch Spark EsHadoopNoNodesLeftException in cluster Mode

2014-12-29 Thread Costin Leau
Check the node status and see whether it behaves normally or not while
data is being loaded. If the load is too high and the node not
properly configured it could keep on rejecting data or a GC might be
triggered, causing es-hadoop to fail the job.

On 12/23/14, Rahul Kumar  wrote:
> Hi,
>
> I am trying to index data trough apache spark to elasticsearch. Apache
> Spark is used for data enrichment and then i am using *saveToEs*  to index
> data.
>
> Local mode my code is working fine, but when I run code on cluster mode
> with 1 Master and 2 slave it indexes around 1700 data and showing following
>
> error
>
> org.apache.spark.SparkException: Job aborted due to stage failure: Task 9
>> in stage 1711.0 failed 4 times, most recent failure: Lost task 9.3 in
>> stage
>> 1711.0 (TID 27379, ip-1-169-15-116.ec2.internal):
>> org.elasticsearch.hadoop.rest.EsHadoopNoNodesLeftException: Connection
>> error (check network and/or proxy settings)- all nodes failed; tried
>> [[4.212.11.16:9200]]
>
>
>
> my configurations are following
>
> sparkConf.set("es.index.auto.create", "true")
> sparkConf.set("es.nodes","1.1.1.1:9200") //my elasticserver ip
> sparkConf.set("spark.eventLog.enabled","true")
> sparkConf.set("es.nodes.discovery", "false")
>
>
>
> *finalData.foreach(row=>{*
> *  sc.makeRDD(Seq(row)).saveToEs("spark71/docs")*
> * println(row)*
> *})*
>
>
>
>
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/3ec58ae2-8578-4f8d-8122-960812b6811a%40googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAJogdmdyWJykijeEn1%2BfM-TEKLP3u2QKwE6Y4UJRLjej1wWv8w%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: [hadoop] ElasticsearchIllegalArgumentException

2014-12-29 Thread Costin Leau
It looks like you have a mapping problem in Elasticsearch. Typically
this occurs when you try incompatible/multiple value types on the same
key (for example setting a string to a number field or an object to a
string field - this looks like the case, etc...).
The field in question looks to be `page.revision.comment` - check your
mappings (do you depend on the automatic mapping or define one
yourself?).

On 12/25/14, CAI Longqi  wrote:
> Finally when reducing, it prompts me with such error:
>
> 14/12/25 00:35:45 INFO mapreduce.Job:  map 100% reduce 96%
>
> 14/12/25 00:35:55 INFO mapreduce.Job: Task Id :
> attempt_1417849082194_0158_r_00_2, Status : FAILED
>
> Error: org.elasticsearch.hadoop.rest.EsHadoopInvalidRequest: Found
> unrecoverable error [Bad Request(400) - [MapperPa
>
> rsingException[failed to parse [page.revision.comment]]; nested:
> ElasticsearchIllegalArgumentException[unknown prope
>
> rty [deleted]]; ]]; Bailing out..
>
> at
> org.elasticsearch.hadoop.rest.RestClient.retryFailedEntries(RestClient.java:199)
>
> at
> org.elasticsearch.hadoop.rest.RestClient.bulk(RestClient.java:165)
>
> at
> org.elasticsearch.hadoop.rest.RestRepository.sendBatch(RestRepository.java:170)
>
> at
> org.elasticsearch.hadoop.rest.RestRepository.doWriteToIndex(RestRepository.java:160)
>
> at
> org.elasticsearch.hadoop.rest.RestRepository.writeToIndex(RestRepository.java:130)
>
> at
> org.elasticsearch.hadoop.mr.EsOutputFormat$EsRecordWriter.write(EsOutputFormat.java:159)
>
> at
> org.apache.hadoop.mapred.ReduceTask$NewTrackingRecordWriter.write(ReduceTask.java:558)
>
> at
> org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:89)
>
> at
> org.apache.hadoop.mapreduce.lib.reduce.WrappedReducer$Context.write(WrappedReducer.java:105)
>
> at org.apache.hadoop.mapreduce.Reducer.reduce(Reducer.java:150)
>
> at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:171)
>
> at
> org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:627)
>
> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389)
>
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
>
> at java.security.AccessController.doPrivileged(Native Method)
>
> at javax.security.auth.Subject.doAs(Subject.java:415)
>
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
>
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
>
> What does that mean? Any suggestions?
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/af659de5-5274-425d-9901-8941ad7fcfb2%40googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAJogdmdY5-9ZZ_H-awP1%2B_J7Xi9BeXWMm84dZD45rPrQ6EHL9g%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: EsBolt - Schema Definition

2014-12-29 Thread Costin Leau
EsBolt 'passes' the information to Elasticsearch. To define how the
data is indexed, simply define your mapping in Elasticsearch directly.
Without any mapping, Elasticsearch will try to automatically detect
and map your data accordingly which might match your expectation or
not.
For example see [1]

[1] 
http://www.elasticsearch.org/guide/en/elasticsearch/hadoop/2.1.Beta/mapping.html

On 12/29/14, Abhishek Patel  wrote:
> I am using EsBolt which receives stream from Kafka bolt. The Kafka bolt
> emits json which is given to EsBolt. Now, my JSON string may contain geo
> data. So I want it to be indexed geo-spatially. How I can define my
> indexing schema for EsBolt.
>
> Thanks
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/a9bca2db-108a-4f65-a4f7-32c0276b2c4b%40googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAJogdmfDtaBpzsJmWdEBjagz5kJNJ8PpaK5QLkMpp5HGSQ_6nQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Date histogram doesn't contain value field in ES version:1.4.2

2014-12-29 Thread Yatish Teddla
Hi Admin.
Currently iam working on ES version 1.0.1 . In the date histogram facets 
supports "value_field" that helps to do bucketing on a different key. Now 
we are planning to upgrade ES version to 1.4.2, but in that date histogram 
facets i didn't find any documentation regarding this "value_field".
I need date bucketing with a different key.

Can you please help me to find the documentation for this or am i missing 
any thing?

Ref urls:
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-facets-date-histogram-facet.html
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-aggregations-bucket-datehistogram-aggregation.html

Any help is much appreciated!

Thanks,
Yatish.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/964d0c2a-809e-4dd9-bfde-be8298a9e8f8%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: ElasticSearch roadmap?

2014-12-29 Thread PrasathRajan
Hi All,

  Does ElasticSearch upcomming release have any road map?

Thanks
Prasath Rajan



--
View this message in context: 
http://elasticsearch-users.115913.n3.nabble.com/ElasticSearch-roadmap-tp199069p4068263.html
Sent from the ElasticSearch Users mailing list archive at Nabble.com.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/1419854248810-4068263.post%40n3.nabble.com.
For more options, visit https://groups.google.com/d/optout.


EsBolt - Schema Definition

2014-12-29 Thread Abhishek Patel
I am using EsBolt which receives stream from Kafka bolt. The Kafka bolt 
emits json which is given to EsBolt. Now, my JSON string may contain geo 
data. So I want it to be indexed geo-spatially. How I can define my 
indexing schema for EsBolt.

Thanks 

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/a9bca2db-108a-4f65-a4f7-32c0276b2c4b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


ElasticSearch Hash Function

2014-12-29 Thread Costya Regev
Hi,


I would like to predict the shards routing  of a set of  keys as a function 
of the number of shards,without defining the actual index.
Elasticsearch Version 1.4.2 

How can i check which hash function is used for routing ? and how can i 
check it inside my java program ?






Thanks,
Costya

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/8410c34e-5ad7-4b6b-89b9-7cbf758010eb%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Get count of documents through aggregation.

2014-12-29 Thread myname
I am building SQL api for elasticsearch, I want all sql function 
(count,avg, sum) to return as aggregation result.
I already succeed doing this query using the field _index. it seems to 
working.

בתאריך יום שני, 29 בדצמבר 2014 12:45:07 UTC+2, מאת Adrien Grand:
>
> Can you elaborate a bit more on what you would like to do? Depending on 
> your use-case, you can already use the total hit count returned by the 
> search API, the count API or aggregations.
>
> On Sat, Dec 27, 2014 at 9:53 PM, myname > 
> wrote:
>
>> Is it possible to get count of documents using aggregations?
>> I try using value count aggregation 
>> ,
>>  
>> but it seems it only working if I provied the "field" parameter, counting 
>> only documents containing this field.
>> I tried using the same aggregation with "_id" as field with no success.
>> Any ideas?
>>
>>
>>  -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to elasticsearc...@googlegroups.com .
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/f1379db8-5b3f-4f3a-b5c5-88f44c41d263%40googlegroups.com
>>  
>> 
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>
>
> -- 
> Adrien Grand
>  

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/2f0b38fe-5f82-4a77-a849-c44e147ef4ef%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Preventing stop-of-the-world garbage collection

2014-12-29 Thread joergpra...@gmail.com
You could use G1 GC for nicer behavior regarding application stop times,
but before tinkering with GC, it would be better to check if you have set
up caching, and if it is possible to clear caches or reconfigure ES.

Jörg


On Mon, Dec 29, 2014 at 10:36 AM, Michal Taborsky  wrote:

> Hello everyone,
>
> we are using ES as a backend of an online service and occasionally, we are
> hit by a big garbage collection, which stops the node completely and causes
> all sorts of problems. The nodes have plenty of memory I think. During the
> GC it looks like this.
>
> [cz-dc-v-313] [gc][young][2270193][2282693] duration [1.6m], collections
> [3]/[2m], total [1.6m]/[17.6h], memory [21.1gb]->[6.5gb]/[22gb], all_pools
> {[young] [478.6mb]->[224.7mb]/[599mb]}{[survivor]
> [74.8mb]->[0b]/[74.8mb]}{[old] [20.6gb]->[6.3gb]/[21.3gb]}
> [cz-dc-v-313] [gc][old][2270193][2344] duration [24.1s], collections
> [1]/[2m], total [24.1s]/[6.1m], memory [21.1gb]->[6.5gb]/[22gb], all_pools
> {[young] [478.6mb]->[224.7mb]/[599mb]}{[survivor]
> [74.8mb]->[0b]/[74.8mb]}{[old] [20.6gb]->[6.3gb]/[21.3gb]}
>
> This might happen once a day, usually during a period of heavy indexing,
> sometimes it doesn't. We tried decresing the heap size, but it does not
> have that much of an effect. It makes the GC take a bit less time, but
> makes it happen a bit more often.
>
> The data is actually fairly small in size, about 30G in total, but very
> complex documents and queries. This is a 5-node cluster, the nodes have 32G
> RAM with 22G assigned to ES heap.
>
> I know the manual says we should not touch the JVM GC settings but I feel
> we might have to. Does anyone have any idea how to prevent these garbage
> collections from ever happening?
>
> Thanks,
> Michal
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/29125088-8c43-4d97-b77b-71819fa11d09%40googlegroups.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGcU_F7qoCyuuiA1uEBx96xFzqEpiQrZWbdhoMyxVsEEA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Get count of documents through aggregation.

2014-12-29 Thread Adrien Grand
Can you elaborate a bit more on what you would like to do? Depending on
your use-case, you can already use the total hit count returned by the
search API, the count API or aggregations.

On Sat, Dec 27, 2014 at 9:53 PM, myname  wrote:

> Is it possible to get count of documents using aggregations?
> I try using value count aggregation
> ,
> but it seems it only working if I provied the "field" parameter, counting
> only documents containing this field.
> I tried using the same aggregation with "_id" as field with no success.
> Any ideas?
>
>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/f1379db8-5b3f-4f3a-b5c5-88f44c41d263%40googlegroups.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>



-- 
Adrien Grand

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j7dcKqV3%2ByAsGABER0YKHONDLGni%2BoWeYj1zA1ZAHF0HQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Preventing stop-of-the-world garbage collection

2014-12-29 Thread Michal Taborsky
Hello everyone,

we are using ES as a backend of an online service and occasionally, we are 
hit by a big garbage collection, which stops the node completely and causes 
all sorts of problems. The nodes have plenty of memory I think. During the 
GC it looks like this. 

[cz-dc-v-313] [gc][young][2270193][2282693] duration [1.6m], collections 
[3]/[2m], total [1.6m]/[17.6h], memory [21.1gb]->[6.5gb]/[22gb], all_pools 
{[young] [478.6mb]->[224.7mb]/[599mb]}{[survivor] 
[74.8mb]->[0b]/[74.8mb]}{[old] [20.6gb]->[6.3gb]/[21.3gb]}
[cz-dc-v-313] [gc][old][2270193][2344] duration [24.1s], collections 
[1]/[2m], total [24.1s]/[6.1m], memory [21.1gb]->[6.5gb]/[22gb], all_pools 
{[young] [478.6mb]->[224.7mb]/[599mb]}{[survivor] 
[74.8mb]->[0b]/[74.8mb]}{[old] [20.6gb]->[6.3gb]/[21.3gb]}

This might happen once a day, usually during a period of heavy indexing, 
sometimes it doesn't. We tried decresing the heap size, but it does not 
have that much of an effect. It makes the GC take a bit less time, but 
makes it happen a bit more often. 

The data is actually fairly small in size, about 30G in total, but very 
complex documents and queries. This is a 5-node cluster, the nodes have 32G 
RAM with 22G assigned to ES heap.

I know the manual says we should not touch the JVM GC settings but I feel 
we might have to. Does anyone have any idea how to prevent these garbage 
collections from ever happening?

Thanks,
Michal

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/29125088-8c43-4d97-b77b-71819fa11d09%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.