Re: Elasticsearch puppet module's problem

2015-04-21 Thread Sergey Zemlyanoy
Hi,

So any advice what should I pass to configure service? It seems configs are 
removed by module itself which is odd.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/9bd5667b-e2fc-4d77-94d2-2dd142a34b4c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: No hits if fields are not stored

2015-04-21 Thread Zaid Amir
This is an index request that I have taken out of my _bulk request. Hope it 
helps:

curl -XPOST http://localhost:9200/_bulk -d '{ 
"index" : { 
"_index" : "files", 
"_type" : "rawfiles", 
"_id" : "130741557032361573_equilibrating"
}
}{
"content_br" : [],
"content_da" : [],
"content_de" : [],
"content_en" : ["monism lasagnes beginner prepositional masterworks prince 
diluvium suggesting maharishi raceways quibble debauches virtuosity spurt 
narrater squads entranced iron latents beguiler delimitating banders 
creosoting stained macular protested russet whists cooling hoc 
resuscitation yukking pea oglers puffins covertly divulgement listless 
evince levitate crowded dandyism jeu prewashes blackened queenlier 
effeminate prorater remarriage nails sunk landowner"],
"content_es" : [],
"content_ukw" : [],
"fileid" : "130741557032361573_equilibrating",
"userid" : "123456",
"filesize" : "102633",
"filename" : "130741557032361573_equilibrating.txt",
"extension" : "txt",
"modificationdate" : "2015-04-22T08:55:03.1969749+03:00",
"timestamp" : "2015-04-22T05:55:03.5139749Z",
"category" : 0
}'



On Tuesday, April 21, 2015 at 5:08:56 PM UTC+3, David Pilato wrote:
>
> I’m not saying that you need to send all your data. But to send one 
> document at least which is supposed to match.
> So then we can play with your script and try to fix it.
>
>
>
> -- 
> *David Pilato* - Developer | Evangelist 
> *elastic.co *
> @dadoonet  | @elasticsearchfr 
>  | @scrutmydocs 
> 
>
>
>
>
>  
> Le 21 avr. 2015 à 14:33, Zaid Amir > a 
> écrit :
>
> If by data you mean the indexing calls, then I'm afraid they are too big 
> to be any relevant. Also not sure what this could help with since I have no 
> issues with creating, mapping or indexing data. As I said, what happens is 
> once I change my fields' 'store' property to false, my queries stop 
> returning hits. setting the 'store' property for the fields back to true 
> will make it work. Both are done with the _source field enabled.
>
> On Tuesday, April 21, 2015 at 3:10:56 PM UTC+3, David Pilato wrote:
>>
>> A full script is closed to what you sent.
>>
>> Data are just missing here.
>> Also, could you use GIST to post it?
>>
>>
>>
>> --
>> David ;-)
>> Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs
>>
>> Le 21 avr. 2015 à 13:54, Zaid Amir  a écrit :
>>
>> Sorry, not sure what you mean by a recreation script. But here is how my 
>> index is created and mapped along with the search query
>>
>> #Create an Index
>> curl -XPUT 'localhost:9200/files
>>  
>> #Create index mapping enabling _source and disabling _all fields
>> #Without the store property set to true, no results are returned
>>  
>> curl -XPUT /files/rawfiles/_mapping -d '{
>>   "rawfiles": {
>> "_source": {
>>   "enabled": true
>> },
>> "_all": {
>>   "enabled": false
>> },
>> "_timestamp": {
>>   "enabled": true,
>>   "path": "timestamp"
>> },
>> "properties": {
>>   "fileid": {
>> "store": true,
>> "index": "not_analyzed",
>> "omit_norms": true,
>> "type": "string"
>>   },
>>   "userid": {
>> "store": true,
>> "index": "not_analyzed",
>> "omit_norms": true,
>> "type": "string"
>>   },
>>   "filesize": {
>> "store": true,
>> "index": "not_analyzed",
>> "type": "long"
>>   },
>>   "filename": {
>> "store": true,
>> "omit_norms": true,
>> "index_analyzer": "def_analyzer",
>> "search_analyzer": "def_analyzer_search",
>> "type": "string"
>>   },
>>   "extension": {
>> "analyzer": "keyword",
>> "store": true,
>> "omit_norms": true,
>> "type": "string"
>>   },
>>   "modificationdate": {
>> "store": true,
>> "index": "not_analyzed",
>> "type": "date"
>>   },
>>   "timestamp": {
>> "store": true,
>> "index": "not_analyzed",
>> "type": "date"
>>   },
>>   "category": {
>> "store": true,
>> "index": "not_analyzed",
>> "type": "long"
>>   },
>> "content_ukw": {
>> "analyzer": "def_analyzer",
>> "store": true,
>> "omit_norms": true,
>> "type": "string"
>>   },
>>   "content_br": {
>> "analyzer": "br_analyzer",
>> "store": true,
>> "omit_norms": true,
>> "type": "string"
>>   },
>>   "content_da": {
>> "analyzer": "da_analyzer",
>> "store": true,
>> "omit_norms": true,
>> "type": "string"
>>   },
>>   "content_de": {
>> "analyzer": "de_analyzer",
>> "store": true,
>> "omit_norms": true,
>> "type": "string"
>>   },
>>   "content_en": {
>> "analyzer": "en_analyzer",
>> "store": true,
>> "omit_n

Re: Elasticsearch Version Upgrade

2015-04-21 Thread David Pilato
Only post 1.0


--
David ;-)
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

> Le 22 avr. 2015 à 01:14, Norberto Meijome  a écrit :
> 
> David, is this the case with older versions (both client and server on 0.90.x 
> versions using java client), and across the 0.90 to 1.x boundary, or only 
> post 1.x?
> 
>> On 22/04/2015 12:03 am, "David Pilato"  wrote:
>> This should work in both ways.
>> 
>> The client knows what is the node version.
>> The node knows what is the client version.
>> 
>> So basically, if one knows he should not send a new data because the other 
>> one is too old, it will simply ignore it.
>> Same for reading. If your node is newer, he knows that the client won’t 
>> provide X or Y value. So he won’t try to read it.
>> 
>> 
>> That said, the best thing to do is to test it! :D
>> 
>> 
>> 
>> -- 
>> David Pilato - Developer | Evangelist 
>> elastic.co
>> @dadoonet | @elasticsearchfr | @scrutmydocs
>> 
>> 
>> 
>> 
>> 
>>> Le 21 avr. 2015 à 15:39, Costya Regev  a écrit :
>>> 
>>> Another Question : if i will upgrade my Elasticsearch Client to Version 
>>> 1.5.1 and my Elasticsearch Servers will stay on version 1.4.2  will it work 
>>> ? it there a backward compatibility ?
>>> 
 On Tuesday, April 21, 2015 at 4:21:38 PM UTC+3, Costya Regev wrote:
 Just checking ,
 
 so you are sure that there is forward compatibility... and my system will 
 work fine with Es Client version of 1.4.1 when the server's version will 
 be 1.5.1 , right ?
 
> On Tuesday, April 21, 2015 at 3:12:44 PM UTC+3, David Pilato wrote:
> It should work fine.
> 
> --
> David ;-)
> Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs
> 
>> Le 21 avr. 2015 à 14:08, Costya Regev  a écrit :
>> 
>> Hi ,
>> 
>> We have Elasticsearch Servers running with Es Version 1.4.2,our client 
>> version is 1.4.1.
>> 
>> We are about to upgrade our Es cluster Version to 1.5.1 , my question is 
>> :
>> 
>> Do we need to upgrade the client version to 1.5.1 or our current version 
>> should be compatible with the new Version?
>> 
>> 
>> Thanks,
>> Costya.
>> 
>> -- 
>> You received this message because you are subscribed to the Google 
>> Groups "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send 
>> an email to elasticsearc...@googlegroups.com.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/b82263a6-e35b-4a38-b4ac-d1b570e233a9%40googlegroups.com.
>> For more options, visit https://groups.google.com/d/optout.
>>> 
>>> 
>>> -- 
>>> You received this message because you are subscribed to the Google Groups 
>>> "elasticsearch" group.
>>> To unsubscribe from this group and stop receiving emails from it, send an 
>>> email to elasticsearch+unsubscr...@googlegroups.com.
>>> To view this discussion on the web visit 
>>> https://groups.google.com/d/msgid/elasticsearch/6bd32750-ed76-457a-ac79-24e810b689be%40googlegroups.com.
>>> For more options, visit https://groups.google.com/d/optout.
>> 
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to elasticsearch+unsubscr...@googlegroups.com.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/B456AA03-5B37-4E04-9BD0-DE472DFB4AF2%40pilato.fr.
>> For more options, visit https://groups.google.com/d/optout.
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/elasticsearch/CACj2-4Kf3JqnY5jjeO5NYhPGA7ZaXSWxDi4FhmG_y3-Gji5T8g%40mail.gmail.com.
> For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/BB2741A4-9815-4807-9D7E-87878DE46FEB%40pilato.fr.
For more options, visit https://groups.google.com/d/optout.


Re: Document getting lost

2015-04-21 Thread Mark Walkom
Simply increasing the depths means more things will queue, but you still
need to catch up on that queue. If you are overloaded then this will never
happen and your queue won't be much help.
Look at the larger picture, are you running out of resources consistently
or is it transitory?

On 22 April 2015 at 14:08, bvnrwork  wrote:

> Hi ,
>
> Document getting lost due to queue size settings and also see ESRejection
> exception as explained in below article
>
>
> https://www.loggly.com/blog/nine-tips-configuring-elasticsearch-for-high-performance/
>
> I want to understand that if right queue size is set then there is no
> chance of document /data loss or is there any parameter I need to configure
> ?
>
>
> Regards,
> Nagaraju
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/ba6b8d3a-71a2-41f1-b489-58d7938e9f48%40googlegroups.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAEYi1X-qcs7PPQwCE96T_LPVE77qjUAAX8jzuyvzOe38mcKW6w%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Document getting lost

2015-04-21 Thread bvnrwork
Hi ,

Document getting lost due to queue size settings and also see ESRejection 
exception as explained in below article 

https://www.loggly.com/blog/nine-tips-configuring-elasticsearch-for-high-performance/

I want to understand that if right queue size is set then there is no 
chance of document /data loss or is there any parameter I need to configure 
?


Regards,
Nagaraju

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/ba6b8d3a-71a2-41f1-b489-58d7938e9f48%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: org.elasticsearch.index.mapper.MapperParsingException: failed to parse - need guidance

2015-04-21 Thread rastro
Note this line:

Caused by: java.lang.NumberFormatException: For input string: "Cached ad is 
better"

What's the mapping on your 'error' field?


On Tuesday, April 21, 2015 at 2:41:34 PM UTC-7, Tony Chong wrote:
>
>
> Hi,
>
> Using ELK
> ES 1.5.0
> LS 1.5.0rc2
> Kibana 4.0.1
>
> I have read about similiar issues but wasn't really sure what the proper 
> way to fix this is. I basically have a bunch of log files written out in 
> nested JSON. The logs can be a combination of various keys:values, 
> sometimes with the value being null (literally) as you can see in my logs 
> excerpt below.
>
>
> 2 thing of note: if i wipe out all my indicies, and restart the ELK stack, 
> everything works again. This only seems to happen after about 2 to 3 weeks 
> of being online. Second point, this wasn't an issue with logstash 1.4.2 and 
> the corresponding ES version. Any assistance would be awesome. 
>
> Thanks!
>
> Tony
>
>
>
> [2015-04-21 21:19:09,286][DEBUG][action.bulk  ] 
> [elasticsearch03] [logstash-2015.04.21][19] failed to execute bulk item 
> (index) index {[logstash-2015.04.21][requestAds][AUzd2VGHPBf0vCHmQV4j], 
> source[{"country":"BR","region":"18","city":"Curitiba","latitude":null,"longitude":null,"device_language":"pt","browser_user_agent":"VungleDroid/3.3.0","is_sd_card_available":1,"device_make":"motorola","device_model":"XT1033","device_height":1184,"device_width":720,"os_version":"5.0.2","platform":"android","sound_enabled":false,"volume":0,"device_id":"5ec495a3-80cb-4682-a0a9-c66c2ca122ea","ifa":"5ec495a3-80cb-4682-a0a9-c66c2ca122ea","isu":"5ec495a3-80cb-4682-a0a9-c66c2ca122ea","user_age":null,"user_gender":null,"ip_address":"189.123.219.242","connection":"wifi","network_operator":"TIM","pub_app_id":"507686ae771615941001aca5","pub_app_bundle_id":"com.kiloo.subwaysurf","ad_app_id":null,"campaign_id":null,"creative_id":null,"event_id":null,"sleep":-1,"strategy":null,"expiry":null,"post_bundle":null,"video_url":null,"show_close":null,"show_close_incentivized":null,"video_height":null,"video_width":null,"call_to_action_url":null,"call_to_action_destination":null,"countdown":null,"delay":null,"error":"Cached
>  
> ad is 
> better","shouldStream":false,"message":"","is_test":false,"host":"ip-10-155-170-179","level":"info","timestamp":"2015-04-21
>  
> 21:18:21.280","@version":"1","@timestamp":"2015-04-21T21:19:07.996Z","type":"requestAds"}]}
> org.elasticsearch.index.mapper.MapperParsingException: failed to parse 
> [error]
> at 
> org.elasticsearch.index.mapper.core.AbstractFieldMapper.parse(AbstractFieldMapper.java:410)
> at 
> org.elasticsearch.index.mapper.object.ObjectMapper.serializeValue(ObjectMapper.java:706)
> at 
> org.elasticsearch.index.mapper.object.ObjectMapper.parse(ObjectMapper.java:497)
> at 
> org.elasticsearch.index.mapper.DocumentMapper.parse(DocumentMapper.java:544)
> at 
> org.elasticsearch.index.mapper.DocumentMapper.parse(DocumentMapper.java:493)
> at 
> org.elasticsearch.index.shard.IndexShard.prepareCreate(IndexShard.java:438)
> at 
> org.elasticsearch.action.bulk.TransportShardBulkAction.shardIndexOperation(TransportShardBulkAction.java:432)
> at 
> org.elasticsearch.action.bulk.TransportShardBulkAction.shardOperationOnPrimary(TransportShardBulkAction.java:149)
> at 
> org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction.performOnPrimary(TransportShardReplicationOperationAction.java:515)
> at 
> org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction$1.run(TransportShardReplicationOperationAction.java:422)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.NumberFormatException: For input string: "Cached ad 
> is better"
> at 
> java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
> at java.lang.Long.parseLong(Long.java:441)
> at java.lang.Long.parseLong(Long.java:483)
> at 
> org.elasticsearch.common.xcontent.support.AbstractXContentParser.longValue(AbstractXContentParser.java:145)
> at 
> org.elasticsearch.index.mapper.core.LongFieldMapper.innerParseCreateField(LongFieldMapper.java:300)
> at 
> org.elasticsearch.index.mapper.core.NumberFieldMapper.parseCreateField(NumberFieldMapper.java:236)
> at 
> org.elasticsearch.index.mapper.core.AbstractFieldMapper.parse(AbstractFieldMapper.java:400)
> ... 12 more
> [2015-04-21 21:19:09,286][DEBUG][action.bulk  ] 
> [elasticsearch03] [logstash-2015.04.21][8] failed to execute bulk item 
> (index) index {[logstash-2015.04.21][albatross][AUzd2VGOtXgWXEfZvOn3], 
> source[{"log_type":"albatross_vast_error","error":"Cannot have empty 
> VAST","uri":"
> http://api.adfalcon.com/AdRequest/GetAd/?R_SID=a0cdeb3e8f6f4c4aa25b04146e7eb25c&R_F=vast2&R_ADTYPE=v&R_V=api-all-2.2.0&R_SSID=BrainWars%3A%20Com

Re: How to diagnose slow queries every 10 minutes exactly?

2015-04-21 Thread Dave Reed
Ok, I've identified the problem, and it has nothing to do with ES :) It's 
something funky with the network card or its driver. I isolated the problem 
by testing each of my two nodes independently, identified it was only a 
problem on one of the two. Then spun up a simple node http listener and was 
able to reproduce the problem with that, thus eliminating ES as even being 
involved.

Still, this thread was very helpful, as I had a few other issues that 
reducing my shard count seems to have resolved.

Thank you so much for your help. ES rocks.

On Tuesday, April 21, 2015 at 2:20:22 PM UTC-7, Dave Reed wrote:
>
> Thanks for the info, but there's no load balancer involved here. No VMs 
> either.. nothing fancy.
>
> On Tuesday, April 21, 2015 at 1:55:18 PM UTC-7, AlexR wrote:
>>
>> it could be entirely unrelated but if I recall someone reported similar 
>> regular interval slowness. it proved to be the load balancer they used if I 
>> remember correctly. 
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/047ae734-f07d-4e4d-b0f1-f1417ddee044%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Elasticsearch Version Upgrade

2015-04-21 Thread Norberto Meijome
David, is this the case with older versions (both client and server on
0.90.x versions using java client), and across the 0.90 to 1.x boundary, or
only post 1.x?
On 22/04/2015 12:03 am, "David Pilato"  wrote:

> This should work in both ways.
>
> The client knows what is the node version.
> The node knows what is the client version.
>
> So basically, if one knows he should not send a new data because the other
> one is too old, it will simply ignore it.
> Same for reading. If your node is newer, he knows that the client won’t
> provide X or Y value. So he won’t try to read it.
>
>
> That said, the best thing to do is to test it! :D
>
>
>
> --
> *David Pilato* - Developer | Evangelist
> *elastic.co *
> @dadoonet  | @elasticsearchfr
>  | @scrutmydocs
> 
>
>
>
>
>
> Le 21 avr. 2015 à 15:39, Costya Regev  a écrit :
>
> Another Question : if i will upgrade my Elasticsearch Client to Version
> 1.5.1 and my Elasticsearch Servers will stay on version 1.4.2  will it work
> ? it there a backward compatibility ?
>
> On Tuesday, April 21, 2015 at 4:21:38 PM UTC+3, Costya Regev wrote:
>>
>> Just checking ,
>>
>> so you are sure that there is forward compatibility... and my system will
>> work fine with Es Client version of 1.4.1 when the server's version will be
>> 1.5.1 , right ?
>>
>> On Tuesday, April 21, 2015 at 3:12:44 PM UTC+3, David Pilato wrote:
>>>
>>> It should work fine.
>>>
>>> --
>>> David ;-)
>>> Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs
>>>
>>> Le 21 avr. 2015 à 14:08, Costya Regev  a écrit :
>>>
>>> Hi ,
>>>
>>> We have Elasticsearch Servers running with Es Version 1.4.2,our client
>>> version is 1.4.1.
>>>
>>> We are about to upgrade our Es cluster Version to 1.5.1 , my question is
>>> :
>>>
>>> Do we need to upgrade the client version to 1.5.1 or our current version
>>> should be compatible with the new Version?
>>>
>>>
>>> Thanks,
>>> Costya.
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "elasticsearch" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to elasticsearc...@googlegroups.com.
>>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/elasticsearch/b82263a6-e35b-4a38-b4ac-d1b570e233a9%40googlegroups.com
>>> 
>>> .
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/6bd32750-ed76-457a-ac79-24e810b689be%40googlegroups.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>
>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/B456AA03-5B37-4E04-9BD0-DE472DFB4AF2%40pilato.fr
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CACj2-4Kf3JqnY5jjeO5NYhPGA7ZaXSWxDi4FhmG_y3-Gji5T8g%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Elasticsearch service often goes down or gets killed

2015-04-21 Thread Tony Chong
I have seen these types of issues because the heap size was not big enough. 
It WILL just die and you will not know what happened. 

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/22b9822d-d532-47cb-adac-fc9acbf74f49%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


org.elasticsearch.index.mapper.MapperParsingException: failed to parse - need guidance

2015-04-21 Thread Tony Chong

Hi,

Using ELK
ES 1.5.0
LS 1.5.0rc2
Kibana 4.0.1

I have read about similiar issues but wasn't really sure what the proper 
way to fix this is. I basically have a bunch of log files written out in 
nested JSON. The logs can be a combination of various keys:values, 
sometimes with the value being null (literally) as you can see in my logs 
excerpt below.


2 thing of note: if i wipe out all my indicies, and restart the ELK stack, 
everything works again. This only seems to happen after about 2 to 3 weeks 
of being online. Second point, this wasn't an issue with logstash 1.4.2 and 
the corresponding ES version. Any assistance would be awesome. 

Thanks!

Tony



[2015-04-21 21:19:09,286][DEBUG][action.bulk  ] 
[elasticsearch03] [logstash-2015.04.21][19] failed to execute bulk item 
(index) index {[logstash-2015.04.21][requestAds][AUzd2VGHPBf0vCHmQV4j], 
source[{"country":"BR","region":"18","city":"Curitiba","latitude":null,"longitude":null,"device_language":"pt","browser_user_agent":"VungleDroid/3.3.0","is_sd_card_available":1,"device_make":"motorola","device_model":"XT1033","device_height":1184,"device_width":720,"os_version":"5.0.2","platform":"android","sound_enabled":false,"volume":0,"device_id":"5ec495a3-80cb-4682-a0a9-c66c2ca122ea","ifa":"5ec495a3-80cb-4682-a0a9-c66c2ca122ea","isu":"5ec495a3-80cb-4682-a0a9-c66c2ca122ea","user_age":null,"user_gender":null,"ip_address":"189.123.219.242","connection":"wifi","network_operator":"TIM","pub_app_id":"507686ae771615941001aca5","pub_app_bundle_id":"com.kiloo.subwaysurf","ad_app_id":null,"campaign_id":null,"creative_id":null,"event_id":null,"sleep":-1,"strategy":null,"expiry":null,"post_bundle":null,"video_url":null,"show_close":null,"show_close_incentivized":null,"video_height":null,"video_width":null,"call_to_action_url":null,"call_to_action_destination":null,"countdown":null,"delay":null,"error":"Cached
 
ad is 
better","shouldStream":false,"message":"","is_test":false,"host":"ip-10-155-170-179","level":"info","timestamp":"2015-04-21
 
21:18:21.280","@version":"1","@timestamp":"2015-04-21T21:19:07.996Z","type":"requestAds"}]}
org.elasticsearch.index.mapper.MapperParsingException: failed to parse 
[error]
at 
org.elasticsearch.index.mapper.core.AbstractFieldMapper.parse(AbstractFieldMapper.java:410)
at 
org.elasticsearch.index.mapper.object.ObjectMapper.serializeValue(ObjectMapper.java:706)
at 
org.elasticsearch.index.mapper.object.ObjectMapper.parse(ObjectMapper.java:497)
at 
org.elasticsearch.index.mapper.DocumentMapper.parse(DocumentMapper.java:544)
at 
org.elasticsearch.index.mapper.DocumentMapper.parse(DocumentMapper.java:493)
at 
org.elasticsearch.index.shard.IndexShard.prepareCreate(IndexShard.java:438)
at 
org.elasticsearch.action.bulk.TransportShardBulkAction.shardIndexOperation(TransportShardBulkAction.java:432)
at 
org.elasticsearch.action.bulk.TransportShardBulkAction.shardOperationOnPrimary(TransportShardBulkAction.java:149)
at 
org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction.performOnPrimary(TransportShardReplicationOperationAction.java:515)
at 
org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction$1.run(TransportShardReplicationOperationAction.java:422)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.NumberFormatException: For input string: "Cached ad is 
better"
at 
java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.lang.Long.parseLong(Long.java:441)
at java.lang.Long.parseLong(Long.java:483)
at 
org.elasticsearch.common.xcontent.support.AbstractXContentParser.longValue(AbstractXContentParser.java:145)
at 
org.elasticsearch.index.mapper.core.LongFieldMapper.innerParseCreateField(LongFieldMapper.java:300)
at 
org.elasticsearch.index.mapper.core.NumberFieldMapper.parseCreateField(NumberFieldMapper.java:236)
at 
org.elasticsearch.index.mapper.core.AbstractFieldMapper.parse(AbstractFieldMapper.java:400)
... 12 more
[2015-04-21 21:19:09,286][DEBUG][action.bulk  ] 
[elasticsearch03] [logstash-2015.04.21][8] failed to execute bulk item 
(index) index {[logstash-2015.04.21][albatross][AUzd2VGOtXgWXEfZvOn3], 
source[{"log_type":"albatross_vast_error","error":"Cannot have empty 
VAST","uri":"http://api.adfalcon.com/AdRequest/GetAd/?R_SID=a0cdeb3e8f6f4c4aa25b04146e7eb25c&R_F=vast2&R_ADTYPE=v&R_V=api-all-2.2.0&R_SSID=BrainWars%3A%20Competitive%20brain%20training%20game%20Brain%20Wars&R_SSName=BrainWars%3A%20Competitive%20brain%20training%20game%20Brain%20Wars&R_SSUrl=https%3A%2F%2Fitunes.apple.com%2Fus%2Fapp%2Fbrainwars-competitive-brain%2Fid845044428%3Fmt%3D8%26uo%3D4&R_SSMID=jp.co.translimit.brainwars&R_CT=Games&D_UID_IDFA=13f2156e-e269-4d2b-8228-52bda8f0d426&D_UID_DNT=false&R_IP=62.61.173.102&D_UA

Re: How to diagnose slow queries every 10 minutes exactly?

2015-04-21 Thread Dave Reed
Thanks for the info, but there's no load balancer involved here. No VMs 
either.. nothing fancy.

On Tuesday, April 21, 2015 at 1:55:18 PM UTC-7, AlexR wrote:
>
> it could be entirely unrelated but if I recall someone reported similar 
> regular interval slowness. it proved to be the load balancer they used if I 
> remember correctly. 
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/fade4dea-91c9-4459-82ef-56f780dcb704%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: How to diagnose slow queries every 10 minutes exactly?

2015-04-21 Thread AlexR
it could be entirely unrelated but if I recall someone reported similar regular 
interval slowness. it proved to be the load balancer they used if I remember 
correctly. 

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/9073c7ba-7fdd-4ed5-80dd-c0499e618f9a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Elasticsearch service often goes down or gets killed

2015-04-21 Thread Mark Walkom
You need to monitor the cluster with something like Marvel, kopf or HQ to
find out what is happening.
ES may die if the cluster is overloaded (think OOM), but you should see
something in the logs on that.

On 20 April 2015 at 22:13, Sébastien Vassaux  wrote:

> Hello!
>
> My webserver is running ubuntu 14.10 with elasticsearch 1.5.0 and java
> 1.7u55
>
> For some reason,* the elasticsearch service often goes down,* resulting
> in my website not being available to my users anymore (using
> FOSElasticaBundle with symfony).
>
> I am using systemctl to restart it automatically, but I would prefer a
> good fix once and for all. I feel the logs I have are not descriptive
> enough.
> Being pretty new to managing a server, I need some help.
>
> Can someone help me figure out the reason for this failure ? What are the
> right files I can output here to better understand the issue ?
>
> Thanks !
>
> *My systemctl status gives :*
>
> elasticsearch.service - ElasticSearch
>Loaded: loaded (/usr/lib/systemd/system/elasticsearch.service; enabled)
>Active: active (running) since Mon 2015-04-20 12:04:24 CEST; 1h 56min
> ago  <- Here it means restarted 1h56 ago. Why did it
> fail in the first place ?
>  Main PID: 9120 (java)
>CGroup: /system.slice/elasticsearch.service
>└─9120 /usr/bin/java -Xms256m -Xmx1g -Djava.awt.headless=true
> -XX:+UseParNewGC -XX:+UseConcMarkSweepGC
> -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingO...
>
> *In my journalctl, I have :*
>
> Apr 18 18:56:19 xx.ovh.net sshd[29397]: error: open /dev/tty failed -
> could not set controlling tty: Permission denied
> Apr 20 13:52:45 xx.ovh.net sshd[9764]: error: open /dev/tty failed -
> could not set controlling tty: Permission denied
>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/f0f6e4d3-35f6-4d63-ab91-bf048332e467%40googlegroups.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAEYi1X_Yhe-0TQme5PY_c_7DfoXKGkyUZWeVjTiieokL9Wb8cA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: How to export dashboard and visualization by using elasticdump?

2015-04-21 Thread Mark Walkom
Take a look at https://github.com/taskrabbit/elasticsearch-dump

On 21 April 2015 at 23:25, Priya G  wrote:

> Can anyone tell me the steps how to install elasticdump and how to export
> and import dashboards?
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/aac98611-7dfa-4cf1-bcab-bd1f43816e44%40googlegroups.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAEYi1X8V65MweWkrmL8f6G%3DpyMM%3DQ3h6TDfxNreE6oAOJ88boQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: selecting a server - a single quad socket, or two dual socket

2015-04-21 Thread Mark Walkom
It may make sense to do this, you probably also want to look into running
multiple instances on the host to maximise capacity.

On 21 April 2015 at 19:08, Tzahi jakubovitz  wrote:

> Today we can buy very performant servers at very reasonable price points.
>
> e.g. – the price of two dual socket servers with 512 GB memory is
> comparable to a single quad socket server with 1024 GB (1 TB) memory.
> (Assuming same number of cores and MHz on each CPU)
>
> My gut feeling is that a single quad server will give better performance
> since balancing shards and indexes across servers is simpler – especially
> if a query targets certain shards.
>
> Thanks for your opinion.
>
> Tzahi
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/de40706d-972a-4349-98a2-ba55ee580177%40googlegroups.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAEYi1X9D0WZZ7KwkOz3g0MY%2BnN%2BZymBGER7yLzBD3eYWYmCy7A%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: deploying ElasticSearch to a large memory server

2015-04-21 Thread Mark Walkom
It's definitely reasonable to run multiple instances per physical here.

On 21 April 2015 at 19:22, Tzahi jakubovitz  wrote:

> Hi all,
> I have a server with 1.5 TB memory.
> I can either use it with a single ES process, or launch few separate
> instances (using either VM, docker, or just different ports on the same
> server OS).
>
> What will be a reasonable number of instances for such a server ?
>
> Thanks,
> Tzahi
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/8909b6ad-2435-4804-900a-bfdec2aaddea%40googlegroups.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAEYi1X-9yK7KGzbkaoVULxDNq9aJp2cthe6M82j8kKGKV8WqQQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Rebuilding master node caused data loss

2015-04-21 Thread Mark Walkom
Default is still yes.

What happened in the logs on the data nodes?

On 22 April 2015 at 00:23, Brian  wrote:

> I have a cluster with 5 data nodes, and 1 master node.  I decided to test
> a master node failure, and clearly I miss understood exactly what is stored
> on the master.  I turned down the VM running the master node, and built a
> new one from scratch.  I then added it to the cluster as a master.  When
> this came online, I lost all my data that was in cluster previously and it
> started making new indexes clean again.  Now this isn't critical data, this
> is my test setup, but it still confused me.
>
> I have looked into this and it would seem there is a deafult setting for 
> gateway.local.auto_import_dangled.
> As I understand it, this was put in place for people like me who didn't
> understand whta would happen if you lost a master node, and should by
> default have imported the old data from each data node.  If this was
> defaulted to no, and just deleted the data, I would know exactly what
> happened.  I have looked at my configurations and I haven't set this to no,
> and yet the data was deleted.
>
> Can someone clarify if this setting is no longer valid, or if the default
> hsa been changd and not documented?
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/bbfc97a6-a775-41f4-b34c-e38d7c2c515d%40googlegroups.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAEYi1X-EOhDGhU%3DHxWxWjOsPSTWu0YTRyyeK1CaGcMd4C0xsOg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: How to diagnose slow queries every 10 minutes exactly?

2015-04-21 Thread Dave Reed
Ok,

I have deleted everything and restarted the ES service, on both nodes, and 
rebuilt the indexes. I have changed my shard count to 1 -- thank you for 
explaining that, I didn't realize the overhead was as large as it was. I do 
have some indexes that are much larger than others. The largest one has 
around 400k issues. Search performance seems fine with the 1 shard though, 
so I'll leave it at that. I can introduce a per-project shard count setting 
if needed later on.

Results:

My search count issue appears to be resolved, it was probably related to my 
large shard count.

However, the slow queries once every 10 minutes remains. It's even still on 
the same schedule -- always at wall clock time with 9 minutes, e.g. 
12:29pm. Here is a hot_threads call I ran while the slow searches were 
running (I replaced machine names and IPs with tokens for obscurity):

::: [][TbKES0BpQ---dAP1J9kKKQ][][inet[/:9300]]
   Hot threads at 2015-04-21T19:29:27.841Z, interval=500ms, busiestThreads=3, 
ignoreIdleThreads=true:

::: [][aPof4TmzQlm7Ck7frfGeuA][][inet[/:9300]]
   Hot threads at 2015-04-21T19:29:26.250Z, interval=500ms, busiestThreads=3, 
ignoreIdleThreads=true:
   
0.0% (0s out of 500ms) cpu usage by thread 'Attach Listener'
 unique snapshot
 unique snapshot
 unique snapshot
 unique snapshot
 unique snapshot
 unique snapshot
 unique snapshot
 unique snapshot
 unique snapshot
 unique snapshot



Here is the slow search log for the same timeframe (first # is the time to 
get the response, second # is the .took field from the ES response json)

4/21/2015, 12:29:27 PM  8586 8361
4/21/2015, 12:29:27 PM  9681 9454
4/21/2015, 12:29:27 PM  7685 7457

There's nothing in ES logs after the indexes were initially created, and 
it's just info type things like this:
[2015-04-21 10:48:05,029][INFO ][cluster.metadata ] [] 
updating number_of_replicas to [1] for indices []

It's so interesting that it's every 10 minutes and always at x9 minutes, 
regardless of the fact that I've restarted ES. There still could be an 
external reason for this, but like I said, there's no CPU usage or Disk I/O 
going on during these periods, so it's very strange. I have no leads other 
than these slow search times.

Thanks so much for your help. Any other troubleshooting ideas would be 
greatly appreciated.
-Dave

On Tuesday, April 21, 2015 at 12:12:38 AM UTC-7, David Pilato wrote:
>
> Some notes. You are using defaults. So you have 5 shards per index. 1000 
> primary shards.
> With replicas, it means 1000 shards per Node. Which means 1000 Lucene 
> instances.
>
> First thing to do is IMO to use only one shard per index unless you need 
> more for "big" indices.
>
> Then, have 3 nodes and set minimum_master_nodes to 2. You probably ran 
> into a split brain issue which could explain the difference you are seeing.
> I would probably set replica to 0 and then to 1 again. But if you follow 
> my first advice, you have to reindex, so reindex with 1 shard, 0 replica, 
> then set replica to 1.
>
>
> My 2 cents
>
> David
>
> Le 21 avr. 2015 à 08:31, Dave Reed > a 
> écrit :
>
> The logs are basically empty. There's activity from when I am creating the 
> indexes, but that's about it. Is there a logging level that could be 
> increased? I will run hot_thread as soon as I can in the morning and post 
> results when I can.
>
> I have each index set to 1 replica. If it matters, I first import them 
> with 0 replicas then set it to 1 when they are done. Shard count I will 
> have to check and get back to you, but it's whatever the default would be, 
> we haven't tweaked that. 
>
> I have 200+ indexes because I have 200+ different "projects" represented, 
> and each one has it's own set of mappings. Mappings which could collide on 
> name. I originally tried having a single index with each project 
> represented by a Type instead, but a conversation I had on this forum about 
> that led me away from it due to the fact two different projects may 
> sometimes have the same field names but with different types. Most searches 
> (99.9%) are done on a per-project basis, and changes to projects can create 
> a need to reindex the project, so having the segregation is nice, lest I 
> have to reimport the entire thing.
>
> In case this is related, I also found that a document was changed and 
> reindexed, but searches would sometimes include it and sometimes not. I 
> could literally just refresh the search over and over again, and it would 
> appear in the results roughly 50% of the time. 0 results, then 0, 0, then 1 
> result, 1, 1, then 0 again, etc.. I was running the search against one of 
> the nodes via _head. The cluster health was green during all of this. That 
> was surprising and something I was going to investigate more on my own, but 
> perhaps these two problems are related.
>
> I'm really hitting the limit of what I know how to troubleshoot with ES, 
> hence I am really hoping for help here :) 
>
>

Re: Scoring - queryNorm differs for documents during one query

2015-04-21 Thread Jakub Neubauer
Just some thoughts: As the queryNorm is calculated from terms frequencies - 
it seems to me, that it is calculated from only those terms of the query 
that somehow "matched" the document in some clause. So in our example, for 
first document terms "a" and "b" were used to calculate queryNorm, but for 
the second document only term "b". But this is not what one would suppose 
from the documentation! I would expect that _all_ query terms would be used 
in calculation, to satisfy the statement that queryNorm is fixed for all 
hits.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/28a667db-e8e1-47b3-a9a2-98cb5a277659%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Bulk Index from Remote Host

2015-04-21 Thread TB
David and Christopher, thanks for your advice, i did split the files into 
12 MB chunks,which was found to be optimum after testing various sizes.
I wanted draw from your experience of potential issues, w.r.to bulk 
indexing from local vs bulk indexing from remote.
i did choose to bulk index locally

On Monday, April 20, 2015 at 5:40:43 PM UTC-5, TB wrote:
>
> We are planning to bulk insert about 10 Gig data ,however we are being 
> forced to do this from a remote host.
> Is this a good practice? And are there any potential issues i should watch 
> out for?
>
> any advice would be great
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/a144d5f5-4a2c-46ff-9709-061e2a7bfe6d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Index Size and Replica Impact

2015-04-21 Thread TB
Hi, I did not change the default, it was set to default : 5 
And the shards were allocated as you mentioned.
My index is search intensive than index intensive, so all nodes are 
configured as Master and Data nodes.
David, could you point me to a resource on how you derived the shard size 
across all nodes. 
As Norberto mentioned below, i have an understanding on replica shards now.

I also did an optimize with max_num_segments=1, how should i choose this? i 
read through where experts advice not to tinker with optimize and number of 
segments.
If i have a search intensive app, would it make sense to have 2 shards and 
have 1 replica for HA?
What would be an optimum way to measure this?


On Tuesday, April 21, 2015 at 12:33:56 AM UTC-5, David Pilato wrote:
>
> You don't have to set replicas to 3. It depends on the number of shards 
> you have for your index.
> If you are using default (5), then you probably have today something like:
>
> Node 1 : 4 shards
> Node 2 : 3 shards
> Node 3 : 3 shards
>
> Each shard should be around 600mb size (If using all defaults).
>
> What are your exact index settings today?
>
> David
>
> Le 20 avr. 2015 à 23:54, TB > a écrit :
>
> I have my indexes size @ 6 GB currently with replica set @ 1.
> I have 3 node cluster, in order to utilize the cluster , my understanding 
> that i would have set the replica to 3.
> If i do that, would my index size grow more than 6 GB in each node?
>
>  -- 
> You received this message because you are subscribed to the Google Groups 
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to elasticsearc...@googlegroups.com .
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/elasticsearch/b3a6247f-a2a1-446f-8ed5-e93be4672cc3%40googlegroups.com
>  
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/bcc161df-388b-4e1e-973c-9242b5ce6f00%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Search by 'all values from a document must be contained in query'

2015-04-21 Thread Max Melentiev
Hi!

I'm indexing documents with list of required features (int[] A). And then I 
want to filter docs by the list of available features (int[] B). So that 
all required features must be in a list of available (A & B = A).

I haven't found any appropriate filter for this and understand that this 
problem is hard to solve with inverted index.
I wonder if there any tricky way to do it? Maybe its possible to set 
minimum_should_match of bool filter to the number of required features 
(A.lenght) or something else.

Any thoughts will be helpful.

Thanks!
Max 

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/cbea8916-12e0-4ae4-bc4c-577259fc9185%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Script to return array for scripted metric aggregation from combine

2015-04-21 Thread Colin Goodheart-Smithe
Vineeth,

You can return any standard groovy object (by this i mean primitives, 
strings, arrays or maps) from the combine script and it will be passed to 
the reduce script. Below is a sense recreation script for a more complex 
example which counts the number of occurances of each word in a field 
(basically a crude version of the terms aggregation). Please note that 
these scripts are for test purposes and should not be used in aa production 
environment, not least because they are written in Groovy and require 
dynamic scripting to be enabled.

DELETE test


POST /test/doc/1
{
  "l": 10,
  "s": "ten"
}
POST /test/doc/2
{
  "l": 4,
  "s": "four"
}
POST /test/doc/3
{
  "l": 10,
  "s": "ten"
}
POST /test/doc/4
{
  "l": 7,
  "s": "seven"
}
POST /test/doc/5
{
  "l": 10,
  "s": "ten"
}
POST /test/doc/6
{
  "l": 4,
  "s": "four"
}
POST /test/doc/7
{
  "l": 6,
  "s": "six"
}
POST /test/doc/8
{
  "l": 6,
  "s": "six"
}


# Output of combine script on each shard is a map with a key for every word 
and values for the number of occurances of that word
GET /test/_search?search_type=count
{
  "aggs": {
"scripted_terms": {
  "scripted_metric": {
"init_script": "_agg['words'] = []",
"map_script": "word = doc['s']; _agg.words.add(word.value)",
"combine_script": "combined = [:]; for (word in _agg.words) { if 
(combined[word]) { combined[word] += 1 } else { combined[word] = 1 } }; 
return combined"
  }
}
  }
}


# Reduce script uses the map from each shard and adds together the values 
for common keys to produce a final map as output
GET /test/_search?search_type=count
{
  "aggs": {
"scripted_terms": {
  "scripted_metric": {
"init_script": "_agg['words'] = []",
"map_script": "word = doc['s']; _agg.words.add(word.value)",
"combine_script": "combined = [:]; for (word in _agg.words) { if 
(combined[word]) { combined[word] += 1 } else { combined[word] = 1 } }; 
return combined",
"reduce_script": "reduced = [:]; for (a in _aggs) { for (entry in 
a) { word = entry.key; if (reduced[word]) { reduced[word] += entry.value } 
else { reduced[word] = entry.value } } }; return reduced"
  }
}
  }
}

Hope this helps,

Colin

On Tuesday, April 21, 2015 at 4:31:21 PM UTC+1, vineeth mohan wrote:
>
> Hi ,
>
> For scripted metric aggregation 
> 
>  , 
> in the example shown in the documentation , the combine script returns a 
> single number.
>
> Instead here , can i pass an array or hash ? I tried doing it , though it 
> did not return any error , i am not able to access those values from reduce 
> script. In reduce script per shard i am getting an instance when converted 
> to string read as 'Script2$_run_closure1@52ef3bd9'
>
> Kindly let me know , if this can be accomplished in any way.
> Thanks
>Vineeth
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/57d34e94-203f-4a4f-83ef-2e89f6ab6328%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Script to return array for scripted metric aggregation from combine

2015-04-21 Thread vineeth mohan
Hi ,

For scripted metric aggregation

,
in the example shown in the documentation , the combine script returns a
single number.

Instead here , can i pass an array or hash ? I tried doing it , though it
did not return any error , i am not able to access those values from reduce
script. In reduce script per shard i am getting an instance when converted
to string read as 'Script2$_run_closure1@52ef3bd9'

Kindly let me know , if this can be accomplished in any way.
Thanks
   Vineeth

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAGdPd5krMeqn%2Bmh-r2kp0kYBrF9%2Ba%3Dc09CEtSBdVzT8rSvq1AQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Rebuilding master node caused data loss

2015-04-21 Thread Brian
I have a cluster with 5 data nodes, and 1 master node.  I decided to test a 
master node failure, and clearly I miss understood exactly what is stored 
on the master.  I turned down the VM running the master node, and built a 
new one from scratch.  I then added it to the cluster as a master.  When 
this came online, I lost all my data that was in cluster previously and it 
started making new indexes clean again.  Now this isn't critical data, this 
is my test setup, but it still confused me.

I have looked into this and it would seem there is a deafult setting for 
gateway.local.auto_import_dangled. 
As I understand it, this was put in place for people like me who didn't 
understand whta would happen if you lost a master node, and should by 
default have imported the old data from each data node.  If this was 
defaulted to no, and just deleted the data, I would know exactly what 
happened.  I have looked at my configurations and I haven't set this to no, 
and yet the data was deleted.

Can someone clarify if this setting is no longer valid, or if the default 
hsa been changd and not documented?

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/bbfc97a6-a775-41f4-b34c-e38d7c2c515d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: No hits if fields are not stored

2015-04-21 Thread David Pilato
I’m not saying that you need to send all your data. But to send one document at 
least which is supposed to match.
So then we can play with your script and try to fix it.



-- 
David Pilato - Developer | Evangelist 
elastic.co
@dadoonet  | @elasticsearchfr 
 | @scrutmydocs 






> Le 21 avr. 2015 à 14:33, Zaid Amir  a écrit :
> 
> If by data you mean the indexing calls, then I'm afraid they are too big to 
> be any relevant. Also not sure what this could help with since I have no 
> issues with creating, mapping or indexing data. As I said, what happens is 
> once I change my fields' 'store' property to false, my queries stop returning 
> hits. setting the 'store' property for the fields back to true will make it 
> work. Both are done with the _source field enabled.
> 
> On Tuesday, April 21, 2015 at 3:10:56 PM UTC+3, David Pilato wrote:
> A full script is closed to what you sent.
> 
> Data are just missing here.
> Also, could you use GIST to post it?
> 
> 
> 
> --
> David ;-)
> Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs
> 
> Le 21 avr. 2015 à 13:54, Zaid Amir > a écrit 
> :
> 
>> Sorry, not sure what you mean by a recreation script. But here is how my 
>> index is created and mapped along with the search query
>> 
>> #Create an Index
>> curl -XPUT 'localhost:9200/files
>>  
>> #Create index mapping enabling _source and disabling _all fields
>> #Without the store property set to true, no results are returned
>>  
>> curl -XPUT /files/rawfiles/_mapping -d '{
>>   "rawfiles": {
>> "_source": {
>>   "enabled": true
>> },
>> "_all": {
>>   "enabled": false
>> },
>> "_timestamp": {
>>   "enabled": true,
>>   "path": "timestamp"
>> },
>> "properties": {
>>   "fileid": {
>> "store": true,
>> "index": "not_analyzed",
>> "omit_norms": true,
>> "type": "string"
>>   },
>>   "userid": {
>> "store": true,
>> "index": "not_analyzed",
>> "omit_norms": true,
>> "type": "string"
>>   },
>>   "filesize": {
>> "store": true,
>> "index": "not_analyzed",
>> "type": "long"
>>   },
>>   "filename": {
>> "store": true,
>> "omit_norms": true,
>> "index_analyzer": "def_analyzer",
>> "search_analyzer": "def_analyzer_search",
>> "type": "string"
>>   },
>>   "extension": {
>> "analyzer": "keyword",
>> "store": true,
>> "omit_norms": true,
>> "type": "string"
>>   },
>>   "modificationdate": {
>> "store": true,
>> "index": "not_analyzed",
>> "type": "date"
>>   },
>>   "timestamp": {
>> "store": true,
>> "index": "not_analyzed",
>> "type": "date"
>>   },
>>   "category": {
>> "store": true,
>> "index": "not_analyzed",
>> "type": "long"
>>   },
>> "content_ukw": {
>> "analyzer": "def_analyzer",
>> "store": true,
>> "omit_norms": true,
>> "type": "string"
>>   },
>>   "content_br": {
>> "analyzer": "br_analyzer",
>> "store": true,
>> "omit_norms": true,
>> "type": "string"
>>   },
>>   "content_da": {
>> "analyzer": "da_analyzer",
>> "store": true,
>> "omit_norms": true,
>> "type": "string"
>>   },
>>   "content_de": {
>> "analyzer": "de_analyzer",
>> "store": true,
>> "omit_norms": true,
>> "type": "string"
>>   },
>>   "content_en": {
>> "analyzer": "en_analyzer",
>> "store": true,
>> "omit_norms": true,
>> "type": "string"
>>   },
>>   "content_es": {
>> "analyzer": "es_analyzer",
>> "store": true,
>> "omit_norms": true,
>> "type": "string"
>>   }
>> }
>>   }
>> }'
>>  
>> #After indexing some data, do a search query
>> curl -XPOST /files/rawfiles/_search -d '{
>>   "from": 0,
>>   "size": 50,
>>   "sort": [
>> {
>>   "_score": {
>> "order": "desc"
>>   }
>> }
>>   ],
>>   "fields": [
>> "filename",
>> "filesize"
>>   ],
>>   "aggs": {
>> "extension": {
>>   "terms": {
>> "field": "extension"
>>   }
>> },
>> "pagecount": {
>>   "terms": {
>> "field": "pagecount"
>>   }
>> },
>> "dates": {
>>   "date_range": {
>> "field": "modificationdate",
>> "format": "MM-",
>> "ranges": [
>>   {
>> "from": "now-1M/M",
>> "to": "now",
>> "key": "one_month"
>>   },
>>   {
>> "from": "now-3M/M",
>> "to": "now-1M/M",
>> "key": "three_months"
>>   },
>>   {
>> "from": "0",
>> "to": "now-3M/M",
>> "key": 

Re: Elasticsearch Version Upgrade

2015-04-21 Thread David Pilato
This should work in both ways.

The client knows what is the node version.
The node knows what is the client version.

So basically, if one knows he should not send a new data because the other one 
is too old, it will simply ignore it.
Same for reading. If your node is newer, he knows that the client won’t provide 
X or Y value. So he won’t try to read it.


That said, the best thing to do is to test it! :D



-- 
David Pilato - Developer | Evangelist 
elastic.co
@dadoonet  | @elasticsearchfr 
 | @scrutmydocs 






> Le 21 avr. 2015 à 15:39, Costya Regev  a écrit :
> 
> Another Question : if i will upgrade my Elasticsearch Client to Version 1.5.1 
> and my Elasticsearch Servers will stay on version 1.4.2  will it work ? it 
> there a backward compatibility ?
> 
> On Tuesday, April 21, 2015 at 4:21:38 PM UTC+3, Costya Regev wrote:
> Just checking ,
> 
> so you are sure that there is forward compatibility... and my system will 
> work fine with Es Client version of 1.4.1 when the server's version will be 
> 1.5.1 , right ?
> 
> On Tuesday, April 21, 2015 at 3:12:44 PM UTC+3, David Pilato wrote:
> It should work fine.
> 
> --
> David ;-)
> Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs
> 
> Le 21 avr. 2015 à 14:08, Costya Regev > a écrit :
> 
>> Hi ,
>> 
>> We have Elasticsearch Servers running with Es Version 1.4.2,our client 
>> version is 1.4.1.
>> 
>> We are about to upgrade our Es cluster Version to 1.5.1 , my question is :
>> 
>> Do we need to upgrade the client version to 1.5.1 or our current version 
>> should be compatible with the new Version?
>> 
>> 
>> Thanks,
>> Costya.
>> 
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to elasticsearc...@googlegroups.com <>.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/b82263a6-e35b-4a38-b4ac-d1b570e233a9%40googlegroups.com
>>  
>> .
>> For more options, visit https://groups.google.com/d/optout 
>> .
> 
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to elasticsearch+unsubscr...@googlegroups.com 
> .
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/elasticsearch/6bd32750-ed76-457a-ac79-24e810b689be%40googlegroups.com
>  
> .
> For more options, visit https://groups.google.com/d/optout 
> .

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/B456AA03-5B37-4E04-9BD0-DE472DFB4AF2%40pilato.fr.
For more options, visit https://groups.google.com/d/optout.


Re: Elasticsearch Version Upgrade

2015-04-21 Thread Costya Regev
Another Question : if i will upgrade my Elasticsearch Client to Version 
1.5.1 and my Elasticsearch Servers will stay on version 1.4.2  will it work 
? it there a backward compatibility ?

On Tuesday, April 21, 2015 at 4:21:38 PM UTC+3, Costya Regev wrote:
>
> Just checking ,
>
> so you are sure that there is forward compatibility... and my system will 
> work fine with Es Client version of 1.4.1 when the server's version will be 
> 1.5.1 , right ?
>
> On Tuesday, April 21, 2015 at 3:12:44 PM UTC+3, David Pilato wrote:
>>
>> It should work fine.
>>
>> --
>> David ;-)
>> Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs
>>
>> Le 21 avr. 2015 à 14:08, Costya Regev  a écrit :
>>
>> Hi ,
>>
>> We have Elasticsearch Servers running with Es Version 1.4.2,our client 
>> version is 1.4.1.
>>
>> We are about to upgrade our Es cluster Version to 1.5.1 , my question is :
>>
>> Do we need to upgrade the client version to 1.5.1 or our current version 
>> should be compatible with the new Version?
>>
>>
>> Thanks,
>> Costya.
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to elasticsearc...@googlegroups.com.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/b82263a6-e35b-4a38-b4ac-d1b570e233a9%40googlegroups.com
>>  
>> 
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/6bd32750-ed76-457a-ac79-24e810b689be%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Applicative version type mapping

2015-04-21 Thread Yarden Bar
Hi there,

I was looking for a type mapping for application version ? 
Meaning that the following seria 
["3.9.1","4.0.0",2.5.3.1.alpha","6.3.1.beta"] can be 
queried/searched/sorted ?

Does ES has support for that ?

Thanks,
Yarden

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/9aa3a12f-96e1-41b9-ad8d-b4bf32f45d87%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


How to export dashboard and visualization by using elasticdump?

2015-04-21 Thread Priya G
Can anyone tell me the steps how to install elasticdump and how to export 
and import dashboards?

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/aac98611-7dfa-4cf1-bcab-bd1f43816e44%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Elasticsearch Version Upgrade

2015-04-21 Thread Costya Regev
Just checking ,

so you are sure that there is forward compatibility... and my system will 
work fine with Es Client version of 1.4.1 when the server's version will be 
1.5.1 , right ?

On Tuesday, April 21, 2015 at 3:12:44 PM UTC+3, David Pilato wrote:
>
> It should work fine.
>
> --
> David ;-)
> Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs
>
> Le 21 avr. 2015 à 14:08, Costya Regev > 
> a écrit :
>
> Hi ,
>
> We have Elasticsearch Servers running with Es Version 1.4.2,our client 
> version is 1.4.1.
>
> We are about to upgrade our Es cluster Version to 1.5.1 , my question is :
>
> Do we need to upgrade the client version to 1.5.1 or our current version 
> should be compatible with the new Version?
>
>
> Thanks,
> Costya.
>
> -- 
> You received this message because you are subscribed to the Google Groups 
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to elasticsearc...@googlegroups.com .
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/elasticsearch/b82263a6-e35b-4a38-b4ac-d1b570e233a9%40googlegroups.com
>  
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/96b56559-f0fb-481d-a2d8-a87dba56b2f5%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Creating Snapshot Repository on Windows cluster

2015-04-21 Thread Sam Judson
Hi

Thanks for the reply.Unfortunately I've tried all of those things already. 
As I say, I've got the permissions wide open (full control on Everyone) for 
both the share and the file system underneath.

You have to escape the UNC path in JSON, I've tried double escaping etc. 
with no luck. I've tried the following:

file:gbr-t-ess-003/Snapshots/backup2
file://gbr-t-ess-003/Snapshots/backup2
gbr-t-ess-003/Snapshots/backup2
//gbr-t-ess-003/Snapshots/backup2
file:gbr-t-ess-003//Snapshots//backup2
gbr-t-ess-003\\Snapshots\\backup2
gbr-t-ess-003Snapshotsbackup2

I've read the Eclipse page on UNC paths 
here: http://wiki.eclipse.org/Eclipse/UNC_Paths

None of which has got me any closer to getting this to work unfortunately.

The full error if it helps:

{
   "error": "RepositoryException[[main_backup] failed to create 
repository]; nested: CreationException[Guice creation errors:\r\n\r\n1) 
Error injecting constructor, 
org.elasticsearch.common.blobstore.BlobStoreException: Failed to create 
directory at [gbr-t-ess-003\\Snapshots\\backup2]\r\n  at 
org.elasticsearch.repositories.fs.FsRepository.(Unknown Source)\r\n 
 while locating org.elasticsearch.repositories.fs.FsRepository\r\n  while 
locating org.elasticsearch.repositories.Repository\r\n\r\n1 error]; nested: 
BlobStoreException[Failed to create directory at 
[gbr-t-ess-003\\Snapshots\\backup2]]; ",
   "status": 500
}

Sam

On Tuesday, 21 April 2015 12:43:47 UTC+1, deepak.chauhan wrote:
>
> the problem might be in your location syntax . 
> please change the location path to 
>  "location" ;" \\gbr-t-ess-003\\Snapshots\\backup2\\"
>
> On Tue, Apr 21, 2015 at 5:04 PM, Deepak Chauhan  > wrote:
>
>> this problem may be due to the reason that the directory you are 
>> providing is not accepting Binary data 
>> you can try by changing the directory and provide read-write permissions 
>> to the directory
>>
>> On Tue, Apr 21, 2015 at 4:40 PM, Sam Judson > > wrote:
>>
>>> Sorry to bump, but anyone have any idea on this one, as I'm stumped.
>>>
>>> Does anyone need any more information?
>>>
>>> Sam
>>>
>>> On Monday, 20 April 2015 09:57:42 UTC+1, Sam Judson wrote:

 Hi

 I'm having some trouble creating a snapshot repository on a cluster 
 running on Windows.

 PUT _snapshot/main_backup
 {
   "type": "fs",
   "settings": {
 "location": "gbr-t-ess-003\\Snapshots\\backup2\\",
 "compress": true
   }
 }

 The above fails with a BlobStoreException: Failed to create directory.

 I've set the Share to Everyone Read/Write access to hopefully get this 
 working but still no good.

 I've tried creating symbolic links on each machine and using local 
 directory path, but no luck either.

 Anyone got this working?

 Sam

>>>  -- 
>>> You received this message because you are subscribed to the Google 
>>> Groups "elasticsearch" group.
>>> To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to elasticsearc...@googlegroups.com .
>>> To view this discussion on the web visit 
>>> https://groups.google.com/d/msgid/elasticsearch/26f31ff5-d81c-47d3-8cea-94bdc5da65f6%40googlegroups.com
>>>  
>>> 
>>> .
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/8b8d07c7-2d0e-4aa5-854a-e1ac6f56ff0e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: No hits if fields are not stored

2015-04-21 Thread Zaid Amir
If by data you mean the indexing calls, then I'm afraid they are too big to 
be any relevant. Also not sure what this could help with since I have no 
issues with creating, mapping or indexing data. As I said, what happens is 
once I change my fields' 'store' property to false, my queries stop 
returning hits. setting the 'store' property for the fields back to true 
will make it work. Both are done with the _source field enabled.

On Tuesday, April 21, 2015 at 3:10:56 PM UTC+3, David Pilato wrote:
>
> A full script is closed to what you sent.
>
> Data are just missing here.
> Also, could you use GIST to post it?
>
>
>
> --
> David ;-)
> Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs
>
> Le 21 avr. 2015 à 13:54, Zaid Amir > a 
> écrit :
>
> Sorry, not sure what you mean by a recreation script. But here is how my 
> index is created and mapped along with the search query
>
> #Create an Index
> curl -XPUT 'localhost:9200/files
>  
> #Create index mapping enabling _source and disabling _all fields
> #Without the store property set to true, no results are returned
>  
> curl -XPUT /files/rawfiles/_mapping -d '{
>   "rawfiles": {
> "_source": {
>   "enabled": true
> },
> "_all": {
>   "enabled": false
> },
> "_timestamp": {
>   "enabled": true,
>   "path": "timestamp"
> },
> "properties": {
>   "fileid": {
> "store": true,
> "index": "not_analyzed",
> "omit_norms": true,
> "type": "string"
>   },
>   "userid": {
> "store": true,
> "index": "not_analyzed",
> "omit_norms": true,
> "type": "string"
>   },
>   "filesize": {
> "store": true,
> "index": "not_analyzed",
> "type": "long"
>   },
>   "filename": {
> "store": true,
> "omit_norms": true,
> "index_analyzer": "def_analyzer",
> "search_analyzer": "def_analyzer_search",
> "type": "string"
>   },
>   "extension": {
> "analyzer": "keyword",
> "store": true,
> "omit_norms": true,
> "type": "string"
>   },
>   "modificationdate": {
> "store": true,
> "index": "not_analyzed",
> "type": "date"
>   },
>   "timestamp": {
> "store": true,
> "index": "not_analyzed",
> "type": "date"
>   },
>   "category": {
> "store": true,
> "index": "not_analyzed",
> "type": "long"
>   },
>  "content_ukw": {
> "analyzer": "def_analyzer",
> "store": true,
> "omit_norms": true,
> "type": "string"
>   },
>   "content_br": {
> "analyzer": "br_analyzer",
> "store": true,
> "omit_norms": true,
> "type": "string"
>   },
>   "content_da": {
> "analyzer": "da_analyzer",
> "store": true,
> "omit_norms": true,
> "type": "string"
>   },
>   "content_de": {
> "analyzer": "de_analyzer",
> "store": true,
> "omit_norms": true,
> "type": "string"
>   },
>   "content_en": {
> "analyzer": "en_analyzer",
> "store": true,
> "omit_norms": true,
> "type": "string"
>   },
>   "content_es": {
> "analyzer": "es_analyzer",
> "store": true,
> "omit_norms": true,
> "type": "string"
>   }
> }
>   }
> }'
>  
> #After indexing some data, do a search query
> curl -XPOST /files/rawfiles/_search -d '{
>   "from": 0,
>   "size": 50,
>   "sort": [
> {
>   "_score": {
> "order": "desc"
>   }
> }
>   ],
>   "fields": [
> "filename",
> "filesize"
>   ],
>   "aggs": {
> "extension": {
>   "terms": {
> "field": "extension"
>   }
> },
> "pagecount": {
>   "terms": {
> "field": "pagecount"
>   }
> },
> "dates": {
>   "date_range": {
> "field": "modificationdate",
> "format": "MM-",
> "ranges": [
>   {
> "from": "now-1M/M",
> "to": "now",
> "key": "one_month"
>   },
>   {
> "from": "now-3M/M",
> "to": "now-1M/M",
> "key": "three_months"
>   },
>   {
> "from": "0",
> "to": "now-3M/M",
> "key": "old"
>   }
> ]
>   }
> }
>   },
>   "query": {
> "multi_match": {
>   "query": "retracted",
>   "fields": [
> "filename^3",
> "content_*^2",
> "content_ukw"
>   ]
> }
>   }
> }'
>  
>  
> #No hits will be returned without setting the store property to true for 
> filename and content fileds
>
>
> On Tuesday, April 21, 2015 at 1:59:33 PM UTC+3, David Pilato wrote:
>>
>> I don’t understand. Could you GIST a full recreation scripts which 
>> demonstrate what you are seeing?
>>
>>
>>
>> -- 
>> *David Pilato* - Developer | Evangelist 

Re: Elasticsearch Version Upgrade

2015-04-21 Thread David Pilato
It should work fine.

--
David ;-)
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

> Le 21 avr. 2015 à 14:08, Costya Regev  a écrit :
> 
> Hi ,
> 
> We have Elasticsearch Servers running with Es Version 1.4.2,our client 
> version is 1.4.1.
> 
> We are about to upgrade our Es cluster Version to 1.5.1 , my question is :
> 
> Do we need to upgrade the client version to 1.5.1 or our current version 
> should be compatible with the new Version?
> 
> 
> Thanks,
> Costya.
> -- 
> You received this message because you are subscribed to the Google Groups 
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/elasticsearch/b82263a6-e35b-4a38-b4ac-d1b570e233a9%40googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/D1278DA3-D71F-4683-BC6A-5EA10D467D17%40pilato.fr.
For more options, visit https://groups.google.com/d/optout.


Re: No hits if fields are not stored

2015-04-21 Thread David Pilato
A full script is closed to what you sent.

Data are just missing here.
Also, could you use GIST to post it?



--
David ;-)
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

> Le 21 avr. 2015 à 13:54, Zaid Amir  a écrit :
> 
> Sorry, not sure what you mean by a recreation script. But here is how my 
> index is created and mapped along with the search query
> 
> #Create an Index
> curl -XPUT 'localhost:9200/files
>  
> #Create index mapping enabling _source and disabling _all fields
> #Without the store property set to true, no results are returned
>  
> curl -XPUT /files/rawfiles/_mapping -d '{
>   "rawfiles": {
> "_source": {
>   "enabled": true
> },
> "_all": {
>   "enabled": false
> },
> "_timestamp": {
>   "enabled": true,
>   "path": "timestamp"
> },
> "properties": {
>   "fileid": {
> "store": true,
> "index": "not_analyzed",
> "omit_norms": true,
> "type": "string"
>   },
>   "userid": {
> "store": true,
> "index": "not_analyzed",
> "omit_norms": true,
> "type": "string"
>   },
>   "filesize": {
> "store": true,
> "index": "not_analyzed",
> "type": "long"
>   },
>   "filename": {
> "store": true,
> "omit_norms": true,
> "index_analyzer": "def_analyzer",
> "search_analyzer": "def_analyzer_search",
> "type": "string"
>   },
>   "extension": {
> "analyzer": "keyword",
> "store": true,
> "omit_norms": true,
> "type": "string"
>   },
>   "modificationdate": {
> "store": true,
> "index": "not_analyzed",
> "type": "date"
>   },
>   "timestamp": {
> "store": true,
> "index": "not_analyzed",
> "type": "date"
>   },
>   "category": {
> "store": true,
> "index": "not_analyzed",
> "type": "long"
>   },
>  "content_ukw": {
> "analyzer": "def_analyzer",
> "store": true,
> "omit_norms": true,
> "type": "string"
>   },
>   "content_br": {
> "analyzer": "br_analyzer",
> "store": true,
> "omit_norms": true,
> "type": "string"
>   },
>   "content_da": {
> "analyzer": "da_analyzer",
> "store": true,
> "omit_norms": true,
> "type": "string"
>   },
>   "content_de": {
> "analyzer": "de_analyzer",
> "store": true,
> "omit_norms": true,
> "type": "string"
>   },
>   "content_en": {
> "analyzer": "en_analyzer",
> "store": true,
> "omit_norms": true,
> "type": "string"
>   },
>   "content_es": {
> "analyzer": "es_analyzer",
> "store": true,
> "omit_norms": true,
> "type": "string"
>   }
> }
>   }
> }'
>  
> #After indexing some data, do a search query
> curl -XPOST /files/rawfiles/_search -d '{
>   "from": 0,
>   "size": 50,
>   "sort": [
> {
>   "_score": {
> "order": "desc"
>   }
> }
>   ],
>   "fields": [
> "filename",
> "filesize"
>   ],
>   "aggs": {
> "extension": {
>   "terms": {
> "field": "extension"
>   }
> },
> "pagecount": {
>   "terms": {
> "field": "pagecount"
>   }
> },
> "dates": {
>   "date_range": {
> "field": "modificationdate",
> "format": "MM-",
> "ranges": [
>   {
> "from": "now-1M/M",
> "to": "now",
> "key": "one_month"
>   },
>   {
> "from": "now-3M/M",
> "to": "now-1M/M",
> "key": "three_months"
>   },
>   {
> "from": "0",
> "to": "now-3M/M",
> "key": "old"
>   }
> ]
>   }
> }
>   },
>   "query": {
> "multi_match": {
>   "query": "retracted",
>   "fields": [
> "filename^3",
> "content_*^2",
> "content_ukw"
>   ]
> }
>   }
> }'
>  
>  
> #No hits will be returned without setting the store property to true for 
> filename and content fileds
> 
>> On Tuesday, April 21, 2015 at 1:59:33 PM UTC+3, David Pilato wrote:
>> I don’t understand. Could you GIST a full recreation scripts which 
>> demonstrate what you are seeing?
>> 
>> 
>> 
>> -- 
>> David Pilato - Developer | Evangelist 
>> elastic.co
>> @dadoonet | @elasticsearchfr | @scrutmydocs
>> 
>> 
>> 
>> 
>> 
>>> Le 21 avr. 2015 à 12:50, Zaid Amir  a écrit :
>>> 
>>> Hi,
>>> 
>>> I am having issues with ES. I have configured ES to store the _source 
>>> field, however when I query I do not get any hits unless I "store" the 
>>> fields that I want to query. This is how my query request looks like:
>>> 
>>> {
>>>   "from": 0,
>>>   "size": 50,
>>>   "sort": [
>>> {
>>>   "_score": {
>>> "order": "desc"
>>>   }
>>> }
>>>   ],

Elasticsearch Version Upgrade

2015-04-21 Thread Costya Regev
Hi ,

We have Elasticsearch Servers running with Es Version 1.4.2,our client 
version is 1.4.1.

We are about to upgrade our Es cluster Version to 1.5.1 , my question is :

Do we need to upgrade the client version to 1.5.1 or our current version 
should be compatible with the new Version?


Thanks,
Costya.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/b82263a6-e35b-4a38-b4ac-d1b570e233a9%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: No hits if fields are not stored

2015-04-21 Thread Zaid Amir
Sorry, not sure what you mean by a recreation script. But here is how my 
index is created and mapped along with the search query

#Create an Index
curl -XPUT 'localhost:9200/files
 
#Create index mapping enabling _source and disabling _all fields
#Without the store property set to true, no results are returned
 
curl -XPUT /files/rawfiles/_mapping -d '{
  "rawfiles": {
"_source": {
  "enabled": true
},
"_all": {
  "enabled": false
},
"_timestamp": {
  "enabled": true,
  "path": "timestamp"
},
"properties": {
  "fileid": {
"store": true,
"index": "not_analyzed",
"omit_norms": true,
"type": "string"
  },
  "userid": {
"store": true,
"index": "not_analyzed",
"omit_norms": true,
"type": "string"
  },
  "filesize": {
"store": true,
"index": "not_analyzed",
"type": "long"
  },
  "filename": {
"store": true,
"omit_norms": true,
"index_analyzer": "def_analyzer",
"search_analyzer": "def_analyzer_search",
"type": "string"
  },
  "extension": {
"analyzer": "keyword",
"store": true,
"omit_norms": true,
"type": "string"
  },
  "modificationdate": {
"store": true,
"index": "not_analyzed",
"type": "date"
  },
  "timestamp": {
"store": true,
"index": "not_analyzed",
"type": "date"
  },
  "category": {
"store": true,
"index": "not_analyzed",
"type": "long"
  },
   "content_ukw": {
"analyzer": "def_analyzer",
"store": true,
"omit_norms": true,
"type": "string"
  },
  "content_br": {
"analyzer": "br_analyzer",
"store": true,
"omit_norms": true,
"type": "string"
  },
  "content_da": {
"analyzer": "da_analyzer",
"store": true,
"omit_norms": true,
"type": "string"
  },
  "content_de": {
"analyzer": "de_analyzer",
"store": true,
"omit_norms": true,
"type": "string"
  },
  "content_en": {
"analyzer": "en_analyzer",
"store": true,
"omit_norms": true,
"type": "string"
  },
  "content_es": {
"analyzer": "es_analyzer",
"store": true,
"omit_norms": true,
"type": "string"
  }
}
  }
}'
 
#After indexing some data, do a search query
curl -XPOST /files/rawfiles/_search -d '{
  "from": 0,
  "size": 50,
  "sort": [
{
  "_score": {
"order": "desc"
  }
}
  ],
  "fields": [
"filename",
"filesize"
  ],
  "aggs": {
"extension": {
  "terms": {
"field": "extension"
  }
},
"pagecount": {
  "terms": {
"field": "pagecount"
  }
},
"dates": {
  "date_range": {
"field": "modificationdate",
"format": "MM-",
"ranges": [
  {
"from": "now-1M/M",
"to": "now",
"key": "one_month"
  },
  {
"from": "now-3M/M",
"to": "now-1M/M",
"key": "three_months"
  },
  {
"from": "0",
"to": "now-3M/M",
"key": "old"
  }
]
  }
}
  },
  "query": {
"multi_match": {
  "query": "retracted",
  "fields": [
"filename^3",
"content_*^2",
"content_ukw"
  ]
}
  }
}'
 
 
#No hits will be returned without setting the store property to true for 
filename and content fileds


On Tuesday, April 21, 2015 at 1:59:33 PM UTC+3, David Pilato wrote:
>
> I don’t understand. Could you GIST a full recreation scripts which 
> demonstrate what you are seeing?
>
>
>
> -- 
> *David Pilato* - Developer | Evangelist 
> *elastic.co *
> @dadoonet  | @elasticsearchfr 
>  | @scrutmydocs 
> 
>
>
>
>
>  
> Le 21 avr. 2015 à 12:50, Zaid Amir > a 
> écrit :
>
> Hi,
>
> I am having issues with ES. I have configured ES to store the _source 
> field, however when I query I do not get any hits unless I "store" the 
> fields that I want to query. This is how my query request looks like:
>
> {
>   "from": 0,
>   "size": 50,
>   "sort": [
> {
>   "_score": {
> "order": "desc"
>   }
> }
>   ],
>   "fields": [
> "filename",
> "filesize"
>   ],
>   "aggs": {
> "extension": {
>   "terms": {
> "field": "extension"
>   }
> },
> "pagecount": {
>   "terms": {
> "field": "pagecount"
>   }
> },
> "dates": {
>   "date_range": {
> "field": "modificationdate",
> "format": "MM-",
> "ranges": [
>   {
> "from": "now-1M/M",
> "to": "now",
> "key": "one_month"
>  

Re: Creating Snapshot Repository on Windows cluster

2015-04-21 Thread Deepak Chauhan
the problem might be in your location syntax .
please change the location path to
 "location" ;" \\gbr-t-ess-003\\Snapshots\\backup2\\"

On Tue, Apr 21, 2015 at 5:04 PM, Deepak Chauhan <
deepak.chau...@daffodilsw.com> wrote:

> this problem may be due to the reason that the directory you are providing
> is not accepting Binary data
> you can try by changing the directory and provide read-write permissions
> to the directory
>
> On Tue, Apr 21, 2015 at 4:40 PM, Sam Judson  wrote:
>
>> Sorry to bump, but anyone have any idea on this one, as I'm stumped.
>>
>> Does anyone need any more information?
>>
>> Sam
>>
>> On Monday, 20 April 2015 09:57:42 UTC+1, Sam Judson wrote:
>>>
>>> Hi
>>>
>>> I'm having some trouble creating a snapshot repository on a cluster
>>> running on Windows.
>>>
>>> PUT _snapshot/main_backup
>>> {
>>>   "type": "fs",
>>>   "settings": {
>>> "location": "gbr-t-ess-003\\Snapshots\\backup2\\",
>>> "compress": true
>>>   }
>>> }
>>>
>>> The above fails with a BlobStoreException: Failed to create directory.
>>>
>>> I've set the Share to Everyone Read/Write access to hopefully get this
>>> working but still no good.
>>>
>>> I've tried creating symbolic links on each machine and using local
>>> directory path, but no luck either.
>>>
>>> Anyone got this working?
>>>
>>> Sam
>>>
>>  --
>> You received this message because you are subscribed to the Google Groups
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to elasticsearch+unsubscr...@googlegroups.com.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/elasticsearch/26f31ff5-d81c-47d3-8cea-94bdc5da65f6%40googlegroups.com
>> 
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHzCMprNTnaD2iSMqKQCKrSH0Fnwhwk4cYu4-d-%3DV%3DMUk0_4pw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Creating Snapshot Repository on Windows cluster

2015-04-21 Thread Deepak Chauhan
this problem may be due to the reason that the directory you are providing
is not accepting Binary data
you can try by changing the directory and provide read-write permissions to
the directory

On Tue, Apr 21, 2015 at 4:40 PM, Sam Judson  wrote:

> Sorry to bump, but anyone have any idea on this one, as I'm stumped.
>
> Does anyone need any more information?
>
> Sam
>
> On Monday, 20 April 2015 09:57:42 UTC+1, Sam Judson wrote:
>>
>> Hi
>>
>> I'm having some trouble creating a snapshot repository on a cluster
>> running on Windows.
>>
>> PUT _snapshot/main_backup
>> {
>>   "type": "fs",
>>   "settings": {
>> "location": "gbr-t-ess-003\\Snapshots\\backup2\\",
>> "compress": true
>>   }
>> }
>>
>> The above fails with a BlobStoreException: Failed to create directory.
>>
>> I've set the Share to Everyone Read/Write access to hopefully get this
>> working but still no good.
>>
>> I've tried creating symbolic links on each machine and using local
>> directory path, but no luck either.
>>
>> Anyone got this working?
>>
>> Sam
>>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/26f31ff5-d81c-47d3-8cea-94bdc5da65f6%40googlegroups.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHzCMpr2PS4%2BbmnNZO6iojkpzZFSEa8c7oVha-8HEcnt_BK5Gw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Creating Snapshot Repository on Windows cluster

2015-04-21 Thread Sam Judson
Sorry to bump, but anyone have any idea on this one, as I'm stumped.

Does anyone need any more information?

Sam

On Monday, 20 April 2015 09:57:42 UTC+1, Sam Judson wrote:
>
> Hi
>
> I'm having some trouble creating a snapshot repository on a cluster 
> running on Windows.
>
> PUT _snapshot/main_backup
> {
>   "type": "fs",
>   "settings": {
> "location": "gbr-t-ess-003\\Snapshots\\backup2\\",
> "compress": true
>   }
> }
>
> The above fails with a BlobStoreException: Failed to create directory.
>
> I've set the Share to Everyone Read/Write access to hopefully get this 
> working but still no good.
>
> I've tried creating symbolic links on each machine and using local 
> directory path, but no luck either.
>
> Anyone got this working?
>
> Sam
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/26f31ff5-d81c-47d3-8cea-94bdc5da65f6%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: No hits if fields are not stored

2015-04-21 Thread David Pilato
I don’t understand. Could you GIST a full recreation scripts which demonstrate 
what you are seeing?



-- 
David Pilato - Developer | Evangelist 
elastic.co
@dadoonet  | @elasticsearchfr 
 | @scrutmydocs 






> Le 21 avr. 2015 à 12:50, Zaid Amir  a écrit :
> 
> Hi,
> 
> I am having issues with ES. I have configured ES to store the _source field, 
> however when I query I do not get any hits unless I "store" the fields that I 
> want to query. This is how my query request looks like:
> 
> {
>   "from": 0,
>   "size": 50,
>   "sort": [
> {
>   "_score": {
> "order": "desc"
>   }
> }
>   ],
>   "fields": [
> "filename",
> "filesize"
>   ],
>   "aggs": {
> "extension": {
>   "terms": {
> "field": "extension"
>   }
> },
> "pagecount": {
>   "terms": {
> "field": "pagecount"
>   }
> },
> "dates": {
>   "date_range": {
> "field": "modificationdate",
> "format": "MM-",
> "ranges": [
>   {
> "from": "now-1M/M",
> "to": "now",
> "key": "one_month"
>   },
>   {
> "from": "now-3M/M",
> "to": "now-1M/M",
> "key": "three_months"
>   },
>   {
> "from": "0",
> "to": "now-3M/M",
> "key": "old"
>   }
> ]
>   }
> }
>   },
>   "query": {
> "multi_match": {
>   "query": "some string",
>   "fields": [
> "filename^3",
> "content_*^2",
> "content_ukw"
>   ]
> }
>   }
> }
> 
> 
> Unfortunately without storing the filename and the content_ fileds, I do not 
> get any results and ES returns no hits. 
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to elasticsearch+unsubscr...@googlegroups.com 
> .
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/elasticsearch/4aeb3736-ba27-4934-b655-15b53d14b8a1%40googlegroups.com
>  
> .
> For more options, visit https://groups.google.com/d/optout 
> .

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/2B522B56-1928-48E2-89BA-15792D5E1B33%40pilato.fr.
For more options, visit https://groups.google.com/d/optout.


No hits if fields are not stored

2015-04-21 Thread Zaid Amir
Hi,

I am having issues with ES. I have configured ES to store the _source 
field, however when I query I do not get any hits unless I "store" the 
fields that I want to query. This is how my query request looks like:

{
  "from": 0,
  "size": 50,
  "sort": [
{
  "_score": {
"order": "desc"
  }
}
  ],
  "fields": [
"filename",
"filesize"
  ],
  "aggs": {
"extension": {
  "terms": {
"field": "extension"
  }
},
"pagecount": {
  "terms": {
"field": "pagecount"
  }
},
"dates": {
  "date_range": {
"field": "modificationdate",
"format": "MM-",
"ranges": [
  {
"from": "now-1M/M",
"to": "now",
"key": "one_month"
  },
  {
"from": "now-3M/M",
"to": "now-1M/M",
"key": "three_months"
  },
  {
"from": "0",
"to": "now-3M/M",
"key": "old"
  }
]
  }
}
  },
  "query": {
"multi_match": {
  "query": "some string",
  "fields": [
"filename^3",
"content_*^2",
"content_ukw"
  ]
}
  }
}


Unfortunately without storing the filename and the content_ fileds, I do 
not get any results and ES returns no hits. 

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/4aeb3736-ba27-4934-b655-15b53d14b8a1%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


about org.elasticsearch.indices.recovery.RecoveryFailedException error

2015-04-21 Thread guoyiqincn
*Hi All,*

*I have a 5 nodes cluster. *


*now my cluster eror *

[2015-04-21 18:32:26,032][WARN ][indices.cluster  ] [i-bxtszyhz] 
[blacklist][1] failed to start shard
org.elasticsearch.indices.recovery.RecoveryFailedException: [blacklist][1]: 
Recovery failed from 
[i-5tar85fu][9ofd50CbQmiYwcdppvAYVw][i-5tar85fu][inet[/192.168.101.53:9300]] 
into 
[i-bxtszyhz][AF98fGO3RCG0gta1LKrd-A][i-bxtszyhz][inet[/192.168.101.54:9300]]
at 
org.elasticsearch.indices.recovery.RecoveryTarget.doRecovery(RecoveryTarget.java:308)
at 
org.elasticsearch.indices.recovery.RecoveryTarget.access$200(RecoveryTarget.java:65)
at 
org.elasticsearch.indices.recovery.RecoveryTarget$2.run(RecoveryTarget.java:177)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
Caused by: org.elasticsearch.transport.RemoteTransportException: 
[i-5tar85fu][inet[/192.168.101.53:9300]][internal:index/shard/recovery/start_recovery]
Caused by: org.elasticsearch.index.engine.RecoveryEngineException: 
[blacklist][1] Phase[1] Execution failed
at 
org.elasticsearch.index.engine.internal.InternalEngine.recover(InternalEngine.java:1120)
at 
org.elasticsearch.index.shard.service.InternalIndexShard.recover(InternalIndexShard.java:654)
at 
org.elasticsearch.indices.recovery.RecoverySource.recover(RecoverySource.java:137)
at 
org.elasticsearch.indices.recovery.RecoverySource.access$2600(RecoverySource.java:74)
at 
org.elasticsearch.indices.recovery.RecoverySource$StartRecoveryTransportRequestHandler.messageReceived(RecoverySource.java:464)
at 
org.elasticsearch.indices.recovery.RecoverySource$StartRecoveryTransportRequestHandler.messageReceived(RecoverySource.java:450)
at 
org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.run(MessageChannelHandler.java:275)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
Caused by: 
org.elasticsearch.indices.recovery.RecoverFilesRecoveryException: 
[blacklist][1] Failed to transfer [1] files with total size of [2mb]
at 
org.elasticsearch.indices.recovery.RecoverySource$1.phase1(RecoverySource.java:276)
at 
org.elasticsearch.index.engine.internal.InternalEngine.recover(InternalEngine.java:1116)
... 9 more
Caused by: org.elasticsearch.transport.RemoteTransportException: 
[i-bxtszyhz][inet[/192.168.101.54:9300]][internal:index/shard/recovery/file_chunk]
Caused by: java.io.FileNotFoundException: 
/var/data/elasticsearch/9FbankCloud/nodes/0/indices/blacklist/1/index/recovery.1429612346025.segments_5
 
(Permission denied)
at java.io.FileOutputStream.open(Native Method)
at java.io.FileOutputStream.(FileOutputStream.java:221)
at java.io.FileOutputStream.(FileOutputStream.java:171)
at 
org.apache.lucene.store.FSDirectory$FSIndexOutput.(FSDirectory.java:389)
at 
org.apache.lucene.store.FSDirectory.createOutput(FSDirectory.java:282)
at 
org.apache.lucene.store.FileSwitchDirectory.createOutput(FileSwitchDirectory.java:152)
at 
org.apache.lucene.store.RateLimitedFSDirectory.createOutput(RateLimitedFSDirectory.java:40)
at 
org.elasticsearch.index.store.DistributorDirectory.createOutput(DistributorDirectory.java:118)
at 
org.apache.lucene.store.FilterDirectory.createOutput(FilterDirectory.java:69)
at 
org.elasticsearch.index.store.Store.createVerifyingOutput(Store.java:336)
at 
org.elasticsearch.indices.recovery.RecoveryStatus.openAndPutIndexOutput(RecoveryStatus.java:120)
at 
org.elasticsearch.indices.recovery.RecoveryTarget$FileChunkTransportRequestHandler.messageReceived(RecoveryTarget.java:573)
at 
org.elasticsearch.indices.recovery.RecoveryTarget$FileChunkTransportRequestHandler.messageReceived(RecoveryTarget.java:535)
at 
org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.run(MessageChannelHandler.java:275)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)


I really hope to get help

*Thanks !!!*

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/1694480e-1017-4986-b7fb-cb756cccf73e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: I have got a little Problem with my synonym filter ....

2015-04-21 Thread Ivan Brusic
What kind of query are you executing? Are you query against a specific
field? A match query against the title field should work.

When using the analyze API, explicit state the field and not the analyzer
for more accurate behavior of what really goes on.

Cheers,

Ivan
On Apr 21, 2015 11:40 AM, "Ste Phan"  wrote:

> *... I build a little sample of what I do.*
>
> *My Test Synonyms file is (test.syn placed into my /etc/elasticsearch
> folder):*
>
> aaa,bbb,ccc,ddd
> www,xxx,yyy,zzz
> eee,fff,ggg,hhh => 111
> sss,ttt,uuu,vvv => 222
> rrr => 333,444,555
>
> *I created an index like so:*
>
> PUT /testindex?pretty
> {
> "settings": {
> "analysis": {
> "analyzer": {
> "myIndexAnalyzer": {
> "tokenizer": "standard",
> "filter": [
> "lowercase",
> "mySynonymsFilter"
> ]
> },
> "mySearchAnalyzer": {
> "tokenizer": "standard",
> "filter": [
> "lowercase"
> ]
> }
> },
> "filter": {
> "mySynonymsFilter": {
> "type": "synonym",
> "ignore_case": true,
> "synonyms_path": "test.syn"
> }
> }
> }
> },
> "mappings": {
> "testitem": {
> "properties": {
> "title": {
> "type": "string",
> "index_analyzer": "myIndexAnalyzer",
> "search_analyzer": "mySearchAnalyzer"
> }
> }
> }
> }
> }
>
> *and added some data:*
>
> POST /_bulk
> { "index": { "_index": "testindex", "_type": "testitem", "_id": "1" }}
> { "title":"aaa test daten eintrag." }
> { "index": { "_index": "testindex", "_type": "testitem", "_id": "2" }}
> { "title":"bbb test daten eintrag." }
> { "index": { "_index": "testindex", "_type": "testitem", "_id": "3" }}
> { "title":"eee test daten eintrag." }
>
> *Testing the myIndexAnalyzer using*
>
> POST /testindex/_analyze?analyzer=myIndexAnalyzer&pretty
> {aaa test daten eintrag}
>
> *Results to:*
>
> {
>"tokens": [
>   {
>  "token": "aaa",
>  "start_offset": 1,
>  "end_offset": 4,
>  "type": "SYNONYM",
>  "position": 1
>   },
>   {
>  "token": "bbb",
>  "start_offset": 1,
>  "end_offset": 4,
>  "type": "SYNONYM",
>  "position": 1
>   },
>   {
>  "token": "ccc",
>  "start_offset": 1,
>  "end_offset": 4,
>  "type": "SYNONYM",
>  "position": 1
>   },
>   {
>  "token": "ddd",
>  "start_offset": 1,
>  "end_offset": 4,
>  "type": "SYNONYM",
>  "position": 1
>   },
>   {
>  "token": "test",
>  "start_offset": 5,
>  "end_offset": 9,
>  "type": "",
>  "position": 2
>   }
>]
> }
>
> *Which to me seems to be fine.*
>
> Searching this index, i expected to find Record Ids 1 and 2 if I am
> searching for "aaa", "bbb", "ccc", "ddd".
>
> Which is my fault??
>
> TIA
> Ste Phan
>
>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/64e90076-f905-4490-bfe8-3b1607e5e98a%40googlegroups.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQB6A9nq1GC52sdugQx1%2BM_pJJvdo6ti0ofQYfbOqK6P2A%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: I have got a little Problem with my synonym filter ....

2015-04-21 Thread Ste Phan
 

> I forgot to figure out that if search for "aaa" I receive Record _id = 1,
>
 
searching vor "bbb" I receive Record _id = 2 ... nothing else. 

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/2a74f413-bd31-4c9a-a3a0-95084cc2fc0d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


I have got a little Problem with my synonym filter ....

2015-04-21 Thread Ste Phan
*... I build a little sample of what I do.*

*My Test Synonyms file is (test.syn placed into my /etc/elasticsearch 
folder):*

aaa,bbb,ccc,ddd
www,xxx,yyy,zzz
eee,fff,ggg,hhh => 111
sss,ttt,uuu,vvv => 222
rrr => 333,444,555

*I created an index like so:*

PUT /testindex?pretty
{
"settings": {
"analysis": {
"analyzer": {
"myIndexAnalyzer": {
"tokenizer": "standard",
"filter": [
"lowercase",
"mySynonymsFilter"
]
},
"mySearchAnalyzer": {
"tokenizer": "standard",
"filter": [
"lowercase"
]
}
},
"filter": {
"mySynonymsFilter": {
"type": "synonym",
"ignore_case": true,
"synonyms_path": "test.syn"
}
}
}
},
"mappings": {
"testitem": {
"properties": {
"title": {
"type": "string",
"index_analyzer": "myIndexAnalyzer",
"search_analyzer": "mySearchAnalyzer"
}
}
}
}
}

*and added some data:*

POST /_bulk
{ "index": { "_index": "testindex", "_type": "testitem", "_id": "1" }}
{ "title":"aaa test daten eintrag." }
{ "index": { "_index": "testindex", "_type": "testitem", "_id": "2" }}
{ "title":"bbb test daten eintrag." }
{ "index": { "_index": "testindex", "_type": "testitem", "_id": "3" }}
{ "title":"eee test daten eintrag." }

*Testing the myIndexAnalyzer using*

POST /testindex/_analyze?analyzer=myIndexAnalyzer&pretty
{aaa test daten eintrag}

*Results to:*

{
   "tokens": [
  {
 "token": "aaa",
 "start_offset": 1,
 "end_offset": 4,
 "type": "SYNONYM",
 "position": 1
  },
  {
 "token": "bbb",
 "start_offset": 1,
 "end_offset": 4,
 "type": "SYNONYM",
 "position": 1
  },
  {
 "token": "ccc",
 "start_offset": 1,
 "end_offset": 4,
 "type": "SYNONYM",
 "position": 1
  },
  {
 "token": "ddd",
 "start_offset": 1,
 "end_offset": 4,
 "type": "SYNONYM",
 "position": 1
  },
  {
 "token": "test",
 "start_offset": 5,
 "end_offset": 9,
 "type": "",
 "position": 2
  }
   ]
}

*Which to me seems to be fine.*

Searching this index, i expected to find Record Ids 1 and 2 if I am 
searching for "aaa", "bbb", "ccc", "ddd".

Which is my fault??

TIA
Ste Phan


-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/64e90076-f905-4490-bfe8-3b1607e5e98a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Using serialized doc_value instead of _source to improve read latency

2015-04-21 Thread Itai Frenkel
The answer is these changes in elasticsearch.yml:
script.groovy.sandbox.class_whitelist: 
com.fasterxml.jackson.databind.ObjectMapper
script.groovy.sandbox.package_whitelist: com.fasterxml.jackson.databind

for some reason these classes are not shaded even though the pom.xml does 
shade them.

On Tuesday, April 21, 2015 at 5:21:58 AM UTC+3, Itai Frenkel wrote:
>
> If I could focus the question better :  How do I whitelist a specific 
> class in the groovy script inside transform ?
>
> On Tuesday, April 21, 2015 at 1:18:03 AM UTC+3, Itai Frenkel wrote:
>>
>> Hi,
>>
>> We are having a performance problem in which for each hit, elasticsearch 
>> parses the entire _source then generates a new Json with only the requested 
>> query _source fields. In order to overcome this issue we would like to use 
>> mapping transform script that serializes the requested query fields (which 
>> is known in advance) into a doc_value. Does that makes sense?
>>
>> The actual problem with the transform script is  SecurityException that 
>> does not allow using any json serialization mechanism. A binary 
>> serialization would also be ok.
>>
>>
>> Itai
>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/b7787495-500b-4ed7-b0e6-4fad7fda1aa2%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


deploying ElasticSearch to a large memory server

2015-04-21 Thread Tzahi jakubovitz
Hi all,
I have a server with 1.5 TB memory.
I can either use it with a single ES process, or launch few separate 
instances (using either VM, docker, or just different ports on the same 
server OS).

What will be a reasonable number of instances for such a server ?

Thanks,
Tzahi

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/8909b6ad-2435-4804-900a-bfdec2aaddea%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


selecting a server - a single quad socket, or two dual socket

2015-04-21 Thread Tzahi jakubovitz


Today we can buy very performant servers at very reasonable price points.

e.g. – the price of two dual socket servers with 512 GB memory is 
comparable to a single quad socket server with 1024 GB (1 TB) memory. 
(Assuming same number of cores and MHz on each CPU) 

My gut feeling is that a single quad server will give better performance 
since balancing shards and indexes across servers is simpler – especially 
if a query targets certain shards.

Thanks for your opinion.

Tzahi

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/de40706d-972a-4349-98a2-ba55ee580177%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Bulk Index from Remote Host

2015-04-21 Thread Christopher Blasnik
Hi,

The best way to approach this is to restrict the size of your bulk request 
and / or the number of documents for each request.

I tend to do both, the best sizes seem to be in the 5 to 10 MiB range, 
however, I also restrict (which isn't really necessary) the max number of 
documents (e.g. 5000) for one request.

You should check out this link from the documentation:
http://www.elastic.co/guide/en/elasticsearch/guide/current/bulk.html

- Chris



On Tuesday, 21 April 2015 00:40:43 UTC+2, TB wrote:
>
> We are planning to bulk insert about 10 Gig data ,however we are being 
> forced to do this from a remote host.
> Is this a good practice? And are there any potential issues i should watch 
> out for?
>
> any advice would be great
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/93802357-b65f-4a85-8a86-58c6a8fa7cfc%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Cohort analysis using the query DSL

2015-04-21 Thread Christopher Blasnik
thanks for this Mark - I only got around to going through the slides today.

I will still have to find a solution which involves a more general approach 
which can be applied to any of our indexed fields (logging data from 
several systems, different document layout, and more systems joining in the 
near future).
I'll formulate a new approach using several queries and then bunging the 
data together programmatically. I'll fill you in on the solution when I get 
around to doing this.

Cheers,
- Chris

On Friday, 17 April 2015 11:50:49 UTC+2, ma...@elastic.co wrote:
>
> Check out the talk I gave at elasticon on "entity centric indexing"
>
> https://www.elastic.co/elasticon/2015/sf/building-entity-centric-indexes
>
> The video is yet to be released but the slides are there.
> Web session analysis is one example use case.
>
> Cheers
> Mark
>
>
>
> On Friday, April 17, 2015 at 9:11:30 AM UTC+1, Christopher Blasnik wrote:
>>
>> Dear all!
>>
>> I am trying to perform a cohort analysis with Elasticsearch. For a quick 
>> primer on what cohort analysis actually is, please take a look at this 
>> wikipedia article (which IMO does not carry a lot of information, but it's 
>> good enough to get an idea): http://en.wikipedia.org/wiki/Cohort_analysis
>>
>>
>> I have found a solution with which to tackle the problem, however, I am 
>> not entirely happy with it. Let me describe my data and the DSL query that 
>> I've come up with
>>
>>
>> Data:
>>
>> We collect data on our website traffic, which results in about 50k to 
>> 100k unique visits a day.
>>
>> Cohort analysis:
>>
>> Find the percentage of users within a 24-hour period which register at 
>> the website and then actually go to our purchasing page (calculate the 
>> percentages of how many users do this within the first, second, third etc. 
>> hour after registration).
>>
>> Two very abbreviated sample documents:
>>
>> - sessionId: our unique identifier for performing counts
>> - url: the url for evaluating cohorts
>> - time: unix timestamp for event
>>
>> {
>>   "sessionId": "some-random-id",
>>   "time": 142823880, (unix timestamp: Apr 5th, 3:00 pm)
>>   "url": "/register"
>> }
>>
>>
>> {
>>   "sessionId": "some-random-id",
>>   "time": 142824150, (unix timestamp: Apr 5th, 3:45 pm)
>>   "url": "/buy"
>> }
>>
>>
>> The query I've come up with does the following:
>>
>> - Basic query: restrict data set by time range & the target urls
>>
>> - Aggregations:
>>   - perform a terms agg on the unique identifier (sessionId)
>> For each resulting bucket, do:
>> 
>> - A date_histogram on the "time" field for the requested period (hour)
>>   For each resulting bucket, do:
>>   
>>   - perform a filter aggregation for the "start state" (url = 
>> /register)
>>   - perform a filter aggregation for the "target state" (url = /buy)
>>   
>>   
>> The result is then parsed, the users accumulated and the percentages 
>> calculated programmatically.
>>
>>
>> Here's the query for the above pseudo code:
>>
>>
>> {
>>   "query": {
>> "filtered": {
>>   "filter": {
>> "bool": {
>>   "must": [
>> {
>>   "range": {
>> "time": {
>>   "gte": 142819200, (Apr 4th, 0:00 am)
>>   "lt": 142827840, (Apr 5th, 0:00 am)
>> }
>>   }
>> },
>> {
>>   "terms": {
>> "url": [
>>   "/register",
>>   "/buy"
>> ]
>>   }
>> }
>>   ],
>>   "must_not": [
>> {
>>   "missing": {
>> "field": "sessionId"
>>   }
>> }
>>   ]
>> }
>>   }
>> }
>>   },
>>   "size": 0,
>>   "aggs": {
>> "uniques": {
>>   "terms": {
>> "field": "sessionId",
>> "size": 1
>>   },
>>   "aggs": {
>> "bucket_hour_start_state": {
>>   "date_histogram": {
>> "field": "time",
>> "interval": "hour"
>>   },
>>   "aggs": {
>> "start_state": {
>>   "filter": {
>> "bool": {
>>   "must": [
>> {
>>   "term": {
>> "url": "/register"
>>   }
>> }
>>   ]
>> }
>>   }
>> }
>>   }
>> },
>> "bucket_hour_target_state": {
>>   "date_histogram": {
>> "field": "time",
>> "interval": "hour"
>>   },
>>   "aggs": {
>> "target_state": {
>>   "filter": {
>> "bool": {
>>   "must": [
>> {
>>   "term": {
>> "url": "/buy"
>>   }
>> }
>>  

Re: How to diagnose slow queries every 10 minutes exactly?

2015-04-21 Thread David Pilato
Some notes. You are using defaults. So you have 5 shards per index. 1000 
primary shards.
With replicas, it means 1000 shards per Node. Which means 1000 Lucene instances.

First thing to do is IMO to use only one shard per index unless you need more 
for "big" indices.

Then, have 3 nodes and set minimum_master_nodes to 2. You probably ran into a 
split brain issue which could explain the difference you are seeing.
I would probably set replica to 0 and then to 1 again. But if you follow my 
first advice, you have to reindex, so reindex with 1 shard, 0 replica, then set 
replica to 1.


My 2 cents

David

> Le 21 avr. 2015 à 08:31, Dave Reed  a écrit :
> 
> The logs are basically empty. There's activity from when I am creating the 
> indexes, but that's about it. Is there a logging level that could be 
> increased? I will run hot_thread as soon as I can in the morning and post 
> results when I can.
> 
> I have each index set to 1 replica. If it matters, I first import them with 0 
> replicas then set it to 1 when they are done. Shard count I will have to 
> check and get back to you, but it's whatever the default would be, we haven't 
> tweaked that. 
> 
> I have 200+ indexes because I have 200+ different "projects" represented, and 
> each one has it's own set of mappings. Mappings which could collide on name. 
> I originally tried having a single index with each project represented by a 
> Type instead, but a conversation I had on this forum about that led me away 
> from it due to the fact two different projects may sometimes have the same 
> field names but with different types. Most searches (99.9%) are done on a 
> per-project basis, and changes to projects can create a need to reindex the 
> project, so having the segregation is nice, lest I have to reimport the 
> entire thing.
> 
> In case this is related, I also found that a document was changed and 
> reindexed, but searches would sometimes include it and sometimes not. I could 
> literally just refresh the search over and over again, and it would appear in 
> the results roughly 50% of the time. 0 results, then 0, 0, then 1 result, 1, 
> 1, then 0 again, etc.. I was running the search against one of the nodes via 
> _head. The cluster health was green during all of this. That was surprising 
> and something I was going to investigate more on my own, but perhaps these 
> two problems are related.
> 
> I'm really hitting the limit of what I know how to troubleshoot with ES, 
> hence I am really hoping for help here :) 
> 
> 
>> On Monday, April 20, 2015 at 10:23:46 PM UTC-7, David Pilato wrote:
>> Could you run a hot_threads API call when this happens?
>> Anything in logs about GC?
>> 
>> BTW 200 indices is a lot for 2 nodes. And how many shards/replicas do you 
>> have?
>> Why do you need so many indices for 2m docs?
>> 
>> 
>> David
>> 
>>> Le 21 avr. 2015 à 01:16, Dave Reed  a écrit :
>>> 
>>> I have a 2-node cluster running on some beefy machines. 12g and 16g of heap 
>>> space. About 2.1 million documents, each relatively small in size, spread 
>>> across 200 or so indexes. The refresh interval is 0.5s (while I don't need 
>>> realtime I do need relatively quick refreshes). Documents are continuously 
>>> modified by the app, so reindex requests trickle in constantly. By trickle 
>>> I mean maybe a dozen a minute. All index requests are made with _bulk, 
>>> although a lot of the time there's only 1 in the list.
>>> 
>>> Searches are very fast -- normally taking 50ms or less.
>>> 
>>> But oddly, exactly every 10 minutes, searches slow down for a moment. The 
>>> exact same query that normally takes <50ms takes 9000ms, for example. Any 
>>> other queries regardless of what they are also take multiple seconds to 
>>> complete. Once this moment passes, search queries return to normal.
>>> 
>>> I have a tester I wrote that continuously posts the same query and logs the 
>>> results, which is how I discovered this pattern.
>>> 
>>> Here's an excerpt. Notice that query time is great at 3:49:10, then at :11 
>>> things stop for 10 seconds. At :21 the queued up searches finally come 
>>> through. The numbers reported are the "took" field from the ES search 
>>> response. Then things resume as normal. This is true no matter which node I 
>>> run the search against.
>>> 
>>> This pattern repeats like this every 10 minutes, to the second, for days 
>>> now.
>>> 
>>> 3:49:09, 47
>>> 3:49:09, 63
>>> 3:49:10, 31
>>> 3:49:10, 47
>>> 3:49:11, 47
>>> 3:49:11, 62
>>> 3:49:21, 8456
>>> 3:49:21, 5460
>>> 3:49:21, 7457
>>> 3:49:21, 4509
>>> 3:49:21, 3510
>>> 3:49:21, 515
>>> 3:49:21, 1498
>>> 3:49:21, 2496
>>> 3:49:22, 2465
>>> 3:49:22, 2636
>>> 3:49:22, 6506
>>> 3:49:22, 7504
>>> 3:49:22, 9501
>>> 3:49:22, 4509
>>> 3:49:22, 1638
>>> 3:49:22, 6646
>>> 3:49:22, 9641
>>> 3:49:22, 655
>>> 3:49:22, 3667
>>> 3:49:22, 31
>>> 3:49:22, 78
>>> 3:49:22, 47
>>> 3:49:23, 47
>>> 3:49:23, 47
>>> 3:49:24, 93
>>> 
>>> I've ruled out any obvious periodical process running on eit