Re: High Disk Watermark exceeded on one or more nodes

2014-12-16 Thread Mark Walkom
It looks like this -
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/index-modules-allocation.html#disk

What is your actual disk usage? Can you run a curl -XGET
localhost:9200/_cluster/settings and see if it mentions those settings?

On 16 December 2014 at 23:28, Pauline Kelly 
wrote:
>
> I'm running an elk + redis stack on this machine, and just started
> collecting eventlogs via GELF from a windows server.
>
> I had a look at the logs recently, and this came up:
>
> [2014-12-17 09:31:03,820][WARN ][cluster.routing.allocation.decider]
> [logstash test] high disk watermark [10%] exceeded on
> [7drCr113QgSM8wcjNss_Mg][Blur] free: 632.3mb[8.4%], shards will be
> relocated away from this node
>
> [2014-12-17 09:31:03,820][INFO ][cluster.routing.allocation.decider]
> [logstash test] high disk watermark exceeded on one or more nodes,
> rerouting shards
>
> I had a look at the size of Elasticsearches logs in /var/ and it's about
> 23gb -
> I see that Elasticsearch has it's own memory heuristics but I'm not
> entirely sure how that works, or whether it's affecting this- but the logs
> aren't deleting after a week as I thought they should.
>
> Could someone explain to me a bit more about what is going on here?
>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/9e132f09-1fa6-4401-af53-7167fe15c781%40googlegroups.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAEYi1X_yDVNDNW3Pkyibji6Mxau1kwK95SYCOek39g5OzH19-A%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: File Descriptors

2014-12-16 Thread Chetan Dev
Hi,

I am on windows

Thanks 

On Wednesday, December 17, 2014 2:12:16 AM UTC+5:30, Andrew Selden wrote:
>
> What OS are you on? My guess would be that the library (sigar) that reads 
> that value can’t find it.
>
>
> On Dec 16, 2014, at 2:15 AM, Chetan Dev  > wrote:
>
> Hi,
>
> What does file descriptor value of -1 shows?
> what is the default value for it ?
>
> Thanks 
>
>
>   },
> "WkgDi0joTYSrs5sO3_bndQ" : {
>   "name" : "AEPLPERF2",
>   "transport_address" : "inet[/192.168.1.13:9300]",
>   "host" : "AEPLPERF1",
>   "ip" : "192.168.1.13",
>   "version" : "1.4.1",
>   "build" : "89d3241",
>   "http_address" : "inet[/192.168.1.13:9200]",
>   "process" : {
> "refresh_interval_in_millis" : 1000,
> "id" : 28240,
> "max_file_descriptors" : -1,
> "mlockall" : false
>   }
>
>
>
> -- 
> You received this message because you are subscribed to the Google Groups 
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to elasticsearc...@googlegroups.com .
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/elasticsearch/7c08629e-9c6a-42ce-8754-702c0e27be2d%40googlegroups.com
>  
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/260d8234-a030-4c6e-81c6-ed48dcf936c2%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: FiltrES - A language that compiles to ElasticSearch Query DSL

2014-12-16 Thread David Pilato
Cool stuff Abe.

Just a note, I think it’s a nice feature for developers but I’m more concerned 
with "end users".

A dev knows that == means equal but a end user will use field=x 
Same for different. != does not mean anything for a non developer.

My 2 cents.

-- 
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet  | @elasticsearchfr 
 | @scrutmydocs 




> Le 17 déc. 2014 à 00:53, Abe Haskins  a écrit :
> 
> Hi folks!
> 
> I wanted to share FiltrES.js , a 
> tool for compiling simple human readable expressions (i.e. '(height <= 73 or 
> (favorites.color == "green" and height != 73)) and firstname ~= "o.+")') into 
> ES queries. This is useful for times when you want end users (or developers 
> who aren't ES experts) to be able to query based on arbitrary filters. It 
> doesn't use script filters, so it's safe and easy to use. 
> 
> I'd love to get any thoughts/feedback as I am not an ES expert and FiltrES 
> was written so I could use it, but I'm happy to expand it for more 
> complex/interesting use cases.
> 
> Best,
> Abe
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to elasticsearch+unsubscr...@googlegroups.com 
> .
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/elasticsearch/27d4008e-b2f3-4410-bc78-7343b5bbf8d7%40googlegroups.com
>  
> .
> For more options, visit https://groups.google.com/d/optout 
> .

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/62BC1FF1-169C-45E3-B5A0-B69533419AAB%40pilato.fr.
For more options, visit https://groups.google.com/d/optout.


Re: $ES_HEAP_SIZE

2014-12-16 Thread 潘飞


在 2013年2月13日星期三UTC+8上午6时43分40秒,kimchy写道:
>
> Note, even if you run just one ES instance, on a 128gb, with 30gb 
> allocated to ES, you potentially don't loose much unless you really need ES 
> to have more memory. Depending on your index size, the file system cache 
> will kick in and use whatever it can making searches faster.
>
> You can also set index.store.type to mmapfs, which will use memory mapped 
> files for the index files, boosting the speed of your search even further.
>

Now we already have about 2TB index in our Elasticsearch cluster, is it 
still OK to change  index.store.type to mmapfs ?

>
> I would start with just 1 instance on that box, and see if you really need 
> another one.
>
> On Feb 8, 2013, at 3:21 PM, Shawn Ritchie > 
> wrote:
>
> Sorry Again, What if i wanted to run them as Services? I tried looking in 
> the /service/elasticsearch.conf and init.d but with no luck
>
> On Friday, 8 February 2013 13:40:19 UTC+1, Clinton Gormley wrote:
>>
>> On Fri, 2013-02-08 at 04:27 -0800, Shawn Ritchie wrote: 
>> > Already read that post, but from what i understood or misunderstood, 
>> > is its making the assumption you will have 1 instance of elastic 
>> > search running on a machine. 
>> > 
>> > What i'd like to do is with 1 elastic search installation is lunch 2 
>> > instance of elastic search with different /config /data /log node 
>> > name. 
>> > 
>> > 
>> > Or is it that multiple versions of elastic search on the same machine 
>> > run @ a directory level, that is 2 instances which are sharing 
>> > the /config /data and /log directries together with the node name? 
>>
>> You can run multiple instances with the same paths (including logging 
>> and data). 
>>
>> If you just want to specify a different node name, then you could do so 
>> on the command line: 
>>
>> ./bin/elasticsearch -Des.node.name =node_1 
>> ./bin/elasticsearch -Des.node.name =node_2 
>>
>> If you want to change more than that, you could specify a specific 
>> config file: 
>>
>> ./bin/elasticsearch -Des.config=/path/to/config/file_1 
>> ./bin/elasticsearch -Des.config=/path/to/config/file_2 
>>
>> clint 
>>
>>
>>
> -- 
> You received this message because you are subscribed to the Google Groups 
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to elasticsearc...@googlegroups.com .
> For more options, visit https://groups.google.com/groups/opt_out.
>  
>  
>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/2181407c-dfa9-4c07-9b30-3c0501ec7709%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: User free text query

2014-12-16 Thread David Pilato
I don’t know if you are aware of analysis process. If not, you could start 
reading this: 
http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/mapping-analysis.html
 


-- 
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet  | @elasticsearchfr 
 | @scrutmydocs 




> Le 17 déc. 2014 à 04:55, Bruno Kamiche  a écrit :
> 
> Any pointers on that?
> 
> On Tuesday, December 16, 2014 10:53:26 PM UTC-5, David Pilato wrote:
> It’s an analysis issue I believe.
> Mostly depends on your mapping for this field.
> 
> 
> -- 
> David Pilato | Technical Advocate | Elasticsearch.com 
> 
> @dadoonet  | @elasticsearchfr 
>  | @scrutmydocs 
> 
> 
> 
> 
>> Le 17 déc. 2014 à 03:40, Bruno Kamiche > a 
>> écrit :
>> 
>> Ok, it works, but I've found a new problem, it only works if the field 
>> "category" is exactly "0003", it seems that is not taking in account that 
>> some records have that field as "0001,0003" or 
>> "value1,value2,value3,...,0003,0005" and alike...
>> 
>> 
>> 
>> On Tuesday, December 16, 2014 9:22:22 PM UTC-5, David Pilato wrote:
>> Replace match with a simpleQueryString query.
>> 
>> David
>> 
>> Le 17 déc. 2014 à 03:13, Bruno Kamiche > a écrit :
>> 
>>> My application interface offers fields for filtering (and they are working 
>>> on elastic), but for power users there is always the free text query 
>>> option, my query in elastic is as follows:
>>> 
>>>"query" : {
>>> "filtered" : {
>>> "query" : {
>>> "match" : {
>>> "texto" : {
>>> "query" : "computer",
>>> "operator" : "and"
>>> }
>>> }
>>> },
>>> "filter" : {
>>> "bool" :  {
>>> "must" : [
>>> {
>>> "range" : {
>>> "datefield" 
>>> : {
>>> 
>>> "gte" : "some date here",
>>> 
>>> "lte" : "some date here"
>>> }
>>> }
>>> },
>>> {
>>> "term" : { 
>>> "filter1" : 1 }
>>> },
>>> {
>>> "term" : { 
>>> "filter2" : 1 }
>>> },
>>> {
>>> "term" : { 
>>> "filter3" : [5] }
>>> }
>>> ]
>>> }
>>> }
>>> }
>>> }
>>> 
>>> That is the structure that I use for every query, the filter values vary 
>>> upon the choices entered by the user, and the "query" itself gets its value 
>>> from a text entry field.
>>> 
>>> Up to this point everything works fine, but when I put "category:0003 
>>> computer" in the query field, it does not return results, although there 
>>> are results the comply...
>>> 
>>> 
>>> 
>>> On Tuesday, December 16, 2014 8:55:01 PM UTC-5, David Pilato wrote:
>>> Actually, elasticsearch will search computer in _all field and as you said 
>>> 0003 in category field.
>>> You should may be disable _all field and use the copy_to feature instead 
>>> BTW.
>>> 
>>> If your interface has different inputs for text and category, then you 
>>> should probably better use the QueryDSL with for example a FilteredQuery, 
>>> with a matchQuery and a termFilter. 
>>> 
>>> My 2 cents 
>>> David
>>> 
>>> Le 17 déc. 2014 à 02:27, Bruno Kamiche > a écrit :
>>> 
 I need to query elasticsearch and let the user specify on which other 
 fields to search for certain attributes...
 
 I have a field named "texto" with the actual text on which queries are 
 done.
 
 I have a field name "category" with values like "0001,0003" (meaning the 
 record is on categories 1 and 3), or "000

Re: User free text query

2014-12-16 Thread Bruno Kamiche
Any pointers on that?

On Tuesday, December 16, 2014 10:53:26 PM UTC-5, David Pilato wrote:
>
> It’s an analysis issue I believe.
> Mostly depends on your mapping for this field.
>
>
> -- 
> *David Pilato* | *Technical Advocate* | *Elasticsearch.com 
> *
> @dadoonet  | @elasticsearchfr 
>  | @scrutmydocs 
> 
>
>
>  
> Le 17 déc. 2014 à 03:40, Bruno Kamiche > 
> a écrit :
>
> Ok, it works, but I've found a new problem, it only works if the field 
> "category" is exactly "0003", it seems that is not taking in account that 
> some records have that field as "0001,0003" or 
> "value1,value2,value3,...,0003,0005" and alike...
>
>
>
> On Tuesday, December 16, 2014 9:22:22 PM UTC-5, David Pilato wrote:
>>
>> Replace match with a simpleQueryString query.
>>
>> David
>>
>> Le 17 déc. 2014 à 03:13, Bruno Kamiche  a écrit :
>>
>> My application interface offers fields for filtering (and they are 
>> working on elastic), but for power users there is always the free text 
>> query option, my query in elastic is as follows:
>>
>>"query" : {
>> "filtered" : {
>> "query" : {
>> "match" : {
>> "texto" : {
>> "query" : "computer",
>> "operator" : "and"
>> }
>> }
>> },
>> "filter" : {
>> "bool" :  {
>> "must" : [
>> {
>> "range" : {
>> 
>> "datefield" : {
>> 
>> "gte" : "some date here",
>> 
>> "lte" : "some date here"
>> }
>> }
>> },
>> {
>> "term" : { 
>> "filter1" : 1 }
>> },
>> {
>> "term" : { 
>> "filter2" : 1 }
>> },
>> {
>> "term" : { 
>> "filter3" : [5] }
>> }
>> ]
>> }
>> }
>> }
>> }
>>
>> That is the structure that I use for every query, the filter values vary 
>> upon the choices entered by the user, and the "query" itself gets its value 
>> from a text entry field.
>>
>> Up to this point everything works fine, but when I put "category:0003 
>> computer" in the query field, it does not return results, although there 
>> are results the comply...
>>
>>
>>
>> On Tuesday, December 16, 2014 8:55:01 PM UTC-5, David Pilato wrote:
>>>
>>> Actually, elasticsearch will search computer in _all field and as you 
>>> said 0003 in category field.
>>> You should may be disable _all field and use the copy_to feature instead 
>>> BTW.
>>>
>>> If your interface has different inputs for text and category, then you 
>>> should probably better use the QueryDSL with for example a FilteredQuery, 
>>> with a matchQuery and a termFilter. 
>>>
>>> My 2 cents 
>>> David
>>>
>>> Le 17 déc. 2014 à 02:27, Bruno Kamiche  a écrit :
>>>
>>> I need to query elasticsearch and let the user specify on which other 
>>> fields to search for certain attributes...
>>>
>>> I have a field named "texto" with the actual text on which queries are 
>>> done.
>>>
>>> I have a field name "category" with values like "0001,0003" (meaning the 
>>> record is on categories 1 and 3), or "0001,0005,0007", etc...
>>>
>>> If the user enters a search criteria like "computer" it will search on 
>>> field "texto" for the text computer, that works fine.
>>>
>>> But I guess there is an option, in which you can specify things like 
>>> "category:0003 computer", this query would need to find records with the 
>>> text "computer" on the field "texto", and additionaly have the value "0003" 
>>> present on the field "category", is this possible?
>>>
>>> Bruno
>>>
>>>
>>> -- 
>>> You received this message because you are subscribed to the Google 
>>> Groups "elasticsearch" group.
>>> To unsubscribe from this group and sto

Re: User free text query

2014-12-16 Thread David Pilato
It’s an analysis issue I believe.
Mostly depends on your mapping for this field.


-- 
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet  | @elasticsearchfr 
 | @scrutmydocs 




> Le 17 déc. 2014 à 03:40, Bruno Kamiche  a écrit :
> 
> Ok, it works, but I've found a new problem, it only works if the field 
> "category" is exactly "0003", it seems that is not taking in account that 
> some records have that field as "0001,0003" or 
> "value1,value2,value3,...,0003,0005" and alike...
> 
> 
> 
> On Tuesday, December 16, 2014 9:22:22 PM UTC-5, David Pilato wrote:
> Replace match with a simpleQueryString query.
> 
> David
> 
> Le 17 déc. 2014 à 03:13, Bruno Kamiche > a 
> écrit :
> 
>> My application interface offers fields for filtering (and they are working 
>> on elastic), but for power users there is always the free text query option, 
>> my query in elastic is as follows:
>> 
>>"query" : {
>> "filtered" : {
>> "query" : {
>> "match" : {
>> "texto" : {
>> "query" : "computer",
>> "operator" : "and"
>> }
>> }
>> },
>> "filter" : {
>> "bool" :  {
>> "must" : [
>> {
>> "range" : {
>> "datefield" 
>> : {
>> 
>> "gte" : "some date here",
>> 
>> "lte" : "some date here"
>> }
>> }
>> },
>> {
>> "term" : { "filter1" 
>> : 1 }
>> },
>> {
>> "term" : { "filter2" 
>> : 1 }
>> },
>> {
>> "term" : { "filter3" 
>> : [5] }
>> }
>> ]
>> }
>> }
>> }
>> }
>> 
>> That is the structure that I use for every query, the filter values vary 
>> upon the choices entered by the user, and the "query" itself gets its value 
>> from a text entry field.
>> 
>> Up to this point everything works fine, but when I put "category:0003 
>> computer" in the query field, it does not return results, although there are 
>> results the comply...
>> 
>> 
>> 
>> On Tuesday, December 16, 2014 8:55:01 PM UTC-5, David Pilato wrote:
>> Actually, elasticsearch will search computer in _all field and as you said 
>> 0003 in category field.
>> You should may be disable _all field and use the copy_to feature instead BTW.
>> 
>> If your interface has different inputs for text and category, then you 
>> should probably better use the QueryDSL with for example a FilteredQuery, 
>> with a matchQuery and a termFilter. 
>> 
>> My 2 cents 
>> David
>> 
>> Le 17 déc. 2014 à 02:27, Bruno Kamiche > a écrit :
>> 
>>> I need to query elasticsearch and let the user specify on which other 
>>> fields to search for certain attributes...
>>> 
>>> I have a field named "texto" with the actual text on which queries are done.
>>> 
>>> I have a field name "category" with values like "0001,0003" (meaning the 
>>> record is on categories 1 and 3), or "0001,0005,0007", etc...
>>> 
>>> If the user enters a search criteria like "computer" it will search on 
>>> field "texto" for the text computer, that works fine.
>>> 
>>> But I guess there is an option, in which you can specify things like 
>>> "category:0003 computer", this query would need to find records with the 
>>> text "computer" on the field "texto", and additionaly have the value "0003" 
>>> present on the field "category", is this possible?
>>> 
>>> Bruno
>>> 
>>> 
>>> -- 
>>> You received this message because you are subscribed to the Google Groups 
>>> "elasticsearch" group.
>>> To unsubscribe from this group and stop receiving emails from it, send an 
>>> email to elasticsearc...@googlegroups.com <>.
>>> To view this discussion on the web visit 
>>> https://groups.google.com

elk cluster plan with 7000EPS an 100/s search

2014-12-16 Thread Wang Yong
Hi folks, 

 

I am building an elk cluster to index and search lots of http access log,
about more than 7000Event per second and also there will be more than 100
cocurrent searchs. 

 

I have 2 machines. One of them has 24 cpu cores, 64G memory and 2T sata
disk(no raid). The other one is much powerful, which has 24 core cpu, 384G
memory and 300G sas disk*8.

 

My plan is to build a 3-node elasticsearch, one running on small server, the
other two running on the big one. Can I route all index request to one node
while all search request to the other two nodes? Is this a good idea to do
like this? Any comments?

 

Thank you guys and happy holiday!

 

Alan

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/000801d019ac%2400659130%240130b390%24%40gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: User free text query

2014-12-16 Thread Bruno Kamiche
Ok, it works, but I've found a new problem, it only works if the field 
"category" is exactly "0003", it seems that is not taking in account that 
some records have that field as "0001,0003" or 
"value1,value2,value3,...,0003,0005" and alike...



On Tuesday, December 16, 2014 9:22:22 PM UTC-5, David Pilato wrote:
>
> Replace match with a simpleQueryString query.
>
> David
>
> Le 17 déc. 2014 à 03:13, Bruno Kamiche > 
> a écrit :
>
> My application interface offers fields for filtering (and they are working 
> on elastic), but for power users there is always the free text query 
> option, my query in elastic is as follows:
>
>"query" : {
> "filtered" : {
> "query" : {
> "match" : {
> "texto" : {
> "query" : "computer",
> "operator" : "and"
> }
> }
> },
> "filter" : {
> "bool" :  {
> "must" : [
> {
> "range" : {
> 
> "datefield" : {
> 
> "gte" : "some date here",
> 
> "lte" : "some date here"
> }
> }
> },
> {
> "term" : { 
> "filter1" : 1 }
> },
> {
> "term" : { 
> "filter2" : 1 }
> },
> {
> "term" : { 
> "filter3" : [5] }
> }
> ]
> }
> }
> }
> }
>
> That is the structure that I use for every query, the filter values vary 
> upon the choices entered by the user, and the "query" itself gets its value 
> from a text entry field.
>
> Up to this point everything works fine, but when I put "category:0003 
> computer" in the query field, it does not return results, although there 
> are results the comply...
>
>
>
> On Tuesday, December 16, 2014 8:55:01 PM UTC-5, David Pilato wrote:
>>
>> Actually, elasticsearch will search computer in _all field and as you 
>> said 0003 in category field.
>> You should may be disable _all field and use the copy_to feature instead 
>> BTW.
>>
>> If your interface has different inputs for text and category, then you 
>> should probably better use the QueryDSL with for example a FilteredQuery, 
>> with a matchQuery and a termFilter. 
>>
>> My 2 cents 
>> David
>>
>> Le 17 déc. 2014 à 02:27, Bruno Kamiche  a écrit :
>>
>> I need to query elasticsearch and let the user specify on which other 
>> fields to search for certain attributes...
>>
>> I have a field named "texto" with the actual text on which queries are 
>> done.
>>
>> I have a field name "category" with values like "0001,0003" (meaning the 
>> record is on categories 1 and 3), or "0001,0005,0007", etc...
>>
>> If the user enters a search criteria like "computer" it will search on 
>> field "texto" for the text computer, that works fine.
>>
>> But I guess there is an option, in which you can specify things like 
>> "category:0003 computer", this query would need to find records with the 
>> text "computer" on the field "texto", and additionaly have the value "0003" 
>> present on the field "category", is this possible?
>>
>> Bruno
>>
>>  -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to elasticsearc...@googlegroups.com.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/36882ade-d788-43b7-95fb-96f6d653f934%40googlegroups.com
>>  
>> 
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>>  -- 
> You received this message because you are subscribed to the Google Groups 
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving

Re: Search sort using a field with an "index_name" results in "No mapping found"

2014-12-16 Thread Dave Reed
Just in the interest of having a two-way link, I opened a github issue 
about it here:
https://github.com/elasticsearch/elasticsearch/issues/8980


On Monday, December 15, 2014 9:41:01 AM UTC-8, Dave Reed wrote:
>
> I see you tried adding index_name to the inner field as well. Nope, I'm 
> afraid that did not work. 
>
> Anyone have any thoughts here? This definitely seems like a bug :)
>
>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/648de326-80eb-402e-91a0-32ccc57ba7ce%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: User free text query

2014-12-16 Thread David Pilato
Replace match with a simpleQueryString query.

David

> Le 17 déc. 2014 à 03:13, Bruno Kamiche  a écrit :
> 
> My application interface offers fields for filtering (and they are working on 
> elastic), but for power users there is always the free text query option, my 
> query in elastic is as follows:
> 
>"query" : {
> "filtered" : {
> "query" : {
> "match" : {
> "texto" : {
> "query" : "computer",
> "operator" : "and"
> }
> }
> },
> "filter" : {
> "bool" :  {
> "must" : [
> {
> "range" : {
> "datefield" : 
> {
> "gte" 
> : "some date here",
> "lte" 
> : "some date here"
> }
> }
> },
> {
> "term" : { "filter1" 
> : 1 }
> },
> {
> "term" : { "filter2" 
> : 1 }
> },
> {
> "term" : { "filter3" 
> : [5] }
> }
> ]
> }
> }
> }
> }
> 
> That is the structure that I use for every query, the filter values vary upon 
> the choices entered by the user, and the "query" itself gets its value from a 
> text entry field.
> 
> Up to this point everything works fine, but when I put "category:0003 
> computer" in the query field, it does not return results, although there are 
> results the comply...
> 
> 
> 
>> On Tuesday, December 16, 2014 8:55:01 PM UTC-5, David Pilato wrote:
>> Actually, elasticsearch will search computer in _all field and as you said 
>> 0003 in category field.
>> You should may be disable _all field and use the copy_to feature instead BTW.
>> 
>> If your interface has different inputs for text and category, then you 
>> should probably better use the QueryDSL with for example a FilteredQuery, 
>> with a matchQuery and a termFilter. 
>> 
>> My 2 cents 
>> David
>> 
>>> Le 17 déc. 2014 à 02:27, Bruno Kamiche  a écrit :
>>> 
>>> I need to query elasticsearch and let the user specify on which other 
>>> fields to search for certain attributes...
>>> 
>>> I have a field named "texto" with the actual text on which queries are done.
>>> 
>>> I have a field name "category" with values like "0001,0003" (meaning the 
>>> record is on categories 1 and 3), or "0001,0005,0007", etc...
>>> 
>>> If the user enters a search criteria like "computer" it will search on 
>>> field "texto" for the text computer, that works fine.
>>> 
>>> But I guess there is an option, in which you can specify things like 
>>> "category:0003 computer", this query would need to find records with the 
>>> text "computer" on the field "texto", and additionaly have the value "0003" 
>>> present on the field "category", is this possible?
>>> 
>>> Bruno
>>> 
>>> -- 
>>> You received this message because you are subscribed to the Google Groups 
>>> "elasticsearch" group.
>>> To unsubscribe from this group and stop receiving emails from it, send an 
>>> email to elasticsearc...@googlegroups.com.
>>> To view this discussion on the web visit 
>>> https://groups.google.com/d/msgid/elasticsearch/36882ade-d788-43b7-95fb-96f6d653f934%40googlegroups.com.
>>> For more options, visit https://groups.google.com/d/optout.
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/elasticsearch/de59e565-8b10-4cae-8fdb-17aa43591804%40googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving em

Re: User free text query

2014-12-16 Thread Bruno Kamiche
My application interface offers fields for filtering (and they are working 
on elastic), but for power users there is always the free text query 
option, my query in elastic is as follows:

   "query" : {
"filtered" : {
"query" : {
"match" : {
"texto" : {
"query" : "computer",
"operator" : "and"
}
}
},
"filter" : {
"bool" :  {
"must" : [
{
"range" : {
"datefield" 
: {

"gte" : "some date here",

"lte" : "some date here"
}
}
},
{
"term" : { 
"filter1" : 1 }
},
{
"term" : { 
"filter2" : 1 }
},
{
"term" : { 
"filter3" : [5] }
}
]
}
}
}
}

That is the structure that I use for every query, the filter values vary 
upon the choices entered by the user, and the "query" itself gets its value 
from a text entry field.

Up to this point everything works fine, but when I put "category:0003 
computer" in the query field, it does not return results, although there 
are results the comply...



On Tuesday, December 16, 2014 8:55:01 PM UTC-5, David Pilato wrote:
>
> Actually, elasticsearch will search computer in _all field and as you said 
> 0003 in category field.
> You should may be disable _all field and use the copy_to feature instead 
> BTW.
>
> If your interface has different inputs for text and category, then you 
> should probably better use the QueryDSL with for example a FilteredQuery, 
> with a matchQuery and a termFilter. 
>
> My 2 cents 
> David
>
> Le 17 déc. 2014 à 02:27, Bruno Kamiche > 
> a écrit :
>
> I need to query elasticsearch and let the user specify on which other 
> fields to search for certain attributes...
>
> I have a field named "texto" with the actual text on which queries are 
> done.
>
> I have a field name "category" with values like "0001,0003" (meaning the 
> record is on categories 1 and 3), or "0001,0005,0007", etc...
>
> If the user enters a search criteria like "computer" it will search on 
> field "texto" for the text computer, that works fine.
>
> But I guess there is an option, in which you can specify things like 
> "category:0003 computer", this query would need to find records with the 
> text "computer" on the field "texto", and additionaly have the value "0003" 
> present on the field "category", is this possible?
>
> Bruno
>
>  -- 
> You received this message because you are subscribed to the Google Groups 
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to elasticsearc...@googlegroups.com .
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/elasticsearch/36882ade-d788-43b7-95fb-96f6d653f934%40googlegroups.com
>  
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/de59e565-8b10-4cae-8fdb-17aa43591804%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: User free text query

2014-12-16 Thread David Pilato
Actually, elasticsearch will search computer in _all field and as you said 0003 
in category field.
You should may be disable _all field and use the copy_to feature instead BTW.

If your interface has different inputs for text and category, then you should 
probably better use the QueryDSL with for example a FilteredQuery, with a 
matchQuery and a termFilter. 

My 2 cents 
David

> Le 17 déc. 2014 à 02:27, Bruno Kamiche  a écrit :
> 
> I need to query elasticsearch and let the user specify on which other fields 
> to search for certain attributes...
> 
> I have a field named "texto" with the actual text on which queries are done.
> 
> I have a field name "category" with values like "0001,0003" (meaning the 
> record is on categories 1 and 3), or "0001,0005,0007", etc...
> 
> If the user enters a search criteria like "computer" it will search on field 
> "texto" for the text computer, that works fine.
> 
> But I guess there is an option, in which you can specify things like 
> "category:0003 computer", this query would need to find records with the text 
> "computer" on the field "texto", and additionaly have the value "0003" 
> present on the field "category", is this possible?
> 
> Bruno
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/elasticsearch/36882ade-d788-43b7-95fb-96f6d653f934%40googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/21A058BD-5BA5-4BCD-9BDA-8D17651A485F%40pilato.fr.
For more options, visit https://groups.google.com/d/optout.


Re: ES shard placement on different nodes

2014-12-16 Thread David Pilato
Yes it's possible but you should not be afraid of this.
At the end, both primary and replicas will do the same "index"/"search" 
operation.

HTH 

David

> Le 17 déc. 2014 à 02:29, Nidhi Gupta  a écrit :
> 
> Hi,
> I have multi index ES cluster.  I read ES tries to equally distributes the 
> shards on different nodes. 
> Question : 
> ES tries the equal distribution across all nodes of primary+replica shards 
> per index or primary shards per index and then distribute replica shards per 
> index? 
> so its it possible that all primary shards may end up on the same node? 
> 
> Thanks,
> Nidhi
> 
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/elasticsearch/f657b52a-5ddc-455f-8ea4-fb8e67140430%40googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/E6A50622-3DD8-4FF2-8F06-5271C5D30DEA%40pilato.fr.
For more options, visit https://groups.google.com/d/optout.


ES shard placement on different nodes

2014-12-16 Thread Nidhi Gupta
Hi,
I have multi index ES cluster.  I read ES tries to equally distributes the 
shards on different nodes. 
Question : 
ES tries the equal distribution across all nodes of primary+replica shards 
per index or primary shards per index and then distribute replica shards 
per index? 
so its it possible that all primary shards may end up on the same node? 

Thanks,
Nidhi


-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/f657b52a-5ddc-455f-8ea4-fb8e67140430%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


User free text query

2014-12-16 Thread Bruno Kamiche
I need to query elasticsearch and let the user specify on which other 
fields to search for certain attributes...

I have a field named "texto" with the actual text on which queries are done.

I have a field name "category" with values like "0001,0003" (meaning the 
record is on categories 1 and 3), or "0001,0005,0007", etc...

If the user enters a search criteria like "computer" it will search on 
field "texto" for the text computer, that works fine.

But I guess there is an option, in which you can specify things like 
"category:0003 computer", this query would need to find records with the 
text "computer" on the field "texto", and additionaly have the value "0003" 
present on the field "category", is this possible?

Bruno

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/36882ade-d788-43b7-95fb-96f6d653f934%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: duplicate data on hadoop

2014-12-16 Thread Mungeol Heo
Try to use one index instead of multiple indexes.
For instance, change 'es.resource' = 'event-conversion*/conversion' to
'es.resource' = 'event-conversion-01/conversion'.
I think es-hadoop does not support multiple indexes setting for now.

On Wed, Nov 19, 2014 at 11:57 AM, Jimmy Carter  wrote:
> Dear All,
>
> i have problem with import data from es to hadoop ( querying using hive),
> the problem is i have 3 recod in my es, but when i select in hive table it
> return result 21, when i check each record in es duplicate 6 times,
>
> here the query table creation in hadoop :
> CREATE EXTERNAL TABLE conversion (
> ip string,
> user_id string,
> tracker_id string,
> time_log bigint,
> session_id string,
> spot_id string,
> traker_type_id bigint,
>
> )
> STORED BY 'org.elasticsearch.hadoop.hive.EsStorageHandler'
> TBLPROPERTIES('es.resource' = 'event-conversion*/conversion',
> 'es.mapping.names' = 'ip:ip, user_id:user_id,
> tracker_id:tracker_id, time_log:datetime_log,
> session_id:session_id,spot_id:spot_id, tracker_type_id:tracker_type_id');
>
> so is there something that wrong in my create table query?
>
> thank you
>
>
> jimmy
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/b446dd4f-4619-40f7-a083-72ae0fbbf163%40googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CADQPeWz7J6-M%3DDVp8Nj0MYkikjRyUzOg6t72dU7oawqyrRn81Q%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Same query, different CPU util when run with Java API versus REST

2014-12-16 Thread Jeff Potts
I figured it out. The queries executed by the straight Elasticsearch REST 
test and the Java test are not exactly identical. The difference is that 
the "current date" used by the straight Elasticsearch test is pulled from a 
randomized CSV file whereas the Java service always injects the actual 
current date down to the millisecond. So, even though I switched the query 
and the Java to leverage filters, the straight Elasticsearch REST test is 
the only test that takes advantage of the cache. The Java service query 
doesn't take advantage of the cache because the current date changes every 
millisecond.

To confirm this I hardcoded the milliseconds in both the REST test and the 
Java service. Once I did that, the CPU rates (and everything else) were 
identical.

The other telltale sign that there was a problem was the indices filter 
cache graph in Marvel. Once straight REST test ran once, it could be re-run 
and never cause an increase in what is in the filter cache. The Java 
service, on the other hand, was populating the filter cache every time.

Thanks to everyone that weighed in on this.

Jeff

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/4a3e2981-7a3b-4e6e-a2f0-ed65fddc9150%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


FiltrES - A language that compiles to ElasticSearch Query DSL

2014-12-16 Thread Abe Haskins
Hi folks!

I wanted to share FiltrES.js , a 
tool for compiling simple human readable expressions (i.e. '(height <= 73 or 
(favorites.color == "green" and height != 73)) and firstname ~= "o.+")') 
into ES queries. This is useful for times when you want end users (or 
developers who aren't ES experts) to be able to query based on arbitrary 
filters. It doesn't use script filters, so it's safe and easy to use. 

I'd love to get any thoughts/feedback as I am *not *an ES expert and 
FiltrES was written so I could use it, but I'm happy to expand it for more 
complex/interesting use cases.

Best,
Abe

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/27d4008e-b2f3-4410-bc78-7343b5bbf8d7%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: zen disco socket usage and port

2014-12-16 Thread Paul Baclace
Now that you pointed out the addition "es." prefix needed when specifying 
on the command line, I can see that:
  
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/setup-configuration.html
shows it via color in an example and with "instead of". Apparently, that 
was not obvious enough for me!

Thanks for the transport.netty.worker_count tip.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/c5813aa7-a2d8-430f-a801-bca56aecc325%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: zen disco socket usage and port

2014-12-16 Thread joergpra...@gmail.com
ES uses a netty worker pool in order to be able to connect to multiples
nodes. The size of the pool does not automatically take into consideration
that you want a single node cluster only, it does not shrink or grow. You
can reduce the size of the netty worker pool with the
setting transport.netty.worker_count

Any command line parameter must be prefixed by "es." so you should use -
Des.transport.tcp.port=

Jörg

On Tue, Dec 16, 2014 at 11:08 PM, Paul Baclace 
wrote:
>
> More info:  I tried to set the transport port like this:
>  -Dtransport.tcp.port= on the elasticsearch command line, but it still
> uses port 9300.
>
>
> On Tuesday, December 16, 2014 12:42:12 PM UTC-8, Paul Baclace wrote:
>>
>> Is it normal for a single node elasticsearch process to open 13 sockets
>> to itself? This seems like an excessive zen disco party and inexplicable. I
>> am trying out v1.4.
>>
>> Is it possible to set the transport protocol port?
>>
>>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/27ea7a90-af23-4989-8d82-c2ba5ca27f75%40googlegroups.com
> 
> .
>
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoFYhzGh_6hFHDNMgWwhfYh9r7BQxZk6r-D0V7ZJaPN_4Q%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Same query, different CPU util when run with Java API versus REST

2014-12-16 Thread joergpra...@gmail.com
The difference between DSL and Java is that in Java you use search type
QUERY_AND_FETCH which is slow.

Jörg

On Mon, Dec 15, 2014 at 3:27 PM, Jeff Potts  wrote:
>
> Yes, updated the gist. Thanks for taking a look at this.
>
> Jeff
>
> On Sunday, December 14, 2014 11:52:35 AM UTC-6, Jörg Prante wrote:
>>
>> Can you post the full query code for better recreation?
>>
>> Jörg
>>
>> On Fri, Dec 12, 2014 at 6:44 PM, Jeff Potts  wrote:
>>>
>>> I should mention that the Elasticsearch node, the Java service, and the
>>> JMeter test client are all on different machines.
>>>
>>> Jeff
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "elasticsearch" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to elasticsearc...@googlegroups.com.
>>> To view this discussion on the web visit https://groups.google.com/d/
>>> msgid/elasticsearch/20ee0ee8-5b4c-476e-8403-1db175de4158%
>>> 40googlegroups.com
>>> 
>>> .
>>>
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/c312478a-e5c5-49aa-abd8-88eeac2c2051%40googlegroups.com
> 
> .
>
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGE8DGWF-b1o9StQkY9dSFDDLvtyXGifAGtAx%2BC7Jx96Q%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


High Disk Watermark exceeded on one or more nodes

2014-12-16 Thread Pauline Kelly
I'm running an elk + redis stack on this machine, and just started 
collecting eventlogs via GELF from a windows server.

I had a look at the logs recently, and this came up:

[2014-12-17 09:31:03,820][WARN ][cluster.routing.allocation.decider] 
[logstash test] high disk watermark [10%] exceeded on 
[7drCr113QgSM8wcjNss_Mg][Blur] free: 632.3mb[8.4%], shards will be 
relocated away from this node

[2014-12-17 09:31:03,820][INFO ][cluster.routing.allocation.decider] 
[logstash test] high disk watermark exceeded on one or more nodes, 
rerouting shards

I had a look at the size of Elasticsearches logs in /var/ and it's about 
23gb - 
I see that Elasticsearch has it's own memory heuristics but I'm not 
entirely sure how that works, or whether it's affecting this- but the logs 
aren't deleting after a week as I thought they should.

Could someone explain to me a bit more about what is going on here?

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/9e132f09-1fa6-4401-af53-7167fe15c781%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: zen disco socket usage and port

2014-12-16 Thread Paul Baclace
More info:  I tried to set the transport port like this: 
 -Dtransport.tcp.port= on the elasticsearch command line, but it still 
uses port 9300. 


On Tuesday, December 16, 2014 12:42:12 PM UTC-8, Paul Baclace wrote:
>
> Is it normal for a single node elasticsearch process to open 13 sockets to 
> itself? This seems like an excessive zen disco party and inexplicable. I am 
> trying out v1.4. 
>
> Is it possible to set the transport protocol port? 
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/27ea7a90-af23-4989-8d82-c2ba5ca27f75%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Same query, different CPU util when run with Java API versus REST

2014-12-16 Thread Jeff Potts
Scratch what I said about total shard search query time. I had started a 
client node and was pointing my REST test at that.

CPU difference still there.

Also, I've upgraded to 1.3.7.

Jeff

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/e429e400-2be3-4481-8218-c3a0e3d76308%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Same query, different CPU util when run with Java API versus REST

2014-12-16 Thread Jeff Potts
Yes, getReadClient() gets a Node Client that is instantiated by Spring and 
then injected as a dependency. I have tried the Transport Client as well 
and it makes no difference.

An interesting finding is that I have isolated the performance degradation 
to the Range filter. When my service has these filters the test using REST 
causes the ES node to use 9% CPU while the Java service uses 22% CPU:

filtersToApply.add(FilterBuilders.termFilter(Constants.PROP_PAGE_ID, 
pageId));
filtersToApply.add(FilterBuilders.termFilter(Constants.PROP_GEO_CODE, 
meta.getGeoCode()));
filtersToApply.add(FilterBuilders.termFilter(Constants.PROP_PLACEMENT_ID, 
meta.getPlacementId()));

When I add a date range, the difference in performance increases 
dramatically with the REST call staying at around 9% and the Java service 
driving the CPU util to 81%:

filtersToApply.add(FilterBuilders.rangeFilter(Constants.PROP_PUB_DATE).lte(curDate.getTime()));

I've also noted that while the index search shard query rate is identical 
between the two tests, the total shard search query time is dramatically 
different with the Java service driving a much higher query time compared 
to the REST-based test. 

Jeff

On Tuesday, December 16, 2014 2:38:51 PM UTC-6, Brian wrote:
>
> Jeff,
>
> Does getReadClient() get a reference to a previously created singleton 
> TransportClient (or NodeClient, as the case may be)? I am guessing yes, but 
> just asking to be sure.
>
> In my own set-up, I have created a thin HTTP REST layer (also using Netty, 
> with the LMAX Disruptor to minimize thread usage while maximizing 
> throughput). It contains our business logic, and issues queries and updates 
> to Elasticsearch via the TransportClient. It creates a singleton instance 
> of the TransportClient, and shares this singleton's reference throughout 
> the service. Using seige, I have hammered the combination of my thin server 
> plus Elasticsearch and don't see any performance issues as you describe.
>
> Just a thought...
>
> Regards,
> Brian
>
> On Monday, December 15, 2014 9:27:18 AM UTC-5, Jeff Potts wrote:
>>
>> Yes, updated the gist. Thanks for taking a look at this.
>>
>> Jeff
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/dd88c5a7-28ec-4aa2-ad1f-2974a5ed1978%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: File Descriptors

2014-12-16 Thread Andrew Selden
What OS are you on? My guess would be that the library (sigar) that reads that 
value can’t find it.


> On Dec 16, 2014, at 2:15 AM, Chetan Dev  wrote:
> 
> Hi,
> 
> What does file descriptor value of -1 shows?
> what is the default value for it ?
> 
> Thanks 
> 
> 
>   },
> "WkgDi0joTYSrs5sO3_bndQ" : {
>   "name" : "AEPLPERF2",
>   "transport_address" : "inet[/192.168.1.13:9300]",
>   "host" : "AEPLPERF1",
>   "ip" : "192.168.1.13",
>   "version" : "1.4.1",
>   "build" : "89d3241",
>   "http_address" : "inet[/192.168.1.13:9200]",
>   "process" : {
> "refresh_interval_in_millis" : 1000,
> "id" : 28240,
> "max_file_descriptors" : -1,
> "mlockall" : false
>   }
> 
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to elasticsearch+unsubscr...@googlegroups.com 
> .
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/elasticsearch/7c08629e-9c6a-42ce-8754-702c0e27be2d%40googlegroups.com
>  
> .
> For more options, visit https://groups.google.com/d/optout 
> .

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/DC20CB93-6027-4361-AC15-09BC49B6C2E5%40elasticsearch.com.
For more options, visit https://groups.google.com/d/optout.


zen disco socket usage and port

2014-12-16 Thread Paul Baclace
Is it normal for a single node elasticsearch process to open 13 sockets to 
itself? This seems like an excessive zen disco party and inexplicable. I am 
trying out v1.4. 

Is it possible to set the transport protocol port? 

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/0db13e63-9d6d-4d60-b05b-a4c0a9bd490a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


zen disco socket usage and port

2014-12-16 Thread Paul Baclace
Is it normal for a single node elasticsearch process to open 13 sockets to 
itself? This seems like an excessive zen disco party and inexplicable. I am 
trying out v1.4. 

Is it possible to set the transport protocol port? 

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/0ba05311-c067-44ce-96f1-44ad71ac422d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Same query, different CPU util when run with Java API versus REST

2014-12-16 Thread Brian
Jeff,

Does getReadClient() get a reference to a previously created singleton 
TransportClient (or NodeClient, as the case may be)? I am guessing yes, but 
just asking to be sure.

In my own set-up, I have created a thin HTTP REST layer (also using Netty, 
with the LMAX Disruptor to minimize thread usage while maximizing 
throughput). It contains our business logic, and issues queries and updates 
to Elasticsearch via the TransportClient. It creates a singleton instance 
of the TransportClient, and shares this singleton's reference throughout 
the service. Using seige, I have hammered the combination of my thin server 
plus Elasticsearch and don't see any performance issues as you describe.

Just a thought...

Regards,
Brian

On Monday, December 15, 2014 9:27:18 AM UTC-5, Jeff Potts wrote:
>
> Yes, updated the gist. Thanks for taking a look at this.
>
> Jeff
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/e0f039df-95fc-4fb9-a787-84ee56d77e84%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Same query, different CPU util when run with Java API versus REST

2014-12-16 Thread Marie Jacob
I have this same issue, except that CPU utilization is approximately the 
same, but response times are very different. I'm using ES 1.3.4. I have two 
JMeter tests simulating concurrent tests with the exact same configuration. 
The test using the Java Sampler with the Transport Client are showing 
response times that are twice as much as the test with HTTP samplers. Are 
the connection channels in the ES client maybe overloaded? 

Any insight on this would definitely be great.

Thanks,
Marie.
 

On Monday, December 15, 2014 9:27:18 AM UTC-5, Jeff Potts wrote:
>
> Yes, updated the gist. Thanks for taking a look at this.
>
> Jeff
>
> On Sunday, December 14, 2014 11:52:35 AM UTC-6, Jörg Prante wrote:
>>
>> Can you post the full query code for better recreation?
>>
>> Jörg
>>
>> On Fri, Dec 12, 2014 at 6:44 PM, Jeff Potts  wrote:
>>>
>>> I should mention that the Elasticsearch node, the Java service, and the 
>>> JMeter test client are all on different machines.
>>>
>>> Jeff
>>>
>>> -- 
>>> You received this message because you are subscribed to the Google 
>>> Groups "elasticsearch" group.
>>> To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to elasticsearc...@googlegroups.com.
>>> To view this discussion on the web visit 
>>> https://groups.google.com/d/msgid/elasticsearch/20ee0ee8-5b4c-476e-8403-1db175de4158%40googlegroups.com
>>>  
>>> 
>>> .
>>>
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/7ff35b47-9e32-4f85-8899-2bbef4e632e8%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Is ElasticSearch truly scalable for analytics?

2014-12-16 Thread Elvar Böðvarsson
Elasticsearch supports tribe nodes, so you can combine multiple clusters, 
you then query the tribe node to access data on all of them.

On Monday, December 15, 2014 9:52:45 PM UTC, Yifan Wang wrote:
>
> If I understand correctly, ElasticSearch directly sends query to and 
> collects aggregated results from each shard. With number of shards 
> increases, Reduce phase on the Client node will become overwhelmed. 
>
> One would assume, if ElasticSearch support node level aggregation, the 
> "Reduce" becomes distributed so Client node will not become overwhelmed for 
> large clusters with lots of shards. I am wondering if ElasticSearch 
> supports node level reduce. If not, why? I think this is critical if we 
> like ElasticSearch to be truly scalable for analytics. 
>
> Thanks.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/baffc6eb-2f28-4498-adf7-8b9628da7d0d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Is ElasticSearch truly scalable for analytics?

2014-12-16 Thread AlexR
ES already doing aggregations on each node. it is not like it is shipping row 
level query data back to master for aggregation. 
In fact, one unpleasant effect of it is that aggregation results are not 
guaranteed to be precise due to distributed nature of the aggregation for 
multibucket aggs ordered by count such as terms

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/4a9aaac6-7273-44e6-be5e-9403e12a5249%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Return Logstash Failed User logons by day and return code.

2014-12-16 Thread Sachin Divekar
Hi Rod,

What you need to use is multi level terms aggregation. General format of
such query is as following.

{
"aggs": { "agg1": { "terms": { "field": "field1" }, "aggs": { "agg2": {
"terms": { "field": "field2" }, "aggs": { "agg3": { "terms": { "field":
"field3" }
} } } } } }
}

In your case you can use fleeing query

{ "aggs": { "users": { "terms": { "field": "username" }, "aggs": {
"workstations": { "terms": { "field": "workstation" } } } } } }

Just to understand how it works you can play with sequence of aggs, users
and workstations and see how the output changes.

Regards
Sachin Divekar


--
Sent from phone

On Tue, Dec 16, 2014, 9:14 PM Rod Clayton  wrote:

> Dear Sachin,
>
> I want to aggregate them by username and workstation and get a count.  I
> need to produce a report if there are too many failures for an account.
>
> I figured out how to limit the search to a particular day by saying http://
> http://localhost:9200/logstash-2014.11.19/_search?q=status:%20FAILURE&pretty
>
> I am looking for an example to aggregate on a couple of fields and get a
> count by value.
>
> Is that possible?
>
> Thanks,
> Rod
>
>
> On Tuesday, December 16, 2014 9:38:12 AM UTC-5, Sachin Divekar wrote:
>
>> I had mistakenly put extra space in the URL. Corrected URL is
>> http://localhost:9200/_search?q=status:FAILURE&pretty
>>
>> Regards
>> Sachin Divekar
>>
>> On Tue Dec 16 2014 at 8:01:37 PM Sachin Divekar  wrote:
>>
> Hi Rod,
>>>
>>> Try following URL
>>>
>>> http://localhost:9200/_search?q=status: FAILURE&pretty
>>>
>>> In output you will find something like following
>>>
>>> 
>>>
>>> "hits": {
>>> "total": 7,
>>> "max_score": 1,
>>> "hits": [
>>>
>>> -
>>>
>>> So in "hits" block value of "total" field is your count of failed logon
>>> requests.
>>>
>>> For understanding search API and output of search query refer
>>> http://www.elasticsearch.org/guide/en/elasticsearch/reference/
>>> current/_the_search_api.html
>>>
>>> Regards
>>> Sachin Divekar
>>>
>>> On Tue Dec 16 2014 at 7:01:02 PM Rod Clayton  wrote:
>>>
>> Dear Sachin,

 Here is the GIST with the output you requested:

 ka3bhy  / *gist:082a5410d36264521ccb
 *



 On Monday, December 15, 2014 10:13:02 PM UTC-5, Sachin Divekar wrote:

> Hi,
>
> Share output of http://localhost:9200/foo/_search?pretty=true&q=*:*
> substitute foo with name of your index.
> Use gist to share the output. I suggest, read
> http://www.elasticsearch.org/help/
>
> Sachin Divekar
>
> On Tue, Dec 16, 2014, 1:38 AM Rod Clayton  wrote:
>
 The logstash debug for the input logs look like:
>>
>> {
>>   "message" => "37208057\tSecurity\tMicrosoft
>> -Windows-Security-Auditing\tSUCCESS AUDIT\tserver.myorg.org\t11/18/2014
>> 12:13:32 AM\t4776\tNone\t\"The computer attempted to validate the
>> credentials for an account.Authentication Package:
>> MICROSOFT_AUTHENTICATION_PACKAGE_V1_0  Logon Account: joe  Source
>> Workstation: joescomputer  Error Code: 0x0  \"",
>>  "@version" => "1",
>>"@timestamp" => "2014-11-18T05:13:32.000Z",
>>  "host" => "0:0:0:0:0:0:0:1:51947",
>>  "type" => "logons",
>> "recno" => "37208057",
>>   "logtype" => "Security",
>>"status" => "SUCCESS",
>>  "hostname" => "server.myorg.org",
>> "eventCode" => "4776",
>>  "username" => "joe",
>>   "workstation" => "joescomputer",
>> "retcd" => "0x0",
>>   "received_at" => "2014-12-15 19:25:49 UTC",
>> "received_from" => "0:0:0:0:0:0:0:1:51947"
>> }
>>
>> I have obscured the host names and accounts, but the fields are the
>> same.
>>
>> I am hoping for output like:
>>
>> username workstation name error code Count
>> root   maryscomputer 6a   100
>> joelab1 6a5
>> joelab2 6a2
>> mary maryscomputer 6a   1
>>
>> This assumes that the detail records were all dated the same day.
>> I am expecting that this is going to come back in a JSON format that
>> I will have to format to look like above.
>>
>> Is this what you wanted?
>>
>>
>>
>> On Monday, December 15, 2014 1:03:07 PM UTC-5, Sachin Divekar wrote:
>>
>>> Hi,
>>>
>>> Can you share some sample data and desired output?
>>>
>>> Sachin Divekar
>>>
>>> On Mon, Dec 15, 2014, 10:00 PM Rod Clayton 
>>> wrote:
>>>
>> I have loaded login data into Elasticsearch using Logstash.

 I have fields: username retcd workstation.

 I want 

Re: How to use json in update script

2014-12-16 Thread Roger de Cordova Farias
*Does not work (using JAVA API):*

String script = "ctx._source.objectsList += newObject";
> UpdateRequestBuilder prepareUpdate = client.prepareUpdate("indexName",
> "typeName", "id");
> prepareUpdate.setScriptLang("groovy");
> prepareUpdate.setScript(script, ScriptType.INLINE);
> prepareUpdate.addScriptParam("newObject", "{\"status\":\"aasdsd\"}");
> prepareUpdate.get();



 Is there a way to reproduce the working REST API behavior with the Java
API?

2014-12-16 15:17 GMT-02:00 Roger de Cordova Farias <
roger.far...@fontec.inf.br>:
>
> Ok, I found out that I can send a JSON as a script parameter and just
> append it to the nested objects list (with list += newObject or
> list.add(newObject) ) using groovy and it works
>
> But it is not working with the Java API, I can only get it to work using
> the REST API.
>
> When using Java the JSON is treated as a string, then I get the error:
>
> object mapping [objectsList] trying to serialize a value with no field
>> associated with it, current value [{"field":"value"}]
>
>
> I can reproduce the error in the REST API by wrapping the JSON parameter
> with quotes:
>
> *Works (using REST API):*
>
> {
>>   "script": "ctx._source.objectsList += newObject",
>>   "params": {
>> "newObject": {"field": "value"}
>>   },
>>   "lang": "groovy"
>> }
>
>
> *Does not work (using REST API):*
>
> {
>>   "script": "ctx._source.objectsList += newObject",
>>   "params": {
>> "newObject": "{\"field\": \"value\"}"
>>   },
>>   "lang": "groovy"
>> }
>
>
> *Does not work (using JAVA API):*
>
> String script = "ctx._source.objectsList += newObject";
>
>
>
>
> 2014-12-16 13:04 GMT-02:00 Roger de Cordova Farias <
> roger.far...@fontec.inf.br>:
>
>> Hello
>>
>> I'm trying to update a document whose root object contains a list of
>> nested objects. I need to send an object of the nested type as a script
>> parameter to append to the list
>>
>> How can I append the json (a string type) to the nested objects list of
>> the root object using Groovy? or should I use another script lang?
>>
>> I tried using JsonSlurper  in Groovy,
>> that converts between json and Groovy objects, but I always get:
>>
>> Caused by:
>>> org.elasticsearch.script.groovy.GroovyScriptCompilationException:
>>> MultipleCompilationErrorsException[startup failed:
>>> Script3.groovy: 2: unable to resolve class JsonSlurper
>>>  @ line 2, column 19.
>>>def jsonSlurper = new JsonSlurper();
>>>  ^
>>> 1 error
>>> ]
>>> at
>>> org.elasticsearch.script.groovy.GroovyScriptEngineService.compile(GroovyScriptEngineService.java:117)
>>> at
>>> org.elasticsearch.script.ScriptService.getCompiledScript(ScriptService.java:368)
>>> at org.elasticsearch.script.ScriptService.compile(ScriptService.java:354)
>>> at
>>> org.elasticsearch.script.ScriptService.executable(ScriptService.java:497)
>>> at
>>> org.elasticsearch.action.update.UpdateHelper.prepare(UpdateHelper.java:149)
>>> ... 8 more
>>
>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAJp2530t5o9jcpgGRsJo1zV%2BaSvD7Uk8QyTKha6VR-RoHQuqsQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: How to use json in update script

2014-12-16 Thread Roger de Cordova Farias
Ok, I found out that I can send a JSON as a script parameter and just
append it to the nested objects list (with list += newObject or
list.add(newObject) ) using groovy and it works

But it is not working with the Java API, I can only get it to work using
the REST API.

When using Java the JSON is treated as a string, then I get the error:

object mapping [objectsList] trying to serialize a value with no field
> associated with it, current value [{"field":"value"}]


I can reproduce the error in the REST API by wrapping the JSON parameter
with quotes:

*Works (using REST API):*

{
>   "script": "ctx._source.objectsList += newObject",
>   "params": {
> "newObject": {"field": "value"}
>   },
>   "lang": "groovy"
> }


*Does not work (using REST API):*

{
>   "script": "ctx._source.objectsList += newObject",
>   "params": {
> "newObject": "{\"field\": \"value\"}"
>   },
>   "lang": "groovy"
> }


*Does not work (using JAVA API):*

String script = "ctx._source.objectsList += newObject";




2014-12-16 13:04 GMT-02:00 Roger de Cordova Farias <
roger.far...@fontec.inf.br>:
>
> Hello
>
> I'm trying to update a document whose root object contains a list of
> nested objects. I need to send an object of the nested type as a script
> parameter to append to the list
>
> How can I append the json (a string type) to the nested objects list of
> the root object using Groovy? or should I use another script lang?
>
> I tried using JsonSlurper  in Groovy,
> that converts between json and Groovy objects, but I always get:
>
> Caused by:
>> org.elasticsearch.script.groovy.GroovyScriptCompilationException:
>> MultipleCompilationErrorsException[startup failed:
>> Script3.groovy: 2: unable to resolve class JsonSlurper
>>  @ line 2, column 19.
>>def jsonSlurper = new JsonSlurper();
>>  ^
>> 1 error
>> ]
>> at
>> org.elasticsearch.script.groovy.GroovyScriptEngineService.compile(GroovyScriptEngineService.java:117)
>> at
>> org.elasticsearch.script.ScriptService.getCompiledScript(ScriptService.java:368)
>> at org.elasticsearch.script.ScriptService.compile(ScriptService.java:354)
>> at
>> org.elasticsearch.script.ScriptService.executable(ScriptService.java:497)
>> at
>> org.elasticsearch.action.update.UpdateHelper.prepare(UpdateHelper.java:149)
>> ... 8 more
>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAJp2531fsDup%2B0%3DtSR48ugsVkphLG%2B1s4QbOjLP7GjrMncBbTA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Where is the data stored? ElasticSearch YARN

2014-12-16 Thread Costin Leau

I recommend reading the project documentation [1]; there's a dedicated section 
that covers
 storage [2].

[1] http://www.elasticsearch.org/guide/en/elasticsearch/hadoop/master/index.html
[2] 
http://www.elasticsearch.org/guide/en/elasticsearch/hadoop/master/ey-setup.html#_storage

On 12/16/14 6:06 PM, Rafael Pellon wrote:

Hi

We are testing elasticsearch in a HDP environment using YARN.

We follow the instructions in the link 
http://www.elasticsearch.org/blog/elasticsearch-yarn-and-ssl/  and upload a lot
of data but

Where is the data stored? Is it in local file system / HDFS? Is it persisted? 
What is the default configuration of
ES-yarn version? In the standalone version without using Yarn, you could 
configure all of this in the config file.

Any information about this, will be useful.

Thanks in advance,
Rafa

--
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to
elasticsearch+unsubscr...@googlegroups.com 
.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/28fc11fb-b55b-4664-9767-f893a6af0738%40googlegroups.com
.
For more options, visit https://groups.google.com/d/optout.


--
Costin

--
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/54905F46.4060008%40gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: AWS machine for ES master

2014-12-16 Thread Jeff Keller

Just curious, what version of ES are you running?

Thanks,
Jeff


On Sunday, December 14, 2014 10:48:07 AM UTC-5, Yoav Melamed wrote:
>
> Thanks
>
> On Sunday, December 14, 2014 11:01:58 AM UTC+2, Yoav Melamed wrote:
>>
>> Hello,
>>
>> I run Elasticsearch cluser in AWS based on c3.8xlarge machines.
>> Can I use smaller machine for the masters ?
>> What should be enough ?
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/bc2043f4-764f-4ebb-beba-edbf7c0a1aac%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Where is the data stored? ElasticSearch YARN

2014-12-16 Thread Rafael Pellon
Hi

We are testing elasticsearch in a HDP environment using YARN.

We follow the instructions in the link 
http://www.elasticsearch.org/blog/elasticsearch-yarn-and-ssl/  and upload a 
lot of data but

Where is the data stored? Is it in local file system / HDFS? Is it 
persisted? What is the default configuration of ES-yarn version? In the 
standalone version without using Yarn, you could configure all of this in 
the config file.

Any information about this, will be useful.

Thanks in advance,
Rafa

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/28fc11fb-b55b-4664-9767-f893a6af0738%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Return Logstash Failed User logons by day and return code.

2014-12-16 Thread Rod Clayton
Dear Sachin,

I want to aggregate them by username and workstation and get a count.  I 
need to produce a report if there are too many failures for an account.

I figured out how to limit the search to a particular day by saying 
http://http://localhost:9200/logstash-2014.11.19/_search?q=status:%20FAILURE&pretty

I am looking for an example to aggregate on a couple of fields and get a 
count by value.

Is that possible?

Thanks,
Rod

On Tuesday, December 16, 2014 9:38:12 AM UTC-5, Sachin Divekar wrote:
>
> I had mistakenly put extra space in the URL. Corrected URL is 
> http://localhost:9200/_search?q=status:FAILURE&pretty
>
> Regards
> Sachin Divekar
>
> On Tue Dec 16 2014 at 8:01:37 PM Sachin Divekar  > wrote:
>
>> Hi Rod,
>>
>> Try following URL
>>
>> http://localhost:9200/_search?q=status: FAILURE&pretty
>>
>> In output you will find something like following
>>
>> 
>>
>> "hits": {
>> "total": 7,
>> "max_score": 1,
>> "hits": [
>>
>> -
>>
>> So in "hits" block value of "total" field is your count of failed logon 
>> requests. 
>>
>> For understanding search API and output of search query refer 
>> http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/_the_search_api.html
>>
>> Regards
>> Sachin Divekar
>>
>> On Tue Dec 16 2014 at 7:01:02 PM Rod Clayton > > wrote:
>>
>>> Dear Sachin,
>>>
>>> Here is the GIST with the output you requested:
>>>
>>> ka3bhy  / *gist:082a5410d36264521ccb 
>>> *
>>>
>>>
>>>
>>> On Monday, December 15, 2014 10:13:02 PM UTC-5, Sachin Divekar wrote:
>>>
 Hi,

 Share output of http://localhost:9200/foo/_search?pretty=true&q=*:* 
 substitute foo with name of your index.
 Use gist to share the output. I suggest, read 
 http://www.elasticsearch.org/help/

 Sachin Divekar

 On Tue, Dec 16, 2014, 1:38 AM Rod Clayton  wrote:

>>> The logstash debug for the input logs look like:
>
> {
>   "message" => "37208057\tSecurity\tMicrosoft
> -Windows-Security-Auditing\tSUCCESS AUDIT\tserver.myorg.org\t11/18/2014 
> 12:13:32 AM\t4776\tNone\t\"The computer attempted to validate the 
> credentials for an account.Authentication Package: 
> MICROSOFT_AUTHENTICATION_PACKAGE_V1_0  Logon Account: joe  Source 
> Workstation: joescomputer  Error Code: 0x0  \"",
>  "@version" => "1",
>"@timestamp" => "2014-11-18T05:13:32.000Z",
>  "host" => "0:0:0:0:0:0:0:1:51947",
>  "type" => "logons",
> "recno" => "37208057",
>   "logtype" => "Security",
>"status" => "SUCCESS",
>  "hostname" => "server.myorg.org",
> "eventCode" => "4776",
>  "username" => "joe",
>   "workstation" => "joescomputer",
> "retcd" => "0x0",
>   "received_at" => "2014-12-15 19:25:49 UTC",
> "received_from" => "0:0:0:0:0:0:0:1:51947"
> }
>
> I have obscured the host names and accounts, but the fields are the 
> same.
>
> I am hoping for output like:
>
> username workstation name error code Count
> root   maryscomputer 6a   100
> joelab1 6a5
> joelab2 6a2
> mary maryscomputer 6a   1
>
> This assumes that the detail records were all dated the same day.
> I am expecting that this is going to come back in a JSON format that I 
> will have to format to look like above.
>
> Is this what you wanted?
>
>
>
> On Monday, December 15, 2014 1:03:07 PM UTC-5, Sachin Divekar wrote:
>
>> Hi,
>>
>> Can you share some sample data and desired output?
>>
>> Sachin Divekar
>>
>> On Mon, Dec 15, 2014, 10:00 PM Rod Clayton  
>> wrote:
>>
> I have loaded login data into Elasticsearch using Logstash.
>>>
>>> I have fields: username retcd workstation.
>>>
>>> I want to query and get a count of failed logon requests by username 
>>> and workstation on a given day.
>>>
>>> The indexes are named like logstash-2014.11.18.
>>>
>>> What would a query for this look like on the day listed above?
>>>
>>> Thanks,
>>> Rod
>>>
>>> -- 
>>> You received this message because you are subscribed to the Google 
>>> Groups "elasticsearch" group.
>>>
>> To unsubscribe from this group and stop receiving emails from it, 
>>> send an email to elasticsearc...@googlegroups.com.
>>
>>
>>> To view this discussion on the web visit 
>>> https://groups.google.com/d/msgid/elasticsearch/dd8ca3ed-c9e
>>> 6-478a-ad77-9418e5822296%40googlegroups.com 
>>> 

How to use json in update script

2014-12-16 Thread Roger de Cordova Farias
Hello

I'm trying to update a document whose root object contains a list of nested
objects. I need to send an object of the nested type as a script parameter
to append to the list

How can I append the json (a string type) to the nested objects list of the
root object using Groovy? or should I use another script lang?

I tried using JsonSlurper  in Groovy,
that converts between json and Groovy objects, but I always get:

Caused by:
> org.elasticsearch.script.groovy.GroovyScriptCompilationException:
> MultipleCompilationErrorsException[startup failed:
> Script3.groovy: 2: unable to resolve class JsonSlurper
>  @ line 2, column 19.
>def jsonSlurper = new JsonSlurper();
>  ^
> 1 error
> ]
> at
> org.elasticsearch.script.groovy.GroovyScriptEngineService.compile(GroovyScriptEngineService.java:117)
> at
> org.elasticsearch.script.ScriptService.getCompiledScript(ScriptService.java:368)
> at org.elasticsearch.script.ScriptService.compile(ScriptService.java:354)
> at
> org.elasticsearch.script.ScriptService.executable(ScriptService.java:497)
> at
> org.elasticsearch.action.update.UpdateHelper.prepare(UpdateHelper.java:149)
> ... 8 more

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAJp2531Qm2GZbvM7CMZSd8sqjUF-VQ%3DN6YUKQam5EOPd9pBvRA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Recover data from corrupted index

2014-12-16 Thread Octavian
Hy,

I have a corrupted index in an elasticsearch cluster. The index is 
corrupted due to bad mappings. As you can see in example, there are two 
fields with same name and different mappings - one is string/doc_values and 
one is date, and in Lucene this is not possible, due to "doc_values" 
settings.

{
"test1": {
"properties": {
"date": {
"format": "dd//: HH: mm: ssZ",
"type": "date"
}
}
}
}


{
"test2": {
"properties": {
"date": {
"index": "not_analyzed",
"fielddata": {
"format": "doc_values"
},
"doc_values": true
}
}
}
}

The problem is that I'm not able to start the index in order to reindex the 
data from it. (The error is [WARN ][cluster.action.shard ] [cluster] 
[index][1] sending failed shard for [index][1], 
node[b2yMUmvXQFy8LFd8ei6DZQ], [P], s[INITIALIZING], indexUUID 
[I_zoR-4SS02WEfRNKFc6MA], reason [Failed to start shard, message 
[IndexShardGatewayRecoveryException[[index][1] failed recovery]; nested: 
IllegalArgumentException[cannot change DocValues type from BINARY to 
SORTED_SET for field "date"]; ]]).

Is it possible to somehow recover the data from corrupted index and put it 
in a healthy index ? (I'm using Elasticsearch 1.1.1)

Thank you,

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/8d3f0501-9fb8-4406-ae82-b85f7f29185a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Return Logstash Failed User logons by day and return code.

2014-12-16 Thread Sachin Divekar
I had mistakenly put extra space in the URL. Corrected URL is
http://localhost:9200/_search?q=status:FAILURE&pretty

Regards
Sachin Divekar

On Tue Dec 16 2014 at 8:01:37 PM Sachin Divekar  wrote:

> Hi Rod,
>
> Try following URL
>
> http://localhost:9200/_search?q=status: FAILURE&pretty
>
> In output you will find something like following
>
> 
>
> "hits": {
> "total": 7,
> "max_score": 1,
> "hits": [
>
> -
>
> So in "hits" block value of "total" field is your count of failed logon
> requests.
>
> For understanding search API and output of search query refer
> http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/_the_search_api.html
>
> Regards
> Sachin Divekar
>
> On Tue Dec 16 2014 at 7:01:02 PM Rod Clayton 
> wrote:
>
>> Dear Sachin,
>>
>> Here is the GIST with the output you requested:
>>
>> ka3bhy  / *gist:082a5410d36264521ccb
>> *
>>
>>
>>
>> On Monday, December 15, 2014 10:13:02 PM UTC-5, Sachin Divekar wrote:
>>
>>> Hi,
>>>
>>> Share output of http://localhost:9200/foo/_search?pretty=true&q=*:*
>>> substitute foo with name of your index.
>>> Use gist to share the output. I suggest, read
>>> http://www.elasticsearch.org/help/
>>>
>>> Sachin Divekar
>>>
>>> On Tue, Dec 16, 2014, 1:38 AM Rod Clayton  wrote:
>>>
>> The logstash debug for the input logs look like:

 {
   "message" => "37208057\tSecurity\tMicrosoft-Windows-Security-
 Auditing\tSUCCESS AUDIT\tserver.myorg.org\t11/18/2014 12:13:32
 AM\t4776\tNone\t\"The computer attempted to validate the credentials for an
 account.Authentication Package: MICROSOFT_AUTHENTICATION_PACKAGE_V1_0
  Logon Account: joe  Source Workstation: joescomputer  Error Code: 0x0  
 \"",
  "@version" => "1",
"@timestamp" => "2014-11-18T05:13:32.000Z",
  "host" => "0:0:0:0:0:0:0:1:51947",
  "type" => "logons",
 "recno" => "37208057",
   "logtype" => "Security",
"status" => "SUCCESS",
  "hostname" => "server.myorg.org",
 "eventCode" => "4776",
  "username" => "joe",
   "workstation" => "joescomputer",
 "retcd" => "0x0",
   "received_at" => "2014-12-15 19:25:49 UTC",
 "received_from" => "0:0:0:0:0:0:0:1:51947"
 }

 I have obscured the host names and accounts, but the fields are the
 same.

 I am hoping for output like:

 username workstation name error code Count
 root   maryscomputer 6a   100
 joelab1 6a5
 joelab2 6a2
 mary maryscomputer 6a   1

 This assumes that the detail records were all dated the same day.
 I am expecting that this is going to come back in a JSON format that I
 will have to format to look like above.

 Is this what you wanted?



 On Monday, December 15, 2014 1:03:07 PM UTC-5, Sachin Divekar wrote:

> Hi,
>
> Can you share some sample data and desired output?
>
> Sachin Divekar
>
> On Mon, Dec 15, 2014, 10:00 PM Rod Clayton  wrote:
>
 I have loaded login data into Elasticsearch using Logstash.
>>
>> I have fields: username retcd workstation.
>>
>> I want to query and get a count of failed logon requests by username
>> and workstation on a given day.
>>
>> The indexes are named like logstash-2014.11.18.
>>
>> What would a query for this look like on the day listed above?
>>
>> Thanks,
>> Rod
>>
>> --
>> You received this message because you are subscribed to the Google
>> Groups "elasticsearch" group.
>>
> To unsubscribe from this group and stop receiving emails from it, send
>> an email to elasticsearc...@googlegroups.com.
>
>
>> To view this discussion on the web visit https://groups.google.com/d/
>> msgid/elasticsearch/dd8ca3ed-c9e6-478a-ad77-9418e5822296%40goo
>> glegroups.com
>> 
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>  --
 You received this message because you are subscribed to the Google
 Groups "elasticsearch" group.
 To unsubscribe from this group and stop receiving emails from it, send
 an email to elasticsearc...@googlegroups.com.

>>> To view this discussion on the web visit https://groups.google.com/d/ms
 gid/elasticsearch/f6d94667-d81d-40de-a927-de088e2bee69%40goo
 glegroups.com
 

Re: Return Logstash Failed User logons by day and return code.

2014-12-16 Thread Sachin Divekar
Hi Rod,

Try following URL

http://localhost:9200/_search?q=status: FAILURE&pretty

In output you will find something like following



"hits": {
"total": 7,
"max_score": 1,
"hits": [

-

So in "hits" block value of "total" field is your count of failed logon
requests.

For understanding search API and output of search query refer
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/_the_search_api.html

Regards
Sachin Divekar

On Tue Dec 16 2014 at 7:01:02 PM Rod Clayton  wrote:

> Dear Sachin,
>
> Here is the GIST with the output you requested:
>
> ka3bhy  / *gist:082a5410d36264521ccb
> *
>
>
>
> On Monday, December 15, 2014 10:13:02 PM UTC-5, Sachin Divekar wrote:
>
>> Hi,
>>
>> Share output of http://localhost:9200/foo/_search?pretty=true&q=*:*
>> substitute foo with name of your index.
>> Use gist to share the output. I suggest, read
>> http://www.elasticsearch.org/help/
>>
>> Sachin Divekar
>>
>> On Tue, Dec 16, 2014, 1:38 AM Rod Clayton  wrote:
>>
> The logstash debug for the input logs look like:
>>>
>>> {
>>>   "message" => 
>>> "37208057\tSecurity\tMicrosoft-Windows-Security-Auditing\tSUCCESS
>>> AUDIT\tserver.myorg.org\t11/18/2014 12:13:32 AM\t4776\tNone\t\"The
>>> computer attempted to validate the credentials for an account.
>>>  Authentication Package: MICROSOFT_AUTHENTICATION_PACKAGE_V1_0  Logon
>>> Account: joe  Source Workstation: joescomputer  Error Code: 0x0  \"",
>>>  "@version" => "1",
>>>"@timestamp" => "2014-11-18T05:13:32.000Z",
>>>  "host" => "0:0:0:0:0:0:0:1:51947",
>>>  "type" => "logons",
>>> "recno" => "37208057",
>>>   "logtype" => "Security",
>>>"status" => "SUCCESS",
>>>  "hostname" => "server.myorg.org",
>>> "eventCode" => "4776",
>>>  "username" => "joe",
>>>   "workstation" => "joescomputer",
>>> "retcd" => "0x0",
>>>   "received_at" => "2014-12-15 19:25:49 UTC",
>>> "received_from" => "0:0:0:0:0:0:0:1:51947"
>>> }
>>>
>>> I have obscured the host names and accounts, but the fields are the same.
>>>
>>> I am hoping for output like:
>>>
>>> username workstation name error code Count
>>> root   maryscomputer 6a   100
>>> joelab1 6a5
>>> joelab2 6a2
>>> mary maryscomputer 6a   1
>>>
>>> This assumes that the detail records were all dated the same day.
>>> I am expecting that this is going to come back in a JSON format that I
>>> will have to format to look like above.
>>>
>>> Is this what you wanted?
>>>
>>>
>>>
>>> On Monday, December 15, 2014 1:03:07 PM UTC-5, Sachin Divekar wrote:
>>>
 Hi,

 Can you share some sample data and desired output?

 Sachin Divekar

 On Mon, Dec 15, 2014, 10:00 PM Rod Clayton  wrote:

>>> I have loaded login data into Elasticsearch using Logstash.
>
> I have fields: username retcd workstation.
>
> I want to query and get a count of failed logon requests by username
> and workstation on a given day.
>
> The indexes are named like logstash-2014.11.18.
>
> What would a query for this look like on the day listed above?
>
> Thanks,
> Rod
>
> --
> You received this message because you are subscribed to the Google
> Groups "elasticsearch" group.
>
 To unsubscribe from this group and stop receiving emails from it, send
> an email to elasticsearc...@googlegroups.com.


> To view this discussion on the web visit https://groups.google.com/d/
> msgid/elasticsearch/dd8ca3ed-c9e6-478a-ad77-9418e5822296%40goo
> glegroups.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>
  --
>>> You received this message because you are subscribed to the Google
>>> Groups "elasticsearch" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to elasticsearc...@googlegroups.com.
>>>
>> To view this discussion on the web visit https://groups.google.com/d/
>>> msgid/elasticsearch/f6d94667-d81d-40de-a927-de088e2bee69%
>>> 40googlegroups.com
>>> 
>>> .
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To vi

Re: Is ElasticSearch truly scalable for analytics?

2014-12-16 Thread Adrien Grand
What you mean with "node level reduce"?

On Mon, Dec 15, 2014 at 10:52 PM, Yifan Wang 
wrote:
>
> If I understand correctly, ElasticSearch directly sends query to and
> collects aggregated results from each shard. With number of shards
> increases, Reduce phase on the Client node will become overwhelmed.
>
> One would assume, if ElasticSearch support node level aggregation, the
> "Reduce" becomes distributed so Client node will not become overwhelmed for
> large clusters with lots of shards. I am wondering if ElasticSearch
> supports node level reduce. If not, why? I think this is critical if we
> like ElasticSearch to be truly scalable for analytics.
>
> Thanks.
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/a8f56d70-9738-4768-9637-9159e6955cc2%40googlegroups.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>


-- 
Adrien Grand

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j4QddJBSNc-LTL_K9-e9GmawBKXPiWgFJo7kp9uL2ii%3DA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Return Logstash Failed User logons by day and return code.

2014-12-16 Thread Rod Clayton
Dear Sachin,

Here is the GIST with the output you requested:

ka3bhy  / *gist:082a5410d36264521ccb 
*



On Monday, December 15, 2014 10:13:02 PM UTC-5, Sachin Divekar wrote:
>
> Hi,
>
> Share output of http://localhost:9200/foo/_search?pretty=true&q=*:* 
> substitute foo with name of your index.
> Use gist to share the output. I suggest, read 
> http://www.elasticsearch.org/help/
>
> Sachin Divekar
>
> On Tue, Dec 16, 2014, 1:38 AM Rod Clayton  > wrote:
>
>> The logstash debug for the input logs look like:
>>
>> {
>>   "message" => 
>> "37208057\tSecurity\tMicrosoft-Windows-Security-Auditing\tSUCCESS AUDIT\
>> tserver.myorg.org\t11/18/2014 12:13:32 AM\t4776\tNone\t\"The computer 
>> attempted to validate the credentials for an account.Authentication 
>> Package: MICROSOFT_AUTHENTICATION_PACKAGE_V1_0  Logon Account: joe  Source 
>> Workstation: joescomputer  Error Code: 0x0  \"",
>>  "@version" => "1",
>>"@timestamp" => "2014-11-18T05:13:32.000Z",
>>  "host" => "0:0:0:0:0:0:0:1:51947",
>>  "type" => "logons",
>> "recno" => "37208057",
>>   "logtype" => "Security",
>>"status" => "SUCCESS",
>>  "hostname" => "server.myorg.org",
>> "eventCode" => "4776",
>>  "username" => "joe",
>>   "workstation" => "joescomputer",
>> "retcd" => "0x0",
>>   "received_at" => "2014-12-15 19:25:49 UTC",
>> "received_from" => "0:0:0:0:0:0:0:1:51947"
>> }
>>
>> I have obscured the host names and accounts, but the fields are the same.
>>
>> I am hoping for output like:
>>
>> username workstation name error code Count
>> root   maryscomputer 6a   100
>> joelab1 6a5
>> joelab2 6a2
>> mary maryscomputer 6a   1
>>
>> This assumes that the detail records were all dated the same day.
>> I am expecting that this is going to come back in a JSON format that I 
>> will have to format to look like above.
>>
>> Is this what you wanted?
>>
>>
>>
>> On Monday, December 15, 2014 1:03:07 PM UTC-5, Sachin Divekar wrote:
>>
>>> Hi,
>>>
>>> Can you share some sample data and desired output?
>>>
>>> Sachin Divekar
>>>
>>> On Mon, Dec 15, 2014, 10:00 PM Rod Clayton  wrote:
>>>
>> I have loaded login data into Elasticsearch using Logstash.

 I have fields: username retcd workstation.

 I want to query and get a count of failed logon requests by username 
 and workstation on a given day.

 The indexes are named like logstash-2014.11.18.

 What would a query for this look like on the day listed above?

 Thanks,
 Rod

 -- 
 You received this message because you are subscribed to the Google 
 Groups "elasticsearch" group.

>>> To unsubscribe from this group and stop receiving emails from it, send 
 an email to elasticsearc...@googlegroups.com.
>>>
>>>
 To view this discussion on the web visit https://groups.google.com/d/
 msgid/elasticsearch/dd8ca3ed-c9e6-478a-ad77-9418e5822296%
 40googlegroups.com 
 
 .
 For more options, visit https://groups.google.com/d/optout.

>>>  -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to elasticsearc...@googlegroups.com .
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/f6d94667-d81d-40de-a927-de088e2bee69%40googlegroups.com
>>  
>> 
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/8267fbc1-63cd-400d-a969-5f6191203991%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Aggregegation buckets count

2014-12-16 Thread Rich Somerfield
Happy to help.

On Tuesday, December 16, 2014 10:37:25 AM UTC, Tom wrote:
>
> Hi Rich,
>
> perfect, that's it, thx a lot.
>
> Cheers, Tom
>
> Am Dienstag, 16. Dezember 2014 11:02:04 UTC+1 schrieb Rich Somerfield:
>>
>> Hi Tom,
>>
>> I think the "Cardinality" aggregation is what you want.
>>
>> e.g. :
>>
>> {
>>   ...query...
>> },
>> "aggregations": {
>>   "totalUniqueUsers": {
>> "cardinality": {
>>   "field": "username"
>> }
>>   }
>> }
>>
>> -Rich
>>
>> On Tuesday, December 16, 2014 8:48:51 AM UTC, Tom wrote:
>>>
>>> Hi,
>>>
>>> is there a way to get just the count of buckets (not the count of docs, 
>>> which works i know) of an aggregation without receiving the whole buckets 
>>> content?
>>>
>>> thx, Tom
>>>
>>>
>>>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/470ec648-20c8-4967-9c38-fa6ab407c8fe%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Help with request

2014-12-16 Thread Валерий Хвалев
Hi, all

I'm newbe in Elastic. Usualy I was use Sphinx in my projects but due some 
limitation I descide to play with Elastic and I like it. 

But now I stuck in some "thin" request and need your help. So could you 
please push me in right direction how to make such request: 

I have index (sample)
{
"Name":"Car oil TOTAL"
"attributes":{"Attrib1"=>"5w40","Fuel"=>"disel"}
"applicability":["BMW","AUDI","SEAT"] 
}


Could you pelase advice right syntax to find result with next queue: "Oil 
for disel BMW" 
Now i use 
"query_string"=>array (
 "query"=>'Oil*',
 "default_operator"=>'or',
 "allow_leading_wildcard"=>true,
 "fields"=> array(
  "Name^3",
 )
),
"constant_score"=>array(
 "filter"=>array(
  "terms"=>array(
"applicability"=>array('BMW')
  )
  )
)



and there is expected result. But if I change queue to "Oil for BMW" - 
there is emty results. 

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/d127f8dd-da57-44e8-8a04-5c296e950a97%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: ElasticSearch hadoop - .EsHadoopSerializationException

2014-12-16 Thread Costin Leau

Having multiple types shouldn't be an issue - ES is a document store so it's 
pretty common to have different types.
In other words, this is not the intended behavior - can you please create a 
small sample/snippet that reproduces the error
and raise an issue for it [1] ?

Thanks!

[1] 
http://www.elasticsearch.org/guide/en/elasticsearch/hadoop/master/troubleshooting.html

On 12/15/14 6:03 PM, Kamil Dziublinski wrote:

Hi,

I had only one jar on classpath and none in hadoop cluster.
I had different types of values in my MapWritable tho. It turns out this was 
the problem.
So I had always Text as a key, but depending on type Text, LongWritable, 
BooleanWritable or DoubleWritable as value in
that map.
When I changed everything to be Text it started working.

Is this intended behaviour?

Cheers,
Kamil.

On Friday, December 12, 2014 8:37:03 PM UTC+1, Costin Leau wrote:

Hi,

This error is typically tied to a classpath issue - make sure you have only 
one elasticsearch-hadoop jar version in
your
classpath and on the Hadoop cluster.

On 12/12/14 5:56 PM, Kamil Dziublinski wrote:
> Hi guys,
>
> I am trying to run a MR job that reads from HDFS and stores into 
ElasticSearch cluster.
>
> I am getting following error:
> Error: 
org.elasticsearch.hadoop.serialization.EsHadoopSerializationException: Cannot 
handle type [class
> org.apache.hadoop.io.MapWritable], instance 
[org.apache.hadoop.io.MapWritable@3879429f] using writer
> [org.elasticsearch.hadoop.mr.WritableValueWriter@3fc8f1a2]
>  at 
org.elasticsearch.hadoop.serialization.builder.ContentBuilder.value(ContentBuilder.java:259)
>  at 
org.elasticsearch.hadoop.serialization.bulk.TemplatedBulk.doWriteObject(TemplatedBulk.java:68)
>  at 
org.elasticsearch.hadoop.serialization.bulk.TemplatedBulk.write(TemplatedBulk.java:55)
>  at 
org.elasticsearch.hadoop.rest.RestRepository.writeToIndex(RestRepository.java:130)
>  at 
org.elasticsearch.hadoop.mr.EsOutputFormat$EsRecordWriter.write(EsOutputFormat.java:159)
>  at 
org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:635)
>  at 
org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:89)
>  at 
org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper.java:112)
>  at 
com.teradata.cybershot.mr.es.userprofile.EsOnlineProfileMapper.map(EsOnlineProfileMapper.java:35)
>  at 
com.teradata.cybershot.mr.es.userprofile.EsOnlineProfileMapper.map(EsOnlineProfileMapper.java:20)
>  at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
>  at 
org.apache.hadoop.mapreduce.lib.input.DelegatingMapper.run(DelegatingMapper.java:55)
>  at 
org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
>  at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
>  at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167)
>  at java.security.AccessController.doPrivileged(Native Method)
>  at javax.security.auth.Subject.doAs(Subject.java:415)
>  at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1554)
>  at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
>
> We are using cdh5.1.0 and es-hadoop dependency 2.0.2
>
> I have this set in my job configuration:
> job.setOutputFormatClass(EsOutputFormat.class);
> job.setMapOutputValueClass(MapWritable.class);
>
> together with nodes and resource props like it is described on ES page.
>
> in my mapper I simply write: context.write(NullWritable.get(), esMap); 
where esMap is org.apache.hadoop.io.MapWritable.
>
> I do not know why it's failing as everything looks ok to me. Maybe you 
will have some ideas.
>
> Thanks in advance,
> Kamil.
>
> --
> You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
email to
>elasticsearc...@googlegroups.com  
.
> To view this discussion on the web visit

>https://groups.google.com/d/msgid/elasticsearch/71c57e2a-2210-47c0-aa9e-cbbf164ef05b%40googlegroups.com


> 
>.

> For more options, visithttps://groups.google.com/d/optout 
.

--
Costin

--
You received t

Re: Write amplification and SSD

2014-12-16 Thread AndrewK
Thank you for the feedback!

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/6dc59c9e-6504-4786-95e9-7a951b2694d7%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: processors configuration (available_processors)

2014-12-16 Thread joergpra...@gmail.com
You have to set up a container and assign virtual processors to it, and
install ES in there so the Java JVM can use only these cores.

With 16 cores, you have 32 processors because of Intel's hyperthreading
technology which doubles the number of available processing units in a core.

Thread pool is a different topic.

Jörg

On Tue, Dec 16, 2014 at 12:13 PM, Robin Clarke  wrote:
>
> And in the spirit of answering your own questions, I've already been
> helped to the answer:
>
> curl -XGET "http://localhost:9200/_nodes/thread_pool?pretty
> "
>
> ...
> "index" : {
>   "type" : "fixed",
>   "min" : 16,
>   "max" : 16,
>   "queue_size" : "200"
> ...
>
> It seems this cannot be set using the config API
>
> Cheers,
> -Robin-
>
> On Tuesday, 16 December 2014 12:04:03 UTC+1, Robin Clarke wrote:
>>
>> I have a machine with 132GB of memory and 32 cores on which I am running
>> two elasticsearch nodes.  Each node should have only half the total number
>> of CPU cores available so that both nodes can work at full capacity and not
>> block each other.
>> I believe the correct configuration option would be:
>>
>> processors: 16
>>
>> And I thought that this should change the reported value for
>>
>> curl -XGET "http://localhost:9200/_nodes/os?pretty";
>>
>> should be reporting the new value here:
>>   "nodes" : {
>> "2GK7gBPNSSqqbRnRN_WmVg" : {
>>   ...
>>   "os" : {
>>   ...
>> "available_processors" : 32,<-- I expect to see 16 here
>>   ...
>>
>> Any ideas what I am doing wrong here, and how to set / confirm the number
>> of processors that an elasticsearch node should use.
>>
>> Cheers,
>> -Robin-
>>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/abd0aa9f-2f2e-45b0-8822-00c64160c34d%40googlegroups.com
> 
> .
>
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoH9SKwrek6Zv7KHQPRvziF_jeQFK-533L23pgErSidepA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: processors configuration (available_processors)

2014-12-16 Thread Robin Clarke
And in the spirit of answering your own questions, I've already been helped 
to the answer:

curl -XGET "http://localhost:9200/_nodes/thread_pool?pretty 
"

...
"index" : {
  "type" : "fixed",
  "min" : 16,
  "max" : 16,
  "queue_size" : "200"
...

It seems this cannot be set using the config API

Cheers,
-Robin-

On Tuesday, 16 December 2014 12:04:03 UTC+1, Robin Clarke wrote:
>
> I have a machine with 132GB of memory and 32 cores on which I am running 
> two elasticsearch nodes.  Each node should have only half the total number 
> of CPU cores available so that both nodes can work at full capacity and not 
> block each other.
> I believe the correct configuration option would be:
>
> processors: 16
>
> And I thought that this should change the reported value for 
>
> curl -XGET "http://localhost:9200/_nodes/os?pretty";
>
> should be reporting the new value here:
>   "nodes" : {
> "2GK7gBPNSSqqbRnRN_WmVg" : {
>   ...
>   "os" : {
>   ...
> "available_processors" : 32,<-- I expect to see 16 here
>   ...
>
> Any ideas what I am doing wrong here, and how to set / confirm the number 
> of processors that an elasticsearch node should use.
>
> Cheers,
> -Robin-
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/abd0aa9f-2f2e-45b0-8822-00c64160c34d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


processors configuration (available_processors)

2014-12-16 Thread Robin Clarke
I have a machine with 132GB of memory and 32 cores on which I am running 
two elasticsearch nodes.  Each node should have only half the total number 
of CPU cores available so that both nodes can work at full capacity and not 
block each other.
I believe the correct configuration option would be:

processors: 16

And I thought that this should change the reported value for 

curl -XGET "http://localhost:9200/_nodes/os?pretty";

should be reporting the new value here:
  "nodes" : {
"2GK7gBPNSSqqbRnRN_WmVg" : {
  ...
  "os" : {
  ...
"available_processors" : 32,<-- I expect to see 16 here
  ...

Any ideas what I am doing wrong here, and how to set / confirm the number 
of processors that an elasticsearch node should use.

Cheers,
-Robin-

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/90cfb708-05db-4053-a289-ff0fbc561bbd%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Aggregegation buckets count

2014-12-16 Thread Tom
Hi Rich,

perfect, that's it, thx a lot.

Cheers, Tom

Am Dienstag, 16. Dezember 2014 11:02:04 UTC+1 schrieb Rich Somerfield:
>
> Hi Tom,
>
> I think the "Cardinality" aggregation is what you want.
>
> e.g. :
>
> {
>   ...query...
> },
> "aggregations": {
>   "totalUniqueUsers": {
> "cardinality": {
>   "field": "username"
> }
>   }
> }
>
> -Rich
>
> On Tuesday, December 16, 2014 8:48:51 AM UTC, Tom wrote:
>>
>> Hi,
>>
>> is there a way to get just the count of buckets (not the count of docs, 
>> which works i know) of an aggregation without receiving the whole buckets 
>> content?
>>
>> thx, Tom
>>
>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/a36ad00b-5f7e-4e36-9c75-928befb63fea%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Have a problem to map lat long field

2014-12-16 Thread Anoop P R
Thanks David, Now its works for me.

On Tuesday, December 16, 2014 12:55:45 PM UTC+5:30, David Pilato wrote:
>
> There is no autodetection of geo_point.
> So you need to provide first a mapping before sending the first document.
>
> David
>
> Le 16 déc. 2014 à 07:47, Anoop P R > 
> a écrit :
>
> Hi, I have a problem with mapping latlong in elastic search
>
> Here is my input data to elastic search server 
>
> {
> "member_id": "8",
> "keywords": [
> "Sample 2",
> "Sample 3",
> "s",
> "sample"
> ],
> "long_descriptions": "Des",
> "general_desc": "Sample",
> "service_location": 1,
> "categories": [
> "Entertainment Entertainment Entertainment Entertai",
> "DJ’s",
> "Personal Driver",
> "Transportation Services"
> ],
>  *   "pin": {*
> *"location": {*
> *"lat": 33.8101772,*
> *"lon": -118.3520389*
> *}*
> *},*
> "status": 1,
> "state": true
> }
>
> and when I check the mapping it shows like 
>
> "member_id": {
> "type": "string"
> },
> *"pin": {*
> *"properties": {*
> *"location": {*
> *"properties": {*
> *"lat": {*
> *"type": "double"*
> *},*
> *"lon": {*
> *"type": "double"*
> *}*
> *}*
> *}*
> *}*
> },
>
> I am currently using elasticsearch v1.0.1, is this something related to 
> the installation of elasticsearch? 
>
> -- 
> You received this message because you are subscribed to the Google Groups 
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to elasticsearc...@googlegroups.com .
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/elasticsearch/084dd688-b7c8-4d31-a0bc-1ad4ecdc0be3%40googlegroups.com
>  
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/2596df36-55a8-494b-be53-abdfc99d9a17%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


File Descriptors

2014-12-16 Thread Chetan Dev
Hi,

What does file descriptor value of -1 shows?
what is the default value for it ?

Thanks 


  },
"WkgDi0joTYSrs5sO3_bndQ" : {
  "name" : "AEPLPERF2",
  "transport_address" : "inet[/192.168.1.13:9300]",
  "host" : "AEPLPERF1",
  "ip" : "192.168.1.13",
  "version" : "1.4.1",
  "build" : "89d3241",
  "http_address" : "inet[/192.168.1.13:9200]",
  "process" : {
"refresh_interval_in_millis" : 1000,
"id" : 28240,
"max_file_descriptors" : -1,
"mlockall" : false
  }


-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/7c08629e-9c6a-42ce-8754-702c0e27be2d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Aggregegation buckets count

2014-12-16 Thread Rich Somerfield
Hi Tom,

I think the "Cardinality" aggregation is what you want.

e.g. :

{
  ...query...
},
"aggregations": {
  "totalUniqueUsers": {
"cardinality": {
  "field": "username"
}
  }
}

-Rich

On Tuesday, December 16, 2014 8:48:51 AM UTC, Tom wrote:
>
> Hi,
>
> is there a way to get just the count of buckets (not the count of docs, 
> which works i know) of an aggregation without receiving the whole buckets 
> content?
>
> thx, Tom
>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/834082a2-1336-4c2d-a4b0-5bfbe52cafb1%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Write amplification and SSD

2014-12-16 Thread joergpra...@gmail.com
All SSDs have internal rewrites due to wear leveling and garbage
collection, and the issue is not only caused by random writes from the
application, but that too many internal rewrites reduce SSD performance and
lifetime.

I think the contribution of reducing write amplification from application
layer is rather small, because the main focus of SSD performance in that
area is depending on the controller algorithms. E.g. Sandforce controllers
uses compression and can achieve rates of 0.14, much less than other
controllers:
http://www.tomshardware.com/reviews/ssd-520-sandforce-review-benchmark,3124-11.html

Jörg

On Tue, Dec 16, 2014 at 10:29 AM, Michael McCandless  wrote:
>
> It means that ES works well with SSDs since Lucene is write-once under the
> hood, so it is "easy" on the SSDs, vs other approaches which do random
> writes to different places causing the higher write amplification.
>
> But, this is balanced with the fact that Lucene must also periodically
> merge the segments, which is in fact its own higher level form of write
> amplification: when you first index a doc, it's written into a new segment,
> but over that doc's lifetime in the index it may be copied another 4-5
> times or something before it lives in a "max sized" segment.  Still, that
> higher write amplification likely works out to much less stress on the SSD
> than databases that do random writes to their stores.
>
> Mike McCandless
>
> http://blog.mikemccandless.com
>
> On Tue, Dec 16, 2014 at 4:11 AM, AndrewK  wrote:
>
>> At the core training I attended last year there was a side note on SSD
>> and write amplification: roughly along the lines of: write amplification
>> can be a big problem with SSD (as writes can be around 4KB but deletes are
>> often in blocks of around 512KB, and that the problem gets worse the
>> smaller and the more random the writes are), but that write amplification
>> is never an issue in ES as all writes are sequential anyway (reading from
>> my notes here).
>>
>> What does that mean exactly? That write amplification *can* be a big
>> problem with SSD, but not with ES on SSD, or that the problem is relevant
>> with lots of random writes? (I suspect the former, but am not quite sure).
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to elasticsearch+unsubscr...@googlegroups.com.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/elasticsearch/67ed8b75-fe8b-4f40-b41a-b66cf6eb82bc%40googlegroups.com
>> 
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/CAD7smRc9im3H%3DfKx5hauVVZL%3DE00aUnWf-DrZtO7etPC-VzWEQ%40mail.gmail.com
> 
> .
>
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoFSWo-n4a%3Dox5D7xMBWoF5Yd3E%2Beb4Vuzae4eQfmgvJdw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Write amplification and SSD

2014-12-16 Thread Michael McCandless
It means that ES works well with SSDs since Lucene is write-once under the
hood, so it is "easy" on the SSDs, vs other approaches which do random
writes to different places causing the higher write amplification.

But, this is balanced with the fact that Lucene must also periodically
merge the segments, which is in fact its own higher level form of write
amplification: when you first index a doc, it's written into a new segment,
but over that doc's lifetime in the index it may be copied another 4-5
times or something before it lives in a "max sized" segment.  Still, that
higher write amplification likely works out to much less stress on the SSD
than databases that do random writes to their stores.

Mike McCandless

http://blog.mikemccandless.com

On Tue, Dec 16, 2014 at 4:11 AM, AndrewK  wrote:

> At the core training I attended last year there was a side note on SSD and
> write amplification: roughly along the lines of: write amplification can be
> a big problem with SSD (as writes can be around 4KB but deletes are often
> in blocks of around 512KB, and that the problem gets worse the smaller and
> the more random the writes are), but that write amplification is never an
> issue in ES as all writes are sequential anyway (reading from my notes
> here).
>
> What does that mean exactly? That write amplification *can* be a big
> problem with SSD, but not with ES on SSD, or that the problem is relevant
> with lots of random writes? (I suspect the former, but am not quite sure).
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/67ed8b75-fe8b-4f40-b41a-b66cf6eb82bc%40googlegroups.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAD7smRc9im3H%3DfKx5hauVVZL%3DE00aUnWf-DrZtO7etPC-VzWEQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Write amplification and SSD

2014-12-16 Thread AndrewK
At the core training I attended last year there was a side note on SSD and 
write amplification: roughly along the lines of: write amplification can be 
a big problem with SSD (as writes can be around 4KB but deletes are often 
in blocks of around 512KB, and that the problem gets worse the smaller and 
the more random the writes are), but that write amplification is never an 
issue in ES as all writes are sequential anyway (reading from my notes 
here).

What does that mean exactly? That write amplification *can* be a big 
problem with SSD, but not with ES on SSD, or that the problem is relevant 
with lots of random writes? (I suspect the former, but am not quite sure).

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/67ed8b75-fe8b-4f40-b41a-b66cf6eb82bc%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[Kibana] group by request?

2014-12-16 Thread stephanos
Hey there,

we are using Google App Engine to host our SaaS app. Google offers a nice 
log browser but it is way too slw. So one of my colleagues suggested we 
pipe our logs to logstash and make them accessible via Kibana. So far so 
good, we managed to set everything up.

But when Kibana was shown to the other team members they weren't really 
excited. It was much faster, yes. It allowed to make better queries, yes. 
BUT it broke the pattern they knew from the Google App Engine log browser:

/some-request
log message 1
log message 2
/another-request
log message 3
/yet-another-request
log message 4

While Kibana works like this:

log message 1/some-request
log message 2/some-request
log message 3/another-request
log message 4/yet-another-request

So basically App Engine groups log messages by request. To get my team on 
board, can we make Kibana do the same?

Stephan

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/aacdaf38-c614-4dbc-b4d8-a81b832dbc31%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Aggregegation buckets count

2014-12-16 Thread Tom
Hi,

is there a way to get just the count of buckets (not the count of docs, 
which works i know) of an aggregation without receiving the whole buckets 
content?

thx, Tom


-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/99c0776d-8059-432d-b2a7-5ce115dae9d2%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Elasticsearch script score and decay function scores do not multiply

2014-12-16 Thread valerij . vasilcenko
There was on error in a query. It should be:
{"exp": {
"date": {
  "origin": "now",
  "scale": "1d",
  "decay" : 0.05
}
}},
{"script_score": {
"script": "_score * 10",
"lang":"groovy"
}}

On Tuesday, December 16, 2014 10:24:50 AM UTC+2, 
valerij.v...@googlemail.com wrote:
>
> "query": {
> "function_score": {
> "filter" : {
> "bool" : {
> "must" : [
> { "terms" : { "content" : "test"} }
> ]
> }
>
> },
> "functions": [{
> "exp": {
> "date": {
>   "origin": "now",
>   "scale": "1d",
>   "decay" : 0.05
> }
> },
> "script_score": {
> "script": "_score * 10",
> "lang":"groovy"
> }   
> }],
> "score_mode": "multiply"
> }
> }}
>
> "functions" scores do not multiply. Score is calculated only using last 
> function. If I swap places "exp" with "script_score", "exp" score will be 
> shown. What is the problem? Notice: "script_score" is just a dummy function.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/f0ac8e61-4e44-4cd8-a470-0885357dacf9%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Completion Suggesters using Java API

2014-12-16 Thread joergpra...@gmail.com
Please look at the Elasticsearch unit tests, here

https://github.com/elasticsearch/elasticsearch/blob/master/src/test/java/org/elasticsearch/search/suggest/CompletionSuggestSearchTests.java

Jörg

On Mon, Dec 15, 2014 at 11:44 PM, Pradeep B  wrote:
>
> Hi
> In the blog post (http://www.elasticsearch.org/blog/you-complete-me/)
> introducing the Suggestions Feature in Elastic search recommends managing
> suggestions in a lightweight index of its own and the idea is to use the
> payload to render the key to tie it back to the source index.
>
>
> I am trying to create two indexes :
> 1) MyDataIndex
> 2)MyDataSuggestIndex (for suggestions)
>
> When trying to create a Suggestions Index using BulkProcessor class in
> Java API, I am seeing that the suggestions are not working.
> There is no documentation around suggesters using Java API.
>
> Can someone point to examples / samples to create suggestion entries using
> Java API ? I have tried looking in the archives for this group with no luck.
>
> Any pointers / help is highly appreciated.
>
> Thanks in advance.
>
>
> Pradeep
>
>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/3c6134f4-4244-486c-a08d-30e5350a51b3%40googlegroups.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoEhR_z4hq3pztHY2DWjX_6Up-zciEihGmRujOKagiKBCw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Elasticsearch script score and decay function scores do not multiply

2014-12-16 Thread valerij . vasilcenko


"query": {
"function_score": {
"filter" : {
"bool" : {
"must" : [
{ "terms" : { "content" : "test"} }
]
}

},
"functions": [{
"exp": {
"date": {
  "origin": "now",
  "scale": "1d",
  "decay" : 0.05
}
},
"script_score": {
"script": "_score * 10",
"lang":"groovy"
}   
}],
"score_mode": "multiply"
}
}}

"functions" scores do not multiply. Score is calculated only using last 
function. If I swap places "exp" with "script_score", "exp" score will be 
shown. What is the problem? Notice: "script_score" is just a dummy function.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/efcf20a5-b2f5-451d-82b5-220c3fb41216%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.