Re: Any experience with ES and Data Compressing Filesystems?

2014-07-21 Thread Patrick Proniewski
Hi,

gzip/zlib compression is very bad for performance, so it can be interesting for 
closed indices, but for live data I would not recommend it.
Also, you must know that: 

Compression using lz4 is already enabled into indices,
ES/Lucene/Java usually readwrite 4k blocks,

- hence, compression is achieved on 4k blocks. If your filesystem uses 4k 
blocks and you add FS compression, you will probably have a very small gain, if 
any. I've tried on ZFS:

Filesystem SizeUsed   Avail Capacity  Mounted on
zdata/ES-lz4   1.1T1.9G1.1T 0%/zdata/ES-lz4
zdata/ES   1.1T1.9G1.1T 0%/zdata/ES

If you are using a larger block size, like 128k, a compressed filesystem does 
show some benefit: 

Filesystem SizeUsed   Avail Capacity  Mounted on
zdata/ES-lz4   1.1T1.1G1.1T 0%/zdata/ES-lz4 - 
compressratio  1.73x
zdata/ES-gzip  1.1T901M1.1T 0%/zdata/ES-gzip- 
compressratio  2.27x
zdata/ES   1.1T1.9G1.1T 0%/zdata/ES

But a file system block larger than 4k is very suboptimal for IO (ES read or 
write one 4k block - your FS must read or write a 128k block).

On 21 juil. 2014, at 07:58, horst knete baduncl...@hotmail.de wrote:

 Hey guys,
 
 we have mounted an btrfs file system with the compression method zlib for 
 testing purposes on our elasticsearchserver and copied one of the indices 
 on the btrfs volume, unfortunately it had no success and still got the size 
 of 50gb :/
 
 I will further try it with other compression methods and will report here
 
 Am Samstag, 19. Juli 2014 07:21:20 UTC+2 schrieb Otis Gospodnetic:
 
 Hi Horst,
 
 I wouldn't bother with this for the reasons Joerg mentioned, but should 
 you try it anyway, I'd love to hear your findings/observations.
 
 Otis
 --
 Performance Monitoring * Log Analytics * Search Analytics
 Solr  Elasticsearch Support * http://sematext.com/
 
 
 
 On Wednesday, July 16, 2014 6:56:36 AM UTC-4, horst knete wrote:
 
 Hey Guys,
 
 to save a lot of hard disk space, we are going to use an compression file 
 system, which allows us transparent compression for the es-indices. (It 
 seems like es-indices are very good compressable, got up to 65% 
 compression-rate in some tests).
 
 Currently the indices are laying at a ext4-Linux Filesystem which 
 unfortunately dont have the transparent compression ability.
 
 Anyone of you got experience with compression file systems like BTRFS or 
 ZFS/OpenZFS and can tell us if this led to big performance losses?
 
 Thanks for responding

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/3DD72EC1-E3EC-493D-94DD-33E63151A579%40patpro.net.
For more options, visit https://groups.google.com/d/optout.


Re: LDAP authentication in Kibana

2014-07-21 Thread naveen bajaj
I configured ldap properties in httpd.conf.

AuthLDAPBindDN uid=nabajaj,OU=Employee,OU=Cisco 
Users,DC=ds,DC=cisco,DC=com
AuthLDAPBindPassword password
AuthLDAPURL ldap://domain:389/OU=Employee,OU=Cisco 
Users,DC=ds,DC=cisco,DC=com?uid?sub?(objectClass=*)
AuthType Basic
AuthBasicProvider ldap
authzldapauthoritative Off
AuthName some text for login prompt
  require valid-user

But it giving me error like

[error] [client x.x.x.x] user nabajaj: authentication failure for 
/kibana: Password Mismatch

Please help me here.

On Wednesday, 18 June 2014 19:10:47 UTC+5:30, dharmendra pratap singh wrote:

 Hello Friends,
 Hope yo are doing good.

 In my application, I want to do the authentication in kibana using LDAP. 
 if anyone has done it before, please help me to come out of this.

 Appreciate your help.

 Regards
 Dharmendra


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/7be86d2f-9c67-4ccd-92d2-37d154ecc6d8%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Kibana settings for IPFIX/Netflow

2014-07-21 Thread Janet Sullivan
Every minute, we take a 1/4096 sample of traffic using IPFIX.  I want to graph 
this data as bits/sec in a histogram.  However, my math  kibana skills are 
failing me.

Here is how I think it should be set up, but it's always too low a value for 
Gbit/s:

Chart Value: total
Value Field: bytes (bytes per minute field)
Scale: 32768 (4096 * 8 bits in a byte)
Seconds, checked
Interval 1m
Y Format bytes

Help?  Maybe I'm missing the obvious, but its 2 a.m. and I'm mystified.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/73ae7aaffd5d44f290d16a14c679e2f8%40BN1PR07MB039.namprd07.prod.outlook.com.
For more options, visit https://groups.google.com/d/optout.


Solaris 10 mlockall error code

2014-07-21 Thread James Pace
We have been having issues running ES with the bootstrap.mlockall: true 
setting. We get the following error:

[2014-07-21 09:56:44,436][WARN ][common.jna] Unknown mlockall error 11

I have googled around and looked in the solaris documentation for the 
description of the error codes and I have been unsuccessful. The solaris 
docs are here 
http://docs.oracle.com/cd/E26505_01/html/816-5168/mlockall-3c.html#REFMAN3Amlockall-3c
 but 
they only list 3 error codes. Is the error code 11 generated by ES?

Our box has a total of 64 gigs of RAM and we give 32gigs to ES. We are 
running it using Oracle Java 1.7.13 64bit. Any help on the matter would be 
greatly appreciated!

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/bc9ba19f-c798-4e7f-9977-c9707709c47e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Solaris 10 mlockall error code

2014-07-21 Thread Mark Walkom
What elasticsearch version are you on?

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com
web: www.campaignmonitor.com


On 21 July 2014 19:09, James Pace james.a.p...@gmail.com wrote:

 We have been having issues running ES with the bootstrap.mlockall: true
 setting. We get the following error:

 [2014-07-21 09:56:44,436][WARN ][common.jna] Unknown mlockall error 11

 I have googled around and looked in the solaris documentation for the
 description of the error codes and I have been unsuccessful. The solaris
 docs are here
 http://docs.oracle.com/cd/E26505_01/html/816-5168/mlockall-3c.html#REFMAN3Amlockall-3c
  but
 they only list 3 error codes. Is the error code 11 generated by ES?

 Our box has a total of 64 gigs of RAM and we give 32gigs to ES. We are
 running it using Oracle Java 1.7.13 64bit. Any help on the matter would be
 greatly appreciated!

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/bc9ba19f-c798-4e7f-9977-c9707709c47e%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/bc9ba19f-c798-4e7f-9977-c9707709c47e%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAEM624YbdrUjNe-S4UKvZcBQgnPup4QF3nidxLzLmDc82yd4mg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


RE: Kibana settings for IPFIX/Netflow

2014-07-21 Thread Janet Sullivan
I'm tired, I didn't explain that well, we use pmacct to do 1 minute 
aggregations.

From: elasticsearch@googlegroups.com [mailto:elasticsearch@googlegroups.com] On 
Behalf Of Janet Sullivan
Sent: Monday, July 21, 2014 1:50 AM
To: elasticsearch@googlegroups.com
Subject: Kibana settings for IPFIX/Netflow

Every minute, we take a 1/4096 sample of traffic using IPFIX.  I want to graph 
this data as bits/sec in a histogram.  However, my math  kibana skills are 
failing me.

Here is how I think it should be set up, but it's always too low a value for 
Gbit/s:

Chart Value: total
Value Field: bytes (bytes per minute field)
Scale: 32768 (4096 * 8 bits in a byte)
Seconds, checked
Interval 1m
Y Format bytes

Help?  Maybe I'm missing the obvious, but its 2 a.m. and I'm mystified.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/f60cf590cb4146698ec2e4eacc8815b8%40BY2PR07MB043.namprd07.prod.outlook.com.
For more options, visit https://groups.google.com/d/optout.


Re: Solaris 10 mlockall error code

2014-07-21 Thread James Pace
Ooops good point! We are running version 1.2.1

On Monday, 21 July 2014 10:18:15 UTC+1, Mark Walkom wrote:

 What elasticsearch version are you on?

 Regards,
 Mark Walkom

 Infrastructure Engineer
 Campaign Monitor
 email: ma...@campaignmonitor.com javascript:
 web: www.campaignmonitor.com


 On 21 July 2014 19:09, James Pace james@gmail.com javascript: 
 wrote:

 We have been having issues running ES with the bootstrap.mlockall: true 
 setting. We get the following error:

 [2014-07-21 09:56:44,436][WARN ][common.jna] Unknown mlockall error 11

 I have googled around and looked in the solaris documentation for the 
 description of the error codes and I have been unsuccessful. The solaris 
 docs are here 
 http://docs.oracle.com/cd/E26505_01/html/816-5168/mlockall-3c.html#REFMAN3Amlockall-3c
  but 
 they only list 3 error codes. Is the error code 11 generated by ES?

 Our box has a total of 64 gigs of RAM and we give 32gigs to ES. We are 
 running it using Oracle Java 1.7.13 64bit. Any help on the matter would be 
 greatly appreciated!

 -- 
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearc...@googlegroups.com javascript:.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/bc9ba19f-c798-4e7f-9977-c9707709c47e%40googlegroups.com
  
 https://groups.google.com/d/msgid/elasticsearch/bc9ba19f-c798-4e7f-9977-c9707709c47e%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.




-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/9b4fa7b1-2561-4f33-8a60-969ff317206b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [ANN] ElasticUI AngularJS Directives - Easily Build an Interface on top of Elasticsearch

2014-07-21 Thread Yousef El-Dardiry
Thanks all for the enthusiastic responses! Very excited to see the first 
contributions today and many stars on github over the last couple of weeks. 
Would love to hear your feedback / use cases in case you have any already :)

Regards,

- Yousef

http://www.elasticui.com
http://www.tweetbeam.com

On Thursday, July 3, 2014 3:06:11 PM UTC+2, Petar Djekic wrote:

 wow, this is really cool!

 On Wednesday, July 2, 2014 12:56:48 PM UTC+2, Yousef El-Dardiry wrote:

 Hi all,

 I just open sourced a set of AngularJS Directives for Elasticsearch. It 
 enables developers to rapidly build a frontend (e.g.: faceted search 
 engine) on top of Elasticsearch.

 http://www.elasticui.com (or github 
 https://github.com/YousefED/ElasticUI)

 It makes creating an aggregation and listing the buckets as simple as:

 *ul 
 eui-aggregation=ejs.TermsAggregation('text_agg').field('text').size(10)*
 *li ng-repeat=bucket in aggResult.buckets{{bucket}}/li*
 */ul*

 I think this was currently missing in the ecosystem, which is why I 
 decided to build and open source it. I'd love any kind of feedback.

 - Yousef

 *-*
 Another example; add a checkbox facet based on a field using one of the 
 built-in widgets 
 https://github.com/YousefED/ElasticUI/blob/master/docs/widgets.md:

 *eui-checklist field='facet_field' size=10/eui-checklist*

 Resulting in
 [image: checklist screenshot]



-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/d2b79b01-4460-4f07-ab38-508a22f50d37%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Creating index with _timestamps

2014-07-21 Thread Surajit Roy
Hi All,

I am trying to create an index with a _timestamp mapping.

curl -XPOST http://localhost:9200/test; -d'
{ settings :
 { number_of_shards : 5,
number_of_replicas : 1},
mappings : {
stats : {
 _timestamp : {
enabled : true,
store : true }
}}}'


When I am writing the data in the index I dont see the time stamp coming up

curl -XPOST http://localhost:9200/test/stats/1; -d'{a:1}'


{_index:test,_type:stats,_id:1,_version:1,created:true}

Why am I not getting _timestamp, can anyone help?


Thanks
Surajit

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAAxzCObuchG-a-tBfDQa-u3_ahX7hoaPq8g3jgVZXMBiEJH4oA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


marvel dashboard

2014-07-21 Thread eunever32
Hi,

The marvel overview by default shows 20 indices (third panel)

I guess there is some way to configure this 20? To say 40?

But how to do it?

Your help appreciated.

Regards.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/5ca8a37f-cce9-4d08-b2f1-240957a7f0d5%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Solaris 10 mlockall error code

2014-07-21 Thread joergpra...@gmail.com
Error 11 is a POSIX error number (errno) and means EAGAIN, Resource
temporarily unavailable, which is documented.

On Solaris 10, this means, you must first allow the Elasticsearch user to
allocate this amount of virtual memory.

Switch to Elasticsearch user and then check the following values

prctl -n project.max-shm-memory $$
prctl -n process.max-address-space $$

Then the sys admin could create a project with projadd and projmod and
change resource limits for the Elasticsearch user.

The error can also mean that Solaris has not enough memory for mlockall
because there is already software running using the memory, or it the free
memory is too fragmented.

Jörg



On Mon, Jul 21, 2014 at 11:09 AM, James Pace james.a.p...@gmail.com wrote:

 We have been having issues running ES with the bootstrap.mlockall: true
 setting. We get the following error:

 [2014-07-21 09:56:44,436][WARN ][common.jna] Unknown mlockall error 11

 I have googled around and looked in the solaris documentation for the
 description of the error codes and I have been unsuccessful. The solaris
 docs are here
 http://docs.oracle.com/cd/E26505_01/html/816-5168/mlockall-3c.html#REFMAN3Amlockall-3c
  but
 they only list 3 error codes. Is the error code 11 generated by ES?

 Our box has a total of 64 gigs of RAM and we give 32gigs to ES. We are
 running it using Oracle Java 1.7.13 64bit. Any help on the matter would be
 greatly appreciated!

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/bc9ba19f-c798-4e7f-9977-c9707709c47e%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/bc9ba19f-c798-4e7f-9977-c9707709c47e%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoH%3DJQqs6UY7e2bO7z%3DhGacdMq_AJvoZpYCFfKqCAqMWaA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Solaris 10 mlockall error code

2014-07-21 Thread James Pace
Thanks for the information, I'll chase it up and let you know how I get on.

James

On Monday, 21 July 2014 12:36:09 UTC+1, Jörg Prante wrote:

 Error 11 is a POSIX error number (errno) and means EAGAIN, Resource 
 temporarily unavailable, which is documented.

 On Solaris 10, this means, you must first allow the Elasticsearch user to 
 allocate this amount of virtual memory.

 Switch to Elasticsearch user and then check the following values

 prctl -n project.max-shm-memory $$
 prctl -n process.max-address-space $$

 Then the sys admin could create a project with projadd and projmod and 
 change resource limits for the Elasticsearch user.

 The error can also mean that Solaris has not enough memory for mlockall 
 because there is already software running using the memory, or it the free 
 memory is too fragmented.

 Jörg



 On Mon, Jul 21, 2014 at 11:09 AM, James Pace james@gmail.com 
 javascript: wrote:

 We have been having issues running ES with the bootstrap.mlockall: true 
 setting. We get the following error:

 [2014-07-21 09:56:44,436][WARN ][common.jna] Unknown mlockall error 11

 I have googled around and looked in the solaris documentation for the 
 description of the error codes and I have been unsuccessful. The solaris 
 docs are here 
 http://docs.oracle.com/cd/E26505_01/html/816-5168/mlockall-3c.html#REFMAN3Amlockall-3c
  but 
 they only list 3 error codes. Is the error code 11 generated by ES?

 Our box has a total of 64 gigs of RAM and we give 32gigs to ES. We are 
 running it using Oracle Java 1.7.13 64bit. Any help on the matter would be 
 greatly appreciated!

 -- 
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearc...@googlegroups.com javascript:.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/bc9ba19f-c798-4e7f-9977-c9707709c47e%40googlegroups.com
  
 https://groups.google.com/d/msgid/elasticsearch/bc9ba19f-c798-4e7f-9977-c9707709c47e%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.




-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/f503e0b0-2e6b-4b6d-9e0e-07c2858783df%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Creating index with _timestamps

2014-07-21 Thread David Pilato
Ask for it using fields: 
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-request-fields.html#search-request-fields

Using timestamp does not modify the original source.

--
David ;-)
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs


Le 21 juil. 2014 à 12:44, Surajit Roy beasurajit...@gmail.com a écrit :

Hi All,

I am trying to create an index with a _timestamp mapping.

curl -XPOST http://localhost:9200/test; -d'
{ settings :
 { number_of_shards : 5,
number_of_replicas : 1},
mappings : {
stats : {
 _timestamp : {
enabled : true, 
store : true }
}}}'


When I am writing the data in the index I dont see the time stamp coming up

curl -XPOST http://localhost:9200/test/stats/1; -d'{a:1}'


{_index:test,_type:stats,_id:1,_version:1,created:true}

Why am I not getting _timestamp, can anyone help?


Thanks
Surajit
-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAAxzCObuchG-a-tBfDQa-u3_ahX7hoaPq8g3jgVZXMBiEJH4oA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/873351B2-4C21-406A-B606-D9E8EE4A410D%40pilato.fr.
For more options, visit https://groups.google.com/d/optout.


Index performace with large arrays

2014-07-21 Thread Steve Mee
Hi there. I'm new to ES and would appreciate some advice on design concepts 
around large arrays.

I am writing a help tip feature that pops up a message each time a user 
logs in. The user can flag a checkbox if they do not want to see this 
particular tip again.

After playing with ElasticSearch the solution I came up with involved using 
a HelpTip document which contains an array of UserIds (identifying the 
users who have flagged that they do not want to see this tip again). 

Example1:
HelpTip
{
title: Need help getting started?,
text: Watch our overview video,
userArray: [id1, id2]
}

I know ES can cope with large arrays but I wonder if there would be 
performance issues if this array grew to 4000+ IDs. This record would be 
regularly re-indexed (each time a new user ID is added to the array). would 
there be performance issues when indexing a document containing a large 
array field?

Is this a sensible approach or would I be better using a relational model 
and holding the Help Tip info and the list of users in separate documents, 
then parsing them using two separate calls from my application?
Example 2:
HelpTip
{
title: Need help getting started?,
text: Watch our overview video
}

HelpTipUserFlags
{
HelpTipId: 1,
UserId: ID1
}

Hope this makes sense. Thanks in advance for any help.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/9cba2c32-6266-4b87-b708-83ee64499dbf%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Notifications for a query

2014-07-21 Thread P lva
Hello Everyone,

Started working with Elasticsearch recently.
Just wanted to know if there's any way of being notified when a document
matches a query. (essentially create a monitoring system)
Can I use percolator to do this ?

Thanks

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAO9TxdPOJGbY8nEj4Hj31LMr%3DaVLTwc_%3DwJW%3Drc7Ly2LAYc8nw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Clustering/Sharding impact on query performance

2014-07-21 Thread Kireet Reddy
My working assumption had been that elasticsearch executes queries across all 
shards in parallel and then merges the results. So maybe shards = cpu cores 
would help in this case where there is only one concurrent query. But I have 
never tested this assumption, out of curiosity during the 20 shard test did you 
still only see 1 cpu being used? Did you try 2 shards and get the same results?

On Jul 20, 2014, at 1:01 AM, 'Fin Sekun' via elasticsearch 
elasticsearch@googlegroups.com wrote:

 Hi Kireet, thanks for your answer and sorry for the late response. More 
 shards doesn't help. It will slow down the system because each shard takes 
 quite some overhead to maintain a Lucene index and, the smaller the shards, 
 the bigger the overhead. Having more shards enhances the indexing performance 
 and allows to distribute a big index across machines, but I don't have a 
 cluster with a lot of machines. I could observe this negative effects while 
 testing with 20 shards.
 
 It would be very cool if somebody could answer/comment to the question 
 summarized at the end of my post. Thanks again.
 
 
 
 
 
 On Friday, July 11, 2014 3:02:50 AM UTC+2, Kireet Reddy wrote:
 I would test using multiple primary shards on a single machine. Since your 
 dataset seems to fit into RAM, this could help for these longer latency 
 queries.
 
 On Thursday, July 10, 2014 12:24:26 AM UTC-7, Fin Sekun wrote:
 Any hints?
 
 
 
 On Monday, July 7, 2014 3:51:19 PM UTC+2, Fin Sekun wrote:
 
 Hi,
 
 
 SCENARIO
 
 Our Elasticsearch database has ~2.5 million entries. Each entry has the three 
 analyzed fields match, sec_match and thi_match (all contains 3-20 
 words) that will be used in this query:
 https://gist.github.com/anonymous/a8d1142512e5625e4e91
 
 
 ES runs on two types of servers:
 (1) Real servers (system has direct access to real CPUs, no virtualization) 
 of newest generation - Very performant!
 (2) Cloud servers with virtualized CPUs - Poor CPUs, but this is generic for 
 cloud services.
 
 See https://gist.github.com/anonymous/3098b142c2bab51feecc for (1) and (2) 
 CPU details.
 
 
 ES settings:
 ES version 1.2.0 (jdk1.8.0_05)
 ES_HEAP_SIZE = 512m (we also tested with 1024m with same results)
 vm.max_map_count = 262144
 ulimit -n 64000
 ulimit -l unlimited
 index.number_of_shards: 1
 index.number_of_replicas: 0
 index.store.type: mmapfs
 threadpool.search.type: fixed
 threadpool.search.size: 75
 threadpool.search.queue_size: 5000
 
 
 Infrastructure:
 As you can see above, we don't use the cluster feature of ES (1 shard, 0 
 replicas). The reason is that our hosting infrastructure is based on 
 different providers.
 Upside: We aren't dependent on a single hosting provider. Downside: Our 
 servers aren't in the same LAN.
 
 This means:
 - We cannot use ES sharding, because synchronisation via WAN (internet) seems 
 not a useful solution.
 - So, every ES-server has the complete dataset and we configured only one 
 shard and no replicas for higher performance.
 - We have a distribution process that updates the ES data on every host 
 frequently. This process is fine for us, because updates aren't very often 
 and perfect just-in-time ES synchronisation isn't necessary for our business 
 case.
 - If a server goes down/crashs, the central loadbalancer removes it (the 
 resulting minimal packet lost is acceptable).
  
 
 
 
 PROBLEM
 
 For long query terms (6 and more keywords), we have very high CPU loads, even 
 on the high performance server (1), and this leads to high response times: 
 1-4sec on server (1), 8-20sec on server (2). The system parameters while 
 querying:
 - Very high load (usually 100%) for the thread responsible CPU (the other 
 CPUs are idle in our test scenario)
 - No I/O load (the harddisks are fine)
 - No RAM bottlenecks
 
 So, we think the file caching is working fine, because we have no I/O 
 problems and the garbage collector seams to be happy (jstat shows very few 
 GCs). The CPU is the problem, and ES hot-threads point to the Scorer module:
 https://gist.github.com/anonymous/9cecfd512cb533114b7d 
 
 
 
 
 SUMMARY/ASSUMPTIONS
 
 - Our database size isn't very big and the query not very complex.
 - ES is designed for huge amount of data, but the key is clustering/sharding: 
 Data distribution to many servers means smaller indices, smaller indices 
 leads to fewer CPU load and short response times.
 - So, our database isn't big, but to big for a single CPU and this means 
 especially low performance (virtual) CPUs can only be used in sharding 
 environments.
 
 If we don't want to lost the provider independency, we have only the 
 following two options:
 
 1) Simpler query (I think not possible in our case)
 2) Smaller database
 
 
 
 
 QUESTIONS
 
 Are our assumptions correct? Especially:
 
 - Is clustering/sharding (also small indices) the main key to performance, 
 that means the only possibility to prevent overloaded (virtual) CPUs?
 - Is it right that clustering is only useful/possible in LANs?
 

Re: Index performace with large arrays

2014-07-21 Thread joergpra...@gmail.com
If the user can opt out, I assume you have fewer opt outs than opt ins,
then you should use opt outs for an andnot filter :)

In that case, I would create an opt out index in the form index/type/id

users/optouts/userid

with docs containing a quite short array of opt outs

{ optouts: [ id1, id2, ... idn ] }

so you can get the doc, read the opt out array, and add it as an and not
filter to your help tip query.

You could also add this optouts array to the user index, but this depends
on your overall design. If you want to remove the opt outs, you could
simply drop the optputs mapping type.

Regarding the array length, you can add as much values as you like, ES can
handle that. If the docs get long (I mean thousands of entries), they will
take substantial time just for fetching them, so I think you should prefer
a model with data as short as possible.

Jörg








On Mon, Jul 21, 2014 at 4:58 PM, Steve Mee st...@genialgenetics.com wrote:

 Hi there. I'm new to ES and would appreciate some advice on design
 concepts around large arrays.

 I am writing a help tip feature that pops up a message each time a user
 logs in. The user can flag a checkbox if they do not want to see this
 particular tip again.

 After playing with ElasticSearch the solution I came up with involved
 using a HelpTip document which contains an array of UserIds (identifying
 the users who have flagged that they do not want to see this tip again).

 Example1:
 HelpTip
 {
 title: Need help getting started?,
 text: Watch our overview video,
 userArray: [id1, id2]
 }

 I know ES can cope with large arrays but I wonder if there would be
 performance issues if this array grew to 4000+ IDs. This record would be
 regularly re-indexed (each time a new user ID is added to the array). would
 there be performance issues when indexing a document containing a large
 array field?

 Is this a sensible approach or would I be better using a relational model
 and holding the Help Tip info and the list of users in separate documents,
 then parsing them using two separate calls from my application?
 Example 2:
 HelpTip
 {
 title: Need help getting started?,
 text: Watch our overview video
 }

 HelpTipUserFlags
 {
 HelpTipId: 1,
 UserId: ID1
 }

 Hope this makes sense. Thanks in advance for any help.

  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/9cba2c32-6266-4b87-b708-83ee64499dbf%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/9cba2c32-6266-4b87-b708-83ee64499dbf%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoFw7P8FQu9%2Bc2ajj0Vg2wNvbpz%3D%2Bo9Af8R-5p9Cj8-7FQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Index performace with large arrays

2014-07-21 Thread Steve Mee


 Thanks for the response Jörg. That tells me exactly what I need to know... 
 stay away from very large arrays here in my design :-)


Cheers - Steve 

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/f06b544f-8aec-4c44-aa38-ce53e5f0be74%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: puppet-elasticsearch options

2014-07-21 Thread Andrej Rosenheinrich
Hi Richard,

another question: you are creating the elasticsearch user and group 
somewhere in the module (havent found exactly where yet). My problem is 
that I have to create a directory for data_dir (on a different device) that 
is needed by the class (or instance, not sure), but I need the owner and 
the group to be able to set it otherwise the service won't start. Can I set 
a requirement in my file declaration to make sure that the user and the 
group already exist? Something like

  file { /data/elasticsearch:
ensure  = directory,
owner   = elasticsearch,
group   = elasticsearch,
require = ???
  }


Once again, thanks!
Andrej

Am Dienstag, 1. Juli 2014 14:37:55 UTC+2 schrieb Richard Pijnenburg:

 Hi Andrej,

 Sorry for the late response. Didn't get an update email about it.

 As long as you don't setup an instance with the 'elasticsearch::instance' 
 define it will only install the package but do nothing afterwards.
 I recently fixed that the default files from the packages are being 
 removed now.
 The memory can be set via the init_defaults hash by setting the ES_HEAP 
 option.

 The issue with 0.90.x versions is that it automatically starts up after 
 package installation.
 Since i don't stop it, it keeps running. Its advised to run a newer 
 version of ES since 0.90.x will be EOL'd at some point.


 On Thursday, June 26, 2014 2:24:47 PM UTC+1, Andrej Rosenheinrich wrote:

 Hi Richard,

 thanks for your answer, it for sure helped! Still, I am puzzling with a 
 few effects and questions:

 1.) I am a bit confused by your class/instance idea. I can do something 
 pretty simple like class { 'elasticsearch' :  version = '0.90.7' } and it 
 will install elasticsearch in the correct version using the default 
 settings you defined. Repeating this (I tested every step on a fresh debian 
 instance in a VM, no different puppet installation steps in between) with a 
 config added in class like 

 class { 'elasticsearch' :
 version = '0.90.7',
 config = {
   'cluster'= {
 'name' = 'andrejtest'
   },
   'http.port' = '9210'
 }
 }
   
 I still get elasticsearch installed, but it completely ignores everything 
 in the config. (I should be able to curl localhost:9210, but its up and 
 running on the old default port, using the old cluster name). You explained 
 overwriting for instances and classes a bit, so I tried the following thing 
 (again, blank image, no previous installation) :

   class { 'elasticsearch' :
 version = '0.90.7',
 config = {
   'cluster'= {
 'name' = 'andrejtest'
   },
   'http.port' = '9210'
 }
   }

   elasticsearch::instance { 'es-01':
   }

 What happened is that I have two elasticsearch instances running, one 
 with the default value and another one (es-01) that uses the provided 
 configuration. Even freakier, I install java7 in my script before the 
 snippet posted , the first (default based) elasticsearch version uses the 
 standard openjdk-6 java, the second instance (es-01) uses java7. 
 So, where is my mistake or what am I doing wrong? What would be the way 
 to install and start only one service using provided configuration? And 
 does elasticsearch::instance require an instance name? I would really miss 
 the funny comic node names ;)

 2. As you pointed out I can define all values from elasticsearch.yml in 
 the config hash. But what about memory settings (I usually modify the 
 init.d script for that), can I configure Xms and Xmx settings in the puppet 
 module somehow?

 Logging configuration would be a nice-to-have (no must-have), just in 
 case you were wondering ;)

 I hope my questions don't sound too confusing, if you could give me a 
 hint on what I am doing wrong I would really appreciate it.

 Thanks in advance!
 Andrej


 Am Freitag, 20. Juni 2014 09:44:49 UTC+2 schrieb Richard Pijnenburg:

 Hi Andrej,

 Thank you for using the puppet module :-)

 The 'port' and 'discovery minimum' settings are both configuration 
 settings for the elasticsearch.yml file.
 You can set those in the 'config' option variable, for example:

 elasticsearch::instance { 'instancename':
   config = { 'http.port' = '9210', 
 'discovery.zen.minimum_master_nodes' = 3 }
 }


 For the logging part, management of the logging.yml file is very limited 
 at the moment but i hope to get some feedback on extending that.
 The thresholds for the slowlogs can be set in the same config option 
 variable.
 See 
 http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/index-modules-slowlog.html#index-slow-log
  
 for more information.

 If you have any further questions, let me know.

 Cheers

 On Thursday, June 19, 2014 9:53:10 AM UTC+1, Andrej Rosenheinrich wrote:

 Hi,

 i am playing around with puppet-easticsearch 0.4.0, works wells so far 
 (thanks!), but I am missing a few options I havent seen in the 
 documentation. As I couldnt figure it out immediately by 

Re: Kibana settings for IPFIX/Netflow

2014-07-21 Thread Dhanasekaran Anbalagan
Hi Janets,

currently I am also trying pmacct It's processing result. I am storing data
to elasticsearch, But currently struggling with dashboard creation, can you
share your kibana dashboard file. it's very useful to me.

-Dhanasekaran.

Did I learn something today? If not, I wasted it.


On Mon, Jul 21, 2014 at 5:19 AM, Janet Sullivan jan...@nairial.net wrote:

  I’m tired, I didn’t explain that well, we use pmacct to do 1 minute
 aggregations.



 *From:* elasticsearch@googlegroups.com [mailto:
 elasticsearch@googlegroups.com] *On Behalf Of *Janet Sullivan
 *Sent:* Monday, July 21, 2014 1:50 AM
 *To:* elasticsearch@googlegroups.com
 *Subject:* Kibana settings for IPFIX/Netflow



 Every minute, we take a 1/4096 sample of traffic using IPFIX.  I want to
 graph this data as bits/sec in a histogram.  However, my math  kibana
 skills are failing me.



 Here is how I think it should be set up, but it’s always too low a value
 for Gbit/s:



 Chart Value: total

 Value Field: bytes (bytes per minute field)

 Scale: 32768 (4096 * 8 bits in a byte)

 Seconds, checked

 Interval 1m

 Y Format bytes



 Help?  Maybe I’m missing the obvious, but its 2 a.m. and I’m mystified.





 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/f60cf590cb4146698ec2e4eacc8815b8%40BY2PR07MB043.namprd07.prod.outlook.com
 https://groups.google.com/d/msgid/elasticsearch/f60cf590cb4146698ec2e4eacc8815b8%40BY2PR07MB043.namprd07.prod.outlook.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAJzooYdPZOny6EqgCWD-QWGBXVhSbXj0HKWxc-arqAu9kbE_7A%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Sort Order when relevance is equal

2014-07-21 Thread Mateusz Kaczynski
Seconding question.

Also, is it possible that it changed between versions 1.0 and 1.2? We're 
trying to upgrade and noticed a seemingly random order of documents with 
equal relevance in regression testing.

On Wednesday, 14 May 2014 00:49:55 UTC, Erich Lin wrote:

 Ignoring the bouncing results problem with multiple shards , is the order 
 of results deterministic when sorting by relevance score or any other 
 field.  

 What I mean by this is if two documents have the same score, 

 1) will they always be in the same order if we set the preference 
 parameter to an arbitrary string like the user’s session ID. 
 2) If so, is there a way to predict this deterministic order? Is it done 
 by ID of the documents as a tiebreaker etc? 
 3) If not, could we specify that or do we have to do a secondary sort on 
 ID if we wanted to do that?




-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/284b6bfd-68aa-4717-ab52-73d44f7cc196%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Need help with Java API

2014-07-21 Thread joergpra...@gmail.com
To issue 1: you create a single node cluster without index, and a client of
it.

To issue 2: you see the UnavailableShardsException caused by a timeout
while indexing to a replica shard. This means, you may have set up a single
node cluster, but with replica level 1 (default) which needs 2 nodes for
indexing. Maybe there was once another node joining the cluster and ES
wants it back abut it never came (after 60 secs). Then ES returns the
timeout error. Maybe replica level 0 helps. You should also check the
cluster health. A color of green shows everything works, yellow means there
are too few nodes to satisfy the replica condition, and read means the
cluster is not in a consistent/usable state.

To issue 3: not sure what clusterName() means. I would use settings and add
a field cluster.name. Maybe it is equivalent. You must ensure you use the
same cluster.name setting throughout all nodes and clients. You also can
not reuse data from clusters that had other names (look into the data
folder)

To issue 4: ES takes ~5 secs for discovery, the zen modules pings and waits
for responding master nodes by default. If you just test locally on your
developer machine, you should disable zen. Most easily by disabling
networking at all, by using NodeBuilder.nodeBuilder().local(true)...

Jörg




On Mon, Jul 21, 2014 at 6:53 PM, Alain Désilets alaindesile...@gmail.com
wrote:

 I am trying to get started with the Java API, using the excellent tutorial
 found here:

http://www.slideshare.net/dadoonet/hands-on-lab-elasticsearch

 But I am still having a lot of trouble.

 Below is a sample of code that I have written:

 package ca.nrc.ElasticSearch;

 import org.codehaus.jackson.map.ObjectMapper;
 import org.elasticsearch.action.get.GetResponse;
 import org.elasticsearch.action.index.IndexResponse;
 import org.elasticsearch.client.Client;
 import org.elasticsearch.node.NodeBuilder;

 public class ElasticSearchRunner {

 static ObjectMapper mapper;
 static Client client;
 static String indexName = meal5;
 static String typeName = beer;
 static long startTimeMSecs;

 public static void main(String[] args) throws Exception {
 startTimeMSecs = System.currentTimeMillis();
 mapper = new ObjectMapper(); // create once, reuse
  echo(Creating the ElasticSearch client...);
 client = NodeBuilder.nodeBuilder().node().client(); // Does this create a
 brand new cluster?
 // client =
 NodeBuilder.nodeBuilder().clusterName(handson).client(true).node().client();
 // Joins existing cluster called handson
 echo(DONE creating the ElasticSearch client... Elapsed time =
 +elapsedSecs()+ secs.);
  echo(Creating a beer object...);
 Beer beer = new Beer(Heineken, Colour.PALE, 0.33, 3);
 String jsonString = mapper.writeValueAsString(beer);
 echo(DONE Creating a beer object...);

 echo(Indexing the beer object...);
 IndexResponse ir = null;
 ir = client.prepareIndex(indexName, typeName).setSource(jsonString)
 .execute().actionGet();
 echo(DONE Indexing the beer object...);

 echo(Retrieving the beer object...);
 GetResponse gr = null;
 gr = client.prepareGet(indexName, typeName, ir.getId()).execute()
 .actionGet();
 echo(DONE Retrieving the beer object...);
 }

  public static float elapsedSecs() {
 float elapsed = (System.currentTimeMillis() - startTimeMSecs)/1000;
 return elapsed;
 }
  public static void echo(String mess) {
 mess = mess +  (Elapsed so far: +elapsedSecs()+ seconds);
 System.out.println(mess);
 }
 }


 It works, sort of...

 If I use the first method for creating the client:

 client = NodeBuilder.nodeBuilder().node().client();

 Then it works fin the first time I run it. However:

 *** ISSUE 1: If I try to inspect the meal index with Marvel, I don't find
 it.

 Also,

 *** ISSUE 2: If I run the application a second time, I get the following
 output:

 Creating the ElasticSearch client... (Elapsed so far: 0.0 seconds)
 log4j:WARN No appenders could be found for logger (org.elasticsearch.node).
 log4j:WARN Please initialize the log4j system properly.
 log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for
 more info.
 DONE creating the ElasticSearch client... Elapsed time = 9.0 secs.
 (Elapsed so far: 9.0 seconds)
 Creating a beer object... (Elapsed so far: 9.0 seconds)
 DONE Creating a beer object... (Elapsed so far: 9.0 seconds)
 Indexing the beer object... (Elapsed so far: 9.0 seconds)
 Exception in thread main
 org.elasticsearch.action.UnavailableShardsException: [meal5][0] [2]
 shardIt, [0] active : Timeout waiting for [1m], request: index
 {[meal5][beer][B3F5ZEmSTruqdnlxhYviFg],
 source[{brand:Heineken,colour:PALE,size:0.33,price:3.0}]}
 at
 org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction.raiseTimeoutFailure(TransportShardReplicationOperationAction.java:526)
 at
 org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction$3.onTimeout(TransportShardReplicationOperationAction.java:516)
 at
 

[ANN] Elasticsearch CouchDB River plugin 2.2.0 released

2014-07-21 Thread Elasticsearch Team

Heya,


We are pleased to announce the release of the Elasticsearch CouchDB River 
plugin, version 2.2.0.

The CouchDB River plugin allows to hook into couchdb _changes feed and 
automatically index it into elasticsearch..

https://github.com/elasticsearch/elasticsearch-river-couchdb/

Release Notes - elasticsearch-river-couchdb - Version 2.2.0


Fix:
 * [67] - Race condition: NPE when starting with no database 
(https://github.com/elasticsearch/elasticsearch-river-couchdb/issues/67)
 * [66] - Race condition: exception while closing the river 
(https://github.com/elasticsearch/elasticsearch-river-couchdb/issues/66)

Update:
 * [64] - Default script engine should be mvel 
(https://github.com/elasticsearch/elasticsearch-river-couchdb/issues/64)
 * [62] - Use `script_type` instead of `scriptType` 
(https://github.com/elasticsearch/elasticsearch-river-couchdb/issues/62)
 * [60] - Default couchdb db name should be river name 
(https://github.com/elasticsearch/elasticsearch-river-couchdb/issues/60)
 * [56] - Update to elasticsearch 1.2.0 
(https://github.com/elasticsearch/elasticsearch-river-couchdb/issues/56)

New:
 * [47] - Move tests to elasticsearch test framework 
(https://github.com/elasticsearch/elasticsearch-river-couchdb/issues/47)
 * [17] - [TEST] Check that you can create a couchdb DB after the river 
creation 
(https://github.com/elasticsearch/elasticsearch-river-couchdb/issues/17)

Doc:
 * [55] - Clarify deleting documents in multi-type example 
(https://github.com/elasticsearch/elasticsearch-river-couchdb/pull/55)
 * [51] - Add documentation and test about parent/child 
(https://github.com/elasticsearch/elasticsearch-river-couchdb/issues/51)
 * [45] - [TEST] Add test and documentation for removing fields using scripts 
(https://github.com/elasticsearch/elasticsearch-river-couchdb/issues/45)


Issues, Pull requests, Feature requests are warmly welcome on 
elasticsearch-river-couchdb project repository: 
https://github.com/elasticsearch/elasticsearch-river-couchdb/
For questions or comments around this plugin, feel free to use elasticsearch 
mailing list: https://groups.google.com/forum/#!forum/elasticsearch

Enjoy,

-The Elasticsearch team

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/53cd521a.d327b40a.404d.3899SMTPIN_ADDED_MISSING%40gmr-mx.google.com.
For more options, visit https://groups.google.com/d/optout.


Synonym filter results in term facet

2014-07-21 Thread ravi063
Hi All,

I have a requirement in which I need to find distinct company names. I was 
using Keyword tokenizer for that field and through term facet I was able 
to get distinct company names. However terms facet treated company names 
like ibm suisse, ibm corporation, ibm as different companies.
Online documentation suggested me to use Synonym filter to solve this. My 
settings is:

curl -XPUT 'http://localhost:9200/dataindex/' -d '{
  settings: {
index: {
  analysis: {
analyzer: {
 customAnalyzer: {
type: custom,
tokenizer: whitespace,
filter: [
  lowercase,synonym  
]
  }
},
filter: {
  synonym : {
  type : synonym,
  tokenizer: keyword,
  synonyms_path : analysis/synonym.txt
  }
}
  }
}
  }
}'

My mapping is:

curl -XPUT 'http://localhost:9200/dataindex/tweet/_mapping' -d '
{ 
tweet : {
properties : {
company: {
 type: string,
 analyzer: customAnalyzer
}  
}   
}   
}'

In the synonym.txt file I have : ibm suisse, ibm corporation, ibm business, 
ibm = ibm corp ltd

Indexed data:
curl -XPUT 'http://localhost:9200/dataindex/tweet/1' -d '{
company : ibm
}'
curl -XPUT 'http://localhost:9200/dataindex/tweet/2' -d '{
company : ibm corporation
}'
curl -XPUT 'http://localhost:9200/dataindex/tweet/3' -d '{
company : ibm suisse
}'
curl -XPUT 'http://localhost:9200/dataindex/tweet/4' -d '{
company : ibm business
}'

If I run a terms facet:
{
  facets: {
loc_facet: {
  terms: {
field: company
  }
}
  }
}
I get 3 terms ie {term: ibm corp ltd, count: 3} {term: suisse, count: 1} 
{term: corporation, count: 1}
I want the facet result to return only one term: ibm corp ltd with count=3. 
This way i will get distinct company names and also map synonym names into 
single company name.
Please correct me if I am using wrong tokenizer or my approach is not 
correct.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/1ba32926-7015-4b8a-89ae-bf43a2561b71%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: ES Failures and Recovery

2014-07-21 Thread Venu Nayar
Did you do a failure analysis and what were your findings?

Thanks,
Venu

On Tuesday, June 11, 2013 7:41:57 AM UTC-7, Anand Nalya wrote:

 Hi,

 We are using 0.90.1 version ES and are planning for high availability 
 testing. While the entire scheme to enable the cluster to be highly 
 available is clear, I wanted to get some idea about ES Service lifetime in 
 terms of Mean-Time to Failure and Time of Recovery in cases of failure. Any 
 historic evidences will also help, as it will be vital for us to calculate 
 the actual availability of the system across an year.

 While I understand that ES provides seamless high availability through 
 replication, but any failure, will impact the performance to some extent 
 and this calculation will help in deriving the actual number of nodes that 
 we should consider without compromising on the performance as well, while 
 the system is available.

 Any ideas/facts would be very helpful .

 Thanks,
 Anand


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/5f3c823f-0b54-4213-ac45-3dfa2f0b9af3%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Can one do a singular/plural phrase match with the Query String Query?

2014-07-21 Thread Brian Jones
Can one perform the following query using wildcards ( instead of two 
distinct phrases ) when using a Query String Query?
photographic film OR photographic films

These do not seem to work, and return the same number of results as just 
photographic 
film:
photographic film?
photographic film*

Can wildcards not be placed inside Exact Phrase queries?  Is there a way to 
mimic this?

My goal is to be able to perform queries like this:
photo* film?

... capturing:
photo film
photo films
photographic films
photography films
etc...

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/66cc151f-a235-40d4-a125-2236aae0f9bf%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [ERROR][bootstrap] {1.2.2}: Initialization Failed ... - NullPointerException[null]

2014-07-21 Thread Phillip Ulberg
Same issue here, I upgraded to ES 1.2.2, and jdk 1.7.0_60-b19, debug log 
output - 

[2014-07-21 18:50:05,054][INFO ][node ] [p-esmon01] 
version[1.2.2], pid[23697], build[9902f08/2014-07-09T12:02:32Z]
[2014-07-21 18:50:05,054][INFO ][node ] [p-esmon01] 
initializing ...
[2014-07-21 18:50:05,056][DEBUG][node ] [p-esmon01] 
using home [/usr/share/elasticsearch], config [/etc/elasticsearch], data 
[[/etc/elasticsearch/data]], logs [/var/log/elasticsearch], work 
[/etc/elasticsearch/work], plugins [/usr/share/elasticsearch/plugins]
[2014-07-21 18:50:05,064][DEBUG][plugins  ] [p-esmon01] 
lucene property is not set in plugin es-plugin.properties file. Skipping 
test.
[2014-07-21 18:50:05,074][DEBUG][plugins  ] [p-esmon01] 
skipping 
[jar:file:/usr/share/elasticsearch/plugins/marvel/marvel-1.2.1.jar!/es-plugin.properties]
[2014-07-21 18:50:05,075][DEBUG][plugins  ] [p-esmon01] 
lucene property is not set in plugin es-plugin.properties file. Skipping 
test.
[2014-07-21 18:50:05,075][DEBUG][plugins  ] [p-esmon01] 
[/usr/share/elasticsearch/plugins/http-basic-server-plugin/_site] directory 
does not exist.
[2014-07-21 18:50:05,077][DEBUG][plugins  ] [p-esmon01] 
[/usr/share/elasticsearch/plugins/http-basic/_site] directory does not 
exist.
[2014-07-21 18:50:05,078][INFO ][plugins  ] [p-esmon01] 
loaded [marvel, http-basic-server-plugin], sites [marvel, head]
[2014-07-21 18:50:05,104][DEBUG][common.compress.lzf  ] using 
[UnsafeChunkDecoder] decoder
[2014-07-21 18:50:05,113][DEBUG][env  ] [p-esmon01] 
using node location [[/etc/elasticsearch/data/p_es_clust_mon01/nodes/0]], 
local_node_id [0]
[2014-07-21 18:50:05,907][ERROR][bootstrap] {1.2.2}: 
Initialization Failed ...
- ExecutionError[java.lang.NoClassDefFoundError: 
org/elasticsearch/rest/StringRestResponse]
NoClassDefFoundError[org/elasticsearch/rest/StringRestResponse]

ClassNotFoundException[org.elasticsearch.rest.StringRestResponse]
org.elasticsearch.common.util.concurrent.ExecutionError: 
java.lang.NoClassDefFoundError: org/elasticsearch/rest/StringRestResponse
at 
org.elasticsearch.common.cache.LocalCache$Segment.get(LocalCache.java:2199)
at 
org.elasticsearch.common.cache.LocalCache.get(LocalCache.java:3934)
at 
org.elasticsearch.common.cache.LocalCache.getOrLoad(LocalCache.java:3938)
at 
org.elasticsearch.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4821)
at 
org.elasticsearch.common.inject.internal.FailableCache.get(FailableCache.java:51)
at 
org.elasticsearch.common.inject.ConstructorInjectorStore.get(ConstructorInjectorStore.java:50)
at 
org.elasticsearch.common.inject.ConstructorBindingImpl.initialize(ConstructorBindingImpl.java:50)
at 
org.elasticsearch.common.inject.InjectorImpl.initializeBinding(InjectorImpl.java:372)
at 
org.elasticsearch.common.inject.BindingProcessor$1$1.run(BindingProcessor.java:148)
at 
org.elasticsearch.common.inject.BindingProcessor.initializeBindings(BindingProcessor.java:204)
at 
org.elasticsearch.common.inject.InjectorBuilder.initializeStatically(InjectorBuilder.java:119)
at 
org.elasticsearch.common.inject.InjectorBuilder.build(InjectorBuilder.java:102)
at 
org.elasticsearch.common.inject.Guice.createInjector(Guice.java:93)
at 
org.elasticsearch.common.inject.Guice.createInjector(Guice.java:70)
at 
org.elasticsearch.common.inject.ModulesBuilder.createInjector(ModulesBuilder.java:59)
at 
org.elasticsearch.node.internal.InternalNode.init(InternalNode.java:188)
at org.elasticsearch.node.NodeBuilder.build(NodeBuilder.java:159)
at org.elasticsearch.bootstrap.Bootstrap.setup(Bootstrap.java:70)
at org.elasticsearch.bootstrap.Bootstrap.main(Bootstrap.java:203)
at 
org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:32)
Caused by: java.lang.NoClassDefFoundError: 
org/elasticsearch/rest/StringRestResponse
at java.lang.Class.getDeclaredConstructors0(Native Method)
at java.lang.Class.privateGetDeclaredConstructors(Class.java:2532)
at java.lang.Class.getDeclaredConstructors(Class.java:1901)
at 
org.elasticsearch.common.inject.spi.InjectionPoint.forConstructorOf(InjectionPoint.java:177)
at 
org.elasticsearch.common.inject.ConstructorInjectorStore.createConstructor(ConstructorInjectorStore.java:59)
at 
org.elasticsearch.common.inject.ConstructorInjectorStore.access$000(ConstructorInjectorStore.java:29)
at 
org.elasticsearch.common.inject.ConstructorInjectorStore$1.create(ConstructorInjectorStore.java:37)
at 
org.elasticsearch.common.inject.ConstructorInjectorStore$1.create(ConstructorInjectorStore.java:33)
at 

Setting from Json File

2014-07-21 Thread M_20
Hi all,

I am trying to read the setting from Json file. Something like:

ImmutableSettings.settingsBuilder().loadFromSource( path );

But it seems loadFromSource doesn't work for this purpose.
Could you please point me out the method?

Thanks

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/44cf6310-4e35-4c30-86ce-69a3660f1257%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Synonym filter results in term facet

2014-07-21 Thread vineeth mohan
Hello Ravi ,

Your approach is wrong.
When you use synonym filter , it indexes all synonyms of that token hence
and synonym will match against that term.
So when you do a facet , you will get an aggregation of all synonyms rather
than just one.

Better approach would be to store the unique name into some other field and
take a facet of that field.

Thanks
   Vineeth


On Mon, Jul 21, 2014 at 11:21 PM, ravi...@gmail.com wrote:

 Hi All,

 I have a requirement in which I need to find distinct company names. I was
 using Keyword tokenizer for that field and through term facet I was able
 to get distinct company names. However terms facet treated company names
 like ibm suisse, ibm corporation, ibm as different companies.
 Online documentation suggested me to use Synonym filter to solve this.
 My settings is:

 curl -XPUT 'http://localhost:9200/dataindex/' -d '{
   settings: {
 index: {
   analysis: {
 analyzer: {
  customAnalyzer: {
 type: custom,
 tokenizer: whitespace,
 filter: [
   lowercase,synonym
 ]
   }
 },
 filter: {
   synonym : {
   type : synonym,
   tokenizer: keyword,
   synonyms_path : analysis/synonym.txt
   }
 }
   }
 }
   }
 }'

 My mapping is:

 curl -XPUT 'http://localhost:9200/dataindex/tweet/_mapping' -d '
 {
 tweet : {
 properties : {
 company: {
  type: string,
  analyzer: customAnalyzer
 }
 }
 }
 }'

 In the synonym.txt file I have : ibm suisse, ibm corporation, ibm
 business, ibm = ibm corp ltd

 Indexed data:
 curl -XPUT 'http://localhost:9200/dataindex/tweet/1' -d '{
 company : ibm
 }'
 curl -XPUT 'http://localhost:9200/dataindex/tweet/2' -d '{
 company : ibm corporation
 }'
 curl -XPUT 'http://localhost:9200/dataindex/tweet/3' -d '{
 company : ibm suisse
 }'
 curl -XPUT 'http://localhost:9200/dataindex/tweet/4' -d '{
 company : ibm business
 }'

 If I run a terms facet:
 {
   facets: {
 loc_facet: {
   terms: {
 field: company
   }
 }
   }
 }
 I get 3 terms ie {term: ibm corp ltd, count: 3} {term: suisse, count: 1}
 {term: corporation, count: 1}
 I want the facet result to return only one term: ibm corp ltd with
 count=3. This way i will get distinct company names and also map synonym
 names into single company name.
 Please correct me if I am using wrong tokenizer or my approach is not
 correct.

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/1ba32926-7015-4b8a-89ae-bf43a2561b71%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/1ba32926-7015-4b8a-89ae-bf43a2561b71%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAGdPd5ny%3Di76CHwpbEoY-4nGaraQfz-Tmmm5MVJbiA%2B0nrgKZQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Can one do a singular/plural phrase match with the Query String Query?

2014-07-21 Thread David Pilato
I think a stemmer analyzer would be fit your use case: See 
http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/choosing-a-stemmer.html#choosing-a-stemmer

-- 
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr


Le 21 juillet 2014 à 20:53:09, Brian Jones (tbrianjo...@gmail.com) a écrit:

Can one perform the following query using wildcards ( instead of two distinct 
phrases ) when using a Query String Query?
photographic film OR photographic films

These do not seem to work, and return the same number of results as just 
photographic film:
photographic film?
photographic film*

Can wildcards not be placed inside Exact Phrase queries?  Is there a way to 
mimic this?

My goal is to be able to perform queries like this:
photo* film?

... capturing:
photo film
photo films
photographic films
photography films
etc...
--
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/66cc151f-a235-40d4-a125-2236aae0f9bf%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/etPan.53cd6b41.19495cff.e03c%40MacBook-Air-de-David.local.
For more options, visit https://groups.google.com/d/optout.


Re: [ANN] Log4j2 Elasticsearch appender

2014-07-21 Thread joergpra...@gmail.com
The Log4j appender is batching events because it is using bulk indexing. If
you are worried, you can set the bulk action size to an extreme high value
and increase heap to some GB only for buffering log messages, together with
a reasonable flush interval.

Jörg


On Mon, Jul 21, 2014 at 9:14 PM, Ivan Brusic i...@brusic.com wrote:

 I was indexing events into Elasticsearch via the standard SocketAppender
 into Logstash, but I stopped doing so since the SocketAppender was not
 releasing threads. Great to see a direct approach, but I like to use
 Logstash in the middle as a buffer in order to batch events.

 --
 Ivan


 On Sat, Jul 19, 2014 at 8:59 AM, Alfredo Serafini ser...@gmail.com
 wrote:

 I'll try it as soon as I can!
 thanks,
 Alfredo
 :-)

 Il giorno venerdì 18 luglio 2014 10:08:14 UTC+2, Jörg Prante ha scritto:

 Hi,

 I released a Log4j2 Elasticsearch appender

 https://github.com/jprante/log4j2-elasticsearch

 in the hope it is useful.

 Best,

 Jörg

  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/dce481d9-ac3e-4fd0-aaba-3a4c69d07d34%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/dce481d9-ac3e-4fd0-aaba-3a4c69d07d34%40googlegroups.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQCppMVB-9_0btCXtPTo67c%2BMFziYDff0qwiptj4kn-hyw%40mail.gmail.com
 https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQCppMVB-9_0btCXtPTo67c%2BMFziYDff0qwiptj4kn-hyw%40mail.gmail.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGim0o0LVVFRoLAM2DOHgS-4wM%2BariaLrF3aBg6Vjs2Nw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: [ERROR][bootstrap] {1.2.2}: Initialization Failed ... - NullPointerException[null]

2014-07-21 Thread joergpra...@gmail.com
You have to update your plugins marvel and http-basic-server-plugin to get
them to work with ES 1.2.2

Jörg


On Mon, Jul 21, 2014 at 9:02 PM, Phillip Ulberg phillip.ulb...@gmail.com
wrote:

 Same issue here, I upgraded to ES 1.2.2, and jdk 1.7.0_60-b19, debug log
 output -

 [2014-07-21 18:50:05,054][INFO ][node ] [p-esmon01]
 version[1.2.2], pid[23697], build[9902f08/2014-07-09T12:02:32Z]
 [2014-07-21 18:50:05,054][INFO ][node ] [p-esmon01]
 initializing ...
 [2014-07-21 18:50:05,056][DEBUG][node ] [p-esmon01]
 using home [/usr/share/elasticsearch], config [/etc/elasticsearch], data
 [[/etc/elasticsearch/data]], logs [/var/log/elasticsearch], work
 [/etc/elasticsearch/work], plugins [/usr/share/elasticsearch/plugins]
 [2014-07-21 18:50:05,064][DEBUG][plugins  ] [p-esmon01]
 lucene property is not set in plugin es-plugin.properties file. Skipping
 test.
 [2014-07-21 18:50:05,074][DEBUG][plugins  ] [p-esmon01]
 skipping
 [jar:file:/usr/share/elasticsearch/plugins/marvel/marvel-1.2.1.jar!/es-plugin.properties]
 [2014-07-21 18:50:05,075][DEBUG][plugins  ] [p-esmon01]
 lucene property is not set in plugin es-plugin.properties file. Skipping
 test.
 [2014-07-21 18:50:05,075][DEBUG][plugins  ] [p-esmon01]
 [/usr/share/elasticsearch/plugins/http-basic-server-plugin/_site] directory
 does not exist.
 [2014-07-21 18:50:05,077][DEBUG][plugins  ] [p-esmon01]
 [/usr/share/elasticsearch/plugins/http-basic/_site] directory does not
 exist.
 [2014-07-21 18:50:05,078][INFO ][plugins  ] [p-esmon01]
 loaded [marvel, http-basic-server-plugin], sites [marvel, head]
 [2014-07-21 18:50:05,104][DEBUG][common.compress.lzf  ] using
 [UnsafeChunkDecoder] decoder
 [2014-07-21 18:50:05,113][DEBUG][env  ] [p-esmon01]
 using node location [[/etc/elasticsearch/data/p_es_clust_mon01/nodes/0]],
 local_node_id [0]
 [2014-07-21 18:50:05,907][ERROR][bootstrap] {1.2.2}:
 Initialization Failed ...
 - ExecutionError[java.lang.NoClassDefFoundError:
 org/elasticsearch/rest/StringRestResponse]
 NoClassDefFoundError[org/elasticsearch/rest/StringRestResponse]

 ClassNotFoundException[org.elasticsearch.rest.StringRestResponse]
 org.elasticsearch.common.util.concurrent.ExecutionError:
 java.lang.NoClassDefFoundError: org/elasticsearch/rest/StringRestResponse
 at
 org.elasticsearch.common.cache.LocalCache$Segment.get(LocalCache.java:2199)
 at
 org.elasticsearch.common.cache.LocalCache.get(LocalCache.java:3934)
 at
 org.elasticsearch.common.cache.LocalCache.getOrLoad(LocalCache.java:3938)
 at
 org.elasticsearch.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4821)
 at
 org.elasticsearch.common.inject.internal.FailableCache.get(FailableCache.java:51)
 at
 org.elasticsearch.common.inject.ConstructorInjectorStore.get(ConstructorInjectorStore.java:50)
 at
 org.elasticsearch.common.inject.ConstructorBindingImpl.initialize(ConstructorBindingImpl.java:50)
 at
 org.elasticsearch.common.inject.InjectorImpl.initializeBinding(InjectorImpl.java:372)
 at
 org.elasticsearch.common.inject.BindingProcessor$1$1.run(BindingProcessor.java:148)
 at
 org.elasticsearch.common.inject.BindingProcessor.initializeBindings(BindingProcessor.java:204)
 at
 org.elasticsearch.common.inject.InjectorBuilder.initializeStatically(InjectorBuilder.java:119)
 at
 org.elasticsearch.common.inject.InjectorBuilder.build(InjectorBuilder.java:102)
 at
 org.elasticsearch.common.inject.Guice.createInjector(Guice.java:93)
 at
 org.elasticsearch.common.inject.Guice.createInjector(Guice.java:70)
 at
 org.elasticsearch.common.inject.ModulesBuilder.createInjector(ModulesBuilder.java:59)
 at
 org.elasticsearch.node.internal.InternalNode.init(InternalNode.java:188)
 at org.elasticsearch.node.NodeBuilder.build(NodeBuilder.java:159)
 at org.elasticsearch.bootstrap.Bootstrap.setup(Bootstrap.java:70)
 at org.elasticsearch.bootstrap.Bootstrap.main(Bootstrap.java:203)
 at
 org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:32)
 Caused by: java.lang.NoClassDefFoundError:
 org/elasticsearch/rest/StringRestResponse
 at java.lang.Class.getDeclaredConstructors0(Native Method)
 at java.lang.Class.privateGetDeclaredConstructors(Class.java:2532)
 at java.lang.Class.getDeclaredConstructors(Class.java:1901)
 at
 org.elasticsearch.common.inject.spi.InjectionPoint.forConstructorOf(InjectionPoint.java:177)
 at
 org.elasticsearch.common.inject.ConstructorInjectorStore.createConstructor(ConstructorInjectorStore.java:59)
 at
 org.elasticsearch.common.inject.ConstructorInjectorStore.access$000(ConstructorInjectorStore.java:29)
 at
 

Re: [ERROR][bootstrap] {1.2.2}: Initialization Failed ... - NullPointerException[null]

2014-07-21 Thread Phillip Ulberg
Turned out to be an issue with the http-basic auth plugin I was using.

On Monday, July 21, 2014 2:02:02 PM UTC-5, Phillip Ulberg wrote:

 Same issue here, I upgraded to ES 1.2.2, and jdk 1.7.0_60-b19, debug log 
 output - 

 [2014-07-21 18:50:05,054][INFO ][node ] [p-esmon01] 
 version[1.2.2], pid[23697], build[9902f08/2014-07-09T12:02:32Z]
 [2014-07-21 18:50:05,054][INFO ][node ] [p-esmon01] 
 initializing ...
 [2014-07-21 18:50:05,056][DEBUG][node ] [p-esmon01] 
 using home [/usr/share/elasticsearch], config [/etc/elasticsearch], data 
 [[/etc/elasticsearch/data]], logs [/var/log/elasticsearch], work 
 [/etc/elasticsearch/work], plugins [/usr/share/elasticsearch/plugins]
 [2014-07-21 18:50:05,064][DEBUG][plugins  ] [p-esmon01] 
 lucene property is not set in plugin es-plugin.properties file. Skipping 
 test.
 [2014-07-21 18:50:05,074][DEBUG][plugins  ] [p-esmon01] 
 skipping 
 [jar:file:/usr/share/elasticsearch/plugins/marvel/marvel-1.2.1.jar!/es-plugin.properties]
 [2014-07-21 18:50:05,075][DEBUG][plugins  ] [p-esmon01] 
 lucene property is not set in plugin es-plugin.properties file. Skipping 
 test.
 [2014-07-21 18:50:05,075][DEBUG][plugins  ] [p-esmon01] 
 [/usr/share/elasticsearch/plugins/http-basic-server-plugin/_site] directory 
 does not exist.
 [2014-07-21 18:50:05,077][DEBUG][plugins  ] [p-esmon01] 
 [/usr/share/elasticsearch/plugins/http-basic/_site] directory does not 
 exist.
 [2014-07-21 18:50:05,078][INFO ][plugins  ] [p-esmon01] 
 loaded [marvel, http-basic-server-plugin], sites [marvel, head]
 [2014-07-21 18:50:05,104][DEBUG][common.compress.lzf  ] using 
 [UnsafeChunkDecoder] decoder
 [2014-07-21 18:50:05,113][DEBUG][env  ] [p-esmon01] 
 using node location [[/etc/elasticsearch/data/p_es_clust_mon01/nodes/0]], 
 local_node_id [0]
 [2014-07-21 18:50:05,907][ERROR][bootstrap] {1.2.2}: 
 Initialization Failed ...
 - ExecutionError[java.lang.NoClassDefFoundError: 
 org/elasticsearch/rest/StringRestResponse]
 NoClassDefFoundError[org/elasticsearch/rest/StringRestResponse]
 
 ClassNotFoundException[org.elasticsearch.rest.StringRestResponse]
 org.elasticsearch.common.util.concurrent.ExecutionError: 
 java.lang.NoClassDefFoundError: org/elasticsearch/rest/StringRestResponse
 at 
 org.elasticsearch.common.cache.LocalCache$Segment.get(LocalCache.java:2199)
 at 
 org.elasticsearch.common.cache.LocalCache.get(LocalCache.java:3934)
 at 
 org.elasticsearch.common.cache.LocalCache.getOrLoad(LocalCache.java:3938)
 at 
 org.elasticsearch.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4821)
 at 
 org.elasticsearch.common.inject.internal.FailableCache.get(FailableCache.java:51)
 at 
 org.elasticsearch.common.inject.ConstructorInjectorStore.get(ConstructorInjectorStore.java:50)
 at 
 org.elasticsearch.common.inject.ConstructorBindingImpl.initialize(ConstructorBindingImpl.java:50)
 at 
 org.elasticsearch.common.inject.InjectorImpl.initializeBinding(InjectorImpl.java:372)
 at 
 org.elasticsearch.common.inject.BindingProcessor$1$1.run(BindingProcessor.java:148)
 at 
 org.elasticsearch.common.inject.BindingProcessor.initializeBindings(BindingProcessor.java:204)
 at 
 org.elasticsearch.common.inject.InjectorBuilder.initializeStatically(InjectorBuilder.java:119)
 at 
 org.elasticsearch.common.inject.InjectorBuilder.build(InjectorBuilder.java:102)
 at 
 org.elasticsearch.common.inject.Guice.createInjector(Guice.java:93)
 at 
 org.elasticsearch.common.inject.Guice.createInjector(Guice.java:70)
 at 
 org.elasticsearch.common.inject.ModulesBuilder.createInjector(ModulesBuilder.java:59)
 at 
 org.elasticsearch.node.internal.InternalNode.init(InternalNode.java:188)
 at org.elasticsearch.node.NodeBuilder.build(NodeBuilder.java:159)
 at org.elasticsearch.bootstrap.Bootstrap.setup(Bootstrap.java:70)
 at org.elasticsearch.bootstrap.Bootstrap.main(Bootstrap.java:203)
 at 
 org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:32)
 Caused by: java.lang.NoClassDefFoundError: 
 org/elasticsearch/rest/StringRestResponse
 at java.lang.Class.getDeclaredConstructors0(Native Method)
 at java.lang.Class.privateGetDeclaredConstructors(Class.java:2532)
 at java.lang.Class.getDeclaredConstructors(Class.java:1901)
 at 
 org.elasticsearch.common.inject.spi.InjectionPoint.forConstructorOf(InjectionPoint.java:177)
 at 
 org.elasticsearch.common.inject.ConstructorInjectorStore.createConstructor(ConstructorInjectorStore.java:59)
 at 
 org.elasticsearch.common.inject.ConstructorInjectorStore.access$000(ConstructorInjectorStore.java:29)
 at 
 

Getting _id field in elasticsearch to map to a field in HIVE

2014-07-21 Thread hiveesuser



Hi,


I am working on a project to integrate hive and elasticsearch and for one 
of our use case we are loading data from ELASTISCSEARCH -- HIVE.


During this process I want to store the _id field which is in elasticsearch 
document in hive. I am able to get the fields which are part of _source 
like messages, @timestamp etc but I am not able to get the _id associated 
with that particular document. 


The following is the sample table I am trying to create 

create external table eshivetable (id string,eventdate timestamp, host 
string, username string, message string) STORED BY 
'org.elasticsearch.hadoop.hive.EsStorageHandler' 
TBLPROPERTIES('es.resource' = 'logstash-*/syslog', 'es.mapping.names' = 
'id:_id,eventdate:@timestamp,host:host,username:username,message:message','es.nodes'='10.10.10.50','es.port'='9200','es.query'='?q=type:syslog');
 
So when I select id it returns a null value...
 
Can some one help me with this please. 
 

 

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/5d2eb38d-9f0d-4329-ba2b-0d28c06f98e5%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Elasticsearch recovering process took a long time.

2014-07-21 Thread anivlis
Hi,
We are using elasticsearch 0.90.1. We had some problems with the network, 
when the network is down (the convergence time is around 40s). Recovering 
the elastic after this event took a long time to be available. We have 16 
data nodes and 16 shards with 2 replica and the settings:

discovery.zen.minimum_master_nodes: 4
gateway.recover_after_nodes: 4
gateway.expected_nodes: 6

Any idea about how to minimize the recovery time?
Is it a good idea to update the version?
What will happen if we increase discovery.zen.fd.ping_interval and 
ping_timeout settings? Assuming the network is completely off, is there any 
way to wait for at least those 40 seconds to start marking servers down? 
Does the ping_timeout ignore the failure to connect to a node?


Thanks.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/a32a0c42-122a-45a3-9c56-8869b4028608%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Elasticsearch recovering process took a long time.

2014-07-21 Thread anivlis
Sorry, the setting discovery.zen.minimum_master_nodes is 8




El lunes, 21 de julio de 2014 17:20:03 UTC-3, anivlis escribió:

 Hi,
 We are using elasticsearch 0.90.1. We had some problems with the network, 
 when the network is down (the convergence time is around 40s). Recovering 
 the elastic after this event took a long time to be available. We have 16 
 data nodes and 16 shards with 2 replica and the settings:

 discovery.zen.minimum_master_nodes: 4
 gateway.recover_after_nodes: 4
 gateway.expected_nodes: 6

 Any idea about how to minimize the recovery time?
 Is it a good idea to update the version?
 What will happen if we increase discovery.zen.fd.ping_interval and 
 ping_timeout settings? Assuming the network is completely off, is there any 
 way to wait for at least those 40 seconds to start marking servers down? 
 Does the ping_timeout ignore the failure to connect to a node?


 Thanks.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/7997de22-b421-4f7c-ae7f-e001dbfef618%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Elasticsearch recovering process took a long time.

2014-07-21 Thread Ivan Brusic
Recovery is throttled since version 0.90.1

http://www.elasticsearch.org/guide/en/elasticsearch/reference/0.90/index-modules-store.html#store-throttling

Increase indices.store.throttle.max_bytes_per_sec to a level that is
suitable for your environment. Since IO should be the main bottleneck, the
setting could vary greatly depending on SSD, platter disk or shared storage.

-- 
Ivan


On Mon, Jul 21, 2014 at 1:28 PM, anivlis svluc...@gmail.com wrote:

 Sorry, the setting discovery.zen.minimum_master_nodes is 8




 El lunes, 21 de julio de 2014 17:20:03 UTC-3, anivlis escribió:

 Hi,
 We are using elasticsearch 0.90.1. We had some problems with the network,
 when the network is down (the convergence time is around 40s). Recovering
 the elastic after this event took a long time to be available. We have 16
 data nodes and 16 shards with 2 replica and the settings:

 discovery.zen.minimum_master_nodes: 4
 gateway.recover_after_nodes: 4
 gateway.expected_nodes: 6

 Any idea about how to minimize the recovery time?
 Is it a good idea to update the version?
 What will happen if we increase discovery.zen.fd.ping_interval and
 ping_timeout settings? Assuming the network is completely off, is there any
 way to wait for at least those 40 seconds to start marking servers down?
 Does the ping_timeout ignore the failure to connect to a node?


 Thanks.

  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/7997de22-b421-4f7c-ae7f-e001dbfef618%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/7997de22-b421-4f7c-ae7f-e001dbfef618%40googlegroups.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQACJs%3Dzuu4gjYTvcKpF%2BSgv_1UkYZw%3DC0G8FkY1meAg-Q%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


multiple indices per document

2014-07-21 Thread Prakash


I want to use ES to index logs coming from different processes. Assume I 
have 2 sources: ProcessA and ProcessB Logs from the processes are formatted 
in json. Example log:

{level:DEBUG,logger:REPOSITORY,timestamp:1405982400689,attrs:{profile:ManagementServerA,organization:FOOBAR},thread:main,message:Repository.store()
 : Stored successfully in /central/zone/cef9cccab964}

How can I get ES to update multiple indexes when it sees a new document ? 
In this case I want indices on the profile and organization values. Do I 
have to

   1. Create indexes using the ES REST api before ES sees any logs.
   2. Supply an _index field to each json document
   3. Have multiple values in the _index field to indicate what indexes 
   must be updated ? i.e should I have: *_index: {ManagementServerA , 
   FOOBAR}*

Please let me know if this is the correct way to do this.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/c299f3e4-eebc-43a4-ab23-894605b2a752%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


explanation of each fields of _stats api data

2014-07-21 Thread Jinyuan Zhou
Hi there,
How can find documentations for explanation of json data returned by _stats 
for index. I mean meaning of each fields not just high level description. 
Specifically, For the following  data

primaries: {docs: {count: 1789457,deleted: 0},store: {
size_in_bytes: 2085533463,throttle_time_in_millis: 582538},indexing: {
index_total: 297925,index_time_in_millis: 113345,index_current: 0,
delete_total: 0,delete_time_in_millis: 0,delete_current: 0},

I can't find out meaning of index_total field. and index_current field. for 
this index the document count is 1789457 which is about 3 times  of the 
index_total value. 

Thanks,
Jack

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/2f5ef5fa-fe47-4907-a7d0-b3f7212e2fbe%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Handling node failure in ES cluster

2014-07-21 Thread Mark Walkom
Max and min memory should be the same, mlockall is probably not working due
to these being different as it can't lock a sliding window.
Try setting that and see if it helps.

Also you didn't mention your java version and release, which would be
helpful.

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com
web: www.campaignmonitor.com


On 22 July 2014 02:38, kmoore.cce kmoore@gmail.com wrote:

 I have had some issues recently as I've expanded my ES cluster, where a
 single node failure causes basically all other index/search operations to
 timeout and fail.

 I am currently running elasticsearch v1.2.1 and primarily interface with
 the indices using the elasticsearch python module.

 My cluster is 20 nodes, each an m1.large ec2 instance. I currently have
 ~18 indices each with 5 shards and 3 replicas. The average size of each
 index is ~20GB and ~10 million documents (low is ~100K documents (300mb),
 high ~40 million (35gb)).
 I run each node with ES_MAX_SIZE=4g and ES_MIN_SIZE=512m. There are no
 other services running on the elasticsearch nodes, except ssh. I use zen
 unicast discovery with a set list of nodes. I have tried to enable
 'bootstrap.mlockall', but the ulimit settings do not seem to be working and
 I keep getting 'Unable to lock JVM memory (ENOMEM)' errors when starting a
 node (note: I didn't see this log message when running 0.90.7).

 I have a fairly constant series of new or updated documents (I don't
 actually update, but rather reindex when a new document with the same id is
 found) that are being ingested all the time, and a number of users who are
 querying the data on a regular basis - most queries are set queries through
 the python API.

 The issue I have now is that while data is being ingested/indexed, I will
 hit Java heap out of memory errors. I think this is related to garbage
 collection as that seems to be the last activity in the logs nearly
 everytime this occurs. I have tried adjusting the heap max to 6g, and that
 seems to help but I am not sure it solves the issue. In conjunction with
 that, when the out of memory error occurs it seems to cause the other nodes
 to stop working effectively, timeout errors in both indexing and searching.

 My question is: what is the best way to support a node failing for this
 reason? I would obviously like to solve the underlying problem as well, but
 I would also like to be able to support a node crashing for some reason
 (whether it be because of me or because ec2 took it away). Shouldn't the
 failover in replicas support the missing node? I understand the cluster
 state would be yellow at this time, but I should be able to index and
 search data on the remaining nodes, correct?

 Are there configuration changes I can make to better support the cluster
 and identify or solve the underyling issue?

 Any help is appreciated. I understand I have a lot to learn about
 Elasticsearch, but I am hoping I can add some stability/resiliency to my
 cluster.

 Thanks in advance,
 -Kevin

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/74ac48ec-0c05-4683-9c78-66d8c97687fa%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/74ac48ec-0c05-4683-9c78-66d8c97687fa%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAEM624a1bjybx6b-B-7h%2BkVVy-JPEEvs0_9JaY-wbcLS5hPFhw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


How to remove a cluster setting?

2014-07-21 Thread Jeffrey Zhou
I made the following setting to my Elasticsearch cluster in order to 
decommission some old nodes in the cluster. After removed these old nodes, 
now I need to re-enable the cluster to allocate shards on those '10.0.6.*' 
nodes. Does anyone know how to remove this setting? 

PUT /_cluster/settings 
{ 
   transient: { 
  cluster.routing.allocation.exclude._ip: 10.0.6.* 
   } 
} 

Thanks in advance for any help! 

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/53df40a8-a248-4373-b789-e0490e3dab8a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Sorting a nested array

2014-07-21 Thread Darin Amos
Hi Everyone,

I am looking for a little guidance on how to setup my indexes to support 
sorting a nested array. To give a simple example, we have an index of 
products in our index, below is a dummy example document:

/products/product/partnumber

{
 name:Toaster,
 desc:For making toast,
 price:29.99,
 reviews:[
  {
   rating : 5,
   comment : works great!
  },  
  {
   rating : 1,
   comment : Ugh!!
  },  
  {
   rating : 3,
   comment : Meh
  }
 ]
}


I would like to be able to query for product documents, but be able to sort 
the reviews array based on the rating. A customer could ask to see the 
first 10 best reviews, or the worst 5 reviews for example. Is this 
possible? Is this possible if the reviews are in a separate index and I 
execute a join query?

Any information or guidance would be extremely helpful!

Thanks!

Darin

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/500b16ff-b00f-44a9-a238-9c7a09432bfa%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Sorting a nested array

2014-07-21 Thread vineeth mohan
Hello Darin ,

This is the wrong approach to do this.
You can only sort per document and not per nested document and recieve
output as documents and not as nested documents.
So i would rather go for parent child approach.

That being said , what you asked is actually possible with little tweaking
on client side. -
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-function-score-query.html#_using_function_score


So a function score query can has a score_mode of min or max.
This means that if we give the field reward as the score value and
score_mode , it will take the highest  value per doc as the score.
With this , you can get the top 10 documents bearing top 10 rating.
Once you receive this you need to take out all ratings and do a sort again
and then find the top 10 ratings.
Possibility is that a single document might contain top 2 good reviews.

With this i believe you might need to turn include_in_parent on.

Thanks
   Vineeth


On Tue, Jul 22, 2014 at 8:15 AM, Darin Amos darinamos.it@gmail.com
wrote:

 Hi Everyone,

 I am looking for a little guidance on how to setup my indexes to support
 sorting a nested array. To give a simple example, we have an index of
 products in our index, below is a dummy example document:

 /products/product/partnumber

 {
  name:Toaster,
  desc:For making toast,
  price:29.99,
  reviews:[
   {
rating : 5,
comment : works great!
   },
   {
rating : 1,
comment : Ugh!!
   },
   {
rating : 3,
comment : Meh
   }
  ]
 }


 I would like to be able to query for product documents, but be able to
 sort the reviews array based on the rating. A customer could ask to see the
 first 10 best reviews, or the worst 5 reviews for example. Is this
 possible? Is this possible if the reviews are in a separate index and I
 execute a join query?

 Any information or guidance would be extremely helpful!

 Thanks!

 Darin

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/500b16ff-b00f-44a9-a238-9c7a09432bfa%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/500b16ff-b00f-44a9-a238-9c7a09432bfa%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAGdPd5%3DxMYAQfaZAzx7JyuHvAuBb4%3DZ9XBPyMkp_1G9uWEOdvA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Common website analytics aggregation formulas

2014-07-21 Thread Anki Reddy
Unique visitors 

http://www.elasticsearch.org/blog/count-elasticsearch/ 



On Saturday, May 3, 2014 5:17:13 AM UTC+5:30, Demetrius Nunes wrote:

 Hi guys,

 At my company we're building a platform product which has a big analytics 
 component to it.

 I am intending to use elasticsearch to power that part of the platform.

 Most of the analytics examples that I see using elasticsearch aggregations 
 are around systems logs  monitoring.

 There quite a few metrics that I have to provide reporting that are very 
 typical of website analytics, such as time spent on site, bounce rate, 
 active users, etc.

 I've already implemented all the tracking code within the system and I 
 have indexes with timestamps, user-generated events such as page hits, 
 clicks, and so on.

 So, are there any good references, best practices, plugins or even 
 formulas on how to implement these kinds of website analytics metrics using 
 elasticsearch?

 Thanks a lot,
 Demetrius



-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/8e22b230-9929-43fe-842f-10a548cd25ee%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Handling node failure in ES cluster

2014-07-21 Thread Otis Gospodnetic
Lots of things could be the source of problems here.  Maybe you can tune 
the JVM params.  We don't know what you are using or what your GC activity 
looks like.  Can you share GC metrics graphs?  If you don't have any GC 
monitoring, you can use SPM http://sematext.com/spm/.  Why do you have 5 
shards for all indices?  Some seem small and shouldn't need to be sharded 
so much.  Why do you have 3 replicas and not, say, just 2? (we don't know 
your query rates).

Otis
--
Performance Monitoring * Log Analytics * Search Analytics
Solr  Elasticsearch Support * http://sematext.com/


On Monday, July 21, 2014 12:38:49 PM UTC-4, kmoore.cce wrote:

 I have had some issues recently as I've expanded my ES cluster, where a 
 single node failure causes basically all other index/search operations to 
 timeout and fail.

 I am currently running elasticsearch v1.2.1 and primarily interface with 
 the indices using the elasticsearch python module.

 My cluster is 20 nodes, each an m1.large ec2 instance. I currently have 
 ~18 indices each with 5 shards and 3 replicas. The average size of each 
 index is ~20GB and ~10 million documents (low is ~100K documents (300mb), 
 high ~40 million (35gb)).
 I run each node with ES_MAX_SIZE=4g and ES_MIN_SIZE=512m. There are no 
 other services running on the elasticsearch nodes, except ssh. I use zen 
 unicast discovery with a set list of nodes. I have tried to enable 
 'bootstrap.mlockall', but the ulimit settings do not seem to be working and 
 I keep getting 'Unable to lock JVM memory (ENOMEM)' errors when starting a 
 node (note: I didn't see this log message when running 0.90.7).

 I have a fairly constant series of new or updated documents (I don't 
 actually update, but rather reindex when a new document with the same id is 
 found) that are being ingested all the time, and a number of users who are 
 querying the data on a regular basis - most queries are set queries through 
 the python API.

 The issue I have now is that while data is being ingested/indexed, I will 
 hit Java heap out of memory errors. I think this is related to garbage 
 collection as that seems to be the last activity in the logs nearly 
 everytime this occurs. I have tried adjusting the heap max to 6g, and that 
 seems to help but I am not sure it solves the issue. In conjunction with 
 that, when the out of memory error occurs it seems to cause the other nodes 
 to stop working effectively, timeout errors in both indexing and searching.

 My question is: what is the best way to support a node failing for this 
 reason? I would obviously like to solve the underlying problem as well, but 
 I would also like to be able to support a node crashing for some reason 
 (whether it be because of me or because ec2 took it away). Shouldn't the 
 failover in replicas support the missing node? I understand the cluster 
 state would be yellow at this time, but I should be able to index and 
 search data on the remaining nodes, correct?

 Are there configuration changes I can make to better support the cluster 
 and identify or solve the underyling issue? 

 Any help is appreciated. I understand I have a lot to learn about 
 Elasticsearch, but I am hoping I can add some stability/resiliency to my 
 cluster.

 Thanks in advance,
 -Kevin


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/9eb495e8-9ac6-4ef0-95ae-a6cc4516c67a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


curl indices.memory.index_buffer_size ??

2014-07-21 Thread Chris Berry
Hello,

I am trying to set indices.memory.index_buffer_size to 30% using curl to my 
running cluster

And I am not able to make it stick
I am doing this: 

$  curl -XPUT http://foo:9200/_cluster/settings -d '{ persistent : { 
indices.memory.index_buffer_size : 30% }}'
{acknowledged:true,persistent:{},transient:{}

But when I check settings it is not there?

Any idea what I am doing wrong??
It is probably something obvious. But I don't see it...

Thanks,
-- Chris 

$ curl -XGET http://foo:9200/_cluster/settings?pretty=true
{
  persistent : {
threadpool : {
  index : {
type : cached
  }
}
  },
  transient : {
cluster : {
  routing : {
allocation : {
  enable : all
}
  }
}
  }
}


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/0d6ca29a-3ecf-4050-85c8-f7672c4a964d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: How to remove a cluster setting?

2014-07-21 Thread David Pilato
Try:

PUT /_cluster/settings 
{ 
   transient: { 
  cluster.routing.allocation.exclude._ip:  
   } 
} 

--
David ;-)
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs


Le 22 juil. 2014 à 02:50, Jeffrey Zhou jeffreyzhou2...@gmail.com a écrit :

I made the following setting to my Elasticsearch cluster in order to 
decommission some old nodes in the cluster. After removed these old nodes, now 
I need to re-enable the cluster to allocate shards on those '10.0.6.*' nodes. 
Does anyone know how to remove this setting? 

PUT /_cluster/settings 
{ 
   transient: { 
  cluster.routing.allocation.exclude._ip: 10.0.6.* 
   } 
} 

Thanks in advance for any help!
-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/53df40a8-a248-4373-b789-e0490e3dab8a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/054923B7-1941-4FA0-B4B7-51A99A85F0B3%40pilato.fr.
For more options, visit https://groups.google.com/d/optout.