Re: increase query performance by adding more machines, shouldn't it be linear to # of machines?

2014-07-02 Thread Seungjin Lee
yes, all same machines on which only ES with same configuration is running


2014-07-02 14:55 GMT+09:00 David Pilato da...@pilato.fr:

 Are you using same physical machine for all your VMs?

 --
 David ;-)
 Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs


 Le 2 juil. 2014 à 07:09, Seungjin Lee sweetest0...@gmail.com a écrit :

 Hi all,

 I'm testing percolator performance, 50k/s is required condition with 3~4k
 rules.

 now I only have 1 simple rule, and 5 es vms with 1 shard and 4 replicas.

 and using Java transport client like below

 new TransportClient(settings)
 .addTransportAddresses(transportAddressList.toArray(new
 InetSocketTransportAddress[transportAddressList.size()]));


 when I added just 1 address to transport client, percolator perf is about
 10k/s

 and when I added all 5 of vms to it, perf is about 15k/s

 so it increases by about 1.5times only even though I added 4 more vm
 addresses.

 Is it supposed to be like this?

 What I was thinking is, if it runs in for example round-robin fashion,
 there should be about 5 times performance gain.

 Could you comment anything on this?



  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/CAL3_U412ecCDh0GONWJZy6S9zjY%2By8xvoVsPrt1849csfs-zUA%40mail.gmail.com
 https://groups.google.com/d/msgid/elasticsearch/CAL3_U412ecCDh0GONWJZy6S9zjY%2By8xvoVsPrt1849csfs-zUA%40mail.gmail.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/BAEDE7D4-320E-4123-A783-823FF1EC26ED%40pilato.fr
 https://groups.google.com/d/msgid/elasticsearch/BAEDE7D4-320E-4123-A783-823FF1EC26ED%40pilato.fr?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAL3_U42vJaO0oqfw4DjWP6KK%3DpqtnGEBFGKHOr7Ky7nmbXV5HA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: increase query performance by adding more machines, shouldn't it be linear to # of machines?

2014-07-02 Thread David Pilato
Sorry. I meant on how many physical bare metal machines your 5 VMs are running?


--
David ;-)
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs


Le 2 juil. 2014 à 07:59, Seungjin Lee sweetest0...@gmail.com a écrit :

yes, all same machines on which only ES with same configuration is running


2014-07-02 14:55 GMT+09:00 David Pilato da...@pilato.fr:
 Are you using same physical machine for all your VMs?
 
 --
 David ;-)
 Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs
 
 
 Le 2 juil. 2014 à 07:09, Seungjin Lee sweetest0...@gmail.com a écrit :
 
 Hi all,
 
 I'm testing percolator performance, 50k/s is required condition with 3~4k 
 rules.
 
 now I only have 1 simple rule, and 5 es vms with 1 shard and 4 replicas.
 
 and using Java transport client like below
 
 new TransportClient(settings)
 .addTransportAddresses(transportAddressList.toArray(new 
 InetSocketTransportAddress[transportAddressList.size()]));
 
 
 when I added just 1 address to transport client, percolator perf is about 
 10k/s
 
 and when I added all 5 of vms to it, perf is about 15k/s
 
 so it increases by about 1.5times only even though I added 4 more vm 
 addresses.
 
 Is it supposed to be like this?
 
 What I was thinking is, if it runs in for example round-robin fashion, there 
 should be about 5 times performance gain.
 
 Could you comment anything on this?
 
 
 
 -- 
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/CAL3_U412ecCDh0GONWJZy6S9zjY%2By8xvoVsPrt1849csfs-zUA%40mail.gmail.com.
 For more options, visit https://groups.google.com/d/optout.
 -- 
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/BAEDE7D4-320E-4123-A783-823FF1EC26ED%40pilato.fr.
 For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAL3_U42vJaO0oqfw4DjWP6KK%3DpqtnGEBFGKHOr7Ky7nmbXV5HA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/5BFE5974-6B7B-465A-A587-AEBDF6C27D23%40pilato.fr.
For more options, visit https://groups.google.com/d/optout.


Re: Dealing with spam in this forum

2014-07-02 Thread Paul Brown
Hi, Clinton --

May I suggest:

- Some users (e.g., me) who read this list via an email subscription regard
ANY spam on the list as an unacceptable state of affairs.  This is not a
problem with Apache lists, for example, so I would point the finger of
blame at Google Groups.

- Having N longstanding members who are willing to help ban spammers is
equivalent to having N longstanding members who are willing to quickly
admit new users.  (And you're welcome to add me as N+1.)

- Banning is ineffective.  Spammers will continuously sign up with new
accounts.

-- Paul


—
p...@mult.ifario.us | Multifarious, Inc. | http://mult.ifario.us/


On Tue, Jul 1, 2014 at 11:36 AM, Clinton Gormley 
clinton.gorm...@elasticsearch.com wrote:

 Hi all

 Recently we've had a few spam emails that have made it through Google's
 filters, and there have been a calls for us to change to a
 moderate-first-post policy. I am reluctant to adopt this policy for the
 following reasons:

 We get about 30 new users every day from all over the world, many of whom
 are early in their learning phase and are quite stuck - they need help as
 soon as possible. Fortunately this list is very active and helpful. In
 contrast, we've only ever banned 34 users from the list for spamming.  So
 making new users wait for timezones to swing their way feels like a heavy
 handed solution to a small problem. Yes, spammers are annoying but they are
 a small minority on this list.

 Instead, we have asked 10 of our long standing members to help us with
 banning spammers.  This way we have Spam Guardians active around the globe,
 who only need to do something if a spammer raises their ugly head above the
 parapet. One or two spam emails may get through, but hopefully somebody
 will leap into action and stop their activity before it becomes too
 tiresome.

 This isn't an exclusive list. If you would like to be on it, feel free to
 email me.  Note: I expect you to be a long standing and currently active
 member of this list to be included.

 If this solution doesn't solve the problem, then we can reconsider
 moderate-first-post, but we've managed to go 5 years without requiring it,
 and I'd prefer to keep things as easy as possible for new users.

 Clint

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/c9af5a09-0295-42e3-bc20-52471828aa96%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/c9af5a09-0295-42e3-bc20-52471828aa96%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CACArsZ_92BQQSdLgYjKU2PsoQO5%2BFyFafyrFZTyeY720xGgMww%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


How to search the records with locations all in a polygon or multiPolygon?

2014-07-02 Thread 阙裕斌


I add the mappings and insert a record with 2 locations ([13, 13], [52, 
52]), 
and I want to search the results with it's locations all in the polygon,not 
one of the locations in the polygon. would you please tell me how to search 
the reslut? 

curl -XPOST localhost:9200/test5 -d '{ 
mappings : { 
gistype : { 
properties : { 
address:{ 
  properties:{ 
location:{type : geo_point}   
  } 
} 
} 
} 
} 
}' 

curl -XPUT 'http://localhost:9200/test5/gistype/1' -d '{ 
name: Wind  Wetter, Berlin, Germany, 
address: [ 
  { 
name:1, 
location: [13, 13] 
  }, 
  { 
name:2, 
location: [52, 52] 
  } 
] 
}' 

I searched like this , but I want to search the record locations all in the 
polygon. So it's wrong . 

curl -XGET 'http://localhost:9200/test5/gistype/_search?pretty=true' -d '{ 
  query: { 
filtered: { 
  query: { 
match_all: {} 
  }, 
  filter: { 
geo_polygon : { 
location : { 
points : [ 
{lat : 0, lon : 0}, 
{lat : 14, lon : 0}, 
{lat : 14, lon : 14}, 
{lat : 0, lon : 14} 
] 
} 
} 
  } 
} 
  } 
}'


@kimchy

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/5d156505-9695-4315-8826-884d8e4f1f8e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Dealing with spam in this forum

2014-07-02 Thread Patrick Proniewski
Hi,

I do agree with Paul, 200%.
I've received in my mailbox at least 49 spams just for the 06/30. I won't call 
this a few spam email. I'm subscribed for years on many mailing lists, and 
I'm pretty sure that it would take years to get as much spam on those lists as 
I get in 1 day on ES mailing list.

On 2 juil. 2014, at 08:18, Paul Brown p...@mult.ifario.us wrote:

 Hi, Clinton --
 
 May I suggest:
 
 - Some users (e.g., me) who read this list via an email subscription regard
 ANY spam on the list as an unacceptable state of affairs.  This is not a
 problem with Apache lists, for example, so I would point the finger of
 blame at Google Groups.
 
 - Having N longstanding members who are willing to help ban spammers is
 equivalent to having N longstanding members who are willing to quickly
 admit new users.  (And you're welcome to add me as N+1.)
 
 - Banning is ineffective.  Spammers will continuously sign up with new
 accounts.
 
 -- Paul
 
 
 --
 p...@mult.ifario.us | Multifarious, Inc. | http://mult.ifario.us/
 
 
 On Tue, Jul 1, 2014 at 11:36 AM, Clinton Gormley 
 clinton.gorm...@elasticsearch.com wrote:
 
 Hi all
 
 Recently we've had a few spam emails that have made it through Google's
 filters, and there have been a calls for us to change to a
 moderate-first-post policy. I am reluctant to adopt this policy for the
 following reasons:
 
 We get about 30 new users every day from all over the world, many of whom
 are early in their learning phase and are quite stuck - they need help as
 soon as possible. Fortunately this list is very active and helpful. In
 contrast, we've only ever banned 34 users from the list for spamming.  So
 making new users wait for timezones to swing their way feels like a heavy
 handed solution to a small problem. Yes, spammers are annoying but they are
 a small minority on this list.
 
 Instead, we have asked 10 of our long standing members to help us with
 banning spammers.  This way we have Spam Guardians active around the globe,
 who only need to do something if a spammer raises their ugly head above the
 parapet. One or two spam emails may get through, but hopefully somebody
 will leap into action and stop their activity before it becomes too
 tiresome.
 
 This isn't an exclusive list. If you would like to be on it, feel free to
 email me.  Note: I expect you to be a long standing and currently active
 member of this list to be included.
 
 If this solution doesn't solve the problem, then we can reconsider
 moderate-first-post, but we've managed to go 5 years without requiring it,
 and I'd prefer to keep things as easy as possible for new users.
 
 Clint

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/37AE06FE-1DB5-4CCE-A90B-4B2E61234E9A%40patpro.net.
For more options, visit https://groups.google.com/d/optout.


Re: help in query

2014-07-02 Thread surfer
oops there is an it that doesn't belong
On 07/02/2014 09:24 AM, surfer wrote:
 That definitely helped it. Thank you Vineeth
 Regards
 giovanni



 On 07/01/2014 07:19 PM, vineeth mohan wrote:
 Hello Giovanni ,

 I feel this will help
 - 
 http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/_literal_multi_match_literal_query_2.html#_wildcards_in_field_names

 Thanks
   Vineeth


 On Tue, Jul 1, 2014 at 10:19 PM, surfer sur...@crs4.it
 mailto:sur...@crs4.it wrote:

 Hi,

 I'm indexing something like:

 first doc=   { v4 : myvalue }
 second doc = { v1 : [ { v4 : myvalue }, { v5 :
 anothervalue } ] }
 third doc=  { v1 : [ { v2 : [ {v4 : myvalue }] } ] }
 fourth doc  =  { v1 :  [ { v2 : [ { v3 : [ { v4 :
 myvalue }] }
 ] } ] }

 so nested dictionaries and array of dictionaries.

 I was wondering if there is a query to obtain all the docs that have
 v4 : myvalue and another condition that has to be satisfied
 is that
 this must happen inside a v1 dictionary and whatever number of
 intermediate dictionaries (none, v2, v2 and v3)  that is with the
 three
 docs written my query should give:

 second doc, third doc and fourth doc

 Any hint is appreciated
 giovanni

 --
 You received this message because you are subscribed to the
 Google Groups elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it,
 send an email to elasticsearch+unsubscr...@googlegroups.com
 mailto:elasticsearch%2bunsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/53B2E682.40700%40crs4.it.
 For more options, visit https://groups.google.com/d/optout.


 -- 
 You received this message because you are subscribed to the Google
 Groups elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it,
 send an email to elasticsearch+unsubscr...@googlegroups.com
 mailto:elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/CAGdPd5%3D1MiPxmqSUFaxZg7%3DsiUaUQPSkn_XWUfycz%2BqmXpEJbQ%40mail.gmail.com
 https://groups.google.com/d/msgid/elasticsearch/CAGdPd5%3D1MiPxmqSUFaxZg7%3DsiUaUQPSkn_XWUfycz%2BqmXpEJbQ%40mail.gmail.com?utm_medium=emailutm_source=footer.
 For more options, visit https://groups.google.com/d/optout.

 -- 
 You received this message because you are subscribed to the Google
 Groups elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send
 an email to elasticsearch+unsubscr...@googlegroups.com
 mailto:elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/53B3B3BE.3060307%40crs4.it
 https://groups.google.com/d/msgid/elasticsearch/53B3B3BE.3060307%40crs4.it?utm_medium=emailutm_source=footer.
 For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/53B3B462.5060405%40crs4.it.
For more options, visit https://groups.google.com/d/optout.


Re: help in query

2014-07-02 Thread surfer
That definitely helped it. Thank you Vineeth
Regards
giovanni



On 07/01/2014 07:19 PM, vineeth mohan wrote:
 Hello Giovanni ,

 I feel this will help
 - 
 http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/_literal_multi_match_literal_query_2.html#_wildcards_in_field_names

 Thanks
   Vineeth


 On Tue, Jul 1, 2014 at 10:19 PM, surfer sur...@crs4.it
 mailto:sur...@crs4.it wrote:

 Hi,

 I'm indexing something like:

 first doc=   { v4 : myvalue }
 second doc = { v1 : [ { v4 : myvalue }, { v5 :
 anothervalue } ] }
 third doc=  { v1 : [ { v2 : [ {v4 : myvalue }] } ] }
 fourth doc  =  { v1 :  [ { v2 : [ { v3 : [ { v4 :
 myvalue }] }
 ] } ] }

 so nested dictionaries and array of dictionaries.

 I was wondering if there is a query to obtain all the docs that have
 v4 : myvalue and another condition that has to be satisfied is
 that
 this must happen inside a v1 dictionary and whatever number of
 intermediate dictionaries (none, v2, v2 and v3)  that is with the
 three
 docs written my query should give:

 second doc, third doc and fourth doc

 Any hint is appreciated
 giovanni

 --
 You received this message because you are subscribed to the Google
 Groups elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it,
 send an email to elasticsearch+unsubscr...@googlegroups.com
 mailto:elasticsearch%2bunsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/53B2E682.40700%40crs4.it.
 For more options, visit https://groups.google.com/d/optout.


 -- 
 You received this message because you are subscribed to the Google
 Groups elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send
 an email to elasticsearch+unsubscr...@googlegroups.com
 mailto:elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/CAGdPd5%3D1MiPxmqSUFaxZg7%3DsiUaUQPSkn_XWUfycz%2BqmXpEJbQ%40mail.gmail.com
 https://groups.google.com/d/msgid/elasticsearch/CAGdPd5%3D1MiPxmqSUFaxZg7%3DsiUaUQPSkn_XWUfycz%2BqmXpEJbQ%40mail.gmail.com?utm_medium=emailutm_source=footer.
 For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/53B3B3BE.3060307%40crs4.it.
For more options, visit https://groups.google.com/d/optout.


Re: Queries with fields {...} don't return field with dot in their name

2014-07-02 Thread benq
Hello Vineeth,

the items that are indexed in elasticsearch really contains a field named 
response.user.

_source: {
clientip: aaa.bbb..ddd,

request: http://.aa/b/c;,

request.accept-encoding: gzip, deflate, request.accept-language: 
de-ch, response.content-type: text/html; charset=UTF-8, 

 response: 200,

response.age: 0, response.user: userAAA, @timestamp: 
2014-07-01T12:18:51.501+02:00, }


I realize there is an ambiguity between a field with a dot in its name and 
a field of a child document. Should fields with dot in their name be 
avoided?

Benoît

Le mardi 1 juillet 2014 19:17:41 UTC+2, vineeth mohan a écrit :

 Hello Ben , 

 Can you paste a sample feed.

 Thanks
Vineeth


 On Tue, Jul 1, 2014 at 8:26 PM, benq benoit@gmail.com javascript: 
 wrote:

 Hi all,

 I have a query that specify the fields to be returned as described here: 
 http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-request-fields.html
 However, it does not return the fields with a dot in their name, like 
 response.user.

 For example, 
 Ex:
 {
   size: 1000,
   fields: [@timestamp, request, response, response.user, 
 clientip],
   query: {match_all: {} },
   filter: {
 and: [
   { range: { @timestamp: { from: ... 

 ]
   }
 }

 The timestamp, request, response and clientip fields are returned. The 
 response.user is not.

 Any idea why?

 Regards,
 Benoît

 -- 
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearc...@googlegroups.com javascript:.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/cc08476f-0c6e-47bc-870a-2008386636c5%40googlegroups.com
  
 https://groups.google.com/d/msgid/elasticsearch/cc08476f-0c6e-47bc-870a-2008386636c5%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.




-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/86f48242-6514-4d4b-9809-362e18af1d95%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: limitation of 2,147,483,647 terms per segment index in Lucene

2014-07-02 Thread simonw
Peter, thanks so much for raising this. This looks aweful! I think we 
should move this into an issue on [1] (please feel free to create one) IMO 
we should aim to name the issue in a way to prevent this from happening 
altogether. Along the lines we should help you to recover but I don't know 
how tricky it will be. Lets start with the issue!!

simon


[1] https://github.com/elasticsearch/elasticsearch/issues

On Monday, June 30, 2014 5:49:32 PM UTC+2, Peter Portante wrote:

 Is there a way to recover a segment index of a shard that has exceeded 
 Lucene's 2^31 limit?

 Thanks,
 -peter


 [2014-06-30 10:53:02,187][WARN ][indices.cluster  ] [Patriots] 
 [vos][0] failed to start shard
 org.elasticsearch.index.gateway.IndexShardGatewayRecoveryException: 
 [vos][0] failed recovery
 at 
 org.elasticsearch.index.gateway.IndexShardGatewayService$1.run(IndexShardGatewayService.java:185)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
 at java.lang.Thread.run(Thread.java:722)
 Caused by: java.lang.IllegalArgumentException: Too many documents, 
 composite IndexReaders cannot exceed 2147483647
 at 
 org.apache.lucene.index.BaseCompositeReader.init(BaseCompositeReader.java:77)
 at 
 org.apache.lucene.index.DirectoryReader.init(DirectoryReader.java:369)
 at 
 org.apache.lucene.index.StandardDirectoryReader.init(StandardDirectoryReader.java:43)
 at 
 org.apache.lucene.index.StandardDirectoryReader.open(StandardDirectoryReader.java:115)
 at 
 org.apache.lucene.index.IndexWriter.getReader(IndexWriter.java:385)
 at 
 org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:112)
 at 
 org.apache.lucene.search.SearcherManager.init(SearcherManager.java:89)
 at 
 org.elasticsearch.index.engine.internal.InternalEngine.buildSearchManager(InternalEngine.java:1364)
 at 
 org.elasticsearch.index.engine.internal.InternalEngine.start(InternalEngine.java:291)
 at 
 org.elasticsearch.index.shard.service.InternalIndexShard.performRecoveryPrepareForTranslog(InternalIndexShard.java:709)
 at 
 org.elasticsearch.index.gateway.local.LocalIndexShardGateway.recover(LocalIndexShardGateway.java:204)
 at 
 org.elasticsearch.index.gateway.IndexShardGatewayService$1.run(IndexShardGatewayService.java:132)
 ... 3 more
 [2014-06-30 10:53:02,213][WARN ][cluster.action.shard ] [Patriots] 
 [vos][0] sending failed shard for [vos][0], node[bS9Lp_a9QZOjiab23Ztk4A], 
 [P], s[INITIALIZING], indexUUID [a0_HJrlgQq-UNCwL2QiVbg], reason [Failed to 
 start shard, message [IndexShardGatewayRecoveryException[[vos][0] failed 
 recovery]; nested: IllegalArgumentException[Too many documents, composite 
 IndexReaders cannot exceed 2147483647]; ]]
 [2014-06-30 10:53:02,213][WARN ][cluster.action.shard ] [Patriots] 
 [vos][0] received shard failed for [vos][0], node[bS9Lp_a9QZOjiab23Ztk4A], 
 [P], s[INITIALIZING], indexUUID [a0_HJrlgQq-UNCwL2QiVbg], reason [Failed to 
 start shard, message [IndexShardGatewayRecoveryException[[vos][0] failed 
 recovery]; nested: IllegalArgumentException[Too many documents, composite 
 IndexReaders cannot exceed 2147483647]; ]]



-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/01253247-c16b-44cb-aa61-d02bc20f44eb%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Min Hard Drive Requirements

2014-07-02 Thread Ophir Michaeli
 

Hi all,

I'm testing the indexing of 100 million documents, it took about 400GB of 
the hard drive.
Is there a minimum free hard drive space needed for the index to work OK?
I'm asking because after we indexed 100 million documents we tested the 
index and it worked OK, 
but then when trying to optimize the optimize took days and then the index 
did not respond.
The hard drive had only 10 GB free space so we tried to copy the index to a 
new hard drive with a bigger free space, but the index is still not 
functioning.

Thank you,
Ophir

  

 

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/3405d84f-49d4-4cf9-836e-6b6bc09fdc74%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


problem index date yyyy-MM-dd’T'HH:mm:ss.SSS

2014-07-02 Thread Tanguy Bernard
Hello,

I try to indexing datetime mysql like this : 2013-05-01 00:00:00 
In ES it's represented like this : 2013-05-01T00:00:00.000Z
The real problem seems to be when I index this date : -00-00 00:00:00

I have used this mapping :

type:date,
 format:-MM-dd HH:mm:ss||MM/dd/||/MM/dd,
 index:not_analyzed


I have obtained this error :

[2014-07-02 10:11:56,503][INFO ][cluster.metadata ] [ik-test2] 
[_river] update_mapping [source] (dynamic)
can not be represented as java.sql.Timestamp
java.io.IOException: java.sql.SQLException: Value '7918-00-00 00:00:00 
 ...

 can not be represented as java.sql.Timestamp
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1078)
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:989)
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:975)
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:920)
at 
com.mysql.jdbc.ResultSetRow.getTimestampFast(ResultSetRow.java:1102)
at com.mysql.jdbc.BufferRow.getTimestampFast(BufferRow.java:576)
at 
com.mysql.jdbc.ResultSetImpl.getTimestampInternal(ResultSetImpl.java:6592)
at 
com.mysql.jdbc.ResultSetImpl.getTimestamp(ResultSetImpl.java:6192)
at com.mysql.jdbc.ResultSetImpl.getObject(ResultSetImpl.java:5058)
at 
org.xbib.elasticsearch.river.jdbc.strategy.simple.SimpleRiverSource.processRow(SimpleRiverSource.java:590)
at 
org.xbib.elasticsearch.river.jdbc.strategy.simple.SimpleRiverSource.nextRow(SimpleRiverSource.java:565)
at 
org.xbib.elasticsearch.river.jdbc.strategy.simple.SimpleRiverSource.merge(SimpleRiverSource.java:356)
at 
org.xbib.elasticsearch.river.jdbc.strategy.simple.SimpleRiverSource.execute(SimpleRiverSource.java:257)
at 
org.xbib.elasticsearch.river.jdbc.strategy.simple.SimpleRiverSource.fetch(SimpleRiverSource.java:228)
... 3 more
[2014-07-02 10:11:56,633][WARN 
][org.xbib.elasticsearch.river.jdbc.strategy.simple.SimpleRiverFlow] 
aborting river
[2014-07-02 10:12:01,392][INFO 
][org.xbib.elasticsearch.river.jdbc.strategy.simple.SimpleRiverMouth] new 
bulk [1] of [69 items], 1 outstanding bulk requests
[2014-07-02 10:12:01,437][INFO ][cluster.metadata ] [ik-test2] 
[my_index] update_mapping [source] (dynamic)



Can you help me, with my problem ?

Thank to you in advance.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/a572cbaa-5304-480d-9fc1-2e1783c36cea%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Min Hard Drive Requirements

2014-07-02 Thread Mark Walkom
It will work until it's full, but then ES will fall over.
Merging does require a certain amount of disk space, usually the same
amount as the segment that is being merged as it has to take a copy of the
shard to work on. So for a 10GB segment, you'd need at least 10GB free.

How many shards do you have for the index, or how many are you trying to
optimise (merge) down to?

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com
web: www.campaignmonitor.com


On 2 July 2014 18:13, Ophir Michaeli ophirmicha...@gmail.com wrote:

  Hi all,

 I'm testing the indexing of 100 million documents, it took about 400GB of
 the hard drive.
 Is there a minimum free hard drive space needed for the index to work OK?
 I'm asking because after we indexed 100 million documents we tested the
 index and it worked OK,
 but then when trying to optimize the optimize took days and then the index
 did not respond.
 The hard drive had only 10 GB free space so we tried to copy the index to
 a new hard drive with a bigger free space, but the index is still not
 functioning.

 Thank you,
 Ophir





 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/3405d84f-49d4-4cf9-836e-6b6bc09fdc74%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/3405d84f-49d4-4cf9-836e-6b6bc09fdc74%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAEM624aptU73HX9C3TtkcXULpG6YYF6TzZ7jUZOY8mE6grx8yw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


One date field mapping for two different locale

2014-07-02 Thread Jahrit
Here's the problem.

I have data with date field that can be either in english or german date
format (or rather week and month naming convention). 

I.e.Mittwoch, 18. Juni 2012 or Wednesday, 18. June 2012

I can set up separate mappings for separate fileds for each nation's date.

{
website : {
properties : {
date_en : {
type : date, 
format : EEE, dd. MMM , 
locale : US
}
}
}
}

{
website : {
properties : {
date_de : {
type : date, 
format : EEE, dd. MMM , 
locale : DE
}
}
}
}

And this work properly, untill I will try to put engilsh date into german
date field and backwards.

I do not have option to receive some additional data with date's locale
information.

What I need is the option to recognize each type of date, save it internally
in timestamp (for sorting) and do that in one and the same field (because of
sorting and field naming convention).

Something like this.
{
website : {
properties : {
date : {
type : date, 
format : EEE, dd. MMM , 
locale : US||DE
}
}
}
}

What I want to achive is be able to sort all documents with different
country dates by date. Maybe there can be different approach. I'll gladly
try other solutions.




--
View this message in context: 
http://elasticsearch-users.115913.n3.nabble.com/One-date-field-mapping-for-two-different-locale-tp4059118.html
Sent from the ElasticSearch Users mailing list archive at Nabble.com.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/1404289689171-4059118.post%40n3.nabble.com.
For more options, visit https://groups.google.com/d/optout.


Shards considered in write consistency

2014-07-02 Thread Varun Vasan V
Hey,

I have a question related to write consistency.

I have a elasticsearch cluster with 2 nodes. The nodes are configured as 

number_of_shards = 5
number_of_replicas = 1

If i set the action.write_consistency value as `quorum`, what is the number 
of active shards required to satisfy the quorum?
Please shed some light into the matter.

Thanks.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/0029734f-7e86-47b5-8bf1-f0ebe6b6ad2a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: problem index date yyyy-MM-dd’T'HH:mm:ss.SSS

2014-07-02 Thread Tanguy Bernard
As made, when I index date -00-00 00:00:00 the indexing stop completly 
with an error. (the begin work and stop instantly) 
I have tried to put (mapping) the type : string to my date but it doesn't 
work

Have you an idea to solve my problem ?

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/57d83657-a032-4ed8-8236-143a8e44c5fc%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: problem index date yyyy-MM-dd’T'HH:mm:ss.SSS

2014-07-02 Thread vineeth mohan
Hi Tanguy ,

How is this a valid date string - java.io.IOException:
java.sql.SQLException: Value '7918-00-00 00:00:00   ?
This value cant be mapped to any date format or is valid in anyway.

Thanks
 Vineth




On Wed, Jul 2, 2014 at 3:21 PM, Tanguy Bernard bernardtanguy1...@gmail.com
wrote:

 As made, when I index date -00-00 00:00:00 the indexing stop completly
 with an error. (the begin work and stop instantly)
 I have tried to put (mapping) the type : string to my date but it doesn't
 work

 Have you an idea to solve my problem ?

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/57d83657-a032-4ed8-8236-143a8e44c5fc%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/57d83657-a032-4ed8-8236-143a8e44c5fc%40googlegroups.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAGdPd5kfiLtGL6xu5LQKy9Hc1e10OhJsimHqm0Qu2XYhK%3DALag%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: problem index date yyyy-MM-dd’T'HH:mm:ss.SSS

2014-07-02 Thread Clinton Gormley
What you can do is to set the mapping for the date field to have:

{ type: date, format: -MM-dd HH:mm:ss, ignore_malformed:
true }

then it will just ignore those invalid dates rather than throwing an error

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAPt3XKSnThkYccGcfTCQYQ%3DJiyiQ_jS1Aq8Tmu9_4x%2Bm_XVQTg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: problem index date yyyy-MM-dd’T'HH:mm:ss.SSS

2014-07-02 Thread Tanguy Bernard
In my mysql table (type : datetime) :

| date_source |
+-+
| 2008-09-15 18:29:07 |
| 2013-08-29 00:00:00 |
| 2013-07-04 00:00:00 |
| 2013-07-17 00:00:00 |
| 2013-07-17 00:00:00 |
| -00-00 00:00:00 |
...
If I use a mapping (type :string)

And I index :

PUT /_river/test/_meta
{
type : jdbc,
jdbc : {

url : jdbc:mysql://ip:port/database,
user : user,
password : password,
sql : select id_source as _id, title_source, date_source from 
source,  *// if I add this where date_source not like '%%', it's 
work but values miss for this date*
index : test,
type : source,
max_bulk_requests : 5  


}}




Le mercredi 2 juillet 2014 12:09:58 UTC+2, vineeth mohan a écrit :

 Hi Tanguy ,

 How is this a valid date string - java.io.IOException: 
 java.sql.SQLException: Value '7918-00-00 00:00:00   ?
 This value cant be mapped to any date format or is valid in anyway.

 Thanks
  Vineth




 On Wed, Jul 2, 2014 at 3:21 PM, Tanguy Bernard bernardt...@gmail.com 
 javascript: wrote:

 As made, when I index date -00-00 00:00:00 the indexing stop 
 completly with an error. (the begin work and stop instantly) 
 I have tried to put (mapping) the type : string to my date but it doesn't 
 work

 Have you an idea to solve my problem ?

 -- 
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearc...@googlegroups.com javascript:.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/57d83657-a032-4ed8-8236-143a8e44c5fc%40googlegroups.com
  
 https://groups.google.com/d/msgid/elasticsearch/57d83657-a032-4ed8-8236-143a8e44c5fc%40googlegroups.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.




-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/1fc6c18b-a192-4972-92b6-9210be9c46aa%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Index missing error Eelasticseach java

2014-07-02 Thread venuchitta
Hi,

I am new to elasticsearch. I am using JAVA Api to establish connection
with ES.

public void createIndex(final String index) {

getClient().admin().indices().prepareCreate(index).execute().actionGet();
}


public void createLocalCluster(final String clusterName) {
NodeBuilder builder = NodeBuilder.nodeBuilder();
Settings settings = ImmutableSettings.settingsBuilder()
.put(gateway.type, none)
.put(cluster.name, clusterName)
.build();
builder.settings(settings).local(false).data(true);
this.node = builder.node();
this.client = node.client();
}

public boolean existsIndex(final String index) {
IndicesExistsResponse response =
getClient().admin().indices().prepareExists(index).execute().actionGet();
return response.isExists(); 
}

public void openIndex(String name){

getClient().admin().indices().prepareOpen(name).execute().actionGet();
}

createLocalCluster(cerES);
createIndex(news);
System.out.println(existsIndex(news));

When i execute the above java code iam getting true response. But when i
close the java program and start the program again with the following code:
openIndex(news);

It is throwing IndexMissingException.But i can see the news index in Data
folder of eclipse. So how i retreive data from the node previously?. Is it
lost? or am i wrong somewhere?





--
View this message in context: 
http://elasticsearch-users.115913.n3.nabble.com/Index-missing-error-Eelasticseach-java-tp4059080.html
Sent from the ElasticSearch Users mailing list archive at Nabble.com.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/1404254107251-4059080.post%40n3.nabble.com.
For more options, visit https://groups.google.com/d/optout.


Re: Dealing with spam in this forum

2014-07-02 Thread Clinton Gormley


 I've received in my mailbox at least 49 spams just for the 06/30. I won't 
 call this a few spam email. I'm subscribed for years on many mailing 
 lists, and I'm pretty sure that it would take years to get as much spam on 
 those lists as I get in 1 day on ES mailing list. 


That's interesting... I'd only seen three spam emails, so I wondered where 
you got 49 from. I read the posts from my gmail account, so then I checked 
my spam folder and sure enough there were a lot more emails in there that I 
was unaware of.

I'm going to disable my spam filter for this group so that I get more 
visibility, and I'd ask other moderators to do the same.

Let's see how it goes for a while longer. We can always revisit this 
decision later on.

clint




-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/8e790230-b81f-416b-b7b8-ef1589fb399c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: problem index date yyyy-MM-dd’T'HH:mm:ss.SSS

2014-07-02 Thread David Pilato
What this date is supposed to represent? 
month = 0 or day = 0 does not exist, right? 

-- 
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr


Le 2 juillet 2014 à 12:29:29, Tanguy Bernard (bernardtanguy1...@gmail.com) a 
écrit:

In my mysql table (type : datetime) :

| date_source         |
+-+
| 2008-09-15 18:29:07 |
| 2013-08-29 00:00:00 |
| 2013-07-04 00:00:00 |
| 2013-07-17 00:00:00 |
| 2013-07-17 00:00:00 |
| -00-00 00:00:00 |
...
If I use a mapping (type :string)

And I index :

PUT /_river/test/_meta
{
type : jdbc,
jdbc : {

    url : jdbc:mysql://ip:port/database,
    user : user,
    password : password,
    sql : select id_source as _id, title_source, date_source from source,  
// if I add this where date_source not like '%%', it's work but values 
miss for this date
    index : test,
    type : source,
    max_bulk_requests : 5  


}}




Le mercredi 2 juillet 2014 12:09:58 UTC+2, vineeth mohan a écrit :
Hi Tanguy ,

How is this a valid date string - java.io.IOException: java.sql.SQLException: 
Value '7918-00-00 00:00:00   ?
This value cant be mapped to any date format or is valid in anyway.

Thanks
         Vineth




On Wed, Jul 2, 2014 at 3:21 PM, Tanguy Bernard bernardt...@gmail.com wrote:
As made, when I index date -00-00 00:00:00 the indexing stop completly with 
an error. (the begin work and stop instantly) 
I have tried to put (mapping) the type : string to my date but it doesn't work

Have you an idea to solve my problem ?
--
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearc...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/57d83657-a032-4ed8-8236-143a8e44c5fc%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/1fc6c18b-a192-4972-92b6-9210be9c46aa%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/etPan.53b3e1b3.431bd7b7.f8fb%40MacBook-Air-de-David.local.
For more options, visit https://groups.google.com/d/optout.


Re: Shards considered in write consistency

2014-07-02 Thread David Pilato
In your case quorum means that you need all primaries to be allocated.
Which is the case here.

Doc explains that very well: 
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/docs-index_.html#index-consistency
Have a look in detail at: 
http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/_how_primary_and_replica_shards_interact.html#_how_primary_and_replica_shards_interact
and 
http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/distrib-write.html

HTH


-- 
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr


Le 2 juillet 2014 à 11:40:19, Varun Vasan V (varunvasa...@gmail.com) a écrit:

Hey,

I have a question related to write consistency.

I have a elasticsearch cluster with 2 nodes. The nodes are configured as 

number_of_shards = 5
number_of_replicas = 1

If i set the action.write_consistency value as `quorum`, what is the number of 
active shards required to satisfy the quorum?
Please shed some light into the matter.

Thanks.
--
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/0029734f-7e86-47b5-8bf1-f0ebe6b6ad2a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/etPan.53b3e296.257130a3.f8fb%40MacBook-Air-de-David.local.
For more options, visit https://groups.google.com/d/optout.


have we a way to use highlight and fuzzy together ?

2014-07-02 Thread Tanguy Bernard
Hello
Everything is on subject
I have to use fuzzy for my fileds (title,content) and when I'm searching I 
want to see a part of the sentance where my keyword is.

This, together, doesn't work:
$params['body']['highlight']['fields'][$value]['fragment_size']=30;
$params['body']['query']['fuzzy']=0.2;

Have we a way to use highlight and fuzzy together or an other way 
equivalent ?


Thank to you in advance.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/b6628cb4-7e2c-4e21-b578-a14865450a83%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: problem index date yyyy-MM-dd’T'HH:mm:ss.SSS

2014-07-02 Thread Tanguy Bernard
This date is created when a document is created, but an error occur and I 
have this -00-00 ^^
I'm in company while exist since 10 years, the database is old and they are 
this kind of error.

For the moment, I will use :

sql : select id_source as _id, title_source, date_source from source,  *// 
if I add this where date_source not like '%%', it's work but values 
miss for this date*
Or not index date_source. My goal was to sort my result with date_source.

Le mercredi 2 juillet 2014 12:40:58 UTC+2, David Pilato a écrit :

 What this date is supposed to represent? 
 month = 0 or day = 0 does not exist, right? 

 -- 
 *David Pilato* | *Technical Advocate* | *Elasticsearch.com*
 @dadoonet https://twitter.com/dadoonet | @elasticsearchfr 
 https://twitter.com/elasticsearchfr


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/6eca7137-875f-47e3-8719-537ed5ad0310%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: problem index date yyyy-MM-dd’T'HH:mm:ss.SSS

2014-07-02 Thread David Pilato
I would recommend updating the SQL database! :)

So may be update all dates where date is -00-00 to 1970-01-01 if it fits 
with your use case.

-- 
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr


Le 2 juillet 2014 à 12:54:36, Tanguy Bernard (bernardtanguy1...@gmail.com) a 
écrit:

This date is created when a document is created, but an error occur and I have 
this -00-00 ^^
I'm in company while exist since 10 years, the database is old and they are 
this kind of error.

For the moment, I will use :

sql : select id_source as _id, title_source, date_source from source,  // 
if I add this where date_source not like '%%', it's work but values miss 
for this date
Or not index date_source. My goal was to sort my result with date_source.

Le mercredi 2 juillet 2014 12:40:58 UTC+2, David Pilato a écrit :
What this date is supposed to represent? 
month = 0 or day = 0 does not exist, right? 

-- 
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr
--
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/6eca7137-875f-47e3-8719-537ed5ad0310%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/etPan.53b3e561.628c895d.f8fb%40MacBook-Air-de-David.local.
For more options, visit https://groups.google.com/d/optout.


[ANN] ElasticUI AngularJS Directives - Easily Build an Interface on top of Elasticsearch

2014-07-02 Thread Yousef El-Dardiry
Hi all,

I just open sourced a set of AngularJS Directives for Elasticsearch. It 
enables developers to rapidly build a frontend (e.g.: faceted search 
engine) on top of Elasticsearch.

http://www.elasticui.com (or github https://github.com/YousefED/ElasticUI)

It makes creating an aggregation and listing the buckets as simple as:

*ul 
eui-aggregation=ejs.TermsAggregation('text_agg').field('text').size(10)*
*li ng-repeat=bucket in aggResult.buckets{{bucket}}/li*
*/ul*

I think this was currently missing in the ecosystem, which is why I decided 
to build and open source it. I'd love any kind of feedback.

- Yousef

*-*
Another example; add a checkbox facet based on a field using one of the 
built-in widgets 
https://github.com/YousefED/ElasticUI/blob/master/docs/widgets.md:

*eui-checklist field='facet_field' size=10/eui-checklist*

Resulting in
[image: checklist screenshot]

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/7677d4fb-b340-4957-a7e6-ef4ef5e8347e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: _all analyzer advice

2014-07-02 Thread mooky
Ah. Cheers.
I had looked at that page a few times but missed that.

On Tuesday, 1 July 2014 19:04:56 UTC+1, Glen Smith wrote:


 http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/analysis-analyzers.html

 On Tuesday, July 1, 2014 6:23:54 AM UTC-4, mooky wrote:

 Thanks.
 So default_index and default_search have special meaning.
 Is this in the docs anywhere?

 -N



 On Monday, 30 June 2014 17:21:40 UTC+1, Glen Smith wrote:

 Totally. For example:

 analyzer: {
 default_index: {
 tokenizer: standard,
 filter: [standard, lowercase]
 },
 default_search: {
 tokenizer: standard,
 filter: [standard, lowercase, stop]
 },


 On Monday, June 30, 2014 12:19:55 PM UTC-4, mooky wrote:

 Excellent. Thanks for the info.

 Is it possible to set my custom analyser as the default analyser for an 
 index (ie instead of standard_analyzer)

 -N

 On Monday, 30 June 2014 14:41:10 UTC+1, Glen Smith wrote:

 You can set up an analyser for your index...

 ...
 my-index: {
 analysis: {
 analyzer: {
 default_index: {
 tokenizer: standard,
 filter: [standard, icu_fold_filter, stop]
 },
 default_search: {
 tokenizer: standard,
 filter: [standard, icu_fold_filter, stop]
 },
 custom_index: {
 tokenizer: whitespace,
 filter: [lower]
 },
 custom_search: {
 tokenizer: whitespace,
 filter: [lower]
 }
 }
 }
 }
 ...

 and then map your relevant field accordingly:

 {
 _timestamp: {
 enabled: true,
 store: yes
 },
 properties: {
 my_field: {
 type: string,
 index_analyzer: custom_index,
 search_analyzer: custom_search
 }
 }
 }


 Note that you can (and often should) set up index analysis and search 
 analysis differently (eg if you use synonyms, only expand search terms).

 Hope I haven't missed the point...

 On Monday, June 30, 2014 8:47:36 AM UTC-4, mooky wrote:

 Hi all,

 I have a google-style search capability in my app that uses the _all 
 field with the default (standard) analyzer (I don't configure anything - 
 so 
 its Elastic's default).

 There are a few cases where we don't quite get the behaviour we want, 
 and I am trying to work out how I tweak the analyzer configuration.

 1) if the user searches using 99.97, then they get the results they 
 expect, but if they search using 99.97%, they get nothing. They should 
 get 
 the results that match 99.97%. The default analyzer config loses the 
 %, I 
 guess.

 2) I have no idea what the text is ( : ) ) but the user wants to 
 search using 托克金通贸易 - which is in the data - but currently we get zero 
 results. It looks like the standard analyzer/tokenizer breaks on each 
 character.

 I *_think_* I just want a whitespace analyzer with lower-casing 
 However, 
 a) I am not exactly sure how to configure that, and;
 b) I am not 100% sure what I am losing/gaining vs standard analyzer. 
 (dont need stop-words - in any case default cfg for standard analyser 
 doesn't have any IIRC)

 (FWIW, on all our other text fields, we tend to use no analyzer)

 (Elastic 1.1.1 and 1.2 ...)

 Cheers.
 -M



-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/6796a0dc-5eaa-4db4-ab47-400215743c61%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: problem index date yyyy-MM-dd’T'HH:mm:ss.SSS

2014-07-02 Thread Tanguy Bernard
Yes, it's just some date. I think that it can be update quickly. It's the 
better way :)
Thank you all.

Le mercredi 2 juillet 2014 12:56:59 UTC+2, David Pilato a écrit :

 I would recommend updating the SQL database! :)

 So may be update all dates where date is -00-00 to 1970-01-01 if it 
 fits with your use case.

 -- 
 *David Pilato* | *Technical Advocate* | *Elasticsearch.com*
 @dadoonet https://twitter.com/dadoonet | @elasticsearchfr 
 https://twitter.com/elasticsearchfr


 Le 2 juillet 2014 à 12:54:36, Tanguy Bernard (bernardt...@gmail.com 
 javascript:) a écrit:

 This date is created when a document is created, but an error occur and I 
 have this -00-00 ^^ 
 I'm in company while exist since 10 years, the database is old and they 
 are this kind of error.

 For the moment, I will use :

 sql : select id_source as _id, title_source, date_source from source,  
 *// 
 if I add this where date_source not like '%%', it's work but values 
 miss for this date*
 Or not index date_source. My goal was to sort my result with date_source.

 Le mercredi 2 juillet 2014 12:40:58 UTC+2, David Pilato a écrit : 

  What this date is supposed to represent? 
  month = 0 or day = 0 does not exist, right? 

  -- 
 *David Pilato* | *Technical Advocate* | *Elasticsearch.com* 
  @dadoonet https://twitter.com/dadoonet | @elasticsearchfr 
 https://twitter.com/elasticsearchfr

   --
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearc...@googlegroups.com javascript:.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/6eca7137-875f-47e3-8719-537ed5ad0310%40googlegroups.com
  
 https://groups.google.com/d/msgid/elasticsearch/6eca7137-875f-47e3-8719-537ed5ad0310%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.



-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/bb21d146-0094-4765-baf5-9232977ad4e8%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [ANN] ElasticUI AngularJS Directives - Easily Build an Interface on top of Elasticsearch

2014-07-02 Thread Mark Walkom
Very cool, I'll pass this onto some of our devs :)

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com
web: www.campaignmonitor.com


On 2 July 2014 20:56, Yousef El-Dardiry yousefdard...@gmail.com wrote:

 Hi all,

 I just open sourced a set of AngularJS Directives for Elasticsearch. It
 enables developers to rapidly build a frontend (e.g.: faceted search
 engine) on top of Elasticsearch.

 http://www.elasticui.com (or github
 https://github.com/YousefED/ElasticUI)

 It makes creating an aggregation and listing the buckets as simple as:

 *ul
 eui-aggregation=ejs.TermsAggregation('text_agg').field('text').size(10)*
 *li ng-repeat=bucket in aggResult.buckets{{bucket}}/li*
 */ul*

 I think this was currently missing in the ecosystem, which is why I
 decided to build and open source it. I'd love any kind of feedback.

  - Yousef

 *-*
 Another example; add a checkbox facet based on a field using one of the
 built-in widgets
 https://github.com/YousefED/ElasticUI/blob/master/docs/widgets.md:

 *eui-checklist field='facet_field' size=10/eui-checklist*

 Resulting in
 [image: checklist screenshot]

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/7677d4fb-b340-4957-a7e6-ef4ef5e8347e%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/7677d4fb-b340-4957-a7e6-ef4ef5e8347e%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAEM624aT-TAiuMMjzmdy1ACDM-h5yGut2T0Z09LGaEcFu-3CXg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


spellcheck and completion suggester or what?

2014-07-02 Thread Bernd Fehling
Hi group, I have a special problem which I'm trying to solve.
I need search suggestions while typing text into a search box.
I tried different settings and options with ES, including term suggester, 
completion suggester and so on, but no success.

What I'm looking for is if I type in a search and I have already typed 
dan that I get suggestions like:
{
   
   - responseHeader: 
   {
  - status: 0,
  - QTime: 11
  },
   - spellcheck: 
   {
  - suggestions: 
  [
 - 
 [
- dan,
- 
{
   - numFound: 9,
   - startOffset: 0,
   - endOffset: 3,
   - suggestion: 
   [
  - dana,
  - danach,
  - danckert,
  - dando,
  - danger,
  - dangos,
  - danguolė,
  - daniel,
  - danish
  ]
   }
]
 ]
  }
   
}

So just the suggestions in alphabetical order from the index.
The above example is from SOLR but I need this feature for ES.
Any idea how to achive this?

Regards, Bernd

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/3cd3b917-7680-416c-b92e-e0ee0a91a601%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Something I am finding difficult, using Aggregations

2014-07-02 Thread mooky

Having used elastic aggregations for a little bit (and having used Mongo 
aggregations previously), I have been finding a couple of things a bit 
difficult/awkward.
I am not sure if its because I don't know how to do it properly - or we 
missing a feature/enhancement in elastic.

A common thing I want to do is aggregate on field x, but in the result, I 
also want field y  z (which are unique for a given x) - there doesn't seem 
to be an easy way to do that.

Lets say I have some data:
{
id : 94538ef6-2998-4ddd-be00-1f5dc2654955,
quantity : 1234567.2342,
commodityId : 0e918fb8-6572-4663-a692-cbebe8aca7f2,
commodityName : Lead,
ownerId : 53e0f816-8a0a-4659-b868-c48035676b25,
ownerName : Simon Chan,
locationId : 1cdd4bc7-76d9-43fb-ac56-8f555164211a,
locationName : Shenyang - Shenyang Dongbei,
locationCode : W33,
locationCity : Shenyang,
locationCountry : China
}

Lets say I want to do a (term) aggregation on ownerId (because its unique, 
while ownerName obviously is not) I will get results where the bucket key 
is the id. However, what I want to display to the user is the ownerName - 
not the id. Looking up the name from the id could be very expensive - but 
its also unnecessary because the name will be unique for a given bucket - 
we have the info to hand in the index. The same issue if I want to 
aggregate by locationId, or commodityId. We dereference the data associated 
with an id, so that we can search on them - but also we want to use this 
information to create a label for a bucket when we aggregate.

Is there a simple way to retrieve ownerName while aggregating on ownerId?
The only way I know to do this is to:
a) make sure owner name is not_analyzed and
b) do a term subaggregation - which will give only 1 result.
Is there an easier way I have missed?

(FWIW doing the same thing in, say, a Mongo aggregation is simply a matter 
of adding the ownerName as a key field - since its unique for a given id, 
it wont change the aggregation results - the ownerName info is simply 
extracted from the key data in the result).

Cheers,
M

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/cfcf8e74-06e7-4bf3-8cca-311dd14ccbe2%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Realtime search + fast indexing

2014-07-02 Thread mooky
One thing you can consider is calling refresh() after indexing - which has 
the effect I think you are looking for. 
There are probably some performance considerations others here can comment 
on better than I.
In any case, calling refresh() is what we do.



On Thursday, 26 June 2014 10:25:12 UTC+1, Nico Krijnen wrote:

 Hi,

 We have recently migrated our application from 'bare Lucene + Zoie for 
 realtime search' to Elastic Search. Elastic search is awesome and next to 
 scalability, it gives us lots of additional features. The one thing we 
 really miss though is realtime search.

 Search is the core of our application. All our data is stored in the index 
 (primary data store). When a user adds a file or makes a change, their 
 subsequent search must reflect that change. With Zoie, the data was indexed 
 very quickly into a temporary Lucene memory index. Not having to write+read 
 it on disk makes the documents available for search much faster than NRT 
 Lucene. The memory index is flushed to disk asynchrounously from time to 
 time, not impacting indexing or search performance. Zoie also allows you to 
 wait for a specific 'version of the index' to be available for searching. 
 That way we could make the user's thread wait until their data was indexed 
 in memory, only pausing the thread of that user without having any 
 performance impact for all the other users.

 Result: realtime search and insanely fast indexing.

 With Elastic Search we have to do a refresh to make data available for 
 search. Lots of refreshes or the 1 second refresh interval will cause 
 significant slower indexing speed. We don't know beforehand when our users 
 will import documents or make lots of changes, so we cannot really increase 
 the refresh interval when needed to make indexing faster. We know that 
 'get' is realtime and we make use of that as much as possible, but in lots 
 of cases we really require a search to find the data.

 Our plan is to implement some mechanism in Elastic Search to get the same 
 realtime search + fast indexing behavior that we had with Zoie. We need 
 some pointers though on what would be the best place in Elastic Search to 
 do something like this. After all it hooks into low level Elastic Search 
 and Lucene stuff.

 I can imagine that 'realtime-search while indexing' is important for many 
 other Elastic Search users too. What are the chances of something like this 
 getting merged back into the main branch?

 I'm planning to be at the Friday drinks tomorrow in Amsterdam. Is there 
 anyone attending with whom I could do some sparring with on this matter?

 Thanks,
 Nico


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/ead036cf-7ddc-4006-8361-8d4a0f77c7c9%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


cluster.routing.allocation.enable behavior (sticky shard allocation not working as expected)

2014-07-02 Thread Andrew Davidoff
Hi,

I am trying to use cluster.routing.allocation.enable to speed up node 
restarts. As I understand it, if I set cluster.routing.allocation.enable to 
none, restart a node, then set cluster.routing.allocation.enable to 
all, the shards that go UNASSIGNED when the node goes down should start 
back up on the same node they were assigned to previously. But in practice 
when I do this, the shards get assigned across the entire cluster when I 
set cluster.routing.allocation.enable back to all, and then after that, 
some amount of rebalancing happens.

How can I avoid this, and make shards on a restarted node come back on the 
same node?

To be clear, here's exactly the sequence of events:

1) curl -XPUT -s $host:$port/_cluster/settings?pretty=1 -d 
'{persistent:{cluster.routing.allocation.enable: none}}'
2) service elasticsearch stop on one node of a 3 node cluster 
(discovery.zen.minimum_master_nodes: 2)
3) shards that were assigned to the now stopped node show as UNASSIGNED
4) service elasticsearch start on the same node as in (2)
5) wait a few minutes - shards mentioned in (3) still show as UNASSIGNED, 
each node sees the full cluster (/_cat/nodes)
6) curl -XPUT -s $host:$port/_cluster/settings?pretty=1 -d 
'{persistent:{cluster.routing.allocation.enable: all}}'
7) UNASSIGNED shards mentioned in (3) begin being assigned across all nodes 
in the cluster
8) After all UNASSIGNED nodes are assigned, some start rebalancing 
(migrating to other nodes)
9) Cluster is happy

The amount of data in this cluster is very large, and this process can take 
close to 24 hours. So I'd like very much to avoid that for routine restarts.

Thanks.
Andy

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/34bb65f7-a286-46f7-a9a1-5f4e72f06926%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Dealing with spam in this forum

2014-07-02 Thread mooky
I fall on the side of caring less about spam emails (since I have decent 
spam filter on my email) and would rate easy access to the group much 
higher.
I tend to add/remove myself from groups all the time - so adding a delay to 
adding myself to a group with be a big PITA for me.

-M


On Wednesday, 2 July 2014 11:34:05 UTC+1, Clinton Gormley wrote:

 I've received in my mailbox at least 49 spams just for the 06/30. I won't 
 call this a few spam email. I'm subscribed for years on many mailing 
 lists, and I'm pretty sure that it would take years to get as much spam on 
 those lists as I get in 1 day on ES mailing list. 


 That's interesting... I'd only seen three spam emails, so I wondered where 
 you got 49 from. I read the posts from my gmail account, so then I checked 
 my spam folder and sure enough there were a lot more emails in there that I 
 was unaware of.

 I'm going to disable my spam filter for this group so that I get more 
 visibility, and I'd ask other moderators to do the same.

 Let's see how it goes for a while longer. We can always revisit this 
 decision later on.

 clint




-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/1c6d4d20-3bab-4063-997d-b3407355c955%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Looking to build a logging solution with threshold alerting.

2014-07-02 Thread Joshua Hall
I am looking to build a logging solution and wanted to make sure that I am 
not missing any key components.

The logs that I have are currently stored in a database which there is 
limited access due to locking risks from bad queries.

My plan is to have the dba's write the logs from the database tables to a 
file on a set interval then have logstash pick up the logs and write it to 
elastic search.  Then for viewing/searching the logs I will be using 
kibana.  Everything up to this point I have been able to make a proof of 
concept for but the other request was to have alerting.

I have spent some time looking at this and the general response seems to be 
to use percolation, but that seems to only make sense if you want to send 
an alert if you receive a single error that matches a query and from what I 
have seen there is no way to a threshold alerting system using percolation.

My thought to solve the threshold alerting is to create a simple web UI 
that allows the user to enter in a query to search for, a threshold, a time 
frame, and emails to send the alert to that would get stored in elastic 
search.  Then an app (Running as a windows service or cron job) that pulls 
the alerts and then runs the queries and checks the time-frame and 
threshold (Would run on some interval).  If the count surpasses the 
threshold then it would send an email to values stored in the email 
addresses. 

I know that SPM seems to cover this and move but we are currently looking 
to see if we can do this without buying another product.

Is this the correct approach to take or should I be looking at doing 
something else?

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/ce1cb3cc-e974-4b3b-8568-a2afaaae6c00%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: get rid of _all to optimize storage and perfs (Re: Splunk vs. Elastic search performance?)

2014-07-02 Thread Brian


 Patrick, 




 * Well, I did answer your question. But probably not from the direction 
 you expected. hmm no, you didn't. My question was: it looks like I cant 
 retrieve/display [_all fields] content. Any idea? and you replied with 
 your logstash template where _all is disabled. I'm interested in disabling 
 _all, but that was not my question at this point.*


Fair enough. I don't know the inner details; I am just an enthusiastic end 
user.

To the best of my knowledge, there is no content for the _all field; I view 
this as an Elasticsearch psuedo field whose name is _all and whose index 
terms are taken from all fields (by default), but still there is no actual 
content for it.

And after I got into the habit of disabling the _all field, my hands-on 
exploration of its nuances have ended. It's time for the experts to explain!
 

   
 *Your answer to my second message, below, is informative and interesting 
 but fails to answer my second question too. I simply asked whether I need 
 to feed the complete modified mapping of my template or if I can just push 
 the modified part (ie. the _all:{enabled: false} part). *


 Again, I have never done this, so I can only tell you what I do. I just 
cannot tell you all the nuances of what Elasticsearch is capable of.

My recommendation is to try it. Elasticsearch is great at letting you 
experiment and then telling you clearly if your attempt succeeds or fails.

So, try your scenario. If it fails, then it didn't work or you did 
something wrong. If it succeeds, then you can see exactly what 
Elasticsearch actually accepted as your mapping. For example:

curl 'http://localhost:9200/logstash-2014.06.30/_mapping?pretty=true'  
echo

This particular query looks at one of my logstash-generated indices, and it 
lets me verify that Elasticsearch and Logstash conspired to create the 
mappings I expected. I used this command quite a bit until I finally got 
everything configured correctly. (I actually verify the mapping via 
Elasticsearch Head, but under the covers it's the same command.)

Brian

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/8eaefd0e-f684-4f44-9fcb-3137812a99d3%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Problem accessing fields from a native script during percolation

2014-07-02 Thread petchemaite
  Hi all,

We're trying to figure out how to access fields from within a native 
AbstractSearchScript when it's called from a percolate request that 
contains the document to percolate.
We tried to use source mechanism and stored fields to no avail (no errors, 
but no matches). 
The same scripts are working fine for classic searches.

This was tried with Elasticsearch release 1.1.1 and a snapshot of 2.0.0.

We're running out of ideas, any help would be really appreciated
Thanks!


curl -XPOST http://localhost:9200/index1
curl -XPOST http://localhost:9200/index1/mytype/_mapping -d '{
  properties: {
source_field: { type: string },
stored_field: { type: string, stored: true }
  }
}'

curl -XPUT http://localhost:9200/index1/.percolator/1; -d '{
  query: {
filtered: {
  query : { match_all : {}},
  filter: {
script: {
  script: cooccurenceScript,
  params: {
map: { list : [ a ] }
  },
  lang: native
}
  }
}
  }
}'

curl -XPUT http://localhost:9200/index1/.percolator/2; -d '{
  query: {
filtered: {
  query : { match_all : {}},
  filter: {
script: {
  script: cooccurenceStoredScript,
  params: {
map: { list : [ a ] }
  },
  lang: native
}
  }
}
  }
}'

Native scripts:
package test;

import java.util.Map;
import org.elasticsearch.common.Nullable;
import org.elasticsearch.common.component.AbstractComponent;
import org.elasticsearch.common.inject.Inject;
import org.elasticsearch.common.settings.Settings;
import org.elasticsearch.node.Node;
import org.elasticsearch.script.ExecutableScript;
import org.elasticsearch.script.NativeScriptFactory;


public class CooccurenceScriptFactory extends AbstractComponent implements 
NativeScriptFactory{

private final Node node;

@SuppressWarnings(unchecked)
@Inject
public CooccurenceScriptFactory(Node node, Settings settings) {
super(settings);
this.node = node;
}

@Override public ExecutableScript newScript (@Nullable 
MapString,Object params){
return new CooccurenceScript(node.client(), logger, params);
  }
}

package test;

import org.elasticsearch.ElasticsearchIllegalArgumentException;
import org.elasticsearch.client.Client;
import org.elasticsearch.common.Nullable;
import org.elasticsearch.common.logging.ESLogger;
import org.elasticsearch.common.xcontent.support.XContentMapValues;
import org.elasticsearch.script.AbstractSearchScript;
import org.elasticsearch.search.lookup.SourceLookup;

import java.util.List;
import java.util.Map;


public class CooccurenceScript extends AbstractSearchScript {

private ListString list = null;

@SuppressWarnings(unchecked)
public CooccurenceScript(Client client, ESLogger logger, @Nullable 
MapString,Object params) {
MapString, Object map = params == null ? null : 
XContentMapValues.nodeMapValue(params.get(map), null);
if (map == null) {
throw new ElasticsearchIllegalArgumentException(Missing the 
map parameter);
}
list = (ListString) map.get(list);
if (list == null || list.isEmpty()) {
throw new ElasticsearchIllegalArgumentException(Missing the 
list parameter or list is empty);
}
}

@Override
public java.lang.Object run() {
SourceLookup source = source();
@SuppressWarnings(unchecked)
ListObject values = (ListObject) source.get(source_field);
if (values == null || values.isEmpty()) {
return false;
}
for (Object localValue : values) {
boolean result = true;
for (String s : list) {
result = ((String) localValue).contains(s);
}
if (result) {
return true;
}
}
return false;
}

}

package test;

import org.elasticsearch.common.Nullable;
import org.elasticsearch.common.component.AbstractComponent;
import org.elasticsearch.common.inject.Inject;
import org.elasticsearch.common.settings.Settings;
import org.elasticsearch.node.Node;
import org.elasticsearch.script.ExecutableScript;
import org.elasticsearch.script.NativeScriptFactory;

import java.util.Map;


public class CooccurenceStoredScriptFactory extends AbstractComponent 
implements NativeScriptFactory{

private final Node node;

@SuppressWarnings(unchecked)
@Inject
public CooccurenceStoredScriptFactory(Node node, Settings settings) {
super(settings);
this.node = node;
}

@Override public ExecutableScript newScript (@Nullable 
MapString,Object params){
return new CooccurenceStoredScript(node.client(), logger, params);
  }
}

package test;

import org.elasticsearch.ElasticsearchIllegalArgumentException;
import org.elasticsearch.client.Client;
import org.elasticsearch.common.Nullable;
import org.elasticsearch.common.logging.ESLogger;
import org.elasticsearch.common.xcontent.support.XContentMapValues;
import 

Re: have we a way to use highlight and fuzzy together ?

2014-07-02 Thread Nikolas Everett
On Wed, Jul 2, 2014 at 6:47 AM, Tanguy Bernard bernardtanguy1...@gmail.com
wrote:

Hello
 Everything is on subject
 I have to use fuzzy for my fileds (title,content) and when I'm searching I
 want to see a part of the sentance where my keyword is.

 This, together, doesn't work:
 $params['body']['highlight']['fields'][$value]['fragment_size']=30;
 $params['body']['query']['fuzzy']=0.2;

 Have we a way to use highlight and fuzzy together or an other way
 equivalent ?


Usually its better to show a recreation with curl.  PHP isn't always
understood.

Vocabulary point: fuzzy, prefix, and regex queries are called multi term
queries.

Anyway, there are three highlighters built in to Elasticsearch all of which
have different feature sets.  I'm not sure if the plain highlighter
supports multi term queries, but you can try the fast vector highlighter or
the postings highlighter which do support multi term queries.  See
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-request-highlighting.html

For completeness sake I should mention that I maintain a fourth highlighter
that also supports multi term queries but it is a plugin:
https://github.com/wikimedia/search-highlighter

Nik

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAPmjWd02ok9w%2B9U47bdmT78JvxP%3D41kpPBpOyRCGmXH71J4a5Q%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Dealing with spam in this forum

2014-07-02 Thread Jack Park
The behavior in my gmail-operated spam filter has been to toss out
lots of emails from this list as false positives. So, I keep sending
them back to my in box; pretty soon, gmail asks me to forward the good
ones to them to study, so I do. The result of that is that they catch
NONE of those spams.  They also don't put enough information in the
header to allow me to see if all those spams come from the same IP
address. Otherwise, it might be possible for the group list to block
certain IP addresses.

On Wed, Jul 2, 2014 at 3:34 AM, Clinton Gormley cl...@traveljury.com wrote:
 I've received in my mailbox at least 49 spams just for the 06/30. I won't
 call this a few spam email. I'm subscribed for years on many mailing
 lists, and I'm pretty sure that it would take years to get as much spam on
 those lists as I get in 1 day on ES mailing list.


 That's interesting... I'd only seen three spam emails, so I wondered where
 you got 49 from. I read the posts from my gmail account, so then I checked
 my spam folder and sure enough there were a lot more emails in there that I
 was unaware of.

 I'm going to disable my spam filter for this group so that I get more
 visibility, and I'd ask other moderators to do the same.

 Let's see how it goes for a while longer. We can always revisit this
 decision later on.

 clint


 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/8e790230-b81f-416b-b7b8-ef1589fb399c%40googlegroups.com.

 For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAH6s0fzpS%2BPdFjLq5-zuCoDJxTXyzrZ3mxOikrYpCh5erwMJAA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Index missing error Eelasticseach java

2014-07-02 Thread joergpra...@gmail.com
Use gateway type local instead of none, then your index persists across
cluster restarts.

Jörg


On Wed, Jul 2, 2014 at 12:35 AM, venuchitta venu.chitta1...@gmail.com
wrote:

 Hi,

 I am new to elasticsearch. I am using JAVA Api to establish connection
 with ES.

 public void createIndex(final String index) {

 getClient().admin().indices().prepareCreate(index).execute().actionGet();
 }


 public void createLocalCluster(final String clusterName) {
 NodeBuilder builder = NodeBuilder.nodeBuilder();
 Settings settings = ImmutableSettings.settingsBuilder()
 .put(gateway.type, none)
 .put(cluster.name, clusterName)
 .build();
 builder.settings(settings).local(false).data(true);
 this.node = builder.node();
 this.client = node.client();
 }

 public boolean existsIndex(final String index) {
 IndicesExistsResponse response =
 getClient().admin().indices().prepareExists(index).execute().actionGet();
 return response.isExists();
 }

 public void openIndex(String name){

 getClient().admin().indices().prepareOpen(name).execute().actionGet();
 }

 createLocalCluster(cerES);
 createIndex(news);
 System.out.println(existsIndex(news));

 When i execute the above java code iam getting true response. But when i
 close the java program and start the program again with the following code:
 openIndex(news);

 It is throwing IndexMissingException.But i can see the news index in Data
 folder of eclipse. So how i retreive data from the node previously?. Is it
 lost? or am i wrong somewhere?





 --
 View this message in context:
 http://elasticsearch-users.115913.n3.nabble.com/Index-missing-error-Eelasticseach-java-tp4059080.html
 Sent from the ElasticSearch Users mailing list archive at Nabble.com.

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/1404254107251-4059080.post%40n3.nabble.com
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGd%2BG%3DEhY2LHeH27ujsw0%2B4_%3DpPZoW3qkd8Grxe416Wdw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Inter-document Queries

2014-07-02 Thread Theo Harris
Together with Zennet we brainstormed a solution building on top of Itamar's 
proposal. 

In one string field we append the current path to the all previous ones and 
since we are talking about funnels we need to store them only on the last 
event/document generated, e.g SessionEndedEvent.
Then we can use regex pattern matching to identify if the sequence of steps 
can be found anywhere in the stored paths string. This solution appears to 
be extremely fast. 

On Wednesday, June 11, 2014 1:14:59 AM UTC+3, Zennet Wheatcroft wrote:

 I simplified the actual problem in order to avoid explaining the domain 
 specific details. Allow me to add back more detail.

 We want to be able to search for multiple points of user action, towards a 
 conversion funnel, and condition on multiple fields. Let's add another 
 field (response) to the above model:
 {.., path:/promo/A, response: 200, ..}
 {.., path:/page/1, response: 401, ..}
 {.., path:/promo/D,response: 200, ..}
 {.., path:/page/23, response: 301, ..}
 {.., path:/page/2, response: 418, ..}
 Let's say we define three points through the conversion funnel:
 A: Visited path=/page/1
 B: Got response=401 from some path
 C: Exited at path=/sale/C

 And we want to know how many users did steps A-B-C in that order. If we 
 add an array prev_response like we did for prev_path, then we can use a 
 term filter to find documents with term path=/sale/C and prev_path=/page/1 
 and prev_response=401. But this will not distinguish between A-B-C and 
 B-A-C. Perhaps I could use the script filter for the last mile and from 
 the term filtered results throw out B-A-C and it will run more quickly 
 because of the reduced document set.

 Is there another way to implement this query?

 Zennet


 On Wednesday, June 4, 2014 5:01:19 PM UTC-7, Itamar Syn-Hershko wrote:

 You need to be able to form buckets that can be reduced again, either 
 using the aggregations framework or a query. One model that will allow you 
 to do that is something like this:

 { userid: xyz, path:/sale/B, previous_paths:[...], 
 tstamp:..., ... }

 So whenever you add a new path, you denormalize and add previous paths 
 that could be relevant. This might bloat your storage a bit and be slower 
 on writes, but it is very optimized for reads since now you can do an 
 aggregation that queries for the desired path and buckets on the user. To 
 check the condition of the previous path you should be able to bucket again 
 using a script, or maybe even with a query on a nested type.

 This is just from the top of my head but should definitely work if you 
 can get to that model

 --

 Itamar Syn-Hershko
 http://code972.com | @synhershko https://twitter.com/synhershko
 Freelance Developer  Consultant
 Author of RavenDB in Action http://manning.com/synhershko/


 On Thu, Jun 5, 2014 at 2:36 AM, Zennet Wheatcroft zwhea...@atypon.com 
 wrote:

 Yes. I can re-index the data or transform it in any way to make this 
 query efficient. 

 What would you suggest?



 On Wednesday, June 4, 2014 2:14:09 PM UTC-7, Itamar Syn-Hershko wrote:

 This model is not efficient for this type of querying. You cannot do 
 this in one query using this model, and the pre-processing work you do now 
 + traversing all documents is very costly.

 Is it possible for you to index the data (even as a projection) into 
 Elasticsearch using a different model, so you can use ES properly using 
 queries or the aggregations framework?

 --

 Itamar Syn-Hershko
 http://code972.com | @synhershko https://twitter.com/synhershko
 Freelance Developer  Consultant
 Author of RavenDB in Action http://manning.com/synhershko/


 On Thu, Jun 5, 2014 at 12:04 AM, Zennet Wheatcroft zwhea...@atypon.com
  wrote:

  Hi,

 I am looking for an efficient way to do inter-document queries in 
 Elasticsearch. Specifically, I want to count the number of users that 
 went 
 through an exit point B after visiting point A.

 In general terms, say we have some event log data about users actions 
 on a website:
 
 {userid:xyz, machineid:110530745, path:/promo/A, country
 :US, tstamp:2013-04-01 00:01:01}
 {userid:pdq, machineid:110519774, path:/page/1, country:
 CN, tstamp:2013-04-01 00:02:11}
 {userid:xyz, machineid:110530745, path:/promo/D, country
 :US, tstamp:2013-04-01 00:06:31}
 {userid:abc, machineid:110527022, path:/page/23, country
 :DE, tstamp:2013-04-01 00:08:00}
 {userid:pdq, machineid:110519774, path:/page/2, country:
 CN, tstamp:2013-04-01 00:08:55}
 {userid:xyz, machineid:110530745, path:/sale/B, country:
 US, tstamp:2013-04-01 00:09:46}
 {userid:abc, machineid:110527022 , path:/promo/A, 
 country:DE, tstamp:2013-04-01 00:10:46}
 
 And we have 500+M such entries.

 We want a count of the number of userids that visited path=/sale/B 
 after visiting path=/promo/A.

 What I did is to preprocess the data, sorting by userid, tstamp, 
 then compacting all events by the same userid into the same document. 
 Then 
 I wrote a script filter which traverses the path array 

Re: Memory issues on ES client node

2014-07-02 Thread joergpra...@gmail.com
I'm not sure but it looks like a node tries to move some GB of document
hits around. This might have triggered timeouts at other places (probably
with node disconnects) and maybe the GB chunk is not yet GC collected, so
you see this in your heap analyzer tool.

It depends on the search results and search hits you generated if the
heaviness of the search result is expected or not, so it would be useful to
know more about your queries.

Jörg


On Wed, Jul 2, 2014 at 3:29 AM, Venkat Morampudi venkatmoramp...@gmail.com
wrote:

 Thanks for reply Jörg. I don't have any logs. I will try to enable them it
 would but it would take some time though. If there anything in particular
 that we need to enable, please let me know.

 -VM


 On Tuesday, July 1, 2014 12:58:21 PM UTC-7, Jörg Prante wrote:

 Do you have anything in your logs, i.e. many disconnects/reconnects?

 Jörg


 On Tue, Jul 1, 2014 at 7:59 PM, Venkat Morampudi venkatm...@gmail.com
 wrote:

 In the elastic search deployment we are seeing random client node
 crashed due to out of memory exception. I got the memory dump from one of
 the crash and analysed using Eclipse memory analyzer. I have attached leak
 suspect report. Apparently 242 objects of type org.elasticsearch.action.
 search.type.TransportSearchQueryThenFetchAction$AsyncAction are holding
 almost 8gb of memory. I have spent some time on source code but couldn't
 find anything obvious.


 I would really appreciate any help with this issue.


 -VM

 --
 You received this message because you are subscribed to the Google
 Groups elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send
 an email to elasticsearc...@googlegroups.com.

 To view this discussion on the web visit https://groups.google.com/d/
 msgid/elasticsearch/37881ead-70c2-40d8-89b6-a771b2a36bdd%
 40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/37881ead-70c2-40d8-89b6-a771b2a36bdd%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/9930fcfd-d2d4-4f62-b8a0-8f1f989069f2%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/9930fcfd-d2d4-4f62-b8a0-8f1f989069f2%40googlegroups.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoE_Xum%2BU%3D-M-X_R93qbDdOKx-QFS2PFCbxcik-uqtpBbw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


ElasticSearch 1.2.1 doesnt run on JDK 1.6?

2014-07-02 Thread David Marko
We have been using older Elasticsearch version here upgrading to 1.2.1 
shows use 'unknown class version errors' on JDK 1.6 . Docs says that JDK 
1.6 is support (and it was) . Is there some update here? What latest 
Elasticsearch version is available fo JDK 1.6?

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/a5df4ed1-6dd4-4402-8208-25d35c27ca49%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Elastic Search

2014-07-02 Thread Jamie A


 Thanks Mark,


 Yeah sorry I realized after the post that I should have done pastebin but 
I couldnt edit my post. Yes I am using the logstash dashboard. I changed 
the number of pages to a max record size of 10,000 results. I also realized 
that my query in kibana was only selecting the last days worth of records. 
So in the end I'm a dumbass. Works now after I change the date for the 
query.

:)

Jamie

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/427b5e09-40d5-4183-b3c6-a6a63514768a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: ElasticSearch 1.2.1 doesnt run on JDK 1.6?

2014-07-02 Thread joergpra...@gmail.com
Docs say at least Java 7 is required from ES 1.2.0 on

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/setup.html

For Java 6, you have to use ES versions  1.2.0

Jörg


On Wed, Jul 2, 2014 at 4:21 PM, David Marko dmarko...@gmail.com wrote:

 We have been using older Elasticsearch version here upgrading to 1.2.1
 shows use 'unknown class version errors' on JDK 1.6 . Docs says that JDK
 1.6 is support (and it was) . Is there some update here? What latest
 Elasticsearch version is available fo JDK 1.6?

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/a5df4ed1-6dd4-4402-8208-25d35c27ca49%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/a5df4ed1-6dd4-4402-8208-25d35c27ca49%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHCmOPogq7ZCvA%3DcacYUMyE0qCHsxVN4cCk%3DhfE67FpDQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: cluster.routing.allocation.enable behavior (sticky shard allocation not working as expected)

2014-07-02 Thread Gregoire Seux
On Wed, Jul 02, 2014 at 05:43:26AM -0700, Andrew Davidoff wrote:

 How can I avoid this, and make shards on a restarted node come back on the 
 same node?

Hello,

I have exactly the same issue.
My objective is to make a rolling restart script which wait for green
cluster state before restarting a node.
I use:
 curl -XPUT -s $host:$port/_cluster/settings -d 
 '{transient:{cluster.routing.allocation.enable: new_primaries}}'

to allow the cluster to work (and be able to create indices) during
restart.

But same issue: node is back up but nothing happen until I enable all
allocation again

I have gone through elasticsearch documentation related to recovery,
gateway, cluster settings without finding any parameters to activate or
configure this initial recovery of local indices.

-- 
Grégoire

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/20140702142754.GA2140%40criteo-scalasto.criteo.prod.
For more options, visit https://groups.google.com/d/optout.


Re: does snapshot restore lead to a memory leak?

2014-07-02 Thread Igor Motov
So, your search-only machines are running out of memory, while your 
index-only machines are doing fine. Did I understand you correctly? Could 
you send me nodes stats (curl localhost:9200/_nodes/stats?pretty) from 
the machine that runs out of memory, please run stats a few times with 1 
hour interval. I would like to see how memory consumption is increasing 
over time. Please, also run nodes info ones (curl localhost:9200/_nodes) 
and post here (or send me by email) the results. Thanks!

On Wednesday, July 2, 2014 10:15:46 AM UTC-4, JoeZ99 wrote:

 Hey, Igor, thanks for answering! and sorry for the delay. Didn't catch the 
 update.

 I explain:

- we have one cluster of one machine which is only meant for serving 
 search requests. the goal is  not to index anything to it. It contains 1.7k 
 indices, give it or take it. 
- every day, those 1.7k indices are reindexed, and snapshoted in pairs 
 to a S3 repository (producint 850 snapshots)repository. 
- every day, the one reading only cluster of the first point restores 
 those 850 snapshots to update its 1.7k indices from that same S3 
 repository. 

 It works like a real charm. Load has dropped dramatically, and we can set 
 a farm of temporary machines to do the indexing duties. 

 But memory consumption never stops growing.

 we don't get any out of memory error or anything. In fact, there is 
 nothing in the logs that shows any error, but after a week or a few days, 
 the host has its memory almost exhausted and elasticsearch is not 
 responding. The memory consumption is of course way ahead of the HEAP_SIZE
 We have to restart it and, when we do it we get the following error:

 java.util.concurrent.RejectedExecutionException: Worker has already been 
 shutdown
 at org.elasticsearch.common.netty.channel.socket.nio.
 AbstractNioSelector.registerTask(AbstractNioSelector.java:120)
 at org.elasticsearch.common.netty.channel.socket.nio.
 AbstractNioWorker.executeInIoThread(AbstractNioWorker.java:72)
 at org.elasticsearch.common.netty.channel.socket.nio.NioWorker.
 executeInIoThread(NioWorker.java:36)
 at org.elasticsearch.common.netty.channel.socket.nio.
 AbstractNioWorker.executeInIoThread(AbstractNioWorker.java:56)
 at org.elasticsearch.common.netty.channel.socket.nio.NioWorker.
 executeInIoThread(NioWorker.java:36)
 at org.elasticsearch.common.netty.channel.socket.nio.
 AbstractNioChannelSink.execute(AbstractNioChannelSink.java:34)
 at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.
 execute(DefaultChannelPipeline.java:636)
 at org.elasticsearch.common.netty.channel.Channels.
 fireExceptionCaughtLater(Channels.java:496)
 at org.elasticsearch.common.netty.channel.AbstractChannelSink.
 exceptionCaught(AbstractChannelSink.java:46)
 at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.
 notifyHandlerException(DefaultChannelPipeline.java:658)
 at org.elasticsearch.common.netty.channel.
 DefaultChannelPipeline$DefaultChannelHandlerContext.sendDownstream(
 DefaultChannelPipeline.java:781)
 at org.elasticsearch.common.netty.channel.Channels.write(Channels.
 java:725)
 at org.elasticsearch.common.netty.handler.codec.oneone.
 OneToOneEncoder.doEncode(OneToOneEncoder.java:71)
 at org.elasticsearch.common.netty.handler.codec.oneone.
 OneToOneEncoder.handleDownstream(OneToOneEncoder.java:59)
 at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.
 sendDownstream(DefaultChannelPipeline.java:591)
 at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.
 sendDownstream(DefaultChannelPipeline.java:582)
 at org.elasticsearch.common.netty.channel.Channels.write(Channels.
 java:704)
 at org.elasticsearch.common.netty.channel.Channels.write(Channels.
 java:671)
 at org.elasticsearch.common.netty.channel.AbstractChannel.write(
 AbstractChannel.java:248)
 at org.elasticsearch.http.netty.NettyHttpChannel.sendResponse(
 NettyHttpChannel.java:158)
 at org.elasticsearch.rest.action.search.RestSearchAction$1.
 onResponse(RestSearchAction.java:106)
 at org.elasticsearch.rest.action.search.RestSearchAction$1.
 onResponse(RestSearchAction.java:98)
 at org.elasticsearch.action.search.type.
 TransportSearchQueryAndFetchAction$AsyncAction.innerFinishHim(
 TransportSearchQueryAndFetchAction.java:94)
 at org.elasticsearch.action.search.type.
 TransportSearchQueryAndFetchAction$AsyncAction.moveToSecondPhase(
 TransportSearchQueryAndFetchAction.java:77)
 at org.elasticsearch.action.search.type.
 TransportSearchTypeAction$BaseAsyncAction.innerMoveToSecondPhase(
 TransportSearchTypeAction.java:425)
 at org.elasticsearch.action.search.type.
 TransportSearchTypeAction$BaseAsyncAction.onFirstPhaseResult(
 TransportSearchTypeAction.java:243)
 at org.elasticsearch.action.search.span style=color: #
 ...

-- 
You received this 

geo_polygon filter with non-zero rule filling

2014-07-02 Thread Clément SALAÜN
Is it possible to apply a geo_polygon filter with a non-zero rule 
https://en.wikipedia.org/wiki/Nonzero-rule ?

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/271d6449-1c2e-446c-9e35-4b45198ad381%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Queries with fields {...} don't return field with dot in their name

2014-07-02 Thread vineeth mohan
Hello Ben ,

This is defenitely an ambiguity.

By request.user , in the usual case ES expects a data like

request : {
  user : vm
 }

Try request\.user or something. Some mechanism to escape the dot.

Thanks
  Vineeth



On Wed, Jul 2, 2014 at 1:13 PM, benq benoit.quart...@gmail.com wrote:

 Hello Vineeth,

 the items that are indexed in elasticsearch really contains a field named
 response.user.

 _source: {
 clientip: aaa.bbb..ddd,

 request: http://.aa/b/c;,

 request.accept-encoding: gzip, deflate, request.accept-language:
 de-ch, response.content-type: text/html; charset=UTF-8,

  response: 200,

 response.age: 0, response.user: userAAA, @timestamp:
 2014-07-01T12:18:51.501+02:00, }


 I realize there is an ambiguity between a field with a dot in its name and
 a field of a child document. Should fields with dot in their name be
 avoided?

 Benoît

 Le mardi 1 juillet 2014 19:17:41 UTC+2, vineeth mohan a écrit :

 Hello Ben ,

 Can you paste a sample feed.

 Thanks
Vineeth


 On Tue, Jul 1, 2014 at 8:26 PM, benq benoit@gmail.com wrote:

 Hi all,

 I have a query that specify the fields to be returned as described here:
 http://www.elasticsearch.org/guide/en/elasticsearch/reference/
 current/search-request-fields.html
 However, it does not return the fields with a dot in their name, like
 response.user.

 For example,
 Ex:
 {
   size: 1000,
   fields: [@timestamp, request, response, response.user,
 clientip],
   query: {match_all: {} },
   filter: {
 and: [
   { range: { @timestamp: { from: ...

 ]
   }
 }

 The timestamp, request, response and clientip fields are returned. The
 response.user is not.

 Any idea why?

 Regards,
 Benoît

 --
 You received this message because you are subscribed to the Google
 Groups elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send
 an email to elasticsearc...@googlegroups.com.

 To view this discussion on the web visit https://groups.google.com/d/
 msgid/elasticsearch/cc08476f-0c6e-47bc-870a-2008386636c5%
 40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/cc08476f-0c6e-47bc-870a-2008386636c5%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/86f48242-6514-4d4b-9809-362e18af1d95%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/86f48242-6514-4d4b-9809-362e18af1d95%40googlegroups.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAGdPd5keeKB99M4rh1rKX74%3D0P%3DUqEEZtWVi0sq5-PYwY4WbKg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: get rid of _all to optimize storage and perfs (Re: Splunk vs. Elastic search performance?)

2014-07-02 Thread Steve Mayzak
All,

This seems apropos to the current discussion and could help clear up some 
confusion on recommendations etc.  We, Elasticsearch, are hosting a Webinar 
on ELK, given by the Logstash creator, Jordan Sissel.

Its today in 40 minutes.
http://www.elasticsearch.org/webinars/introduction-elk-stack/


On Wednesday, July 2, 2014 6:08:34 AM UTC-7, Brian wrote:


 Patrick, 




 * Well, I did answer your question. But probably not from the direction 
 you expected. hmm no, you didn't. My question was: it looks like I cant 
 retrieve/display [_all fields] content. Any idea? and you replied with 
 your logstash template where _all is disabled. I'm interested in disabling 
 _all, but that was not my question at this point.*


 Fair enough. I don't know the inner details; I am just an enthusiastic end 
 user.

 To the best of my knowledge, there is no content for the _all field; I 
 view this as an Elasticsearch psuedo field whose name is _all and whose 
 index terms are taken from all fields (by default), but still there is no 
 actual content for it.

 And after I got into the habit of disabling the _all field, my hands-on 
 exploration of its nuances have ended. It's time for the experts to explain!
  

   
 *Your answer to my second message, below, is informative and interesting 
 but fails to answer my second question too. I simply asked whether I need 
 to feed the complete modified mapping of my template or if I can just push 
 the modified part (ie. the _all:{enabled: false} part). *


  Again, I have never done this, so I can only tell you what I do. I just 
 cannot tell you all the nuances of what Elasticsearch is capable of.

 My recommendation is to try it. Elasticsearch is great at letting you 
 experiment and then telling you clearly if your attempt succeeds or fails.

 So, try your scenario. If it fails, then it didn't work or you did 
 something wrong. If it succeeds, then you can see exactly what 
 Elasticsearch actually accepted as your mapping. For example:

 curl 'http://localhost:9200/logstash-2014.06.30/_mapping?pretty=true'  
 echo

 This particular query looks at one of my logstash-generated indices, and 
 it lets me verify that Elasticsearch and Logstash conspired to create the 
 mappings I expected. I used this command quite a bit until I finally got 
 everything configured correctly. (I actually verify the mapping via 
 Elasticsearch Head, but under the covers it's the same command.)

 Brian


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/d2dd4206-c8bd-4c96-90df-5ad4a7bce5e1%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: does snapshot restore lead to a memory leak?

2014-07-02 Thread JoeZ99
Igor.
Yes, that's right. My index only machines are just machines that are 
booted just for the indexing-snapshotting task. once there is no more tasks 
in queue, those machines are terminated. they only handle a few indices 
each time (their only purpose is to snapshot).

I will do as you tell me. I guess I'll better wait to the timeframe in 
which most of the restores occurs, because that's when the memory 
consumption grows more, so expect those postings in 5 or 6 hours. 

On Wednesday, July 2, 2014 10:29:53 AM UTC-4, Igor Motov wrote:

 So, your search-only machines are running out of memory, while your 
 index-only machines are doing fine. Did I understand you correctly? Could 
 you send me nodes stats (curl localhost:9200/_nodes/stats?pretty) from 
 the machine that runs out of memory, please run stats a few times with 1 
 hour interval. I would like to see how memory consumption is increasing 
 over time. Please, also run nodes info ones (curl localhost:9200/_nodes) 
 and post here (or send me by email) the results. Thanks!

 On Wednesday, July 2, 2014 10:15:46 AM UTC-4, JoeZ99 wrote:

 Hey, Igor, thanks for answering! and sorry for the delay. Didn't catch 
 the update.

 I explain:

- we have one cluster of one machine which is only meant for serving 
 search requests. the goal is  not to index anything to it. It contains 1.7k 
 indices, give it or take it. 
- every day, those 1.7k indices are reindexed, and snapshoted in pairs 
 to a S3 repository (producint 850 snapshots)repository. 
- every day, the one reading only cluster of the first point 
 restores those 850 snapshots to update its 1.7k indices from that same S3 
 repository. 

 It works like a real charm. Load has dropped dramatically, and we can set 
 a farm of temporary machines to do the indexing duties. 

 But memory consumption never stops growing.

 we don't get any out of memory error or anything. In fact, there is 
 nothing in the logs that shows any error, but after a week or a few days, 
 the host has its memory almost exhausted and elasticsearch is not 
 responding. The memory consumption is of course way ahead of the HEAP_SIZE
 We have to restart it and, when we do it we get the following error:

 java.util.concurrent.RejectedExecutionException: Worker has already been 
 shutdown
 at org.elasticsearch.common.netty.channel.socket.nio.
 AbstractNioSelector.registerTask(AbstractNioSelector.java:120)
 at org.elasticsearch.common.netty.channel.socket.nio.
 AbstractNioWorker.executeInIoThread(AbstractNioWorker.java:72)
 at org.elasticsearch.common.netty.channel.socket.nio.NioWorker.
 executeInIoThread(NioWorker.java:36)
 at org.elasticsearch.common.netty.channel.socket.nio.
 AbstractNioWorker.executeInIoThread(AbstractNioWorker.java:56)
 at org.elasticsearch.common.netty.channel.socket.nio.NioWorker.
 executeInIoThread(NioWorker.java:36)
 at org.elasticsearch.common.netty.channel.socket.nio.
 AbstractNioChannelSink.execute(AbstractNioChannelSink.java:34)
 at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.
 execute(DefaultChannelPipeline.java:636)
 at org.elasticsearch.common.netty.channel.Channels.
 fireExceptionCaughtLater(Channels.java:496)
 at org.elasticsearch.common.netty.channel.AbstractChannelSink.
 exceptionCaught(AbstractChannelSink.java:46)
 at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.
 notifyHandlerException(DefaultChannelPipeline.java:658)
 at org.elasticsearch.common.netty.channel.
 DefaultChannelPipeline$DefaultChannelHandlerContext.sendDownstream(
 DefaultChannelPipeline.java:781)
 at org.elasticsearch.common.netty.channel.Channels.write(Channels
 .java:725)
 at org.elasticsearch.common.netty.handler.codec.oneone.
 OneToOneEncoder.doEncode(OneToOneEncoder.java:71)
 at org.elasticsearch.common.netty.handler.codec.oneone.
 OneToOneEncoder.handleDownstream(OneToOneEncoder.java:59)
 at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.
 sendDownstream(DefaultChannelPipeline.java:591)
 at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.
 sendDownstream(DefaultChannelPipeline.java:582)
 at org.elasticsearch.common.netty.channel.Channels.write(Channels
 .java:704)
 at org.elasticsearch.common.netty.channel.Channels.write(Channels
 .java:671)
 at org.elasticsearch.common.netty.channel.AbstractChannel.write(
 AbstractChannel.java:248)
 at org.elasticsearch.http.netty.NettyHttpChannel.sendResponse(
 NettyHttpChannel.java:158)
 at org.elasticsearch.rest.action.search.RestSearchAction$1.
 onResponse(RestSearchAction.java:106)
 at org.elasticsearch.rest.action.search.RestSearchAction$1.
 onResponse(RestSearchAction.java:98)
 at org.elasticsearch.action.search.type.
 TransportSearchQueryAndFetchAction$AsyncAction.innerFinishHim(
 TransportSearchQueryAndFetchAction.java:94)
  

Re: does snapshot restore lead to a memory leak?

2014-07-02 Thread joergpra...@gmail.com
This memory issue report might be related

https://groups.google.com/forum/#!topic/elasticsearch/EH76o1CIeQQ

Jörg


On Wed, Jul 2, 2014 at 5:34 PM, JoeZ99 jzar...@gmail.com wrote:

 Igor.
 Yes, that's right. My index only machines are just machines that are
 booted just for the indexing-snapshotting task. once there is no more tasks
 in queue, those machines are terminated. they only handle a few indices
 each time (their only purpose is to snapshot).

 I will do as you tell me. I guess I'll better wait to the timeframe in
 which most of the restores occurs, because that's when the memory
 consumption grows more, so expect those postings in 5 or 6 hours.


 On Wednesday, July 2, 2014 10:29:53 AM UTC-4, Igor Motov wrote:

 So, your search-only machines are running out of memory, while your
 index-only machines are doing fine. Did I understand you correctly? Could
 you send me nodes stats (curl localhost:9200/_nodes/stats?pretty) from
 the machine that runs out of memory, please run stats a few times with 1
 hour interval. I would like to see how memory consumption is increasing
 over time. Please, also run nodes info ones (curl localhost:9200/_nodes)
 and post here (or send me by email) the results. Thanks!

 On Wednesday, July 2, 2014 10:15:46 AM UTC-4, JoeZ99 wrote:

 Hey, Igor, thanks for answering! and sorry for the delay. Didn't catch
 the update.

 I explain:

- we have one cluster of one machine which is only meant for serving
 search requests. the goal is  not to index anything to it. It contains 1.7k
 indices, give it or take it.
- every day, those 1.7k indices are reindexed, and snapshoted in
 pairs to a S3 repository (producint 850 snapshots)repository.
- every day, the one reading only cluster of the first point
 restores those 850 snapshots to update its 1.7k indices from that same S3
 repository.

 It works like a real charm. Load has dropped dramatically, and we can
 set a farm of temporary machines to do the indexing duties.

 But memory consumption never stops growing.

 we don't get any out of memory error or anything. In fact, there is
 nothing in the logs that shows any error, but after a week or a few days,
 the host has its memory almost exhausted and elasticsearch is not
 responding. The memory consumption is of course way ahead of the HEAP_SIZE
 We have to restart it and, when we do it we get the following error:

 java.util.concurrent.RejectedExecutionException: Worker has already
 been shutdown
 at org.elasticsearch.common.netty.channel.socket.nio.AbstractNi
 oSelector.registerTask(AbstractNioSelector.java:120)
 at org.elasticsearch.common.netty.channel.socket.nio.AbstractNi
 oWorker.executeInIoThread(AbstractNioWorker.java:72)
 at org.elasticsearch.common.netty.channel.socket.nio.NioWorker.
 executeInIoThread(NioWorker.java:36)
 at org.elasticsearch.common.netty.channel.socket.nio.AbstractNi
 oWorker.executeInIoThread(AbstractNioWorker.java:56)
 at org.elasticsearch.common.netty.channel.socket.nio.NioWorker.
 executeInIoThread(NioWorker.java:36)
 at org.elasticsearch.common.netty.channel.socket.nio.AbstractNi
 oChannelSink.execute(AbstractNioChannelSink.java:34)
 at org.elasticsearch.common.netty.channel.DefaultChannelPipeline
 .execute(DefaultChannelPipeline.java:636)
 at org.elasticsearch.common.netty.channel.Channels.fireExceptio
 nCaughtLater(Channels.java:496)
 at org.elasticsearch.common.netty.channel.AbstractChannelSink.e
 xceptionCaught(AbstractChannelSink.java:46)
 at org.elasticsearch.common.netty.channel.DefaultChannelPipeline
 .notifyHandlerException(DefaultChannelPipeline.java:658)
 at org.elasticsearch.common.netty.channel.DefaultChannelPipelin
 e$DefaultChannelHandlerContext.sendDownstream(DefaultChannelPipeline.
 java:781)
 at org.elasticsearch.common.netty.channel.Channels.write(Channe
 ls.java:725)
 at org.elasticsearch.common.netty.handler.codec.oneone.OneToOne
 Encoder.doEncode(OneToOneEncoder.java:71)
 at org.elasticsearch.common.netty.handler.codec.oneone.OneToOne
 Encoder.handleDownstream(OneToOneEncoder.java:59)
 at org.elasticsearch.common.netty.channel.DefaultChannelPipeline
 .sendDownstream(DefaultChannelPipeline.java:591)
 at org.elasticsearch.common.netty.channel.DefaultChannelPipeline
 .sendDownstream(DefaultChannelPipeline.java:582)
 at org.elasticsearch.common.netty.channel.Channels.write(Channe
 ls.java:704)
 at org.elasticsearch.common.netty.channel.Channels.write(Channe
 ls.java:671)
 at org.elasticsearch.common.netty.channel.AbstractChannel.write(
 AbstractChannel.java:248)
 at org.elasticsearch.http.netty.NettyHttpChannel.sendResponse(N
 ettyHttpChannel.java:158)
 at org.elasticsearch.rest.action.search.RestSearchAction$1.onRe
 sponse(RestSearchAction.java:106)
 at org.elasticsearch.rest.action.search.RestSearchAction$1.onRe
 sponse(RestSearchAction.java:98)

Re: Min Hard Drive Requirements

2014-07-02 Thread Ophir Michaeli
When I tried to optimize the index had 51 shards.
Regards, Ophir

On Wednesday, July 2, 2014 11:27:50 AM UTC+3, Mark Walkom wrote:

 It will work until it's full, but then ES will fall over.
 Merging does require a certain amount of disk space, usually the same 
 amount as the segment that is being merged as it has to take a copy of the 
 shard to work on. So for a 10GB segment, you'd need at least 10GB free.

 How many shards do you have for the index, or how many are you trying to 
 optimise (merge) down to?

 Regards,
 Mark Walkom

 Infrastructure Engineer
 Campaign Monitor
 email: ma...@campaignmonitor.com javascript:
 web: www.campaignmonitor.com
  

 On 2 July 2014 18:13, Ophir Michaeli ophirm...@gmail.com javascript: 
 wrote:

  Hi all,

 I'm testing the indexing of 100 million documents, it took about 400GB of 
 the hard drive.
 Is there a minimum free hard drive space needed for the index to work OK?
 I'm asking because after we indexed 100 million documents we tested the 
 index and it worked OK, 
 but then when trying to optimize the optimize took days and then the 
 index did not respond.
 The hard drive had only 10 GB free space so we tried to copy the index to 
 a new hard drive with a bigger free space, but the index is still not 
 functioning.

 Thank you,
 Ophir

   

  
  
 -- 
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearc...@googlegroups.com javascript:.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/3405d84f-49d4-4cf9-836e-6b6bc09fdc74%40googlegroups.com
  
 https://groups.google.com/d/msgid/elasticsearch/3405d84f-49d4-4cf9-836e-6b6bc09fdc74%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.




-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/6cdd64fb-dfe7-479f-b44c-3c1e8cff1ce7%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Custom Query variables ?

2014-07-02 Thread Ivan Brusic
If you enable explanations, you can see why Lucene the rational behind the
scoring:

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-request-explain.html

You are probably correct in that the array length is influencing the
scoring. By default, Lucene will rate higher fields with fewer terms by
using length normalization. You can disable norms on the field:

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-core-types.html#norms

You can fine-tune better by learning how to read Lucene's explanations. It
is difficult at first, but it is a useful skill.

Cheers,

Ivan


On Tue, Jul 1, 2014 at 1:02 AM, Pierrick Boutruche pboutru...@octo.com
wrote:

 Up ? Any ideas ?

 Le lundi 30 juin 2014 17:48:54 UTC+2, Pierrick Boutruche a écrit :

 Hi everyone,

 I'm creating on my own a little Geocoder. My goal is to be able to
 retrieve a big city or a country with a string on input. This string can be
 mistyped, so I indexed geonames cities5000 data (cities  5000 inhab), and
 crossed theses data with countries  admin data. So I got a 46000 cities
 index with country, admin  pop.

 I created a search_field in which I put country, admin  city name +
 alternate names provided in cities5000 file.

 I want, within this array, search for a string.

 Currently, I'm just searching with a MatchQuery, like Paris in
 search_field. Unfortunately, the first result is Paris... in Canada...

 Still, the search_field data is this one, for Paris (CA) and Paris (FR):

 [u'Paris', u'Paris', u'Canada', u'Ontario', u'Ontario']

 [u'Paris', u'Paris', u'France', u'\xcele-de-France', u'Ile-de-France',
 u'Paris', u'Paris']

 I don't understand why Paris, CA is first, 'cause there's so much more
 Paris in the second one...


 Anyway, is there any way to make the number of my_query terms
 appearance make the difference ? Because with alternate names, there will
 be so much much more Paris that it has te count.

 Actually I think the array length matters in the scoring and I don't want
 it to... I thought of a custom query score, but I don't think I'm able to
 get the query term in the script query.


 Any ideas ?


 Thanks !

  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/0f1e6aec-697c-46fc-882e-d8927783fab5%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/0f1e6aec-697c-46fc-882e-d8927783fab5%40googlegroups.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQCa9nNPX-7oQgjXq6AsFVUyxarDOq9SQ3w6M2MMgT2rNQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Custom Query variables ?

2014-07-02 Thread joergpra...@gmail.com
For geo search, it would be a good approach to respect the searchers
preference by using a locale, so I suggest to add a locale fr filter to
the search.
Or an origin is added to the start query and all cities are ordered by geo
distance in relation to the origin. For country search, the origin could be
the capital city...

Jörg


On Wed, Jul 2, 2014 at 6:38 PM, Ivan Brusic i...@brusic.com wrote:

 If you enable explanations, you can see why Lucene the rational behind the
 scoring:


 http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-request-explain.html

 You are probably correct in that the array length is influencing the
 scoring. By default, Lucene will rate higher fields with fewer terms by
 using length normalization. You can disable norms on the field:


 http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-core-types.html#norms

 You can fine-tune better by learning how to read Lucene's explanations. It
 is difficult at first, but it is a useful skill.

 Cheers,

 Ivan


 On Tue, Jul 1, 2014 at 1:02 AM, Pierrick Boutruche pboutru...@octo.com
 wrote:

 Up ? Any ideas ?

 Le lundi 30 juin 2014 17:48:54 UTC+2, Pierrick Boutruche a écrit :

 Hi everyone,

 I'm creating on my own a little Geocoder. My goal is to be able to
 retrieve a big city or a country with a string on input. This string can be
 mistyped, so I indexed geonames cities5000 data (cities  5000 inhab), and
 crossed theses data with countries  admin data. So I got a 46000 cities
 index with country, admin  pop.

 I created a search_field in which I put country, admin  city name +
 alternate names provided in cities5000 file.

 I want, within this array, search for a string.

 Currently, I'm just searching with a MatchQuery, like Paris in
 search_field. Unfortunately, the first result is Paris... in Canada...

 Still, the search_field data is this one, for Paris (CA) and Paris
 (FR):

 [u'Paris', u'Paris', u'Canada', u'Ontario', u'Ontario']

 [u'Paris', u'Paris', u'France', u'\xcele-de-France', u'Ile-de-France',
 u'Paris', u'Paris']

 I don't understand why Paris, CA is first, 'cause there's so much more
 Paris in the second one...


 Anyway, is there any way to make the number of my_query terms
 appearance make the difference ? Because with alternate names, there will
 be so much much more Paris that it has te count.

 Actually I think the array length matters in the scoring and I don't
 want it to... I thought of a custom query score, but I don't think I'm able
 to get the query term in the script query.


 Any ideas ?


 Thanks !

  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/0f1e6aec-697c-46fc-882e-d8927783fab5%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/0f1e6aec-697c-46fc-882e-d8927783fab5%40googlegroups.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQCa9nNPX-7oQgjXq6AsFVUyxarDOq9SQ3w6M2MMgT2rNQ%40mail.gmail.com
 https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQCa9nNPX-7oQgjXq6AsFVUyxarDOq9SQ3w6M2MMgT2rNQ%40mail.gmail.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGUyH0-GBjqAYOrMvuEL_ERA82MMdGEK2GHCBEmOcGOFg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Defing default mapping to enable _timestamp for all indices

2014-07-02 Thread kazoompa
Hi,

I have the following ES setting defined in my YAML file:

http.enabled: false
discovery.zen.ping.multicast.enabled: false
index:
  mappings:
_default_:
  _timestamp:
enabled: true
store : true
  analysis:
analyzer:
  mica_index_analyzer:
type: custom
tokenizer: standard
filter: [standard,lowercase,mica_nGram_filter]
  mica_search_analyzer:
type: custom
tokenizer: standard
filter: [standard,lowercase]
filter:
  mica_nGram_filter:
type: nGram
min_gram: 2
max_gram: 20


My intention is to enable the _timestamp field for all created indices. The 
above does not seem to work, is the error in the syntax of the YAML or I am 
missing a step?

Thanks.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/9bd349c6-a7f4-4627-af0e-088ffe2a0418%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Wrong Scoring using match query on Sense

2014-07-02 Thread Ivan Brusic
If you enable explanations, you would see that length normalization is
scoring the document with the shorter field higher than the document with a
term frequency of 2.

The fieldNorm is incredibly lossy since it uses only 1 byte, so there must
be some inconsistencies between the example and your test case. The example
has a fieldNorm of 0.375, while it is 0.3125 in your case (and mine as
well). The example might not have deleted all the documents in the index
before the test.

Cheers,

Ivan


On Tue, Jul 1, 2014 at 1:43 AM, rayman idan.f...@gmail.com wrote:

 I am trying to exercise the following example using Sense :
 http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/match-query.html
 .

 However when I ran

 GET /my_index/my_type/_search
 {
   query: {
 match: {
   title: QUICK!
 }
   }
 }

 I got wrong scoring. I expect to see doc with is 3. But doc with id got
 higher score. any idea?:

 {
took: 2,
timed_out: false,
_shards: {
   total: 1,
   successful: 1,
   failed: 0
},
hits: {
   total: 3,
   max_score: 0.5,
   hits: [
  {
 _index: my_index,
 _type: my_type,
 _id: 1,
 _score: 0.5,
 _source: {
title: The quick brown fox
 }
  },
  {
 _index: my_index,
 _type: my_type,
 _id: 3,
 _score: 0.44194174,
 _source: {
title: The quick brown fox jumps over the quick dog
 }
  },
  {
 _index: my_index,
 _type: my_type,
 _id: 2,
 _score: 0.3125,
 _source: {
title: The quick brown fox jumps over the lazy dog
 }
  }
   ]
}
 }

 Thanks.




  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/38002b2d-6d70-4a20-9820-7814c37e8aea%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/38002b2d-6d70-4a20-9820-7814c37e8aea%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQAdM%3Dx6WfpbGVLRqiuVkF7QLXq0hZgCpqSSnc8A8nqHGQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


[ANN] Elasticsearch Servlet Transport plugin 2.2.0 released

2014-07-02 Thread Elasticsearch Team

Heya,


We are pleased to announce the release of the Elasticsearch Servlet Transport 
plugin, version 2.2.0.

The wares transport plugin allows to use the REST interface over servlets..

https://github.com/elasticsearch/elasticsearch-transport-wares/

Release Notes - elasticsearch-transport-wares - Version 2.2.0



Update:
 * [21] - Update to elasticsearch 1.2.0 
(https://github.com/elasticsearch/elasticsearch-transport-wares/pull/21)

New:
 * [22] - Add plugin release semi-automatic script 
(https://github.com/elasticsearch/elasticsearch-transport-wares/issues/22)
 * [17] - NodeServlet should use an elasticsearch node created elsewhere in the 
webapp 
(https://github.com/elasticsearch/elasticsearch-transport-wares/issues/17)



Issues, Pull requests, Feature requests are warmly welcome on 
elasticsearch-transport-wares project repository: 
https://github.com/elasticsearch/elasticsearch-transport-wares/
For questions or comments around this plugin, feel free to use elasticsearch 
mailing list: https://groups.google.com/forum/#!forum/elasticsearch

Enjoy,

-The Elasticsearch team

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/53b448f6.c5bcb40a.4b4f.78e0SMTPIN_ADDED_MISSING%40gmr-mx.google.com.
For more options, visit https://groups.google.com/d/optout.


Re: Problem Configuring AWS S3 for Backups

2014-07-02 Thread sabdalla80
Unfortunately, I tried with and without the region setting, no difference.

On Tuesday, July 1, 2014 7:43:21 PM UTC-4, Glen Smith wrote:

 I'm not sure it matters, but I noticed you aren't setting a region in 
 either your config or when registering your repo.

 On Tuesday, July 1, 2014 7:08:28 PM UTC-4, sabdalla80 wrote:

 I am not sure the version is the problem, I guess I can upgrade from V1.1 
 to latest. 
 Not able to load credential from supply chain, Any idea this error is 
 generated, Is there any other place that my credentials need to be besides 
 .yml file?
 Note, I am able to write/read to S3 remotely, so I don't have any 
 priviliges problems that I can think of.

 On Tuesday, July 1, 2014 4:44:17 PM UTC-4, David Pilato wrote:

 I think 2.1.1 should work fine as well.

 That said, you should upgrade to latest 1.1 (or 1.2)...

 --
 David ;-)
 Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs


 Le 1 juil. 2014 à 22:13, Glen Smith gl...@smithsrock.com a écrit :

 According to
 https://github.com/elasticsearch/elasticsearch-cloud-aws/tree/es-1.1
 you should use v2.1.0 of the plugin with ES 1.1.0.

 On Tuesday, July 1, 2014 9:03:04 AM UTC-4, sabdalla80 wrote:

 I am having a problem setting up backup and restore part of AWS on S3. 
 I have 2.1.1 AWS plugin  ElasticSearch V1.1.0

 My yml:

 cloud:
 aws:
 access_key: #
 secret_key: #
discovery:
 type: ec2

 When I try to register a repository:

 PUT /_snapshot/es_repository{

 type: s3,

 settings: {

   bucket: esbucket

 }}


 I get this error, it complains about loading my credentials! Is this 
 ElasticSearch problem or AWS?

 Note I am running as root user ubuntu on Ec2 and also running AWS 
 with root privileges as opposed to IAM role, not sure if it's a problem or 
 not.
error: RepositoryException[[es_repository] failed to create 
 repository]; nested: CreationException[Guice creation errors:\n\n1) Error 
 injecting constructor, com.amazonaws.AmazonClientException: Unable to load 
 AWS credentials from any provider in the chain\n  at 
 org.elasticsearch.repositories.s3.S3Repository.init(Unknown Source)\n 
  while locating org.elasticsearch.repositories.s3.S3Repository\n  while 
 locating org.elasticsearch.repositories.Repository\n\n1 error]; nested: 
 AmazonClientException[Unable to load AWS credentials from any provider in 
 the chain]; ,
status: 500
 }ode here...





  -- 
 You received this message because you are subscribed to the Google 
 Groups elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send 
 an email to elasticsearc...@googlegroups.com.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/e7db355a-7710-4408-80de-60960fd16d1d%40googlegroups.com
  
 https://groups.google.com/d/msgid/elasticsearch/e7db355a-7710-4408-80de-60960fd16d1d%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.



-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/56de92a0-6c58-43b5-b4cc-df7c613ba4e2%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


ES doesn't work with rexster gremlin extension

2014-07-02 Thread Aaliyah
the problem is like the topic. I'm not sure if I misunderstood something or 
I missed some configurations. The ES works fine in usual situations, but 
doesn't work with rexster gremlin extension. 

In java, I configured the graph as follows:

https://lh3.googleusercontent.com/-kd6vKDQdH6g/U7RbYPE-ciI/AHk/sSU5e5R3DMM/s1600/1.PNG

https://lh3.googleusercontent.com/-9HiEPJmS1FQ/U7Rbjmg4oCI/AHs/2FgdAtjiBHc/s1600/2.PNG

In the extension, I wrote:

*String query = v. + propKey + : ( + propValue + ); *

*if(((TitanGraph) graph).indexQuery(search, 
query).vertices().iterator().hasNext()){...}*

When I invoked the extension, the rexster told: null pointer exception, 
unknown index, etc. As follows:

https://lh4.googleusercontent.com/-I0w-hVgu0F0/U7RdQCcjOZI/AH4/4l13192eR2k/s1600/3.PNG

https://lh3.googleusercontent.com/-phHCRqzQN6I/U7RdUkyE8oI/AIA/rhkTlffd8lI/s1600/4.PNG

After this, I searched some advices from google and made some 
configurations in rexster.xml

https://lh4.googleusercontent.com/-_6zRCaQtvRw/U7ReJ41k4jI/AIQ/Nwgon5WqPtU/s1600/6.PNG

Then, the problems seemed as another way. (As you can see, first_example is 
the name of the graph.)

https://lh5.googleusercontent.com/-r8TvH5CxjqA/U7Rfb9NFNEI/AIs/epemSCk5-8c/s1600/8.PNG

Besides, when I invoked the extension, I've been told: the graph is 
non-configured.

https://lh4.googleusercontent.com/-uPWbHmgKL28/U7RfO3PoVlI/AIk/6ndP888fGYA/s1600/7.PNG

I've also tried the embedded mode. The problems seem to be the same.

Plus,  I'm using:

Titan 0.4.2

 Tinkerpop 0.2.4

 Cassandra 2.0.7

ElasticSearch 1.2.1 


This problem's driving me crazy. Any pointer would be appreciated! Thanks 
in advance!











-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/1a915c28-dd2f-4b72-b640-0fd8f9961c8f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Recommended Hardware Specs Sharding\Index Strategy

2014-07-02 Thread mrno42
When you say - do not let a shard grow bigger than your JVM heap (this is 
really a rough estimation) so segment merging will work flawlessly

are we counting all the primary and replicas shards of all indexes on that 
node?  So for example, if we had two indexes with on 10 node cluster. Each 
index has 10 shards and 1 replica(40 total in cluster). 

So per node, the heap size should be larger than:
 
1 shard for first index
1 shard for replica of first index
1 shard for second index
1 shard for replica second index

the four shards combined?  

Thanks again for your advice



On Saturday, August 10, 2013 6:50:27 AM UTC-7, Jörg Prante wrote:

 Your concern is a single shard getting too big. If you use 64bit JVM and 
 mmapfs (quite common), you can open even the largest files. So from this 
 point of view, a node can handle the biggest files. There is no real limit. 

 Another question is throughput performance with large shard files. For 
 example, the more mixed read/write operations are in the workload, the 
 smaller the Lucene indexes should be, to allow the JVM/OS a better load 
 distribution.

 For selecting a total number of shards and shard size, here are some 
 general rules of thumb:

 - do not select a smaller number of shards than your total number of nodes 
 you will add to the cluster. Each node should hold at least one shard.

 - do not let a shard grow bigger than your JVM heap (this is really a 
 rough estimation) so segment merging will work flawlessly

 - if you want fast recovery, or if you want to move shards around (not a 
 common case), the smaller a shard is the faster the operation will get done

 In case you are worried about shards getting out of bounds, you can 
 reindex with a higher number of shards (having the _source enabled is 
 always an advantage for reindexing) with your favorite custom tool. 
 Reindexing can take significant time, and may not be an option if you can't 
 stop indexing.

 Jörg


 On Fri, Aug 9, 2013 at 4:32 PM, David Arata david...@gmail.com 
 javascript: wrote:

 My concern is what would would be the best strategy so that an index or 
 single shard in an index does not get too big for a node to handle and if 
 its approaching that size what can be done? 




-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/3fc800b0-28d6-4ad0-8aa5-eb182d9b27ee%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Documents not being stored

2014-07-02 Thread Joseph Johnson
Hello,

I am attempting to set up a large scale ELK setup at work. Here is a
basic setup of what we have so far:

```
Nodes (approx 150)
[logstash]

  |
  |
  +---+
  |   |
Indexer1 Indexer2
[Redis]  [Redis]
[Logstash]   [Logstash]
  |   |
  |   |
  ++--+
   |
   |
 ES Master -- Kibana3
 [Master: yes]
 [Data: no]
   |
   |
 ES Data (4 data nodes)
 [Master: no]
 [Data: yes]
```

In case the formatting does not hold with the above, I've created a
paste here: https://baneofswitches.privatepaste.com/c8dfc2c30b


The Setup
=

* We have approximately 150 nodes configured to send to a shuffled
Redis instance on either Indexer1 or Indexer2. A sanitized version of
the node Logstash config is here:
https://baneofswitches.privatepaste.com/345b94064d

* Each indexer is identical. They both run their own independent Redis
service. They then each have a Logstash service that pulls events from
Redis and pushes them to the ES Master. They are using the http
protocol. A sanitized version of their config is here:
https://baneofswitches.privatepaste.com/e19eae690f

* The ES Master is configured to only be a Master, and is not set to be
a data node. It has 32 GB of RAM.

* There are 4 ES data nodes, configured to be data nodes only, they have
been configured to be ineligible to be elected as Masters. They have 62
GB RAM and the storage for ES is on SSDs

* We have Kibana3 configured to search from the ES Master.

* Average # of logs generated by all nodes total seems to be
approximately 7k/sec, with peaks up to about 16k/s.

* Indexer throughput seems to be good enough that one indexer can work
just fine during normal usage.

* We are using the default 5 shards with 1 replica


The Problem
===

When this setup is loaded as mentioned above, we are noticing that some
logs are being dropped. We were able to test this by running something like:

seq 1 5000 | xargs -I{} -n 1 -P 40 logger Testing unqString {} of 5000

Sometimes we would see all 5000 show up in Kibana, other times a subset
of them (for example 4800 events).


Troubleshooting
===

We have taken a number of steps to eliminate possibilities. We have
confirmed that logs are being reliably transferred from nodes to Redis
and from Redis through Logstash. We confirmed this by monitoring counts
over many trials. The Redis- logstash leg was tested by outputting to a
file and comparing counts.

That left the Logstash - ES leg. We tested this by writing a script
that pushed fake events via the bulk API. We were unable to reproduce
the problem with one request. However, when the cluster is under load
(we let 'real' logs flow) and we push via the bulk API with our script
we occasionally see partial loss of data.

It's important to note that partial loss here means that the request
succeeds (200 return code), and much of the data in the bulk request is
then searchable, however not all will be. For example, if we put the
cluster under load and push a request with a bulk of 5000 events in, we
will see 4968 of the 5000 in our subsequent search.

We have tried increasing the bulk api threadpool as well as giving a
greater percentage (50%) to the indexing buffer. Neither has fixed the
issue.


Conclusion


I am looking for feedback on how to troubleshoot this further and find
the cause. I am also looking for information to see if anyone else out
there is getting these sorts of incoming volume and what sorts of things
they had to do to get their setup working. I appreciate all feedback.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/53B46818.7020005%40gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: [ANN] ElasticUI AngularJS Directives - Easily Build an Interface on top of Elasticsearch

2014-07-02 Thread Matthew Morrison
Great idea. I'll give it a try ASAP.


On Wednesday, July 2, 2014 10:56:48 PM UTC+12, Yousef El-Dardiry wrote:

 Hi all,

 I just open sourced a set of AngularJS Directives for Elasticsearch. It 
 enables developers to rapidly build a frontend (e.g.: faceted search 
 engine) on top of Elasticsearch.

 http://www.elasticui.com (or github 
 https://github.com/YousefED/ElasticUI)

 It makes creating an aggregation and listing the buckets as simple as:

 *ul 
 eui-aggregation=ejs.TermsAggregation('text_agg').field('text').size(10)*
 *li ng-repeat=bucket in aggResult.buckets{{bucket}}/li*
 */ul*

 I think this was currently missing in the ecosystem, which is why I 
 decided to build and open source it. I'd love any kind of feedback.

 - Yousef

 *-*
 Another example; add a checkbox facet based on a field using one of the 
 built-in widgets 
 https://github.com/YousefED/ElasticUI/blob/master/docs/widgets.md:

 *eui-checklist field='facet_field' size=10/eui-checklist*

 Resulting in
 [image: checklist screenshot]


-- 

See why you should attend BroadSoft Connections 2014 
http://broadsoftconnections.com/

This email is intended solely for the person or entity to which it is 
addressed and may contain confidential and/or privileged information. If 
you are not the intended recipient and have received this email in error, 
please notify BroadSoft, Inc. immediately by replying to this message, and 
destroy all copies of this message, along with any attachment, prior to 
reading, distributing or copying it.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/a2ae383e-154a-4a2f-a925-20461157a827%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Min Hard Drive Requirements

2014-07-02 Thread Mark Walkom
Ok, how many were you reducing to? How big is the index?

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com
web: www.campaignmonitor.com


On 3 July 2014 02:03, Ophir Michaeli ophirmicha...@gmail.com wrote:

 When I tried to optimize the index had 51 shards.
 Regards, Ophir

 On Wednesday, July 2, 2014 11:27:50 AM UTC+3, Mark Walkom wrote:

 It will work until it's full, but then ES will fall over.
 Merging does require a certain amount of disk space, usually the same
 amount as the segment that is being merged as it has to take a copy of the
 shard to work on. So for a 10GB segment, you'd need at least 10GB free.

 How many shards do you have for the index, or how many are you trying to
 optimise (merge) down to?

 Regards,
 Mark Walkom

 Infrastructure Engineer
 Campaign Monitor
 email: ma...@campaignmonitor.com
 web: www.campaignmonitor.com


 On 2 July 2014 18:13, Ophir Michaeli ophirm...@gmail.com wrote:

  Hi all,

 I'm testing the indexing of 100 million documents, it took about 400GB
 of the hard drive.
 Is there a minimum free hard drive space needed for the index to work OK?
 I'm asking because after we indexed 100 million documents we tested the
 index and it worked OK,
 but then when trying to optimize the optimize took days and then the
 index did not respond.
 The hard drive had only 10 GB free space so we tried to copy the index
 to a new hard drive with a bigger free space, but the index is still not
 functioning.

 Thank you,
 Ophir





 --
 You received this message because you are subscribed to the Google
 Groups elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send
 an email to elasticsearc...@googlegroups.com.
 To view this discussion on the web visit https://groups.google.com/d/
 msgid/elasticsearch/3405d84f-49d4-4cf9-836e-6b6bc09fdc74%
 40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/3405d84f-49d4-4cf9-836e-6b6bc09fdc74%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/6cdd64fb-dfe7-479f-b44c-3c1e8cff1ce7%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/6cdd64fb-dfe7-479f-b44c-3c1e8cff1ce7%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAEM624aymYEHAGEg4z5XhXxnpgyCgobzPeoC8Wg%3DMB%3DM3AhCMw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Are there any facets that can be used to co-relate log events ?

2014-07-02 Thread Matthew Morrison
Hi Aditya,

I'm looking to do something similar, did you have any success with this 
problem?

Thanks
Matt

On Wednesday, January 22, 2014 11:53:36 PM UTC+13, Aditya Pavan Kumar 
Vegesna wrote:

  Hi

 I am looking for way to co-relate multiple log events and then calculate 
 the time duration between those events?

 e.g: Request log event  response log event - to calculate the difference 
 in timestamps to assess the performance of the application.

 Can anyone help me how this can be achieved.

 Thanks

 Pavan Kumar


-- 

See why you should attend BroadSoft Connections 2014 
http://broadsoftconnections.com/

This email is intended solely for the person or entity to which it is 
addressed and may contain confidential and/or privileged information. If 
you are not the intended recipient and have received this email in error, 
please notify BroadSoft, Inc. immediately by replying to this message, and 
destroy all copies of this message, along with any attachment, prior to 
reading, distributing or copying it.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/d20a2379-c915-477c-877d-690895b22773%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Kibana browser compatibility issues

2014-07-02 Thread Laura
We are using Logstash-ElasticSearch-Kibana and just want to be able to open 
the index file in Kibana. What is the necessary plugin that will allow us 
to do this in something other than firefox?

On Monday, June 2, 2014 11:56:35 AM UTC-7, Binh Ly wrote:

 If you simply point the browser at the file system index.html, in my 
 experience, that only works in Firefox (and only if you explicitly do 
 http://server:9200;). The Kibana default assumes that you actually run 
 Kibana from a web server (or as an ES site plugin if you prefer) and that 
 ES is accessible from the same host as where Kibana is being served from. 


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/303741f3-a5ce-40c0-b9c0-b2284637c92c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Kibana browser compatibility issues

2014-07-02 Thread Brian
Laura,

The simplest way is to install Kibana as a site plug-in on the same node on 
which you run Elasticsearch. Not the best way from a performance and 
security perspective, but certainly the easiest way to start with an 
absolute minimum of extra levers to pull and knobs to turn, so to speak.

So what does that really mean, a site plugin?

Assume you configure Elasticsearch to look for plugins within the 
/opt/elk/plugins directory.

Then you unpack the Kibana3 distribution within /opt/kibana3. That means 
you'll see the following files within /opt/kibana3/kibana-3.1.0:
app  build.txt  config.js  css  favicon.ico  font  img  index.html 
 LICENSE.md  README.md  vendor

So then create the /opt/elk/plugins/kibana3 directory. Then:
$ ln -s  /opt/kibana3/kibana-3.1.0 /opt/elk/plugins/kibana3/_site

Now when you start ES and point it to the correct configuration file which 
in turn points it to the plugins directory as described above, Kibana will 
be available at the following URL (assuming you're on the same host; change 
localhost as needed, of course):

http://localhost:9200/_plugin/kibana3/

Hope this helps!

Brian

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/59b1ac76-d3a5-4b63-bdc6-f617ef8c0627%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Visibility

2014-07-02 Thread smonasco
Hi,

I'm trying to get a lot more visibility and metrics into what's going on 
under the hood.

Occasionally, we see spikes in memory.  I'd like to get heap mem used on a 
per shard basis.  If I'm not mistaken, somewhere somehow, this Lucene index 
that is a shard is using memory in the heap, and I'd like to collect metric.

It may also be an operation somewhere higher up in the elasticsearch level 
where we are merging results from shards or results from indexes (maybe 
elasticsearch doesn't bother to merge twice but merges once), that's also a 
mem space I'd like to collect data on.

I think a per query mem use would also be something interesting, though, 
perhaps obviously too much to keep up with for every query (maybe a future 
opt-in feature, unless it's already there and I'm missing it).

Other cluster events like nodes entering and exiting the cluster or the 
changing of the master would be nice to collect.

I'm guessing some of this isn't available and some of it is, but my 
Google-Fu seems to be lacking.  I'm pretty sure I can poll to figure out 
the events happened, but was wondering if there was something in the java 
client node where I could get a Future or some other hook to turn it into a 
push instead of a pull.

Any help will be appreciated.  I'm aware it's a wide net though.

--Shannon Monasco

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/56362f94-c20b-4201-ae15-5f5f9ca77ff4%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


downside to using Bulk API for small/single-doc sets?

2014-07-02 Thread Nikita Tovstoles
Hi,

I am using ES Java API to talk to an ES server. Sometimes I need to index a 
single doc, sometimes dozens or hundreds at a time. I'd prefer to keep my 
code simple (am a contrarian thinker) and wonder if I can get away with 
always using bulk API (ie BulkRequestBuilder). so that my interface to ES 
would look like so:

void indexDoc(Doc doc);
void indexDocs(CollectionDoc docs);

...but impl would always delegate to BulkRequestBuilder - with number of 
actions sometimes being ~ 1.

Is there a performance (or other) downside to this approach. Specifically, 
would bulk index updates (with set of size == 1) take significantly longer 
than non-bulk updates?

thanks,
-nikita

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/9a915ef3-812b-4905-8e4e-852aeb43a81c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Recommended Hardware Specs Sharding\Index Strategy

2014-07-02 Thread Mark Walkom
The heap should be as big as your largest shard, irrespective of what index
it belongs to or if it's a replica.

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com
web: www.campaignmonitor.com


On 3 July 2014 05:50, mrno42 doug...@gmail.com wrote:

 When you say - do not let a shard grow bigger than your JVM heap (this is
 really a rough estimation) so segment merging will work flawlessly

 are we counting all the primary and replicas shards of all indexes on that
 node?  So for example, if we had two indexes with on 10 node cluster. Each
 index has 10 shards and 1 replica(40 total in cluster).

 So per node, the heap size should be larger than:

 1 shard for first index
 1 shard for replica of first index
 1 shard for second index
 1 shard for replica second index

 the four shards combined?

 Thanks again for your advice



 On Saturday, August 10, 2013 6:50:27 AM UTC-7, Jörg Prante wrote:

 Your concern is a single shard getting too big. If you use 64bit JVM
 and mmapfs (quite common), you can open even the largest files. So from
 this point of view, a node can handle the biggest files. There is no real
 limit.

 Another question is throughput performance with large shard files. For
 example, the more mixed read/write operations are in the workload, the
 smaller the Lucene indexes should be, to allow the JVM/OS a better load
 distribution.

 For selecting a total number of shards and shard size, here are some
 general rules of thumb:

 - do not select a smaller number of shards than your total number of
 nodes you will add to the cluster. Each node should hold at least one shard.

 - do not let a shard grow bigger than your JVM heap (this is really a
 rough estimation) so segment merging will work flawlessly

 - if you want fast recovery, or if you want to move shards around (not a
 common case), the smaller a shard is the faster the operation will get done

 In case you are worried about shards getting out of bounds, you can
 reindex with a higher number of shards (having the _source enabled is
 always an advantage for reindexing) with your favorite custom tool.
 Reindexing can take significant time, and may not be an option if you can't
 stop indexing.

 Jörg


 On Fri, Aug 9, 2013 at 4:32 PM, David Arata david...@gmail.com wrote:

 My concern is what would would be the best strategy so that an index or
 single shard in an index does not get too big for a node to handle and if
 its approaching that size what can be done?


  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/3fc800b0-28d6-4ad0-8aa5-eb182d9b27ee%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/3fc800b0-28d6-4ad0-8aa5-eb182d9b27ee%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAEM624Z_Fdx7-ew-XaapNN7wN6zfjD97PXSLk3G-QrXOVVoX6A%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Looking to build a logging solution with threshold alerting.

2014-07-02 Thread Mark Walkom
There was another thread on this very recently, and some people are using
riemann for this.
Take a look in the archives and you can probably find some useful info.

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com
web: www.campaignmonitor.com


On 2 July 2014 22:53, Joshua Hall joshuadeanh...@gmail.com wrote:

 I am looking to build a logging solution and wanted to make sure that I am
 not missing any key components.

 The logs that I have are currently stored in a database which there is
 limited access due to locking risks from bad queries.

 My plan is to have the dba's write the logs from the database tables to a
 file on a set interval then have logstash pick up the logs and write it to
 elastic search.  Then for viewing/searching the logs I will be using
 kibana.  Everything up to this point I have been able to make a proof of
 concept for but the other request was to have alerting.

 I have spent some time looking at this and the general response seems to
 be to use percolation, but that seems to only make sense if you want to
 send an alert if you receive a single error that matches a query and from
 what I have seen there is no way to a threshold alerting system using
 percolation.

 My thought to solve the threshold alerting is to create a simple web UI
 that allows the user to enter in a query to search for, a threshold, a time
 frame, and emails to send the alert to that would get stored in elastic
 search.  Then an app (Running as a windows service or cron job) that pulls
 the alerts and then runs the queries and checks the time-frame and
 threshold (Would run on some interval).  If the count surpasses the
 threshold then it would send an email to values stored in the email
 addresses.

 I know that SPM seems to cover this and move but we are currently looking
 to see if we can do this without buying another product.

 Is this the correct approach to take or should I be looking at doing
 something else?

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/ce1cb3cc-e974-4b3b-8568-a2afaaae6c00%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/ce1cb3cc-e974-4b3b-8568-a2afaaae6c00%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAEM624Z2f%3DD9H1LfWX98oTNNJia2R1-NEwkpiEtZ63FiKrOmGA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Visibility

2014-07-02 Thread Mark Walkom
Depends what you want to do really.

There are plugins like ElasticHQ, Marvel, kopf and bigdesk that will give
you some info. You can also hook collectd into the stack and take metrics,
or use plugins from nagios etc.
What monitoring platforms do you have in place now?

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com
web: www.campaignmonitor.com


On 3 July 2014 07:49, smonasco smona...@gmail.com wrote:

 Hi,

 I'm trying to get a lot more visibility and metrics into what's going on
 under the hood.

 Occasionally, we see spikes in memory.  I'd like to get heap mem used on a
 per shard basis.  If I'm not mistaken, somewhere somehow, this Lucene index
 that is a shard is using memory in the heap, and I'd like to collect metric.

 It may also be an operation somewhere higher up in the elasticsearch level
 where we are merging results from shards or results from indexes (maybe
 elasticsearch doesn't bother to merge twice but merges once), that's also a
 mem space I'd like to collect data on.

 I think a per query mem use would also be something interesting, though,
 perhaps obviously too much to keep up with for every query (maybe a future
 opt-in feature, unless it's already there and I'm missing it).

 Other cluster events like nodes entering and exiting the cluster or the
 changing of the master would be nice to collect.

 I'm guessing some of this isn't available and some of it is, but my
 Google-Fu seems to be lacking.  I'm pretty sure I can poll to figure out
 the events happened, but was wondering if there was something in the java
 client node where I could get a Future or some other hook to turn it into a
 push instead of a pull.

 Any help will be appreciated.  I'm aware it's a wide net though.

 --Shannon Monasco

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/56362f94-c20b-4201-ae15-5f5f9ca77ff4%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/56362f94-c20b-4201-ae15-5f5f9ca77ff4%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAEM624aCuHQhWVHp0MOrTZH3s0y6kN7jqkg7bXEQF%2BrtwfEqTQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Bulk API: different values for the same parameter at batch and operation level?

2014-07-02 Thread Yawei
Hi there,

I noticed that in Java bulk API, some parameters can be set can on both 
per-batch-request level and per-operation level, e.g. the consistency level 
parameter: BulkRequestBuilder#setConsistencyLevel v.s IndexRequestBuilder.
 setConsistencyLevel.

What if the parameter has different values between these two level? Will 
the per-operation one override the per-batch-request one?

Thanks

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/1962c6cd-8829-4b58-8b8e-ef3350f6ea98%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Splunk vs. Elastic search performance?

2014-07-02 Thread Steve Mayzak
In the latest version of Logstash, you can use the elasticsearch output and 
just set the protocol to http.  The elasticsearch_http output will be 
removed eventually.

On Monday, June 23, 2014 9:22:28 AM UTC-7, Ivan Brusic wrote:

 I agree. I thought elasticsearch_http was actually the recommended route. 
 Also, I have seen no reported issues with different client/server versions 
 since 1.0. My current logstash setup (which is not production level, simply 
 a dev logging tool) uses Elasticsearch 1.2.1 with Logstash 1.4.1 using the 
 non http interface.

 -- 
 Ivan


 On Fri, Jun 20, 2014 at 3:29 PM, Mark Walkom ma...@campaignmonitor.com 
 javascript: wrote:

 I wasn't aware that the elasticsearch_http output wasn't recommended?
 When I spoke to a few of the ELK devs a few months ago, they indicated 
 that there was minimal performance difference, at the greater benefit of 
 not being locked to specific LS+ES versioning.

 Regards,
 Mark Walkom

 Infrastructure Engineer
 Campaign Monitor
 email: ma...@campaignmonitor.com javascript:
 web: www.campaignmonitor.com


 On 21 June 2014 02:43, Brian brian@gmail.com javascript: wrote:

 Thomas,

 Thanks for your insights and experiences. As I am someone who has 
 explored and used ES for over a year but is relatively new to the ELK 
 stack, your data points are extremely valuable. Let me offer some of my own 
 views.

 Re: double the storage. I strongly recommend ELK users to disable the 
 _all field. The entire text of the log events generated by logstash ends up 
 in the message field (and not @message as many people incorrectly 
 post). So the _all field is just redundant overhead with no value add. The 
 result is a dramatic drop in database file sizes and dramatic increase in 
 load performance. Of course, you need to configure ES to use the message 
 field 
 as the default for a Lucene Kibana query.

 During the year that I've used ES and watched this group, I have been on 
 the front line of a brand new product with a smart and dedicated 
 development team working steadily to improve the product. Six months ago, 
 the ELK stack eluded me and reports weren't encouraging (with the sole 
 exception of the Kibana web site's marketing pitch). But ES has come a long 
 way since six months ago, and the ELK stack is much more closely integrated.

 The Splunk UI is carefully crafted to isolate users from each other and 
 prevent external (to the Splunk db itself, not to our company) users from 
 causing harm to data. But Kibana seems to be meant for a small cadre of 
 trusted users. What if I write a dashboard with the same name as someone 
 else's? Kibana doesn't even begin to discuss user isolation. But I am 
 confident that it will.

 How can I tell Kibana to set the default Lucene query operator to AND 
 instead of OR. Google is not my friend: I keep getting references to the 
 Ruby versions of Kibana; that's ancient history by now. Kibana is cool and 
 promising, but it has a long way to go for deployment to all of the folks 
 in our company who currently have access to Splunk.

 Logstash has a nice book that's been very helpful, and logstash itself 
 has been an excellent tool for prototyping. The book has been invaluable in 
 helping me extract dates from log events and handling all of our different 
 multiline events. But it still doesn't explain why the date filter needs a 
 different array of matching strings to get the date that the grok filter 
 has already matched and isolated. And recommendations to avoid the 
 elasticsearch_http output and use elasticsearch (via the Node client) 
 directly contradict the fact that logstash's 1.1.1 version of the ES client 
 library is not compatible with the most recent 1.2.1 version of ES.

 And logstash is also a resource hog, so we eventually plan to replace it 
 with Perl and Apache Flume (already in use) and pipe it into my Java bulk 
 load tool (which is always kept up-to-date with the versions of ES we 
 deploy!!). Because we send the data via Flume to our data warehouse, any 
 losses in ES will be annoying but won't be catastrophic. And the front-end 
 following of rotated log files will be done using the GNU *tail -F* command 
 and option. This GNU tail command with its uppercase -F option follows 
 rotated log files perfectly. I doubt that logstash can do the same, and we 
 currently see that neither can Splunk (so we sporadically lose log events 
 in Splunk too). So GNU tail -F piped into logstash with the stdin filter 
 works perfectly in my evaluation setup and will likely form the first stage 
 of any log forwarder we end up deploying,

 Brian

 On Thursday, June 19, 2014 8:48:34 AM UTC-4, Thomas Paulsen wrote:

 We had a 2,2TB/d installation of Splunk and ran it on VMWare with 12 
 Indexer and 2 Searchheads. Each indexer had 1000IOPS guaranteed assigned. 
 The system is slow but ok to use. 

 We tried Elasticsearch and we were able to get the same performance 
 with the same amount of machines. 

elasticsearch high cpu usage

2014-07-02 Thread vincent Park
Hi, 
I have 5 clustered nodes and each nodes have 1 replica. 
total document size is 216 M and 853,000 docs. 
I was suffering from very high CPU usage. 
every hours and every early morning about am 05:00 ~ am 09:00 
you can see my cacti graph. 

there is elasticsearch only on this server 

I thought there are something wrong with es process. 
but there is a few server request at cpu peak time. 
and there is no cron job even. 

$ ./elasticsearch -v 
*Version: 1.1.1, Build: f1585f0/2014-04-16T14:27:12Z, JVM: 1.7.0_55*

$ java -version 


*java version 1.7.0_55 Java(TM) SE Runtime Environment (build 
1.7.0_55-b13) Java HotSpot(TM) 64-Bit Server VM (build 24.55-b03, mixed 
mode)*

and I installed plugins on elasticsearch: 
*HQ, bigdesk, head, kopf, sense*

es log at cpu peak time: 



[2014-07-03 08:01:00,045][DEBUG][action.search.type   ] [node1] 
[search][4], node[GJjzCrLvQQ-ZRRoqL13MrQ], [P], s[STARTED]: Failed to 
execute [org.elasticsearch.action.search.SearchRequest@451f9e7c] lastShard 
[true] 
org.elasticsearch.common.util.concurrent.EsRejectedExecutionException: 
rejected execution (queue capacity 300) on 
org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction$4@68ab486b
 
at 
org.elasticsearch.common.util.concurrent.EsAbortPolicy.rejectedExecution(EsAbortPolicy.java:62)
 
at java.util.concurrent.ThreadPoolExecutor.reject(Unknown Source) 
at java.util.concurrent.ThreadPoolExecutor.execute(Unknown Source) 
at 
org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.onFirstPhaseResult(TransportSearchTypeAction.java:293)
 
at 
org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.onFirstPhaseResult(TransportSearchTypeAction.java:300)
 
at 
org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.start(TransportSearchTypeAction.java:190)
 
at 
org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction.doExecute(TransportSearchQueryThenFetchAction.java:59)
 
at 
org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction.doExecute(TransportSearchQueryThenFetchAction.java:49)
 
at 
org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:63)
 
at 
org.elasticsearch.action.search.TransportSearchAction.doExecute(TransportSearchAction.java:108)
 
at 
org.elasticsearch.action.search.TransportSearchAction.doExecute(TransportSearchAction.java:43)
 
at 
org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:63)
 
at org.elasticsearch.client.node.NodeClient.execute(NodeClient.java:92) 
at 
org.elasticsearch.client.support.AbstractClient.search(AbstractClient.java:212) 
at 
org.elasticsearch.rest.action.search.RestSearchAction.handleRequest(RestSearchAction.java:98)
 
at 
org.elasticsearch.rest.RestController.executeHandler(RestController.java:159) 
at 
org.elasticsearch.rest.RestController.dispatchRequest(RestController.java:142) 
at 
org.elasticsearch.http.HttpServer.internalDispatchRequest(HttpServer.java:121) 
at 
org.elasticsearch.http.HttpServer$Dispatcher.dispatchRequest(HttpServer.java:83)
 
at 
org.elasticsearch.http.netty.NettyHttpServerTransport.dispatchRequest(NettyHttpServerTransport.java:291)
 
at 
org.elasticsearch.http.netty.HttpRequestHandler.messageReceived(HttpRequestHandler.java:43)
 
at 
org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
 
at 
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
 
at 
org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
 
at 
org.elasticsearch.common.netty.handler.codec.http.HttpChunkAggregator.messageReceived(HttpChunkAggregator.java:145)
 
at 
org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
 
at 
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
 
at 
org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
 
at 
org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:296)
 
at 
org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:459)
 
at 
org.elasticsearch.common.netty.handler.codec.replay.ReplayingDecoder.callDecode(ReplayingDecoder.java:536)
 
at 
org.elasticsearch.common.netty.handler.codec.replay.ReplayingDecoder.messageReceived(ReplayingDecoder.java:435)
 
at 
org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
 
at 

elasticsearch high cpu usage

2014-07-02 Thread vincent Park
Hi, 
I have 5 clustered nodes and each nodes have 1 replica. 
total document size is 216 M and 853,000 docs. 
I was suffering from very high CPU usage. 
every hours and every early morning about am 05:00 ~ am 09:00 
you can see my cacti graph. 

there is elasticsearch only on this server 

I thought there are something wrong with es process. 
but there is a few server request at cpu peak time. 
and there is no cron job even. 

every hours and every early morning about am 05:00 ~ am 09:00 
I don't know what's going on elasticsearch at this time!! 
somebody help me, tell me what happened in there. please.. 

$ ./elasticsearch -v 
*Version: 1.1.1, Build: f1585f0/2014-04-16T14:27:12Z, JVM: 1.7.0_55*

$ java -version 


*java version 1.7.0_55 Java(TM) SE Runtime Environment (build 
1.7.0_55-b13) Java HotSpot(TM) 64-Bit Server VM (build 24.55-b03, mixed 
mode)*

and I installed plugins on elasticsearch: 
*HQ, bigdesk, head, kopf, sense*

es log at cpu peak time: 



[2014-07-03 08:01:00,045][DEBUG][action.search.type   ] [node1] 
[search][4], node[GJjzCrLvQQ-ZRRoqL13MrQ], [P], s[STARTED]: Failed to 
execute [org.elasticsearch.action.search.SearchRequest@451f9e7c] lastShard 
[true] 
org.elasticsearch.common.util.concurrent.EsRejectedExecutionException: 
rejected execution (queue capacity 300) on 
org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction$4@68ab486b
 
at 
org.elasticsearch.common.util.concurrent.EsAbortPolicy.rejectedExecution(EsAbortPolicy.java:62)
 
at java.util.concurrent.ThreadPoolExecutor.reject(Unknown Source) 
at java.util.concurrent.ThreadPoolExecutor.execute(Unknown Source) 
at 
org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.onFirstPhaseResult(TransportSearchTypeAction.java:293)
 
at 
org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.onFirstPhaseResult(TransportSearchTypeAction.java:300)
 
at 
org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.start(TransportSearchTypeAction.java:190)
 
at 
org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction.doExecute(TransportSearchQueryThenFetchAction.java:59)
 
at 
org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction.doExecute(TransportSearchQueryThenFetchAction.java:49)
 
at 
org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:63)
 
at 
org.elasticsearch.action.search.TransportSearchAction.doExecute(TransportSearchAction.java:108)
 
at 
org.elasticsearch.action.search.TransportSearchAction.doExecute(TransportSearchAction.java:43)
 
at 
org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:63)
 
at org.elasticsearch.client.node.NodeClient.execute(NodeClient.java:92) 
at 
org.elasticsearch.client.support.AbstractClient.search(AbstractClient.java:212) 
at 
org.elasticsearch.rest.action.search.RestSearchAction.handleRequest(RestSearchAction.java:98)
 
at 
org.elasticsearch.rest.RestController.executeHandler(RestController.java:159) 
at 
org.elasticsearch.rest.RestController.dispatchRequest(RestController.java:142) 
at 
org.elasticsearch.http.HttpServer.internalDispatchRequest(HttpServer.java:121) 
at 
org.elasticsearch.http.HttpServer$Dispatcher.dispatchRequest(HttpServer.java:83)
 
at 
org.elasticsearch.http.netty.NettyHttpServerTransport.dispatchRequest(NettyHttpServerTransport.java:291)
 
at 
org.elasticsearch.http.netty.HttpRequestHandler.messageReceived(HttpRequestHandler.java:43)
 
at 
org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
 
at 
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
 
at 
org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
 
at 
org.elasticsearch.common.netty.handler.codec.http.HttpChunkAggregator.messageReceived(HttpChunkAggregator.java:145)
 
at 
org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
 
at 
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
 
at 
org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
 
at 
org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:296)
 
at 
org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:459)
 
at 
org.elasticsearch.common.netty.handler.codec.replay.ReplayingDecoder.callDecode(ReplayingDecoder.java:536)
 
at 
org.elasticsearch.common.netty.handler.codec.replay.ReplayingDecoder.messageReceived(ReplayingDecoder.java:435)
 
at 

Re: Visibility

2014-07-02 Thread Shannon Monasco
I currently record basically everything in bigdesk, all the numerics from
cluster health, cluster state, nodes info, node stats, index status and
segments.

I want memory allocated on a per shard level for Lucene level actions,
query level actions (outside field and filter cache) and hooks into events
like nodes entering and exiting the cluster, new indexes, alias and other
administrative changes and master elections.

Basically when it comes to memory I'd like to have all parts of the heap
accounted for.  Field + filter cache is not accounting for whatever process
is spiking nor does it answer most of the heap.  At 29 gigs being used and
garbage collection taking minutes, but not getting anything, elastic is
only reporting 7 gigs in cache.  We can discuss my particular memory
problems and solutions, but mostly I'm after the visibility.

--Shannon Monasco
On Jul 2, 2014 5:50 PM, Mark Walkom ma...@campaignmonitor.com wrote:

 Depends what you want to do really.

 There are plugins like ElasticHQ, Marvel, kopf and bigdesk that will give
 you some info. You can also hook collectd into the stack and take metrics,
 or use plugins from nagios etc.
 What monitoring platforms do you have in place now?

 Regards,
 Mark Walkom

 Infrastructure Engineer
 Campaign Monitor
 email: ma...@campaignmonitor.com
 web: www.campaignmonitor.com


 On 3 July 2014 07:49, smonasco smona...@gmail.com wrote:

 Hi,

 I'm trying to get a lot more visibility and metrics into what's going on
 under the hood.

 Occasionally, we see spikes in memory.  I'd like to get heap mem used on
 a per shard basis.  If I'm not mistaken, somewhere somehow, this Lucene
 index that is a shard is using memory in the heap, and I'd like to collect
 metric.

 It may also be an operation somewhere higher up in the elasticsearch
 level where we are merging results from shards or results from indexes
 (maybe elasticsearch doesn't bother to merge twice but merges once), that's
 also a mem space I'd like to collect data on.

 I think a per query mem use would also be something interesting, though,
 perhaps obviously too much to keep up with for every query (maybe a future
 opt-in feature, unless it's already there and I'm missing it).

 Other cluster events like nodes entering and exiting the cluster or the
 changing of the master would be nice to collect.

 I'm guessing some of this isn't available and some of it is, but my
 Google-Fu seems to be lacking.  I'm pretty sure I can poll to figure out
 the events happened, but was wondering if there was something in the java
 client node where I could get a Future or some other hook to turn it into a
 push instead of a pull.

 Any help will be appreciated.  I'm aware it's a wide net though.

 --Shannon Monasco

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/56362f94-c20b-4201-ae15-5f5f9ca77ff4%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/56362f94-c20b-4201-ae15-5f5f9ca77ff4%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


  --
 You received this message because you are subscribed to a topic in the
 Google Groups elasticsearch group.
 To unsubscribe from this topic, visit
 https://groups.google.com/d/topic/elasticsearch/sF_C58d96ms/unsubscribe.
 To unsubscribe from this group and all its topics, send an email to
 elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/CAEM624aCuHQhWVHp0MOrTZH3s0y6kN7jqkg7bXEQF%2BrtwfEqTQ%40mail.gmail.com
 https://groups.google.com/d/msgid/elasticsearch/CAEM624aCuHQhWVHp0MOrTZH3s0y6kN7jqkg7bXEQF%2BrtwfEqTQ%40mail.gmail.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAFDU5WJYc8LbWyqY%2Bu8u%2BcS_WbgNBOfzNEKOLT_xV4pwnfQj_Q%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


elasticsearch high cpu usage

2014-07-02 Thread vincent Park
Hi, 
I have 5 clustered nodes and each nodes have 1 replica. 
total document size is 216 M and 853,000 docs. 
I was suffering from very high CPU usage. 
every hours and every early morning about am 05:00 ~ am 09:00 
you can see my cacti graph. 

there is elasticsearch only on this server 

I thought there are something wrong with es process. 
but there is a few server request at cpu peak time. 
and there is no cron job even. 

every hours and every early morning about am 05:00 ~ am 09:00 
I don't know what's going on elasticsearch at this time!! 
somebody help me, tell me what happened in there. please.. 

$ ./elasticsearch -v 
Version: 1.1.1, Build: f1585f0/2014-04-16T14:27:12Z, JVM: 1.7.0_55 

$ java -version 
java version 1.7.0_55 
Java(TM) SE Runtime Environment (build 1.7.0_55-b13) 
Java HotSpot(TM) 64-Bit Server VM (build 24.55-b03, mixed mode) 

and I installed plugins on elasticsearch: 
HQ, bigdesk, head, kopf, sense 

es log at cpu peak time: 



[2014-07-03 08:01:00,045][DEBUG][action.search.type   ] [node1] 
[search][4], node[GJjzCrLvQQ-ZRRoqL13MrQ], [P], s[STARTED]: Failed to 
execute [org.elasticsearch.action.search.SearchRequest@451f9e7c] lastShard 
[true] 
org.elasticsearch.common.util.concurrent.EsRejectedExecutionException: 
rejected execution (queue capacity 300) on 
org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction$4@68ab486b
 
at 
org.elasticsearch.common.util.concurrent.EsAbortPolicy.rejectedExecution(EsAbortPolicy.java:62)
 
at java.util.concurrent.ThreadPoolExecutor.reject(Unknown Source) 
at java.util.concurrent.ThreadPoolExecutor.execute(Unknown Source) 
at 
org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.onFirstPhaseResult(TransportSearchTypeAction.java:293)
 
at 
org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.onFirstPhaseResult(TransportSearchTypeAction.java:300)
 
at 
org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.start(TransportSearchTypeAction.java:190)
 
at 
org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction.doExecute(TransportSearchQueryThenFetchAction.java:59)
 
at 
org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction.doExecute(TransportSearchQueryThenFetchAction.java:49)
 
at 
org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:63)
 
at 
org.elasticsearch.action.search.TransportSearchAction.doExecute(TransportSearchAction.java:108)
 
at 
org.elasticsearch.action.search.TransportSearchAction.doExecute(TransportSearchAction.java:43)
 
at 
org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:63)
 
at org.elasticsearch.client.node.NodeClient.execute(NodeClient.java:92) 
at 
org.elasticsearch.client.support.AbstractClient.search(AbstractClient.java:212) 
at 
org.elasticsearch.rest.action.search.RestSearchAction.handleRequest(RestSearchAction.java:98)
 
at 
org.elasticsearch.rest.RestController.executeHandler(RestController.java:159) 
at 
org.elasticsearch.rest.RestController.dispatchRequest(RestController.java:142) 
at 
org.elasticsearch.http.HttpServer.internalDispatchRequest(HttpServer.java:121) 
at 
org.elasticsearch.http.HttpServer$Dispatcher.dispatchRequest(HttpServer.java:83)
 
at 
org.elasticsearch.http.netty.NettyHttpServerTransport.dispatchRequest(NettyHttpServerTransport.java:291)
 
at 
org.elasticsearch.http.netty.HttpRequestHandler.messageReceived(HttpRequestHandler.java:43)
 
at 
org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
 
at 
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
 
at 
org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
 
at 
org.elasticsearch.common.netty.handler.codec.http.HttpChunkAggregator.messageReceived(HttpChunkAggregator.java:145)
 
at 
org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
 
at 
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
 
at 
org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
 
at 
org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:296)
 
at 
org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:459)
 
at 
org.elasticsearch.common.netty.handler.codec.replay.ReplayingDecoder.callDecode(ReplayingDecoder.java:536)
 
at 
org.elasticsearch.common.netty.handler.codec.replay.ReplayingDecoder.messageReceived(ReplayingDecoder.java:435)
 
at 

Re: Are there any facets that can be used to co-relate log events ?

2014-07-02 Thread Aditya Pavan Kumar Vegesna
Hey mathew

Sorry no luck with that

Cheers
Aditya
On Jul 3, 2014 2:22 AM, Matthew Morrison mmorri...@broadsoft.com wrote:

 Hi Aditya,

 I'm looking to do something similar, did you have any success with this
 problem?

 Thanks
 Matt

 On Wednesday, January 22, 2014 11:53:36 PM UTC+13, Aditya Pavan Kumar
 Vegesna wrote:

  Hi

 I am looking for way to co-relate multiple log events and then calculate
 the time duration between those events?

 e.g: Request log event  response log event - to calculate the difference
 in timestamps to assess the performance of the application.

 Can anyone help me how this can be achieved.

 Thanks

 Pavan Kumar


 See why you should attend BroadSoft Connections 2014
 http://broadsoftconnections.com/

 This email is intended solely for the person or entity to which it is
 addressed and may contain confidential and/or privileged information. If
 you are not the intended recipient and have received this email in error,
 please notify BroadSoft, Inc. immediately by replying to this message, and
 destroy all copies of this message, along with any attachment, prior to
 reading, distributing or copying it.

 --
 You received this message because you are subscribed to a topic in the
 Google Groups elasticsearch group.
 To unsubscribe from this topic, visit
 https://groups.google.com/d/topic/elasticsearch/rR1nonZMWd4/unsubscribe.
 To unsubscribe from this group and all its topics, send an email to
 elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/d20a2379-c915-477c-877d-690895b22773%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/d20a2379-c915-477c-877d-690895b22773%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAPtG3Y9FqnEdXsOQG3pWQB2k-Lf5UGK5d9kEgHFwHh%3D2aqc5EQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.