Re: ElasticSearch Version problem

2014-05-22 Thread Mukul Gupta
There seems to be some problem when indexing Mysql data in ES. These are
the logs of ES:

[][DEBUG][NodeClient   ] after bulk [18650] [succeeded=93255]
[failed=0] [5ms]
[][DEBUG][NodeClient   ] before bulk [18654] of 5 items, 2407
bytes, 1 outstanding bulk requests
[][DEBUG][NodeClient   ] after bulk [18651] [succeeded=93260]
[failed=0] [3ms]
[][DEBUG][NodeClient   ] after bulk [18653] [succeeded=93265]
[failed=0] [3ms]
[][DEBUG][NodeClient   ] before bulk [18655] of 5 items, 2381
bytes, 1 outstanding bulk requests
[][DEBUG][NodeClient   ] before bulk [18656] of 5 items, 2357
bytes, 2 outstanding bulk requests
[][DEBUG][NodeClient   ] before bulk [18657] of 5 items, 2399
bytes, 3 outstanding bulk requests
[][DEBUG][NodeClient   ] after bulk [18656] [succeeded=93270]
[failed=0] [3ms]
[][DEBUG][NodeClient   ] before bulk [18658] of 5 items, 2242
bytes, 3 outstanding bulk requests
[][DEBUG][NodeClient   ] after bulk [18655] [succeeded=93275]
[failed=0] [5ms]
[][DEBUG][NodeClient   ] before bulk [18659] of 5 items, 2316
bytes, 3 outstanding bulk requests
[][DEBUG][NodeClient   ] after bulk [18658] [succeeded=93280]
[failed=0] [2ms]
[[DEBUG][NodeClient   ] after bulk [18657] [succeeded=93285]
[failed=0] [4ms]
[][DEBUG][NodeClient   ] before bulk [18660] of 5 items, 2336
bytes, 2 outstanding bulk requests
[][DEBUG][NodeClient   ] after bulk [18654] [succeeded=93290]
[failed=0] [7ms]
[][DEBUG][NodeClient   ] before bulk [18661] of 5 items, 2202
bytes, 2 outstanding bulk requests
[][DEBUG][NodeClient   ] after bulk [18659] [succeeded=93300]
[failed=0] [3ms]
[][DEBUG][NodeClient   ] before bulk [18662] of 5 items, 2234
bytes, 1 outstanding bulk requests
[][DEBUG][NodeClient   ] after bulk [18660] [succeeded=93295]
[failed=0] [2ms]
[][DEBUG][NodeClient   ] before bulk [18663] of 5 items, 2176
bytes, 2 outstanding bulk requests
[][DEBUG][NodeClient   ] before bulk [18664] of 5 items, 2332
bytes, 3 outstanding bulk requests
[][DEBUG][NodeClient   ] after bulk [18661] [succeeded=93305]
[failed=0] [7ms]
[][DEBUG][NodeClient   ] before bulk [18665] of 5 items, 2275
bytes, 3 outstanding bulk requests
[][DEBUG][NodeClient   ] after bulk [18662] [succeeded=93310]
[failed=0] [8ms]
[][DEBUG][NodeClient   ] before bulk [18666] of 5 items, 2358
bytes, 3 outstanding bulk requests
[][DEBUG][NodeClient   ] after bulk [18664] [succeeded=93315]
[failed=0] [4ms]
[][DEBUG][NodeClient   ] after bulk [18665] [succeeded=93320]
[failed=0] [2ms]
[][DEBUG][NodeClient   ] before bulk [18667] of 5 items, 2307
bytes, 2 outstanding bulk requests
[][DEBUG][NodeClient   ] after bulk [18663] [succeeded=93325]
[failed=0] [6ms]
[][DEBUG][org.xbib.elasticsearch.river.jdbc.strategy.simple.SimpleRiverSource]
merged 93338 rows

Though the total count looks fine but I can see count in my index which is
exactly half of what it is showing(93338). Any idea what is wrong here ?
Also, I can see some outstanding bulk requests which I think is creating
problem:

Here are the params which I'm using in my jdbc:
fetchsize : 100,
maxconcurrrentbulkactions : 5,
maxbulkactions : 5,
max_retries : 15,
autocommit : true

I tried changing these params to avoid above outstanding bulk requests but
no help.

Thanks,


On Tue, May 20, 2014 at 11:32 PM, joergpra...@gmail.com 
joergpra...@gmail.com wrote:

 1. I'd be thankful if you can post an issue on the JDBC river github page
 with a (small) example where you can demonstrate that JDBC river is not
 able to index all data. In my own projects I index millions of rows without
 skipping any data. There are many possible reasons why this can happen but
 I can try to track down such issues. A start would be to increase log level
 to DEBUG and follow the messages JDBC river is writing about the number of
 indexed documents. This can easily compared to the database rows (assuming
 you transfer the primary key of a table to the _id field)

 2. MySQL functions that return binary data (non-UTF-8) can not be indexed
 into ES without base64 encoding. Maybe you should switch character set of
 the JDBC MySQL connection URL if you are unsure that you really get UTF-8
 from the database.

 Jörg





 On Tue, May 20, 2014 at 7:00 PM, Mukul Gupta mukulnit...@gmail.comwrote:

 Yeah..I switched to JDBC 1.1.0.2 and it is now updating indexed docs but
 still few issues are there.

 1. Though JDBC can now index docs but not all Mysql docs are getting
 index. I tried palying with
 maxbulkactions - the length of each bulk index request submitted
 maxconcurrrentbulkactions - the maximum number of concurrent bulk
 requests
  but not able to solve the issue. I thought 

index paramerters

2014-05-22 Thread anass benjelloun
Hello,

I need to index 100 000 documents with 1Mo.
This is my configuration of ElasticSearch index:

index: {

type: doc,

bulk_size: 100,

number_of_shards : 5, 

number_of_replicas : 2
  }

I need to know what each parameters effect.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/bdf82941-c406-4f91-94cd-adddf35676ec%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Creating Index not with field _id, but with my field/column name.

2014-05-22 Thread dharmendra pratap singh
Hi Guys,
Can someone guide me, If I want to create the index with my field name not 
with _id column.

looking for some help from your end.

Regards
Dharmendra

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/ff83e4c7-341b-4bab-a845-f9fe702ba690%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Number of characters in field

2014-05-22 Thread David Nielsen
Hi. 
I am trying to find a way to express a character count filter in a 
querystring, for instance: I need to find all documents with field 
subject that holds less than 20 chars.
How would i do that in a querystring ?

/David

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/4163e674-b9ec-4be1-b142-c5636521f573%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Aliases and percolators

2014-05-22 Thread Martijn v Groningen
Yes, that is correct.

Martijn


On 21 May 2014 02:34, Mark Dodwell m...@mkdynamic.co.uk wrote:

 Many thanks, that is a super clear answer.

 So, until that issue is addressed, am I correct in thinking I should do
 this when percolating an existing document:

 ```
 curl http://0.0.0.0:9200/idx-1/docs/123/_percolate -d '{
   filter: {
 term: {
   account_id: 1
 }
   }
 }'
 ```

 Thanks again.

 On Tuesday, May 20, 2014 2:18:02 AM UTC-7, Martijn v Groningen wrote:

 1. Yes, the routing will be taken into during adding the percolator query.
 2. At the moment only the routing will be taken into account, the filter
 will not be taken into account.

 I opened issue percolator against an alias with a filter:
 https://github.com/elasticsearch/elasticsearch/issues/6241


 On 17 May 2014 03:54, Mark Dodwell ma...@mkdynamic.co.uk wrote:

  Consider an index `idx`, with a mapping for a single type `docs`.
 Consider aliases of the format `idx-{ACCOUNT-ID}`, with a term filter and a
 routing value set to the account id, like so:

 ```
 $ curl http://0.0.0.0:9200/idx/_aliases?pretty=1
 {
   idx : {
 aliases : {
   idx-1 : {
 filter : {
   term : {
 account_id : 1
   }
 },
 index_routing : 1,
 search_routing : 1
   },
   ...
 }
   }
 }
 ```

 ### Questions

 1. When indexing a percolator query, if you do that via an alias, will
 it respect the routing?

 ```
 # will the alias routing from idx-1 apply to this operation?
 curl -XPUT http://0.0.0.0:9200/idx-1/.percolator/1 -d '{query:...,
 account_id:1}'
 ```

 2. When percolating an existing document from one of those aliases, will
 the routing and term filter from the alias be used when retrieving/checking
 the matching percolator documents?

 ```
 # will the alias routing and term filter from idx-1 apply to the
 percolators?
 curl http://0.0.0.0:9200/idx-1/docs/123/_percolate

 ```

 Any insight much appreciated.

 --
 You received this message because you are subscribed to the Google
 Groups elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send
 an email to elasticsearc...@googlegroups.com.

 To view this discussion on the web visit https://groups.google.com/d/
 msgid/elasticsearch/d2eb11ca-4ef2-467c-90c3-1d98e56c2ae0%
 40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/d2eb11ca-4ef2-467c-90c3-1d98e56c2ae0%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.




 --
 Met vriendelijke groet,

 Martijn van Groningen




-- 
Met vriendelijke groet,

Martijn van Groningen

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CA%2BA76TyxhcP46tDQ-z1fDePj7837Hk67xhOCo%3D-U9YXg4D-xuw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Store Elasticsearch Indices in a revision/audit-proof way

2014-05-22 Thread horst knete
Hey guys,

in order to meet the german laws for logging, i got the order to store the 
elasticsearch indices in a revision/audit-proof way(Indices cannot be 
edited/changed after the storage).

Are there any best practices or tips for doing such a thing?(maybe any 
plugins?)

Thanks for your feedback.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/ed95d3f8-9266-4ee4-a1a4-d3764b1150a4%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Store Elasticsearch Indices in a revision/audit-proof way

2014-05-22 Thread horst knete
Yeah it looks like that this would do the job, thanks for response

Am Donnerstag, 22. Mai 2014 10:40:19 UTC+2 schrieb Mark Walkom:

 You can set indexes to readonly - 
 http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/indices-update-settings.html
 Is that what you're after?

 Regards,
 Mark Walkom

 Infrastructure Engineer
 Campaign Monitor
 email: ma...@campaignmonitor.com javascript:
 web: www.campaignmonitor.com


 On 22 May 2014 18:36, horst knete badun...@hotmail.de javascript:wrote:

 Hey guys,

 in order to meet the german laws for logging, i got the order to store 
 the elasticsearch indices in a revision/audit-proof way(Indices cannot be 
 edited/changed after the storage).

 Are there any best practices or tips for doing such a thing?(maybe any 
 plugins?)

 Thanks for your feedback.

 -- 
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearc...@googlegroups.com javascript:.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/ed95d3f8-9266-4ee4-a1a4-d3764b1150a4%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/ed95d3f8-9266-4ee4-a1a4-d3764b1150a4%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.




-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/e2776ff1-4dde-4e96-85b0-f19cd9ad6c9b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Elasticsearch Facets + limit results

2014-05-22 Thread Martijn Dwars
I'm trying to construct the following SQL query in Elasticsearch:

SELECT companyId, COUNT(*) c FROM visits GROUP BY companyId ORDER BY c DESC 
LIMIT 2

I came up with the following JSON body for the query:

{
  facets: {
company: {
  filter: {
term: {
  entityType: companypage
}
  },
  terms: {
field: entityId,
size: 2
  }
}
  }
}

When I use size: 2, I get the following result:

facets: {
  company: {
_type: terms
missing: 0
total: 4
other: 0
terms: [{
  term: 2
  count: 3
},
{
  term: 20
  count: 1
}]
  }
}

When I use size: 1, I get the following result:

facets: {
  company: {
_type: terms
missing: 0
total: 4
other: 2
terms: [{
  term: 2
  count: 2
}]
  }
}

How is it possible that the count for term 2 is 3 in the first response, 
but 2 in the second response?

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/45ccf14f-81f2-4a9e-b598-6eb120f46197%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Store Elasticsearch Indices in a revision/audit-proof way

2014-05-22 Thread Mark Walkom
Keep us up to date with your project, I'm sure there would be interested
from others on a similar setup.

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com
web: www.campaignmonitor.com


On 22 May 2014 18:46, horst knete baduncl...@hotmail.de wrote:

 Yeah it looks like that this would do the job, thanks for response

 Am Donnerstag, 22. Mai 2014 10:40:19 UTC+2 schrieb Mark Walkom:

 You can set indexes to readonly - http://www.elasticsearch.
 org/guide/en/elasticsearch/reference/current/indices-update-settings.html
 Is that what you're after?

 Regards,
 Mark Walkom

 Infrastructure Engineer
 Campaign Monitor
 email: ma...@campaignmonitor.com
 web: www.campaignmonitor.com


 On 22 May 2014 18:36, horst knete badun...@hotmail.de wrote:

 Hey guys,

 in order to meet the german laws for logging, i got the order to store
 the elasticsearch indices in a revision/audit-proof way(Indices cannot be
 edited/changed after the storage).

 Are there any best practices or tips for doing such a thing?(maybe any
 plugins?)

 Thanks for your feedback.

 --
 You received this message because you are subscribed to the Google
 Groups elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send
 an email to elasticsearc...@googlegroups.com.
 To view this discussion on the web visit https://groups.google.com/d/
 msgid/elasticsearch/ed95d3f8-9266-4ee4-a1a4-d3764b1150a4%
 40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/ed95d3f8-9266-4ee4-a1a4-d3764b1150a4%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/e2776ff1-4dde-4e96-85b0-f19cd9ad6c9b%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/e2776ff1-4dde-4e96-85b0-f19cd9ad6c9b%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAEM624Yy6uXAaDU_4bxyvKmezA7T5zB73sdF6V_HRPabLkb9UA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Store Elasticsearch Indices in a revision/audit-proof way

2014-05-22 Thread joergpra...@gmail.com
You have to add a facility to your middleware that can trace all authorized
operations to your index (access, read, write, modify, delete) and you must
write this to an append-only logfile with timestamps.

If there is interest I could write such a plugin (assuming it can run in a
trusted environment regarding authorization tokens) but I think best place
is in a middleware (where an ES client runs in a broader application
context e.g. transaction awareness).

Jörg


On Thu, May 22, 2014 at 10:36 AM, horst knete baduncl...@hotmail.de wrote:

 Hey guys,

 in order to meet the german laws for logging, i got the order to store the
 elasticsearch indices in a revision/audit-proof way(Indices cannot be
 edited/changed after the storage).

 Are there any best practices or tips for doing such a thing?(maybe any
 plugins?)

 Thanks for your feedback.

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/ed95d3f8-9266-4ee4-a1a4-d3764b1150a4%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/ed95d3f8-9266-4ee4-a1a4-d3764b1150a4%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoG7JxZW%3D_VqLxW01BUcuP8BA2j_MeiyLuZ-b4uTQmj3SQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


How to speed up indexing by using Python API

2014-05-22 Thread 潘飞
Hi all:

Now , I am trying to index my logs by using the elasticsearch Python API, 
but I only get about 600 records/s  indexing speed.

but, on the same ES cluster, with the same data, logstash(redis - logstash 
- elasticsearch) can index data at the speed about 3000records/s.

any advice on how to speed up indexing speed by using the Python API?

thanks very much.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/1b957d36-b7ad-4671-999a-de06aaa74407%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Store Elasticsearch Indices in a revision/audit-proof way

2014-05-22 Thread horst knete
Hi Jörg,

thanks for your offer.

I will contact you if there´s a need for such an plugin in our company.

Also i will keep you up to date if there´s breaking changes in our project.



Am Donnerstag, 22. Mai 2014 10:55:44 UTC+2 schrieb Jörg Prante:

 You have to add a facility to your middleware that can trace all 
 authorized operations to your index (access, read, write, modify, delete) 
 and you must write this to an append-only logfile with timestamps.

 If there is interest I could write such a plugin (assuming it can run in a 
 trusted environment regarding authorization tokens) but I think best place 
 is in a middleware (where an ES client runs in a broader application 
 context e.g. transaction awareness).

 Jörg


 On Thu, May 22, 2014 at 10:36 AM, horst knete 
 badun...@hotmail.dejavascript:
  wrote:

 Hey guys,

 in order to meet the german laws for logging, i got the order to store 
 the elasticsearch indices in a revision/audit-proof way(Indices cannot be 
 edited/changed after the storage).

 Are there any best practices or tips for doing such a thing?(maybe any 
 plugins?)

 Thanks for your feedback.

 -- 
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearc...@googlegroups.com javascript:.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/ed95d3f8-9266-4ee4-a1a4-d3764b1150a4%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/ed95d3f8-9266-4ee4-a1a4-d3764b1150a4%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.




-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/4f2187bc-8d8e-4c3a-ae02-8eed30f3a175%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: how manage insert and update sql (river) ?

2014-05-22 Thread joergpra...@gmail.com
If you use the column name _id, you can control the ID of the ES document
you created by SQL. If you do not use _id, a random doc ID is generated.

See the README at https://github.com/jprante/elasticsearch-river-jdbc

Jörg


On Thu, May 22, 2014 at 11:43 AM, Tanguy Bernard 
bernardtanguy1...@gmail.com wrote:

 Hello,
 I would like to know  a way to manage INSERT and UPDATE ?
 I am forced to delete and then re-index my data ?
 Maybe there are a way to index again but without duplicate my data
 (INSERT) ?
 Can you help with my problem ?

 I use this :

 PUT /_river/user/_meta
 {
 type : jdbc,
 jdbc : {

 url : jdbc:mysql://my_adress/my_index,
 user : my_user,
 password : my_password,
 sql : select name_user, firstname_user, id_user from user,
 index : my_index,
 type : user,
 max_bulk_requests : 5


 }
 }

 Thanks in advance.

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/2ae75640-f6e4-4b1f-9216-b4e667e53171%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/2ae75640-f6e4-4b1f-9216-b4e667e53171%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoF%3DA0D%2BbMCz91sQ7e0ksQTx%2BgWe7wes0hf-We9enH7ByQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


issue with client.admin().cluster().prepareGetSnapshots(...)

2014-05-22 Thread Chetana
The call to prepareGetSnapshots(...) for getting a snapshot which is not 
existing, throws SnapshotMissingException. But I expect instead it should 
return response with a list of zero snapshots(getSnapshots()) or atleast 
isExist=false
 
Is there any other way one can check the existence of a snapshot?
 
Thanks

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/3d5f4027-2db0-4da1-86a9-adbd117ec7d3%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


File Descriptors

2014-05-22 Thread Shawn Ritchie
Hi guys,

Kind of stuck with a fresh installation of an ElasticSearch cluster. 
everything is installed file descriptor limits are set yet when I run 

curl -XGET http://10.0.8.62:9200/_nodes?os=trueprocess=truepretty=true; 
 stats.txt

I get 

  process : {
refresh_interval : 1000,
id : 1200,
max_file_descriptors : 4096,
mlockall : true
  },

yet when I run 

ulimit -n I get

65535

ElasticSearch is installed as a service and running as root? Any idea why 
this is happening?

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/53e168b5-8834-4b63-98ec-4c5a3080a91b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: File Descriptors

2014-05-22 Thread Mark Walkom
What OS and how did you install it?

(Running as root is a really bad idea by the way!)

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com
web: www.campaignmonitor.com


On 22 May 2014 20:19, Shawn Ritchie xritc...@gmail.com wrote:

 Hi guys,

 Kind of stuck with a fresh installation of an ElasticSearch cluster.
 everything is installed file descriptor limits are set yet when I run

 curl -XGET http://10.0.8.62:9200/_nodes?os=trueprocess=truepretty=true;
  stats.txt

 I get

   process : {
 refresh_interval : 1000,
 id : 1200,
 max_file_descriptors : 4096,
 mlockall : true
   },

 yet when I run

 ulimit -n I get

 65535

 ElasticSearch is installed as a service and running as root? Any idea why
 this is happening?

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/53e168b5-8834-4b63-98ec-4c5a3080a91b%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/53e168b5-8834-4b63-98ec-4c5a3080a91b%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAEM624YGJTMJVwfZTea%3DJr4-K77hFvG6hYMoNfLWbTxBa3T4iw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: File Descriptors

2014-05-22 Thread Shawn Ritchie
so this issue only occurs on server restart. If I had to restart
elasticsearch service it would load the correct number of file descriptors.

Regards
Shawn


On Thu, May 22, 2014 at 12:19 PM, Shawn Ritchie xritc...@gmail.com wrote:

 Hi guys,

 Kind of stuck with a fresh installation of an ElasticSearch cluster.
 everything is installed file descriptor limits are set yet when I run

 curl -XGET http://10.0.8.62:9200/_nodes?os=trueprocess=truepretty=true;
  stats.txt

 I get

   process : {
 refresh_interval : 1000,
 id : 1200,
 max_file_descriptors : 4096,
 mlockall : true
   },

 yet when I run

 ulimit -n I get

 65535

 ElasticSearch is installed as a service and running as root? Any idea why
 this is happening?

 --
 You received this message because you are subscribed to a topic in the
 Google Groups elasticsearch group.
 To unsubscribe from this topic, visit
 https://groups.google.com/d/topic/elasticsearch/SIKIWq40MKY/unsubscribe.
 To unsubscribe from this group and all its topics, send an email to
 elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/53e168b5-8834-4b63-98ec-4c5a3080a91b%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/53e168b5-8834-4b63-98ec-4c5a3080a91b%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.




-- 
Thanks
Shawn Ritchie

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAP0uL7zMEv-dDh1N1ZhY%3D2k9SPDt8bdiBPNqjn%2BPgYHqU0oRrQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: File Descriptors

2014-05-22 Thread Shawn Ritchie
CentOS 6.5 and Java 1.7u55


On Thu, May 22, 2014 at 12:28 PM, Shawn Ritchie xritc...@gmail.com wrote:

 so this issue only occurs on server restart. If I had to restart
 elasticsearch service it would load the correct number of file descriptors.

 Regards
 Shawn


 On Thu, May 22, 2014 at 12:19 PM, Shawn Ritchie xritc...@gmail.comwrote:

 Hi guys,

 Kind of stuck with a fresh installation of an ElasticSearch cluster.
 everything is installed file descriptor limits are set yet when I run

 curl -XGET http://10.0.8.62:9200/_nodes?os=trueprocess=truepretty=true;
  stats.txt

 I get

   process : {
 refresh_interval : 1000,
 id : 1200,
 max_file_descriptors : 4096,
 mlockall : true
   },

 yet when I run

 ulimit -n I get

 65535

 ElasticSearch is installed as a service and running as root? Any idea why
 this is happening?

 --
 You received this message because you are subscribed to a topic in the
 Google Groups elasticsearch group.
 To unsubscribe from this topic, visit
 https://groups.google.com/d/topic/elasticsearch/SIKIWq40MKY/unsubscribe.
 To unsubscribe from this group and all its topics, send an email to
 elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/53e168b5-8834-4b63-98ec-4c5a3080a91b%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/53e168b5-8834-4b63-98ec-4c5a3080a91b%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.




 --
 Thanks
 Shawn Ritchie




-- 
Thanks
Shawn Ritchie

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAP0uL7xtHLT6%2ByjPMSENxQEZKYHJL_31ZuLW%2BUXNrO2GOJsSPw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: File Descriptors

2014-05-22 Thread Mark Walkom
Did you use the RPMs? Where are you setting the ulimit?

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com
web: www.campaignmonitor.com


On 22 May 2014 20:30, Shawn Ritchie xritc...@gmail.com wrote:

 CentOS 6.5 and Java 1.7u55


 On Thu, May 22, 2014 at 12:28 PM, Shawn Ritchie xritc...@gmail.comwrote:

 so this issue only occurs on server restart. If I had to restart
 elasticsearch service it would load the correct number of file descriptors.

 Regards
 Shawn


 On Thu, May 22, 2014 at 12:19 PM, Shawn Ritchie xritc...@gmail.comwrote:

 Hi guys,

 Kind of stuck with a fresh installation of an ElasticSearch cluster.
 everything is installed file descriptor limits are set yet when I run

 curl -XGET 
 http://10.0.8.62:9200/_nodes?os=trueprocess=truepretty=true; 
 stats.txt

 I get

   process : {
 refresh_interval : 1000,
 id : 1200,
 max_file_descriptors : 4096,
 mlockall : true
   },

 yet when I run

 ulimit -n I get

 65535

 ElasticSearch is installed as a service and running as root? Any idea
 why this is happening?

 --
 You received this message because you are subscribed to a topic in the
 Google Groups elasticsearch group.
 To unsubscribe from this topic, visit
 https://groups.google.com/d/topic/elasticsearch/SIKIWq40MKY/unsubscribe.
 To unsubscribe from this group and all its topics, send an email to
 elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/53e168b5-8834-4b63-98ec-4c5a3080a91b%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/53e168b5-8834-4b63-98ec-4c5a3080a91b%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.




 --
 Thanks
 Shawn Ritchie




 --
 Thanks
 Shawn Ritchie

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/CAP0uL7xtHLT6%2ByjPMSENxQEZKYHJL_31ZuLW%2BUXNrO2GOJsSPw%40mail.gmail.comhttps://groups.google.com/d/msgid/elasticsearch/CAP0uL7xtHLT6%2ByjPMSENxQEZKYHJL_31ZuLW%2BUXNrO2GOJsSPw%40mail.gmail.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAEM624bonFrf5V1eK%3D9Ug8SpSiC%3DWL3%3Dc6MRiOSmaK9Hkjz6KA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: File Descriptors

2014-05-22 Thread Shawn Ritchie
No I did not use RPM used .tar for the installation process and my ulimit
settings are in

 /etc/security/limits.conf
* - nofile 65535

/etc/sysctl.conf
fs.file-max = 512000



On Thu, May 22, 2014 at 12:37 PM, Mark Walkom ma...@campaignmonitor.comwrote:

 Did you use the RPMs? Where are you setting the ulimit?

 Regards,
 Mark Walkom

 Infrastructure Engineer
 Campaign Monitor
 email: ma...@campaignmonitor.com
 web: www.campaignmonitor.com


 On 22 May 2014 20:30, Shawn Ritchie xritc...@gmail.com wrote:

 CentOS 6.5 and Java 1.7u55


 On Thu, May 22, 2014 at 12:28 PM, Shawn Ritchie xritc...@gmail.comwrote:

 so this issue only occurs on server restart. If I had to restart
 elasticsearch service it would load the correct number of file descriptors.

 Regards
 Shawn


 On Thu, May 22, 2014 at 12:19 PM, Shawn Ritchie xritc...@gmail.comwrote:

 Hi guys,

 Kind of stuck with a fresh installation of an ElasticSearch cluster.
 everything is installed file descriptor limits are set yet when I run

 curl -XGET 
 http://10.0.8.62:9200/_nodes?os=trueprocess=truepretty=true; 
 stats.txt

 I get

   process : {
 refresh_interval : 1000,
 id : 1200,
 max_file_descriptors : 4096,
 mlockall : true
   },

 yet when I run

 ulimit -n I get

 65535

 ElasticSearch is installed as a service and running as root? Any idea
 why this is happening?

 --
 You received this message because you are subscribed to a topic in the
 Google Groups elasticsearch group.
 To unsubscribe from this topic, visit
 https://groups.google.com/d/topic/elasticsearch/SIKIWq40MKY/unsubscribe
 .
 To unsubscribe from this group and all its topics, send an email to
 elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/53e168b5-8834-4b63-98ec-4c5a3080a91b%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/53e168b5-8834-4b63-98ec-4c5a3080a91b%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.




 --
 Thanks
 Shawn Ritchie




 --
 Thanks
 Shawn Ritchie

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/CAP0uL7xtHLT6%2ByjPMSENxQEZKYHJL_31ZuLW%2BUXNrO2GOJsSPw%40mail.gmail.comhttps://groups.google.com/d/msgid/elasticsearch/CAP0uL7xtHLT6%2ByjPMSENxQEZKYHJL_31ZuLW%2BUXNrO2GOJsSPw%40mail.gmail.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


  --
 You received this message because you are subscribed to a topic in the
 Google Groups elasticsearch group.
 To unsubscribe from this topic, visit
 https://groups.google.com/d/topic/elasticsearch/SIKIWq40MKY/unsubscribe.
 To unsubscribe from this group and all its topics, send an email to
 elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/CAEM624bonFrf5V1eK%3D9Ug8SpSiC%3DWL3%3Dc6MRiOSmaK9Hkjz6KA%40mail.gmail.comhttps://groups.google.com/d/msgid/elasticsearch/CAEM624bonFrf5V1eK%3D9Ug8SpSiC%3DWL3%3Dc6MRiOSmaK9Hkjz6KA%40mail.gmail.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.




-- 
Thanks
Shawn Ritchie

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAP0uL7zxoJe_W80wRU4Kqomv0MBzC67cwo8M%2B0WTmepvgs7CEg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


how to get only aggregation values from elasticsearch

2014-05-22 Thread Subhadip Bagui
Hi,

I want to get the average value of MEMORY field from my ES document. Below 
is the query I'm using for that. Here I'm getting the aggregation along 
with the hits Json also. Is there any way we can get the aggreation result 
only. Please suggest.

POST /virtualmachines/_search
{
query : {
filtered : {
query : { match : {
  CLOUD_TYPE : CLOUDSTACK 
}},
filter : {
range : { NODE_CREATE_TIME : { from : 2014-05-22 
14:11:35, to : 2014-05-22 14:33:35 }}
}
}
},
aggs : {
memory_avg : { avg : { field : MEMORY } }
}
}

*response* : 
{
   took: 2,
   timed_out: false,
   _shards: {
  total: 3,
  successful: 3,
  failed: 0
   },
   hits: {
  total: 1,
  max_score: 1,
  hits: [
 {
_index: virtualmachines,
_type: nodes,
_id: 102,
_score: 1,
_source: {
   NODE_ID: 12235,
   CLOUD_TYPE: CLOUDSTACK,
   NODE_GROUP_NAME: JBOSS,
   NODE_CPU: 4GHZ,
   NODE_HOSTNAME: cloud.aricent.com,
   NODE_NAME: cloudstack-node1,
   NODE_PRIVATE_IP_ADDRESS: 10.123.124.125,
   NODE_PUBLIC_IP_ADDRESS: 125.31.108.72,
   NODE_INSTANCE_ID: cloudstack112,
   NODE_STATUS: ACTIVE,
   NODE_CATEGORY_ID: 13,
   NODE_CREATE_TIME: 2014-05-22 14:23:04,
   CPU_SPEED: 500,
   MEMORY: 512,
   CPU_USED: 0.03%
}
 }
  ]
   },
   aggregations: {
  memory_avg: {
 value: 512
  }
   }
}

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/a8538769-7ba5-4ad5-b190-5f0b8d25a26b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Nodes restarting automatically

2014-05-22 Thread Jorge Ferrando
Hello 

We have a cluster of 3 nodes running Ubuntu 12.04.4 LTS 64bits, and 
elasticsearch v1.1.1

It's be running flawlessly but since the last weak some of the nodes 
restarts randomly and cluster gets to red state, then yellow, then green 
and it happens again in a loop (sometimes it even doesnt get green state)

I've tried to look at the logs but i can't find and obvious reason of what 
can be going on 

I've found entries like these, but I don't know if they are in some way 
related to the crash:

[2014-05-22 13:55:16,150][WARN ][index.codec  ] [elastic ASIC 
nodo 3] [logstash-2014.05.22] no index mapper found for field: [date_end] 
returning default postings format
[2014-05-22 13:55:16,151][WARN ][index.codec  ] [elastic ASIC 
nodo 3] [logstash-2014.05.22] no index mapper found for field: 
[date_end.raw] returning default postings format
[2014-05-22 13:55:16,151][WARN ][index.codec  ] [elastic ASIC 
nodo 3] [logstash-2014.05.22] no index mapper found for field: [date_start] 
returning default postings format
[2014-05-22 13:55:16,151][WARN ][index.codec  ] [elastic ASIC 
nodo 3] [logstash-2014.05.22] no index mapper found for field: 
[date_start.raw] returning default postings format


For instance right now it was in yellow state, really close to get to the 
green state and suddenly node 3 autorestarted and now cluster is red with 
2000 shard initializing. The log in that node shows this:

[2014-05-22 13:59:48,498][INFO ][monitor.jvm  ] [elastic ASIC 
nodo 3] [gc][young][1181][222] duration [735ms], collections [1]/[1s], 
total [735ms]/[1.1m], memory [6.5gb]-[6.1gb]/[19.9gb], all_pools {[young] 
[456mb]-[7.2mb]/[532.5mb]}{[survivor] [66.5mb]-[66.5mb]/[66.5mb]}{[old] 
[6gb]-[6gb]/[19.3gb]}
[2014-05-22 14:03:44,825][INFO ][node ] [elastic ASIC 
nodo 3] version[1.1.1], pid[7511], build[f1585f0/2014-04-16T14:27:12Z]
[2014-05-22 14:03:44,826][INFO ][node ] [elastic ASIC 
nodo 3] initializing ...
[2014-05-22 14:03:44,839][INFO ][plugins  ] [elastic ASIC 
nodo 3] loaded [], sites [paramedic, inquisitor, HQ, bigdesk, head]
[2014-05-22 14:03:51,967][INFO ][node ] [elastic ASIC 
nodo 3] initialized
[2014-05-22 14:03:51,967][INFO ][node ] [elastic ASIC 
nodo 3] starting ...

The crash happened exactly at 14:02.

Any Idea what can be going on or how can I trace what's happening?

After rebooting there are also DEBUG errors like this:

[2014-05-22 14:06:16,621][DEBUG][action.search.type   ] [elastic ASIC 
nodo 3] [logstash-2014.05.21][1], node[jgwbxcBoTVa3JIIG5a_FJA], [P], 
s[STARTED]: Failed to execute 
[org.elasticsearch.action.search.SearchRequest@42b80f4a] lastShard [true]
org.elasticsearch.transport.SendRequestTransportException: [elastic ASIC 
nodo 2][inet[/158.42.250.79:9301]][search/phase/query]
at 
org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:202)
at 
org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:173)
at 
org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteQuery(SearchServiceTransportAction.java:208)
at 
org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction$AsyncAction.sendExecuteFirstPhase(TransportSearchQueryThenFetchAction.java:80)
at 
org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:216)
at 
org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:203)
at 
org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.start(TransportSearchTypeAction.java:143)
at 
org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction.doExecute(TransportSearchQueryThenFetchAction.java:59)
at 
org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction.doExecute(TransportSearchQueryThenFetchAction.java:49)
at 
org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:63)
at 
org.elasticsearch.action.search.TransportSearchAction.doExecute(TransportSearchAction.java:108)
at 
org.elasticsearch.action.search.TransportSearchAction.doExecute(TransportSearchAction.java:43)
at 
org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:63)
at org.elasticsearch.client.node.NodeClient.execute(NodeClient.java:92)
at 
org.elasticsearch.client.support.AbstractClient.search(AbstractClient.java:212)
at 
org.elasticsearch.rest.action.search.RestSearchAction.handleRequest(RestSearchAction.java:98)
at 
org.elasticsearch.rest.RestController.executeHandler(RestController.java:159)
at 
org.elasticsearch.rest.RestController.dispatchRequest(RestController.java:142)
at 
org.elasticsearch.http.HttpServer.internalDispatchRequest(HttpServer.java:121)
at 
org.elasticsearch.http.HttpServer$Dispatcher.dispatchRequest(HttpServer.java:83)
at 

Re: Nodes restarting automatically

2014-05-22 Thread Mark Walkom
How are you running the service, upstart, init or something else?

ES shouldn't just restart on it's own, this could be something else like
the kernel's OOM killer.

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com
web: www.campaignmonitor.com


On 22 May 2014 22:07, Jorge Ferrando jorfe...@gmail.com wrote:

 Hello

 We have a cluster of 3 nodes running Ubuntu 12.04.4 LTS 64bits, and
 elasticsearch v1.1.1

 It's be running flawlessly but since the last weak some of the nodes
 restarts randomly and cluster gets to red state, then yellow, then green
 and it happens again in a loop (sometimes it even doesnt get green state)

 I've tried to look at the logs but i can't find and obvious reason of what
 can be going on

 I've found entries like these, but I don't know if they are in some way
 related to the crash:

 [2014-05-22 13:55:16,150][WARN ][index.codec  ] [elastic ASIC
 nodo 3] [logstash-2014.05.22] no index mapper found for field: [date_end]
 returning default postings format
 [2014-05-22 13:55:16,151][WARN ][index.codec  ] [elastic ASIC
 nodo 3] [logstash-2014.05.22] no index mapper found for field:
 [date_end.raw] returning default postings format
 [2014-05-22 13:55:16,151][WARN ][index.codec  ] [elastic ASIC
 nodo 3] [logstash-2014.05.22] no index mapper found for field: [date_start]
 returning default postings format
 [2014-05-22 13:55:16,151][WARN ][index.codec  ] [elastic ASIC
 nodo 3] [logstash-2014.05.22] no index mapper found for field:
 [date_start.raw] returning default postings format


 For instance right now it was in yellow state, really close to get to the
 green state and suddenly node 3 autorestarted and now cluster is red with
 2000 shard initializing. The log in that node shows this:

 [2014-05-22 13:59:48,498][INFO ][monitor.jvm  ] [elastic ASIC
 nodo 3] [gc][young][1181][222] duration [735ms], collections [1]/[1s],
 total [735ms]/[1.1m], memory [6.5gb]-[6.1gb]/[19.9gb], all_pools {[young]
 [456mb]-[7.2mb]/[532.5mb]}{[survivor] [66.5mb]-[66.5mb]/[66.5mb]}{[old]
 [6gb]-[6gb]/[19.3gb]}
 [2014-05-22 14:03:44,825][INFO ][node ] [elastic ASIC
 nodo 3] version[1.1.1], pid[7511], build[f1585f0/2014-04-16T14:27:12Z]
 [2014-05-22 14:03:44,826][INFO ][node ] [elastic ASIC
 nodo 3] initializing ...
 [2014-05-22 14:03:44,839][INFO ][plugins  ] [elastic ASIC
 nodo 3] loaded [], sites [paramedic, inquisitor, HQ, bigdesk, head]
 [2014-05-22 14:03:51,967][INFO ][node ] [elastic ASIC
 nodo 3] initialized
 [2014-05-22 14:03:51,967][INFO ][node ] [elastic ASIC
 nodo 3] starting ...

 The crash happened exactly at 14:02.

 Any Idea what can be going on or how can I trace what's happening?

 After rebooting there are also DEBUG errors like this:

 [2014-05-22 14:06:16,621][DEBUG][action.search.type   ] [elastic ASIC
 nodo 3] [logstash-2014.05.21][1], node[jgwbxcBoTVa3JIIG5a_FJA], [P],
 s[STARTED]: Failed to execute
 [org.elasticsearch.action.search.SearchRequest@42b80f4a] lastShard [true]
 org.elasticsearch.transport.SendRequestTransportException: [elastic ASIC
 nodo 2][inet[/158.42.250.79:9301]][search/phase/query]
 at
 org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:202)
  at
 org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:173)
 at
 org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteQuery(SearchServiceTransportAction.java:208)
  at
 org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction$AsyncAction.sendExecuteFirstPhase(TransportSearchQueryThenFetchAction.java:80)
 at
 org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:216)
  at
 org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:203)
 at
 org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.start(TransportSearchTypeAction.java:143)
  at
 org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction.doExecute(TransportSearchQueryThenFetchAction.java:59)
 at
 org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction.doExecute(TransportSearchQueryThenFetchAction.java:49)
  at
 org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:63)
 at
 org.elasticsearch.action.search.TransportSearchAction.doExecute(TransportSearchAction.java:108)
  at
 org.elasticsearch.action.search.TransportSearchAction.doExecute(TransportSearchAction.java:43)
 at
 org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:63)
  at org.elasticsearch.client.node.NodeClient.execute(NodeClient.java:92)
 at
 org.elasticsearch.client.support.AbstractClient.search(AbstractClient.java:212)
  at
 

Re: Nodes restarting automatically

2014-05-22 Thread Jorge Ferrando
elasticsearch nodes are launched through /etc/init.d/elasticsearch


On Thu, May 22, 2014 at 2:13 PM, Mark Walkom ma...@campaignmonitor.comwrote:

 How are you running the service, upstart, init or something else?

 ES shouldn't just restart on it's own, this could be something else like
 the kernel's OOM killer.

 Regards,
 Mark Walkom

 Infrastructure Engineer
 Campaign Monitor
 email: ma...@campaignmonitor.com
 web: www.campaignmonitor.com


 On 22 May 2014 22:07, Jorge Ferrando jorfe...@gmail.com wrote:

 Hello

 We have a cluster of 3 nodes running Ubuntu 12.04.4 LTS 64bits, and
 elasticsearch v1.1.1

 It's be running flawlessly but since the last weak some of the nodes
 restarts randomly and cluster gets to red state, then yellow, then green
 and it happens again in a loop (sometimes it even doesnt get green state)

 I've tried to look at the logs but i can't find and obvious reason of
 what can be going on

 I've found entries like these, but I don't know if they are in some way
 related to the crash:

 [2014-05-22 13:55:16,150][WARN ][index.codec  ] [elastic ASIC
 nodo 3] [logstash-2014.05.22] no index mapper found for field: [date_end]
 returning default postings format
 [2014-05-22 13:55:16,151][WARN ][index.codec  ] [elastic ASIC
 nodo 3] [logstash-2014.05.22] no index mapper found for field:
 [date_end.raw] returning default postings format
 [2014-05-22 13:55:16,151][WARN ][index.codec  ] [elastic ASIC
 nodo 3] [logstash-2014.05.22] no index mapper found for field: [date_start]
 returning default postings format
 [2014-05-22 13:55:16,151][WARN ][index.codec  ] [elastic ASIC
 nodo 3] [logstash-2014.05.22] no index mapper found for field:
 [date_start.raw] returning default postings format


 For instance right now it was in yellow state, really close to get to the
 green state and suddenly node 3 autorestarted and now cluster is red with
 2000 shard initializing. The log in that node shows this:

 [2014-05-22 13:59:48,498][INFO ][monitor.jvm  ] [elastic ASIC
 nodo 3] [gc][young][1181][222] duration [735ms], collections [1]/[1s],
 total [735ms]/[1.1m], memory [6.5gb]-[6.1gb]/[19.9gb], all_pools {[young]
 [456mb]-[7.2mb]/[532.5mb]}{[survivor] [66.5mb]-[66.5mb]/[66.5mb]}{[old]
 [6gb]-[6gb]/[19.3gb]}
 [2014-05-22 14:03:44,825][INFO ][node ] [elastic ASIC
 nodo 3] version[1.1.1], pid[7511], build[f1585f0/2014-04-16T14:27:12Z]
 [2014-05-22 14:03:44,826][INFO ][node ] [elastic ASIC
 nodo 3] initializing ...
 [2014-05-22 14:03:44,839][INFO ][plugins  ] [elastic ASIC
 nodo 3] loaded [], sites [paramedic, inquisitor, HQ, bigdesk, head]
 [2014-05-22 14:03:51,967][INFO ][node ] [elastic ASIC
 nodo 3] initialized
 [2014-05-22 14:03:51,967][INFO ][node ] [elastic ASIC
 nodo 3] starting ...

 The crash happened exactly at 14:02.

 Any Idea what can be going on or how can I trace what's happening?

 After rebooting there are also DEBUG errors like this:

 [2014-05-22 14:06:16,621][DEBUG][action.search.type   ] [elastic ASIC
 nodo 3] [logstash-2014.05.21][1], node[jgwbxcBoTVa3JIIG5a_FJA], [P],
 s[STARTED]: Failed to execute
 [org.elasticsearch.action.search.SearchRequest@42b80f4a] lastShard [true]
 org.elasticsearch.transport.SendRequestTransportException: [elastic ASIC
 nodo 2][inet[/158.42.250.79:9301]][search/phase/query]
 at
 org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:202)
  at
 org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:173)
 at
 org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteQuery(SearchServiceTransportAction.java:208)
  at
 org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction$AsyncAction.sendExecuteFirstPhase(TransportSearchQueryThenFetchAction.java:80)
 at
 org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:216)
  at
 org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:203)
 at
 org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.start(TransportSearchTypeAction.java:143)
  at
 org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction.doExecute(TransportSearchQueryThenFetchAction.java:59)
 at
 org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction.doExecute(TransportSearchQueryThenFetchAction.java:49)
  at
 org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:63)
 at
 org.elasticsearch.action.search.TransportSearchAction.doExecute(TransportSearchAction.java:108)
  at
 org.elasticsearch.action.search.TransportSearchAction.doExecute(TransportSearchAction.java:43)
 at
 org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:63)
  at 

Re: Nodes restarting automatically

2014-05-22 Thread Nikolas Everett
Like Mark said, check the oomkiller.  It should log to syslog.  Its is evil.

Nik


On Thu, May 22, 2014 at 2:14 PM, Jorge Ferrando jorfe...@gmail.com wrote:

 elasticsearch nodes are launched through /etc/init.d/elasticsearch


 On Thu, May 22, 2014 at 2:13 PM, Mark Walkom ma...@campaignmonitor.comwrote:

 How are you running the service, upstart, init or something else?

 ES shouldn't just restart on it's own, this could be something else like
 the kernel's OOM killer.

 Regards,
 Mark Walkom

 Infrastructure Engineer
 Campaign Monitor
 email: ma...@campaignmonitor.com
 web: www.campaignmonitor.com


 On 22 May 2014 22:07, Jorge Ferrando jorfe...@gmail.com wrote:

 Hello

 We have a cluster of 3 nodes running Ubuntu 12.04.4 LTS 64bits, and
 elasticsearch v1.1.1

 It's be running flawlessly but since the last weak some of the nodes
 restarts randomly and cluster gets to red state, then yellow, then green
 and it happens again in a loop (sometimes it even doesnt get green state)

 I've tried to look at the logs but i can't find and obvious reason of
 what can be going on

 I've found entries like these, but I don't know if they are in some way
 related to the crash:

 [2014-05-22 13:55:16,150][WARN ][index.codec  ] [elastic
 ASIC nodo 3] [logstash-2014.05.22] no index mapper found for field:
 [date_end] returning default postings format
 [2014-05-22 13:55:16,151][WARN ][index.codec  ] [elastic
 ASIC nodo 3] [logstash-2014.05.22] no index mapper found for field:
 [date_end.raw] returning default postings format
 [2014-05-22 13:55:16,151][WARN ][index.codec  ] [elastic
 ASIC nodo 3] [logstash-2014.05.22] no index mapper found for field:
 [date_start] returning default postings format
 [2014-05-22 13:55:16,151][WARN ][index.codec  ] [elastic
 ASIC nodo 3] [logstash-2014.05.22] no index mapper found for field:
 [date_start.raw] returning default postings format


 For instance right now it was in yellow state, really close to get to
 the green state and suddenly node 3 autorestarted and now cluster is red
 with 2000 shard initializing. The log in that node shows this:

 [2014-05-22 13:59:48,498][INFO ][monitor.jvm  ] [elastic
 ASIC nodo 3] [gc][young][1181][222] duration [735ms], collections [1]/[1s],
 total [735ms]/[1.1m], memory [6.5gb]-[6.1gb]/[19.9gb], all_pools {[young]
 [456mb]-[7.2mb]/[532.5mb]}{[survivor] [66.5mb]-[66.5mb]/[66.5mb]}{[old]
 [6gb]-[6gb]/[19.3gb]}
 [2014-05-22 14:03:44,825][INFO ][node ] [elastic
 ASIC nodo 3] version[1.1.1], pid[7511], build[f1585f0/2014-04-16T14:27:12Z]
 [2014-05-22 14:03:44,826][INFO ][node ] [elastic
 ASIC nodo 3] initializing ...
 [2014-05-22 14:03:44,839][INFO ][plugins  ] [elastic
 ASIC nodo 3] loaded [], sites [paramedic, inquisitor, HQ, bigdesk, head]
 [2014-05-22 14:03:51,967][INFO ][node ] [elastic
 ASIC nodo 3] initialized
 [2014-05-22 14:03:51,967][INFO ][node ] [elastic
 ASIC nodo 3] starting ...

 The crash happened exactly at 14:02.

 Any Idea what can be going on or how can I trace what's happening?

 After rebooting there are also DEBUG errors like this:

 [2014-05-22 14:06:16,621][DEBUG][action.search.type   ] [elastic
 ASIC nodo 3] [logstash-2014.05.21][1], node[jgwbxcBoTVa3JIIG5a_FJA], [P],
 s[STARTED]: Failed to execute
 [org.elasticsearch.action.search.SearchRequest@42b80f4a] lastShard
 [true]
 org.elasticsearch.transport.SendRequestTransportException: [elastic ASIC
 nodo 2][inet[/158.42.250.79:9301]][search/phase/query]
 at
 org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:202)
  at
 org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:173)
 at
 org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteQuery(SearchServiceTransportAction.java:208)
  at
 org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction$AsyncAction.sendExecuteFirstPhase(TransportSearchQueryThenFetchAction.java:80)
 at
 org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:216)
  at
 org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:203)
 at
 org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.start(TransportSearchTypeAction.java:143)
  at
 org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction.doExecute(TransportSearchQueryThenFetchAction.java:59)
 at
 org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction.doExecute(TransportSearchQueryThenFetchAction.java:49)
  at
 org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:63)
 at
 org.elasticsearch.action.search.TransportSearchAction.doExecute(TransportSearchAction.java:108)
  at
 

Re: Number of characters in field

2014-05-22 Thread Dan Tuffery
You could use a script filter:

filtered : {
query : {
...
},
filter : {
script : {
script : doc['subject'].value.length()  20
}
}
}

Dan

On Thursday, May 22, 2014 8:45:41 AM UTC+1, David Nielsen wrote:

 Hi. 
 I am trying to find a way to express a character count filter in a 
 querystring, for instance: I need to find all documents with field 
 subject that holds less than 20 chars.
 How would i do that in a querystring ?

 /David


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/9d7cff9d-1449-4635-b780-89ce6102ec77%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Nodes restarting automatically

2014-05-22 Thread Jorge Ferrando
I've been checking syslog in all of the nodes and I found no mention to
oom, process killed, out of memory or something similar...

Just in caes I ran this commands in the 3 nodes and the problem persists:

echo 0  /proc/sys/vm/oom-kill
echo 1  /proc/sys/vm/overcommit_memory
echo 100  /proc/sys/vm/overcommit_ratio


On Thu, May 22, 2014 at 2:16 PM, Nikolas Everett nik9...@gmail.com wrote:

 Like Mark said, check the oomkiller.  It should log to syslog.  Its is
 evil.

 Nik


 On Thu, May 22, 2014 at 2:14 PM, Jorge Ferrando jorfe...@gmail.comwrote:

 elasticsearch nodes are launched through /etc/init.d/elasticsearch


 On Thu, May 22, 2014 at 2:13 PM, Mark Walkom 
 ma...@campaignmonitor.comwrote:

 How are you running the service, upstart, init or something else?

 ES shouldn't just restart on it's own, this could be something else like
 the kernel's OOM killer.

 Regards,
 Mark Walkom

 Infrastructure Engineer
 Campaign Monitor
 email: ma...@campaignmonitor.com
 web: www.campaignmonitor.com


 On 22 May 2014 22:07, Jorge Ferrando jorfe...@gmail.com wrote:

 Hello

 We have a cluster of 3 nodes running Ubuntu 12.04.4 LTS 64bits, and
 elasticsearch v1.1.1

 It's be running flawlessly but since the last weak some of the nodes
 restarts randomly and cluster gets to red state, then yellow, then green
 and it happens again in a loop (sometimes it even doesnt get green state)

 I've tried to look at the logs but i can't find and obvious reason of
 what can be going on

 I've found entries like these, but I don't know if they are in some way
 related to the crash:

 [2014-05-22 13:55:16,150][WARN ][index.codec  ] [elastic
 ASIC nodo 3] [logstash-2014.05.22] no index mapper found for field:
 [date_end] returning default postings format
 [2014-05-22 13:55:16,151][WARN ][index.codec  ] [elastic
 ASIC nodo 3] [logstash-2014.05.22] no index mapper found for field:
 [date_end.raw] returning default postings format
 [2014-05-22 13:55:16,151][WARN ][index.codec  ] [elastic
 ASIC nodo 3] [logstash-2014.05.22] no index mapper found for field:
 [date_start] returning default postings format
 [2014-05-22 13:55:16,151][WARN ][index.codec  ] [elastic
 ASIC nodo 3] [logstash-2014.05.22] no index mapper found for field:
 [date_start.raw] returning default postings format


 For instance right now it was in yellow state, really close to get to
 the green state and suddenly node 3 autorestarted and now cluster is red
 with 2000 shard initializing. The log in that node shows this:

 [2014-05-22 13:59:48,498][INFO ][monitor.jvm  ] [elastic
 ASIC nodo 3] [gc][young][1181][222] duration [735ms], collections [1]/[1s],
 total [735ms]/[1.1m], memory [6.5gb]-[6.1gb]/[19.9gb], all_pools {[young]
 [456mb]-[7.2mb]/[532.5mb]}{[survivor] [66.5mb]-[66.5mb]/[66.5mb]}{[old]
 [6gb]-[6gb]/[19.3gb]}
 [2014-05-22 14:03:44,825][INFO ][node ] [elastic
 ASIC nodo 3] version[1.1.1], pid[7511], build[f1585f0/2014-04-16T14:27:12Z]
 [2014-05-22 14:03:44,826][INFO ][node ] [elastic
 ASIC nodo 3] initializing ...
 [2014-05-22 14:03:44,839][INFO ][plugins  ] [elastic
 ASIC nodo 3] loaded [], sites [paramedic, inquisitor, HQ, bigdesk, head]
 [2014-05-22 14:03:51,967][INFO ][node ] [elastic
 ASIC nodo 3] initialized
 [2014-05-22 14:03:51,967][INFO ][node ] [elastic
 ASIC nodo 3] starting ...

 The crash happened exactly at 14:02.

 Any Idea what can be going on or how can I trace what's happening?

 After rebooting there are also DEBUG errors like this:

 [2014-05-22 14:06:16,621][DEBUG][action.search.type   ] [elastic
 ASIC nodo 3] [logstash-2014.05.21][1], node[jgwbxcBoTVa3JIIG5a_FJA], [P],
 s[STARTED]: Failed to execute
 [org.elasticsearch.action.search.SearchRequest@42b80f4a] lastShard
 [true]
 org.elasticsearch.transport.SendRequestTransportException: [elastic
 ASIC nodo 2][inet[/158.42.250.79:9301]][search/phase/query]
 at
 org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:202)
  at
 org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:173)
 at
 org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteQuery(SearchServiceTransportAction.java:208)
  at
 org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction$AsyncAction.sendExecuteFirstPhase(TransportSearchQueryThenFetchAction.java:80)
 at
 org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:216)
  at
 org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:203)
 at
 org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.start(TransportSearchTypeAction.java:143)
  at
 org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction.doExecute(TransportSearchQueryThenFetchAction.java:59)
 at

Re: index paramerters

2014-05-22 Thread anass benjelloun
Hello,

I found some informations wich are not complete :

There is no “correct” number of actions to perform in a single bulk call. 
You should experiment with different settings to find the optimum size for 
your particular workload.

Every time you index a document elasticsearch will decide which primary 
shard is supposed to hold that document and will index it there. Primary 
shards are not copy of the data, they are the data! Having multiple shards 
does help taking advantage of parallel processing on a single machine

Another type of shard is replica. The default is 1, meaning that every 
primary shard will be copied to another shard that will contain the same 
data. Replicas are used to increase search performance and for fail-over.


regards,

Anass BENJELLOUN

Le jeudi 22 mai 2014 09:26:35 UTC+2, anass benjelloun a écrit :

 Hello,

 I need to index 100 000 documents with 1Mo.
 This is my configuration of ElasticSearch index:

 index: {

 type: doc,

 bulk_size: 100,

 number_of_shards : 5, 

 number_of_replicas : 2
   }

 I need to know what each parameters effect.



-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/570e793b-4f7a-4021-a8a4-5453d59a154c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Number of characters in field

2014-05-22 Thread David Nielsen
Well yes i know that one, is this really the only/best way to do it?. 
My application is forwarding an input field directly to a querystring, the 
user need to be able to query something like this:

tags:h1 AND subject:lenght20 



On Thursday, May 22, 2014 2:30:30 PM UTC+2, Dan Tuffery wrote:

 You could use a script filter:

 filtered : {
 query : {
 ...
 },
 filter : {
 script : {
 script : doc['subject'].value.length()  20
 }
 }
 }

 Dan

 On Thursday, May 22, 2014 8:45:41 AM UTC+1, David Nielsen wrote:

 Hi. 
 I am trying to find a way to express a character count filter in a 
 querystring, for instance: I need to find all documents with field 
 subject that holds less than 20 chars.
 How would i do that in a querystring ?

 /David



-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/2d4f588a-4655-4286-9e12-4d4b4d0015e6%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: How to speed up indexing by using Python API

2014-05-22 Thread Honza Král
Hi,

what method are you using in your python script? Have you looked at
the bulk and streaming_bulk helpers in ealsticsearch-py?

http://elasticsearch-py.readthedocs.org/en/master/helpers.html

Hope this helps,
Honza

On Thu, May 22, 2014 at 11:09 AM, 潘飞 cnwe...@gmail.com wrote:
 Hi all:

 Now , I am trying to index my logs by using the elasticsearch Python API,
 but I only get about 600 records/s  indexing speed.

 but, on the same ES cluster, with the same data, logstash(redis - logstash
 - elasticsearch) can index data at the speed about 3000records/s.

 any advice on how to speed up indexing speed by using the Python API?

 thanks very much.

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/1b957d36-b7ad-4671-999a-de06aaa74407%40googlegroups.com.
 For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CABfdDipXuL01-WNM0DbDE2Y%2BqBTs5G3wdRofMGN6C2eY2uwWrA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Terms lookup mechanism with multiple lookup docs

2014-05-22 Thread Valery Ayala
Is it possible to use this feature with a lookup on multiple documents 
(multiple IDs) to supply the terms?
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-terms-filter.html#_terms_lookup_mechanism

I tried this 
  terms: {
user: {
  index: users,
  type: user,
  id: [1,2,3,4],
  path: followers
}
  }
 

which was accepted as legit syntax, but only ONE of the ids was actually 
used within the terms filter (the last one in the list).

So far the only way I can come up with for doing this is using a bool 
filter and stuffing a bunch of these terms lookups into the should clause 
to accomplish the equivalent of multiple docs lookup.
Is there a more efficient way?

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/783c6825-c6db-4584-80ee-4dc710730c43%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Is it wise to use ES for saving shopping Carts?

2014-05-22 Thread Matthias Feist
Hi Guys,

I'm working on an online shop. Currently we are storing the cart's content 
in a MySQL Database so we can very easy access the amount of a certain 
product and determine the reserved quantity.
This is very important as the amount in the user's carts is reserved so 
other users my not by them.

What do you think: Is it wise to implement such a system in elasticsearch? 
I'm mostly worried about the time between the add to cart (inserting a 
document) and being able to access the total value due to the flushing 
delay.

Thanks for your advice.

Kind regards
Matthias

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/0c48d418-b8b9-4d9a-b659-5aca07e60eb4%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Elasticsearch Facets + limit results

2014-05-22 Thread emeschitc


How is it possible that the count for term 2 is 3 in the first response, 
 but 2 in the second response?


From the docs:

The size parameter defines how many top terms should be returned out of the 
overall terms list. By default, the node coordinating the search process 
will ask each shard to provide its own top size terms and once all shards 
respond, it will reduce the results to the final list that will then be 
sent back to the client. This means that if the number of unique terms is 
greater than size, the returned list is slightly off and not accurate (it 
could be that the term counts are slightly off and it could even be that a 
term that should have been in the top size entries was not returned).

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/9a37d6e2-6a8b-47dd-9180-8ff6f720a41e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Setting up indices (mappings, settings etc.)

2014-05-22 Thread Peter Webber
Hi,

if anyone could comment on my code I would be very greatful. I'd like to 
know whether my way to set up the index is as it is intended to be.

Thanks!

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/9891deb3-aa0b-4efe-8838-ed8eed5fe354%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [HADOOP] Elasticsearch and hive

2014-05-22 Thread Costin Leau

Hi,

It looks like you have two tables - one that uses the JSONSerDe from cloudera 
and another one using es-hadoop.
You configured your es-hadoop table to consider the input as json however it does not receive the proper format (as the 
exception indicates).

See this [1] section of the documentation for more information.

Let me restate again that you do not the data to Elasticsearch and es-hadoop to be in JSON format - unless it actually 
is in JSON
(as in JSON files are stored in HDFS) simply send whatever data you have in Hive to es-hadoop and it will do the JSON 
translation automatically

and typically much more efficient.
Again the docs [2] discuss this in detail. Frankly, I'm not sure why you want 
to use the JSONSerDe.

To conclude: if the data is in raw format in HDFS, just feed to es-hadoop with the input configured as JSON, if it's in 
a different format,

load it in Hive as you would typically do and then insert it into the es-hadoop 
table.

Hope this helps,

[1] 
http://www.elasticsearch.org/guide/en/elasticsearch/hadoop/current/hive.html#writing-json-hive
[2] 
http://www.elasticsearch.org/guide/en/elasticsearch/hadoop/current/hive.html#hive

On 5/22/14 12:13 PM, hanine haninne wrote:

Hello again,
I m trying to get some tweets from the HDFS and put it in elasticsearch(head) 
in order to make search on it, I did what
u said and I m so thankful, But it start reading and does not continue. I don't 
know why ?!!
here is what I did :

hive -hiveconf 
hive.aux.jars.path=/ES_Hadoop/dist/elasticsearch-hadoop-2.0.0.RC1.jar

Logging initialized using configuration in
jar:file:/home/hduser/hadoop/hive-0.11.0/lib/hive-common-0.11.0.jar!/hive-log4j.properties
Hive history 
file=/tmp/hduser/hive_job_log_hduser_8166@mouna-pc_201405220958_2036150447.txt

hive ADD JAR /home/hduser/hadoop/hive-0.11.0/lib/hive-serdes-1.0-SNAPSHOT.jar;
Added /home/hduser/hadoop/hive-0.11.0/lib/hive-serdes-1.0-SNAPSHOT.jar to class 
path
Added resource: /home/hduser/hadoop/hive-0.11.0/lib/hive-serdes-1.0-SNAPSHOT.jar
hive INSERT OVERWRITE TABLE tweetsES SELECT * FROM tweets;
Total MapReduce jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_201405220937_0005, Tracking URL = 
http://master:50030/jobdetails.jsp?jobid=job_201405220937_0005
Kill Command = /home/hduser/hadoop/libexec/../bin/hadoop job  -kill 
job_201405220937_0005
Hadoop job information for Stage-0: number of mappers: 1; number of reducers: 0
2014-05-22 10:01:39,184 Stage-0 map = 0%,  reduce = 0%
2014-05-22 10:01:59,320 Stage-0 map = 100%,  reduce = 100%
Ended Job = job_201405220937_0005 with errors
Error during job, obtaining debugging information...
Job Tracking URL: http://master:50030/jobdetails.jsp?jobid=job_201405220937_0005
Examining task ID: task_201405220937_0005_m_02 (and more) from job 
job_201405220937_0005

Task with the most failures(4):
-
Task ID:
   task_201405220937_0005_m_00

URL:
http://master:50030/taskdetails.jsp?jobid=job_201405220937_0005tipid=task_201405220937_0005_m_00
-
Diagnostic Messages for this Task:
java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: 
Hive Runtime Error while processing row
{id:467649114269028352,created_at:Sat May 17 12:53:32 + 2014,source:a 
href=\http://www.apple.com\;
rel=\nofollow\iOS/a,favorited:false,retweet_count:0,retweeted_status:null,entities:{urls:[{expanded_url:https://www.google.co.uk/search?q=viera+lifting+fa+cup+for+arsenalclient=safarihl=en-gbsource=lnmstbm=ischsa=Xei=alp3U4LrK82UOufRgaAKved=0CAYQ_AUoAQbiw=320bih=460#facrc=_imgrc=-QK5IhwQT8AS8M%253A%3B2yfUWG7-cngS-M%3Bhttp%253A%252F%252Ffarm4.static.flickr.com%252F3777%252F10838930556_618b87d7cc_m.jpg%3Bhttp%253A%252F%252Fwhotalking.com%252Fflickr%252F2-0%252BChelsea%3B1024%3B669}],user_mentions:[{screen_name:jameswilliams46,name:James
Williams}],hashtags:[]},text:@jameswilliams46
https://t.co/CAmxdagAym,user:{screen_name:SSimpson7379,name:Stephen
Simpson,friends_count:258,followers_count:266,statuses_count:2609,verified:false,utc_offset:3600,time_zone:London},in_reply_to_screen_name:jameswilliams46}
 at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:162)
 at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
 at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430)
 at org.apache.hadoop.mapred.MapTask.run(MapTask.java:366)
 at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:415)
 at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
 at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error 
while processing row
{id:467649114269028352,created_at:Sat May 17 12:53:32 + 2014,source:a 
href=\http://www.apple.com\;

Analyzers and char_filters o_0 creepy outputs

2014-05-22 Thread georgi . mateev
Hi! This is a sample setup, close to what I am working with

https://gist.github.com/anonymous/6e1457321a8ad78c6af8

As you can see, I am trying to remove the hyphens from all words, so that 
words like hand-made are indexed as handmade. The goal is to make a 
search for handmade find all documents, containing hand-made and vice 
versa.
For some reason it doesn't work, though :(

I have also attached 3 sample queries. The expected result would be for all 
of them to return the same result set. 
1) Astonishingly, a search for Chemie-injenieur finds 2 results, but a 
search for Chemieingenieur finds none. This is pretty creepy to me, since 
the char_filter is supposed to strip the hyphens prior to tokenizing in the 
indexing process.
2) Another creepy fact is that if I specify the searchAnalyzer explicitly, 
I find no results (see query 3) from this document set
3) Moreover the analyzeAPI shows that the search term Chemie-ingenieur 
gets translated to Chemieingenieur using this analyzer

4) And the most creepy facts is that when I run these queries with the 
actual index data (800+ documents), I get 17 results for Chemie-ingenieur 
and 22 for Chemieingenieur, where NONE OF THEM OVERLAPS. I.e. I have a 
total of 39 documents that should be matching either of the queries. Some 
of the documents that match Chemie-ingenieur actually don't contain the 
word with the hyphen. So I would expect these documents to be contained in 
both result sets, maybe with a different relevancy score. This is, however, 
not the case.

Please help me get over this, I have been struggling with it for a full 
week already. I would be very grateful for some explanation too, apart from 
a solution, since the output is much different that what I expect from my 
understanding and this means that I don't really understand the system.

P.S. Please focus on the actual problem and let's not discuss the mapping 
into details. The version I have pasted is pretty different than what I 
have started with initially, due to the try-and-error approach I have been 
using for almost a week.

Thanks sincerely,
Georgi

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/417363d0-965f-4398-8174-9889db47d50b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Wait for yellow status

2014-05-22 Thread Ivan Brusic
While doing some tests, I thought I uncovered a bug in the
cluster-health/wait-for-yellow request. No matter what settings I tried,
the request would always return immediately with no timeout. I then
realized that the request is actually something like wait for AT LEAST
yellow state. In other words, a GREEN state would satisfy a wait for
YELLOW state.

The code in question

https://github.com/elasticsearch/elasticsearch/blob/master/src/main/java/org/elasticsearch/action/admin/cluster/health/TransportClusterHealthAction.java#L132

*response.getStatus().value() = request.waitForStatus().value()*

Obviously few people are interested in waiting for when a cluster goes from
GREEN to YELLLOW, but I am one of those few. I would love to propose a true
wait-for-yellow (and even red), but it would be difficult to come up with
an appropriately named call since the waitForYellow is taken and should be
renamed (breaking code in the process).

I am assuming the current code is a feature and not a bug, in which case
the documentation should reflect that thinking (willing to fix). But in the
meantime, how can one be notified if a cluster changes its state from GREEN
without constant polling?

Cheers,

Ivan

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQAfpmsHe%2BfiRZ6_OD%2BV9bGO_nsoPPZn8w%3DgWBvJdLGzFg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Unassigned Shards Problem

2014-05-22 Thread Brian Wilkins
I have five nodes : Two Master Nodes, One Balancer Node, One Workhorse 
Node, and One Coordinator Node.

I am shipping events from logstash, redis, to elasticsearch.

At the moment, my cluster is RED. The shards are created but no index is 
created. I used to get an index like logstash.2014-05-22, but not anymore.

I deleted all my data, Cluster health goes GREEN.

However, as soon as data is sent from logstash - redis - elasticsearch, 
my cluster health goes RED. I end up with unassigned shards. In my 
/var/log/elasticsearch/logstash.log on my master, I see this in the log:

[2014-05-22 12:03:20,599][INFO ][cluster.metadata ] [Bora] 
[logstash-2014.05.22] creating index, cause [auto(bulk api)], shards 
[5]/[1], mappings [_default_]

On my master, this is the configuration:

cluster:
  name: logstash
  routing:
allocation:
  awareness:
attributes: rack
node:
  data: true
  master: true

curl -XGET 'http://localhost:9200/_cluster/health?pretty=true'
{
  cluster_name : logstash,
  status : red,
  timed_out : false,
  number_of_nodes : 5,
  number_of_data_nodes : 3,
  active_primary_shards : 0,
  active_shards : 0,
  relocating_shards : 0,
  initializing_shards : 0,
  unassigned_shards : 10
}

Is there an incorrect setting? I also installed ElasticHQ. It tells me the 
same information.

Brian

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/03c5974b-ae50-4f1c-9ba3-4ef94b564323%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: how to get only aggregation values from elasticsearch

2014-05-22 Thread Ivan Brusic
You can set the size to 0.

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-request-from-size.html

You will still get back the search metadata though.

-- 
Ivan


On Thu, May 22, 2014 at 4:46 AM, Subhadip Bagui i.ba...@gmail.com wrote:

 Hi,

 I want to get the average value of MEMORY field from my ES document. Below
 is the query I'm using for that. Here I'm getting the aggregation along
 with the hits Json also. Is there any way we can get the aggreation result
 only. Please suggest.

 POST /virtualmachines/_search
 {
 query : {
 filtered : {
 query : { match : {
   CLOUD_TYPE : CLOUDSTACK
 }},
 filter : {
 range : { NODE_CREATE_TIME : { from : 2014-05-22
 14:11:35, to : 2014-05-22 14:33:35 }}
 }
 }
 },
 aggs : {
 memory_avg : { avg : { field : MEMORY } }
 }
 }

 *response* :
 {
took: 2,
timed_out: false,
_shards: {
   total: 3,
   successful: 3,
   failed: 0
},
hits: {
   total: 1,
   max_score: 1,
   hits: [
  {
 _index: virtualmachines,
 _type: nodes,
 _id: 102,
 _score: 1,
 _source: {
NODE_ID: 12235,
CLOUD_TYPE: CLOUDSTACK,
NODE_GROUP_NAME: JBOSS,
NODE_CPU: 4GHZ,
NODE_HOSTNAME: cloud.aricent.com,
NODE_NAME: cloudstack-node1,
NODE_PRIVATE_IP_ADDRESS: 10.123.124.125,
NODE_PUBLIC_IP_ADDRESS: 125.31.108.72,
NODE_INSTANCE_ID: cloudstack112,
NODE_STATUS: ACTIVE,
NODE_CATEGORY_ID: 13,
NODE_CREATE_TIME: 2014-05-22 14:23:04,
CPU_SPEED: 500,
MEMORY: 512,
CPU_USED: 0.03%
 }
  }
   ]
},
aggregations: {
   memory_avg: {
  value: 512
   }
}
 }

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/a8538769-7ba5-4ad5-b190-5f0b8d25a26b%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/a8538769-7ba5-4ad5-b190-5f0b8d25a26b%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQCfKh_qbSiEJtj_%3DS%3DZ5Q43Lw3V4egRVEdsvz46FvvqPw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Trigram-accelerated regex searches

2014-05-22 Thread Erik Rose
Martijn took a swing at it just now. He eliminated any scoring-based 
slowdown, like so (constant_score_filter)…

curl -s -XGET 'http://127.0.0.1:9200/dxr_test/line/_search?pretty' -d '{
query: {
filtered: {
query: {
match_all: {}
},
filter: {
and: [
{
query: {
match_phrase: {
content_trg: Children
}
}
},
{
query: {
match_phrase: {
content_trg: Next
}
}
},
{
query: {
wildcard: {
content: {
wildcard: *Children*Next*,
rewrite: constant_score_filter
}
}
}
}
]
}
}
}
}'

…but it didn't make any difference. Somehow, the `and` pipeline isn't 
behaving as we expect. Since ES can't provide any more detailed timing 
ouput, I guess the next step is to go look at the source code for the `and` 
filter and the wildcard query and see what's what.

I think we'd both be fascinated to know what's going on, if anyone has 
anything to add.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/3114f40c-0b15-4dd4-8a6b-fc8c13d43f23%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Trigram-accelerated regex searches

2014-05-22 Thread Robert Muir
On Wed, May 21, 2014 at 6:01 PM, Erik Rose grinche...@gmail.com wrote:
 I'm trying to move Mozilla's source code search engine (dxr.mozilla.org)
 from a custom-written SQLite trigram index to ES. In the current production
 incarnation, we support fast regex (and, by extension, wildcard) searches by
 extracting trigrams from the search pattern and paring down the documents to
 those containing said trigrams.

This is definitely a great approach for a database, but it won't work
exactly the same way for an inverted index because the datastructure
is totally different.

In the inverted index queries like wildcards are slow: they must
iterate and match all terms in the document collection, then intersect
those postings with the rest of your query. So because its inverted,
it works backwards from what you expect and thats why adding
additional intersections like 'AND' don't speed anything up, they
haven't happened yet.

N-grams can speed up partial matching in general, but the methods to
accomplish this are different: usually the best way to go about it is
to try to think about Analyzing the data in such a way that the
queries to accomplish what you need are as basic as possible.

The first question is if you really need partial matching at all: I
don't have much knowledge about your use case, but just going from
your example, i would look at wildcards like *Children*Next* and ask
if instead i'd want to ensure my analyzer split on case-changes, and
try to see if i could get what i need with a sloppy phrase query.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAMUKNZUS40rsAjmzrL_YK6yjgjZRumeQKFVPhVu9bUcW4nN_KA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Elasticsearch 1.20 and 1.1.2

2014-05-22 Thread Ivan Brusic
Releases for some reason never get promoted on the mailing list, so here
goes:

http://www.elasticsearch.org/blog/elasticsearch-1-2-0-released/

The main reason why I posted about the release was because I tested out
cross-version cluster compatibility with 1.1.1 and 1.2.0 nodes and
everything seems to be working. As someone that had to endure full cluster
restarts in the past, this news is welcome. I do know that rolling upgrade
have been available since 1.0, but 1.1 was released shortly after 1.0 that
they seemed like the same minor version.

Cheers,

Ivan

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQD5y0MpTScDq7XDWjs9M8FTGDx%3DiuYeySaf8UdD35wyUg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Trigram-accelerated regex searches

2014-05-22 Thread Matt Weber
Leading wildcards are really expensive.  Maybe you can try creating a copy
of your content field that reverses the tokens using reverse token filter
[1].  By doing this you turn those expensive leading wildcards into
trailing wildcards which should give you better performance.  I think your
query would look something like this:

{
  query: {
constant_score: {
  query: {
bool: {
  should: [
{wildcard: {content: Children*Next*}},
{wildcard: {content_rev: txeN*nerdlihC*}}
  ]
}
  }
}
  }
}

Note that you will need to reverse your query string as the wildcard query
is not analyzed.

[1]
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/analysis-reverse-tokenfilter.html#analysis-reverse-tokenfilter

Thanks,
Matt Weber


On Thu, May 22, 2014 at 11:09 AM, Erik Rose grinche...@gmail.com wrote:

 Martijn took a swing at it just now. He eliminated any scoring-based
 slowdown, like so (constant_score_filter)…

 curl -s -XGET 'http://127.0.0.1:9200/dxr_test/line/_search?pretty' -d
 '{
 query: {
 filtered: {
 query: {
 match_all: {}
 },
 filter: {
 and: [
 {
 query: {
 match_phrase: {
 content_trg: Children
 }
 }
 },
 {
 query: {
 match_phrase: {
 content_trg: Next
 }
 }
 },
 {
 query: {
 wildcard: {
 content: {
 wildcard: *Children*Next*,
 rewrite: constant_score_filter
 }
 }
 }
 }
 ]
 }
 }
 }
 }'

 …but it didn't make any difference. Somehow, the `and` pipeline isn't
 behaving as we expect. Since ES can't provide any more detailed timing
 ouput, I guess the next step is to go look at the source code for the `and`
 filter and the wildcard query and see what's what.

 I think we'd both be fascinated to know what's going on, if anyone has
 anything to add.

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/3114f40c-0b15-4dd4-8a6b-fc8c13d43f23%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/3114f40c-0b15-4dd4-8a6b-fc8c13d43f23%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.




-- 
Thanks,
Matt Weber

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAJ3KEoA1fQjbkygEBhxZdMcb%3D22JGDph65qNn1cvkE66NLRn3A%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Trigram-accelerated regex searches

2014-05-22 Thread Erik Rose
Leading wildcards are really expensive.  Maybe you can try creating a copy 
of your content field that reverses the tokens using reverse token filter 
[1].

Good advice, typically, but notice I have wildcards on either side. 
Reversing just makes the trailing wildcard expensive. :-)

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/6b982d49-5f3b-472a-84ea-cfd4f2c662d6%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Trigram-accelerated regex searches

2014-05-22 Thread Itamar Syn-Hershko
Aye, and then you can use edit distance on single words (fuzzy query) to
cope with fast typers
On May 22, 2014 8:22 PM, Robert Muir robert.m...@elasticsearch.com
wrote:

 On Wed, May 21, 2014 at 6:01 PM, Erik Rose grinche...@gmail.com wrote:
  I'm trying to move Mozilla's source code search engine (dxr.mozilla.org)
  from a custom-written SQLite trigram index to ES. In the current
 production
  incarnation, we support fast regex (and, by extension, wildcard)
 searches by
  extracting trigrams from the search pattern and paring down the
 documents to
  those containing said trigrams.

 This is definitely a great approach for a database, but it won't work
 exactly the same way for an inverted index because the datastructure
 is totally different.

 In the inverted index queries like wildcards are slow: they must
 iterate and match all terms in the document collection, then intersect
 those postings with the rest of your query. So because its inverted,
 it works backwards from what you expect and thats why adding
 additional intersections like 'AND' don't speed anything up, they
 haven't happened yet.

 N-grams can speed up partial matching in general, but the methods to
 accomplish this are different: usually the best way to go about it is
 to try to think about Analyzing the data in such a way that the
 queries to accomplish what you need are as basic as possible.

 The first question is if you really need partial matching at all: I
 don't have much knowledge about your use case, but just going from
 your example, i would look at wildcards like *Children*Next* and ask
 if instead i'd want to ensure my analyzer split on case-changes, and
 try to see if i could get what i need with a sloppy phrase query.

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/CAMUKNZUS40rsAjmzrL_YK6yjgjZRumeQKFVPhVu9bUcW4nN_KA%40mail.gmail.com
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZugwATMVFH%3DFziTPkX-dT6%3DRGfwhCud2S_aBcSDYmxZEA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Trigram-accelerated regex searches

2014-05-22 Thread Erik Rose
 This is definitely a great approach for a database, but it won't work
 exactly the same way for an inverted index because the datastructure
 is totally different.

Ah, I was afraid of that. I hoped, due to the field being unanalyzed (and 
the documentation's noted restriction that wildcard queries work only on 
unanalyzed fields) that the wildcard query would apply to the fetched field 
contents rather than crawling the whole index. But your theory certainly 
fits the timings I've been getting. Thanks!

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/3d523ff3-5292-4ac5-9ec8-7df8520de58b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Trigram-accelerated regex searches

2014-05-22 Thread Erik Rose
Alright, try this on for size. :-)

Since the built-in regex-ish filters want to be all clever and index-based, 
why not use the JS script plugin, which is happy to run as a 
post-processing phase?

curl -s -XGET 'http://127.0.0.1:9200/dxr_test/line/_search?pretty' -d '{
query: {
filtered: {
query: {
match_all: {}
},
filter: {
and: [
{
query: {
match_phrase: {
content_trg: Children
}
}
},
{
query: {
match_phrase: {
content_trg: Next
}
}
},
{
script: {
lang: js,
script: (new 
RegExp(pattern)).test(doc[\content\].value),
params: {
pattern: Children.*Next
}
}
}
]
}
}
}
}'

That gets me through the whole 16M-doc corpus in 117ms. (Without the 
match_phrase queries, it takes forever, at 12s, so you can see the trigrams 
acceleration working.) I am ecstatic.

Some of you might note that the pattern doesn't begin or end with a 
wildcard; that's because RegExp.test() serves as a search rather than a 
match, so wildcards are effectively assumed.

Cheers!

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/6a7674d8-bf51-4be6-860e-589db3939ea8%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Nested cardinality values way off with filter?

2014-05-22 Thread Phil Price
Hello,

I'm trying to get produce the distribution of documents that matches vs 
don't match a query, and get the cardinality of a field for both sets.  The 
idea is Users who did vs Users who did not. In reality I'm actually 
running another aggregation under did not (otherwise I'd just subtract 
one count from the total), but the query here illustrates the issue I'm 
having:

*Query*

aggs: {
total_distinct_count: { cardinality: { field: UserId } },
has_thing: {
filter: { term: { State: thing } },
aggs: {
distinct_count: { cardinality: { field: UserId } }
}
},
does_not_have_thing: {
filter: { 
not : { term: { State: thing } }
},
aggs: { 
distinct_count: { cardinality: { field: UserId } }
}
}
}

*Response*

   hits: {
  total: 3309709,
  max_score: 0,
  hits: []
   },
   aggregations: {
  total_distinct_count: {
 value: 654556
  },
  does_not_have_thing: {
 doc_count: 2575512,
 distinct_count: {
value: 563371
 }
  },
  has_thing: {
 doc_count: 734197,
 distinct_count: {
value: 223128
 }
  }
   }

I would expect (aggregations.has_thing.dictinct_count.value + 
aggregations.does_not_have_thing.distinct_count.value) to be close to 
aggreations.total_distinct_count.value, but in reality it's pretty far off 
(~+20%). Note: That the summation of doc_count adds up exactly to 
hits.total. So I don't think this is an issue with the query, but I could 
be wrong. 

Any ideas whats up? Have I structured the query incorrectly, Is this a bug? 
Or is this just expected behavior? 

Some notes:

   - UserId's data type is a *long, *but the values only fill up integer 
   space. (510,539 to 418,346,844) 
   - I'm running elasticsearch 1.1.0
   - I've tried playing around with the precision threshold, but it doesn't 
   appear to make a difference. 

Thanks in advance,
Cheers
Phil 

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/cb558261-7865-491e-9bc5-e3f78b6390f3%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Unassigned Shards Problem

2014-05-22 Thread Mark Walkom
It does create an index, it says so in the log - [logstash-2014.05.22]
creating index - it's jut not assigning things.

You've set routing.allocation.awareness.attribute, but have you set the
node value, ie node.rack?
See
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-cluster.html#allocation-awareness

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com
web: www.campaignmonitor.com


On 23 May 2014 02:22, Brian Wilkins bwilk...@gmail.com wrote:

 I have five nodes : Two Master Nodes, One Balancer Node, One Workhorse
 Node, and One Coordinator Node.

 I am shipping events from logstash, redis, to elasticsearch.

 At the moment, my cluster is RED. The shards are created but no index is
 created. I used to get an index like logstash.2014-05-22, but not anymore.

 I deleted all my data, Cluster health goes GREEN.

 However, as soon as data is sent from logstash - redis - elasticsearch,
 my cluster health goes RED. I end up with unassigned shards. In my
 /var/log/elasticsearch/logstash.log on my master, I see this in the log:

 [2014-05-22 12:03:20,599][INFO ][cluster.metadata ] [Bora]
 [logstash-2014.05.22] creating index, cause [auto(bulk api)], shards
 [5]/[1], mappings [_default_]

 On my master, this is the configuration:

 cluster:
   name: logstash
   routing:
 allocation:
   awareness:
 attributes: rack
 node:
   data: true
   master: true

 curl -XGET 'http://localhost:9200/_cluster/health?pretty=true'
 {
   cluster_name : logstash,
   status : red,
   timed_out : false,
   number_of_nodes : 5,
   number_of_data_nodes : 3,
   active_primary_shards : 0,
   active_shards : 0,
   relocating_shards : 0,
   initializing_shards : 0,
   unassigned_shards : 10
 }

 Is there an incorrect setting? I also installed ElasticHQ. It tells me the
 same information.

 Brian

  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/03c5974b-ae50-4f1c-9ba3-4ef94b564323%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/03c5974b-ae50-4f1c-9ba3-4ef94b564323%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAEM624YGofrc55ND%3DXb3QW9cu799R0AKjB6_5Nmse_rMc9qcAw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Is it wise to use ES for saving shopping Carts?

2014-05-22 Thread Mark Walkom
ES is eventually consistent, so it may not make sense if you're latency
requirements are very strict.
If you can introduce a delay then it should work.

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com
web: www.campaignmonitor.com


On 22 May 2014 23:54, Matthias Feist matf...@gmail.com wrote:

 Hi Guys,

 I'm working on an online shop. Currently we are storing the cart's content
 in a MySQL Database so we can very easy access the amount of a certain
 product and determine the reserved quantity.
 This is very important as the amount in the user's carts is reserved so
 other users my not by them.

 What do you think: Is it wise to implement such a system in elasticsearch?
 I'm mostly worried about the time between the add to cart (inserting a
 document) and being able to access the total value due to the flushing
 delay.

 Thanks for your advice.

 Kind regards
 Matthias

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/0c48d418-b8b9-4d9a-b659-5aca07e60eb4%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/0c48d418-b8b9-4d9a-b659-5aca07e60eb4%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAEM624ZM_fk9i49SpQtRmb2rr5Xy6WV3omwsCB8baBh4tic3Bw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Elasticsearch 1.20 and 1.1.2

2014-05-22 Thread joergpra...@gmail.com
Plugin developers should watch out for changes in classes, e.g.
XContentRestResponse (useful for REST actions) has gone, and there are some
internal API changes in IndexShard methods, also new deprecations
(IndicesStatusAction is now RecoveryAction) - maybe more I did not
recognize yet in my compile error logs...

Beside this, I just managed to switch developer stage from 1.1.0 to 1.2.0
with green status :)  tomorrow heavy load test.

Jörg



On Thu, May 22, 2014 at 8:24 PM, Ivan Brusic i...@brusic.com wrote:

 Releases for some reason never get promoted on the mailing list, so here
 goes:

 http://www.elasticsearch.org/blog/elasticsearch-1-2-0-released/

 The main reason why I posted about the release was because I tested out
 cross-version cluster compatibility with 1.1.1 and 1.2.0 nodes and
 everything seems to be working. As someone that had to endure full cluster
 restarts in the past, this news is welcome. I do know that rolling upgrade
 have been available since 1.0, but 1.1 was released shortly after 1.0 that
 they seemed like the same minor version.

 Cheers,

 Ivan

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQD5y0MpTScDq7XDWjs9M8FTGDx%3DiuYeySaf8UdD35wyUg%40mail.gmail.comhttps://groups.google.com/d/msgid/elasticsearch/CALY%3DcQD5y0MpTScDq7XDWjs9M8FTGDx%3DiuYeySaf8UdD35wyUg%40mail.gmail.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoESs5iNQ%3DkB%2BdyKiGuyC9GeVu607Kp%2BquS4RX6pn0ZLAQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Unassigned Shards Problem

2014-05-22 Thread Brian Wilkins
Thanks for your reply. I set the node.rack to rack_one on all the nodes as 
a test. In ElasticHQ, on the right it shows no indices. It is empty. In my 
master, I see that the nodes are identifying with rack_one (all of them). 

Any other clues?

Thanks

Brian

On Thursday, May 22, 2014 5:10:25 PM UTC-4, Mark Walkom wrote:

 It does create an index, it says so in the log - [logstash-2014.05.22] 
 creating index - it's jut not assigning things.

 You've set routing.allocation.awareness.attribute, but have you set the 
 node value, ie node.rack?
 See 
 http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-cluster.html#allocation-awareness

 Regards,
 Mark Walkom

 Infrastructure Engineer
 Campaign Monitor
 email: ma...@campaignmonitor.com javascript:
 web: www.campaignmonitor.com


 On 23 May 2014 02:22, Brian Wilkins bwil...@gmail.com javascript:wrote:

 I have five nodes : Two Master Nodes, One Balancer Node, One Workhorse 
 Node, and One Coordinator Node.

 I am shipping events from logstash, redis, to elasticsearch.

 At the moment, my cluster is RED. The shards are created but no index is 
 created. I used to get an index like logstash.2014-05-22, but not anymore.

 I deleted all my data, Cluster health goes GREEN.

 However, as soon as data is sent from logstash - redis - elasticsearch, 
 my cluster health goes RED. I end up with unassigned shards. In my 
 /var/log/elasticsearch/logstash.log on my master, I see this in the log:

 [2014-05-22 12:03:20,599][INFO ][cluster.metadata ] [Bora] 
 [logstash-2014.05.22] creating index, cause [auto(bulk api)], shards 
 [5]/[1], mappings [_default_]

 On my master, this is the configuration:

 cluster:
   name: logstash
   routing:
 allocation:
   awareness:
 attributes: rack
 node:
   data: true
   master: true

 curl -XGET 'http://localhost:9200/_cluster/health?pretty=true'
 {
   cluster_name : logstash,
   status : red,
   timed_out : false,
   number_of_nodes : 5,
   number_of_data_nodes : 3,
   active_primary_shards : 0,
   active_shards : 0,
   relocating_shards : 0,
   initializing_shards : 0,
   unassigned_shards : 10
 }

 Is there an incorrect setting? I also installed ElasticHQ. It tells me 
 the same information.

 Brian

  -- 
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearc...@googlegroups.com javascript:.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/03c5974b-ae50-4f1c-9ba3-4ef94b564323%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/03c5974b-ae50-4f1c-9ba3-4ef94b564323%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.




-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/5bd4c3d5-3be5-44ef-a8a6-5dba6876130c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Elasticsearch 1.20 and 1.1.2

2014-05-22 Thread Mark Walkom
Hurray!

However they are still using the new version, new path release method, so
if you want 1.2 you will need to update your sources to
http://packages.elasticsearch.org/elasticsearch/1.2/$OS

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com
web: www.campaignmonitor.com


On 23 May 2014 04:24, Ivan Brusic i...@brusic.com wrote:

 Releases for some reason never get promoted on the mailing list, so here
 goes:

 http://www.elasticsearch.org/blog/elasticsearch-1-2-0-released/

 The main reason why I posted about the release was because I tested out
 cross-version cluster compatibility with 1.1.1 and 1.2.0 nodes and
 everything seems to be working. As someone that had to endure full cluster
 restarts in the past, this news is welcome. I do know that rolling upgrade
 have been available since 1.0, but 1.1 was released shortly after 1.0 that
 they seemed like the same minor version.

 Cheers,

 Ivan

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQD5y0MpTScDq7XDWjs9M8FTGDx%3DiuYeySaf8UdD35wyUg%40mail.gmail.comhttps://groups.google.com/d/msgid/elasticsearch/CALY%3DcQD5y0MpTScDq7XDWjs9M8FTGDx%3DiuYeySaf8UdD35wyUg%40mail.gmail.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAEM624az2QY2ahYzKR_Ycxd34%2Bi9x-WrnzQNUkD3e73C%2BytrKQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Unassigned Shards Problem

2014-05-22 Thread Brian Wilkins
Went back and read the page again. So I made one master, workhorse, and 
balancer with rackid of rack_two for testing. One master shows rackid of 
rack_one. All nodes were restarted. The shards are still unassigned. Also,the 
indices in ElasticHQ are empty.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/4ac7b65e-3bc7-4f51-85e5-65ca3719880d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Nested cardinality values way off with filter?

2014-05-22 Thread Adrien Grand
distinct_countOn Thu, May 22, 2014 at 10:34 PM, Phil Price 
philpr...@gmail.com wrote:

 I would expect (aggregations.has_thing.dictinct_count.value +
 aggregations.does_not_have_thing.distinct_count.value) to be close to
 aggreations.total_distinct_count.value, but in reality it's pretty far off


I think this result is to be expected if you have some user IDs that match
both criteria? Eg. if your index has these two documents:

{
  UserId : 42,
  State: thing
}

{
  UserId : 42,
  State: anything
}

Then your aggregations would look like:

   aggregations: {
  total_distinct_count: {
 value: 1
  },
  does_not_have_thing: {
 doc_count: 1,
 distinct_count: {
value: 1
 }
  },
  has_thing: {
 doc_count: 1,
 distinct_count: {
value: 1
 }
  }
   }


And the sum of the values of distinct_count per bucket is larger than the
global value for distinct_count.

-- 
Adrien Grand

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j6Dsf5wbALt4v7ObbeM%3DRyuHDG-ueYoNnXFwzE_TtQqdg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: time taken by each stage for a query

2014-05-22 Thread Adrien Grand
That is not easy, and the reason is that Elasticsearch and Solr work in
quite a different way eg. when it comes to compute facets/aggregations:
Solr first computes top hits, and if facets are required, it will load the
doc IDs of document matches into a bit set that will be used in a
subsequent step in order to compute facets. On the other hand,
Elasticsearch computes both top hits and facets/aggregations at the same
time (in the same Collector if you are familiar with Lucene terminology)
which makes timings harder to track.


On Thu, May 22, 2014 at 6:25 AM, Srinivasan Ramaswamy ursva...@gmail.comwrote:

 In the past when i used solr, i can look at time taken by each component
 to understand where most of the time is spent for a particular query.
 Similarly, I am trying to understand the breakdown of time spent for one
 particular query. Can anyone tell me how can i investigate performance of
 specific queries it in elasticsearch ?

 Thanks
 Srini

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/02f323f6-b2af-4c71-889f-9f4f7f888c57%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/02f323f6-b2af-4c71-889f-9f4f7f888c57%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.




-- 
Adrien Grand

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j5y1Md0G9WBuQxj_AokB8ro%2BuYAR9biFRKwYLdbpAkU4g%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Is it wise to use ES for saving shopping Carts?

2014-05-22 Thread Adrien Grand
On Thu, May 22, 2014 at 3:54 PM, Matthias Feist matf...@gmail.com wrote:

 What do you think: Is it wise to implement such a system in elasticsearch?
 I'm mostly worried about the time between the add to cart (inserting a
 document) and being able to access the total value due to the flushing
 delay.


For your information, this flushing delay only exists for search
operations. We call it a near realtime operation because of this delay that
you need to wait after having indexed a document and before being able to
search for it (1 second by default).

However Elasticsearch doesn't only have a search API, it also has a GET API
that is realtime[1] and basically allows you to use Elasticsearch as a
key-value store.

[1]
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/docs-get.html#realtime

-- 
Adrien Grand

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j49hmhz7Av15FM13Nnpizh6GZEwn3dWmE_d2nZNCSXvow%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Nested cardinality values way off with filter?

2014-05-22 Thread Phil Price
Doh! You are correct, my bad. I assumed the filter was an exclusive per 
user property, but in fact - it is not. 

Thanks for getting back to me
Cheers
Phil

On Thursday, May 22, 2014 4:36:02 PM UTC-7, Adrien Grand wrote:

 distinct_countOn Thu, May 22, 2014 at 10:34 PM, Phil Price 
 phil...@gmail.com javascript: wrote:

 I would expect (aggregations.has_thing.dictinct_count.value + 
 aggregations.does_not_have_thing.distinct_count.value) to be close to 
 aggreations.total_distinct_count.value, but in reality it's pretty far off


 I think this result is to be expected if you have some user IDs that match 
 both criteria? Eg. if your index has these two documents:

 {
   UserId : 42,
   State: thing
 }

 {
   UserId : 42,
   State: anything
 }

 Then your aggregations would look like:

aggregations: {
   total_distinct_count: {
  value: 1
   },
   does_not_have_thing: {
  doc_count: 1,
  distinct_count: {
 value: 1
  }
   },
   has_thing: {
  doc_count: 1,
  distinct_count: {
 value: 1
  }
   }
}


 And the sum of the values of distinct_count per bucket is larger than the 
 global value for distinct_count.

 -- 
 Adrien Grand
  

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/40c4f479-1787-4931-be7d-9511dc06e1fb%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: why the special nested aggregation and query?

2014-05-22 Thread Adrien Grand
Although I would agree that being able to detect it automatically could
make things simpler, I think that the fact that it is excplicit is more
flexible. For example, it can make sense to copy field values into the root
document[1]. This can help speed-up some queries that don't need to know
about the tree structure of your document. And in that case you have two
ways to search the same field name:
 - either through the root document: faster but less flexible
 - or through the nested document: less flexible but slower

The fact that nested queries are explicit allows you to choose the way that
you want the field to be queried.

For aggregations, I think it is also nice to make it explicit so that
counts are not surprising: imagine that you have a document with properties
stored as nested documents and each property having a name. If you run a
terms aggregation on the property name from the root document, buckets will
count how many root documents have this property name. On the other hand,
if you run this terms aggregation through a nested field, this will count
the number of _properties_ that have this name. Since each document can
have several properties, counts might be much higher.

[1]
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-nested-type.html




On Wed, May 21, 2014 at 9:55 PM, Jad jna...@gmail.com wrote:

 Hi All,

 I've been thinking about why nested fields need to be handled with a
 special nested query and aggregation type. Is it to handle the case where
 there are multiple nested levels, to be able to control whether a query
 involving two nested fields is within the same nested instance or across
 two nested instances?

 Thanks
 Jad.

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/19b47173-41f2-4a5e-9bb8-efc84f0f9a56%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/19b47173-41f2-4a5e-9bb8-efc84f0f9a56%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.




-- 
Adrien Grand

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j60svm9864bHyz1Da0JU%3DLN3%3DtgOCoyPN06rN5Q0oE4NA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: _search/scroll?search_type=scan bugs/inconsistencies

2014-05-22 Thread Adrien Grand
scan is mainly useful as a way to export data from the index. In the
context of a user interface, I think scroll would make more sense[1]. On a
side note, paging improved significantly for scroll requests in
Elasticsearch 1.2 (in both terms of speed and memory usage).

[1]
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-request-scroll.html


On Wed, May 21, 2014 at 8:28 PM, schmichael mich...@lytics.io wrote:

 On Wednesday, May 21, 2014 11:20:53 AM UTC-7, schmichael wrote:

 ...
 2. search_type=scan doesn't seem to honor size=N


 It seems I missed this in the guide:

 When scanning, the size is applied to each shard, so you will get back a
 maximum of size * number_of_primary_shards documents in each batch.

 http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/scan-scroll.html

 ...but that only seems to be the case with the scan search_type. Do I just
 have to divide the user's requested page size by my number of shards (5 at
 this point)?

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/76d3e1eb-f0dd-46f3-a508-7b246b068e21%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/76d3e1eb-f0dd-46f3-a508-7b246b068e21%40googlegroups.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.




-- 
Adrien Grand

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j74g0Qqz0VLNDAHDEZ9N%3D2w723vEWaxqicfnQ_Cn1DSfA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: indices.cache.filter.size limit not enforce ?

2014-05-22 Thread Adrien Grand
For those who would come to this thread through a search engine, Dan found
the root cause of this issue
https://github.com/elasticsearch/elasticsearch/issues/6268


On Wed, May 21, 2014 at 8:03 PM, Daniel Low dang...@gmail.com wrote:

 Hello,

 Has there been any updates to this? We are using nodes with 256GB of ram
 and heap sizes of 96GB and also seeing this exact same issue where filter
 cache sizes grow above the limit. What I also discovered was that when I
 set the filter cache size to 31.9GB or lower the limit worked fine, but
 anything above and it did not.

 Thanks,
 Daniel


 On Friday, October 25, 2013 5:55:37 AM UTC-7, Benoît wrote:

 Hi !

 On Friday, October 25, 2013 2:06:58 PM UTC+2, Clinton Gormley wrote:


 I've never seen the filter cache limit not being enforced.  If you can
 provide supporting data, ie the filter cache size from the nodes_stats plus
 the settings you had in place at the time, would be helpful.


 Output of  _cluster/settings and _nodes/stats?all=true in the following
 gist : https://gist.github.com/benoit-intrw/7154161

 The value is not really high right now but 44.5gb is over 30% of commited
 heap (127.8gb)

   filter_cache: {
   memory_size: 44.5gb,
   memory_size_in_bytes: 47819287444,
   evictions: 0
 },


 I support Ivan's comment about heap size: the bigger the heap, the
 longer GC takes.  And using a heap above 32GB means the JVM can't use
 compressed pointers.  So better to run multiple nodes on one machine, using
 shard awareness to ensure that you don't have copies of the same data on
 the same machine.


 ok i will think about it but the machine are in production ...

 Regards

 Benoît

  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/f5158eda-7e43-4b05-9f32-52bd3984c766%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/f5158eda-7e43-4b05-9f32-52bd3984c766%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.




-- 
Adrien Grand

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j4h9EcB6DoW0zSU2bpWCxoWm1_kkMt7vDYY0vqXLZif6g%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: _search/scroll?search_type=scan bugs/inconsistencies

2014-05-22 Thread Michael Schurter
Thanks for the response Adrien. I'm excited to upgrade to 1.2, but it seems
strange to me that people refer to scan vs. scroll (you're not the first)
as scan is simply a search_type that - AFAIK - can be used for any type of
search (scroll or otherwise).

It just seems strange that setting the search_type=scan changes the
behavior of _search/scroll significantly (size=N semantics differ and no
first page is returned).

On Thu, May 22, 2014 at 5:09 PM, Adrien Grand 
adrien.gr...@elasticsearch.com wrote:

 scan is mainly useful as a way to export data from the index. In the
 context of a user interface, I think scroll would make more sense[1]. On a
 side note, paging improved significantly for scroll requests in
 Elasticsearch 1.2 (in both terms of speed and memory usage).

 [1]
 http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-request-scroll.html


 On Wed, May 21, 2014 at 8:28 PM, schmichael mich...@lytics.io wrote:

 On Wednesday, May 21, 2014 11:20:53 AM UTC-7, schmichael wrote:

 ...
 2. search_type=scan doesn't seem to honor size=N


 It seems I missed this in the guide:

 When scanning, the size is applied to each shard, so you will get back a
 maximum of size * number_of_primary_shards documents in each batch.

 http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/scan-scroll.html

 ...but that only seems to be the case with the scan search_type. Do I
 just have to divide the user's requested page size by my number of shards
 (5 at this point)?

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/76d3e1eb-f0dd-46f3-a508-7b246b068e21%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/76d3e1eb-f0dd-46f3-a508-7b246b068e21%40googlegroups.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.




 --
 Adrien Grand

 --
 You received this message because you are subscribed to a topic in the
 Google Groups elasticsearch group.
 To unsubscribe from this topic, visit
 https://groups.google.com/d/topic/elasticsearch/--2SuAbGjkk/unsubscribe.
 To unsubscribe from this group and all its topics, send an email to
 elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j74g0Qqz0VLNDAHDEZ9N%3D2w723vEWaxqicfnQ_Cn1DSfA%40mail.gmail.comhttps://groups.google.com/d/msgid/elasticsearch/CAL6Z4j74g0Qqz0VLNDAHDEZ9N%3D2w723vEWaxqicfnQ_Cn1DSfA%40mail.gmail.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CANRei7AS2acfapr2n6wEkgc8rASEdrUZ9gnKnk8z084AUfTo1A%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


IncompatibleClassChangeError[Implementing class]

2014-05-22 Thread Olivier B
Hey!

I'm using Elasticsearch 1.1.1 on ubuntu on java 7:

java version 1.7.0_55
OpenJDK Runtime Environment (IcedTea 2.4.7) (7u55-2.4.7-1ubuntu1) 
OpenJDK 64-Bit Server VM (build 24.51-b03, mixed mode)


It's working perfectly. But, when I try to upgrade to 1.2.0, elasticsearch 
won't start:
[2014-05-23 02:32:53,812][INFO ][node ] [Larry 
Bodine]version
[1.2.0], pid[43], build[c82387f/2014-05-22T12:49:13Z]
[2014-05-23 02:32:53,813][INFO ][node ] [Larry 
Bodine]initializing 
...
[2014-05-23 02:32:53,853][INFO ][plugins  ] [Larry 
Bodine]loaded 
[cloud-aws, jetty, river-couchdb, lang-javascript], sites [bigdesk]
[2014-05-23 02:32:56,655][INFO ][node ] [Larry 
Bodine]initialized
[2014-05-23 02:32:56,655][INFO ][node ] [Larry 
Bodine]starting 
...
[2014-05-23 02:32:56,800][INFO ][transport] [Larry 
Bodine]bound_address 
{inet[/0:0:0:0:0:0:0:0:9300]}, publish_address {inet[/172.17.0.2:9300]}
[2014-05-23 02:32:59,846][INFO ][cluster.service  ] [Larry Bodine] 
new_master [Larry 
Bodine][QBC3No2hQYKosA-cz9Pfog][7c414f6cd187][inet[/172.17.0.2:9300]], 
reason: zen-disco-join (elected_as_master)
[2014-05-23 02:32:59,895][INFO ][discovery] [Larry Bodine] 
es-cluster/QBC3No2hQYKosA-cz9Pfog
[2014-05-23 02:32:59,939][INFO ][gateway  ] [Larry 
Bodine]recovered 
[0] indices into cluster_state
{1.2.0}: Startup Failed ...
- IncompatibleClassChangeError[Implementing class]


Any idea of what's happening?

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/91b5acc5-28bc-470c-badf-0729b8250235%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Reverse River?

2014-05-22 Thread Tim Uckun
I would like to have a river in reverse. Every time a document is inserted 
or modified I would like to push that into another destination like a 
database.  Ideally this would be async or maybe even in batches.

Has anybody done anything like this before?

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/45b10d5a-1981-46a5-bb4a-a7ce9273ba7f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Reverse River?

2014-05-22 Thread Ivan Brusic
Some relevant comments:
https://github.com/elasticsearch/elasticsearch/issues/1242

-- 
Ivan


On Thu, May 22, 2014 at 8:45 PM, Tim Uckun timuc...@gmail.com wrote:

 I would like to have a river in reverse. Every time a document is inserted
 or modified I would like to push that into another destination like a
 database.  Ideally this would be async or maybe even in batches.

 Has anybody done anything like this before?

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/45b10d5a-1981-46a5-bb4a-a7ce9273ba7f%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/45b10d5a-1981-46a5-bb4a-a7ce9273ba7f%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQA3K_0Sa0C_4PeXjkr%3DvzuBzhXSYp91wSTs8ThtdOTbKA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Elastic search High threads , 100% utilization of non heap memory.

2014-05-22 Thread srikanth ramineni
Hi Team,

We are experiencing issue with high usage of non heap memory and high 
thread count. Mostly we are seeing GC process is running. We are watching 
threads using big desk from past two days. Threads are reaching peak. We 
are not sure why it is reaching this much high. Two days back we ran 
optimizer manually. We restarted the servers, still threads are showing 
peak.  Can you please suggest what might  be the issues. Is there any way 
to debug this kind of issue. Most of the times we are getting node not 
reachable and timeout error.

[7.9gb], all_pools {[young] [3.4mb]-[64.4mb]/[266.2mb]}{[survivor] 
[33.2mb]-[33.2mb]/[33.2mb]}{[old] [2.1gb]-[2.2gb]/[7.6gb]}
[2014-05-22 20:42:53,966][WARN ][monitor.jvm  ] [Echo] 
[gc][young][2864][37] duration [2s], collections [1]/[2.7s], total 
[2s]/[23.1s], memory [2.4gb]-[2.4gb]/[7.9gb], all_pools {[young] 
[104.9mb]-[3.6mb]/[266.2mb]}{[survivor] [33.2mb]-[33.2mb]/[33.2mb]}{[old] 
[2.2gb]-[2.4gb]/[7.6gb]}
[2014-05-22 20:43:09,921][INFO ][monitor.jvm  ] [Echo] 
[gc][young][2865][46] duration [6.6s], collections [9]/[15.9s], total 
[6.6s]/[29.7s], memory [2.4gb]-[2.6gb]/[7.9gb], all_pools {[young] 
[3.6mb]-[98mb]/[266.2mb]}{[survivor] [33.2mb]-[23.9kb]/[33.2mb]}{[old] 
[2.4gb]-[2.5gb]/[7.6gb]}



Below is our configuration.

Having 4 node cluster
each node having 16GB RAM - Elastic search allocated to 10GB  RAM on each 
node.
Total size of data indexed : 450 GB
Each node having 350GB space


Thanks,
Srikanth.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/af014404-0b15-4387-93ca-ec5d3bdc805f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Connect through proxy

2014-05-22 Thread nilsga
Is it possible to connect with the TransportClient to an ElasticSearch 
cluster via a socks proxy? If yes, how?

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/e75acc5f-3f43-4dfb-b6dd-3d7bfa4f7bfc%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.