date:20150115

A range filter on a date field with something like from now/d-1 to now/d+1
might work I think.
If you don’t have a date field (could be a _timestamp field if you activated
it), I’m afraid you can’t do that.

--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet https://twitter.com/dadoonet | @elasticsearchfr
https://twitter.com/elasticsearchfr | @scrutmydocs
https://twitter.com/scrutmydocs

Le 15 janv. 2015 à 18:15, Matthew acernu...@gmail.com a écrit :

Hi all,

Is there any way to only load the last 24 hours of indices? I am trying to
apply a query to only show the number of documents created over the last 24
hours (over the REST API), but I have not had too much luck.

Thanks!

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com
mailto:elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/07576351-5bea-4f99-af51-16ff76791914%40googlegroups.com

https://groups.google.com/d/msgid/elasticsearch/07576351-5bea-4f99-af51-16ff76791914%40googlegroups.com?utm_medium=emailutm_source=footer.
For more options, visit https://groups.google.com/d/optout
https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/0882E33E-371F-4B54-BAF9-CD0BABBD7E6F%40pilato.fr.
For more options, visit https://groups.google.com/d/optout.

Filter index to last 24h (REST)

2015-01-15 Thread Matthew

Hi all,

Is there any way to only load the last 24 hours of indices? I am trying to 
apply a query to only show the number of documents created over the last 24 
hours (over the REST API), but I have not had too much luck.

Thanks!

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/07576351-5bea-4f99-af51-16ff76791914%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Aggregation - Blank and date aggregation

Then it means that you want to use a date_histogram aggregation with
interval=day. See
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-aggregations-bucket-datehistogram-aggregation.html

On Thu, Jan 15, 2015 at 4:43 PM, buddarapu nagaraju budda08n...@gmail.com
wrote:

Hey Adrien ,Thank you.I have one more question on aggregating on dates .

We actually stored date time in a field called createdDateTime but I
need only aggregates on date part of date time .

Any ideas ? Or sample code can help us ?

Regards
Nagaraju
908 517 6981

On Wed, Jan 14, 2015 at 6:10 AM, Adrien Grand
adrien.gr...@elasticsearch.com wrote:

On Wed, Jan 14, 2015 at 10:37 AM, buddarapu nagaraju
budda08n...@gmail.com wrote:

Does term aggregation counts on blank field values ?

Yes, an empty value counts as a term. Note that you need the field to
be not analyzed for it to work (or to use an analyzer that emits empty
strings). Otherwise the standard analyzer would analyzer as an empty
list of tokens, so a field value of would not actually count...

Does term aggregation is enough for doing date aggregation ? Or there
any specific aggregations we have ?All I need in date aggregation is to
know different dates and its counts ?

A terms aggregation is enough, but a date_histogram aggregation is
generally more useful on dates as there are lots of unique values and it's
often more useful to group them based on the year, month or day.

--
Adrien Grand

--
You received this message because you are subscribed to a topic in the
Google Groups elasticsearch group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/i9N09n_-n38/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j74ZqbBN0zNW6-5Feu7xYTKkomzx%3DDMhx28inFVYLSu5Q%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j74ZqbBN0zNW6-5Feu7xYTKkomzx%3DDMhx28inFVYLSu5Q%40mail.gmail.com?utm_medium=emailutm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAFtuXXKp0JycJfNvLxPGN_5YL7P-X%3DGDzvmYJQ9NFN7Q%2BaJjQw%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAFtuXXKp0JycJfNvLxPGN_5YL7P-X%3DGDzvmYJQ9NFN7Q%2BaJjQw%40mail.gmail.com?utm_medium=emailutm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
Adrien Grand

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j7Nn8h7C9BoW6PUjHbS%2Bnerpw3%3DWUi5RrC5ewtDBtSRaA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: How to find all docs where field_a === val1 and field_b === val2?

2015-01-15 Thread David Reagan

Thanks! I was thinking a bool query was something specific to fields with 
boolean values. Which is why I didn't understand the bool query example in 
the docs. Your posts helped me get what I wanted. :)

On Wednesday, January 14, 2015 at 3:34:05 PM UTC-8, Brian wrote:

 By the way, David, the full query follows:

 {
   from : 0,
   size : 20,
   timeout : 6,
   *query* : {
 *bool* : {
   *must* : [ {
 match : {
   field_a : {
 query : val1,
 type : boolean
   }
 }
   }, {
 match : {
   field_b : {
 query : val2,
 type : boolean
   }
 }
   } ]
 }
   },
   version : true,
   explain : false,
   fields : [ _ttl, _source ]
 }

 Also note that since the _ttl field is being requested (always), then the 
 _source must also be asked for explicitly. If you don't ask for any fields, 
 _source is returned by default. But if you ask for one or more fields 
 explicitly, then you must also ask for _source or it won't be returned.

 Brian

 On Wednesday, January 14, 2015 at 6:31:29 PM UTC-5, Brian wrote:

 David,

 This is what I use. I hope it helps.

 {
   *bool* : {
 *must* : [ {
   match : {
 field_a : {
   query : val1,
   type : boolean
 }
   }
 }, {
   match : {
 field_b : {
   query : val2,
   type : boolean
 }
   }
 } ]
   }
 }

 Brian



-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/feb306b6-aa38-4eaf-a9fc-ad23be10ea4a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

How can I add additional parameters in aggregation?

2015-01-15 Thread Eylon Steiner

I have documents with id and name and title.
I am making aggregation according name, but how can I get in the results 
also the name and title?

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/b9456488-805b-4c5e-ad16-12cd9a0feaf2%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Is ElasticSearch truly scalable for analytics?

2015-01-15 Thread Mark Harwood

 Regarding the accuracy of top-k lists

This is perhaps an over-simplification -  we deal with far more complex 
scenarios than a simple, single top-K list - we have whole aggregation 
trees with multiple layers of aggs: geo, time, nested, parent/child, 
percentiles, cardinalities etc etc which can embed multiple top K terms 
aggs, or be contained by one. Today all aggs work in one pass over local 
data to produce a merge-able summary output - if you introduce the idea of 
pausing all of this local computation mid-stream and then resuming it once 
you've centrally determined what top K is across a cluster and for 
various points in the agg tree then coordinating all of these updates gets 
impossibly complex.

I acknowledge it is a highly specialised use-case which not very many 
people run into, but it is a case I'm currently working on.

To be fair multi-level merging is a capability which might also apply to 
analytics in federated architectures where proxy servers might act as the 
front to nodes in remote clusters.

I was thinking to reduce the complete set of buckets locally

I'm unclear on your approach to the reduce:
1) Take the summary outputs of multiple agg pipelines computed in parallel 
and merge them in the same way coordinating nodes do or
2) Take the raw inputs (doc streams) from all shards held on a node and 
feed them through a single aggregation pipeline to get one combined output

The problems being 1) loses accuracy and 2) loses any parallelism because 
agg pipelines are single threaded and must process doc streams serially.
Because you claimed accuracy would be better I guess you mean option 2?



-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/5967eb30-5bd8-42b8-aa35-1793dc77afa7%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Questions about scaling elasticsearch with regard to the number of documents indexed per second

2015-01-15 Thread Chinch Pokli

Awesome! Great to know that. So as a conclusion the steps will be:
1) Stream tweets from twitter
2) Use the bulk API to make batches of 1000 (or more) tweets
3) Once the batch size is reached, spawn a new thread which will index the
data into ES, meanwhile my original thread will continue streaming tweets

Do these steps sound alright to you or did I miss something?

On Thursday, January 15, 2015 at 7:58:19 PM UTC+5:30, David Pilato wrote:

I can index on my laptop 1-12000 docs per second. SSD drives of course.

--
David ;-)
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 15 janv. 2015 à 13:43, Chinch Pokli cpo...@gmail.com javascript: a
écrit :

No, so the whole point was that, will elasticsearch be able to index say
10,000 documents per second? If yes, I can simply hook up my twitter code
to es. If not, I would need to think of how to make that happen.
Typically I've seen es indexes just around 30 docs per second which is
pretty low.

I am hoping Redis/ Kafka/ Logstash/ etc. might help elasticsearch to get
some breathing room and enable it to index up to 10K docs per second.

On Thursday, January 15, 2015 at 10:47:31 AM UTC+5:30, David Pilato wrote:

You have a Twitter input so you can extract content from Twitter and send
to elasticsearch. No need to have Redis here.

--
David ;-)
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 15 janv. 2015 à 00:02, Chinch Pokli cpo...@gmail.com a écrit :

Thanks. I'll have a look at the raw option.
Regarding logstash, I don't fully understand it's utility. It says that
it can take messages from a Redis server. But if I have to set up Redis, I
could simply use the Redis river to index into Elasticsearch. Is there any
additional benefit that Logstash would give me?

On Thursday, January 15, 2015 at 4:06:12 AM UTC+5:30, David Pilato wrote:

You should look at raw option or better look at Logstash.

My 2 cents.

David

Le 14 janv. 2015 à 23:29, Chinch Pokli cpo...@gmail.com a écrit :

Hi,

I am using elasticsearch to index twitter stream. Until recently I was
using the official river which was working great but realized that it
throwing out much of the data (e.g. it is not storing number of followers
etc. data).

Is there a way to make the river to store all the data? If not, I am
fine with writing a streaming code which will stream and index. But have a
concern. How many documents can elasticsearch index per second? I might
eventually need to index almost 10,000 documents (each document = 2 KB) per
second (current requirement is of 100 documents per second). Is this even
feasible? If yes, do I need to make any special modifications?

Thanks-in-advance!!

https://groups.google.com/d/msgid/elasticsearch/da547692-903b-4793-a77e-fd5f0b5a01b7%40googlegroups.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

https://groups.google.com/d/msgid/elasticsearch/d89e6057-ab58-49ef-a553-c5bd5265c172%40googlegroups.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

https://groups.google.com/d/msgid/elasticsearch/a5c75aed-e290-4152-9f8d-160510f3ecfa%40googlegroups.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/11bf4f30-d7f6-41ac-886a-c5281dac31bd%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Is ElasticSearch truly scalable for analytics?

2015-01-15 Thread AlexR

I would be also very interested in node level shard results reduction but not 
for scalability but precision reasons. I would like to have an option for a 
node to do complete aggregations on its shards so the results are exact rather 
than approximate. There are many use cases when corpus of data is reltively 
small to fit one powerful node and exactness is a MUST. With 48 core servers 
and ssd drives such node can process good deal of data and produce exact 
results which is a must for traditional datamart-like apps. Having this option 
will allow for this class of apps to be built. And in myltinode setup it wull 
provide better precision too

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/d3fb8f8d-4563-4e97-b0fd-3cc220f252bc%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Questions about scaling elasticsearch with regard to the number of documents indexed per second

Sounds good.
If you are using Java, you could also look at the river code.
Note that you should use BulkProcessor class which is super handy.

BTW I said 1/s but not for tweets. I have less fields (20) than Twitter
(100).
With more fields, I guess it would take more time. Though with better machines,
it could work. I'd say that you need to test on the production cluster.

--
David ;-)
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 15 janv. 2015 à 15:40, Chinch Pokli cpo...@gmail.com a écrit :

Do these steps sound alright to you or did I miss something?

On Thursday, January 15, 2015 at 7:58:19 PM UTC+5:30, David Pilato wrote:
I can index on my laptop 1-12000 docs per second. SSD drives of course.

--
David ;-)
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 15 janv. 2015 à 13:43, Chinch Pokli cpo...@gmail.com a écrit :

I am hoping Redis/ Kafka/ Logstash/ etc. might help elasticsearch to get
some breathing room and enable it to index up to 10K docs per second.

On Thursday, January 15, 2015 at 10:47:31 AM UTC+5:30, David Pilato wrote:
You have a Twitter input so you can extract content from Twitter and send
to elasticsearch. No need to have Redis here.

--
David ;-)
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 15 janv. 2015 à 00:02, Chinch Pokli cpo...@gmail.com a écrit :

Thanks. I'll have a look at the raw option.
Regarding logstash, I don't fully understand it's utility. It says that
it can take messages from a Redis server. But if I have to set up Redis,
I could simply use the Redis river to index into Elasticsearch. Is there
any additional benefit that Logstash would give me?

On Thursday, January 15, 2015 at 4:06:12 AM UTC+5:30, David Pilato wrote:
You should look at raw option or better look at Logstash.

My 2 cents.

David

Le 14 janv. 2015 à 23:29, Chinch Pokli cpo...@gmail.com a écrit :

Hi,

Is there a way to make the river to store all the data? If not, I am
fine with writing a streaming code which will stream and index. But
have a concern. How many documents can elasticsearch index per second?
I might eventually need to index almost 10,000 documents (each document
= 2 KB) per second (current requirement is of 100 documents per
second). Is this even feasible? If yes, do I need to make any special
modifications?

Thanks-in-advance!!
--
You received this message because you are subscribed to the Google
Groups elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/da547692-903b-4793-a77e-fd5f0b5a01b7%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/d89e6057-ab58-49ef-a553-c5bd5265c172%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/a5c75aed-e290-4152-9f8d-160510f3ecfa%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/11bf4f30-d7f6-41ac-886a-c5281dac31bd%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: need help for search

No worries for your english.
Sorry. I missed your gist.

Based on your examples, it sounds like you are french. Are you aware of the 
french mailing list? 
https://groups.google.com/forum/?hl=frfromgroups#!forum/elasticsearch-fr 
https://groups.google.com/forum/?hl=frfromgroups#!forum/elasticsearch-fr

It would help a lot if you can simplify with some sample data and small queries 
what you are trying to do what does not work.
So suppress all analyzers as I guess here it’s not really your concern at this 
stage.
Try with only two or 3 fields.


-- 
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet https://twitter.com/dadoonet | @elasticsearchfr 
https://twitter.com/elasticsearchfr | @scrutmydocs 
https://twitter.com/scrutmydocs



 Le 15 janv. 2015 à 17:13, Thibaut Owczarz thib...@1001pharmacies.com a 
 écrit :
 
 hi,
 
 in my structure send in my gist,
 my question is just that:
 
 i have a search field. no say what i type in this field.
 but i need 1 request like this.
 {
 query : {
 bool: {
 must: [ ],
 must_not: [ ],
 should: [
 {
 term: {
 sku: $datasearch
 }
 },
 {
 term: {
internal_code: $datasearch
 }
 },
 {
 match: {
firstname: $datasearch
 }
 },
 {
 match: {
lastname: $datasearch
 }
 },
 {
 match: {
address: $datasearch
 }
 },
 {
 match: {
city: $datasearch
 }
 },   
 {
 match: {
localized_description: $datasearch
 }
 },   
 {
 match: {
localized_keywords: $datasearch
 }
 },
 {
 match: {
service.localized_label: $datasearch
 }
 },
 {
 match: {
medias.localized_label: $datasearch
 }
 },
 {
 match: {
services.localized_label: $datasearch
 }
 }
 ]
 }
 }
 }';
 
 Exemple :
 -
 - if $datasearch=sku, i have directly 1 user with this sku
 - if $datasearch=firstname, i have directly a list of user who have this 
 firstname
 - if $datasearch=keyword, i have list of user who have this keyword
 
 - i take term for sku or internal_code because i can't search whith partial 
 of this. (if my sku = 1234, no could found result if i type 123)
 
 - And for finish, in my data i have user : 
 [1 - charles martin who have localized_keywords=moto, licorne, cheval, 
 course ] 
 [2 - henry martin who have localized_keywords=pétanque, chevaux, basket, 
 parieur]
 i want with my request have this 2 user if $datasearch = cheval.
 
 I hope to be me understand , I can have a bad English
 
 thanks
 
 Le jeudi 15 janvier 2015 16:17:08 UTC+1, David Pilato a écrit :
 Could you reproduce this with a full test case so we understand exactly What 
 you are doing?
 May be simplify your test.
 
 See elasticsearch.org/help http://elasticsearch.org/help
 
 
 --
 David ;-)
 Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs
 
 Le 15 janv. 2015 à 16:01, Thibaut Owczarz thi...@1001pharmacies.com 
 javascript: a écrit :
 
 i'm ok, but my data search no say if is sku or code_internal or other field.
 
 if i do that, it's ok
 {
   query: {
 bool: {
   must: [
 {
   term: {
 sku: 01b3ae496c0142f993cf131c607fe003
   }
 }
   ],
   must_not: [],
   should: [
   {
 term: {
internal_code: 01b3ae496c0142f993cf131c607fe003
 }
  },
 
 {
   match: {
 firstname: 01b3ae496c0142f993cf131c607fe003
   }
 },
 {
   match: {
 lastname: 01b3ae496c0142f993cf131c607fe003
   }
 },
 {
   match: {
 address: 01b3ae496c0142f993cf131c607fe003
   }
 },
 {
   match: {
 city: 01b3ae496c0142f993cf131c607fe003
   }
 },
 {
   match: {
 localized_description: 01b3ae496c0142f993cf131c607fe003
   }

Re: need help for search

Could you reproduce this with a full test case so we understand exactly What 
you are doing?
May be simplify your test.

See elasticsearch.org/help


--
David ;-)
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

 Le 15 janv. 2015 à 16:01, Thibaut Owczarz thib...@1001pharmacies.com a 
 écrit :
 
 i'm ok, but my data search no say if is sku or code_internal or other field.
 
 if i do that, it's ok
 {
   query: {
 bool: {
   must: [
 {
   term: {
 sku: 01b3ae496c0142f993cf131c607fe003
   }
 }
   ],
   must_not: [],
   should: [
   {
 term: {
internal_code: 01b3ae496c0142f993cf131c607fe003
 }
  },
 
 {
   match: {
 firstname: 01b3ae496c0142f993cf131c607fe003
   }
 },
 {
   match: {
 lastname: 01b3ae496c0142f993cf131c607fe003
   }
 },
 {
   match: {
 address: 01b3ae496c0142f993cf131c607fe003
   }
 },
 {
   match: {
 city: 01b3ae496c0142f993cf131c607fe003
   }
 },
 {
   match: {
 localized_description: 01b3ae496c0142f993cf131c607fe003
   }
 },
 {
   match: {
 localized_keywords: 01b3ae496c0142f993cf131c607fe003
   }
 },
 {
   match: {
 service.localized_label: 01b3ae496c0142f993cf131c607fe003
   }
 },
 {
   match: {
 medias.localized_label: 01b3ae496c0142f993cf131c607fe003
   }
 },
 {
   match: {
 services.localized_label: 01b3ae496c0142f993cf131c607fe003
   }
 }
   ]
 }
   }
 }
 
 but if now i search with internal_code
 {
   query: {
 bool: {
   must: [
 {
   term: {
 sku: 3401598272746
   }
 }
   ],
   must_not: [],
   should: [
   {
 term: {
internal_code: 3401598272746
 }
  },
 
 {
   match: {
 firstname: 3401598272746
   }
 },
 {
   match: {
 lastname: 3401598272746
   }
 },
 {
   match: {
 address: 3401598272746
   }
 },
 {
   match: {
 city: 3401598272746
   }
 },
 {
   match: {
 localized_description: 3401598272746
   }
 },
 {
   match: {
 localized_keywords: 3401598272746
   }
 },
 {
   match: {
 service.localized_label: 3401598272746
   }
 },
 {
   match: {
 medias.localized_label: 3401598272746
   }
 },
 {
   match: {
 services.localized_label: 3401598272746
   }
 }
   ]
 }
   }
 }
 my request is bad
 
 
 Le jeudi 15 janvier 2015 15:49:56 UTC+1, David Pilato a écrit :
 
 I guess it's most likely because you added all your filters in should clause 
 instead of must?
 
 --
 David ;-)
 Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs
 
 Le 15 janv. 2015 à 15:36, Thibaut Owczarz thi...@1001pharmacies.com a 
 écrit :
 
 i found my first error, no need user. because i search already in user.
 but why when i search a defined sku, no found only one ?
 
 
 curl -XPOST 'http://localhost:9200/test_fr/user/_search' -d '{
 query : {
 bool: {
 must: [ ],
 must_not: [ ],
 should: [
 {
 term: {
 sku: 01b3ae496c0142f993cf131c607fe003
 }
 },
 {
 term: {
internal_code: 01b3ae496c0142f993cf131c607fe003
 }
 },
 {
 match: {
firstname: 01b3ae496c0142f993cf131c607fe003
 }
 },
 {
 match: {
lastname: 01b3ae496c0142f993cf131c607fe003
 }
 },
 {
 match: {
address: 01b3ae496c0142f993cf131c607fe003
 }
 },
 {
 match: {
city: 01b3ae496c0142f993cf131c607fe003
 }
 },   
 {
 match: {
localized_description: 
 01b3ae496c0142f993cf131c607fe003
 }
 },   
 {
 match: {

Re: Aggregation - Blank and date aggregation

2015-01-15 Thread buddarapu nagaraju

Hey Adrien ,Thank you.I have one more question on aggregating on dates .

We actually stored date time in a field called createdDateTime but I need
only aggregates on date part of date time .

Any ideas ? Or sample code can help us ?

Regards
Nagaraju
908 517 6981

On Wed, Jan 14, 2015 at 6:10 AM, Adrien Grand
adrien.gr...@elasticsearch.com wrote:

On Wed, Jan 14, 2015 at 10:37 AM, buddarapu nagaraju
budda08n...@gmail.com wrote:

Does term aggregation counts on blank field values ?

Does term aggregation is enough for doing date aggregation ? Or there any
specific aggregations we have ?All I need in date aggregation is to know
different dates and its counts ?

--
Adrien Grand

--
You received this message because you are subscribed to a topic in the
Google Groups elasticsearch group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/i9N09n_-n38/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j74ZqbBN0zNW6-5Feu7xYTKkomzx%3DDMhx28inFVYLSu5Q%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j74ZqbBN0zNW6-5Feu7xYTKkomzx%3DDMhx28inFVYLSu5Q%40mail.gmail.com?utm_medium=emailutm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAFtuXXKp0JycJfNvLxPGN_5YL7P-X%3DGDzvmYJQ9NFN7Q%2BaJjQw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: When searching for 'Boss' with fuzziness, get higher score for 'Bose' than 'Boss'. ???? How Comes !?!?

This is because the score takes two factors into account: the document
frequency and the edit distance. Quite likely in your case, even though
Boss is closer than Bose, Bose has a much lower document frequency which
helped it eventually get a better score. I guess we should have another
rewrite method that would not take freqs into account (or somehow merge
them) to avoid that issue.

On Thu, Jan 15, 2015 at 4:06 PM, Eylon Steiner eylon.stei...@gmail.com
wrote:

Any ideas?

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/52e09e54-90b6-4014-8454-34e3db5756e5%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/52e09e54-90b6-4014-8454-34e3db5756e5%40googlegroups.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
Adrien Grand

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j7-7SbX_CVizbC%3DwCf9jyNSfkn4zy-GEqEj0sdBZGkRrg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

elasticsearch-hadoop for spark, index documents from a RDD in different index by day: myindex-2014-01-01 for example

2015-01-15 Thread Julien Naour

Hi,

I work on a complex workflow using Spark (Parsing, Cleaning, Machine 
Learning).
At the end of the workflow I want to send aggregated results to 
elasticsearch so my portal could query data.
There will be two types of processing: streaming and the possibility to 
relaunch workflow on all available data.

Right now I use elasticsearch-hadoop and particularly the spark part to 
send document to elasticsearch with the saveJsonToEs(myindex, mytype) 
method.
The target is to have an index by day using the proper template that we 
build.
AFAIK you could not add consideration of a feature in a document to send it 
to the proper index in elasticsearch-hadoop.

What is the proper way to implement this feature? 
Have a special step useing spark and bulk so that each executor send 
documents to the proper index considering the feature of each line?
Is there something that I missed in elasticsearch-hadoop?

Julien

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/58b0e0e3-a297-4cf4-95bf-d3cf34546ea3%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: need help for search

hi,

in my structure send in my gist,
my question is just that:

i have a search field. no say what i type in this field.
but i need 1 request like this.
{
query : {
bool: {
must: [ ],
must_not: [ ],
should: [
{
term: {
sku: $datasearch
}
},
{
term: {
   internal_code: $datasearch
}
},
{
match: {
   firstname: $datasearch
}
},
{
match: {
   lastname: $datasearch
}
},
{
match: {
   address: $datasearch
}
},
{
match: {
   city: $datasearch
}
},   
{
match: {
   localized_description: $datasearch
}
},   
{
match: {
   localized_keywords: $datasearch
}
},
{
match: {
   service.localized_label: $datasearch
}
},
{
match: {
   medias.localized_label: $datasearch
}
},
{
match: {
   services.localized_label: $datasearch
}
}
]
}
}
}';

Exemple :
-
- if $datasearch=sku, i have directly 1 user with this sku
- if $datasearch=firstname, i have directly a list of user who have this 
firstname
- if $datasearch=keyword, i have list of user who have this keyword

- i take term for sku or internal_code because i can't search whith partial 
of this. (if my sku = 1234, no could found result if i type 123)

- And for finish, in my data i have user : 
[1 - charles martin who have localized_keywords=moto, licorne, cheval, 
course ] 
[2 - henry martin who have localized_keywords=pétanque, chevaux, basket, 
parieur]
i want with my request have this 2 user if $datasearch = cheval.

I hope to be me understand , I can have a bad English


thanks


Le jeudi 15 janvier 2015 16:17:08 UTC+1, David Pilato a écrit :

 Could you reproduce this with a full test case so we understand exactly 
 What you are doing?
 May be simplify your test.

 See elasticsearch.org/help


 --
 David ;-)
 Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

 Le 15 janv. 2015 à 16:01, Thibaut Owczarz thi...@1001pharmacies.com 
 javascript: a écrit :

 i'm ok, but my data search no say if is sku or code_internal or other 
 field.

 if i do that, it's ok
 {
   query: {
 bool: {
   must: [
 {
   term: {
 sku: 01b3ae496c0142f993cf131c607fe003
   }
 }
   ],
   must_not: [],
   should: [
   {

 term: {
internal_code: 01b3ae496c0142f993cf131c607fe003
 }
  },

 {
   match: {
 firstname: 01b3ae496c0142f993cf131c607fe003
   }
 },
 {
   match: {
 lastname: 01b3ae496c0142f993cf131c607fe003
   }
 },
 {
   match: {
 address: 01b3ae496c0142f993cf131c607fe003
   }
 },
 {
   match: {
 city: 01b3ae496c0142f993cf131c607fe003
   }
 },
 {
   match: {
 localized_description: 01b3ae496c0142f993cf131c607fe003
   }
 },
 {
   match: {
 localized_keywords: 01b3ae496c0142f993cf131c607fe003
   }
 },
 {
   match: {
 service.localized_label: 01b3ae496c0142f993cf131c607fe003
   }
 },
 {
   match: {
 medias.localized_label: 01b3ae496c0142f993cf131c607fe003
   }
 },
 {
   match: {
 services.localized_label: 01b3ae496c0142f993cf131c607fe003
   }
 }
   ]
 }
   }
 }

 but if now i search with internal_code
 {
   query: {
 bool: {
   must: [
 {
   term: {
 sku: 3401598272746
   }
 }
   ],
   must_not: [],
   should: [
   {

 term: {
internal_code: 3401598272746

 }
  },

 {
   match: {
 firstname: 3401598272746
   }

Re: elasticsearch-hadoop for spark, index documents from a RDD in different index by day: myindex-2014-01-01 for example

2015-01-15 Thread Julien Naour

My previous idea doesn't seem to work. Cannot send documents directly to
_bulk only to index/type pattern

On Thursday, January 15, 2015 at 4:17:57 PM UTC+1, Julien Naour wrote:

Hi,

I work on a complex workflow using Spark (Parsing, Cleaning, Machine
Learning).
At the end of the workflow I want to send aggregated results to
elasticsearch so my portal could query data.
There will be two types of processing: streaming and the possibility to
relaunch workflow on all available data.

Right now I use elasticsearch-hadoop and particularly the spark part to
send document to elasticsearch with the saveJsonToEs(myindex, mytype)
method.
The target is to have an index by day using the proper template that we
build.
AFAIK you could not add consideration of a feature in a document to send
it to the proper index in elasticsearch-hadoop.

What is the proper way to implement this feature?
Have a special step useing spark and bulk so that each executor send
documents to the proper index considering the feature of each line?
Is there something that I missed in elasticsearch-hadoop?

Julien

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/f01bc8d0-0c04-4c82-8ddf-dc301b06179c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Just initialize shards when problems but no rebalance

2015-01-15 Thread Kimbro Staken

Kimbro

On Thu, Jan 15, 2015 at 6:14 AM, Matías Waisgold mwaisg...@gmail.com
wrote:

Another problem we are having is that in the file storage we see data from
shards that are not assigned to itself so it can´t allocate anything in
this dirty state.

2015-01-15 0:09 GMT-03:00 Mark Walkom markwal...@gmail.com:

You could do this, but it's a lot of manual overhead to have to deal with.
However ES does have some disk space awareness during allocation, take a
look at
http://www.elasticsearch.org/guide/en/elasticsearch/reference/master/index-modules-allocation.html#disk

On 15 January 2015 at 10:57, Matías Waisgold mwaisg...@gmail.com wrote:

Is it ok to combine cluster.routing.allocation.allow_rebalance to none
and cluster.routing.allocation.enable to all.

Kind regards

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAMaTqYqFmk8t7couOmYEyPYNZPKepT8nKVrCM6fvSPW0CUjMwA%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAMaTqYqFmk8t7couOmYEyPYNZPKepT8nKVrCM6fvSPW0CUjMwA%40mail.gmail.com?utm_medium=emailutm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAA0DmXaW8AdZJhGPGTRqD%3DYCSQ%2B2JdM-oGGpxkRgi0BZLOw2rg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: need help for search

I guess it's most likely because you added all your filters in should clause 
instead of must?

--
David ;-)
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

 Le 15 janv. 2015 à 15:36, Thibaut Owczarz thib...@1001pharmacies.com a 
 écrit :
 
 i found my first error, no need user. because i search already in user.
 but why when i search a defined sku, no found only one ?
 
 
 curl -XPOST 'http://localhost:9200/test_fr/user/_search' -d '{
 query : {
 bool: {
 must: [ ],
 must_not: [ ],
 should: [
 {
 term: {
 sku: 01b3ae496c0142f993cf131c607fe003
 }
 },
 {
 term: {
internal_code: 01b3ae496c0142f993cf131c607fe003
 }
 },
 {
 match: {
firstname: 01b3ae496c0142f993cf131c607fe003
 }
 },
 {
 match: {
lastname: 01b3ae496c0142f993cf131c607fe003
 }
 },
 {
 match: {
address: 01b3ae496c0142f993cf131c607fe003
 }
 },
 {
 match: {
city: 01b3ae496c0142f993cf131c607fe003
 }
 },   
 {
 match: {
localized_description: 
 01b3ae496c0142f993cf131c607fe003
 }
 },   
 {
 match: {
localized_keywords: 
 01b3ae496c0142f993cf131c607fe003
 }
 },
 {
 match: {
service.localized_label: 
 01b3ae496c0142f993cf131c607fe003
 }
 },
 {
 match: {
medias.localized_label: 
 01b3ae496c0142f993cf131c607fe003
 }
 },
 {
 match: {
services.localized_label: 
 01b3ae496c0142f993cf131c607fe003
 }
 }
 ]
 }
 }
 }';
 
 they return all my users.
 
 Thanks
 
 Le jeudi 15 janvier 2015 14:58:16 UTC+1, Thibaut Owczarz a écrit :
 
 Hello,
  I start learning Elasticsearch, and i have a problem for understand how 
 search. anyone could help me? 
 
 My gist for all my structure and my data is here
 https://gist.github.com/thibaut1001/7a3000c3ff371be3a52d
 
 My problem is just in 4part
 To search in multi field by data like this
 
 ## We need to search henry in field selected
 curl -XPOST 'http://localhost:9200/test_fr/user/_search' -d '{
 query : {
 bool: {
 must: [ ],
 must_not: [ ],
 should: [
 {
 term: {
 user.sku: henry
 }
 },
 {
 term: {
user.internal_code: henry
 }
 },
 {
 term: {
user.firstname: henry
 }
 },
 {
 term: {
user.lastname: henry
 }
 },
 {
 term: {
user.address: henry
 }
 },
 {
 term: {
user.city: henry
 }
 },
 {
 term: {
user.localized_description: henry
 }
 },
 {
 term: {
user.localized_keywords: henry
 }
 },
 {
 term: {
user.service.localized_label: henry
 }
 },
 {
 term: {
user.medias.localized_label: henry
 }
 },
 {
 term: {
user.services.localized_label: henry
 }
 }
 ]
 }
 }
 }';
 ## Return no results Why?
 
 I have many question.
 Could you help me please,
 thanks
 
 -- 
 You received this message because you are subscribed to the

Re: elasticsearch-hadoop for spark, index documents from a RDD in different index by day: myindex-2014-01-01 for example

2015-01-15 Thread Julien Naour

I think I have a solution:
Build JSON files so I could send it directly to _bulk
saveJsonToEs(_bulk)

Not sure if it will be optimized or even worked, I'll try.

On Thursday, January 15, 2015 at 4:17:57 PM UTC+1, Julien Naour wrote:

Hi,

Julien

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/b9bba847-9e64-4336-92d9-80cd52c081d8%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: need help for search

i'm ok, but my data search no say if is sku or code_internal or other field.

if i do that, it's ok
{
  query: {
bool: {
  must: [
{
  term: {
sku: 01b3ae496c0142f993cf131c607fe003
  }
}
  ],
  must_not: [],
  should: [
  {

term: {
   internal_code: 01b3ae496c0142f993cf131c607fe003
}
 },

{
  match: {
firstname: 01b3ae496c0142f993cf131c607fe003
  }
},
{
  match: {
lastname: 01b3ae496c0142f993cf131c607fe003
  }
},
{
  match: {
address: 01b3ae496c0142f993cf131c607fe003
  }
},
{
  match: {
city: 01b3ae496c0142f993cf131c607fe003
  }
},
{
  match: {
localized_description: 01b3ae496c0142f993cf131c607fe003
  }
},
{
  match: {
localized_keywords: 01b3ae496c0142f993cf131c607fe003
  }
},
{
  match: {
service.localized_label: 01b3ae496c0142f993cf131c607fe003
  }
},
{
  match: {
medias.localized_label: 01b3ae496c0142f993cf131c607fe003
  }
},
{
  match: {
services.localized_label: 01b3ae496c0142f993cf131c607fe003
  }
}
  ]
}
  }
}

but if now i search with internal_code
{
  query: {
bool: {
  must: [
{
  term: {
sku: 3401598272746
  }
}
  ],
  must_not: [],
  should: [
  {

term: {
   internal_code: 3401598272746

}
 },

{
  match: {
firstname: 3401598272746
  }
},
{
  match: {
lastname: 3401598272746
  }
},
{
  match: {
address: 3401598272746
  }
},
{
  match: {
city: 3401598272746
  }
},
{
  match: {
localized_description: 3401598272746
  }
},
{
  match: {
localized_keywords: 3401598272746
  }
},
{
  match: {
service.localized_label: 3401598272746
  }
},
{
  match: {
medias.localized_label: 3401598272746
  }
},
{
  match: {
services.localized_label: 3401598272746
  }
}
  ]
}
  }
}
my request is bad


Le jeudi 15 janvier 2015 15:49:56 UTC+1, David Pilato a écrit :

 I guess it's most likely because you added all your filters in should 
 clause instead of must?

 --
 David ;-)
 Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

 Le 15 janv. 2015 à 15:36, Thibaut Owczarz thi...@1001pharmacies.com 
 javascript: a écrit :

 i found my first error, no need user. because i search already in user.
 but why when i search a defined sku, no found only one ?


 curl -XPOST 'http://localhost:9200/test_fr/user/_search' -d '{
 query : {
 bool: {
 must: [ ],
 must_not: [ ],
 should: [
 {
 term: {
 sku: 01b3ae496c0142f993cf131c607fe003
 }
 },
 {
 term: {
internal_code: 01b3ae496c0142f993cf131c607fe003
 }
 },
 {
 match: {
firstname: 01b3ae496c0142f993cf131c607fe003
 }
 },
 {
 match: {
lastname: 01b3ae496c0142f993cf131c607fe003
 }
 },
 {
 match: {
address: 01b3ae496c0142f993cf131c607fe003
 }
 },
 {
 match: {
city: 01b3ae496c0142f993cf131c607fe003
 }
 },   
 {
 match: {
localized_description: 
 01b3ae496c0142f993cf131c607fe003
 }
 },   
 {
 match: {
localized_keywords: 
 01b3ae496c0142f993cf131c607fe003
 }
 },
 {
 match: {
service.localized_label: 
 01b3ae496c0142f993cf131c607fe003
 }
 },
 {
 match: {
medias.localized_label:

Re: need help for search

Thanks for elastisearch-fr mailing list

tomorrow I do a little game simple data
and I give the request that I want to do and the result i need

Thanks


Le jeudi 15 janvier 2015 17:31:28 UTC+1, David Pilato a écrit :

 No worries for your english.
 Sorry. I missed your gist.

 Based on your examples, it sounds like you are french. Are you aware of 
 the french mailing list? 
 https://groups.google.com/forum/?hl=frfromgroups#!forum/elasticsearch-fr

 It would help a lot if you can simplify with some sample data and small 
 queries what you are trying to do what does not work.
 So suppress all analyzers as I guess here it’s not really your concern at 
 this stage.
 Try with only two or 3 fields.


 -- 
 *David Pilato* | *Technical Advocate* | *Elasticsearch.com 
 http://Elasticsearch.com*
 @dadoonet https://twitter.com/dadoonet | @elasticsearchfr 
 https://twitter.com/elasticsearchfr | @scrutmydocs 
 https://twitter.com/scrutmydocs


  
 Le 15 janv. 2015 à 17:13, Thibaut Owczarz thi...@1001pharmacies.com 
 javascript: a écrit :

 hi,

 in my structure send in my gist,
 my question is just that:

 i have a search field. no say what i type in this field.
 but i need 1 request like this.
 {
 query : {
 bool: {
 must: [ ],
 must_not: [ ],
 should: [
 {
 term: {
 sku: $datasearch
 }
 },
 {
 term: {
internal_code: $datasearch
 }
 },
 {
 match: {
firstname: $datasearch
 }
 },
 {
 match: {
lastname: $datasearch
 }
 },
 {
 match: {
address: $datasearch
 }
 },
 {
 match: {
city: $datasearch
 }
 },   
 {
 match: {
localized_description: $datasearch
 }
 },   
 {
 match: {
localized_keywords: $datasearch
 }
 },
 {
 match: {
service.localized_label: $datasearch
 }
 },
 {
 match: {
medias.localized_label: $datasearch
 }
 },
 {
 match: {
services.localized_label: $datasearch
 }
 }
 ]
 }
 }
 }';

 Exemple :
 -
 - if $datasearch=sku, i have directly 1 user with this sku
 - if $datasearch=firstname, i have directly a list of user who have this 
 firstname
 - if $datasearch=keyword, i have list of user who have this keyword

 - i take term for sku or internal_code because i can't search whith 
 partial of this. (if my sku = 1234, no could found result if i type 123)

 - And for finish, in my data i have user : 
 [1 - charles martin who have localized_keywords=moto, licorne, cheval, 
 course ] 
 [2 - henry martin who have localized_keywords=pétanque, chevaux, basket, 
 parieur]
 i want with my request have this 2 user if $datasearch = cheval.

 I hope to be me understand , I can have a bad English


 thanks


 Le jeudi 15 janvier 2015 16:17:08 UTC+1, David Pilato a écrit :

 Could you reproduce this with a full test case so we understand exactly 
 What you are doing?
 May be simplify your test.

 See elasticsearch.org/help


 --
 David ;-)
 Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

 Le 15 janv. 2015 à 16:01, Thibaut Owczarz thi...@1001pharmacies.com a 
 écrit :

 i'm ok, but my data search no say if is sku or code_internal or other 
 field.

 if i do that, it's ok
 {
   query: {
 bool: {
   must: [
 {
   term: {
 sku: 01b3ae496c0142f993cf131c607fe003
   }
 }
   ],
   must_not: [],
   should: [
   {

 term: {
internal_code: 01b3ae496c0142f993cf131c607fe003
 }
  },

 {
   match: {
 firstname: 01b3ae496c0142f993cf131c607fe003
   }
 },
 {
   match: {
 lastname: 01b3ae496c0142f993cf131c607fe003
   }
 },
 {
   match: {
 address: 01b3ae496c0142f993cf131c607fe003
   }
 },
 {
   match: {
 city:

Re: Just initialize shards when problems but no rebalance

2015-01-15 Thread Kimbro Staken

So is this still happening with 1.4.2?

Here's the ticket. Looks like the fix was supposed to be in 1.4.1

https://github.com/elasticsearch/elasticsearch/issues/8538

On Thu, Jan 15, 2015 at 10:55 AM, Matías Waisgold mwaisg...@gmail.com
wrote:

Great, thank you. We are creating another cluster with more disk space to
avoid this situations.
By any chance do you have the link to the issue?

2015-01-15 13:26 GMT-03:00 Kimbro Staken ksta...@kstaken.com:

Kimbro

On Thu, Jan 15, 2015 at 6:14 AM, Matías Waisgold mwaisg...@gmail.com
wrote:

Yes, I've seen that but the problem is that when the threshold is
reached it removes all shards from the server instead of just removing 1
and balance. And when that happens the cluster starts to move shards over
everywhere and it never stops.

Another problem we are having is that in the file storage we see data
from shards that are not assigned to itself so it can´t allocate anything
in this dirty state.

2015-01-15 0:09 GMT-03:00 Mark Walkom markwal...@gmail.com:

You could do this, but it's a lot of manual overhead to have to deal
with.
However ES does have some disk space awareness during allocation, take
a look at
http://www.elasticsearch.org/guide/en/elasticsearch/reference/master/index-modules-allocation.html#disk

On 15 January 2015 at 10:57, Matías Waisgold mwaisg...@gmail.com
wrote:

Hi is there any setting that I can put to ES that it automatically
assigns shards that are unassigned but never ever rebalance the cluster?
I´ve found several issues when rebalancing and prefer to do it
manually.
If I set cluster.routing.allocation.enable to none nothing happens.
If I set it to all then it starts rebalancing.

Is it ok to combine cluster.routing.allocation.allow_rebalance to
none and cluster.routing.allocation.enable to all.

Kind regards

--
You received this message because you are subscribed to a topic in the
Google Groups elasticsearch group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/CHqlig1M-T0/unsubscribe
.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAEYi1X8KXKpmnAPWvr8a_Mgny75KkkKxRFP_bJVhQL20bhR0UQ%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAEYi1X8KXKpmnAPWvr8a_Mgny75KkkKxRFP_bJVhQL20bhR0UQ%40mail.gmail.com?utm_medium=emailutm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google
Groups elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAMaTqYqFmk8t7couOmYEyPYNZPKepT8nKVrCM6fvSPW0CUjMwA%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAMaTqYqFmk8t7couOmYEyPYNZPKepT8nKVrCM6fvSPW0CUjMwA%40mail.gmail.com?utm_medium=emailutm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to a topic in the
Google Groups elasticsearch group.
To unsubscribe from this

Re: real time match analysis

2015-01-15 Thread Ed Kim

I was able to identify which field matched via explain, but couldn't see
any information on which token filter was the reason for the match. I've
tried specifying the analyzer name that the field uses as well as not
specifying. If the explain is supposed to provide this data, I will give it
another go and set up a test index with simpler analyzer setups.

Also, in order to do this, I will need to run the explain separate from the
search itself. My ultimate goal is to be able to do this within
milliseconds (less than 10). Is this feasible with explain?

On Wednesday, January 14, 2015 at 12:51:15 PM UTC-8, Nikolas Everett wrote:

What about explain?

On Wed, Jan 14, 2015 at 3:24 PM, Ed Kim edk...@gmail.com javascript:
wrote:

Just a friendly bump to see if anyone has any feedback. :)

On Saturday, January 10, 2015 at 10:38:34 PM UTC-8, Ed Kim wrote:

Hello all, I was wondering if anyone could offer some feedback on
whether there is a way to determine how a document matched in real time. I
currently use custom analyzers at index time to allow a broad array of
matches for a given text field. I try to match based on phrases, synonyms,
substrings, stemming, etc of a given phrase, and I would like to be able to
figure out at search time, which analyzer was attributed to causing the
match.

Currently, I've gotten around this by creating child documents where the
fields are fanned out to their respective analyzer types. So I have a child
document where the field only applies stemming, another that uses only
synonyms, etc. However, due to the growing number of fields that require
analysis and the growth of my data set, I'd much prefer if I had less
documents (and less complex too). I was hoping there would be a way to tag
tokens at the analysis phase that could be used at the search phase to
quickly determine my match level, but I was not able to find anything like
this.

Having said that, has anyone else ever tried to figure this out, or have
an thoughts on how to leverage ES at a lower level to determine match?

https://groups.google.com/d/msgid/elasticsearch/eab16b7d-7d98-4096-b853-66ef65376c44%40googlegroups.com?utm_medium=emailutm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/326aca97-d937-41cc-9c28-7f89aa398c81%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Seeking a Director of Data Engineering in Austin TX

2015-01-15 Thread Mark Walkom

Hi Traci,
This is a community based technical list. We'd greatly appreciate it if you
didn't post job ads.

On 16 January 2015 at 03:38, Traci Martin traci@gmail.com wrote:

Hello All!

I am a recruiter in Austin, TX trying to fill a Director of Data
Engineering for my client, also in Austin. They are ELK stack evangelists
and would prefer some with, at least knowledge of Lucene or Hadoop. This is
really a great company to work for and probably the nicest client I have
had the pleasure of working with.

It is a permanent position offering great benefits, a laid back
atmosphere, and very competitive salary with options. If you are interested
please feel free to contact me.

*There will be no re-lo provided and no sponsorship at this time.

Traci Martin
512-640-3656
tmar...@intersysconsulting.com

*Director **Data Engineering*

*Who we are: *
*Intersys Consulting* is a leading Business Intelligence, Data
Management, and Application Development professional services organization
focused on providing solutions with real business value. We provide a
customer-focused approach to building authentic partnerships with our
clients with objective counsel from concept to deployment for a consistent
voice through the dynamic IT environment.

*What we look for: *
*Intersys Consulting *is focused on finding and cultivating talent across
the IT space. We have over 100 developers, project managers, business
analysts, and data management professionals, most with over ten years of
experience in their respective fields. In new hires we look for
authenticity; be proud of who you are and what you bring to the table, as
well as those candidates who consistently deliver the highest quality
product and have a deep desire to improve not just themselves, but the
organization as a whole.

*The Position:*
Intersys Consulting is seeking a Director of Data Engineering to work at
our client site in Austin, Texas.

*Primary Responsibilities:*

- Build and optimize each component of our data pipeline
- Work with our data scientists to provide data in the optimal format
- Work with our DevOps team to ensure the data infrastructure is
reliable and scalable
- Integrate with our data partners to enrich our firstparty data with
thirdparty sources
- Stay on top of cutting edge technologies to constantly improve and
streamline our data systems

*Qualifications:*

- Experience with high performance, high traffic web systems
- Experience with monitoring systems: New Relic, ELK stack, etc.
- Experience with either Hadoop or Elasticsearch/Lucene and a desire
and willingness to learn the other

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/c61a0318-a9c8-496d-86de-54a4a7ba3349%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/c61a0318-a9c8-496d-86de-54a4a7ba3349%40googlegroups.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAEYi1X9N7JQ%2B8k0y8V4cV%2B7ddO0yqeOe783AVh0mdFKvyUTLsw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: ElasticSearch: access document nested value in groovy script

2015-01-15 Thread Anil Kumar

I found this.

I had to use _source.medals to access the nested documents which are stored 
in disk and not in memory.

Thanks

On Wednesday, January 14, 2015 at 10:55:15 AM UTC-8, Anil Kumar wrote:

 I have a document stored in ElasticSearch as below. _source:

  {
  firstname: John,
  lastname: Smith,
  medals:[
{
  bucket: 100, 
  count: 1
},
{
  bucket: 150,
  count: 2
}
  ]
   }

 I can access the string type value inside a document using doc.firstname for 
 scripted metric aggregation 
 http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-aggregations-metrics-scripted-metric-aggregation.html
 .

 But I am not able to get the field value using doc.medals[0].bucket.

 Can you please help me out and let me know how to access the values inside 
 nested fields?


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/19bd5fb9-b584-441f-8c55-c2f0d2b7d24e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Using icu_collation plugin in Unit Tests

2015-01-15 Thread Kumar S

Thanks David!

Sorry for being a new one in the ES world. But where would i download the
JAR file from and what calss should i be using for the icu_collation?

Thank you very much,
Kumar Subramanian,

On Thursday, January 15, 2015 at 12:52:12 PM UTC-8, David Pilato wrote:

You most likely just need to add it as a dependency. Which is easy if you
are using maven.

David

Le 15 janv. 2015 à 21:03, Kumar S krsku...@gmail.com javascript: a
écrit :

Hi,
I am new to ES. I am using NodeBuilder in my unit test to run a local
instance of ES. I would like to use the icu_collation plugin. How can i
install and run the plugin form within this local instance. Is there API
that i should use? if not, what are the different ways i can do this?

Thank you very much,
Kumar Subramanian.

https://groups.google.com/d/msgid/elasticsearch/5f3ebc39-4c13-4d1b-a888-bd101ab46136%40googlegroups.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/5a5e82b3-038b-4251-ae2c-f2216dc991f0%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Excluding Terms Using a Minus Sign

2015-01-15 Thread Cindy Conway

Is there a way to exclude a term if the user precedes it with a minus sign; 
the way google does. For example, if I want to search for the word lovre, 
but I don't want the museum in France, I can search for: 
*louve -museum* as my search terms. Does ES support this? I am not finding 
anything like that in the documentation.

Thanks All!

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/b7e7fa83-332f-4fc9-a704-5abccb2d9856%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Using icu_collation plugin in Unit Tests

2015-01-15 Thread Kumar S

Hi,
I am new to ES. I am using NodeBuilder in my unit test to run a local 
instance of ES. I would like to use the icu_collation plugin. How can i 
install and run the plugin form within this local instance. Is there API 
that i should use? if not, what are the different ways i can do this?

Thank you very much,
Kumar Subramanian.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/5f3ebc39-4c13-4d1b-a888-bd101ab46136%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Just initialize shards when problems but no rebalance

2015-01-15 Thread Matías Waisgold

I'm on 1.4.1 and still seeing the same behavior.
There should be a better practice than remove all shards at the same time
and try to move a few.
We are going to apply the same solution you mentioned, add more disk.
Thank's for your help.

2015-01-15 16:09 GMT-03:00 Kimbro Staken ksta...@kstaken.com:

So is this still happening with 1.4.2?

Here's the ticket. Looks like the fix was supposed to be in 1.4.1

https://github.com/elasticsearch/elasticsearch/issues/8538

On Thu, Jan 15, 2015 at 10:55 AM, Matías Waisgold mwaisg...@gmail.com
wrote:

Great, thank you. We are creating another cluster with more disk space to
avoid this situations.
By any chance do you have the link to the issue?

2015-01-15 13:26 GMT-03:00 Kimbro Staken ksta...@kstaken.com:

Kimbro

On Thu, Jan 15, 2015 at 6:14 AM, Matías Waisgold mwaisg...@gmail.com
wrote:

Yes, I've seen that but the problem is that when the threshold is
reached it removes all shards from the server instead of just removing 1
and balance. And when that happens the cluster starts to move shards over
everywhere and it never stops.

Another problem we are having is that in the file storage we see data
from shards that are not assigned to itself so it can´t allocate anything
in this dirty state.

2015-01-15 0:09 GMT-03:00 Mark Walkom markwal...@gmail.com:

You could do this, but it's a lot of manual overhead to have to deal
with.
However ES does have some disk space awareness during allocation, take
a look at
http://www.elasticsearch.org/guide/en/elasticsearch/reference/master/index-modules-allocation.html#disk

On 15 January 2015 at 10:57, Matías Waisgold mwaisg...@gmail.com
wrote:

Hi is there any setting that I can put to ES that it automatically
assigns shards that are unassigned but never ever rebalance the cluster?
I´ve found several issues when rebalancing and prefer to do it
manually.
If I set cluster.routing.allocation.enable to none nothing happens.
If I set it to all then it starts rebalancing.

Is it ok to combine cluster.routing.allocation.allow_rebalance to
none and cluster.routing.allocation.enable to all.

Kind regards

--
You received this message because you are subscribed to the Google
Groups elasticsearch group.
To unsubscribe from this group and stop receiving emails from it,
send an email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/666a4d70-2497-4a2b-8c5e-774c7d0617b7%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/666a4d70-2497-4a2b-8c5e-774c7d0617b7%40googlegroups.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to a topic in the
Google Groups elasticsearch group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/CHqlig1M-T0/unsubscribe
.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAEYi1X8KXKpmnAPWvr8a_Mgny75KkkKxRFP_bJVhQL20bhR0UQ%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAEYi1X8KXKpmnAPWvr8a_Mgny75KkkKxRFP_bJVhQL20bhR0UQ%40mail.gmail.com?utm_medium=emailutm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

filtering/querying on script field

2015-01-15 Thread samatha kankipati



Is it possible to filter or query on script_fields.
If so, can you provide any example..

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/6d4c738b-0975-4711-b9e1-a7d6eaa7830b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Using icu_collation plugin in Unit Tests

You most likely just need to add it as a dependency. Which is easy if you are 
using maven.

David

 Le 15 janv. 2015 à 21:03, Kumar S krskumar...@gmail.com a écrit :
 
 Hi,
 I am new to ES. I am using NodeBuilder in my unit test to run a local 
 instance of ES. I would like to use the icu_collation plugin. How can i 
 install and run the plugin form within this local instance. Is there API that 
 i should use? if not, what are the different ways i can do this?
 
 Thank you very much,
 Kumar Subramanian.
 -- 
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/5f3ebc39-4c13-4d1b-a888-bd101ab46136%40googlegroups.com.
 For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/8E14B6ED-B736-4CA8-9200-65E60006CDDC%40pilato.fr.
For more options, visit https://groups.google.com/d/optout.

Re: Help creating a near real time streaming plugin to perform replication between clusters

2015-01-15 Thread joergpra...@gmail.com

While it seems quite easy to attach listeners to an ES node to capture
operations in translog-style and push out index/delete operations on shard
level somehow, there will be more to consider for a reliable solution.

The Couchbase developers have added a data replication protocol to their
product which is meant for transporting changes over long distances with
latency for in-memory processing.

To learn about the most important features, see

https://github.com/couchbaselabs/dcp-documentation

and

http://docs.couchbase.com/admin/admin/Concepts/dcp.html

I think bringing such a concept of an inter cluster protocol into ES could
be a good starting point, to sketch the complete path for such an ambitious
project beforehand.

Most challenging could be dealing with back pressure when receiving
nodes/clusters are becoming slow. For a solution to this, reactive Java /
reactive streams look like a viable possibility.

Kibana and nested documents -- include_in_parent

2015-01-15 Thread Phil

Hello,

I am new to ElasticSearch and I have a very specific question. We have 
implemented our ElasticSearch cluster with a nested document structure. 
Each document is made of one ID, a key element and one field including 
several nested records that are inserted by the script api and the bulk 
update function.

My question is, is it possible to view nested documents in Kibana, without 
using *include_in_parent, *because from preliminary testing it seams to be 
using more disk space when include_in_parent is in the mappings ? 
When include_in_parent is not in the mappings, the documents are not 
viewable within Kibana 4.0.0

Also, is there a function or way to display which documents have the most 
nested records, by using the size of the nested records in the 
document? I would like to have a pie chart, that could display them using 
the size of their nested attribute.

Thank you in advance.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/85b0aed9-f74a-4031-b815-999f1df9be55%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: How to remove a cluster setting?

2015-01-15 Thread Mark Walkom

This is a known issue, see
https://github.com/elasticsearch/elasticsearch/issues/6732

On 15 January 2015 at 22:01, Gary Gao garygaow...@gmail.com wrote:

 why this didn't work on my es :

 GET /_cluster/settings
 {
persistent: {
   discovery: {
  zen: {
 minimum_master_nodes: 2
  }
   }
},
transient: {
   indices: {
  recovery: {
 translog_size: 1024kb,
 concurrent_streams: 3,
 translog_ops: 2000,
 max_bytes_per_sec: 400mb,
 file_chunk_size: 1024kb
  }
   }
}
 }

 PUT _cluster/settings
 {
   transient: {
 indices.recovery.translog_size:
   }
 }

 response:
 {
acknowledged: true,
persistent: {},
transient: {}
 }

 When I do GET again, this setting still exists.


 On Tuesday, July 22, 2014 at 8:50:10 AM UTC+8, Jeffrey Zhou wrote:

 I made the following setting to my Elasticsearch cluster in order to
 decommission some old nodes in the cluster. After removed these old nodes,
 now I need to re-enable the cluster to allocate shards on those '10.0.6.*'
 nodes. Does anyone know how to remove this setting?

 PUT /_cluster/settings
 {
transient: {
   cluster.routing.allocation.exclude._ip: 10.0.6.*
}
 }

 Thanks in advance for any help!

  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/24d2a534-fe0f-4956-9d59-38b0300393d3%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/24d2a534-fe0f-4956-9d59-38b0300393d3%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAEYi1X_gRwFZ1gyoXHrKU5-wWqyCg6d9p2in2jx%2B6jpyCyeRGw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Best pratices for index , search and updates

2015-01-15 Thread bvnr

Am new to the elastic search ...

Can some body throw me ideas about the best practices one should follow to 
get good performance for index ,search and updates 

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/b76ce70c-f2f5-4a56-b402-3b46ced79a82%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: filtering/querying on script field

2015-01-15 Thread Masaru Hasegawa

Hi Samatha,

I don’t think so because script field is created from fields of hit document, 
results of query/filter.
You can use script filter instead 
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-script-filter.html#query-dsl-script-filter.


Masaru

On January 16, 2015 at 04:40:49, samatha kankipati 
(samatha.kankip...@gmail.com) wrote:
  
  
 Is it possible to filter or query on script_fields.
 If so, can you provide any example..
  
 --
 You received this message because you are subscribed to the Google Groups 
 elasticsearch  
 group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearch+unsubscr...@googlegroups.com.  
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/6d4c738b-0975-4711-b9e1-a7d6eaa7830b%40googlegroups.com.
   
 For more options, visit https://groups.google.com/d/optout.
  

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/etPan.54b89229.3d1b58ba.1877%40citra.local.
For more options, visit https://groups.google.com/d/optout.

Re: Slow Commands with 1.2.4 to 1.4.2 Upgrade

2015-01-15 Thread pskieu

Just added 2 more nodes with the same specs, and still seeing the same
slowness. These commands no longer return anything, because it's taking too
long to return.

On Tuesday, December 30, 2014 at 3:54:34 PM UTC-8, Mark Walkom wrote:

How slow?
Is the load on your system high?

On 31 December 2014 at 05:04, psk...@gmail.com javascript: wrote:

I have about 50 GB of data (1 mil docs) in a single node--8 cores with 32
GB (24 GB heap). I just upgraded from 1.2.4 to 1.4.2, and I noticed that a
few commands take a long time to return, and marvel doesn't work as well as
it used to.

Some of the commands that are slow for me are _cat/indices and _nodes.

https://groups.google.com/d/msgid/elasticsearch/f9ab96bf-b5c3-4f99-9c9c-e00568aada9c%40googlegroups.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/9e3f7c4b-0705-4063-a591-8c5359ff8254%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

questions regarding elasticsearch-spark

2015-01-15 Thread Seungjin Lee

Hi all,

I'm quite familiar with ElasticSearch but new to spark, and
elasticsearch-spark.

My idea at this moment is that by using spark together with elasticsearch,
it might be able to increase search performance when the time interval is
fixed.

question is, is hadoop need to be set up first to use elasticsearch-spark?
does it depend on hadoop by any means?

Sincerely,

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAL3_U40M1jth_Lw1-TqiWv0rW0M-Qa2yZsvJx-j-hf9Ngf5KOA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: questions regarding elasticsearch-spark

2015-01-15 Thread Ravi Kiran

Hi Lee,

No. Hadoop isn't required . You can use the spark Standalone mode (
https://spark.apache.org/docs/1.2.0/spark-standalone.html) when running
ElasticSearch on spark.

Regards
Ravi

On Thu, Jan 15, 2015 at 10:15 PM, Seungjin Lee sweetest0...@gmail.com
wrote:

Hi all,

I'm quite familiar with ElasticSearch but new to spark, and
elasticsearch-spark.

My idea at this moment is that by using spark together with elasticsearch,
it might be able to increase search performance when the time interval is
fixed.

question is, is hadoop need to be set up first to use elasticsearch-spark?
does it depend on hadoop by any means?

Sincerely,

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAL3_U40M1jth_Lw1-TqiWv0rW0M-Qa2yZsvJx-j-hf9Ngf5KOA%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAL3_U40M1jth_Lw1-TqiWv0rW0M-Qa2yZsvJx-j-hf9Ngf5KOA%40mail.gmail.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAK4spt2bdpQ7t_xvtap5HTwva2un4te-rBd7P2ZP4qm2zNf3bA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Changing the Axis (X - Y) Label | Naming Legend

2015-01-15 Thread Ravi Prakash

Hi,

1. Is there any way we can change the Label of X and Y axis
2. Is Kibana3, it was possible to name the legends, any way we can do this 
in Kiabana4

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/9b48b40d-7c99-4ecc-a896-2b664fb87fe4%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Perl client: Cannot combine params and body?

2015-01-15 Thread Andrew Walker

I have a remote node that I am attempting to connect to that requires an 
api key as a URL parameter in addition to the body in order to get it to 
work.

The code is as follows:

#!/usr/bin/perl
use v5.14;
use warnings;
use Search::Elasticsearch;
use Data::Dumper;

my $API_KEY='API_KEY';

my $ES = Search::Elasticsearch-new(
cxn_pool = 'Static::NoPing',
nodes = [{
scheme = 'https',
host = 'service.host.com',
port = 443,
path = '/api/es/a_path',
}],
#send_get_body_as = 'POST',
trace_to = 'Stdout',
log_to = 'Stdout',
);

my $res = $ES-search(
params = {
api_key = $API_KEY,
},
body= {
query   = {
bool = {
must = {
query_string = {
default_field = _all,
query = thisisasitethatdoesntexist.com,
default_operator = AND
}
}
}
}
}
);

print Dumper($res);


The generated curl is:

# Request to: https://service.host.com:443/api/es/a_path
curl -XGET 'http://localhost:9200/_search?api_key=API_KEYpretty=1' -d '
{
   query : {
  bool : {
 must : {
query_string : {
   query : thisisasitethatdoesntexist.com,
   default_field : _all,
   default_operator : AND
}
 }
  }
   }
}
'

When I replace localhost and the path with the proper host and path and run 
the curl command directly from the command line, I get zero hits back, 
which is what I expect.  If I run the above perl, however, I get many 
millions of results back, which is exactly the same as what I get when I 
remove the body from the curl query (-d ''). So it seems that the 
combination of params and body causes body to get eaten?  I looked at the 
code, but I couldn't find where this might be happening.  Any help?

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/b990961c-a129-4cd0-b1e0-46f33f86c4ff%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Sorting on nested object collections

2015-01-15 Thread Russ Cam

I've run the query with the smallest possible subset and the query is 
returning the results in the expected order so it appears to be correct. 

The biggest question that I have is does the second sort condition know to 
run on the *first* projected valuation that had the max date from the first 
sort condition?

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/420cb05f-38f0-4ef9-a922-96e26f5ab5e1%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Complex search

2015-01-15 Thread Russ Cam

Take a look at highlighting 
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-request-highlighting.html
 for 
highlighting the relevant parts of matches and at multifield 
http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/multi-match-query.html#_boosting_individual_fieldssearch
 
queries with boosting on individual fields.

On Friday, 16 January 2015 08:19:08 UTC+11, Serge Schumacher wrote:

 Hi,
 I'm looking to create a search behaviour like Amazon does.

 I have an index with 3 Fields  : Title, Description and Category.

 I want to search in the fields title and descriptions for the word *car* 
 and I would like to get scored result like this :

 car   -- score : 1 in category vehicles
 autocar-- score : 0,5 in category vehicles where the part car 
 should highlighted ex : auto*car*
 carradio -- core : 0,5 in category vehicles where the part car should 
 highlightedex : *car*radio

 and that if the word is found in the title field, the score should be 
 higher as if the word would only be found in the description field.

 Is anybody out there who could help me on this topic or at least point me 
 to the right direction where I should look for ?

 Thanks,
 Serge


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/c1b4e9e2-f84b-4b72-bdda-0b22a8584658%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

How can we achieve an equivalent of this SQL a query in Elasticsearch?

2015-01-15 Thread Lokesh Gupta

What will be equivalent of the following query in the Elasticsearch world..

select myDate, col1, col2 from myTable
where myDate = (select max(myDate) from myTable)

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/cee4d390-a53c-4c11-ae4b-4d40023ca889%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: How can we achieve an equivalent of this SQL a query in Elasticsearch?

I think you need to run two queries for now. One is an aggregation (max). The
other one use the result of this aggregation to search for documents.

My 2 cents

--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet https://twitter.com/dadoonet | @elasticsearchfr
https://twitter.com/elasticsearchfr | @scrutmydocs
https://twitter.com/scrutmydocs

Le 15 janv. 2015 à 09:13, Lokesh Gupta lgup...@gmail.com a écrit :

What will be equivalent of the following query in the Elasticsearch world..

select myDate, col1, col2 from myTable
where myDate = (select max(myDate) from myTable)

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com
mailto:elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/cee4d390-a53c-4c11-ae4b-4d40023ca889%40googlegroups.com

https://groups.google.com/d/msgid/elasticsearch/cee4d390-a53c-4c11-ae4b-4d40023ca889%40googlegroups.com?utm_medium=emailutm_source=footer.
For more options, visit https://groups.google.com/d/optout
https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/335F42ED-A70A-4401-82A6-6828DF3D794B%40pilato.fr.
For more options, visit https://groups.google.com/d/optout.

Re: How to remove a cluster setting?

2015-01-15 Thread Gary Gao

why this didn't work on my es :

GET /_cluster/settings
{
   persistent: {
  discovery: {
 zen: {
minimum_master_nodes: 2
 }
  }
   },
   transient: {
  indices: {
 recovery: {
translog_size: 1024kb,
concurrent_streams: 3,
translog_ops: 2000,
max_bytes_per_sec: 400mb,
file_chunk_size: 1024kb
 }
  }
   }
}

PUT _cluster/settings
{
  transient: {
indices.recovery.translog_size:
  }
}

response:
{
   acknowledged: true,
   persistent: {},
   transient: {}
}

When I do GET again, this setting still exists.


On Tuesday, July 22, 2014 at 8:50:10 AM UTC+8, Jeffrey Zhou wrote:

 I made the following setting to my Elasticsearch cluster in order to 
 decommission some old nodes in the cluster. After removed these old nodes, 
 now I need to re-enable the cluster to allocate shards on those '10.0.6.*' 
 nodes. Does anyone know how to remove this setting? 

 PUT /_cluster/settings 
 { 
transient: { 
   cluster.routing.allocation.exclude._ip: 10.0.6.* 
} 
 } 

 Thanks in advance for any help! 


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/24d2a534-fe0f-4956-9d59-38b0300393d3%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: How can I sort results by _id?

Making it index:not_analyzed should work, what is the issue with the
results?

Note that loading the _id in fielddata is typically very costly since the
_id field is typically unique per document.

On Thu, Jan 15, 2015 at 10:35 AM, Jason Zhang moc...@gmail.com wrote:

I use a query dsl like:

{
filter: {
exists: { field: info }
},
sort: { _id: desc }
}

And the _id here is an integer like '123'.

But the result is like:

{
took: 50,
...
hits: {
...
hits: [
{
...
sort: [ null ]
}]
}
}

Also, I've tried to add _id: { index: not_analyzerd } in the
_mapping.
This time the sort section returns values. But I find the results are
still partly unordered.

Can I sort results by _id? How?

Thank you.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/4ea45f18-847a-4b58-b78e-ddcd9ee1e9f9%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/4ea45f18-847a-4b58-b78e-ddcd9ee1e9f9%40googlegroups.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
Adrien Grand

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j7uK%2BJY_2-C3LHGTc7YYRFVv2z_-o%3DuWbDhE2SQOJYFZA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: how to combine aggregations

I believe you could run a terms aggregation on the city field, and under
this terms aggregation put two sum aggregations, one for clicks and one for
displays. And finally you could derive the click rate from the sum of
clicks and displays on client side? If you are starting playing with
aggregations, I would recommend reading this blog post by Zachary Tong:
http://www.elasticsearch.org/blog/intro-to-aggregations/

On Wed, Jan 14, 2015 at 10:43 PM, Yan Georget y...@ogury.co wrote:

Hello,

Let's imagine I am logging displays and clicks, say by cities.
I can aggregate those by countries and I can also compute grand totals.

Now I would like to compute click rates (clicks/displays) by cities,
countries and I would also like to get a global click rate.
How can I do this?

It seems that I could use a scripted metric (I have not tried yet) but I
would also like to expose these rates in Kibana.

It is possible?

Thanks in advance,
Yan Georget

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/c5356c3e-9322-4708-9c20-eed270ee57d9%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/c5356c3e-9322-4708-9c20-eed270ee57d9%40googlegroups.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
Adrien Grand

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j69AGX4bH4eL%3DxP6a84oT-64Op1FqGha5iMJJZ_hzVAnA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Out of memory on start with 38GB index

2015-01-15 Thread Thomas Cataldo

Hi,

I am doing all my tests on a 38GB production index copy, with ES 1.4.2. I 
tried several memory settings and virtual machine sizes, but ES fails to 
start on a linux system with 48GB memory and 32GB for ES heap.

Searching for similar issues, I 
encountered https://github.com/elasticsearch/elasticsearch/issues/8394 
which is still open and looks fairly similar to my problem.


The debug output at the start of looks like this :

[2015-01-14 12:00:48,710][DEBUG][indices.cluster  ] [Saint Elmo] 
[mailspool][1] creating shard

[2015-01-14 12:00:48,710][DEBUG][index.service] [Saint Elmo] 
[mailspool] creating shard_id [1]

[2015-01-14 12:00:48,791][DEBUG][index.deletionpolicy ] [Saint Elmo] 
[mailspool][1] Using [keep_only_last] deletion policy

[2015-01-14 12:00:48,793][DEBUG][index.merge.policy   ] [Saint Elmo] 
[mailspool][1] using [tiered] merge mergePolicy with 
expunge_deletes_allowed[10.0], floor_segment[2mb], max_merge_at_once[10], 
max_merge_at_once_explicit[30], max_merged_segment[5gb], 
segments_per_tier[10.0], reclaim_deletes_weight[2.0]

[2015-01-14 12:00:48,794][DEBUG][index.merge.scheduler] [Saint Elmo] 
[mailspool][1] using [concurrent] merge scheduler with max_thread_count[2], 
max_merge_count[4]

[2015-01-14 12:00:48,797][DEBUG][index.shard.service  ] [Saint Elmo] 
[mailspool][1] state: [CREATED]

[2015-01-14 12:00:48,797][DEBUG][index.translog   ] [Saint Elmo] 
[mailspool][1] interval [5s], flush_threshold_ops [2147483647], 
flush_threshold_size [200mb], flush_threshold_period [30m]

[2015-01-14 12:00:48,801][DEBUG][index.shard.service  ] [Saint Elmo] 
[mailspool][1] state: [CREATED]-[RECOVERING], reason [from gateway]

[2015-01-14 12:00:48,801][DEBUG][index.gateway] [Saint Elmo] 
[mailspool][1] starting recovery from local ...

[2015-01-14 12:00:48,805][DEBUG][river.cluster] [Saint Elmo] 
processing [reroute_rivers_node_changed]: execute

[2015-01-14 12:00:48,805][DEBUG][river.cluster] [Saint Elmo] 
processing [reroute_rivers_node_changed]: no change in cluster_state

[2015-01-14 12:00:48,814][INFO ][gateway  ] [Saint Elmo] 
recovered [1] indices into cluster_state

[2015-01-14 12:00:48,814][DEBUG][cluster.service  ] [Saint Elmo] 
processing [local-gateway-elected-state]: done applying updated 
cluster_state (version: 2)

[2015-01-14 12:00:48,840][DEBUG][index.engine.internal] [Saint Elmo] 
[mailspool][1] starting engine

[2015-01-14 12:00:58,406][DEBUG][cluster.service  ] [Saint Elmo] 
processing [routing-table-updater]: execute

[2015-01-14 12:00:58,407][DEBUG][gateway.local] [Saint Elmo] 
[mailspool][4]: throttling allocation [[mailspool][4], node[null], [P], 
s[UNASSIGNED]] to [[[Saint 
Elmo][gOgAuHo4SXyfyuPpws0Usw][es][inet[/172.16.45.250:9300 on primary 
allocation

[2015-01-14 12:00:58,407][DEBUG][gateway.local] [Saint Elmo] 
[mailspool][2]: throttling allocation [[mailspool][2], node[null], [P], 
s[UNASSIGNED]] to [[[Saint 
Elmo][gOgAuHo4SXyfyuPpws0Usw][es][inet[/172.16.45.250:9300 on primary 
allocation

[2015-01-14 12:00:58,407][DEBUG][gateway.local] [Saint Elmo] 
[mailspool][3]: throttling allocation [[mailspool][3], node[null], [P], 
s[UNASSIGNED]] to [[[Saint 
Elmo][gOgAuHo4SXyfyuPpws0Usw][es][inet[/172.16.45.250:9300 on primary 
allocation

[2015-01-14 12:00:58,408][DEBUG][gateway.local] [Saint Elmo] 
[mailspool][0]: throttling allocation [[mailspool][0], node[null], [P], 
s[UNASSIGNED]] to [[[Saint 
Elmo][gOgAuHo4SXyfyuPpws0Usw][es][inet[/172.16.45.250:9300 on primary 
allocation

[2015-01-14 12:00:58,408][DEBUG][cluster.service  ] [Saint Elmo] 
processing [routing-table-updater]: no change in cluster_state

[2015-01-14 12:01:31,619][WARN ][index.engine.internal] [Saint Elmo] 
[mailspool][1] failed engine [refresh failed]

java.lang.OutOfMemoryError: Java heap space

at org.apache.lucene.util.FixedBitSet.init(FixedBitSet.java:187)

at 
org.apache.lucene.search.MultiTermQueryWrapperFilter.getDocIdSet(MultiTermQueryWrapperFilter.java:104)

at 
org.elasticsearch.index.cache.filter.weighted.WeightedFilterCache$FilterCacheFilterWrapper.getDocIdSet(WeightedFilterCache.java:177)

at 
org.elasticsearch.common.lucene.search.OrFilter.getDocIdSet(OrFilter.java:55)

at 
org.elasticsearch.common.lucene.search.ApplyAcceptedDocsFilter.getDocIdSet(ApplyAcceptedDocsFilter.java:46)

at org.apache.lucene.search.FilteredQuery$1.scorer(FilteredQuery.java:130)

at 
org.apache.lucene.search.FilteredQuery$RandomAccessFilterStrategy.filteredScorer(FilteredQuery.java:542)

at org.apache.lucene.search.FilteredQuery$1.scorer(FilteredQuery.java:136)

at 
org.apache.lucene.search.QueryWrapperFilter$1.iterator(QueryWrapperFilter.java:59)

at 
org.apache.lucene.index.BufferedUpdatesStream.applyQueryDeletes(BufferedUpdatesStream.java:554)

at

Grandchild is not getting fetched by parent id

2015-01-15 Thread Iv Igi

I am experiencing an issue while trying to retrieve a grandchild record by 
its parent ID. (child-grandchild relationship)
The amount of hits in result is always zero.
Also the same request is working fine for parent-child relationship.

My records are getting organized kinda like this:

Account --(one to one)-- User --(one to one)-- Address

My execution environment is:
 - Fedora 21 CE
 - openjdk 1.8.0_25
 - ES 1.4.2

Here is a script that is showing the problem

# index creation
curl -XPUT localhost:9200/the_index/ -d {
\mappings\: {
\account\ : {},
\user\ : { 
\_parent\ : { 
\type\ : \account\ 
}  
},
\address\ : { 
\_parent\ : { 
\type\ : \user\ 
}  
}
}
};

# mrsmith account creation
curl -XPUT localhost:9200/the_index/account/mrsmith -d {
\foo\ : \foo\
};

# john user creation
curl -XPUT localhost:9200/the_index/user/john?parent=mrsmith -d {
\bar\ : \bar\
};

# john user creation
curl -XPUT localhost:9200/the_index/address/smithshouse?parent=john -d {
\baz\ : \baz\
};

# Here I am trying to retrieve a record. Getting zero hits.
curl -XGET localhost:9200/the_index/address/_search?pretty -d {
\query\ : { \bool\ : { \must\ : { \term\ : { \_parent\ : 
\john\ } } } }
};

# Another approach with has_parent query type. Still getting zero hits.
curl -XGET localhost:9200/the_index/address/_search?pretty -d {
   \query\ : { 
   \has_parent\ : {
   \parent_type\ : \user\,
   \query\ : { 
   \term\ : { 
   \_id\ : \john\ 
   } 
   } 
   } 
}
};

# OK, lets try a routed search. Nope
curl -XGET localhost:9200/the_index/address/_search?routing=johnpretty 
-d {
\query\ : { \bool\ : { \must\ : { \term\ : { \_parent\ : 
\john\ } } } }
};

# Routed has_parent query. Same
curl -XGET localhost:9200/the_index/address/_search?routing=johnpretty 
-d {
   \query\ : { 
   \has_parent\ : {
   \parent_type\ : \user\,
   \query\ : { 
   \term\ : { 
   \_id\ : \john\ 
   } 
   } 
   } 
}
};

# Retrieving a record by itself. Going just fine.
curl -XGET localhost:9200/the_index/address/smithshouse?parent=john;

# Querying for user record with the same query. Got a hit.
curl -XGET localhost:9200/the_index/user/_search?pretty -d {
\query\ : { \bool\ : { \must\ : { \term\ : { \_parent\ : 
\mrsmith\ } } } }
};



The output:

{acknowledged:true}
{_index:the_index,_type:account,_id:mrsmith,_version:1,created:true}{_index:the_index,_type:user,_id:john,_version:1,created:true}{_index:the_index,_type:address,_id:smithshouse,_version:1,created:true}
{
  took : 54,
  timed_out : false,
  _shards : {
total : 5,
successful : 5,
failed : 0
  },
  hits : {
total : 0,
max_score : null,
hits : [ ]
  }
}
{
  took : 221,
  timed_out : false,
  _shards : {
total : 5,
successful : 5,
failed : 0
  },
  hits : {
total : 0,
max_score : null,
hits : [ ]
  }
}
{
  took : 35,
  timed_out : false,
  _shards : {
total : 1,
successful : 1,
failed : 0
  },
  hits : {
total : 0,
max_score : null,
hits : [ ]
  }
}
{
  took : 481,
  timed_out : false,
  _shards : {
total : 1,
successful : 1,
failed : 0
  },
  hits : {
total : 0,
max_score : null,
hits : [ ]
  }
}
{_index:the_index,_type:address,_id:smithshouse,_version:1,found:true,_source:{
baz : baz
}}
{
  took : 65,
  timed_out : false,
  _shards : {
total : 5,
successful : 5,
failed : 0
  },
  hits : {
total : 1,
max_score : 1.0,
hits : [ {
  _index : the_index,
  _type : user,
  _id : john,
  _score : 1.0,
  _source:{
bar : bar
}
} ]
  }
}

You can find out on resuls that ES got the required shard, but no records 
have been fetched.
Probably I am doing it in a wrong way, and if it so please fix me up.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/bbaebc65-a87f-4857-a2a4-577b0b487c6b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Logstash output to Elastic search is not working

2015-01-15 Thread Marc

What do you mean by 
Can't see anything from the following command output:
#curl http://localhost:9200/_search?pretty;

from your first post?

On Wednesday, January 14, 2015 at 3:27:57 AM UTC+1, zal...@gmail.com wrote:

 Hi Marc,

 I didn't find any .sincedb file from the file system.The problem is still. 



 On Tuesday, January 13, 2015 at 8:39:57 PM UTC+8, Marc wrote:

 It all looks ok to me, since one can see that the logstash process is 
 added as a node.
 However, you should try to remove the .sincedb files in your home 
 directory.
 If sincedb files exist and you are trying to analyze identical log files 
 it will know that it already read in the info and wait for new log entries 
 in the file... ergo nothing will happen


 On Tuesday, January 13, 2015 at 10:05:10 AM UTC+1, zal...@gmail.com 
 wrote:

 Hi all,

 I've started experimenting ELK today, unfortunately not succeeded. 
 Everything installed properly and running without any error. When I start 
 Logstash with the following command, output to STDOUT is fine. But nothing 
 is seen in elastic search:

 #./logstash agent -e input { stdin {} } output { elasticsearch { host 
 = localhost } stdout { codec = rubydebug}}

 What should I do?

 Elastic search's console output is:

 [2015-01-13 15:55:48,072][INFO ][node ] [Apollo] 
 started
 [2015-01-13 15:55:51,392][INFO ][gateway  ] [Apollo] 
 recovered [1] indices into cluster_state
 [2015-01-13 15:55:51,422][INFO ][cluster.service  ] [Apollo] 
 added 
 {[logstash-0.0.0.0-21484-2010][O0emX_s0SmauCfqAC_YaTA][inet[/172.16.4.88:9302]]{client=true,
  
 data=false},[logstash-suricata-3299-4010][cKVoEM8zT8KPVIAelpMSsg][suricata][inet[/172.16.4.88:9301]]{client=true,
  
 data=false},}, reason: zen-disco-receive(join from 
 node[[logstash-suricata-3299-4010][cKVoEM8zT8KPVIAelpMSsg][suricata][inet[/172.16.4.88:9301]]{client=true,
  
 data=false}])
 [2015-01-13 15:57:44,028][INFO ][cluster.service  ] [Apollo] 
 removed 
 {[logstash-0.0.0.0-21484-2010][O0emX_s0SmauCfqAC_YaTA][inet[/172.16.4.88:9302]]{client=true,
  
 data=false},}, reason: 
 zen-disco-node_failed([logstash-0.0.0.0-21484-2010][O0emX_s0SmauCfqAC_YaTA][inet[/172.16.4.88:9302]]{client=true,
  
 data=false}), reason transport disconnected
 [2015-01-13 16:01:29,656][INFO ][cluster.service  ] [Apollo] 
 added 
 {[logstash-0.0.0.0-22435-2010][LdUiD4llTY6S7eiN8Z97ag][inet[/172.16.4.88:9302]]{client=true,
  
 data=false},}, reason: zen-disco-receive(join from 
 node[[logstash-0.0.0.0-22435-2010][LdUiD4llTY6S7eiN8Z97ag][inet[/172.16.4.88:9302]]{client=true,
  
 data=false}])
 [2015-01-13 16:21:07,373][INFO ][cluster.service  ] [Apollo] 
 removed 
 {[logstash-0.0.0.0-22435-2010][LdUiD4llTY6S7eiN8Z97ag][inet[/172.16.4.88:9302]]{client=true,
  
 data=false},}, reason: 
 zen-disco-node_failed([logstash-0.0.0.0-22435-2010][LdUiD4llTY6S7eiN8Z97ag][inet[/172.16.4.88:9302]]{client=true,
  
 data=false}), reason transport disconnected
 [2015-01-13 16:25:07,143][INFO ][cluster.service  ] [Apollo] 
 added 
 {[logstash-0.0.0.0-24108-2010][k2ToeYbPRtW_LH4PLBcL-A][inet[/172.16.4.88:9302]]{client=true,
  
 data=false},}, reason: zen-disco-receive(join from 
 node[[logstash-0.0.0.0-24108-2010][k2ToeYbPRtW_LH4PLBcL-A][inet[/172.16.4.88:9302]]{client=true,
  
 data=false}])

 Can't see anything from the following command output:
 #curl http://localhost:9200/_search?pretty

 Please help me on this. 




-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/33030a85-786c-46e6-b24c-b9de6403b79a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: How can I sort results by _id?

This is because the _id is a string field, so comparison is based on the
lexicographical order, not numeric.

On Thu, Jan 15, 2015 at 11:04 AM, Jason Zhang moc...@gmail.com wrote:

What I'm confused is the 'sorted' results are still partly unordered.

Also, if I query:

{ range: {
_id: {
gt: 1,
lt: 1}}}

the results contain _id: 199989.

On Thursday, January 15, 2015 at 5:48:48 PM UTC+8, Adrien Grand wrote:

Making it index:not_analyzed should work, what is the issue with the
results?

Note that loading the _id in fielddata is typically very costly since the
_id field is typically unique per document.

On Thu, Jan 15, 2015 at 10:35 AM, Jason Zhang moc...@gmail.com wrote:

I use a query dsl like:

{
filter: {
exists: { field: info }
},
sort: { _id: desc }
}

And the _id here is an integer like '123'.

But the result is like:

{
took: 50,
...
hits: {
...
hits: [
{
...
sort: [ null ]
}]
}
}

Also, I've tried to add _id: { index: not_analyzerd } in the
_mapping.
This time the sort section returns values. But I find the results are
still partly unordered.

Can I sort results by _id? How?

Thank you.

--
You received this message because you are subscribed to the Google
Groups elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/4ea45f18-847a-4b58-b78e-ddcd9ee1e9f9%
40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/4ea45f18-847a-4b58-b78e-ddcd9ee1e9f9%40googlegroups.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
Adrien Grand

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/b7f625dd-8afd-4603-afc8-1fd6d5b601d1%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/b7f625dd-8afd-4603-afc8-1fd6d5b601d1%40googlegroups.com?utm_medium=emailutm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
Adrien Grand

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j6x_GN9HuZzYtgB_T69hu0y_QVUCzqxxOKciEvKubgkUw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: How can I sort results by _id?

2015-01-15 Thread Itamar Syn-Hershko

No, an ID has to be a string

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer Consultant
Author of RavenDB in Action http://manning.com/synhershko/

On Thu, Jan 15, 2015 at 12:12 PM, Jason Zhang moc...@gmail.com wrote:

Can I specify its type as integer in _mapping? Because the _id I use is
rewritten.

On Thursday, January 15, 2015 at 6:07:22 PM UTC+8, Adrien Grand wrote:

This is because the _id is a string field, so comparison is based on the
lexicographical order, not numeric.

On Thu, Jan 15, 2015 at 11:04 AM, Jason Zhang moc...@gmail.com wrote:

What I'm confused is the 'sorted' results are still partly unordered.

Also, if I query:

{ range: {
_id: {
gt: 1,
lt: 1}}}

the results contain _id: 199989.

On Thursday, January 15, 2015 at 5:48:48 PM UTC+8, Adrien Grand wrote:

Making it index:not_analyzed should work, what is the issue with the
results?

Note that loading the _id in fielddata is typically very costly since
the _id field is typically unique per document.

On Thu, Jan 15, 2015 at 10:35 AM, Jason Zhang moc...@gmail.com wrote:

I use a query dsl like:

{
filter: {
exists: { field: info }
},
sort: { _id: desc }
}

And the _id here is an integer like '123'.

But the result is like:

{
took: 50,
...
hits: {
...
hits: [
{
...
sort: [ null ]
}]
}
}

Also, I've tried to add _id: { index: not_analyzerd } in the
_mapping.
This time the sort section returns values. But I find the results
are still partly unordered.

Can I sort results by _id? How?

Thank you.

--
You received this message because you are subscribed to the Google
Groups elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/4ea45f18-847a-4b58-b78e-ddcd9ee1e9f9%40goo
glegroups.com
https://groups.google.com/d/msgid/elasticsearch/4ea45f18-847a-4b58-b78e-ddcd9ee1e9f9%40googlegroups.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
Adrien Grand

--
You received this message because you are subscribed to the Google
Groups elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/b7f625dd-8afd-4603-afc8-1fd6d5b601d1%
40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/b7f625dd-8afd-4603-afc8-1fd6d5b601d1%40googlegroups.com?utm_medium=emailutm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
Adrien Grand

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/2475cb1a-5631-4b06-8507-28c4d81f9d4d%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/2475cb1a-5631-4b06-8507-28c4d81f9d4d%40googlegroups.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZvWQtGKE6JDd6%3D%2BXRJENrAyLPkTE3%2BBRpFsEJ%2BS09bTpg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

How can I sort results by _id?

2015-01-15 Thread Jason Zhang

I use a query dsl like:

{
  filter: {
exists: { field: info }
  },
  sort: { _id: desc }
}

And the _id here is an integer like '123'.

But the result is like:

{ 
  took: 50,
  ...
  hits: {
...
hits: [
  {
...
sort: [ null ]
  }]
  }
}

Also, I've tried to add _id: { index: not_analyzerd } in the _mapping.
This time the sort section returns values. But I find the results are 
still partly unordered.

Can I sort results by _id? How?

Thank you.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/4ea45f18-847a-4b58-b78e-ddcd9ee1e9f9%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: How can I sort results by _id?

2015-01-15 Thread Jason Zhang

What I'm confused is the 'sorted' results are still partly unordered.

Also, if I query:

{ range: {
_id: {
gt: 1,
lt: 1}}}

the results contain _id: 199989.

On Thursday, January 15, 2015 at 5:48:48 PM UTC+8, Adrien Grand wrote:

Making it index:not_analyzed should work, what is the issue with the
results?

Note that loading the _id in fielddata is typically very costly since the
_id field is typically unique per document.

On Thu, Jan 15, 2015 at 10:35 AM, Jason Zhang moc...@gmail.com
javascript: wrote:

I use a query dsl like:

{
filter: {
exists: { field: info }
},
sort: { _id: desc }
}

And the _id here is an integer like '123'.

But the result is like:

{
took: 50,
...
hits: {
...
hits: [
{
...
sort: [ null ]
}]
}
}

Also, I've tried to add _id: { index: not_analyzerd } in the
_mapping.
This time the sort section returns values. But I find the results are
still partly unordered.

Can I sort results by _id? How?

Thank you.

https://groups.google.com/d/msgid/elasticsearch/4ea45f18-847a-4b58-b78e-ddcd9ee1e9f9%40googlegroups.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
Adrien Grand

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/b7f625dd-8afd-4603-afc8-1fd6d5b601d1%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: How can I sort results by _id?

2015-01-15 Thread Jason Zhang

Can I specify its type as integer in _mapping? Because the _id I use is
rewritten.

On Thursday, January 15, 2015 at 6:07:22 PM UTC+8, Adrien Grand wrote:

This is because the _id is a string field, so comparison is based on the
lexicographical order, not numeric.

On Thu, Jan 15, 2015 at 11:04 AM, Jason Zhang moc...@gmail.com
javascript: wrote:

What I'm confused is the 'sorted' results are still partly unordered.

Also, if I query:

{ range: {
_id: {
gt: 1,
lt: 1}}}

the results contain _id: 199989.

On Thursday, January 15, 2015 at 5:48:48 PM UTC+8, Adrien Grand wrote:

Making it index:not_analyzed should work, what is the issue with the
results?

Note that loading the _id in fielddata is typically very costly since
the _id field is typically unique per document.

On Thu, Jan 15, 2015 at 10:35 AM, Jason Zhang moc...@gmail.com wrote:

I use a query dsl like:

{
filter: {
exists: { field: info }
},
sort: { _id: desc }
}

And the _id here is an integer like '123'.

But the result is like:

{
took: 50,
...
hits: {
...
hits: [
{
...
sort: [ null ]
}]
}
}

Also, I've tried to add _id: { index: not_analyzerd } in the
_mapping.
This time the sort section returns values. But I find the results are
still partly unordered.

Can I sort results by _id? How?

Thank you.

--
You received this message because you are subscribed to the Google
Groups elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/4ea45f18-847a-4b58-b78e-ddcd9ee1e9f9%
40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/4ea45f18-847a-4b58-b78e-ddcd9ee1e9f9%40googlegroups.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
Adrien Grand

https://groups.google.com/d/msgid/elasticsearch/b7f625dd-8afd-4603-afc8-1fd6d5b601d1%40googlegroups.com?utm_medium=emailutm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
Adrien Grand

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/2475cb1a-5631-4b06-8507-28c4d81f9d4d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Questions about scaling elasticsearch with regard to the number of documents indexed per second

2015-01-15 Thread Chinch Pokli

I am hoping Redis/ Kafka/ Logstash/ etc. might help elasticsearch to get
some breathing room and enable it to index up to 10K docs per second.

On Thursday, January 15, 2015 at 10:47:31 AM UTC+5:30, David Pilato wrote:

You have a Twitter input so you can extract content from Twitter and send
to elasticsearch. No need to have Redis here.

--
David ;-)
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 15 janv. 2015 à 00:02, Chinch Pokli cpo...@gmail.com javascript: a
écrit :

Thanks. I'll have a look at the raw option.
Regarding logstash, I don't fully understand it's utility. It says that it
can take messages from a Redis server. But if I have to set up Redis, I
could simply use the Redis river to index into Elasticsearch. Is there any
additional benefit that Logstash would give me?

On Thursday, January 15, 2015 at 4:06:12 AM UTC+5:30, David Pilato wrote:

You should look at raw option or better look at Logstash.

My 2 cents.

David

Le 14 janv. 2015 à 23:29, Chinch Pokli cpo...@gmail.com a écrit :

Hi,

Is there a way to make the river to store all the data? If not, I am fine
with writing a streaming code which will stream and index. But have a
concern. How many documents can elasticsearch index per second? I might
eventually need to index almost 10,000 documents (each document = 2 KB) per
second (current requirement is of 100 documents per second). Is this even
feasible? If yes, do I need to make any special modifications?

Thanks-in-advance!!

https://groups.google.com/d/msgid/elasticsearch/da547692-903b-4793-a77e-fd5f0b5a01b7%40googlegroups.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

https://groups.google.com/d/msgid/elasticsearch/d89e6057-ab58-49ef-a553-c5bd5265c172%40googlegroups.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/a5c75aed-e290-4152-9f8d-160510f3ecfa%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Just initialize shards when problems but no rebalance

2015-01-15 Thread Matías Waisgold

Another problem we are having is that in the file storage we see data from
shards that are not assigned to itself so it can´t allocate anything in
this dirty state.

2015-01-15 0:09 GMT-03:00 Mark Walkom markwal...@gmail.com:

You could do this, but it's a lot of manual overhead to have to deal with.
However ES does have some disk space awareness during allocation, take a
look at
http://www.elasticsearch.org/guide/en/elasticsearch/reference/master/index-modules-allocation.html#disk

On 15 January 2015 at 10:57, Matías Waisgold mwaisg...@gmail.com wrote:

Is it ok to combine cluster.routing.allocation.allow_rebalance to none
and cluster.routing.allocation.enable to all.

Kind regards

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/666a4d70-2497-4a2b-8c5e-774c7d0617b7%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/666a4d70-2497-4a2b-8c5e-774c7d0617b7%40googlegroups.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAMaTqYqFmk8t7couOmYEyPYNZPKepT8nKVrCM6fvSPW0CUjMwA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: How can we achieve an equivalent of this SQL a query in Elasticsearch?

2015-01-15 Thread Lokesh Gupta

Thanks.. Any other creative solutions?

On Thursday, January 15, 2015 at 1:54:10 PM UTC+5:30, David Pilato wrote:

I think you need to run two queries for now. One is an aggregation (max).
The other one use the result of this aggregation to search for documents.

My 2 cents

--
*David Pilato* | *Technical Advocate* | *Elasticsearch.com
http://Elasticsearch.com*
@dadoonet https://twitter.com/dadoonet | @elasticsearchfr
https://twitter.com/elasticsearchfr | @scrutmydocs
https://twitter.com/scrutmydocs

Le 15 janv. 2015 à 09:13, Lokesh Gupta lgu...@gmail.com javascript: a
écrit :

What will be equivalent of the following query in the Elasticsearch world..

select myDate, col1, col2 from myTable
where myDate = (select max(myDate) from myTable)

https://groups.google.com/d/msgid/elasticsearch/cee4d390-a53c-4c11-ae4b-4d40023ca889%40googlegroups.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/906c817f-3ca4-4a7b-a0cc-a316076ae332%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Deploy Elasticsearch in live

2015-01-15 Thread BizEcho Jmr

Hi all,

I use ElasticSearch locally on my PC as a search engine in a content website 
developed 
with the Django framework.

I would like your opinion on the choice of a host offers production, ideally a 
scalable offering.

I consulted the offers of DigitalOcean, Amazon EC2, OVH (OVH VPC, runAbove 
...).
Amazon EC2 offers a free initial first year but I do not know if this offer is 
suitable for my application.
The first offers DigitalOcean is $ 5 / month, but the memory is only 512 MB. 
I just received an email and find out that it was now possible to deploy 
ElasticSearch Google Compute Engine.

  And what would be the impact on this configuration in live if I planned to 
use as Logstash and Kibana.

Thank you in advance for your host offers advice in live.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/554ee6b4-7652-4ccc-9d17-27c117a26cf9%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Http Cors Setting

2015-01-15 Thread Raffaele Garofalo

In my case I faced the same issue cause my web tier is hosted on a 
different domain.
My configuration is working quite well, I can see the pre-flight (OPTIONS) 
call returning 200 and then subsequent POST or GET being succesfull.

I have used the following configuration:

http.cors.enabled: true
http.cors.allow-origin: my regex for my domains
http.cors.allow-methods: OPTIONS, HEAD, GET, POST, PUT, DELETE
http.cors.allow-credentials: true
http.cors.allow-headers: X-Requested-With, Content-Type, Content-Length, 
accept, authorization

You can work with Chrome F12 and verify which are the pre-flight headers 
sent by your application and add them to the parameter 
http.cors.allow-headers

On Tuesday, November 11, 2014 at 1:21:05 PM UTC+1, Reza Samee wrote:

 Hello to all!

 Note: I'm new to ELK :)

 I'm using elasticsearch 1.4.0 and I'm trying to enable http.cors feature 
 in elasticsearch. When I set http.cors.enabled: true and 
 http.cors.allow-origin: * in config file and then restart, the 
 http.cors feature doesn't enabled yet and I can't use kibana again. 
 What's wrong with my config file?

 elasticsearch.conf:

 http.cors.enabled: true
 http.cors.allow-origin: *



-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/a6aa7e2b-5809-4d42-8dc5-3fdfc7dd8547%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: How can we achieve an equivalent of this SQL a query in Elasticsearch?

2015-01-15 Thread Mark Harwood

Sorted query?

GET /myIndex/_search
{
query:{match_all: {}},
fields:[myDate,col1],
sort: [
{
myDate: {
order: desc
}
}
]
}

On Thursday, January 15, 2015 at 1:05:22 PM UTC, Lokesh Gupta wrote:

Thanks.. Any other creative solutions?

On Thursday, January 15, 2015 at 1:54:10 PM UTC+5:30, David Pilato wrote:

I think you need to run two queries for now. One is an aggregation (max).
The other one use the result of this aggregation to search for documents.

My 2 cents

Le 15 janv. 2015 à 09:13, Lokesh Gupta lgu...@gmail.com a écrit :

What will be equivalent of the following query in the Elasticsearch
world..

select myDate, col1, col2 from myTable
where myDate = (select max(myDate) from myTable)

https://groups.google.com/d/msgid/elasticsearch/cee4d390-a53c-4c11-ae4b-4d40023ca889%40googlegroups.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/8dffc8cf-8dee-4584-8fac-119482ea0831%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: How can we achieve an equivalent of this SQL a query in Elasticsearch?

2015-01-15 Thread Lokesh Gupta

Thanks for the suggestion. Sorted query would work if I am okay with
getting data for dates other than the max(date). But in the use case I have
I need to restrict the results to be only for max(date).

Is there a way to chain the output of a query as an input to another query?

On Thursday, January 15, 2015 at 7:10:51 PM UTC+5:30, Mark Harwood wrote:

Sorted query?

GET /myIndex/_search
{
query:{match_all: {}},
fields:[myDate,col1],
sort: [
{
myDate: {
order: desc
}
}
]
}

On Thursday, January 15, 2015 at 1:05:22 PM UTC, Lokesh Gupta wrote:

Thanks.. Any other creative solutions?

On Thursday, January 15, 2015 at 1:54:10 PM UTC+5:30, David Pilato wrote:

I think you need to run two queries for now. One is an aggregation
(max). The other one use the result of this aggregation to search for
documents.

My 2 cents

Le 15 janv. 2015 à 09:13, Lokesh Gupta lgu...@gmail.com a écrit :

What will be equivalent of the following query in the Elasticsearch
world..

select myDate, col1, col2 from myTable
where myDate = (select max(myDate) from myTable)

https://groups.google.com/d/msgid/elasticsearch/cee4d390-a53c-4c11-ae4b-4d40023ca889%40googlegroups.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/8973324f-32fc-4b90-b549-df014808d729%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Suggestion for an ElasticSearch plugin that forward documents at indexng time

2015-01-15 Thread Stefano Ruggiero

Hi all,

i would like to know if someone have play around an ElasticSearch plugin 
that can forward documents at indexing time to an external source, i dont 
want to do it throught logstash but only whene doc is indexed
my goal is to take that plugin as an example of my custom one, i would like 
to have a plugin that receive a copy of a document that is indexed so we 
can manipulate it in real time and then send it to an external database or 
interface.

thanks for all the suggestions

regards

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/144e4f63-d686-4bb2-aded-cc9a77c28971%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

need help for search

Hello,
 I start learning Elasticsearch, and i have a problem for understand how 
search. anyone could help me? 

My gist for all my structure and my data is here
https://gist.github.com/thibaut1001/7a3000c3ff371be3a52d

My problem is just in 4part
To search in multi field by data like this


## We need to search henry in field selected
curl -XPOST 'http://localhost:9200/test_fr/user/_search' -d '{
query : {
bool: {
must: [ ],
must_not: [ ],
should: [
{
term: {
user.sku: henry
}
},
{
term: {
   user.internal_code: henry
}
},
{
term: {
   user.firstname: henry
}
},
{
term: {
   user.lastname: henry
}
},
{
term: {
   user.address: henry
}
},
{
term: {
   user.city: henry
}
},
{
term: {
   user.localized_description: henry
}
},
{
term: {
   user.localized_keywords: henry
}
},
{
term: {
   user.service.localized_label: henry
}
},
{
term: {
   user.medias.localized_label: henry
}
},
{
term: {
   user.services.localized_label: henry
}
}
]
}
}
}';

## Return no results Why?

I have many question.
Could you help me please,
thanks

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/c32551bd-cd04-4227-b783-40ca556928f7%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Excluding Terms Using a Minus Sign

Yes simple query string query supports this.
See
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-simple-query-string-query.html#query-dsl-simple-query-string-query

David

Le 15 janv. 2015 à 20:37, Cindy Conway cindyanncon...@gmail.com a écrit :

Is there a way to exclude a term if the user precedes it with a minus sign;
the way google does. For example, if I want to search for the word lovre, but
I don't want the museum in France, I can search for:
louve -museum as my search terms. Does ES support this? I am not finding
anything like that in the documentation.

Thanks All!
--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/b7e7fa83-332f-4fc9-a704-5abccb2d9856%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/483A7A3D-24E0-4292-B156-55DD3874AEA5%40pilato.fr.
For more options, visit https://groups.google.com/d/optout.

Complex search

2015-01-15 Thread Serge Schumacher

Hi,
I'm looking to create a search behaviour like Amazon does.

I have an index with 3 Fields  : Title, Description and Category.

I want to search in the fields title and descriptions for the word *car* 
and I would like to get scored result like this :

car   -- score : 1 in category vehicles
autocar-- score : 0,5 in category vehicles where the part car should 
highlighted ex : auto*car*
carradio -- core : 0,5 in category vehicles where the part car should 
highlightedex : *car*radio

and that if the word is found in the title field, the score should be 
higher as if the word would only be found in the description field.

Is anybody out there who could help me on this topic or at least point me 
to the right direction where I should look for ?

Thanks,
Serge

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/3f5e7d15-74c2-49ab-bc8f-231d01899fa4%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Is ElasticSearch truly scalable for analytics?

2015-01-15 Thread Nils Dijk

Adding a 'node reduce phase' to aggregations is something I'm very
interested in, and also investigating for the project I'm currently working
on.

If you introduce an extra reduction phase (for multiple shards on the same
node) you introduce further potential for inaccuracies in the final
results.

This is true if you only reduce the top-k items per shard, but I was
thinking to reduce the complete set of buckets locally. This takes a bit
more cpu, and memory, but my guess is that this is negligible compared to
the work already being done by the aggregation framework. If you reduce the
buckets on the node before sending it to the coordinator it will actually
increase the accuracy for aggregations!

how many of these sorts of use cases generate sufficiently large trees of
results where a node-level merging would be beneficial

It is primarily beneficial for bigger installations with lots of shards per
machine. Say 40 machines with ~100 shards per machine. In the current
strategy where every node is sending 100 results there is a lot of
bandwidth used on the coordinating node, since it receives 4000 responses,
while it could do with 40 responses (1 per machine).

I acknowledge it is a highly specialised use-case which not very many
people run into, but it is a case I'm currently working on.

How hard would it to be to implement such a feature?

I have been looking into this, and it is not trivial. This needs to be
implemented in/around the SearchService. This is the place I found to be
implementing the different search strategies, eg. DFS. Unlike the rest of
Elasticsearch it does seem to not consist of modules that implement
different search strategies.

Regarding the accuracy of top-k lists. I think the above, both the 'node
reduce phase' and making the search strategy pluggable will be the
groundwork to start working on implementations of TJA or TPUT strategies as
discussed in an old issue[1] about accuracy of factes.

The order of steps to take before reaching the ultimate goal would be:
1) Make search strategies (eg. query then fetch, dfs query then fetch) more
modularized.
2) Make a search strategy with a 'node reduce phase' for the aggregations.
Start with a complete reduce on the node. If that takes to much memory/time
you can use TJA or TPUT locally on the node to get a reliable top-k list.
3a) Make a search strategy that executes TJA on the cluster coordinated by
the coordinating node
3b) Make a separate strategy that executes TPUT on the cluster coordinated
by the coordinating node

I would say that 3a and 3b are 'easy' if doing a complete reduce in step 2
is not consuming to much resources.

Adding strategies for both TJA and TPUT gives ultimate control to the user,
as TPUT is not suited for reliably sorting on sums where the field might
contain a negative value. But TPUT has better performance in latency over
TJA.

I would love to get an opinion from Adrien concerning the feasibility of
such an approach.

-- Nils

[1] https://github.com/elasticsearch/elasticsearch/issues/1305

On Wednesday, January 14, 2015 at 7:47:07 PM UTC+1, Elliott Bradshaw wrote:

How hard would it to be to implement such a feature? Even if there are
only a handful of use cases, it could prove very helpful in these.
Particularly since very large trees are the ones that will struggle the
most with bandwidth issues.

On Wednesday, January 14, 2015 at 1:36:53 PM UTC-5, Mark Harwood wrote:

Understood, but what about cases where size is set to unlimited?
Inaccuracies are not a concern in that case, correct?

Correct. But if we only consider the scenarios where the key sets are
complete and accuracy is not put at risk by merging (i.e. there is no top
N type filtering in play), how many of these sorts of use cases generate
sufficiently large trees of results where a node-level merging would be
beneficial?

On Wednesday, January 14, 2015 at 1:09:48 PM UTC-5, Mark Harwood wrote:

If you introduce an extra reduction phase (for multiple shards on the
same node) you introduce further potential for inaccuracies in the final
results.
Consider the role of 'size' and 'shard_size' in the terms aggregation
[1] and the effects they have on accuracy. You'd arguably need a
'node_size' setting to also control the size of this new intermediate
collection. All stages that reduce the volumes of data processed can
introduce an approximation with the potential for inaccuracies upstream
when merging.

[1]
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-aggregations-bucket-terms-aggregation.html#_shard_size

On Wednesday, January 14, 2015 at 5:44:47 PM UTC, Elliott Bradshaw
wrote:

Adrien,

I get the feeling that you're a pretty heavy contributor to the
aggregation module. In your experience, would a shard per cpu core
strategy be an effective performance solution in a pure aggregation use
case?If this could proportionally

Re: Questions about scaling elasticsearch with regard to the number of documents indexed per second

I can index on my laptop 1-12000 docs per second. SSD drives of course.

--
David ;-)
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 15 janv. 2015 à 13:43, Chinch Pokli cpo...@gmail.com a écrit :

No, so the whole point was that, will elasticsearch be able to index say
10,000 documents per second? If yes, I can simply hook up my twitter code to
es. If not, I would need to think of how to make that happen.
Typically I've seen es indexes just around 30 docs per second which is pretty
low.

I am hoping Redis/ Kafka/ Logstash/ etc. might help elasticsearch to get some
breathing room and enable it to index up to 10K docs per second.

On Thursday, January 15, 2015 at 10:47:31 AM UTC+5:30, David Pilato wrote:
You have a Twitter input so you can extract content from Twitter and send to
elasticsearch. No need to have Redis here.

--
David ;-)
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 15 janv. 2015 à 00:02, Chinch Pokli cpo...@gmail.com a écrit :

Thanks. I'll have a look at the raw option.
Regarding logstash, I don't fully understand it's utility. It says that it
can take messages from a Redis server. But if I have to set up Redis, I
could simply use the Redis river to index into Elasticsearch. Is there any
additional benefit that Logstash would give me?

On Thursday, January 15, 2015 at 4:06:12 AM UTC+5:30, David Pilato wrote:
You should look at raw option or better look at Logstash.

My 2 cents.

David

Le 14 janv. 2015 à 23:29, Chinch Pokli cpo...@gmail.com a écrit :

Hi,

Is there a way to make the river to store all the data? If not, I am fine
with writing a streaming code which will stream and index. But have a
concern. How many documents can elasticsearch index per second? I might
eventually need to index almost 10,000 documents (each document = 2 KB)
per second (current requirement is of 100 documents per second). Is this
even feasible? If yes, do I need to make any special modifications?

Thanks-in-advance!!
--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/da547692-903b-4793-a77e-fd5f0b5a01b7%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/d89e6057-ab58-49ef-a553-c5bd5265c172%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/a5c75aed-e290-4152-9f8d-160510f3ecfa%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/FD1F8969-377F-420C-A2CF-438F7383C890%40pilato.fr.
For more options, visit https://groups.google.com/d/optout.

Re: need help for search