Re: IllegalStateException[field \DISPLAY_NAME\ was indexed without position data

2014-04-30 Thread chee hoo lum
Hi Ivan,

Running the following query it returns records below :

{
  query : { match : {DISPLAY_NAME : Happy People} }
}


Result :
https://gist.github.com/cheehoo/073ab926baa123b18224




but running with span query suggested:

{
 from : 100,
 size : 100,
 query : {
span_first : {
match : {
span_near : {
 clauses : [
 { span_term : { DISPLAY_NAME : happy } },
 { span_term : { DISPLAY_NAME : people } }

 ],
 slop : 1,
 in_order : true
 }
},
*end : 2*
}
   }
}


no result returned.

Any clues :)


Thanks.








On Wed, Apr 30, 2014 at 12:04 PM, Ivan Brusic i...@brusic.com wrote:

 Do you have any documents that starts with happy people?

 --
 Ivan


 On Tue, Apr 29, 2014 at 7:21 PM, chee hoo lum cheeho...@gmail.com wrote:

 Hi Ivan,

 Tried with 2 and 3 with no luck.

 {
  from : 100,
  size : 100,
  query : {
 span_first : {
 match : {
 span_near : {
 clauses : [
 { span_term : { DISPLAY_NAME : happy } },
 { span_term : { DISPLAY_NAME : people } }

 ],
 slop : 1,
 in_order : true
 }
 },
 *end : 2*
 }
}
 }


 The field is using standard analyzer with stopword=_none:

  DISPLAY_NAME: {
 type: string,
 analyzer: standard
 },

index.analysis.analyzer.standard.type: standard,
index.analysis.analyzer.standard.stopwords: _none_


 Any clue on this ? :) Thanks




 On Wed, Apr 30, 2014 at 12:37 AM, Ivan Brusic i...@brusic.com wrote:

 The end parameter is too low. It needs to be at a minimum the number of
 clauses in the span_near query.

 --
 Ivan


 On Mon, Apr 28, 2014 at 7:05 PM, chee hoo lum cheeho...@gmail.comwrote:

  Hi Ivan,

 Not able to get any result with the following query :

 {
  from : 100,
  size : 100,
  query : {
 span_first : {
 match : {
 span_near : {
 clauses : [
  { span_term : { DISPLAY_NAME : happy } },
 { span_term : { DISPLAY_NAME : people } }

 ],
 slop : 1,
 in_order : true
 }
 },
 end : 1
 }
}
 }


 Meanwhile tried with :

 {
  from : 100,
  size : 100,
  query : {
 span_first : {
 match : {
 span_term : { DISPLAY_NAME : happy }
 },
 end : 1
 }
}
 }

 and it returns :

   _index: jdbc_dev,
 _type: media,
 _id: 9556,
 _score: 4.612431,
 _source: {
 DISPLAY_NAME: Happy People,


 Anything wrong with my first query ?

 Thanks



 On Tue, Apr 29, 2014 at 12:16 AM, Ivan Brusic i...@brusic.com wrote:

 The main limitation of the span queries is that they only operate on
 analyzed terms. The terms used in span_term must match the terms in the
 index. In your case, there is no single term happy holiday in your 
 index,
 because the original document was tokenized into happy birthday
 to you.

 You would need to do a span near query of the two terms with a slop of
 1 and in order. This span near query will then be the argument to the span
 first.

 Here is a good explanation of span queries in Lucene:
 http://searchhub.org/2009/07/18/the-spanquery/

 --
 Ivan


  On Sun, Apr 27, 2014 at 11:24 PM, cyrilforce cheeho...@gmail.comwrote:

  Hi Ivan,

 I recreate the mapping and re-index the documents and now working
 fine. Thanks.

 Btw would like to ask how i could search two or more words in the
 span_first query as i need it to support the following searches :
 1)happy
 2)happy holiday
 3)happy birthday to you

 {
  from : 100,
  size : 100,
  query : {
 span_first : {
 match : {
* span_term : { DISPLAY_NAME : happy holiday }*
 },
 end : 1
 }
}
 }


 returns empty list even we have documents that display_name start
 with *happy holiday*.

 Thanks.


 On Sunday, April 27, 2014 2:55:37 AM UTC+8, cyrilforce wrote:

 Hi Ivan,

 I am using version elasticsearch-0.90.1. Nope we don't have any
 templates. Not sure whether your are referring to the full index mapping
 here's the gist

 media mapping
 https://gist.github.com/cheehoo/11327970

 full index mapping
 https://gist.github.com/cheehoo/11327996

 Thanks in advance.





 On Sat, Apr 26, 2014 at 8:31 AM, Ivan Brusic i...@brusic.comwrote:

 Your mapping looks correct. Which version are you running? Do you
 have any templates?

 Just to be on the safe side, can you provide the mapping that
 Elasticsearch is using (not the one you provide):

 http://localhost:9200/jdbc_dev/media/_mapping

 --
 Ivan




 On Fri, Apr 25, 2014 at 3:24 AM, cyrilforce cheeho...@gmail.comwrote:

 Hi,

 I am trying to query some records via the span_first query as
 below :

 {
  from : 100,
  size : 100,
  query : {
 span_first : {
 match : {
* span_term : { DISPLAY_NAME : happy }*
 },
  

Truncating scores

2014-04-30 Thread Loïc Wenkin
Hello everybody,

I am using the function_score query in order to compute a custom score for 
items I am indexing into ElasticSearch. I am using a native script (written 
in Java) in order to compute my score. This score is computed based on a 
date (Date.getTime()). When I use a logger and look what is returned by my 
native script, I get what I want, but when I look at the score of items 
returned by query (I use the replace mode), I get a truncated number (e.g. 
if a computed score displayed in the native script with the value 1 392 028 
423 243, it is returned with the value 1 392 028 420 000 as score of 
returned items). The problem here is that I am loosing milliseconds and 
seconds (I only get the decade part of seconds). Loose milliseconds can be 
acceptable, but I can't loose seconds.

Is this problem a limitation of ElasticSearch ? Is there any way to 
workaround this problem ?

Thanks in advance for your replies.

Regards,
Loïc Wenkin

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/ccf7c19e-aa70-42ac-a4a4-d7174ab0de49%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: performance issue with script scoring with fields having a large array

2014-04-30 Thread Radu Gheorghe
Hello,

Using _source for scripts is typically slow, because ES has to go to each
stored document and extract fields from there. A faster approach is to use
something like doc['field3'].values[12], which will used the field data
cache (already loaded in memory, at least after the first run):
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-scripting.html#_document_fields

More details about field data can be found here:
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/index-modules-fielddata.htm

Best regards,
Radu
--
Performance Monitoring * Log Analytics * Search Analytics
Solr  Elasticsearch Support * http://sematext.com/


On Wed, Apr 30, 2014 at 12:27 PM, NM n.maisonne...@gmail.com wrote:

 I have document having fields containing  large array.

 I would like to score according to the value of a nth element of such
 array, but got very slow answer (5s) for only 10K document indexed.

 my mapping:
 document {
 id: value,
 field2: string,
 field3: [ int_1,int_2, ... , int_10k] - large array of 10K integers
 }

 assume I generated and indexed 10K documents with 1K random integer values
 in the field 'field3'

 I then use the following search query

 GET /test/document/_search
 {
   query:{
function_score:{
   script_score : {
 script :  _source.fields3[12] * _source.fields3[11] 
 }

 = got 5000 ms

 however with basic Java object with a simple nested loop:

 - for all the documents
   score[i] =  doc[i].fields[12] * doc[i].fields[11]
 - sort by score

 = got  50 ms

 ES is 100 slower than a simple loop..

 How to get similar performance with ES?

  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/db53da70-4f75-4088-b9a6-2cde3caef062%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/db53da70-4f75-4088-b9a6-2cde3caef062%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHXA0_2wmDJFBJvJ1fTUsszaP7GjVtJYfSU-AbHMq6NS%2BVqhFw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Elasticsearch Deployment architecture

2014-04-30 Thread Mark Walkom
It will work, but if you want to maintain HA then it'd make sense to keep
your inputs separate from your outputs. At least, that's my take :)

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com
web: www.campaignmonitor.com


On 30 April 2014 19:48, Norberto Meijome num...@gmail.com wrote:

 Sending indexing requests to SLB - is this less optimal, or would outright
 fail?
 On 30/04/2014 9:04 am, Mark Walkom ma...@campaignmonitor.com wrote:

 For searches, yes. You'd want the indexing to go to the masters.

 Regards,
 Mark Walkom

 Infrastructure Engineer
 Campaign Monitor
 email: ma...@campaignmonitor.com
 web: www.campaignmonitor.com


 On 30 April 2014 09:02, Norberto Meijome num...@gmail.com wrote:

 On a related note, if you have separate slb and master, your main LB
 (say, haproxy) would be pointing to the slb , not the master , right?
  On 29/04/2014 8:40 pm, Dinesh Chandra shadow.on.f...@gmail.com
 wrote:

 Hi,

 I am very new to elasticsearch, I am trying to deploy elasticsearch in
 my dev environment - While there are many ways in which Elasticsearch can
 be deployed, I and my team have arrived at this architecture

 4 Data Nodes
 3 Master Nodes
 2 Search Load Balancers (SLB)

 Now my question is:
  - Does it make sense to have SLB at all?
  - Can I just have master nodes and have them perform the JOB of SLB
 too?

 Please enlighten me on a sensible Elasticsearch Architecture!

 --
 You received this message because you are subscribed to the Google
 Groups elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send
 an email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/82ee8ae2-c84d-4685-b061-d3e433b7969f%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/82ee8ae2-c84d-4685-b061-d3e433b7969f%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.

  --
 You received this message because you are subscribed to the Google
 Groups elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send
 an email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/CACj2-4K9mh%3D%3Dv02mkRForLfHO8E4MYUcd3kNvfvFJGWvRwFiCg%40mail.gmail.comhttps://groups.google.com/d/msgid/elasticsearch/CACj2-4K9mh%3D%3Dv02mkRForLfHO8E4MYUcd3kNvfvFJGWvRwFiCg%40mail.gmail.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/CAEM624bQXpsN12dCPQefkkL8LMX0bdsGVrs2uS0ZRLMtqRM%3DXg%40mail.gmail.comhttps://groups.google.com/d/msgid/elasticsearch/CAEM624bQXpsN12dCPQefkkL8LMX0bdsGVrs2uS0ZRLMtqRM%3DXg%40mail.gmail.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.

  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/CACj2-4JqQ3Q%3DTKaTWbZTEkbFBW%2Bj6acGeFiBo7omUH-6aEo1Lg%40mail.gmail.comhttps://groups.google.com/d/msgid/elasticsearch/CACj2-4JqQ3Q%3DTKaTWbZTEkbFBW%2Bj6acGeFiBo7omUH-6aEo1Lg%40mail.gmail.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAEM624Y9ZS7FAVa1R%3D-sK4UWyJj6uoSC756fe%3Dii7Xi2e7Kn0A%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: index binary files

2014-04-30 Thread Radu Gheorghe
Hello,

Normally, you would send indexing requests to the REST API with the stuff
you want Elasticsearch to index:
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/docs-index_.html

If you want Elasticsearch to automatically fetch files from the file system
for you, have a look at David's FileSystem River:
https://github.com/dadoonet/fsriver

Best regards,
Radu
--
Performance Monitoring * Log Analytics * Search Analytics
Solr  Elasticsearch Support * http://sematext.com/


On Tue, Apr 29, 2014 at 6:40 PM, anass benjelloun anass@gmail.comwrote:

 hello,

 I installed ElasticSearch, its work good i can index and search xml and
 json content using Dev HTTP Client.
 I need your help to index binary files in elasticsearch then search for
 them by content.
 I added mapper-attachements to elastic search but what i dont know is how
 to specify the folder of pdf or docx files to index it. something like
 base64 or i dont know.
 Thanks for helping me.

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/787f6815-408a-4ef7-bfd3-a5ee6cc02798%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/787f6815-408a-4ef7-bfd3-a5ee6cc02798%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHXA0_2UQpB63eye_Yii0KiGYXiMj8Q6v3swRrxxYNk5jiMxpQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Specify metadata per word/term in a string

2014-04-30 Thread Neeraj Makam
Hi

Given a text,say hello elastic search world,
is there a way i can associate a field or some metadata per word in the 
text on which i can later query? 
for eg: give code number to each word, and should be able to search like
text = hello AND code = 25
i.e return all hello words which have 25 in their code metadata. 

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/f5f40059-e9e7-4a05-93f8-aacdff26abb0%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Specify metadata per word/term in a string

2014-04-30 Thread vineeth mohan
Hello Neeraj ,

First of all you cant return a hello from Elasticsearch.
Elasticsearch works on feed level basis.
Which means if you want to search hello , you will get the feed with the
text hello elasticsearch search world but not just hello.

Only way I can think of create a different document for each word. So a
document would look like -

{
 word : hello,
code : 25
}

here , you can get it worked.

If you want to retrieve the text also , give it as follows -

{
  text : hello from Elasticsearch ,
 words : [
{ word : hello , count : 25 } ,
{ word : from , count : 22}
   ]
}

WHERE words field is nested type.

Thanks
Vineeth







On Wed, Apr 30, 2014 at 4:36 PM, Neeraj Makam neeraj23...@gmail.com wrote:

 Hi

 Given a text,say hello elastic search world,
 is there a way i can associate a field or some metadata per word in the
 text on which i can later query?
 for eg: give code number to each word, and should be able to search like
 text = hello AND code = 25
 i.e return all hello words which have 25 in their code metadata.

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/f5f40059-e9e7-4a05-93f8-aacdff26abb0%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/f5f40059-e9e7-4a05-93f8-aacdff26abb0%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAGdPd5nwzNDRVkkxJ8qczgzFAacatg-tEQYaAkrk%2BeHRRcqthA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Specify metadata per word/term in a string

2014-04-30 Thread joergpra...@gmail.com
There is a related feature that is called payloads for terms. In
Elasticsearch you can assign payload to terms, e.g. numbers for custom
scoring.

See also

https://github.com/elasticsearch/elasticsearch/issues/3772
https://github.com/elasticsearch/elasticsearch/pull/4161

It uses DelimitedPayloadTokenFilter

http://lucene.apache.org/core/4_7_0/analyzers-common/org/apache/lucene/analysis/payloads/DelimitedPayloadTokenFilter.html

Jörg



On Wed, Apr 30, 2014 at 1:06 PM, Neeraj Makam neeraj23...@gmail.com wrote:

 Hi

 Given a text,say hello elastic search world,
 is there a way i can associate a field or some metadata per word in the
 text on which i can later query?
 for eg: give code number to each word, and should be able to search like
 text = hello AND code = 25
 i.e return all hello words which have 25 in their code metadata.

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/f5f40059-e9e7-4a05-93f8-aacdff26abb0%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/f5f40059-e9e7-4a05-93f8-aacdff26abb0%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoEV56vGmxgKceY5dcSkEsJwEwrmtVXona-MH298YOExnQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Truncating scores

2014-04-30 Thread Nikolas Everett
Scores are Java floats so I'd expect them to be less precise then the long
that getTime returns.  I believe you could look at sorting rather then
scoring or look at reducing the precision on the top bits of your long.
You know, y2k bug style.

The reason the score is a float is that for text scoring its exact enough.
Also, some of the lucene data structures are actually more lossy then
float.  The field norm, iirc, is a floating point number packet into 8 bits
rather the float's 32.

Nik


On Wed, Apr 30, 2014 at 5:56 AM, Loïc Wenkin loic.wen...@gmail.com wrote:

 Hello everybody,

 I am using the function_score query in order to compute a custom score for
 items I am indexing into ElasticSearch. I am using a native script (written
 in Java) in order to compute my score. This score is computed based on a
 date (Date.getTime()). When I use a logger and look what is returned by my
 native script, I get what I want, but when I look at the score of items
 returned by query (I use the replace mode), I get a truncated number (e.g.
 if a computed score displayed in the native script with the value 1 392 028
 423 243, it is returned with the value 1 392 028 420 000 as score of
 returned items). The problem here is that I am loosing milliseconds and
 seconds (I only get the decade part of seconds). Loose milliseconds can be
 acceptable, but I can't loose seconds.

 Is this problem a limitation of ElasticSearch ? Is there any way to
 workaround this problem ?

 Thanks in advance for your replies.

 Regards,
 Loïc Wenkin

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/ccf7c19e-aa70-42ac-a4a4-d7174ab0de49%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/ccf7c19e-aa70-42ac-a4a4-d7174ab0de49%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAPmjWd39xkFEJNfb0x8C-M5h6GaxP7qqFYBFjTcBua1siVRttQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: SearchParseExceptions in Marvel monitoring cluster

2014-04-30 Thread Boaz Leskes
Hi Mihir,

This type of error typically ocour when the marvel index doesn't contain 
the right data. I'm intrigued by the ClusterBlockException on you 
monitoring cluster.

Can you gist the output of : curl SERVER:9200/_cat/shards/?v for both nodes 
of you marvel cluster?

Thx,
Boaz

On Monday, April 28, 2014 2:43:30 PM UTC+2, Mihir M wrote:

 Hi, 

 We have 2 Elasticsearch clusters in our development environment. 
 One of them is our development cluster with 9 nodes having 
  - 4 Data nodes (with 4 GB heap) 
  - 3 Master eligible nodes (default heap) 
  - 2 Search Load Balancers (default heap) 

 The second is our monitoring cluster for storing Marvel data of the 
 development cluster. This cluster has 2 nodes running with default 
 configuration. 
 All the above nodes are running the latest ES version 1.1.1 and the latest 
 Marvel version which is 1.1.0. 

 Of late we have been seeing issues in the Marvel cluster. One of the nodes 
 in the Marvel cluster throws the following exception continuously: 
 [.marvel-2014.04.25][0], node[dA2UtjgdQ1S55zgvQHOHYQ], [P], s[STARTED]: 
 Failed to execute [org.elasticsearch.action.search.SearchRequest@24de815] 
 org.elasticsearch.search.SearchParseException: [.marvel-2014.04.25][0]: 
 from[-1],size[-1]: Parse Failure [Failed to parse source 
 [{facets:{0:{date_histogram:{key_field:@timestamp,value_field:total.search.query_total,interval:1m},global:true,facet_filter:{fquery:{query:{filtered:{query:{query_string:{query:_type:indices_stats}},filter:{bool:{must:[{range:{@timestamp:{from:1398434986844,to:now}}}],size:50,query:{filtered:{query:{query_string:{query:_type:cluster_event
  

 OR 
 _type:node_event}},filter:{bool:{must:[{range:{@timestamp:{from:1398434986844,to:now}}}],sort:[{@timestamp:{order:desc}},{@timestamp:{order:desc}}]}]]
  

 at 
 org.elasticsearch.search.SearchService.parseSource(SearchService.java:634) 
 at 
 org.elasticsearch.search.SearchService.createContext(SearchService.java:507) 

 at 
 org.elasticsearch.search.SearchService.createAndPutContext(SearchService.java:480)
  

 at 
 org.elasticsearch.search.SearchService.executeFetchPhase(SearchService.java:324)
  

 at 
 org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteFetch(SearchServiceTransportAction.java:304)
  

 at 
 org.elasticsearch.action.search.type.TransportSearchQueryAndFetchAction$AsyncAction.sendExecuteFirstPhase(TransportSearchQueryAndFetchAction.java:71)
  

 at 
 org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:216)
  

 at 
 org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction$4.run(TransportSearchTypeAction.java:296)
  

 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
  

 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
  

 at java.lang.Thread.run(Thread.java:744) 
 Caused by: org.elasticsearch.search.facet.FacetPhaseExecutionException: 
 Facet [0]: (value) field [total.search.query_total] not found 
 at 
 org.elasticsearch.search.facet.datehistogram.DateHistogramFacetParser.parse(DateHistogramFacetParser.java:186)
  

 at 
 org.elasticsearch.search.facet.FacetParseElement.parse(FacetParseElement.java:93)
  

 at 
 org.elasticsearch.search.SearchService.parseSource(SearchService.java:622) 
 ... 10 more 

 It keeps repeating at regular intervals. Also this is observed in only one 
 of the 2 nodes of the monitoring cluster. Usually it is the master which 
 shows this exception. 
 Similar exceptions are observed in the Marvel dashboard - Cluster Overview 
 page. 

 Also in the development cluster in one of the Master nodes, we see 
 ClusterBlockException [shard state 0 not initialized or recovered] for the 
 monitoring cluster. 

 Please explain why this is happening. One more thing to add, we are facing 
 this problem ever since we migrated to ES 1.1.0. Before that while running 
 1.0.0, no such things were observed. 

 Looking forward to your reply. 




 - 
 Regards 
 -- 
 View this message in context: 
 http://elasticsearch-users.115913.n3.nabble.com/SearchParseExceptions-in-Marvel-monitoring-cluster-tp4054926.html
  
 Sent from the ElasticSearch Users mailing list archive at Nabble.com. 


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/e21279a2-62e9-4d08-9aed-f9d32c110da5%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Truncating scores

2014-04-30 Thread Loïc Wenkin
Hello Nikolas,

Thanks for your reply. I have done something like what you have just 
explained. I divide the score by 5000 before returning it. Doing this, I 
remove milliseconds and I keep a precision of 5 seconds, which I expect to 
be enough. If it's always a problem, I may try to remove some years from 
the date in order to get a smallest number.

I think that using sort is an hard work since I have something like this in 
my documents :

a: {

b: {

objectsSortableByDate: [ 

... 

]

},

c: {

objectsSortableByDate: [

... 

] 

} 

}

I want to filter my entities according the smallest (or highest) date of 
any objectsSortableByDate (whatever they are in b or in c), and sometime, 
I may have more than two nested objects, so, I think that the easiest way 
to sort is using a computed score. If you have a better idea, I will take 
it :)

Loïc

Le mercredi 30 avril 2014 14:48:37 UTC+2, Nikolas Everett a écrit :

 Scores are Java floats so I'd expect them to be less precise then the long 
 that getTime returns.  I believe you could look at sorting rather then 
 scoring or look at reducing the precision on the top bits of your long.  
 You know, y2k bug style.

 The reason the score is a float is that for text scoring its exact 
 enough.  Also, some of the lucene data structures are actually more lossy 
 then float.  The field norm, iirc, is a floating point number packet into 8 
 bits rather the float's 32.

 Nik


 On Wed, Apr 30, 2014 at 5:56 AM, Loïc Wenkin loic@gmail.comjavascript:
  wrote:

 Hello everybody,

 I am using the function_score query in order to compute a custom score 
 for items I am indexing into ElasticSearch. I am using a native script 
 (written in Java) in order to compute my score. This score is computed 
 based on a date (Date.getTime()). When I use a logger and look what is 
 returned by my native script, I get what I want, but when I look at the 
 score of items returned by query (I use the replace mode), I get a 
 truncated number (e.g. if a computed score displayed in the native script 
 with the value 1 392 028 423 243, it is returned with the value 1 392 028 
 420 000 as score of returned items). The problem here is that I am loosing 
 milliseconds and seconds (I only get the decade part of seconds). Loose 
 milliseconds can be acceptable, but I can't loose seconds.

 Is this problem a limitation of ElasticSearch ? Is there any way to 
 workaround this problem ?

 Thanks in advance for your replies.

 Regards,
 Loïc Wenkin
  
 -- 
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearc...@googlegroups.com javascript:.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/ccf7c19e-aa70-42ac-a4a4-d7174ab0de49%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/ccf7c19e-aa70-42ac-a4a4-d7174ab0de49%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.




-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/c003b925-0766-4750-a722-3125a77c3774%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Aggregation bug? Or user error?

2014-04-30 Thread Adrien Grand
This looks wrong indeed. By any chance, would you have a curl recreation of
this issue?


On Tue, Apr 29, 2014 at 7:35 PM, mooky nick.minute...@gmail.com wrote:

 It looks like a bug to me - but if its user error, then obviously I can
 fix it a lot quicker :)


 On Tuesday, 29 April 2014 13:04:53 UTC+1, mooky wrote:

 I am seeing some very odd aggregation results - where the sum of the
 sub-aggregations is more than the parent bucket.

 Results:
 CSSX : {
   doc_count : *24*,
   intentDate : {
 buckets : [ {
   key : Overdue,
   to : 1.3981248E12,
   to_as_string : 2014-04-22,
   doc_count : *1*,
   ME : {
 doc_count : *0*
   },
   NOT_ME : {
 doc_count : *24*
   }
 }, {
   key : May,
   from : 1.3981248E12,
   from_as_string : 2014-04-22,
   to : 1.4006304E12,
   to_as_string : 2014-05-21,
   doc_count : *23*,
   ME : {
 doc_count : 0
   },
   NOT_ME : {
 doc_count : *24*
   }
 }, {
   key : June,
   from : 1.4006304E12,
   from_as_string : 2014-05-21,
   to : 1.4033088E12,
   to_as_string : 2014-06-21,
   doc_count : *0*,
   ME : {
 doc_count : *0*
   },
   NOT_ME : {
 doc_count : *24*
   }
 } ]
   }
 },


 I wouldn't have thought that to be possible at all.
 Here is the request that generated the dodgy results.


 CSSX : {
   filter : {
 and : {
   filters : [ {
 type : {
   value : inventory
 }
   }, {
 term : {
   isAllocated : false
 }
   }, {
 term : {
   intentMarketCode : CSSX
 }
   }, {
 terms : {
   groupCompanyId : [ 0D13EF2D0E114D43BFE362F5024D8873, 
 0D593DE0CFBE49BEA3BF5AD7CD965782, 1E9C36CC45C64FCAACDEE0AF4FB91FBA, 
 33A946DC2B0E494EB371993D345F52E4, 6471AA50DFCF4192B8DD1C2E72A032C7, 
 9FB2FFDC0FF0797FE04014AC6F0616B6, 9FB2FFDC0FF1797FE04014AC6F0616B6, 
 9FB2FFDC0FF2797FE04014AC6F0616B6, 9FB2FFDC0FF3797FE04014AC6F0616B6, 
 9FB2FFDC0FF5797FE04014AC6F0616B6, 9FB2FFDC0FF6797FE04014AC6F0616B6, 
 AFE0FED33F06AFB6E04015AC5E060AA3 ]
 }
   }, {
 not : {
   filter : {
 terms : {
   status : [ Cancelled, Completed ]
 }
   }
 }
   } ]
 }
   },
   aggregations : {
 intentDate : {
   date_range : {
 field : intentDate,
 ranges : [ {
   key : Overdue,
   to : 2014-04-22
 }, {
   key : May,
   from : 2014-04-22,
   to : 2014-05-21
 }, {
   key : June,
   from : 2014-05-21,
   to : 2014-06-21
 } ]
   },
   aggregations : {
 ME : {
   filter : {
 term : {

   trafficOperatorSid : S-1-5-21-20xxspan
 style=color: #000; class=styled-by
 ...

  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/4ceceaaf-4fb8-4e54-97f4-c49fcbf9493d%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/4ceceaaf-4fb8-4e54-97f4-c49fcbf9493d%40googlegroups.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.




-- 
Adrien Grand

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j7WWj4GaAEH0K%2B37srpP4f_9S%3DKffM7k1DAAyZiy1zUpQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Date range query ignore month

2014-04-30 Thread Fatih Karatana
Hi guys,

I've been using Elasticsearch as my data store and I got  lots of documents 
in it. My problem is, I figured it out that Elasticsearch does ignore month 
field regarding mapping and I can not get real search response. 

Here is what I have in my index and my query, please tell me if I'm wrong:

curl -XPUT 'http://localhost:9200/tt6/' -d '{}'
curl -XPUT 'http://localhost:9200/tt6/tweet/_mapping' -d '{tweet : 
{properties : {date : {type : date, format: -MM-DD HH:mm:ss 
'
curl -XPUT 'http://localhost:9200/tt6/tweet/1' -d '{date: 2014-02-14 
04:00:45}'

curl -XGET 'http://localhost:9200/tt6/_search' -d '
{
  query: {
bool: {
  must: [
{
  range: {
tweet.date: {
  from: 2014-12-01 00:00:00,
  to: 2014-12-30 00:00:00
}
  }
}
  ],
  must_not: [],
  should: []
}
  },
  from: 0,
  size: 10,
  sort: [],
  facets: {}
}'

And my response is
{
  took: 3,
  timed_out: false,
  _shards: {
total: 5,
successful: 5,
failed: 0
  },
  hits: {
total: 1,
max_score: 1,
hits: [
  {
_index: tt6,
_type: tweet,
_id: 1,
_score: 1,
_source: {
  date: 2014-02-14 04:00:45,
  name: test
}
  }
]
  }
}

By given date range it must has no response beet 1st of December 2014 and 
30th of December 2014, but it returns.

Any help will be appreciated.

Regards.

Fatih.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/6ee655cf-9e77-439f-9aac-8255efafcb2a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Registering node event listeners

2014-04-30 Thread Ivan Brusic
Would the DiscoverService solve my initial problem or only get around
constructing a DiscoveryNodesProvider? DiscoverService only uses
the InitialStateDiscoveryListener, which doesn't publish interesting events.

I won't be near a computer in the next few days to test.

-- 
Ivan


On Wed, Apr 30, 2014 at 4:40 AM, joergpra...@gmail.com 
joergpra...@gmail.com wrote:

 Have you looked at InternalNode.java?

 Form my understanding you could try to implement your own DiscoveryModule
 with DiscoveryService and start it like this

 DiscoveryService discoService =
 injector.getInstance(DiscoveryService.class).start();

 Jörg



 On Wed, Apr 30, 2014 at 12:17 AM, Ivan Brusic i...@brusic.com wrote:

 I am looking to transition a piece of my search infrastructure from
 polling the cluster's health status to hopefully receiving notifications
 whenever an event occurs. Using the TransportService, I registered various
 relevant listeners, but none of them are triggered.

 Here is the gist of the code:

 https://gist.github.com/brusic/2dcced28e0ed753b6632

 Most of it I stole^H^H^H^H^Hborrowed from ZenDiscovery. I am assuming
 something is not quite right with the TransportService. I tried using both
 a node client and a master-less/data-less client. I also suspect that
 the DiscoveryNodesProvider might not have been initialized correctly, but I
 am primarily after the events from NodesFaultDetection, which does not use
 the DiscoveryNodesProvider.

 I know I am missing something obvious, but I cannot quite spot it. Is
 there perhaps a different route using the TransportClient?

 Cheers,

 Ivan

  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQC5twFLr%2By_oqkV3_SjS9T_kikG9Z%2BBi6DJ_jOydHYBCA%40mail.gmail.comhttps://groups.google.com/d/msgid/elasticsearch/CALY%3DcQC5twFLr%2By_oqkV3_SjS9T_kikG9Z%2BBi6DJ_jOydHYBCA%40mail.gmail.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/CAKdsXoEVGCvFFaeJmxba-UZEuKS7EK5FakqBbSgy4qUGuywtYg%40mail.gmail.comhttps://groups.google.com/d/msgid/elasticsearch/CAKdsXoEVGCvFFaeJmxba-UZEuKS7EK5FakqBbSgy4qUGuywtYg%40mail.gmail.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQB%2B%3Dox_Q7D-U%3DVVROusfdGuJWHF_hxZJAT85NAZ0d%3D1eg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Substring match in search term order using Elasticsearch

2014-04-30 Thread Kruti Shukla


Posted same question on stackover flow 
http://stackoverflow.com/questions/23244796/substring-match-in-search-term-order-using-elasticsearch;
 but still looking for Answer.


I'm new to elasticsearch

I want to perform substring/partial word match using elastic search. I want 
results to be returned in the perticular order. In order to explain my 
problem I will show you how I create my index, mappings and what are the 
records I use.

*Creating Index and mappings:*

PUT /my_index1
{
settings: {
analysis: {
filter: {
trigrams_filter: {
type: ngram,
min_gram: 3,
max_gram: 3
}
},
analyzer: {
trigrams: {
type:  custom,
tokenizer: standard,
filter:   [
lowercase,
trigrams_filter
]
}
}
}
},
mappings: {
my_type1: {
properties: {
text: {
type: string,
analyzer: trigrams 
}
}
}
}
}

*Bulk record insert:*

POST /my_index1/my_type1/_bulk
{ index: { _id: 1 }}
{ text: men's shaver }
{ index: { _id: 2 }}
{ text: men's foil shaver }
{ index: { _id: 3 }}
{ text: men's foil advanced shaver }
{ index: { _id: 4 }}
{ text: norelco men's foil advanced shaver }
{ index: { _id: 5 }}
{ text: men's shavers }
{ index: { _id: 6 }}
{ text: women's shaver }
{ index: { _id: 7 }}
{ text: women's foil shaver }
{ index: { _id: 8 }}
{ text: women's foil advanced shaver }
{ index: { _id: 9 }}
{ text: norelco women's foil advanced shaver }
{ index: { _id: 10 }}
{ text: women's shavers }

*Now, I want to perform search for en's shaver. I'm searching using 
follwing query:*

POST /my_index1/my_type1/_search
{
query: {
   match: {
  text: 
  { query: en's shaver,

minimum_should_match: 100%

  }
   }

}
}

I want results to be in following sequence:

   1. men's shaver -- closest match with following same search keyword 
   order en's shaver
   2. women's shaver -- closest match with following same search keyword 
   order en's shaver
   3. men's foil shaver -- increased distance by 1
   4. women's foil shaver -- increased distance by 1
   5. men's foil advanced shaver -- increased distance by 2
   6. women's foil advanced shaver -- increased distance by 2
   7. men's shavers -- substring match for shavers
   8. women's shavers -- substring match for shavers

I'm performing following query. It is not giving me result in the order I 
want:

POST /my_index1/my_type1/_search
{
   query: {
  query_string: {
 default_field: text,
 query: men's shaver,
 minimum_should_match: 90%
  }
   }
}

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/b7d43a2d-be99-45a5-a2a3-4151dbc52292%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Using snapshotrestore to separate indexing from searching

2014-04-30 Thread JoeZ99
as I posted before, our system does not fit very well in cluster structure, 
because we have many small indices in place (about 1k indices with an 
average of 6k records each), we guessed that with so many small indices, 
the cluster spent too much time and resources which nodes should be master 
, or where to locate absurdly small shards, etc... Bottom line is that the 
cluster always ended up not working right. BTW, I'm suspecting that with a 
few advanced tuning options of the cluster (shard routing and the like) we 
may be able to put it on again, but unfortunately we can't find that kind 
of knowledge in the standard doc. If any of you have any hint on this, it 
would be greatly appreciated!!!

Anyway, we need to scale the system somehow, and this is what we've come up 
with:

  - Our indices can have configuration variations that make a reindex 
needed at any time. it doesn't happen a lot, but it happens, and with 1k 
indices, it's bound to happen.
  - Indexing data is regenerated everyday, so every day the whole set of 
indices is re-created (we figured it's much faster to recreate the index 
than to update an existing one replacing everyone of its records)

We would like the machines used for searching results are only used for 
that, and never used for indexing/reindexing ops, because we don't want the 
user experience to suffer when searching against an already loaded server 
because it's doing some heavy indexing.

In our ideal scenario, indexing/reindexing would be done in devoted 
machines, which can be as many as needed, and searching would be done in 
different machines. We plan to use the snapshot/restore feature for that. 

Any time an index/reindex is needed, it would be done on one of these 
indexing machines, and then the fresh index would be snapshotted, to be 
restored to the search machine afterwards. We should have some client 
control to make sure the snapshot process is only once at a time, it's my 
understanding that this is not the case in the restore process (i.e. you 
can have more than one restore process running on a cluster).

Individual item index can happen occasionally, but I figure when that 
happens we can just index to both the searching machines and the indexing 
machines, because it's never going to be big.

Please understand cluster instead of machine

How crazy does this whole thing sound, Is there any other way we can get 
some scalability?

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/82d7dd51-1b86-4b0f-8abc-425a45f1dfac%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Lucene Date Range Query in Kibana

2014-04-30 Thread Uli Bethke
Is there a way in Kibana or Lucene to define a date range query as Today-60 
days.

Something along the logical lines of visit_date: [*-60 TO *] 

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/121e135e-e12f-417c-879f-36e877ec0d98%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Security of ES

2014-04-30 Thread Adrien Grand
Hi,

Elasticsearch doesn't support any form of authentification or authorization
at the moment. The way users deal with this issue is usually by giving
access to Elasticsearch through a proxy that would handle security based on
the path of the URL.


On Wed, Apr 30, 2014 at 5:56 PM, Patrick Proniewski 
elasticsea...@patpro.net wrote:

 Hello,

 As a BOfH, I'm quite used to provide auth-based access to IT resources. As
 CISO I must guaranty that users get only what they need, especially about
 sensitive content. Unfortunately I can't find anything about
 authentication, and security in ES documentation. It looks like the product
 is designed like memcached: it's there and free to use.

 Is there any way to provide some partitioning inside an ES cluster, so
 that we can share the cluster without sharing the data?

 thanks,
 Patrick

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/E22ED5A1-1554-4558-BBC7-3408CBA3C179%40patpro.net
 .
 For more options, visit https://groups.google.com/d/optout.




-- 
Adrien Grand

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j53TD4iwPrP76RcKP6ofojtho%2Bt2o9BCbNsx3u0BLGpRA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Security of ES

2014-04-30 Thread David Pilato
Yes. By now, you have to deal with security yourself.

So, secure URL using Ngnix for example, use aliases which will expose alias URL 
and not direct index URL.
Use filters in aliases.

Example:

Let's say you have a groupid field in your documents and you have a doc index.
A doc A belongs to groupid marketing.
Doc B belongs to groupid finances.

Create an alias marketing which uses doc index with a prebuilt filter on 
groupid with marketing.
Same for finances.

Then secure your URLs using Nginx and let users only access to the right URLs 
(aliases) they should see.

My 2 cents.

-- 
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr


Le 30 avril 2014 à 17:56:10, Patrick Proniewski (elasticsea...@patpro.net) a 
écrit:

Hello,  

As a BOfH, I'm quite used to provide auth-based access to IT resources. As CISO 
I must guaranty that users get only what they need, especially about sensitive 
content. Unfortunately I can't find anything about authentication, and security 
in ES documentation. It looks like the product is designed like memcached: it's 
there and free to use.  

Is there any way to provide some partitioning inside an ES cluster, so that we 
can share the cluster without sharing the data?  

thanks,  
Patrick  

--  
You received this message because you are subscribed to the Google Groups 
elasticsearch group.  
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.  
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/E22ED5A1-1554-4558-BBC7-3408CBA3C179%40patpro.net.
  
For more options, visit https://groups.google.com/d/optout.  

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/etPan.53611f0e.257130a3.2280%40MacBook-Air-de-David.local.
For more options, visit https://groups.google.com/d/optout.


Re: ES and SAN storage

2014-04-30 Thread Mohit Anchlia
I think anyone will find it difficult to answer such questions just because
there are several factors that derive the decision like latency
requirements, high availability requirements, how shared SAN storage is and
impact of somebody stealing IO under the hood etc. The best way is to
develop a test model and test it out. Look at cluster settings on how to
disable/enable shard allocation.

On Wed, Apr 30, 2014 at 8:47 AM, Patrick Proniewski 
elasticsea...@patpro.net wrote:

 Hello,

 I'm still testing ES at a very small scale (1 node on a multipurpose
 server), but I would like to extend it's use at work as a backend for
 logstash. It means that the LS+ES cluster would have to eat few GB of data
 every day, up to 15 or 20GB later if things go well.
 I'm doing all this as a side project: no investment apart from work hours.
 I will recycle blades and storage we plan to decommission from our
 virtualization farm.
 So I'm likely to end with 2 or 3 dual-xeon blades, but no real internal
 storage (an SD-card), and a LUN on a SAN.

 How does ES behave is shared storage condition? What are the best
 practices about nodes/shards/replicas/...?
 Intended audience is Operation team, so less than 10 persons. So no big
 search concurrency but probably mostly deep search and ill-designed
 queries :)

 thanks,
 Patrick

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/0EF076AD-2908-4860-A97F-060A5C511AC3%40patpro.net
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAOT3TWrdPOcspORJT_AR%3DXUNQ5H0xfVcEpL%2B6aZ-sPb9X_Lsgw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Multiple or per field highlight type

2014-04-30 Thread Shmullus
I have mapping where I set one field's mapping term_vector to be 
with_positions_offsets.
I would then like to search with highlights on all the fields, is that 
possible?

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/f6c4e9b9-6c52-4677-a735-8da93e16b507%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: ES and SAN storage

2014-04-30 Thread Patrick Proniewski
Well, then maybe my questions were not precise enough.
My first goal was to make sure ES does work sharing a unique storage for all 
nodes. 
My second gaol was to learn if each node requires to have its dedicated file 
tree, or if you can put every files together as if there's only one ES node.
Does-it make sense to have replicas when eventually filesystem IOs are shared?
Does moving a shard from a node to another makes data passing through the CPU, 
or is ES smart enough to just pass the pointer to the file?


On 30 avr. 2014, at 18:33, Mohit Anchlia wrote:

 I think anyone will find it difficult to answer such questions just because 
 there are several factors that derive the decision like latency requirements, 
 high availability requirements, how shared SAN storage is and impact of 
 somebody stealing IO under the hood etc. The best way is to develop a test 
 model and test it out. Look at cluster settings on how to disable/enable 
 shard allocation.
 
 On Wed, Apr 30, 2014 at 8:47 AM, Patrick Proniewski 
 elasticsea...@patpro.net wrote:
 Hello,
 
 I'm still testing ES at a very small scale (1 node on a multipurpose server), 
 but I would like to extend it's use at work as a backend for logstash. It 
 means that the LS+ES cluster would have to eat few GB of data every day, up 
 to 15 or 20GB later if things go well.
 I'm doing all this as a side project: no investment apart from work hours. I 
 will recycle blades and storage we plan to decommission from our 
 virtualization farm.
 So I'm likely to end with 2 or 3 dual-xeon blades, but no real internal 
 storage (an SD-card), and a LUN on a SAN.
 
 How does ES behave is shared storage condition? What are the best practices 
 about nodes/shards/replicas/...?
 Intended audience is Operation team, so less than 10 persons. So no big 
 search concurrency but probably mostly deep search and ill-designed queries 
 :)
 
 thanks,
 Patrick

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/F6DDE665-B311-4964-A0BF-FFEF156E4FA3%40patpro.net.
For more options, visit https://groups.google.com/d/optout.


Re: Security of ES

2014-04-30 Thread Patrick Proniewski
Thanks Adrien.

On 30 avr. 2014, at 18:02, Adrien Grand wrote:

 Hi,
 
 Elasticsearch doesn't support any form of authentification or authorization 
 at the moment. The way users deal with this issue is usually by giving access 
 to Elasticsearch through a proxy that would handle security based on the path 
 of the URL.
 
 
 On Wed, Apr 30, 2014 at 5:56 PM, Patrick Proniewski 
 elasticsea...@patpro.net wrote:
 Hello,
 
 As a BOfH, I'm quite used to provide auth-based access to IT resources. As 
 CISO I must guaranty that users get only what they need, especially about 
 sensitive content. Unfortunately I can't find anything about authentication, 
 and security in ES documentation. It looks like the product is designed like 
 memcached: it's there and free to use.
 
 Is there any way to provide some partitioning inside an ES cluster, so that 
 we can share the cluster without sharing the data?

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/859E581C-1821-4154-9DF8-461C1BFA225B%40patpro.net.
For more options, visit https://groups.google.com/d/optout.


Re: Security of ES

2014-04-30 Thread Patrick Proniewski
Hmmm ok
I'll have to think about this. I do get the proxy part, very easy, I'm doing 
this kind of stuff for eons. Now you write I can discriminate URL's by 
injecting an arbitrary field into my data and creating an alias that names a 
prebuilt filter. I've discovered aliases just 2 hours ago, I'll have to dive 
into this to understand exactly how it works, and in particular how it can be 
used into a logstash install.

thanks for the tip.

On 30 avr. 2014, at 18:04, David Pilato wrote:

 Yes. By now, you have to deal with security yourself.
 
 So, secure URL using Ngnix for example, use aliases which will expose alias 
 URL and not direct index URL.
 Use filters in aliases.
 
 Example:
 
 Let's say you have a groupid field in your documents and you have a doc 
 index.
 A doc A belongs to groupid marketing.
 Doc B belongs to groupid finances.
 
 Create an alias marketing which uses doc index with a prebuilt filter on 
 groupid with marketing.
 Same for finances.
 
 Then secure your URLs using Nginx and let users only access to the right URLs 
 (aliases) they should see.
 
 My 2 cents.
 
 -- 
 David Pilato | Technical Advocate | Elasticsearch.com
 @dadoonet | @elasticsearchfr
 
 
 Le 30 avril 2014 à 17:56:10, Patrick Proniewski (elasticsea...@patpro.net) a 
 écrit:
 
 Hello, 
 
 As a BOfH, I'm quite used to provide auth-based access to IT resources. As 
 CISO I must guaranty that users get only what they need, especially about 
 sensitive content. Unfortunately I can't find anything about authentication, 
 and security in ES documentation. It looks like the product is designed like 
 memcached: it's there and free to use. 
 
 Is there any way to provide some partitioning inside an ES cluster, so that 
 we can share the cluster without sharing the data? 
 
 thanks, 
 Patrick 
 
 -- 
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group. 
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearch+unsubscr...@googlegroups.com. 
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/E22ED5A1-1554-4558-BBC7-3408CBA3C179%40patpro.net.
  
 For more options, visit https://groups.google.com/d/optout. 
 
 -- 
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/etPan.53611f0e.257130a3.2280%40MacBook-Air-de-David.local.
 For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/13A65587-3274-4A57-8DB7-4A7E2488A3D5%40patpro.net.
For more options, visit https://groups.google.com/d/optout.


Re: ES and SAN storage

2014-04-30 Thread Mohit Anchlia
I'll try and answer as much I know:

ES shouldn't have any issues working with SAN, NFS or EBS. Yes each node
need its own unique file path, they don't share files from other nodes.
Replicas in this only make sense if you are solving for a VM or a node
failure per se. Or it also makes sense if you have SAN storage coming from
a different array.

I don't follow your last question.

On Wed, Apr 30, 2014 at 10:04 AM, Patrick Proniewski 
elasticsea...@patpro.net wrote:

 Well, then maybe my questions were not precise enough.
 My first goal was to make sure ES does work sharing a unique storage for
 all nodes.
 My second gaol was to learn if each node requires to have its dedicated
 file tree, or if you can put every files together as if there's only one ES
 node.
 Does-it make sense to have replicas when eventually filesystem IOs are
 shared?
 Does moving a shard from a node to another makes data passing through the
 CPU, or is ES smart enough to just pass the pointer to the file?


 On 30 avr. 2014, at 18:33, Mohit Anchlia wrote:

  I think anyone will find it difficult to answer such questions just
 because there are several factors that derive the decision like latency
 requirements, high availability requirements, how shared SAN storage is and
 impact of somebody stealing IO under the hood etc. The best way is to
 develop a test model and test it out. Look at cluster settings on how to
 disable/enable shard allocation.
 
  On Wed, Apr 30, 2014 at 8:47 AM, Patrick Proniewski 
 elasticsea...@patpro.net wrote:
  Hello,
 
  I'm still testing ES at a very small scale (1 node on a multipurpose
 server), but I would like to extend it's use at work as a backend for
 logstash. It means that the LS+ES cluster would have to eat few GB of data
 every day, up to 15 or 20GB later if things go well.
  I'm doing all this as a side project: no investment apart from work
 hours. I will recycle blades and storage we plan to decommission from our
 virtualization farm.
  So I'm likely to end with 2 or 3 dual-xeon blades, but no real internal
 storage (an SD-card), and a LUN on a SAN.
 
  How does ES behave is shared storage condition? What are the best
 practices about nodes/shards/replicas/...?
  Intended audience is Operation team, so less than 10 persons. So no big
 search concurrency but probably mostly deep search and ill-designed
 queries :)
 
  thanks,
  Patrick

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/F6DDE665-B311-4964-A0BF-FFEF156E4FA3%40patpro.net
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAOT3TWrqqNrh7jbW3%2BvO%2BSpXdxRGTvB3zcCod6yPRMgt42kcUA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Significant Term aggregation

2014-04-30 Thread Ramdev Wudali
Hi:
   I have been trying to use (and successfully did) the Significant terms 
aggregations in release 1.1.0. The blog posts about this feature
http://www.elasticsearch.org/blog/significant-terms-aggregation/ was 
extremely helpful. Since this feature is in experimental stage and the 
authors had requested feedback and me not knowing about how to provide 
feedback regarding specific features, I am restarting to posting on this 
group.

I had posted on a different thread regarding accessing the TFIDF scores for 
terms so that I could investigate ways in which I could enhance my queries. 
This lead me to look at the experimental Significant Terms Aggregation.  It 
does what it says  quite well. and I am glad this functionality exists. 
However, I would like to see some possibilities of enhancements:

What I noticed in my aggregation results is  a lot of Stopwords (a, an, 
the, at, and, etc.) being included as significant terms. perhaps having the 
possibility of including Stopword lists so that these stop words are not 
included in the signifiant term calculations.  (The significance is 
calculated based on how many times a term appears in the query result vs 
how many times it appears in whole index. ) For common stop words this 
 calculation i going to make them very significant. 

Another possible enhancement would be get a phrase significance (instead of 
a single term, doing a multi term significance) would be nice. 

In the blog post, a similar effect is obtained by highlighting the terms 
that are identified as significant.But it would be nice to just look at the 
buckets and determine that.


Cheers and Thanks for all the fish


Ramdev

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/95bec4ed-69c6-409d-b6b8-4bbe4c8da229%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Lucene Date Range Query in Kibana

2014-04-30 Thread Ramdev Wudali
Lucene and hence elastic search and hence Kibana allows for date range to 
be queries as [NOW-60DAY TO NOW] similar to what you said.





On Wednesday, 30 April 2014 10:37:33 UTC-5, Uli Bethke wrote:

 Is there a way in Kibana or Lucene to define a date range query as 
 Today-60 days.

 Something along the logical lines of visit_date: [*-60 TO *] 


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/1cb787e8-f7c2-4520-9cf1-4098c15d95de%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Substring match in search term order using Elasticsearch

2014-04-30 Thread Ramdev Wudali
what happens when you query  as you indicated ?

did you try and wildchar query ? Also perhaps  an analyzer with the shingle 
token filter 
(http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/analysis-shingle-tokenfilter.html#analysis-shingle-tokenfilter)
 
will work better for your purposes ?

Ramdev


On Wednesday, 30 April 2014 09:15:35 UTC-5, Kruti Shukla wrote:

 Posted same question on stackover flow 
 http://stackoverflow.com/questions/23244796/substring-match-in-search-term-order-using-elasticsearch;
  but still looking for Answer.


 I'm new to elasticsearch

 I want to perform substring/partial word match using elastic search. I 
 want results to be returned in the perticular order. In order to explain my 
 problem I will show you how I create my index, mappings and what are the 
 records I use.

 *Creating Index and mappings:*

 PUT /my_index1
 {
 settings: {
 analysis: {
 filter: {
 trigrams_filter: {
 type: ngram,
 min_gram: 3,
 max_gram: 3
 }
 },
 analyzer: {
 trigrams: {
 type:  custom,
 tokenizer: standard,
 filter:   [
 lowercase,
 trigrams_filter
 ]
 }
 }
 }
 },
 mappings: {
 my_type1: {
 properties: {
 text: {
 type: string,
 analyzer: trigrams 
 }
 }
 }
 }
 }

 *Bulk record insert:*

 POST /my_index1/my_type1/_bulk
 { index: { _id: 1 }}
 { text: men's shaver }
 { index: { _id: 2 }}
 { text: men's foil shaver }
 { index: { _id: 3 }}
 { text: men's foil advanced shaver }
 { index: { _id: 4 }}
 { text: norelco men's foil advanced shaver }
 { index: { _id: 5 }}
 { text: men's shavers }
 { index: { _id: 6 }}
 { text: women's shaver }
 { index: { _id: 7 }}
 { text: women's foil shaver }
 { index: { _id: 8 }}
 { text: women's foil advanced shaver }
 { index: { _id: 9 }}
 { text: norelco women's foil advanced shaver }
 { index: { _id: 10 }}
 { text: women's shavers }

 *Now, I want to perform search for en's shaver. I'm searching using 
 follwing query:*

 POST /my_index1/my_type1/_search
 {
 query: {
match: {
   text: 
   { query: en's shaver,

 minimum_should_match: 100%

   }
}

 }
 }

 I want results to be in following sequence:

1. men's shaver -- closest match with following same search keyword 
order en's shaver
2. women's shaver -- closest match with following same search keyword 
order en's shaver
3. men's foil shaver -- increased distance by 1
4. women's foil shaver -- increased distance by 1
5. men's foil advanced shaver -- increased distance by 2
6. women's foil advanced shaver -- increased distance by 2
7. men's shavers -- substring match for shavers
8. women's shavers -- substring match for shavers

 I'm performing following query. It is not giving me result in the order I 
 want:

 POST /my_index1/my_type1/_search
 {
query: {
   query_string: {
  default_field: text,
  query: men's shaver,
  minimum_should_match: 90%
   }
}
 }



-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/df570460-9e71-4c4b-9208-c5a7f467cde5%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Cannot asynchronously update replica settings over many tables (ES 0.90.7)

2014-04-30 Thread Michael D. Moffitt
Hi all,

I am trying to grow my replicas from 0 to 2 across about 300 tables. I'm 
doing this by asynchronously issuing an UpdateSettingsRequest (through the 
Java client) for each table.

The first 100 go through fine (responding with a UpdateSettingsResponse), 
but the final ~200 fail with this exception:

 Failure is org.elasticsearch.transport.RemoteTransportException: 
[my-cluster][inet[/w.x.y.z:9300]][indices/settings/update]

We're using ES version 0.90.7.  Any ideas what might be clogging the pipes?

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/a61bfd14-e5d3-44ac-b3eb-2f1e95268101%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


help with jdbc rivers and type mapping

2014-04-30 Thread Eric Sims
i can't seem to understand how to fully set up my type mappings while using 
jdbc rivers and sql server.

here's an example.

PUT /_river/mytest_river/_meta
{
type: jdbc,
jdbc: {
  url:jdbc:sqlserver://mydbserver:1433;databaseName=mydatabase,
  user:myuser,
  password:xxx,
  sql:select * from dbo.musicalbum (nolock),
  strategy : oneshot,
  index : myindex,
  type : album,
  bulk_size : 100,
  max_retries: 5,
  max_retries_wait:30s,
  max_bulk_requests : 5,
  bulk_flush_interval : 5s,
  type_mapping: {
  album: {properties: {
   AlbumDescription: {type: string},
   AlbumID: {type: string},
   Artist: {type: string},
   Genre: {type: string,index : not_analyzed},
   Label: {type: string},
   Title: {type: string},
   _id : {path : AlbumID}
}
  }
   }
}
}

so you can see i've specified both a select statement (which normally would 
dynamically produce the mapping for me) and also a type mapping. in the 
type mapping i've tried to specify that i want the _id to be the same as 
AlbumID, and also that i want the Genre to be not_analyzed. it ends up 
throwing multiple errors, only indexing one document, and not creating my 
full mapping.

here's what the mapping ends up looking like: (skipping some of the columns 
altogether!)

{
   myindex: {
  mappings: {
 album: {
properties: {
   AlbumDescription: {
  type: string
   },
   AlbumID: {
  type: string
   },
   Artist: {
  type: string
   },
   Genre: {
  type: string
   },
   Title: {
  type: string
   }
}
 }
  }
   }
}

any assistance would be helpful. it's driving me nuts.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/4c9af783-cf6c-4e41-a287-83ff5589350e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Sense on github abandoned?

2014-04-30 Thread @mromagnoli
Agree 100%. Sense must return to Chrome Store! 

El martes, 29 de abril de 2014 11:52:49 UTC-3, Joshua Worden escribió:

 Would love to see this return to the chrome store. Was rather surprised to 
 see it gone when getting another developer started working with 
 elasticsearch. Even if it was buggy, it was the best way to get started.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/98d89444-f75c-4f50-aece-6e55337c868d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: ES and SAN storage

2014-04-30 Thread Patrick Proniewski
On 30 avr. 2014, at 19:34, Mohit Anchlia wrote:

 I'll try and answer as much I know:
  
 ES shouldn't have any issues working with SAN, NFS or EBS. Yes each node need 
 its own unique file path, they don't share files from other nodes.

ok.


  Replicas in this only make sense if you are solving for a VM or a node 
 failure per se. Or it also makes sense if you have SAN storage coming from a 
 different array.


ok.


 I don't follow your last question.


My english is limited, sorry. As far as I understand ES, some shard balancing 
occurs in the background, when some are created or deleted, others will move 
from node to node so the number of shards is even between nodes. When storage 
is isolated for each node, moving a shard to another node requires the file to 
go through the node CPU/RAM, then network, then CPU/RAM of remote node, then 
storage. It would be very nice in a shared-storage scenario that the shard 
would not be moved through fs-cpu-ram-network-cpu-ram-fs but through a simple 
rename-and-tell action.
Does it make sense?


 
 On Wed, Apr 30, 2014 at 10:04 AM, Patrick Proniewski 
 elasticsea...@patpro.net wrote:
 Well, then maybe my questions were not precise enough.
 My first goal was to make sure ES does work sharing a unique storage for all 
 nodes.
 My second gaol was to learn if each node requires to have its dedicated file 
 tree, or if you can put every files together as if there's only one ES node.
 Does-it make sense to have replicas when eventually filesystem IOs are shared?
 Does moving a shard from a node to another makes data passing through the 
 CPU, or is ES smart enough to just pass the pointer to the file?


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/2D4B8E1F-3513-465F-B864-65401D9E38E1%40patpro.net.
For more options, visit https://groups.google.com/d/optout.


Re: help with jdbc rivers and type mapping

2014-04-30 Thread joergpra...@gmail.com
Thanks for the report.

Does it work if you create the index with the custom mapping beforehand,
with tool like curl?

The JDBC river will use existing index then.

Jörg



On Wed, Apr 30, 2014 at 9:56 PM, Eric Sims eric.sims.aent@gmail.comwrote:

 i can't seem to understand how to fully set up my type mappings while
 using jdbc rivers and sql server.

 here's an example.

 PUT /_river/mytest_river/_meta
 {
 type: jdbc,
 jdbc: {
   url:jdbc:sqlserver://mydbserver:1433;databaseName=mydatabase,
   user:myuser,
   password:xxx,
   sql:select * from dbo.musicalbum (nolock),
   strategy : oneshot,
   index : myindex,
   type : album,
   bulk_size : 100,
   max_retries: 5,
   max_retries_wait:30s,
   max_bulk_requests : 5,
   bulk_flush_interval : 5s,
   type_mapping: {
   album: {properties: {
AlbumDescription: {type: string},
AlbumID: {type: string},
Artist: {type: string},
Genre: {type: string,index : not_analyzed},
Label: {type: string},
Title: {type: string},
_id : {path : AlbumID}
 }
   }
}
 }
 }

 so you can see i've specified both a select statement (which normally
 would dynamically produce the mapping for me) and also a type mapping. in
 the type mapping i've tried to specify that i want the _id to be the same
 as AlbumID, and also that i want the Genre to be not_analyzed. it ends up
 throwing multiple errors, only indexing one document, and not creating my
 full mapping.

 here's what the mapping ends up looking like: (skipping some of the
 columns altogether!)

 {
myindex: {
   mappings: {
  album: {
 properties: {
AlbumDescription: {
   type: string
},
AlbumID: {
   type: string
},
Artist: {
   type: string
},
Genre: {
   type: string
},
Title: {
   type: string
}
 }
  }
   }
}
 }

 any assistance would be helpful. it's driving me nuts.

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/4c9af783-cf6c-4e41-a287-83ff5589350e%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/4c9af783-cf6c-4e41-a287-83ff5589350e%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoEGjQfv%2BkRgia-GRu8D805hmv%2BLUkLXtCBX8VxHSFTTEQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: The effect of multi-fields and copy_to on storage size

2014-04-30 Thread Jeremy McLain
Ideas anyone?

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/21311c5e-c0d5-4896-8560-a24e1683b1fc%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Sense on github abandoned?

2014-04-30 Thread Ivan Brusic
Must is a strong word. I highlighted some alternatives earlier.
On Apr 30, 2014 1:01 PM, @mromagnoli marce.romagn...@gmail.com wrote:

 Agree 100%. Sense must return to Chrome Store!

 El martes, 29 de abril de 2014 11:52:49 UTC-3, Joshua Worden escribió:

 Would love to see this return to the chrome store. Was rather surprised
 to see it gone when getting another developer started working with
 elasticsearch. Even if it was buggy, it was the best way to get started.

  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/98d89444-f75c-4f50-aece-6e55337c868d%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/98d89444-f75c-4f50-aece-6e55337c868d%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQCZwdaaKvTDbyTFJ4oOMaxHiY63GSWmDN0shEJPbbB%2BgA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Sense on github abandoned?

2014-04-30 Thread @mromagnoli
Yeah, maybe you are right. Anyway i have installed Marvel, and make a 
bookmark in Chrome with the URL to Sense.

Perhaps I cried in advance ;P

El miércoles, 30 de abril de 2014 17:19:36 UTC-3, Ivan Brusic escribió:

 Must is a strong word. I highlighted some alternatives earlier.
 On Apr 30, 2014 1:01 PM, @mromagnoli marce.r...@gmail.com javascript: 
 wrote:

 Agree 100%. Sense must return to Chrome Store! 

 El martes, 29 de abril de 2014 11:52:49 UTC-3, Joshua Worden escribió:

 Would love to see this return to the chrome store. Was rather surprised 
 to see it gone when getting another developer started working with 
 elasticsearch. Even if it was buggy, it was the best way to get started.

  -- 
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearc...@googlegroups.com javascript:.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/98d89444-f75c-4f50-aece-6e55337c868d%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/98d89444-f75c-4f50-aece-6e55337c868d%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.



-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/fd0ec98d-d507-4a01-a9e7-d59535637465%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: help with jdbc rivers and type mapping

2014-04-30 Thread Eric Sims
no. i just tried deleting all indexes, then i did:

PUT /myindex

then 

PUT /myindex/album/_mapping
{
  myindex: {
mappings: {
   album: {
  properties: {
   AlbumDescription: {type: string},
   AlbumID: {type: string},
   Artist: {type: string},
   Genre: {type: string,index : not_analyzed},
   Label: {type: string},
   Title: {type: string},
   _id : {path : AlbumID}
}
   }
}
  }
}

then i ran the PUT statement in my previous post.

it still treats it as dynamic mappings

On Wednesday, April 30, 2014 3:56:22 PM UTC-4, Eric Sims wrote:

 i can't seem to understand how to fully set up my type mappings while 
 using jdbc rivers and sql server.

 here's an example.

 PUT /_river/mytest_river/_meta
 {
 type: jdbc,
 jdbc: {
   url:jdbc:sqlserver://mydbserver:1433;databaseName=mydatabase,
   user:myuser,
   password:xxx,
   sql:select * from dbo.musicalbum (nolock),
   strategy : oneshot,
   index : myindex,
   type : album,
   bulk_size : 100,
   max_retries: 5,
   max_retries_wait:30s,
   max_bulk_requests : 5,
   bulk_flush_interval : 5s,
   type_mapping: {
   album: {properties: {
AlbumDescription: {type: string},
AlbumID: {type: string},
Artist: {type: string},
Genre: {type: string,index : not_analyzed},
Label: {type: string},
Title: {type: string},
_id : {path : AlbumID}
 }
   }
}
 }
 }

 so you can see i've specified both a select statement (which normally 
 would dynamically produce the mapping for me) and also a type mapping. in 
 the type mapping i've tried to specify that i want the _id to be the same 
 as AlbumID, and also that i want the Genre to be not_analyzed. it ends up 
 throwing multiple errors, only indexing one document, and not creating my 
 full mapping.

 here's what the mapping ends up looking like: (skipping some of the 
 columns altogether!)

 {
myindex: {
   mappings: {
  album: {
 properties: {
AlbumDescription: {
   type: string
},
AlbumID: {
   type: string
},
Artist: {
   type: string
},
Genre: {
   type: string
},
Title: {
   type: string
}
 }
  }
   }
}
 }

 any assistance would be helpful. it's driving me nuts.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/1bda2b24-8fc4-4706-a43f-cadf820ebc6c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: ES and SAN storage

2014-04-30 Thread Mohit Anchlia
It makes sense if it was just as simple :) The reason shards need to move
through the higher level of stack is that every node maintains it's own
indexes or lucene segments and it can't just be switched. And I think that
is primarily because of how internal structures are maintained in lucene.
You might be able to develop a workaround using one or more of these
settings:

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/cluster-update-settings.html

On Wed, Apr 30, 2014 at 1:05 PM, Patrick Proniewski 
elasticsea...@patpro.net wrote:

 On 30 avr. 2014, at 19:34, Mohit Anchlia wrote:

  I'll try and answer as much I know:
 
  ES shouldn't have any issues working with SAN, NFS or EBS. Yes each node
 need its own unique file path, they don't share files from other nodes.

 ok.


   Replicas in this only make sense if you are solving for a VM or a node
 failure per se. Or it also makes sense if you have SAN storage coming from
 a different array.


 ok.


  I don't follow your last question.


 My english is limited, sorry. As far as I understand ES, some shard
 balancing occurs in the background, when some are created or deleted,
 others will move from node to node so the number of shards is even between
 nodes. When storage is isolated for each node, moving a shard to another
 node requires the file to go through the node CPU/RAM, then network, then
 CPU/RAM of remote node, then storage. It would be very nice in a
 shared-storage scenario that the shard would not be moved through
 fs-cpu-ram-network-cpu-ram-fs but through a simple rename-and-tell action.
 Does it make sense?


 
  On Wed, Apr 30, 2014 at 10:04 AM, Patrick Proniewski 
 elasticsea...@patpro.net wrote:
  Well, then maybe my questions were not precise enough.
  My first goal was to make sure ES does work sharing a unique storage for
 all nodes.
  My second gaol was to learn if each node requires to have its dedicated
 file tree, or if you can put every files together as if there's only one ES
 node.
  Does-it make sense to have replicas when eventually filesystem IOs are
 shared?
  Does moving a shard from a node to another makes data passing through
 the CPU, or is ES smart enough to just pass the pointer to the file?


 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/2D4B8E1F-3513-465F-B864-65401D9E38E1%40patpro.net
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAOT3TWqDyjcfPKxvY37b%2B%2BTwnDk7xj9A%2BL0k19wiLG58XNGPZA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: help with jdbc rivers and type mapping

2014-04-30 Thread joergpra...@gmail.com
The mapping has errors. Something like this might work better:

DELETE /myindex

PUT /myindex

PUT /myindex/album/_mapping
{

   album: {
  properties: {
   AlbumDescription: {type: string},
   AlbumID: {type: string},
   Artist: {type: string},
   Genre: {type: string,index : not_analyzed},
   Label: {type: string},
   Title: {type: string},
   _id : {
index_name : album.AlbumID,
path : full,
type : string
   }
}
   }
}

GET /myindex/album/_mapping

Jörg



On Wed, Apr 30, 2014 at 10:34 PM, Eric Sims eric.sims.aent@gmail.comwrote:

 no. i just tried deleting all indexes, then i did:

 PUT /myindex

 then

 PUT /myindex/album/_mapping
 {
   myindex: {
 mappings: {
album: {
   properties: {
AlbumDescription: {type: string},
AlbumID: {type: string},
Artist: {type: string},
Genre: {type: string,index : not_analyzed},
Label: {type: string},
Title: {type: string},
_id : {path : AlbumID}
 }
}
 }
   }
 }

 then i ran the PUT statement in my previous post.

 it still treats it as dynamic mappings

 On Wednesday, April 30, 2014 3:56:22 PM UTC-4, Eric Sims wrote:

 i can't seem to understand how to fully set up my type mappings while
 using jdbc rivers and sql server.

 here's an example.

 PUT /_river/mytest_river/_meta
 {
 type: jdbc,
 jdbc: {
   url:jdbc:sqlserver://mydbserver:1433;databaseName=mydatabase,
   user:myuser,
   password:xxx,
   sql:select * from dbo.musicalbum (nolock),
   strategy : oneshot,
   index : myindex,
   type : album,
   bulk_size : 100,
   max_retries: 5,
   max_retries_wait:30s,
   max_bulk_requests : 5,
   bulk_flush_interval : 5s,
   type_mapping: {
   album: {properties: {
AlbumDescription: {type: string},
AlbumID: {type: string},
Artist: {type: string},
Genre: {type: string,index : not_analyzed},
Label: {type: string},
Title: {type: string},
_id : {path : AlbumID}
 }
   }
}
 }
 }

 so you can see i've specified both a select statement (which normally
 would dynamically produce the mapping for me) and also a type mapping. in
 the type mapping i've tried to specify that i want the _id to be the same
 as AlbumID, and also that i want the Genre to be not_analyzed. it ends up
 throwing multiple errors, only indexing one document, and not creating my
 full mapping.

 here's what the mapping ends up looking like: (skipping some of the
 columns altogether!)

 {
myindex: {
   mappings: {
  album: {
 properties: {
AlbumDescription: {
   type: string
},
AlbumID: {
   type: string
},
Artist: {
   type: string
},
Genre: {
   type: string
},
Title: {
   type: string
}
 }
  }
   }
}
 }

 any assistance would be helpful. it's driving me nuts.

  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/1bda2b24-8fc4-4706-a43f-cadf820ebc6c%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/1bda2b24-8fc4-4706-a43f-cadf820ebc6c%40googlegroups.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGkTLqF6VC4kSYMT2WjnAcLiLF4RE-DG4914uc31DdRGg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Limit the amount of data generated by Marvel with marvel.agent.interval ?

2014-04-30 Thread Logan Hardy
I'm managing a pretty badass 11 node Elasticsearch cluster that is powering 
a customer facing dashboard reporting platform. 20 cores per node, 64GB 
RAM, SSDs, Dual 10 GbE of awesome. I evaluated Marvel while we were still 
in development on the new platform and I found it to be a very valuable 
tool. At first Marvel was indexing to the same cluster we were monitoring 
and this was okay while we were in development as there were plenty of 
extra cycles in the cluster to handle the load but now that we are in 
production it doesn't make sense to burden the cluster with this. The 
nature of our reporting system requires us to to have an index for each 
customer so we're currently at 328 indexes and over 10,000 shards total. 
The amount of data indexed by Marvel increases dramatically as the number 
of indices increases so once we got over 300 indices in the system the 
daily marvel index ended up at around 400 GB replicated and was indexing 
around 2,000 documents a second by itself. 

What I want to do is have Marvel index to a not as awesome 2 node 
Elasticsearch monitoring cluster. 12 cores, 64 GB RAM and spinning disks. 
But in practice these 2 nodes are unable to keep up with the load and get 
completely bogged down. I'm thinking I can sacrifice redundancy and buy 
myself some cycles by not using any replicas on the Marvel index. My other 
idea is to set marvel.agent.interval from the default 10s to something like 
30s on the assumption that this will cut the amount of data generated by a 
third. Does this sound sane or do you have anyone have other ideas on what 
I can try to limited the load?  

marvel.agent.interval

Controls the interval between data samples. Defaults to 10s. Set to -1 to 
temporarily disable exporting.

This setting is update-able via the Cluster Update Settings API.


Thanks -Logan-

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/5884045a-49f7-48d4-a3cb-93a5f70c53cf%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: help with jdbc rivers and type mapping

2014-04-30 Thread Eric Sims
here's another weird bit. it doesn't seem to show the mappings right after 
i set them:

PUT /myindex/album/_mapping
{
  myindex: {
mappings: {
   album: {
  properties: {
   albumdescription: {type: string},
   albumid: {type: string},
   artist: {type: string},
   genre: {type: string, index : not_analyzed},
   label: {type: string, analyzer: whitespace},
   title: {type: string},
   time: {type : string},
   _id : {
index_name : album.AlbumID, 
path : full, 
type : string
   }
}
   }
}
  }
}


GET /myindex/album/_mapping

returns this:

{
   myindex: {
  mappings: {
 album: {
properties: {}
 }
  }
   }
}

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/55b7887e-43e3-4836-bef7-55e4c9c6c8e5%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Limit the amount of data generated by Marvel with marvel.agent.interval ?

2014-04-30 Thread Mark Walkom
That's pretty sane. I believe the newest version of marvel increased the
default from 5s to 10s.

But be aware, you are breaking the license for Marvel with that number of
nodes - http://www.elasticsearch.org/overview/marvel/

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com
web: www.campaignmonitor.com


On 1 May 2014 06:52, Logan Hardy loganbha...@gmail.com wrote:

 I'm managing a pretty badass 11 node Elasticsearch cluster that is
 powering a customer facing dashboard reporting platform. 20 cores per node,
 64GB RAM, SSDs, Dual 10 GbE of awesome. I evaluated Marvel while we were
 still in development on the new platform and I found it to be a very
 valuable tool. At first Marvel was indexing to the same cluster we were
 monitoring and this was okay while we were in development as there were
 plenty of extra cycles in the cluster to handle the load but now that we
 are in production it doesn't make sense to burden the cluster with this.
 The nature of our reporting system requires us to to have an index for each
 customer so we're currently at 328 indexes and over 10,000 shards total.
 The amount of data indexed by Marvel increases dramatically as the number
 of indices increases so once we got over 300 indices in the system the
 daily marvel index ended up at around 400 GB replicated and was indexing
 around 2,000 documents a second by itself.

 What I want to do is have Marvel index to a not as awesome 2 node
 Elasticsearch monitoring cluster. 12 cores, 64 GB RAM and spinning disks.
 But in practice these 2 nodes are unable to keep up with the load and get
 completely bogged down. I'm thinking I can sacrifice redundancy and buy
 myself some cycles by not using any replicas on the Marvel index. My other
 idea is to set marvel.agent.interval from the default 10s to something like
 30s on the assumption that this will cut the amount of data generated by a
 third. Does this sound sane or do you have anyone have other ideas on what
 I can try to limited the load?

 marvel.agent.interval

 Controls the interval between data samples. Defaults to 10s. Set to -1 to
 temporarily disable exporting.

 This setting is update-able via the Cluster Update Settings API.


 Thanks -Logan-

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/5884045a-49f7-48d4-a3cb-93a5f70c53cf%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/5884045a-49f7-48d4-a3cb-93a5f70c53cf%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAEM624b_tp-8afb-okJSkWQ76KKbzFf9gaa97RJheLCx8-Zg0Q%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Performance of Indexed-Shape Queries Vs Geoshape Queries

2014-04-30 Thread Ilya Paripsa
Hi Alex,

Thanks for your response.

Does this mean that the shape that I query by does not need to be indexed
by Elasticsearch on the fly? Or does this mean that the indexing of the
shape is so quick it does not affect the query latency?

Thank you,
Ilya.


On 21 April 2014 22:46, Alexander Reelsen a...@spinscale.de wrote:

 Hey,

 the main difference is basically the network overhead. What happens behind
 the curtains is that a GET request for the shape is being executed if you
 specify it in the request and then this shape is used instead of the
 provided one.

 Makes sense?


 --Alex


 On Tue, Apr 15, 2014 at 6:50 AM, ipari...@thoughtworks.com wrote:

 Hi,

 We ran tests comparing performance of Indexed-Shape Queries to custom
 Geoshape Queries. We found that Elasticsearch yielded roughly same results
 in both cases. We expected Indexed Shape queries to be faster than custom
 Geoshape queries. Our understanding is that Elasticsearch has to convert
 the custom geoshapes to quadtree on the fly as opposed to having it
 pre-generated. I was wondering if anyone could let us know why there is
 no difference in performance between these two query types.

 *Experiment Design*

 We indexed suburb boundary geometries into one doctype, and geocoded
 points of interest (POIs) into another. We picked top 20 suburbs with
 geometries that have most vertices, and ran two following queries for each
 suburb geometry.

 Geoshape Query

 GET /spike_index/doc_type_pois/_search
 {
query: {
   geo_shape: {
  field_geocode: {
 shape: {
type: polygon,
coordinates: [ suburb multipolygon ]
 }
  }
   }
}
 }

 Indexed-Shape Query

 GET /spike_index/doc_type_pois/_search
 {
query: {
   geo_shape: {
  field_geocode: {
 indexed_shape: {
id: pre-indexed-geometry-id,
type: doc_type_suburb_quadtree,
index: spike_index,
path: field_geometry
  }
  }
   }
}
 }

 The test was carried out using Siege from a box located within the same
 VPC as the Elasticsearch instances. Please find the results below.

 *Indexed-Shape Query Results*

 Transactions:749559 hits
 Availability:100.00 %
 Elapsed time:602.80 secs
 Data transferred: 10342.97 MB
 Response time:  0.01 secs
 Transaction rate:  1243.46 trans/sec
 Throughput: 17.16 MB/sec
 Concurrency: 14.92
 Successful transactions:  749559
 Failed transactions:0
 Longest transaction: 5.01
 Shortest transaction: 0.00

 *Geoshape Query Results*

 Transactions:723894 hits
 Availability:100.00 %
 Elapsed time:599.16 secs
 Data transferred:  9988.83 MB
 Response time:  0.01 secs
 Transaction rate:  1208.18 trans/sec
 Throughput: 16.67 MB/sec
 Concurrency: 14.92
 Successful transactions:  723894
 Failed transactions:0
 Longest transaction: 1.02
 Shortest transaction: 0.00

 If anyone could shed some light on why the results of these queries are
 the same that would be very helpful.




  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.

 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/bfebad47-fd6d-45fe-8bca-97eb14199dad%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/bfebad47-fd6d-45fe-8bca-97eb14199dad%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


  --
 You received this message because you are subscribed to a topic in the
 Google Groups elasticsearch group.
 To unsubscribe from this topic, visit
 https://groups.google.com/d/topic/elasticsearch/qwLNX9SXnkY/unsubscribe.
 To unsubscribe from this group and all its topics, send an email to
 elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/CAGCwEM8LaaFdzazyaNrfWV8wRydduNX57kFU2w_6pw5-O2Gabg%40mail.gmail.comhttps://groups.google.com/d/msgid/elasticsearch/CAGCwEM8LaaFdzazyaNrfWV8wRydduNX57kFU2w_6pw5-O2Gabg%40mail.gmail.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 

Re: how to aggregate by metadata (types/field names)?

2014-04-30 Thread 'almineev .' via elasticsearch


 bump

I'm new to elastic, considering to move from a proprietary system...
I'm blocked on the fact that I can't get list of field hits per document as 
part of search results... Any help any clue?

 


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/19e12a97-9db5-4d54-b8dd-91662c82a22a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.