date:20140326

I think you can use Dynamic template for 
that 
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-root-object-type.html#_dynamic_templates


Regards,
Kevin

On Thursday, March 27, 2014 9:03:55 AM UTC+11, Parag Shah wrote:
>
> Hi all,
>
> {
> "fruits" : {
> "apple" : {
> "sweet" : true,
> "color" : "red",
> "seed" : "red",
> "flesh" : "white"
> },
> "orange" : {
> "sweet" : true,
> "color" : "orange",
> "seed" : "white",
> "flesh" : "orange"
> } 
> }
> }
>
> The above is what I want to generate mappings for.
>
> fruits is the container for different kinds of fruits.
>
> I could add any other fruit (unknown) to "fruits" (known) like:
>
> "banana" : {
>  "sweet" : true,
>  "color" : "yellow",
>  "seed" : "black",
>  "flesh" : "white"
> }
>
> banana is a user-generated value, and for any fruit here and it has some 
> attributes like sweet, color, seed and flesh.
>
> I am not sure how I would do the mapping for this kind of a structure. Any 
> help will be appreciated.
>
> Regards
> Parag
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/b1250ddd-cb40-440b-b0a0-fff909041486%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Mapping question

2014-03-26 Thread Parag Shah

Hi all,

{
"fruits" : {
"apple" : {
"sweet" : true,
"color" : "red",
"seed" : "red",
"flesh" : "white"
},
"orange" : {
"sweet" : true,
"color" : "orange",
"seed" : "white",
"flesh" : "orange"
} 
}
}

The above is what I want to generate mappings for.

fruits is the container for different kinds of fruits.

I could add any other fruit (unknown) to "fruits" (known) like:

"banana" : {
 "sweet" : true,
 "color" : "yellow",
 "seed" : "black",
 "flesh" : "white"
}

banana is a user-generated value, and for any fruit here and it has some 
attributes like sweet, color, seed and flesh.

I am not sure how I would do the mapping for this kind of a structure. Any 
help will be appreciated.

Regards
Parag

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/589f48cc-c77a-4e71-9087-6c40d05a9588%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: fielddata breaker question

2014-03-26 Thread Lee Hinman

On 3/24/14, 7:46 AM, Dunaeth wrote:
> Hi,
> 
> Here's a small sample of data (atm, the march index is near 6M docs for
> 1.37GB size).

I haven't been able to reproduce the issue locally, but Simon noticed
that this may be caused by
https://issues.apache.org/jira/browse/LUCENE-5553 , which is fixed in
the upcoming 4.7.1 release (which will be incorporated into ES shortly
after).

;; Lee

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/53334917.5070302%40gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Is it possible to combine filters and nested filters?

Hmmm, not sure. I just tried this and it works for me (ES 1.1):

1. PUT http://localhost:9200/n
{
  "mappings": {
"doc": {
  "properties": {
"n1": {
  "type": "nested",
  "properties": {
  }
}
  }
}
  }
}

2) POST http://localhost:9200/n/doc
{ "company_id": 123, "n1": { "name": "engineering" } }

3) POST  http://localhost:9200/n/_search
{
  "query": {
"filtered": {
  "query": {
"match_all": {}
  },
  "filter": {
"bool": {
  "must": [
{
  "term": {
"company_id": 123
  }
},
{
  "nested": {
"filter": {
  "term": {
"n1.name": {
  "value": "engineering"
}
  }
},
"path": "n1"
  }
}
  ]
}
  }
}
  }
}

And it returned the document fine. Maybe you can duplicate my steps and see 
if it works for you.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/dcad6e92-e343-4d17-97b6-478332e5c808%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: What is wrong with this query

There is a skeleton example at the very bottom:
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-function-score-query.html#_relation_to_literal_custom_boost_literal_literal_custom_score_literal_and_literal_custom_filters_score_literal

-- 
Ivan


On Wed, Mar 26, 2014 at 12:43 PM, Kina Shah  wrote:

> Ivan,
>
> Thanks for your reply. Yes, I am using 1.0. I went over the Function Score
> documentation, but its bit confusing. Do you have any example of  Function
> Score queries?
>
> Thanks!
>
>
> On Wednesday, March 26, 2014 2:05:09 PM UTC-4, Ivan Brusic wrote:
>
>> Which version of Elasticsearch are you using? The custom filters score
>> query was deprecated in 0.90,4 and removed in 1.0.
>>
>> http://www.elasticsearch.org/guide/en/elasticsearch/
>> reference/0.90/query-dsl-custom-filters-score-query.html
>>
>> Look into using function score queries.
>>
>> --
>> Ivan
>>
>>
>> On Wed, Mar 26, 2014 at 8:24 AM, Kina Shah  wrote:
>>
>>> Hi,
>>>
>>> I am trying to apply the custom filter score to a filtered query. The
>>> filtered query works fine, but when I wrap the custom filter score with
>>> filters around it it gives me error.  Listed below is the query. Can
>>> someone tell me what's wrong with it?
>>>
>>> POST _search
>>> {
>>>  "query":
>>> {
>>>"custom_filters_score":
>>> {
>>> "query":
>>> {
>>>  "filtered":
>>> {
>>> "query":
>>> {
>>> "match_all": {}
>>> },
>>>  "filter":
>>> {
>>> "or":
>>> [
>>> {
>>> "geo_distance" :
>>> {
>>> "distance" : "20km",
>>> "senGeoPointLst" :
>>> {
>>> "lat" : 31.8655875846,
>>> "lon" : 65.8461840664
>>> }
>>> }
>>> },
>>> {
>>> "geo_distance" :
>>> {
>>> "distance" : "20km",
>>> "fcGeoPointLst" :
>>> {
>>> "lat" : 88.1,
>>> "lon" : -120.1
>>> }
>>> }
>>> }
>>> ]
>>> }
>>> }
>>> },
>>> " filters":
>>> [
>>>  {
>>> "geo_distance" :
>>> {
>>> "distance" : "20km",
>>> "fcGeoPointLst" :
>>> {
>>> "lat" : 88.1,
>>> "lon" : -120.1
>>> }
>>> },
>>> "boost":3
>>>  }
>>> ]
>>> }
>>> }
>>> }
>>>
>>> The error I get is :QueryParsingException[[hws] [custom_filters_score]
>>> query does not support [ filters]]; }]",
>>>
>>> Thanks!
>>>
>>>  --
>>> You received this message because you are subscribed to the Google
>>> Groups "elasticsearch" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to elasticsearc...@googlegroups.com.
>>>
>>> To view this discussion on the web visit https://groups.google.com/d/
>>> msgid/elasticsearch/0f7141d1-bb1f-49b5-ac5b-a4dc0c785138%
>>> 40googlegroups.com
>>> .
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/796ce883-453e-4a43-b183-285bebf651b9%40googlegroups.com
> .
>
> For more options, visit https://

Re: how to modify term frequency formula?

I updated my gist to illustrate the SimilarityProvider that goes along with
it. Similarities are easier to add to Elasticsearch than most plugins. You
just need to compile the two files into a jar and then add that jar into
Elasticsearch's classpath ($ES_HOME/lib most likely). The code will scan
for every SimilarityProvider defined and load it.

You then mapping the similarity to a field:
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-core-types.html#_configuring_similarity_per_field

Note that you cannot change the similarity of a field dynamically.

Ivan


http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-core-types.html#_configuring_similarity_per_field


On Wed, Mar 26, 2014 at 12:49 PM, geantbrun  wrote:

> Britta is looping over words that are passed as parameters. It's easy to
> implement her script for a simple query but what about boolean querys? In
> my understanding (but I could be wrong of course), I would have to parse
> the query to call the script with each sub-clause, am I wrong?
>
> I prefer your custom similarity alternative. Again, sorry for the silly
> question (newbie!) but where do you put your java file? Is it the only
> thing that is needed (except for the modification in the mapping)?
> cheers,
> Patrick
>
> Le mercredi 26 mars 2014 11:58:52 UTC-4, Ivan Brusic a écrit :
>>
>> I am still on a version of Elasticsearch that does not have access to the
>> new scoring capabilities, so I cannot test out any scripts. The non
>> normalized term frequency should be the line:
>> tf = _index[field][word].tf()
>>
>> If that is the case, you could substitute that line with something like:
>> tf = Math.min(10, _index[field][word].tf())
>>
>> As a stated before, I am used to using Similarities, so I find the
>> example easier. Here is a custom similarity that I used in Elasticsearch
>> (removes any norms that are indexed):
>> https://gist.github.com/brusic/9786587
>>
>> The second part would be the tf() method you would need to implement
>> instead of decodeNormValue I used.
>>
>> Cheers,
>>
>> Ivan
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQC3G0s2Z2Nx%3DTzpBf_etDZEGdTr%3DA7P65zTErmo_2B7pQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Is it possible to combine filters and nested filters?

2014-03-26 Thread Greg Marzouka

Hey Binh,

The field is indexed the same way I am searching on it, and should be an 
exact match.  The term filter matches when I remove the company_id filter, 
and the company_id filter works when I remove the categories.name nested 
filter.  However, when combined, it does not return results.

On Wednesday, March 26, 2014 4:49:52 PM UTC-4, Binh Ly wrote:
>
> Should work fine. Since you are using a term filter which is an exact 
> case-sensitive match, is it possible that your categories.name field is 
> indexed differently (like standard) that would make term filter not match?
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/fd63952d-ac7d-4985-91d4-e42728be6b62%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[ANN] Websocket transport plugin 1.1.0.0 released

Hi,

I have released Websocket transport plugin 1.1.0.0 in order to get
compatible to Elasticsearch  1.1.0

Beside compatibility adjustments, nothing has changed in functionality.

Hopefully, it can serve as a useful basis for your experiments and
developments with WebSockets.

https://github.com/jprante/elasticsearch-transport-websocket/

Patches, comments, bug reports, improvements are most welcome.

Best,

Jörg

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHe%3Dtg9LS8BJjxktHNDyXSMKgTmoa9p%3DgWhPHFH6%3DPBFw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Is it possible to combine filters and nested filters?

Should work fine. Since you are using a term filter which is an exact 
case-sensitive match, is it possible that your categories.name field is 
indexed differently (like standard) that would make term filter not match?

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/052801c4-7b6d-42a3-a40d-f841d09fc94b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Is it possible to combine filters and nested filters?

2014-03-26 Thread Greg Marzouka

I have a document that contains nested documents.  I know it is possible to 
combine multiple filters on fields in the root document, but I haven't 
figured out how to combine those filters with nested filters.

Take this query for example:


{
  "from": 0,
  "size": 20,
  "query": {
"filtered": {
  "query": {
"match_all": {}
  },
  "filter": {
"bool": {
  "must": [
{
  "term": {
"company_id": 123
  }
},
{
  "nested": {
"filter": {
  "term": {
"categories.name": {
  "value": "Engineering"
}
  }
},
"path": "categories"
  }
}
  ]
}
  }
}
  }
}


company_id is a field at the root of the document.  categories.name is a 
field at the root of a nested category document.  The following query 
returns 0 results, although there are documents that match this criteria. 
 Is it not possible to combine filters like this, or am I just doing it 
wrong?

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/39143c3b-f989-4f60-bce9-a59a3b857924%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: IndexOutOfBoundsException at IndexShardRoutingTable class

2014-03-26 Thread Shinsuke Sugaya

Thank you for the investigation and filing it.

Regards,
 shinsuke

On Thursday, March 27, 2014 12:55:56 AM UTC+9, simonw wrote:
>
> I opened an issue for this: 
> https://github.com/elasticsearch/elasticsearch/issues/5559
>
> On Wednesday, March 26, 2014 5:53:15 AM UTC+1, Shinsuke Sugaya wrote:
>>
>> Hi
>>
>> I encountered the following problem:
>>
>> Caused by: java.lang.IndexOutOfBoundsException: index (-2) must not be 
>> negative
>> at 
>> org.elasticsearch.common.base.Preconditions.checkElementIndex(Preconditions.java:306)
>> at 
>> org.elasticsearch.common.base.Preconditions.checkElementIndex(Preconditions.java:285)
>> at 
>> org.elasticsearch.common.collect.RegularImmutableList.get(RegularImmutableList.java:65)
>> at 
>> org.elasticsearch.cluster.routing.IndexShardRoutingTable.preferNodeActiveInitializingShardsIt(IndexShardRoutingTable.java:378)
>> at 
>> org.elasticsearch.cluster.routing.operation.plain.PlainOperationRouting.preferenceActiveShardIterator(PlainOperationRouting.java:210)
>> at 
>> org.elasticsearch.cluster.routing.operation.plain.PlainOperationRouting.getShards(PlainOperationRouting.java:80)
>> at 
>> org.elasticsearch.action.get.TransportGetAction.shards(TransportGetAction.java:80)
>> at 
>> org.elasticsearch.action.get.TransportGetAction.shards(TransportGetAction.java:42)
>> at 
>> org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction$AsyncSingleAction.(TransportShardSingleOperationAction.java:121)
>> at 
>> org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction$AsyncSingleAction.(TransportShardSingleOperationAction.java:97)
>> at 
>> org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction.doExecute(TransportShardSingleOperationAction.java:74)
>> at 
>> org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction.doExecute(TransportShardSingleOperationAction.java:49)
>> at 
>> org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:63)
>> at 
>> org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:49)
>> at 
>> org.elasticsearch.client.node.NodeClient.execute(NodeClient.java:85)
>> at 
>> org.elasticsearch.client.support.AbstractClient.get(AbstractClient.java:174)
>> ... 9 more
>>
>> My environment is:
>>
>>  - Elasticserach 0.90.7
>>  - 3 nodes in a cluster
>>  - Send GET request with preference=_local
>>
>> Looking into IndexShardRoutingTable class, it seems that "loc" is 
>> an unexpected negative value at the following code. pickIndex method 
>> returns a value of "counter"(incremental value). If "counter" achieves 
>> Integer.MAX_VALUE, I think that "loc" is negative and then 
>> activeShards.get(loc) throws the exception.
>>
>> int index = pickIndex();
>> for (int i = 0; i < activeShards.size(); i++) {
>> int loc = (index + i) % activeShards.size();
>>
>> If it's a bug, I'll file an issue.
>>
>> Best regards,
>>  shinsuke
>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/c33bc0fe-c628-438d-b1a5-9397a2e92b96%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Stree Address Queries

You probably want to "upgrade" to the match query - "text" queries are 
older and no longer exist in 1.x. But anyway when you query:

"match": { "f": "S Fun St" }

You are effectively doing (roughly):

f=S or f=fun or f=St

You could make it do AND if you want (in which case a match is only found 
if the document/field value contains all terms):

{
"match" : {
"f" : {
"query" : "S Fun St",
"operator" : "and"
}
}
}


You could also do OR with a minimum_should_match parameter to specify how 
many of the individual terms should match the document/field value:

{
"match" : {
"f" : {
"query" : "S Fun St",
"minimum_should_match" : 2
}
}
}


-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/611427d1-cf08-4fec-a79d-b725a989ca86%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: How many available disk space need for normal system running?

You should have 300G free if your index is 300G, for copy over in the
process of creating new segments.

Jörg


On Wed, Mar 26, 2014 at 12:01 PM, Ivan Ji  wrote:

> Hi Mark,
>
> My index size is about 300GB, the free disk space is about 5G.
>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHt6WLGNN0O7nDJhxu0xxPoZwsmkFy8swyCNDQukGEj6w%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Bulk indexing with EC2 cluster?

2014-03-26 Thread IronMan2014

Thanks, I was thinking the same about network latency, so I am going to try 
to run right from the instance itself.
I will check on other issues and update my post.

On Wednesday, March 26, 2014 4:06:33 PM UTC-4, Binh Ly wrote:
>
> 1. There is network latency going up to EC2. :)
>
> 2/3. Not sure, can you show your bulk code? Or did you check the logs on 
> your EC2 instances to see if there were any errors?
>
> On Tuesday, March 25, 2014 1:52:57 PM UTC-4, IronMan2014 wrote:
>>
>> I am having some issues and I would like some feedback:
>>
>> #1 - I run a test with 250 MB worth of documents against my local machine 
>> which is an i7, it takes total of 130 secs to index. I run it against a 
>> cluster of 2 i2x4 large EC2 instances, much more powerful than my local 
>> machine, yet it takes about 200 secs for the same test. 
>>
>> #2, When I index against local machine, it shows 1500 docs indexed total, 
>> however on the 2 instances, I see 1150 docs, why is it different.
>>
>> #3, Aside from the above, a separate test, if I run smaller # of docs say 
>> 500, The bulk gets called but never executes, the bulk and index.close() 
>> exit without the Bulk.execute, I look at the index, it is empty, no docs 
>> were actually indexed, but against my local machine this doesn't happen.
>>
>> Some settings:
>>
>> BulkSize: 1000 docs & 5 MB 
>>
>>
>> Settings settings = ImmutableSettings.settingsBuilder()
>>
>>.put("client.transport.sniff", 
>> true)
>>
>>.put("refresh_interval", "-1")  
>>
>>  .put("number_of_shards", 1)
>>
>>  .put("number_of_replicas", "0")
>>
>>.put("cluster.name", this.CLUSTER_NAME
>> )
>>
>>.build();
>>
>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/7eddc0ed-3c72-446f-882f-9fbbe9dce9f1%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: type filtering

Should be the same. They both do the filter method.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/373b542b-90d4-4aee-ba03-54abad7ecec6%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Elasticsearch, port binding, and snapshots to S3

The snapshot does not store any info about your host ip, if that matters. 
Are you asking if you can snapshot from dev and then restore to prod 
through s3? I don't see why not.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/4d65b862-e592-4230-83be-dc69cece4a19%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Bulk indexing with EC2 cluster?

1. There is network latency going up to EC2. :)

2/3. Not sure, can you show your bulk code? Or did you check the logs on 
your EC2 instances to see if there were any errors?

On Tuesday, March 25, 2014 1:52:57 PM UTC-4, IronMan2014 wrote:
>
> I am having some issues and I would like some feedback:
>
> #1 - I run a test with 250 MB worth of documents against my local machine 
> which is an i7, it takes total of 130 secs to index. I run it against a 
> cluster of 2 i2x4 large EC2 instances, much more powerful than my local 
> machine, yet it takes about 200 secs for the same test. 
>
> #2, When I index against local machine, it shows 1500 docs indexed total, 
> however on the 2 instances, I see 1150 docs, why is it different.
>
> #3, Aside from the above, a separate test, if I run smaller # of docs say 
> 500, The bulk gets called but never executes, the bulk and index.close() 
> exit without the Bulk.execute, I look at the index, it is empty, no docs 
> were actually indexed, but against my local machine this doesn't happen.
>
> Some settings:
>
> BulkSize: 1000 docs & 5 MB 
>
>
> Settings settings = ImmutableSettings.settingsBuilder()
>
>.put("client.transport.sniff", true
> )
>
>.put("refresh_interval", "-1")  
>
>  .put("number_of_shards", 1)
>
>  .put("number_of_replicas", "0")
>
>.put("cluster.name", this.CLUSTER_NAME)
>
>.build();
>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/1a89bbdd-8db3-4499-abc2-ea841a865b56%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

discovery.zen.minimum_master_nodes and gateway.recover_after_nodes does not work after upgrading to ES 1.0.1 ?

2014-03-26 Thread David Smith

We have 16 node cluster on 0.90.5. We built a new cluster for 1.0.1 (yes, 
we will upgrade to 1.1.0 soon) but we experience this problem that I would 
like help with.

In our 0.90.5 cluster, we had it configured as:

  discovery.zen.minimum_master_nodes: 9
  gateway.expected_nodes: 16
  gateway.recover_after_nodes: 12


But when we built the 1.0.1 cluster with the same settings, all the nodes 
return 503 when upon startup after checking status http://localhost:9200.

Had to turn that down as below to get it working and return 200 status as 
usual.
  discovery.zen.minimum_master_nodes: 1

  gateway.expected_nodes: 16
  gateway.recover_after_nodes: 1

Full elasticsearch yaml configs for both clusters 
are: https://gist.github.com/ppat/9791741

Am I doing anything wrong? Or is this an issue?

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/a1a63c2e-5717-4cb4-b10f-323ccd37f2fc%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Choking of Elasticsearch.

This metric/warning should have no effect in indexing. The query_total is 
the number of shard _searches executed so far, and filter cache evictions 
counts the number of times something in the filter cache is 
removed/replaced because you have reached the limits - this is affected by 
filters in your queries. Filter cache evictions are normal as part of 
running your queries, although the filter cache size can be increased (or 
decreased) as needed. But before you do that, I'd probably find out first 
why search is slow by analyzing your queries.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/85f3e0aa-6b65-4376-9b33-a20ac287bfc6%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: how to modify term frequency formula?

2014-03-26 Thread geantbrun

Britta is looping over words that are passed as parameters. It's easy to 
implement her script for a simple query but what about boolean querys? In 
my understanding (but I could be wrong of course), I would have to parse 
the query to call the script with each sub-clause, am I wrong?

I prefer your custom similarity alternative. Again, sorry for the silly 
question (newbie!) but where do you put your java file? Is it the only 
thing that is needed (except for the modification in the mapping)?
cheers,
Patrick

Le mercredi 26 mars 2014 11:58:52 UTC-4, Ivan Brusic a écrit :
>
> I am still on a version of Elasticsearch that does not have access to the 
> new scoring capabilities, so I cannot test out any scripts. The non 
> normalized term frequency should be the line:
> tf = _index[field][word].tf()
>
> If that is the case, you could substitute that line with something like:
> tf = Math.min(10, _index[field][word].tf())
>
> As a stated before, I am used to using Similarities, so I find the example 
> easier. Here is a custom similarity that I used in Elasticsearch (removes 
> any norms that are indexed):
> https://gist.github.com/brusic/9786587
>
> The second part would be the tf() method you would need to implement 
> instead of decodeNormValue I used.
>
> Cheers,
>
> Ivan
>
>
>
>
>
>
> On Tue, Mar 25, 2014 at 1:42 PM, geantbrun 
> > wrote:
>
>> Yes I saw Britta's slides but I find it difficult to implement my own 
>> scoring for complex queries (ex: with AND and OR).
>> Do you have a concrete example or a link to share to explain with more 
>> details the override alternative?
>> Thanks again Ivan,
>> Patrick
>>
>> Le mardi 25 mars 2014 12:04:26 UTC-4, Ivan Brusic a écrit :
>>>
>>> Did you see Britta's slides? She has a slide called "Cosine similarity 
>>> as script" which mimics the Lucene scoring as a script. You can replace the 
>>> call to _index[field][word].tf() with your own implementation. You can 
>>> deploy the script as a native Java script (note: not Javascript) for 
>>> performance.
>>>
>>> I find it easier to understand to just change the Similarity. Simply 
>>> over DefaultSimilarity and override "public float tf(float freq)" and then 
>>> reference this similarity in your field mapping.
>>>
>>> -- 
>>> Ivan
>>>
>>>
>>> On Tue, Mar 25, 2014 at 6:57 AM, geantbrun  wrote:
>>>
 Thanks again for the answer Ivan. Would it be simpler to modify 
 directly in the source code the way tf is calculated? I mean replacing 
 somewhere something like tf = sqrt(n) by tf = min(10,sqrt(n)).
 Cheers,
 Patrick

 Le vendredi 21 mars 2014 18:01:51 UTC-4, Ivan Brusic a écrit :
>
> Term frequencies are stored within Lucene, so there is no calculating 
> of the value, just a lookup in the data structure. You can disable term 
> frequencies and then create your own in the script, but it would be 
> easier 
> to calculate that value at index time so that you can access it within 
> your 
> custom score and not have to iterate through all the terms yourself. 
> Britta 
> has posted on the mailing list in the past, so hopefully she will reply 
> with some more authoritative answers, especially ones regarding 
> performance.
>
> -- 
> Ivan
>
>
> On Fri, Mar 21, 2014 at 11:54 AM, geantbrun wrote:
>
>> Thanks a lot Ivan, great answer. 
>>
>> Suppose I use in my script my own formula for tf (with 
>> _index[field][term].tf()) and set the boost_mode to "replace", does 
>> elasticsearch calculate the tf two times or once only? In other words, 
>> is 
>> it computionnally efficient to calculate my own tf? Should I turn off 
>> other 
>> calculations made by es somewhere else to avoid double calculations?
>>
>> Cheers,
>> Patrick
>>
>> Le jeudi 20 mars 2014 17:44:53 UTC-4, Ivan Brusic a écrit :
>>>
>>> You can provide your own similarity to be used at the field level, 
>>> but recent version of elasticsearch allows you to access the tf-idf 
>>> values 
>>> in order to do custom scoring [1]. Also look at Britta's recent talk on 
>>> the 
>>> subject [2].
>>>
>>> That said, either your custom similarity or custom scoring would 
>>> need access to what exactly are the terms which are repeated many 
>>> times. 
>>> Have you looked into omitting term frequencies? It would completely 
>>> bypass 
>>> using term frequencies, which might be an overkill in your case. Look 
>>> into 
>>> the index options [3].
>>>
>>> Finally, perhaps the common terms query can help [4].
>>>
>>> [1] http://www.elasticsearch.org/guide/en/elasticsearch/referenc
>>> e/current/modules-advanced-scripting.html
>>>
>>> [2] https://speakerdeck.com/elasticsearch/scoring-for-human-beings
>>>
>>> [3] http://www.elasticsearch.org/guide/en/elasticsearch/refe
>>> rence/current/mapping-core-types.html#string
>

[ANN] Elasticsearch Google Compute Engine cloud plugin 2.1.0 released


Heya,


We are pleased to announce the release of the Elasticsearch Google Compute 
Engine cloud plugin, version 2.1.0.

The Google Compute Engine (GCE) Cloud plugin allows to use GCE API for the 
unicast discovery mechanism..

https://github.com/elasticsearch/elasticsearch-cloud-gce/

Release Notes - elasticsearch-cloud-gce - Version 2.1.0



Update:
 * [18] - Update to elasticsearch 1.1.0 
(https://github.com/elasticsearch/elasticsearch-cloud-gce/issues/18)




Issues, Pull requests, Feature requests are warmly welcome on 
elasticsearch-cloud-gce project repository: 
https://github.com/elasticsearch/elasticsearch-cloud-gce/
For questions or comments around this plugin, feel free to use elasticsearch 
mailing list: https://groups.google.com/forum/#!forum/elasticsearch

Enjoy,

-The Elasticsearch team

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/20140326194601.EFD4C9400E6%40smtp1-g21.free.fr.
For more options, visit https://groups.google.com/d/optout.

Re: What is wrong with this query

2014-03-26 Thread Kina Shah

Ivan,

Thanks for your reply. Yes, I am using 1.0. I went over the Function Score 
documentation, but its bit confusing. Do you have any example of  Function 
Score queries?

Thanks!

On Wednesday, March 26, 2014 2:05:09 PM UTC-4, Ivan Brusic wrote:
>
> Which version of Elasticsearch are you using? The custom filters score 
> query was deprecated in 0.90,4 and removed in 1.0.
>
>
> http://www.elasticsearch.org/guide/en/elasticsearch/reference/0.90/query-dsl-custom-filters-score-query.html
>
> Look into using function score queries.
>
> -- 
> Ivan
>
>
> On Wed, Mar 26, 2014 at 8:24 AM, Kina Shah 
> > wrote:
>
>> Hi,
>>
>> I am trying to apply the custom filter score to a filtered query. The 
>> filtered query works fine, but when I wrap the custom filter score with 
>> filters around it it gives me error.  Listed below is the query. Can 
>> someone tell me what's wrong with it?
>>
>> POST _search
>> {
>>  "query": 
>> {
>>"custom_filters_score": 
>> { 
>> "query": 
>> {
>>  "filtered": 
>> {
>> "query": 
>> {
>> "match_all": {}
>> },
>>  "filter": 
>> {
>> "or": 
>> [
>> {
>> "geo_distance" : 
>> {
>> "distance" : "20km",
>> "senGeoPointLst" : 
>> {
>> "lat" : 31.8655875846,
>> "lon" : 65.8461840664
>> }
>> }
>> },
>> {
>> "geo_distance" : 
>> {
>> "distance" : "20km",
>> "fcGeoPointLst" : 
>> {
>> "lat" : 88.1,
>> "lon" : -120.1
>> }
>> }
>> }
>> ]
>> }
>> }
>> },
>> " filters": 
>> [ 
>>  { 
>> "geo_distance" : 
>> {
>> "distance" : "20km",
>> "fcGeoPointLst" : 
>> {
>> "lat" : 88.1,
>> "lon" : -120.1
>> }
>> },
>> "boost":3 
>>  }  
>> ] 
>> }
>> }
>> }
>>
>> The error I get is :QueryParsingException[[hws] [custom_filters_score] 
>> query does not support [ filters]]; }]",
>>
>> Thanks!
>>
>>  -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to elasticsearc...@googlegroups.com .
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/0f7141d1-bb1f-49b5-ac5b-a4dc0c785138%40googlegroups.com
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/796ce883-453e-4a43-b183-285bebf651b9%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[ANN] Elasticsearch Azure cloud plugin 2.2.0 released


Heya,


We are pleased to announce the release of the Elasticsearch Azure cloud plugin, 
version 2.2.0.

The Azure Cloud plugin allows to use Azure API for the unicast discovery 
mechanism and add Azure storage repositories..

https://github.com/elasticsearch/elasticsearch-cloud-azure/

Release Notes - elasticsearch-cloud-azure - Version 2.2.0



Update:
 * [11] - Update to elasticsearch 1.1.0 
(https://github.com/elasticsearch/elasticsearch-cloud-azure/issues/11)




Issues, Pull requests, Feature requests are warmly welcome on 
elasticsearch-cloud-azure project repository: 
https://github.com/elasticsearch/elasticsearch-cloud-azure/
For questions or comments around this plugin, feel free to use elasticsearch 
mailing list: https://groups.google.com/forum/#!forum/elasticsearch

Enjoy,

-The Elasticsearch team

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/20140326185230.1E9724C8177%40smtp4-g21.free.fr.
For more options, visit https://groups.google.com/d/optout.

Re: Java API or REST API for client development ?

Using same version on all nodes of a cluster still holds. Otherwise you
will get unpredictable behavior, especially in case of using new features.

Jörg


On Wed, Mar 26, 2014 at 1:12 PM, Graham Tackley  wrote:

> Not true anymore: the java client has been compatible between minor
> versions since 0.90 as far as I remember. 1.0.0 client is currently working
> just fine against my 1.0.1 cluster, and my experimentation today shows that
> it also works fine against 1.1.0.  So this used to be a nightmare requiring
> synchronised upgrades, but hasn't been for a while.
>
> FWIW we use the java client (in transport client mode) extensively from
> our scala apps, and it works brilliantly. I'd definitely recommend.
>
>
> On Wednesday, 26 March 2014 11:45:19 UTC, Martin Forssen wrote:
>>
>> The Java API is said to have better performance (and I believe that). The
>> drawbacks are that you must use the exact same version of the java API
>> library on the client as the server runs, as well as the same version of
>> Java. So upgrades suck.
>>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/610cb839-1625-43d6-be57-db654de9aace%40googlegroups.com
> .
>
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoFdbKjEH0u5EBKUZHrSNnkPUESHYkW7RiZkw5epOG20qg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: elasticsearch 1.1.0 compatibility? full cluster restart?

Do not mix 1.0 and 1.1.

Always use an 1.1 client for 1.1 cluster.

Jörg



On Wed, Mar 26, 2014 at 11:50 AM, Graham Tackley  wrote:

> The release notes for elasticsearch 1.1.0 don't say anything about
> compatibility with 1.0 (or at least I didn't see it).
>
> - can I mix 1.0.1 and 1.1.0 in the same cluster, i.e. do a rolling
> upgrade?
> - does the java 1.0.1 client library talk ok to a 1.1.0 cluster?
>
> I'm really excited about some of the new stuff in 1.1.0...
>
> g
>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/a511febf-a930-422d-94b3-aad903b7f50d%40googlegroups.com
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGkdJ3KLChFkjFay2SUu%2BDogJpxHmFu2Nmsq8QwowpvRQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

[ANN] Elasticsearch AWS cloud plugin 2.1.0 released


Heya,


We are pleased to announce the release of the Elasticsearch AWS cloud plugin, 
version 2.1.0.

The Amazon Web Service (AWS) Cloud plugin allows to use AWS API for the unicast 
discovery mechanism and add S3 repositories..

https://github.com/elasticsearch/elasticsearch-cloud-aws/

Release Notes - elasticsearch-cloud-aws - Version 2.1.0



Update:
 * [60] - Update to elasticsearch 1.1.0 / Lucene 4.7.0 
(https://github.com/elasticsearch/elasticsearch-cloud-aws/issues/60)




Issues, Pull requests, Feature requests are warmly welcome on 
elasticsearch-cloud-aws project repository: 
https://github.com/elasticsearch/elasticsearch-cloud-aws/
For questions or comments around this plugin, feel free to use elasticsearch 
mailing list: https://groups.google.com/forum/#!forum/elasticsearch

Enjoy,

-The Elasticsearch team

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/20140326181202.0959B4B015C%40smtp2-g21.free.fr.
For more options, visit https://groups.google.com/d/optout.

Re: What is wrong with this query

Which version of Elasticsearch are you using? The custom filters score
query was deprecated in 0.90,4 and removed in 1.0.

http://www.elasticsearch.org/guide/en/elasticsearch/reference/0.90/query-dsl-custom-filters-score-query.html

Look into using function score queries.

-- 
Ivan


On Wed, Mar 26, 2014 at 8:24 AM, Kina Shah  wrote:

> Hi,
>
> I am trying to apply the custom filter score to a filtered query. The
> filtered query works fine, but when I wrap the custom filter score with
> filters around it it gives me error.  Listed below is the query. Can
> someone tell me what's wrong with it?
>
> POST _search
> {
>  "query":
> {
>"custom_filters_score":
> {
> "query":
> {
>  "filtered":
> {
> "query":
> {
> "match_all": {}
> },
>  "filter":
> {
> "or":
> [
> {
> "geo_distance" :
> {
> "distance" : "20km",
> "senGeoPointLst" :
> {
> "lat" : 31.8655875846,
> "lon" : 65.8461840664
> }
> }
> },
> {
> "geo_distance" :
> {
> "distance" : "20km",
> "fcGeoPointLst" :
> {
> "lat" : 88.1,
> "lon" : -120.1
> }
> }
> }
> ]
> }
> }
> },
> " filters":
> [
>  {
> "geo_distance" :
> {
> "distance" : "20km",
> "fcGeoPointLst" :
> {
> "lat" : 88.1,
> "lon" : -120.1
> }
> },
> "boost":3
>  }
> ]
> }
> }
> }
>
> The error I get is :QueryParsingException[[hws] [custom_filters_score]
> query does not support [ filters]]; }]",
>
> Thanks!
>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/0f7141d1-bb1f-49b5-ac5b-a4dc0c785138%40googlegroups.com
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQCwwY8w67gpVxfqgUrtOjG0o78unsn6uPEH5pJ%2BBY0wJA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

[ANN] Elasticsearch AWS cloud plugin 2.0.0 released


Heya,


We are pleased to announce the release of the Elasticsearch AWS cloud plugin, 
version 2.0.0.

The Amazon Web Service (AWS) Cloud plugin allows to use AWS API for the unicast 
discovery mechanism and add S3 repositories..

https://github.com/elasticsearch/elasticsearch-cloud-aws/

Release Notes - elasticsearch-cloud-aws - Version 2.0.0



Update:
 * [62] - Update to AWS SDK 1.7.3 
(https://github.com/elasticsearch/elasticsearch-cloud-aws/issues/62)
 * [59] - Add plugin version in es-plugin.properties 
(https://github.com/elasticsearch/elasticsearch-cloud-aws/issues/59)
 * [57] - Update to elasticsearch 1.0.0 
(https://github.com/elasticsearch/elasticsearch-cloud-aws/issues/57)
 * [56] - Custom credentials per repository 
(https://github.com/elasticsearch/elasticsearch-cloud-aws/pull/56)
 * [48] - The dependency on Elasticsearch should be in provided and not compile 
(https://github.com/elasticsearch/elasticsearch-cloud-aws/issues/48)

New:
 * [58] - Add plugin release semi-automatic script 
(https://github.com/elasticsearch/elasticsearch-cloud-aws/issues/58)
 * [52] - Support ap-northeast region for EC2 and S3 
(https://github.com/elasticsearch/elasticsearch-cloud-aws/pull/52)
 * [49] - Support group id in addition to group names 
(https://github.com/elasticsearch/elasticsearch-cloud-aws/issues/49)



Issues, Pull requests, Feature requests are warmly welcome on 
elasticsearch-cloud-aws project repository: 
https://github.com/elasticsearch/elasticsearch-cloud-aws/
For questions or comments around this plugin, feel free to use elasticsearch 
mailing list: https://groups.google.com/forum/#!forum/elasticsearch

Enjoy,

-The Elasticsearch team

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/20140326180305.B9DC18232F%40smtp6-g21.free.fr.
For more options, visit https://groups.google.com/d/optout.

how to correctly store the parent/child structure in the ES

2014-03-26 Thread Павел Поляков

Hi,

I have the next structure in my mysql database.

What I want to do is to copy the content of the database to the
elasticsearch server. Main goal is to search through the transactions and
use the facets option.

Kind of this (currently it's implemented using the mysql):

The issue is - how should I store the documents in the elasticsearch, so
they are reindexed quickly.
In the future, I would need to update the ES index as soon as transaction
or bank_account data is updated.

I've checked all the available options of the
ES http://www.elasticsearch.org/blog/managing-relations-inside-elasticsearch/
and decided to use the parent/child one.
I've created one index and two types - transaction and bank_account. Where
the transaction is the child of the bank_account.

*But there are open questions:*

*1.* How could I query the ES, using the "has_parent" option so it would
return not only the childs but the information about the parent also.
In the results I need to receive the object where the fields country,
currency and name would be available.
Currently I've managed to receive only the _source fields from the
transaction and the _parent field which is the id of the bank account

*2.* How should I query the ES to receive the facets on the fields country,
currency and name?

*3.* Which other structure could I use so the reindexing is happening
quickly? The case is that I have 16 transactions and 20 banks. If I
would store the information about the bank directly in the transaction
object, that means that if I would change the country of the bank, then I
would need to reindex nearly 4 documents (for example). Which is not
acceptable.

I'm also not able to use the "nested object" concept - to store the
transactions inside the bank account, because that way I would not be able
to insert the transactions to that bank dynamically, each new insert would
also cause the reindexing of the whole document.

Any thoughts?

Regards,

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/6a0fba83-6e0b-4e0d-aefd-b66bca5fded3%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: create panel to display latency time using two datetime fields

On this page:

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-facets-date-histogram-facet.html#search-facets-date-histogram-facet

If you scroll down to the bottom, there is an example for value_script. In 
your case, the value_script would be something like (the result is in 
milliseconds, btw, so just convert accordingly if needed):

doc['outDate'].value - doc['inDate'].value

Again this is not supported by the Histogram Panel out of the box, but you 
can implement it yourself using a new panel, or by hacking the Histogram 
Panel.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/0ed2b6f2-e201-469f-88e1-7b147740c405%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: ES query returning unusual results

If indexing was live/active, it probably was just your replicas not caught 
up to the primaries for split seconds.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/a103b6af-3eb5-44b6-af0a-cd93dc1858de%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: _parent missing during mapping - error inserting objects with parent

Assuming you are talking about trying to index a child document into 
elasticsearch_tests_utilityobjects_zeepextag? If so, you just need to 
specify the parent id in the url when you index:

curl -XPOST 
localhost:9200/index/elasticsearch_tests_utilityobjects_zeepextag?parent=parent_id

You might find this useful:

http://www.elasticsearch.org/blog/managing-relations-inside-elasticsearch/

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/d7e1afe1-7890-49fb-a06c-53605b9e3442%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Update Existing Mapping

Unfortunately you can't "update" the index_options after it's in already. 
You'll need to delete your index and redefine the mapping.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/689f54ce-d31f-4ea5-8ce8-579daf6a020d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Query response time does NOT improve by adding additional nodes

2014-03-26 Thread Herbert Bodner

thanks Jörg for your response.
I guess the bottleneck is the I/O (as you suggested).
So it does not matter if I add additional memory or CPU power to the 
Elasticsearch cluster (with adding additional virtual machines), because 
all nodes run on the same physical server with limited I/O capacity. If one 
node already uses the whole I/O capacity, then adding additional nodes (on 
the same physical server) does not help.

cheers

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/c717f35d-8ecc-4f57-a175-225b166c9aed%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

_parent missing during mapping - error inserting objects with parent

2014-03-26 Thread Shawn Ritchie

Hi,

I'm trying to add a _ parent with the following mappings but for some 
reason the _parent field gets ignored so when I go to insert a new object 
in Elastic search I get an exception that no parent field was specified 
anyone knows why this is happening?

Regards
Shawn

"elasticsearch_tests_utilityobjects_searchableentity": {
"properties": {

...

},
  "_all": {
  "enabled": true
  }
  },
  "elasticsearch_tests_utilityobjects_zeepextag": {
  "properties": {
  "UID": {
  "type": "long"
  },
  "Type": {
  "type": "integer"
  },
  "Tag": {
  "fields": {
  "Tag": {
  "index_options": "offsets",
  "type": "string",
  "store": "yes"
  }
  },
  "type": "multi_field"
  }
  },
  "_parent": {
  "type": "elasticsearch_tests_utilityobjects_searchableentity"
  },
  "_all": {
  "enabled": true
  }
  }

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/2dbbc5c6-8843-4864-b5aa-aa116221afbb%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

_parent missing during mapping

2014-03-26 Thread Shawn Ritchie

Hi,

I'm trying to add a _ parent with the following mappings but for some 
reason the _parent field gets ignored so when I go to insert a new object 
in Elastic search I get an exception that no parent field was specified 
anyone knows why this is happening?

Regards
Shawn

"elasticsearch_tests_utilityobjects_searchableentity": {
"properties": {

...

},
  "_all": {
  "enabled": true
  }
  },
  "elasticsearch_tests_utilityobjects_zeepextag": {
  "properties": {
  "UID": {
  "type": "long"
  },
  "Type": {
  "type": "integer"
  },
  "Tag": {
  "fields": {
  "Tag": {
  "index_options": "offsets",
  "type": "string",
  "store": "yes"
  }
  },
  "type": "multi_field"
  }
  },
  "_parent": {
  "type": "search_tests_utilityobjects_searchableentity"
  },
  "_all": {
  "enabled": true
  }
  }




-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/a92decd4-2950-4cc3-bc06-7f1c72278d8c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: How many available disk space need for normal system running?

Lucene segments are immutable, so while segments are merged, the originals
remain in place. You can increase the number of segments you have so that
less merging needs to occur. Mike McCandless has lots of good tips about
merges:

http://blog.mikemccandless.com/2011/02/visualizing-lucenes-segment-merges.html

-- 
Ivan


On Wed, Mar 26, 2014 at 4:11 AM, Mark Walkom wrote:

> There's not much you can do, you either need to delete some data or
> increase your disk space.
>
> Maybe someone can clarify how much space is needed for a merge, but I
> imagine it'd be twice the size of a shard.
>
> Regards,
> Mark Walkom
>
> Infrastructure Engineer
> Campaign Monitor
> email: ma...@campaignmonitor.com
> web: www.campaignmonitor.com
>
>
> On 26 March 2014 22:01, Ivan Ji  wrote:
>
>> Hi Mark,
>>
>> My index size is about 300GB, the free disk space is about 5G.
>>
>>
>>
>> Mark Walkom於 2014年3月26日星期三UTC+8下午6時18分12秒寫道：
>>>
>>> Depends on how much data you have.
>>>
>>> How much disk space is on the machine? How much data is in ES?
>>>
>>> Regards,
>>> Mark Walkom
>>>
>>> Infrastructure Engineer
>>> Campaign Monitor
>>> email: ma...@campaignmonitor.com
>>> web: www.campaignmonitor.com
>>>
>>>
>>> On 26 March 2014 21:06, Ivan Ji  wrote:
>>>
 I just looked up the lucene's document. during the merge, there at
 least need double the index size.
 But does ES tune something on it ? or it just follow the lucene's rule,
 that is 2x the index size.

 Ivan Ji於 2014年3月26日星期三UTC+8下午5時53分33秒寫道：

> Hi all,
>
> I am using ES 1.0.1. I am wondering how many un-used disk space needed
> for the ES's system running?
>
> Because I ran into the error:
>
> [2014-03-26 03:30:52,713][WARN ][index.merge.scheduler] [Rick
>> Jones] [qusion][1] failed to merge
>> java.io.IOException: No space left on device
>> at java.io.RandomAccessFile.writeBytes0(Native Method)
>>  at java.io.RandomAccessFile.writeBytes(RandomAccessFile.java:520)
>> at java.io.RandomAccessFile.write(RandomAccessFile.java:550)
>>  at org.apache.lucene.store.FSDirectory$FSIndexOutput.flushBuffe
>> r(FSDirectory.java:458)
>> at org.apache.lucene.store.RateLimitedFSDirectory$RateLimitedIn
>> dexOutput.flushBuffer(RateLimitedFSDirectory.java:102)
>>  at org.apache.lucene.store.BufferedChecksumIndexOutput.flushBuffer(
>> BufferedChecksumIndexOutput.java:71)
>> at org.apache.lucene.store.BufferedIndexOutput.flushBuffer(Buff
>> eredIndexOutput.java:113)
>>  at org.apache.lucene.store.BufferedIndexOutput.flush(BufferedIn
>> dexOutput.java:102)
>> at org.apache.lucene.store.BufferedChecksumIndexOutput.flush(Bu
>> fferedChecksumIndexOutput.java:86)
>>  at org.apache.lucene.store.BufferedIndexOutput.close(BufferedIn
>> dexOutput.java:126)
>> at org.apache.lucene.store.BufferedChecksumIndexOutput.close(Bu
>> fferedChecksumIndexOutput.java:61)
>>  at org.elasticsearch.index.store.Store$StoreIndexOutput.close(S
>> tore.java:602)
>> at org.apache.lucene.codecs.compressing.CompressingStoredFields
>> IndexWriter.close(CompressingStoredFieldsIndexWriter.java:205)
>>  at org.apache.lucene.util.IOUtils.close(IOUtils.java:140)
>> at org.apache.lucene.codecs.compressing.CompressingStoredFields
>> Writer.close(CompressingStoredFieldsWriter.java:138)
>>  at org.apache.lucene.index.SegmentMerger.mergeFields(SegmentMer
>> ger.java:318)
>> at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:94)
>>  at org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.
>> java:4071)
>> at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3668)
>>  at org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(Con
>> currentMergeScheduler.java:405)
>> at org.apache.lucene.index.TrackingConcurrentMergeScheduler.doMerge(
>> TrackingConcurrentMergeScheduler.java:107)
>>  at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(
>> ConcurrentMergeScheduler.java:482)
>> [2014-03-26 03:30:53,382][WARN ][index.engine.internal] [Rick
>> Jones] [qusion][1] failed engine
>
>
>
> Obviously, there need to have some amount of disk during the merge.
> And I think the larger index size, the more disk space needed for the
> merge operation.
>
>  Does anyone have the idea how much does it can be?
>
> Ivan
>
  --
 You received this message because you are subscribed to the Google
 Groups "elasticsearch" group.
 To unsubscribe from this group and stop receiving emails from it, send
 an email to elasticsearc...@googlegroups.com.
 To view this discussion on the web visit https://groups.google.com/d/
 msgid/elasticsearch/448ff632-0462-436f-b0cf-5a943b8db50f%
 40googlegroups.com

[ANN] Elasticsearch ICU Analysis plugin 2.1.0 released


Heya,


We are pleased to announce the release of the Elasticsearch ICU Analysis 
plugin, version 2.1.0.

The ICU Analysis plugin integrates Lucene ICU module into elasticsearch, adding 
ICU relates analysis components..

https://github.com/elasticsearch/elasticsearch-analysis-icu/

Release Notes - elasticsearch-analysis-icu - Version 2.1.0



Update:
 * [23] - Update to elasticsearch 1.1.0 / Lucene 4.7.0 
(https://github.com/elasticsearch/elasticsearch-analysis-icu/issues/23)
 * [22] - Add plugin version in es-plugin.properties 
(https://github.com/elasticsearch/elasticsearch-analysis-icu/issues/22)




Issues, Pull requests, Feature requests are warmly welcome on 
elasticsearch-analysis-icu project repository: 
https://github.com/elasticsearch/elasticsearch-analysis-icu/
For questions or comments around this plugin, feel free to use elasticsearch 
mailing list: https://groups.google.com/forum/#!forum/elasticsearch

Enjoy,

-The Elasticsearch team

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/20140326155848.EE4EC4C8161%40smtp4-g21.free.fr.
For more options, visit https://groups.google.com/d/optout.

Re: how to modify term frequency formula?

I am still on a version of Elasticsearch that does not have access to the
new scoring capabilities, so I cannot test out any scripts. The non
normalized term frequency should be the line:
tf = _index[field][word].tf()

If that is the case, you could substitute that line with something like:
tf = Math.min(10, _index[field][word].tf())

As a stated before, I am used to using Similarities, so I find the example
easier. Here is a custom similarity that I used in Elasticsearch (removes
any norms that are indexed):
https://gist.github.com/brusic/9786587

The second part would be the tf() method you would need to implement
instead of decodeNormValue I used.

Cheers,

Ivan






On Tue, Mar 25, 2014 at 1:42 PM, geantbrun  wrote:

> Yes I saw Britta's slides but I find it difficult to implement my own
> scoring for complex queries (ex: with AND and OR).
> Do you have a concrete example or a link to share to explain with more
> details the override alternative?
> Thanks again Ivan,
> Patrick
>
> Le mardi 25 mars 2014 12:04:26 UTC-4, Ivan Brusic a écrit :
>>
>> Did you see Britta's slides? She has a slide called "Cosine similarity as
>> script" which mimics the Lucene scoring as a script. You can replace the
>> call to _index[field][word].tf() with your own implementation. You can
>> deploy the script as a native Java script (note: not Javascript) for
>> performance.
>>
>> I find it easier to understand to just change the Similarity. Simply over
>> DefaultSimilarity and override "public float tf(float freq)" and then
>> reference this similarity in your field mapping.
>>
>> --
>> Ivan
>>
>>
>> On Tue, Mar 25, 2014 at 6:57 AM, geantbrun  wrote:
>>
>>> Thanks again for the answer Ivan. Would it be simpler to modify directly
>>> in the source code the way tf is calculated? I mean replacing somewhere
>>> something like tf = sqrt(n) by tf = min(10,sqrt(n)).
>>> Cheers,
>>> Patrick
>>>
>>> Le vendredi 21 mars 2014 18:01:51 UTC-4, Ivan Brusic a écrit :

 Term frequencies are stored within Lucene, so there is no calculating
 of the value, just a lookup in the data structure. You can disable term
 frequencies and then create your own in the script, but it would be easier
 to calculate that value at index time so that you can access it within your
 custom score and not have to iterate through all the terms yourself. Britta
 has posted on the mailing list in the past, so hopefully she will reply
 with some more authoritative answers, especially ones regarding 
 performance.

 --
 Ivan


 On Fri, Mar 21, 2014 at 11:54 AM, geantbrun wrote:

> Thanks a lot Ivan, great answer.
>
> Suppose I use in my script my own formula for tf (with
> _index[field][term].tf()) and set the boost_mode to "replace", does
> elasticsearch calculate the tf two times or once only? In other words, is
> it computionnally efficient to calculate my own tf? Should I turn off 
> other
> calculations made by es somewhere else to avoid double calculations?
>
> Cheers,
> Patrick
>
> Le jeudi 20 mars 2014 17:44:53 UTC-4, Ivan Brusic a écrit :
>>
>> You can provide your own similarity to be used at the field level,
>> but recent version of elasticsearch allows you to access the tf-idf 
>> values
>> in order to do custom scoring [1]. Also look at Britta's recent talk on 
>> the
>> subject [2].
>>
>> That said, either your custom similarity or custom scoring would need
>> access to what exactly are the terms which are repeated many times. Have
>> you looked into omitting term frequencies? It would completely bypass 
>> using
>> term frequencies, which might be an overkill in your case. Look into the
>> index options [3].
>>
>> Finally, perhaps the common terms query can help [4].
>>
>> [1] http://www.elasticsearch.org/guide/en/elasticsearch/referenc
>> e/current/modules-advanced-scripting.html
>>
>> [2] https://speakerdeck.com/elasticsearch/scoring-for-human-beings
>>
>> [3] http://www.elasticsearch.org/guide/en/elasticsearch/refe
>> rence/current/mapping-core-types.html#string
>>
>> [4] http://www.elasticsearch.org/guide/en/elasticsearch/refe
>> rence/current/query-dsl-common-terms-query.html
>>
>> Cheers,
>>
>> Ivan
>>
>>
>> On Thu, Mar 20, 2014 at 8:08 AM, geantbrun wrote:
>>
>>> Hi,
>>> If I understand well, the formula used for the term frequency part
>>> in the default similarity module is the square root of the actual
>>> frequency. Is it possible to modify that formula to include something 
>>> like
>>> a min(my_max_value,sqrt(frequency))? I would like to avoid huge
>>> tf's for documents that have the same term repeated many times. It seems
>>> that BM25 similarity has a parameter to control saturation but I would
>>> prefer to stick with the simple tf/idf similarity mod

Re: IndexOutOfBoundsException at IndexShardRoutingTable class

2014-03-26 Thread simonw

I opened an issue for 
this: https://github.com/elasticsearch/elasticsearch/issues/5559

On Wednesday, March 26, 2014 5:53:15 AM UTC+1, Shinsuke Sugaya wrote:
>
> Hi
>
> I encountered the following problem:
>
> Caused by: java.lang.IndexOutOfBoundsException: index (-2) must not be 
> negative
> at 
> org.elasticsearch.common.base.Preconditions.checkElementIndex(Preconditions.java:306)
> at 
> org.elasticsearch.common.base.Preconditions.checkElementIndex(Preconditions.java:285)
> at 
> org.elasticsearch.common.collect.RegularImmutableList.get(RegularImmutableList.java:65)
> at 
> org.elasticsearch.cluster.routing.IndexShardRoutingTable.preferNodeActiveInitializingShardsIt(IndexShardRoutingTable.java:378)
> at 
> org.elasticsearch.cluster.routing.operation.plain.PlainOperationRouting.preferenceActiveShardIterator(PlainOperationRouting.java:210)
> at 
> org.elasticsearch.cluster.routing.operation.plain.PlainOperationRouting.getShards(PlainOperationRouting.java:80)
> at 
> org.elasticsearch.action.get.TransportGetAction.shards(TransportGetAction.java:80)
> at 
> org.elasticsearch.action.get.TransportGetAction.shards(TransportGetAction.java:42)
> at 
> org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction$AsyncSingleAction.(TransportShardSingleOperationAction.java:121)
> at 
> org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction$AsyncSingleAction.(TransportShardSingleOperationAction.java:97)
> at 
> org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction.doExecute(TransportShardSingleOperationAction.java:74)
> at 
> org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction.doExecute(TransportShardSingleOperationAction.java:49)
> at 
> org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:63)
> at 
> org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:49)
> at 
> org.elasticsearch.client.node.NodeClient.execute(NodeClient.java:85)
> at 
> org.elasticsearch.client.support.AbstractClient.get(AbstractClient.java:174)
> ... 9 more
>
> My environment is:
>
>  - Elasticserach 0.90.7
>  - 3 nodes in a cluster
>  - Send GET request with preference=_local
>
> Looking into IndexShardRoutingTable class, it seems that "loc" is 
> an unexpected negative value at the following code. pickIndex method 
> returns a value of "counter"(incremental value). If "counter" achieves 
> Integer.MAX_VALUE, I think that "loc" is negative and then 
> activeShards.get(loc) throws the exception.
>
> int index = pickIndex();
> for (int i = 0; i < activeShards.size(); i++) {
> int loc = (index + i) % activeShards.size();
>
> If it's a bug, I'll file an issue.
>
> Best regards,
>  shinsuke
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/a9eaa0a8-46a7-42d7-8c74-00296741e061%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Update Existing Mapping

2014-03-26 Thread Praveenkumar Arepalli

How to update Existing mapping in Index?

"firstName" : {
"type" : "multi_field",
"fields" : {
  "firstName" : {
"type" : "string"
  }
"index_options" : "docs",
"include_in_all" : false
  }
}

Here i want update *"index_options" : "docs" *to "index_options" : 
"positions",

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/67d73fab-339e-4a6f-8bd3-9e776b546fd5%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: ES operations and memory usage

- Fielddata (sorting, facets, some scripting), filter cache (filters), and 
id_cache (parent-child) are probably the ones that would affect memory 
usage significantly.
- Fielddata can be loaded in different ways depending on data type and 
other config (details here: 
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/fielddata-formats.html#_fielddata_loading)
- OS memory would generally be file system caches for the underlying Lucene 
indexes

About your other question, the default query execution is called 
query_then_fetch 
(http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-request-search-type.html#query-then-fetch).
 
The basic idea is the search (query) is scattered in parallel to all shards 
(single replica set) of an index being searched. Then a little bit of 
results are gathered back, reduced, and then sorted. Then whatever is the 
final reduced list, ES goes back to the final shards and pulls out (fetch) 
additional data required to return final hits to the caller.

On Wednesday, March 26, 2014 10:41:44 AM UTC-4, Uli Bethke wrote:
>
> I am struggling a bit to understand which ES operations use up which part 
> of memory
>
> - From reading the documentation I know that faceting and sorting take up 
> heap memory and also that some filters are cached in JVM heap. However, 
> what other operations take up JVM heap memory?
> - When faceting on a particular field are all of the values of that field 
> loaded into JVM heap or only the distinct values of the field. I would 
> believe it is the latter as fields of high cardinality require more JVM 
> heap.
> - What operations use OS memory. I believe documents retrieved and ES 
> index are kept in OS memory. Can anyone confirm this? What other operations 
> go to OS memory?
>
>
> Finally, I would also be interested  if there are any posts or tutorials 
> that desribce the complete lifecycle of a query from submission of query to 
> return of resultset.
>
> Thanks
> Uli
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/dbe09764-67e8-49cd-98e7-9c9b5e4a801a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: IndexOutOfBoundsException at IndexShardRoutingTable class

2014-03-26 Thread simonw

that is actually a bug IMO - Math.abs() can return -1 if it hits 
Integer.MIN_VALUE 

this code is just broken - can you open an issue!

On Wednesday, March 26, 2014 11:43:20 AM UTC+1, Kevin Wang wrote:
>
> pickIndex() will return the absolute value of the count, so it won't 
> return a negative value. Can you provide more details?
>
>
> Kevin
>
>
> On Wednesday, March 26, 2014 3:53:15 PM UTC+11, Shinsuke Sugaya wrote:
>>
>> Hi
>>
>> I encountered the following problem:
>>
>> Caused by: java.lang.IndexOutOfBoundsException: index (-2) must not be 
>> negative
>> at 
>> org.elasticsearch.common.base.Preconditions.checkElementIndex(Preconditions.java:306)
>> at 
>> org.elasticsearch.common.base.Preconditions.checkElementIndex(Preconditions.java:285)
>> at 
>> org.elasticsearch.common.collect.RegularImmutableList.get(RegularImmutableList.java:65)
>> at 
>> org.elasticsearch.cluster.routing.IndexShardRoutingTable.preferNodeActiveInitializingShardsIt(IndexShardRoutingTable.java:378)
>> at 
>> org.elasticsearch.cluster.routing.operation.plain.PlainOperationRouting.preferenceActiveShardIterator(PlainOperationRouting.java:210)
>> at 
>> org.elasticsearch.cluster.routing.operation.plain.PlainOperationRouting.getShards(PlainOperationRouting.java:80)
>> at 
>> org.elasticsearch.action.get.TransportGetAction.shards(TransportGetAction.java:80)
>> at 
>> org.elasticsearch.action.get.TransportGetAction.shards(TransportGetAction.java:42)
>> at 
>> org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction$AsyncSingleAction.(TransportShardSingleOperationAction.java:121)
>> at 
>> org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction$AsyncSingleAction.(TransportShardSingleOperationAction.java:97)
>> at 
>> org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction.doExecute(TransportShardSingleOperationAction.java:74)
>> at 
>> org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction.doExecute(TransportShardSingleOperationAction.java:49)
>> at 
>> org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:63)
>> at 
>> org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:49)
>> at 
>> org.elasticsearch.client.node.NodeClient.execute(NodeClient.java:85)
>> at 
>> org.elasticsearch.client.support.AbstractClient.get(AbstractClient.java:174)
>> ... 9 more
>>
>> My environment is:
>>
>>  - Elasticserach 0.90.7
>>  - 3 nodes in a cluster
>>  - Send GET request with preference=_local
>>
>> Looking into IndexShardRoutingTable class, it seems that "loc" is 
>> an unexpected negative value at the following code. pickIndex method 
>> returns a value of "counter"(incremental value). If "counter" achieves 
>> Integer.MAX_VALUE, I think that "loc" is negative and then 
>> activeShards.get(loc) throws the exception.
>>
>> int index = pickIndex();
>> for (int i = 0; i < activeShards.size(); i++) {
>> int loc = (index + i) % activeShards.size();
>>
>> If it's a bug, I'll file an issue.
>>
>> Best regards,
>>  shinsuke
>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/00ea862f-6104-411c-a786-47fd5c79d462%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Searhch Request

2014-03-26 Thread Praveenkumar Arepalli

Thank u Michael.

On Wednesday, 26 March 2014 02:28:05 UTC+5:30, mkleen wrote:
>
> Don't query the "untouched" field with a matchPhrasePrefixQuery because 
> you don't have position for this. Also make sure the positions are switched 
> on for all the fields you query.
>
> "companyName" : { "type" : "string", "term_vector": 
> "with_positions_offsets_payloads", "analyzer" : 
> "test_whitespace_wdf_lcf_analyzer"
> },
>
>
> On 25 March 2014 10:06, Praveenkumar Arepalli 
> 
> > wrote:
>
>> {
>> "index.analysis.analyzer.test_whitespace_wdf_lcf_analyzer.filter.1" : 
>> "test_lowercase_filter",
>> "index.analysis.analyzer.test_whitespace_wdf_lcf_analyzer.filter.0" : 
>> "test_word_delimiter_filter",
>>
>>
>>
>> "index.analysis.analyzer.test_whitespace_wdf_lcf_analyzer.tokenizer" : 
>> "test_whitespace_tokenizer",
>> "index.analysis.filter.test_lowercase_filter.type" : "lowercase",
>> "index.analysis.filter.test_word_delimiter_filter.catenate_words" : "true",
>>
>>
>>
>> "index.analysis.filter.test_word_delimiter_filter.type" : "word_delimiter",
>> "index.analysis.filter.test_word_delimiter_filter.preserve_original" : 
>> "true",
>> "index.analysis.filter.test_word_delimiter_filter.catenate_numbers" : "true",
>>
>>
>>
>> "index.analysis.filter.test_word_delimiter_filter.catenate_all" : "true"
>> "index.analysis.analyzer.test_sort_analyzer.filter" : "lowercase",
>> "index.analysis.analyzer.test_sort_analyzer.tokenizer" : "keyword",
>>
>>
>>
>> }
>>
>>
>>
>>
>> "companyName" : {
>> "type" : "multi_field",
>> "fields" : {
>>   "companyName" : {
>> "type" : "string",
>>
>>
>> "analyzer" : "test_whitespace_wdf_lcf_analyzer"
>>
>>   },
>>   "sortable" : {
>> "type" : "string",
>> "analyzer" : "test_sort_analyzer",
>>
>>
>>
>> "include_in_all" : false
>>   },
>>   "untouched" : {
>> "type" : "string",
>> "index" : "not_analyzed",
>>
>>
>> "norms" : {
>>
>>   "enabled" : false
>> },
>> "index_options" : "docs",
>> "include_in_all" : false
>>   }
>> }
>>   }
>>
>>
>>
>>
>> My Input data : group & Company
>>
>> it was stored successfully
>>
>>
>> now am searching with "group & Company"
>> using matchPhrasePrefixQuery
>>
>>
>>
>>
>>
>> On Tue, Mar 25, 2014 at 1:33 PM, Michael Kleen 
>> > wrote:
>>
>>>  Hi Praveenkumar,
>>>
>>> what is your index layout, your input data and what is your query 
>>> request ? Can you post your setup as a working example using curls similiar 
>>> to https://gist.github.com/mkleen/4739479 ? In this way its easy to 
>>> help you here.
>>>
>>> Regards,
>>>
>>> Michael 
>>>
>>>
>>> On 25 March 2014 08:29, Praveenkumar Arepalli 
>>> 
>>> > wrote:
>>>
 "companyName" : {
 "type" : "multi_field",
 "fields" : {
   "companyName" : {
 "type" : "string",
 "analyzer" : "apptivo_whitespace_wdf_lcf_analyzer"
   },
   "sortable" : {
 "type" : "string",
 "analyzer" : "apptivo_sort_analyzer",
 "include_in_all" : false
   },
   "untouched" : {
 "type" : "string",
 "index" : "not_analyzed",
 "norms" : {
   "enabled" : false
 },
 "index_options" : "docs",
 "include_in_all" : false
   }
 }
   }

 When i am searching in compantName field
 nested: IllegalStateException[field "companyName" was indexed without 
 position data; cannot run PhraseQuery (term=spectrum)]; }





 How to resolve this ?



 On Tue, Mar 25, 2014 at 12:54 PM, Praveenkumar Arepalli <
 arepalli.p...@gmail.com > wrote:

> Your help is appreciated. 
>
>
> On Tue, Mar 25, 2014 at 12:52 PM, David Pilato 
> 
> > wrote:
>
>>
>> http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-request-explain.html#search-request-explain
>>
>>
>> --
>> David ;-)
>> Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs
>>
>>
>> Le 25 mars 2014 à 08:18, Praveenkumar Arepalli <
>> arepalli.p...@gmail.com > a écrit :
>>
>> How to use explain David?
>>
>>
>> On Tue, Mar 25, 2014 at 12:44 PM, David Pilato 
>> 
>> > wrote:
>>
>>> I understand that you as a developer want to know it.
>>> My question is what are you going to do with that information?
>>>
>>> If it's for debugging purpose then explain is fine.
>>>
>>>
>>> --
>>> David ;-)
>>> Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs
>>>
>>>
>>> Le 25 mars 2014 à 08:07, Praveenkumar Arepalli <
>>> arepalli.p...@gmail.com > a écrit :
>>>
>>> Hi Dav

What is wrong with this query

2014-03-26 Thread Kina Shah

Hi,

I am trying to apply the custom filter score to a filtered query. The 
filtered query works fine, but when I wrap the custom filter score with 
filters around it it gives me error.  Listed below is the query. Can 
someone tell me what's wrong with it?

POST _search
{
 "query": 
{
   "custom_filters_score": 
{ 
"query": 
{
 "filtered": 
{
"query": 
{
"match_all": {}
},
 "filter": 
{
"or": 
[
{
"geo_distance" : 
{
"distance" : "20km",
"senGeoPointLst" : 
{
"lat" : 31.8655875846,
"lon" : 65.8461840664
}
}
},
{
"geo_distance" : 
{
"distance" : "20km",
"fcGeoPointLst" : 
{
"lat" : 88.1,
"lon" : -120.1
}
}
}
]
}
}
},
" filters": 
[ 
 { 
"geo_distance" : 
{
"distance" : "20km",
"fcGeoPointLst" : 
{
"lat" : 88.1,
"lon" : -120.1
}
},
"boost":3 
 }  
] 
}
}
}

The error I get is :QueryParsingException[[hws] [custom_filters_score] 
query does not support [ filters]]; }]",

Thanks!

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/0f7141d1-bb1f-49b5-ac5b-a4dc0c785138%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Is it possible to use filtered query with custom filters score?

2014-03-26 Thread Kina Shah

Can you please show me the final query that works? I am trying to do 
something similar using a filtered query and then applying custom filter 
score to the filtered query.

Thanks!


On Thursday, September 29, 2011 10:41:29 AM UTC-4, BlueZero wrote:
>
> I need to search documents that are matching some filters. But i want 
> to sort them by custom score script. 
> FI: 
> filter by gender = 1 
>
> and calculate score: 
> if ( age > 10 ) 
> _score = age * 2 
> else 
> _score = age * 1 
>
> Thats just an example do not try to understand the score function :D 
>
> Is it possible? 
>
> Thanx

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/33ad8ad4-b5f2-4a3a-8949-ae68198dd5cd%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

ES operations and memory usage

2014-03-26 Thread Uli Bethke

I am struggling a bit to understand which ES operations use up which part 
of memory

- From reading the documentation I know that faceting and sorting take up 
heap memory and also that some filters are cached in JVM heap. However, 
what other operations take up JVM heap memory?
- When faceting on a particular field are all of the values of that field 
loaded into JVM heap or only the distinct values of the field. I would 
believe it is the latter as fields of high cardinality require more JVM 
heap.
- What operations use OS memory. I believe documents retrieved and ES index 
are kept in OS memory. Can anyone confirm this? What other operations go to 
OS memory?


Finally, I would also be interested  if there are any posts or tutorials 
that desribce the complete lifecycle of a query from submission of query to 
return of resultset.

Thanks
Uli

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/561712ab-e11b-454f-b95e-38612cbb8db1%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[ANN] Elasticsearch Smart Chinese Analysis plugin 2.1.0 released


Heya,


We are pleased to announce the release of the Elasticsearch Smart Chinese 
Analysis plugin, version 2.1.0.

Smart Chinese Analysis plugin integrates Lucene Smart Chinese analysis module 
into elasticsearch..

https://github.com/elasticsearch/elasticsearch-analysis-smartcn/

Release Notes - elasticsearch-analysis-smartcn - Version 2.1.0



Update:
 * [17] - Update to elasticsearch 1.1.0 / Lucene 4.7.0 
(https://github.com/elasticsearch/elasticsearch-analysis-smartcn/issues/17)
 * [16] - Add plugin version in es-plugin.properties 
(https://github.com/elasticsearch/elasticsearch-analysis-smartcn/issues/16)
 * [12] - Register smartcn analyzer, tokenizer and tokenfilter 
(https://github.com/elasticsearch/elasticsearch-analysis-smartcn/pull/12)




Issues, Pull requests, Feature requests are warmly welcome on 
elasticsearch-analysis-smartcn project repository: 
https://github.com/elasticsearch/elasticsearch-analysis-smartcn/
For questions or comments around this plugin, feel free to use elasticsearch 
mailing list: https://groups.google.com/forum/#!forum/elasticsearch

Enjoy,

-The Elasticsearch team

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/20140326143357.803E1A6292%40smtp3-g21.free.fr.
For more options, visit https://groups.google.com/d/optout.

Re: [BUG?] match_phrase_prefix failing on v1.1.0, working on 1.0.2

2014-03-26 Thread simonw

It's actually unrelated to this issue mentioned above. It got broken due 
to https://github.com/elasticsearch/elasticsearch/pull/5005
and there is an issue open 
here: https://github.com/elasticsearch/elasticsearch/issues/5551 a fix is 
right here: https://github.com/elasticsearch/elasticsearch/issues/5553

simon
On Wednesday, March 26, 2014 3:00:28 PM UTC+1, Michal Barla wrote:
>
> OK, thanks. I've found that #5437 (it is linked from release notes) but 
> since it is discussing slopes I did not consider it relevant for my case.
>
> On Wednesday, March 26, 2014 2:49:35 PM UTC+1, Binh Ly wrote:
>>
>> Yes, it's a bug and will be fixed shortly. Related to this:
>>
>> https://github.com/elasticsearch/elasticsearch/issues/5437
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/dbcdb2ae-d030-4b25-b6d1-a47e7f15986b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

using logstash/elastic search for firewall syslog indexing: syslogs are lost on the way

2014-03-26 Thread Antoine Brun

Hello,

I'm really new to elasticsearch and I'm running some evaluation 
(ElasticSearch vs SOLR) in order to decide which one I should use for 
handling high log volume indexing.
I have a very basic setup:
*syslogInjector -> logstash listening on port 514 -> elasticsearch.*

input {
  tcp {
port => 514
type => syslog
  }
  udp {
port => 514
type => syslog
  }
}
filter {
  grok {
  match => { "message" => "%{GREEDYDATA:syslog_message}" }
  add_field => [ "received_at", "%{@timestamp}" ]
  add_field => [ "received_from", "%{host}" ]
  }
}
output {
  elasticsearch { host => localhost }
}

My test is:
*inject 1 1 ko syslog, at the rate 2000/sec.*

I did a tcpdump on the server that receives the logs and all 1 logs are 
received.

BUT, when I look at the number of document indexed in Lucene, I almost 
never get 1 document, sometime I do, but most of the time I loose up to 
40% of the logs.

Is there anything I can do to make sure that logstash doesn't loose data (I 
assume it is logstash but not sure)
Is logstash doing some buffering?

Thank you for your response

Antoine Brun

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/656cec27-895b-4ade-988b-054c4e79c325%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: [BUG?] match_phrase_prefix failing on v1.1.0, working on 1.0.2

I will open another issue for this shortly. :)

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/f928611e-47af-4e3a-98b0-0b0baa29b628%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Tutorial, how to, guide, standard procedure to upgrade ES from 0.9 to 1.0 ?

You'll need to do a full cluster restart from .90.x to 1.x (i.e. need to 
bring down all nodes). There is some information here to help reduce 
recovery/startup churn:

http://www.elasticsearch.org/webinars/elasticsearch-pre-flight-checklist/

But in general:

1) Stop all queries/indexing jobs

2) Flush your indexes: curl -XPOST "localhost:9200/_flush"

3) Shutdown your cluster: curl -XPOST "localhost:9200/_shutdown"

4) Backup all your data/ and config/ folders (from all nodes if you have 
multiple data folders/nodes) - This is important in case the upgrade fails!
5) Per node, uninstall your current ES. Be careful not to delete the data 
folders and your config/ files

6) Per node, install new ES and pull in your config/* files from previous 
install

7) Verify all config files in all nodes and all data folders

8) Start all nodes one by one

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/efeccb5f-9463-4a49-a3ea-f8ea40ec9d83%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Real Time Python Update on a River

2014-03-26 Thread Honza Král

Hi,

I am afraid the easiest solution here is to not use rivers but instead
do the loading yourself - using a dedicated python process to consume
the twitter stream, enriching the data and loading it into
elasticsearch using the stream_bulk helper in the official python
client (0).

0 - 
http://elasticsearch-py.readthedocs.org/en/latest/helpers.html#elasticsearch.helpers.streaming_bulk

Hope this helps,
Honza

On Tue, Mar 25, 2014 at 5:38 PM, Thibaut Lapierre
 wrote:
> Hi,
> I use the twitter river who use bulk indexing.
>
> I have a Python script who analyse tweets and return some data.
> So i want to analyse each tweet and add two fields to the river with the
> returned data.
>
> Maybe i can build a second sheme with id and treatment status in order to
> run the script separately and update each doc, but i'm pretty sure there is
> a cleaner/easier solution.
>
> Thanks for helping
>
> Thibaut
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/a480947a-5790-4b84-838e-c0d37c191079%40googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CABfdDirZK3-bhP4kZU%2B3szYTK%2BJBibdswT%3DuVJrFWqRk%2BB8u4A%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: [BUG?] match_phrase_prefix failing on v1.1.0, working on 1.0.2

2014-03-26 Thread Michal Barla

OK, thanks. I've found that #5437 (it is linked from release notes) but 
since it is discussing slopes I did not consider it relevant for my case.

On Wednesday, March 26, 2014 2:49:35 PM UTC+1, Binh Ly wrote:
>
> Yes, it's a bug and will be fixed shortly. Related to this:
>
> https://github.com/elasticsearch/elasticsearch/issues/5437
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/45b1cd17-8c90-4146-b221-3e255a5b37cf%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: How to apply score/boost factor in ElasticSearch1.0.0 on filters

The function_score query will allow you to define multiple filters, each of 
which you can apply a boost_factor or any custom score you want:

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-function-score-query.html#_using_function_score

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/f96ed3bf-ccdb-48ab-b53a-346b3cc1e612%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: [BUG?] match_phrase_prefix failing on v1.1.0, working on 1.0.2

Yes, it's a bug and will be fixed shortly. Related to this:

https://github.com/elasticsearch/elasticsearch/issues/5437

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/03a290c6-4835-4bf7-968e-b81f7361e639%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Problem connecting to elastic search V-0.90.5 with "elasticsearch-0.90.12" jar in Java.

2014-03-26 Thread David Pilato

No you can't. 

But your error is something else.
Do you send direct JSON with the transport client?

If so, don't use post_filter which does not exist before 1.0.

-- 
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr


Le 26 mars 2014 à 14:39:29, Raj (rajagopal@gmail.com) a écrit:

HI,
    I am working with elastic search version 0.90.5. I was able to query 
elastic search with client version 0.90.5. But when I try to upgrade the client 
jar to 0.90.12 it cannot query from it. It throws me an error.

Exception in thread "main" 
org.elasticsearch.action.search.SearchPhaseExecutionException: Failed to 
execute phase [query], all shards failed; shardFailures 
{[U-TIXYeuTRCkwaODFI5hpg][guest][1]: SearchParseException[[guest][1]: 
from[0],size[100]: Parse Failure [Failed to parse source 
[{"from":0,"size":100,"post_filter":{"or":{"filters":[{"regexp":{"FirstName.firstname":"michael"}}]}}}]]];
 nested: SearchParseException[[guest][1]: from[0],size[100]: Parse Failure [No 
parser for element [post_filter]]]; }{[U-TIXYeuTRCkwaODFI5hpg][guest][0]: 
SearchParseException[[guest][0]: from[0],size[100]: Parse Failure [Failed to 
parse source 
[{"from":0,"size":100,"post_filter":{"or":{"filters":[{"regexp":{"FirstName.firstname":"michael"}}]}}}]]];
 nested: SearchParseException[[guest][0]: from[0],size[100]: Parse Failure [No 
parser for element [post_filter]]]; }{[U-TIXYeuTRCkwaODFI5hpg][guest][3]: 
SearchParseException[[guest][3]: from[0],size[100]: Parse Failure [Failed to 
parse source 
[{"from":0,"size":100,"post_filter":{"or":{"filters":[{"regexp":{"FirstName.firstname":"michael"}}]}}}]]];
 nested: SearchParseException[[guest][3]: from[0],size[100]: Parse Failure [No 
parser for element [post_filter]]]; }{[U-TIXYeuTRCkwaODFI5hpg][guest][2]: 
SearchParseException[[guest][2]: from[0],size[100]: Parse Failure [Failed to 
parse source 
[{"from":0,"size":100,"post_filter":{"or":{"filters":[{"regexp":{"FirstName.firstname":"michael"}}]}}}]]];
 nested: SearchParseException[[guest][2]: from[0],size[100]: Parse Failure [No 
parser for element [post_filter]]]; }{[U-TIXYeuTRCkwaODFI5hpg][guest][4]: 
SearchParseException[[guest][4]: from[0],size[100]: Parse Failure [Failed to 
parse source 
[{"from":0,"size":100,"post_filter":{"or":{"filters":[{"regexp":{"FirstName.firstname":"michael"}}]}}}]]];
 nested: SearchParseException[[guest][4]: from[0],size[100]: Parse Failure [No 
parser for element [post_filter]]]; }
at 
org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.onFirstPhaseResult(TransportSearchTypeAction.java:272)
at 
org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction$3.onFailure(TransportSearchTypeAction.java:224)
at 
org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteQuery(SearchServiceTransportAction.java:205)
at 
org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction$AsyncAction.sendExecuteFirstPhase(TransportSearchQueryThenFetchAction.java:80)
at 
org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:216)
at 
org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:203)
at 
org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction$2.run(TransportSearchTypeAction.java:186)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:722)

My question is simple.
Can I use elasticsearch-0.90.12.jar to query elastic search v0.90.5 cluster?
--
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/8e11d27f-86e4-4dfd-a21a-0ee93af6a910%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/etPan.5332d990.436c6125.fe0d%40MacBook-Air-de-David.local.
For more options, visit https://groups.google.com/d/optout.

Problem connecting to elastic search V-0.90.5 with "elasticsearch-0.90.12" jar in Java.

2014-03-26 Thread Raj

HI,
I am working with elastic search version 0.90.5. I was able to query 
elastic search with client version 0.90.5. But when I try to upgrade the 
client jar to 0.90.12 it cannot query from it. It throws me an error.

Exception in thread "main" 
org.elasticsearch.action.search.SearchPhaseExecutionException: Failed to 
execute phase [query], all shards failed; shardFailures 
{[U-TIXYeuTRCkwaODFI5hpg][guest][1]: SearchParseException[[guest][1]: 
from[0],size[100]: Parse Failure [Failed to parse source 
[{"from":0,"size":100,"post_filter":{"or":{"filters":[{"regexp":{"FirstName.firstname":"michael"}}]}}}]]];
 
nested: SearchParseException[[guest][1]: from[0],size[100]: Parse Failure 
[No parser for element [post_filter]]]; 
}{[U-TIXYeuTRCkwaODFI5hpg][guest][0]: SearchParseException[[guest][0]: 
from[0],size[100]: Parse Failure [Failed to parse source 
[{"from":0,"size":100,"post_filter":{"or":{"filters":[{"regexp":{"FirstName.firstname":"michael"}}]}}}]]];
 
nested: SearchParseException[[guest][0]: from[0],size[100]: Parse Failure 
[No parser for element [post_filter]]]; 
}{[U-TIXYeuTRCkwaODFI5hpg][guest][3]: SearchParseException[[guest][3]: 
from[0],size[100]: Parse Failure [Failed to parse source 
[{"from":0,"size":100,"post_filter":{"or":{"filters":[{"regexp":{"FirstName.firstname":"michael"}}]}}}]]];
 
nested: SearchParseException[[guest][3]: from[0],size[100]: Parse Failure 
[No parser for element [post_filter]]]; 
}{[U-TIXYeuTRCkwaODFI5hpg][guest][2]: SearchParseException[[guest][2]: 
from[0],size[100]: Parse Failure [Failed to parse source 
[{"from":0,"size":100,"post_filter":{"or":{"filters":[{"regexp":{"FirstName.firstname":"michael"}}]}}}]]];
 
nested: SearchParseException[[guest][2]: from[0],size[100]: Parse Failure 
[No parser for element [post_filter]]]; 
}{[U-TIXYeuTRCkwaODFI5hpg][guest][4]: SearchParseException[[guest][4]: 
from[0],size[100]: Parse Failure [Failed to parse source 
[{"from":0,"size":100,"post_filter":{"or":{"filters":[{"regexp":{"FirstName.firstname":"michael"}}]}}}]]];
 
nested: SearchParseException[[guest][4]: from[0],size[100]: Parse Failure 
[No parser for element [post_filter]]]; }
at 
org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.onFirstPhaseResult(TransportSearchTypeAction.java:272)
at 
org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction$3.onFailure(TransportSearchTypeAction.java:224)
at 
org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteQuery(SearchServiceTransportAction.java:205)
at 
org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction$AsyncAction.sendExecuteFirstPhase(TransportSearchQueryThenFetchAction.java:80)
at 
org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:216)
at 
org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:203)
at 
org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction$2.run(TransportSearchTypeAction.java:186)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:722)

My question is simple.
Can I use elasticsearch-0.90.12.jar to query elastic search v0.90.5 cluster?

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/8e11d27f-86e4-4dfd-a21a-0ee93af6a910%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[ANN] Elasticsearch Stempel (Polish) Analysis plugin 2.1.0 released


Heya,


We are pleased to announce the release of the Elasticsearch Stempel (Polish) 
Analysis plugin, version 2.1.0.

The Stempel (Polish) Analysis plugin integrates Lucene stempel (polish) 
analysis module into elasticsearch..

https://github.com/elasticsearch/elasticsearch-analysis-stempel/

Release Notes - elasticsearch-analysis-stempel - Version 2.1.0



Update:
 * [22] - Update to elasticsearch 1.1.0 / Lucene 4.7.0 
(https://github.com/elasticsearch/elasticsearch-analysis-stempel/issues/22)




Issues, Pull requests, Feature requests are warmly welcome on 
elasticsearch-analysis-stempel project repository: 
https://github.com/elasticsearch/elasticsearch-analysis-stempel/
For questions or comments around this plugin, feel free to use elasticsearch 
mailing list: https://groups.google.com/forum/#!forum/elasticsearch

Enjoy,

-The Elasticsearch team

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/20140326131657.F19694C804F%40smtp4-g21.free.fr.
For more options, visit https://groups.google.com/d/optout.

Re: IndexOutOfBoundsException at IndexShardRoutingTable class

2014-03-26 Thread Shinsuke Sugaya

> int loc = (index + i) % activeShards.size();

index is NOT negative, but (index + i) is negative.

Regards,
 shinsuke

2014年3月26日水曜日 19時43分20秒 UTC+9 Kevin Wang:
>
> pickIndex() will return the absolute value of the count, so it won't 
> return a negative value. Can you provide more details?
>
>
> Kevin
>
>
> On Wednesday, March 26, 2014 3:53:15 PM UTC+11, Shinsuke Sugaya wrote:
>>
>> Hi
>>
>> I encountered the following problem:
>>
>> Caused by: java.lang.IndexOutOfBoundsException: index (-2) must not be 
>> negative
>> at 
>> org.elasticsearch.common.base.Preconditions.checkElementIndex(Preconditions.java:306)
>> at 
>> org.elasticsearch.common.base.Preconditions.checkElementIndex(Preconditions.java:285)
>> at 
>> org.elasticsearch.common.collect.RegularImmutableList.get(RegularImmutableList.java:65)
>> at 
>> org.elasticsearch.cluster.routing.IndexShardRoutingTable.preferNodeActiveInitializingShardsIt(IndexShardRoutingTable.java:378)
>> at 
>> org.elasticsearch.cluster.routing.operation.plain.PlainOperationRouting.preferenceActiveShardIterator(PlainOperationRouting.java:210)
>> at 
>> org.elasticsearch.cluster.routing.operation.plain.PlainOperationRouting.getShards(PlainOperationRouting.java:80)
>> at 
>> org.elasticsearch.action.get.TransportGetAction.shards(TransportGetAction.java:80)
>> at 
>> org.elasticsearch.action.get.TransportGetAction.shards(TransportGetAction.java:42)
>> at 
>> org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction$AsyncSingleAction.(TransportShardSingleOperationAction.java:121)
>> at 
>> org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction$AsyncSingleAction.(TransportShardSingleOperationAction.java:97)
>> at 
>> org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction.doExecute(TransportShardSingleOperationAction.java:74)
>> at 
>> org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction.doExecute(TransportShardSingleOperationAction.java:49)
>> at 
>> org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:63)
>> at 
>> org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:49)
>> at 
>> org.elasticsearch.client.node.NodeClient.execute(NodeClient.java:85)
>> at 
>> org.elasticsearch.client.support.AbstractClient.get(AbstractClient.java:174)
>> ... 9 more
>>
>> My environment is:
>>
>>  - Elasticserach 0.90.7
>>  - 3 nodes in a cluster
>>  - Send GET request with preference=_local
>>
>> Looking into IndexShardRoutingTable class, it seems that "loc" is 
>> an unexpected negative value at the following code. pickIndex method 
>> returns a value of "counter"(incremental value). If "counter" achieves 
>> Integer.MAX_VALUE, I think that "loc" is negative and then 
>> activeShards.get(loc) throws the exception.
>>
>> int index = pickIndex();
>> for (int i = 0; i < activeShards.size(); i++) {
>> int loc = (index + i) % activeShards.size();
>>
>> If it's a bug, I'll file an issue.
>>
>> Best regards,
>>  shinsuke
>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/f3649939-fa7b-4db6-b3d5-b1eb3329abed%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[BUG?] match_phrase_prefix failing on v1.1.0, working on 1.0.2

2014-03-26 Thread Michal Barla

Hello everyone,

I am using match_phrase_prefix query and was surprised that after updating 
to 1.1.0 my tests got broken.

Here is a small test case which works on 1.0.2 (returns a hit) and fails (0 
hits) on 1.1.0.

https://gist.github.com/crutch/04cd5acbd84eee04d683

I did not find any mention about breaking changes concerning 
match_phrase_prefix on a web page...

Is there something I have missed or is it a bug?

-- 
Michal Barla

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/0bfc9aaa-4df9-45d6-988a-b7e8a2b17fed%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Java API or REST API for client development ?

2014-03-26 Thread Graham Tackley

Not true anymore: the java client has been compatible between minor 
versions since 0.90 as far as I remember. 1.0.0 client is currently working 
just fine against my 1.0.1 cluster, and my experimentation today shows that 
it also works fine against 1.1.0.  So this used to be a nightmare requiring 
synchronised upgrades, but hasn't been for a while.

FWIW we use the java client (in transport client mode) extensively from our 
scala apps, and it works brilliantly. I'd definitely recommend. 

On Wednesday, 26 March 2014 11:45:19 UTC, Martin Forssen wrote:
>
> The Java API is said to have better performance (and I believe that). The 
> drawbacks are that you must use the exact same version of the java API 
> library on the client as the server runs, as well as the same version of 
> Java. So upgrades suck.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/610cb839-1625-43d6-be57-db654de9aace%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[ANN] Elasticsearch Phonetic Analysis plugin 2.1.0 released


Heya,


We are pleased to announce the release of the Elasticsearch Phonetic Analysis 
plugin, version 2.1.0.

The Phonetic Analysis plugin integrates phonetic token filter analysis with 
elasticsearch..

https://github.com/elasticsearch/elasticsearch-analysis-phonetic/

Release Notes - elasticsearch-analysis-phonetic - Version 2.1.0



Update:
 * [23] - Update to elasticsearch 1.1.0 / Lucene 4.7.0 
(https://github.com/elasticsearch/elasticsearch-analysis-phonetic/issues/23)
 * [21] - Add plugin version in es-plugin.properties 
(https://github.com/elasticsearch/elasticsearch-analysis-phonetic/issues/21)




Issues, Pull requests, Feature requests are warmly welcome on 
elasticsearch-analysis-phonetic project repository: 
https://github.com/elasticsearch/elasticsearch-analysis-phonetic/
For questions or comments around this plugin, feel free to use elasticsearch 
mailing list: https://groups.google.com/forum/#!forum/elasticsearch

Enjoy,

-The Elasticsearch team

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/20140326115913.30934D480E9%40smtp5-g21.free.fr.
For more options, visit https://groups.google.com/d/optout.

Re: Benchmarks (again)

2014-03-26 Thread Robin Clarke

In that case, *7.4*

-Robin-


On 26 March 2014 12:43, Steinar Bang  wrote:

> > Robin Clarke :
>
> > I just pasted output from uname there - that's kernel 3.2.54-2 you're
> > reading there.
> > I think the Debian release is Wheezy (7.0)
>
> OK.  The OS version can be found in the /etc/debian_version file.
>
> --
> You received this message because you are subscribed to a topic in the
> Google Groups "elasticsearch" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/elasticsearch/muIKhFkrxFc/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/upzcwqfhjibk.fsf%40dod.no.
> For more options, visit https://groups.google.com/d/optout.
>



-- 
Best winds,
-Robin-
~:)

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CACX78vbeuTZunCKn_-Uo%3DKMabxFJufA5cZadLD2q7dHJ9YFMeA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Shards/routing documents imbalance problem

2014-03-26 Thread Han JU

Thanks a lot Kevin.

That DJB_HASH result makes it clear for us. I think we'll just use the id 
value as hash.
Do you guys know how to plugin a custom hash function?

在 2014年3月26日星期三UTC+1上午11时58分36秒，Kevin Wang写道：
>
> There are two hash functions 
> implementation 
> org.elasticsearch.cluster.routing.operation.hash.djb.DjbHashFunction 
> and 
> org.elasticsearch.cluster.routing.operation.hash.simple.SimpleHashFunction, 
> default is DjbHashFunction. You can try get the hash by 
> using DjbHashFunction.DJB_HASH(you id)
>
>
>
>
> On Wednesday, March 26, 2014 9:49:10 PM UTC+11, Han JU wrote:
>>
>> Thanks for your reply.
>>
>> As far as I know, in Java, basic hash value of positive int/long value is 
>> just themselves (our ids are small values like 1125, 345 etc).
>> So I calculated some_id % 128, and I got 116 distinct values. But in 
>> reality there's a lot less shards in use. 
>>
>> Does ElasticSearch use some special hash function?
>>
>> 在 2014年3月26日星期三UTC+1上午11时39分15秒，Kevin Wang写道：
>>>
>>> ES will get the shard id by hash(routing)%num of shards, in your case, 
>>> there are only 167 distinct values but have 128 shards, I think it's highly 
>>> possible there is less than 128 distinct hash values. So some of the shard 
>>> will not have any data.
>>>
>>>
>>> Kevin
>>>
>>> On Wednesday, March 26, 2014 9:30:36 PM UTC+11, Han JU wrote:

 Hi,

 We've indexed 25M documents into a single index of 128 shards with 1 
 replica. 
 The `routing` parameter is set to a path in the document, which is an 
 int value:

 _routing: {
   path: "some_id"
   required: true
 }

 In out 25M documents, there's 167 distinct values of this "some_id" and 
 in our expectation, ElasticSearch will route these documents evenly across 
 all shards.
 But we've found out that, out of 128 shards, there are 53 empty shards 
 (with 0 document inside), or, 40% of the shards are not used at all.

 My question: 

 - is this normal? Do we miss something in configuring routing? 
 - does this imbalanced shard utilization affect indexing speed?

 We can confirm that all documents are correctly indexed and routing 
 works (when searching with routing only 1 shard responds with the correct 
 answer).
 ElasticSearch version is v1.0.1.

 Thanks!

>>>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/29da8ba5-ccab-40b0-9174-b6522408dd51%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Java API or REST API for client development ?

2014-03-26 Thread Martin Forssen

The Java API is said to have better performance (and I believe that). The 
drawbacks are that you must use the exact same version of the java API 
library on the client as the server runs, as well as the same version of 
Java. So upgrades suck.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/4a130362-4732-423b-8994-937e2662c38a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Benchmarks (again)

2014-03-26 Thread Steinar Bang

> Robin Clarke :

> I just pasted output from uname there - that's kernel 3.2.54-2 you're
> reading there.
> I think the Debian release is Wheezy (7.0)

OK.  The OS version can be found in the /etc/debian_version file.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/upzcwqfhjibk.fsf%40dod.no.
For more options, visit https://groups.google.com/d/optout.

[ANN] Elasticsearch Japanese (kuromoji) Analysis plugin 2.1.0 released


Heya,


We are pleased to announce the release of the Elasticsearch Japanese (kuromoji) 
Analysis plugin, version 2.1.0.

The Japanese (kuromoji) Analysis plugin integrates Lucene kuromoji analysis 
module into elasticsearch..

https://github.com/elasticsearch/elasticsearch-analysis-kuromoji/

Release Notes - elasticsearch-analysis-kuromoji - Version 2.1.0



Update:
 * [28] - Update to elasticsearch 1.1.0 / Lucene 4.7.0 
(https://github.com/elasticsearch/elasticsearch-analysis-kuromoji/issues/28)
 * [26] - Add plugin version in es-plugin.properties 
(https://github.com/elasticsearch/elasticsearch-analysis-kuromoji/issues/26)


Doc:
 * [24] - fix typos in README.md 
(https://github.com/elasticsearch/elasticsearch-analysis-kuromoji/pull/24)


Issues, Pull requests, Feature requests are warmly welcome on 
elasticsearch-analysis-kuromoji project repository: 
https://github.com/elasticsearch/elasticsearch-analysis-kuromoji/
For questions or comments around this plugin, feel free to use elasticsearch 
mailing list: https://groups.google.com/forum/#!forum/elasticsearch

Enjoy,

-The Elasticsearch team

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/20140326113144.639CD4B002F%40smtp2-g21.free.fr.
For more options, visit https://groups.google.com/d/optout.

Re: Benchmarks (again)

2014-03-26 Thread Robin Clarke

I just pasted output from uname there - that's kernel 3.2.54-2 you're
reading there.
I think the Debian release is Wheezy (7.0)

-Robin-


On 26 March 2014 12:17, Steinar Bang  wrote:

> > Robin Clarke :
>
> > Thanks for the tip with the number of masters!
> > java version "1.6.0_45" on Debian 3.2.54-2
>
> Debian 3.2?  Are you sure...?
>  http://www.debian.org/releases/woody/
>  http://www.debian.org/releases/sarge/
>
> AFAIK there is no 3.2.  Debian went from 3.1 "woody" to 4.0 "etch".
>
> (I have a computer still running, that was installed as debian potato in
> 2001, taken to debian testing, and went to woody as the first stable
> release in 2002. Both woody or sarge were fine releases in their time,
> but I wouldn't pick either for a new high performance cluster...:-) )
>
> --
> You received this message because you are subscribed to a topic in the
> Google Groups "elasticsearch" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/elasticsearch/muIKhFkrxFc/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/upzc1txpky3j.fsf%40dod.no.
> For more options, visit https://groups.google.com/d/optout.
>



-- 
Best winds,
-Robin-
~:)

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CACX78vYdjQb%2B5oW2Bst_2Ar18bzGQVaCFUe0i_4RgLJnE8cdXw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Benchmarks (again)

2014-03-26 Thread Steinar Bang

> Robin Clarke :

> Thanks for the tip with the number of masters!
> java version "1.6.0_45" on Debian 3.2.54-2

Debian 3.2?  Are you sure...?
 http://www.debian.org/releases/woody/
 http://www.debian.org/releases/sarge/

AFAIK there is no 3.2.  Debian went from 3.1 "woody" to 4.0 "etch".

(I have a computer still running, that was installed as debian potato in
2001, taken to debian testing, and went to woody as the first stable
release in 2002. Both woody or sarge were fine releases in their time,
but I wouldn't pick either for a new high performance cluster...:-) )

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/upzc1txpky3j.fsf%40dod.no.
For more options, visit https://groups.google.com/d/optout.

Re: How many available disk space need for normal system running?

2014-03-26 Thread Mark Walkom

There's not much you can do, you either need to delete some data or
increase your disk space.

Maybe someone can clarify how much space is needed for a merge, but I
imagine it'd be twice the size of a shard.

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com
web: www.campaignmonitor.com


On 26 March 2014 22:01, Ivan Ji  wrote:

> Hi Mark,
>
> My index size is about 300GB, the free disk space is about 5G.
>
>
>
> Mark Walkom於 2014年3月26日星期三UTC+8下午6時18分12秒寫道：
>>
>> Depends on how much data you have.
>>
>> How much disk space is on the machine? How much data is in ES?
>>
>> Regards,
>> Mark Walkom
>>
>> Infrastructure Engineer
>> Campaign Monitor
>> email: ma...@campaignmonitor.com
>> web: www.campaignmonitor.com
>>
>>
>> On 26 March 2014 21:06, Ivan Ji  wrote:
>>
>>> I just looked up the lucene's document. during the merge, there at least
>>> need double the index size.
>>> But does ES tune something on it ? or it just follow the lucene's rule,
>>> that is 2x the index size.
>>>
>>> Ivan Ji於 2014年3月26日星期三UTC+8下午5時53分33秒寫道：
>>>
 Hi all,

 I am using ES 1.0.1. I am wondering how many un-used disk space needed
 for the ES's system running?

 Because I ran into the error:

 [2014-03-26 03:30:52,713][WARN ][index.merge.scheduler] [Rick
> Jones] [qusion][1] failed to merge
> java.io.IOException: No space left on device
> at java.io.RandomAccessFile.writeBytes0(Native Method)
>  at java.io.RandomAccessFile.writeBytes(RandomAccessFile.java:520)
> at java.io.RandomAccessFile.write(RandomAccessFile.java:550)
>  at org.apache.lucene.store.FSDirectory$FSIndexOutput.flushBuffe
> r(FSDirectory.java:458)
> at org.apache.lucene.store.RateLimitedFSDirectory$RateLimitedIn
> dexOutput.flushBuffer(RateLimitedFSDirectory.java:102)
>  at org.apache.lucene.store.BufferedChecksumIndexOutput.flushBuffer(
> BufferedChecksumIndexOutput.java:71)
> at org.apache.lucene.store.BufferedIndexOutput.flushBuffer(Buff
> eredIndexOutput.java:113)
>  at org.apache.lucene.store.BufferedIndexOutput.flush(BufferedIn
> dexOutput.java:102)
> at org.apache.lucene.store.BufferedChecksumIndexOutput.flush(Bu
> fferedChecksumIndexOutput.java:86)
>  at org.apache.lucene.store.BufferedIndexOutput.close(BufferedIn
> dexOutput.java:126)
> at org.apache.lucene.store.BufferedChecksumIndexOutput.close(Bu
> fferedChecksumIndexOutput.java:61)
>  at org.elasticsearch.index.store.Store$StoreIndexOutput.close(S
> tore.java:602)
> at org.apache.lucene.codecs.compressing.CompressingStoredFieldsIndexWr
> iter.close(CompressingStoredFieldsIndexWriter.java:205)
>  at org.apache.lucene.util.IOUtils.close(IOUtils.java:140)
> at org.apache.lucene.codecs.compressing.CompressingStoredFieldsWriter.
> close(CompressingStoredFieldsWriter.java:138)
>  at org.apache.lucene.index.SegmentMerger.mergeFields(SegmentMer
> ger.java:318)
> at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:94)
>  at org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.
> java:4071)
> at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3668)
>  at org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(Con
> currentMergeScheduler.java:405)
> at org.apache.lucene.index.TrackingConcurrentMergeScheduler.doMerge(
> TrackingConcurrentMergeScheduler.java:107)
>  at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(
> ConcurrentMergeScheduler.java:482)
> [2014-03-26 03:30:53,382][WARN ][index.engine.internal] [Rick
> Jones] [qusion][1] failed engine



 Obviously, there need to have some amount of disk during the merge.
 And I think the larger index size, the more disk space needed for the
 merge operation.

  Does anyone have the idea how much does it can be?

 Ivan

>>>  --
>>> You received this message because you are subscribed to the Google
>>> Groups "elasticsearch" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to elasticsearc...@googlegroups.com.
>>> To view this discussion on the web visit https://groups.google.com/d/
>>> msgid/elasticsearch/448ff632-0462-436f-b0cf-5a943b8db50f%
>>> 40googlegroups.com
>>> .
>>>
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/c55ca5dd-f649-4a2c-ba65-6afb2aab0583%40googlegroups.com

Re: How many available disk space need for normal system running?

2014-03-26 Thread Ivan Ji

Hi Mark,

My index size is about 300GB, the free disk space is about 5G.



Mark Walkom於 2014年3月26日星期三UTC+8下午6時18分12秒寫道：
>
> Depends on how much data you have.
>
> How much disk space is on the machine? How much data is in ES?
>
> Regards,
> Mark Walkom
>
> Infrastructure Engineer
> Campaign Monitor
> email: ma...@campaignmonitor.com 
> web: www.campaignmonitor.com
>  
>
> On 26 March 2014 21:06, Ivan Ji > wrote:
>
>> I just looked up the lucene's document. during the merge, there at least 
>> need double the index size.
>> But does ES tune something on it ? or it just follow the lucene's rule, 
>> that is 2x the index size.
>>
>> Ivan Ji於 2014年3月26日星期三UTC+8下午5時53分33秒寫道：
>>
>>> Hi all,
>>>
>>> I am using ES 1.0.1. I am wondering how many un-used disk space needed 
>>> for the ES's system running?
>>>
>>> Because I ran into the error:
>>>
>>> [2014-03-26 03:30:52,713][WARN ][index.merge.scheduler] [Rick Jones] 
 [qusion][1] failed to merge
 java.io.IOException: No space left on device
 at java.io.RandomAccessFile.writeBytes0(Native Method)
  at java.io.RandomAccessFile.writeBytes(RandomAccessFile.java:520)
 at java.io.RandomAccessFile.write(RandomAccessFile.java:550)
  at org.apache.lucene.store.FSDirectory$FSIndexOutput.
 flushBuffer(FSDirectory.java:458)
 at org.apache.lucene.store.RateLimitedFSDirectory$
 RateLimitedIndexOutput.flushBuffer(RateLimitedFSDirectory.java:102)
  at org.apache.lucene.store.BufferedChecksumIndexOutput.flushBuffer(
 BufferedChecksumIndexOutput.java:71)
 at org.apache.lucene.store.BufferedIndexOutput.flushBuffer(
 BufferedIndexOutput.java:113)
  at org.apache.lucene.store.BufferedIndexOutput.flush(
 BufferedIndexOutput.java:102)
 at org.apache.lucene.store.BufferedChecksumIndexOutput.flush(
 BufferedChecksumIndexOutput.java:86)
  at org.apache.lucene.store.BufferedIndexOutput.close(
 BufferedIndexOutput.java:126)
 at org.apache.lucene.store.BufferedChecksumIndexOutput.close(
 BufferedChecksumIndexOutput.java:61)
  at org.elasticsearch.index.store.Store$StoreIndexOutput.close(
 Store.java:602)
 at org.apache.lucene.codecs.compressing.CompressingStoredFieldsIndexWr
 iter.close(CompressingStoredFieldsIndexWriter.java:205)
  at org.apache.lucene.util.IOUtils.close(IOUtils.java:140)
 at org.apache.lucene.codecs.compressing.CompressingStoredFieldsWriter.
 close(CompressingStoredFieldsWriter.java:138)
  at org.apache.lucene.index.SegmentMerger.mergeFields(
 SegmentMerger.java:318)
 at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:94)
  at org.apache.lucene.index.IndexWriter.mergeMiddle(
 IndexWriter.java:4071)
 at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3668)
  at org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(
 ConcurrentMergeScheduler.java:405)
 at org.apache.lucene.index.TrackingConcurrentMergeScheduler.doMerge(
 TrackingConcurrentMergeScheduler.java:107)
  at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(
 ConcurrentMergeScheduler.java:482)
 [2014-03-26 03:30:53,382][WARN ][index.engine.internal] [Rick 
 Jones] [qusion][1] failed engine
>>>
>>>
>>>
>>> Obviously, there need to have some amount of disk during the merge.
>>> And I think the larger index size, the more disk space needed for the 
>>> merge operation.
>>>
>>>  Does anyone have the idea how much does it can be?
>>>
>>> Ivan
>>>
>>  -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to elasticsearc...@googlegroups.com .
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/448ff632-0462-436f-b0cf-5a943b8db50f%40googlegroups.com
>> .
>>
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/c55ca5dd-f649-4a2c-ba65-6afb2aab0583%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Shards/routing documents imbalance problem

There are two hash functions 
implementation 
org.elasticsearch.cluster.routing.operation.hash.djb.DjbHashFunction 
and org.elasticsearch.cluster.routing.operation.hash.simple.SimpleHashFunction, 
default is DjbHashFunction. You can try get the hash by 
using DjbHashFunction.DJB_HASH(you id)




On Wednesday, March 26, 2014 9:49:10 PM UTC+11, Han JU wrote:
>
> Thanks for your reply.
>
> As far as I know, in Java, basic hash value of positive int/long value is 
> just themselves (our ids are small values like 1125, 345 etc).
> So I calculated some_id % 128, and I got 116 distinct values. But in 
> reality there's a lot less shards in use. 
>
> Does ElasticSearch use some special hash function?
>
> 在 2014年3月26日星期三UTC+1上午11时39分15秒，Kevin Wang写道：
>>
>> ES will get the shard id by hash(routing)%num of shards, in your case, 
>> there are only 167 distinct values but have 128 shards, I think it's highly 
>> possible there is less than 128 distinct hash values. So some of the shard 
>> will not have any data.
>>
>>
>> Kevin
>>
>> On Wednesday, March 26, 2014 9:30:36 PM UTC+11, Han JU wrote:
>>>
>>> Hi,
>>>
>>> We've indexed 25M documents into a single index of 128 shards with 1 
>>> replica. 
>>> The `routing` parameter is set to a path in the document, which is an 
>>> int value:
>>>
>>> _routing: {
>>>   path: "some_id"
>>>   required: true
>>> }
>>>
>>>
>>> In out 25M documents, there's 167 distinct values of this "some_id" and 
>>> in our expectation, ElasticSearch will route these documents evenly across 
>>> all shards.
>>> But we've found out that, out of 128 shards, there are 53 empty shards 
>>> (with 0 document inside), or, 40% of the shards are not used at all.
>>>
>>> My question: 
>>>
>>> - is this normal? Do we miss something in configuring routing? 
>>> - does this imbalanced shard utilization affect indexing speed?
>>>
>>> We can confirm that all documents are correctly indexed and routing 
>>> works (when searching with routing only 1 shard responds with the correct 
>>> answer).
>>> ElasticSearch version is v1.0.1.
>>>
>>>  
>>> Thanks!
>>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/9c8a9eba-2f0f-452f-98ac-34463da7f496%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Choking of Elasticsearch.

2014-03-26 Thread Narinder Kaur

Hi there,
 
 We are using Elasticsearch 0.90.6 on our production server, with 
one replica. We need to import the data via scripts and we have million of 
records to be saved. We are using the Bulk mode to save the documents to 
Elasticsearch index. 

Our architecture is using two Databases, One is Mysql, and other is 
Elasticsearch. The data is first saved in Mysql, then it is sent to 
Elasticseach for documents savings. But while data import script is 
running, the data is being saved in Mysql, but not being pushed to 
Elasticsearch, On investigating the nodes via Elastic HQ, I found the 
following factors



 The filter eviction is in red colour, saying it has crossed the 
limit.I need to understand, the two terms used for this filter 
eviction(*filter_cache.eviction 
and query_total*), How are they being calculated and  How can i control 
them,How it is affecting the data saving and data searching too, or 
anything else that can cause the document skipping from saving.The search 
is also very very slow, although,we have given 30 GB of RAM to 
Elasticsearch server.

Thanks in advance.
Narinder Kaur

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/e3aff1cf-d3ff-4037-ac3b-c846a0d00798%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Java API or REST API for client development ?

I think it's better to use official client. REST API will also call Java 
API internally. so if you use REST it will be Java -> REST -> Java.
What do you mean by multiplatform? Android? I'm not quite sure ES Java API 
works on Android or not, but I think the Android shouldn't talk to ES 
directly.


Regards,
Kevin

On Wednesday, March 26, 2014 9:42:19 PM UTC+11, Subhadip Bagui wrote:
>
>
> My app is in Java only. So what I mean is should I use elasticsearch Java 
> client or available REST api's only using HttpClient and all.
>
> What will be more flexiable for multiplatform ingration ?
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/bbe9688c-ef1f-46e3-9ecd-9d183672e310%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

elasticsearch 1.1.0 compatibility? full cluster restart?

2014-03-26 Thread Graham Tackley

The release notes for elasticsearch 1.1.0 don't say anything about 
compatibility with 1.0 (or at least I didn't see it).

- can I mix 1.0.1 and 1.1.0 in the same cluster, i.e. do a rolling upgrade? 
 
- does the java 1.0.1 client library talk ok to a 1.1.0 cluster?

I'm really excited about some of the new stuff in 1.1.0...

g

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/a511febf-a930-422d-94b3-aad903b7f50d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Shards/routing documents imbalance problem

2014-03-26 Thread Han JU

Thanks for your reply.

As far as I know, in Java, basic hash value of positive int/long value is 
just themselves (our ids are small values like 1125, 345 etc).
So I calculated some_id % 128, and I got 116 distinct values. But in 
reality there's a lot less shards in use. 

Does ElasticSearch use some special hash function?

在 2014年3月26日星期三UTC+1上午11时39分15秒，Kevin Wang写道：
>
> ES will get the shard id by hash(routing)%num of shards, in your case, 
> there are only 167 distinct values but have 128 shards, I think it's highly 
> possible there is less than 128 distinct hash values. So some of the shard 
> will not have any data.
>
>
> Kevin
>
> On Wednesday, March 26, 2014 9:30:36 PM UTC+11, Han JU wrote:
>>
>> Hi,
>>
>> We've indexed 25M documents into a single index of 128 shards with 1 
>> replica. 
>> The `routing` parameter is set to a path in the document, which is an int 
>> value:
>>
>> _routing: {
>>   path: "some_id"
>>   required: true
>> }
>>
>>
>> In out 25M documents, there's 167 distinct values of this "some_id" and 
>> in our expectation, ElasticSearch will route these documents evenly across 
>> all shards.
>> But we've found out that, out of 128 shards, there are 53 empty shards 
>> (with 0 document inside), or, 40% of the shards are not used at all.
>>
>> My question: 
>>
>> - is this normal? Do we miss something in configuring routing? 
>> - does this imbalanced shard utilization affect indexing speed?
>>
>> We can confirm that all documents are correctly indexed and routing works 
>> (when searching with routing only 1 shard responds with the correct answer).
>> ElasticSearch version is v1.0.1.
>>
>>  
>> Thanks!
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/f54da2a0-0b7a-49fb-b852-b2200c862b4d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: IndexOutOfBoundsException at IndexShardRoutingTable class

pickIndex() will return the absolute value of the count, so it won't return 
a negative value. Can you provide more details?


Kevin


On Wednesday, March 26, 2014 3:53:15 PM UTC+11, Shinsuke Sugaya wrote:
>
> Hi
>
> I encountered the following problem:
>
> Caused by: java.lang.IndexOutOfBoundsException: index (-2) must not be 
> negative
> at 
> org.elasticsearch.common.base.Preconditions.checkElementIndex(Preconditions.java:306)
> at 
> org.elasticsearch.common.base.Preconditions.checkElementIndex(Preconditions.java:285)
> at 
> org.elasticsearch.common.collect.RegularImmutableList.get(RegularImmutableList.java:65)
> at 
> org.elasticsearch.cluster.routing.IndexShardRoutingTable.preferNodeActiveInitializingShardsIt(IndexShardRoutingTable.java:378)
> at 
> org.elasticsearch.cluster.routing.operation.plain.PlainOperationRouting.preferenceActiveShardIterator(PlainOperationRouting.java:210)
> at 
> org.elasticsearch.cluster.routing.operation.plain.PlainOperationRouting.getShards(PlainOperationRouting.java:80)
> at 
> org.elasticsearch.action.get.TransportGetAction.shards(TransportGetAction.java:80)
> at 
> org.elasticsearch.action.get.TransportGetAction.shards(TransportGetAction.java:42)
> at 
> org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction$AsyncSingleAction.(TransportShardSingleOperationAction.java:121)
> at 
> org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction$AsyncSingleAction.(TransportShardSingleOperationAction.java:97)
> at 
> org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction.doExecute(TransportShardSingleOperationAction.java:74)
> at 
> org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction.doExecute(TransportShardSingleOperationAction.java:49)
> at 
> org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:63)
> at 
> org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:49)
> at 
> org.elasticsearch.client.node.NodeClient.execute(NodeClient.java:85)
> at 
> org.elasticsearch.client.support.AbstractClient.get(AbstractClient.java:174)
> ... 9 more
>
> My environment is:
>
>  - Elasticserach 0.90.7
>  - 3 nodes in a cluster
>  - Send GET request with preference=_local
>
> Looking into IndexShardRoutingTable class, it seems that "loc" is 
> an unexpected negative value at the following code. pickIndex method 
> returns a value of "counter"(incremental value). If "counter" achieves 
> Integer.MAX_VALUE, I think that "loc" is negative and then 
> activeShards.get(loc) throws the exception.
>
> int index = pickIndex();
> for (int i = 0; i < activeShards.size(); i++) {
> int loc = (index + i) % activeShards.size();
>
> If it's a bug, I'll file an issue.
>
> Best regards,
>  shinsuke
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/908dc93c-e7b6-4a03-802e-fe6e18f30f10%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Java API or REST API for client development ?

2014-03-26 Thread Subhadip Bagui


My app is in Java only. So what I mean is should I use elasticsearch Java 
client or available REST api's only using HttpClient and all.

What will be more flexiable for multiplatform ingration ?

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/2cd7714f-ff38-44a5-a5c2-fdde329c9874%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Shards/routing documents imbalance problem

ES will get the shard id by hash(routing)%num of shards, in your case, 
there are only 167 distinct values but have 128 shards, I think it's highly 
possible there is less than 128 distinct hash values. So some of the shard 
will not have any data.


Kevin

On Wednesday, March 26, 2014 9:30:36 PM UTC+11, Han JU wrote:
>
> Hi,
>
> We've indexed 25M documents into a single index of 128 shards with 1 
> replica. 
> The `routing` parameter is set to a path in the document, which is an int 
> value:
>
> _routing: {
>   path: "some_id"
>   required: true
> }
>
>
> In out 25M documents, there's 167 distinct values of this "some_id" and in 
> our expectation, ElasticSearch will route these documents evenly across all 
> shards.
> But we've found out that, out of 128 shards, there are 53 empty shards 
> (with 0 document inside), or, 40% of the shards are not used at all.
>
> My question: 
>
> - is this normal? Do we miss something in configuring routing? 
> - does this imbalanced shard utilization affect indexing speed?
>
> We can confirm that all documents are correctly indexed and routing works 
> (when searching with routing only 1 shard responds with the correct answer).
> ElasticSearch version is v1.0.1.
>
>  
> Thanks!
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/d8961b19-e024-4a04-83fa-48f4cd44b7c4%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Shards/routing documents imbalance problem

2014-03-26 Thread Han JU

Hi,

We've indexed 25M documents into a single index of 128 shards with 1 
replica. 
The `routing` parameter is set to a path in the document, which is an int 
value:

_routing: {
  path: "some_id"
  required: true
}


In out 25M documents, there's 167 distinct values of this "some_id" and in 
our expectation, ElasticSearch will route these documents evenly across all 
shards.
But we've found out that, out of 128 shards, there are 53 empty shards 
(with 0 document inside), or, 40% of the shards are not used at all.

My question: 

- is this normal? Do we miss something in configuring routing? 
- does this imbalanced shard utilization affect indexing speed?

We can confirm that all documents are correctly indexed and routing works 
(when searching with routing only 1 shard responds with the correct answer).
ElasticSearch version is v1.0.1.

 
Thanks!

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/f7e86ae2-14a8-4381-842d-53adf59ec43d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Java API or REST API for client development ?