Re: Design practices for hosting multiple clusters/on-demand cluster creation?

2014-01-07 Thread David Pilato
You could look at chef cookbook: 
https://github.com/elasticsearch/cookbook-elasticsearch
http://www.elasticsearch.org/tutorials/deploying-elasticsearch-with-chef-solo/

Does it help?

--
David ;-)
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs


Le 8 janv. 2014 à 02:01, Josh Harrison  a écrit :

While ES is still in a pre deployment stage at my job, there is growing 
interest in it. For various reasons, a monster cluster holding everyone's stuff 
is simply not possible. Individual projects require complete control over their 
data and the culture and security requirements here are such that doing 
something like always naming project 1's indexes PROJECT_1_ will not 
fly.
We have a fairly beefy hadoop cluster hosting our content currently, along with 
a separate head node acting as the master.
In this situation, is it simply a matter of starting up new processes on each 
node pointed at different configuration profiles and tying specific ports to 
specific projects/clusters?

Basically, is there an established way to build on-demand clusters, given a set 
of resources? We'll layer something in front of it to deal with access 
control/etc.

Thanks!
-Josh
-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/ad2695f7-d1a2-4036-82b2-58bddf349681%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/535D6769-0469-4BF8-9840-C67FA81CFD89%40pilato.fr.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Transport Client hangs in my web application during search.

2014-01-07 Thread David Pilato
Your code looks good to me.
Don't create multiple client but only one for your whole application.

As Jason wrote, look at logs.

--
David ;-)
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs


Le 8 janv. 2014 à 07:40, Jason Wee  a écrit :

Does it show anything in the log? Perhaps try catch block on your code and set 
a query timeout. 

HTH 

/Jason


> On Wed, Jan 8, 2014 at 4:41 AM, Search User  wrote:
> I have a web application in which I create a Transport Client using Spring 
> (singleton) and inject it into my service. When I receive a request in my 
> controller, controller calls the service and service uses the transport 
> client to execute the query and return the results. When I deploy this 
> application in tomcat, I have the client created but when I execute the 
> query, client hangs. 
> 
> If I create the client for every request (in my service) and run the query, 
> everything is fine. Can some one help me understand this behavior?
> 
> Following is my code to create the Client object.
> 
> Settings settings = ImmutableSettings.settingsBuilder().put("cluster.name", 
> "mysearchcluster").put("client.transport.sniff", true).build();
> Client client = new TransportClient(settings).addTransportAddress(new 
> InetSocketTransportAddress("10.150.200.101", 9300));
> 
> 
> 
> Thanks
> -- 
> You received this message because you are subscribed to the Google Groups 
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/elasticsearch/4c846ec4-15c5-4c6f-9e1c-6c56912cc2ee%40googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHO4itxM795Xuo8tikF4oADgYH50R58Y8B0qwdMz4nU82koN3w%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/FD569ACE-1811-4FEC-AFDC-7DA96A621B61%40pilato.fr.
For more options, visit https://groups.google.com/groups/opt_out.


Re: How to index an existing json file

2014-01-07 Thread David Pilato
Start with a clean index:

curl -XDELETE "http://localhost:9200/books/";

You probably have a bad mapping (some docs already indexed?)

If you still have problems, please gist a full curl recreation. See 
http://www.elasticsearch.org/help/


--
David ;-)
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs


Le 8 janv. 2014 à 03:10, ZenMaster80  a écrit :

Great, Do you know why I am getting 
{"error":"MapperParsingException[failed to parse]; nested: 
JsonParseException[Unrecognized token 'life': was expecting ('true', 'false' or 
'null')\n at [Source: [B@5c9a9d06; line: 1, column: 35]]; ","status":400}

data:

{“books”:[{“name”:”life in heaven”,”author”:”Mike Smith”},{“name”:”get 
rich”,”author”:”Joe Shmoe”},{“name”:”luxury properties”,”author”:”Linda 
Jones”}]}




> On Tuesday, January 7, 2014 9:06:01 PM UTC-5, Ivan Brusic wrote:
> The JSON file is used by the curl command, so in your example it should be in 
> the same directory in which you executed the command (current directory).
> 
> -- 
> Ivan
> 
> 
>> On Tue, Jan 7, 2014 at 6:00 PM, ZenMaster80  wrote:
>> Hi,
>> 
>> I am just starting with ElasticSearch, I would like to know how to index a 
>> simple json document "books.json" that has the following in it: Where do I 
>> place the document? I placed it in root directory of elastic search and in 
>> /bin folder..
>> 
>> {“books”:[{“name”:”life in heaven”,”author”:”Mike Smith”},{“name”:”get 
>> rich”,”author”:”Joe Shmoe”},{“name”:”luxury properties”,”author”:”Linda 
>> Jones”]}}
>> 
>> 
>> $ curl -XPUT "http://localhost:9200/books/book/1"; -d @books.json
>> 
>> Warning: Couldn't read data from file "books.json", this makes an empty POST.
>> {"error":"MapperParsingException[failed to parse, document is 
>> empty]","status":400}
>> 
>> 
>> 
>> Thanks
>> 
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to elasticsearc...@googlegroups.com.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/a5c1e37f-9472-499c-9499-1475c944f47b%40googlegroups.com.
>> For more options, visit https://groups.google.com/groups/opt_out.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/5d15fcdf-4a0f-4d92-9dd3-f07899d915fe%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/E9FE0784-B10E-48AD-9C46-45B44B1513B9%40pilato.fr.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Transport Client hangs in my web application during search.

2014-01-07 Thread Jason Wee
Does it show anything in the log? Perhaps try catch block on your code and
set a query timeout.

HTH

/Jason


On Wed, Jan 8, 2014 at 4:41 AM, Search User  wrote:

> I have a web application in which I create a Transport Client using Spring
> (singleton) and inject it into my service. When I receive a request in my
> controller, controller calls the service and service uses the transport
> client to execute the query and return the results. When I deploy this
> application in tomcat, I have the client created but when I execute the
> query, client hangs.
>
> If I create the client for every request (in my service) and run the
> query, everything is fine. Can some one help me understand this behavior?
>
> Following is my code to create the Client object.
>
> Settings settings = ImmutableSettings.settingsBuilder().put("cluster.name"
> , "mysearchcluster").put("client.transport.sniff", true).build();
> Client client = new TransportClient(settings).addTransportAddress(new
> InetSocketTransportAddress("10.150.200.101", 9300));
>
>
>
> Thanks
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/4c846ec4-15c5-4c6f-9e1c-6c56912cc2ee%40googlegroups.com
> .
> For more options, visit https://groups.google.com/groups/opt_out.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHO4itxM795Xuo8tikF4oADgYH50R58Y8B0qwdMz4nU82koN3w%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Order results by value in one of the array entries.

2014-01-07 Thread Jun Ohtani
Hi Johan,

You try to use script based sorting.
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-request-sort.html#_script_based_sorting

Or the function score query.
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-function-score-query.html#_script_score

I hope this helps.

Regards,


Jun Ohtani
joht...@gmail.com
blog : http://blog.johtani.info
twitter : http://twitter.com/johtani




2014/01/07 19:45、Johan E  のメール:

> Hi,
> 
> I'm trying to order the result of a query by a specified entry in a array.
> 
> Here is a sample entry
> 
> 
> {
> "product_name": "product alfa",
> "product_id": "4a86c92ccd26111d7ba0eada7da6a75af",
> "description": "This is a sample product",
> "image_id": "product_a.jpg",
> "inventory": [
> {
> "warehouse": "warehouse_a",
>   "stock": 99
> },
> {
> "warehouse": "warehouse_b",
>   "stock": 19
> },
> {
> "warehouse": "warehouse_c",
>   "stock": 99
> }
>   ]
> }
> 
> If there were more "products" containing alfa, I would (for example) want to 
> sort they by the stock of a warehouse.
> 
> I'm currently using a query like:
> 
> POST _search
> {
> "query": {
> "match": {
> "product_name":{
> "query":"alfa",
> "type" : "phrase"
> }
> }
> },
> "filter": {
> "bool": {
> "must": [
>{
>"term": {
>   "availability.warehouse": "warehouse_a"
>}
>}
> ]
> }
> }
> }
> 
> I would like the results sorted by stock (for warehouse_a only) descending.
> 
> Any ideas?
> 
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/elasticsearch/01a7baad-40e3-40b3-8104-66910762b004%40googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.



signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: incrementally scaling ES from the small data

2014-01-07 Thread Adolfo Rodriguez
Thanks Ivan, makes sense. Still could not test how sockets relate to shards 
and why I automatically get 10 established sockets when opening a client:

node = builder.client(clientOnly).data(!clientOnly).local(local).node();

client = node.client();


on default ES configuration, and many many more sockets after (up to 200), 
and how this number changes when increasing/decreasing number of shards, 

but happily I managed to fix the initial issue of highlighting info being 
randomly lost by a config change as described here:

https://groups.google.com/d/msg/elasticsearch/3t6UL_vzM7o/TLnV2m2B1NAJ


so sockets does not look an issue anymore. 

Regards.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/8a65007d-1053-4842-9c6b-93564b3ec44f%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Too many open files

2014-01-07 Thread Adolfo Rodriguez
Happily, the problem of missing highlight records looks to be gone by 
making a config change.

* Initially I had 2 ES in 2 different apps (a Tomcat and a standalone) 
configured equal (both listening for incoming TransportClients requests on 
port 9300 and both open with client(false)) and a third ES connecting to 
then opened with new TransportClient() to fetch highlighting info. It looks 
that this third ES was randomly loosing highlighting records. (?)

* What I did to fix it was a configuration change to have only one 
client(false)) ES listening for TransportClients and 2 new 
TransportClient()s connecting to it.

It looks this change fixes the issue which was some kind of coupling 
between both client(false)) ESs listening on port 9300.

Regards

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/fbe72b9f-eeac-4d2b-9545-6851352aa3d5%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


How many metadata fields exist of MP3 file ?

2014-01-07 Thread HongXuan Ji
Hi all,

I am wondering how many metadata fields of MP3 files exist when I post the 
mp3 file into ElasticSearch using the mapper-attachment. 

Because in Solr we can know the field information through the endpoint 
SOLR_HOST/update/extract?extractOnly=true, 

but in ElasticSearch are there any ways to get such informations?  Except 
for the MP3 files, how about the doc files? 

I know the ElasticSearch use tika to support this operations, can you give 
me some example to fetch some special field of some special file format?

Regards,

Ivan 


-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/742f86b9-9dd8-4354-ae50-26332f0c4dc0%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Using a nested object property within custom_filters_score script

2014-01-07 Thread Jun Ohtani
Hi Veda,

Colud you try to use _source.colours instead of doc[‘colours’].values?
Maybe,  the field in doc[] means lucene’s field name.
I think that elasticsearch is indexed the nested field as “colours.name”, not 
object.


Regards,


Jun Ohtani
joht...@gmail.com
blog : http://blog.johtani.info
twitter : http://twitter.com/johtani




2014/01/02 2:49、Veda  のメール:

> Hi,
>  An example document in the index is
>  {
>"title": "Red Gucci Dress",
>"merchant": "zara",
>"price": 1972.34,
>"colours": [{"name": "red", "emd": 1.98, "percentage": 45}, {"name":
> "blue", "emd": 1.99, "percentage": 40}]
> }
> 
> and i have defined a mapping to let ES know colours is a nested object by
> posting the following json 
> 
> {
>"nested_product" : {
>"properties" : {
>"colours" : {
>"type" : "nested"
>}
>}
>}
> }
> 
> I'm facing an issue when i want to use custom_filter_score and in the
> scoring script I want to use the matched colour's properties to generate the
> score. Below is the query 
> 
> {
>  "query":{
>"custom_filters_score" : {
>"query": {"match_phrase": {"_all" : "gucci dress"}},
>"filters" : [{
>"filter" : { "nested" : { "path" : "colours",  "filter" :
> {"term" : {"colours.name" : "blue",
>"script" : "foreach(colour : doc['colours'].values){ if
> (colour.name == 'blue') {return ((0.1 * colour.percentage * colour.emd));}}"
>}
>],
>"score_mode" : "total"
>}
> }
> }
> 
> which throws an error 
> 
> Query Failed [Failed to execute main query]]; nested:
> CompileException[[Error: No field found for [colours] in mapping with types
> [nested_product]]\n[Near : {... foreach(colour : doc['colours' }]\n   
>  
> ^\n[Line: 1, Column: 1]]; nested: ElasticSearchIllegalArgumentException[No
> field found for [colours] in mapping with types [nested_product]]; "
> 
> Any help on it would be appreciated
> 
> Thanks in Advance,
> Veda
> 
> 
> 
> 
> 
> 
> --
> View this message in context: 
> http://elasticsearch-users.115913.n3.nabble.com/Using-a-nested-object-property-within-custom-filters-score-script-tp4046901.html
> Sent from the ElasticSearch Users mailing list archive at Nabble.com.
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/elasticsearch/1388598575084-4046901.post%40n3.nabble.com.
> For more options, visit https://groups.google.com/groups/opt_out.



signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: How to index an existing json file

2014-01-07 Thread ZenMaster80
Great, Do you know why I am getting  

{"error":"MapperParsingException[failed to parse]; nested: 
JsonParseException[Unrecognized token 'life': was expecting ('true', 
'false' or 'null')\n at [Source: [B@5c9a9d06; line: 1, column: 35]]; 
","status":400}

data:

{“books”:[{“name”:”life in heaven”,”author”:”Mike Smith”},{“name”:”get 
rich”,”author”:”Joe Shmoe”},{“name”:”luxury properties”,”author”:”Linda 
Jones”}]}



On Tuesday, January 7, 2014 9:06:01 PM UTC-5, Ivan Brusic wrote:
>
> The JSON file is used by the curl command, so in your example it should be 
> in the same directory in which you executed the command (current directory).
>
> -- 
> Ivan
>
>
> On Tue, Jan 7, 2014 at 6:00 PM, ZenMaster80 
> > wrote:
>
>> Hi,
>>
>> I am just starting with ElasticSearch, I would like to know how to index 
>> a simple json document "books.json" that has the following in it: Where do 
>> I place the document? I placed it in root directory of elastic search and 
>> in /bin folder..
>>
>> {“books”:[{“name”:”life in heaven”,”author”:”Mike Smith”},{“name”:”get 
>> rich”,”author”:”Joe Shmoe”},{“name”:”luxury properties”,”author”:”Linda 
>> Jones”]}}
>>
>>
>> $ curl -XPUT "http://localhost:9200/books/book/1"; -d @books.json
>>
>> Warning: Couldn't read data from file "books.json", this makes an empty 
>> POST.
>>
>> {"error":"MapperParsingException[failed to parse, document is 
>> empty]","status":400}
>>
>>
>> Thanks
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to elasticsearc...@googlegroups.com .
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/a5c1e37f-9472-499c-9499-1475c944f47b%40googlegroups.com
>> .
>> For more options, visit https://groups.google.com/groups/opt_out.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/5d15fcdf-4a0f-4d92-9dd3-f07899d915fe%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: How to index an existing json file

2014-01-07 Thread Ivan Brusic
The JSON file is used by the curl command, so in your example it should be
in the same directory in which you executed the command (current directory).

-- 
Ivan


On Tue, Jan 7, 2014 at 6:00 PM, ZenMaster80  wrote:

> Hi,
>
> I am just starting with ElasticSearch, I would like to know how to index a
> simple json document "books.json" that has the following in it: Where do I
> place the document? I placed it in root directory of elastic search and in
> /bin folder..
>
> {“books”:[{“name”:”life in heaven”,”author”:”Mike Smith”},{“name”:”get
> rich”,”author”:”Joe Shmoe”},{“name”:”luxury properties”,”author”:”Linda
> Jones”]}}
>
>
> $ curl -XPUT "http://localhost:9200/books/book/1"; -d @books.json
>
> Warning: Couldn't read data from file "books.json", this makes an empty
> POST.
>
> {"error":"MapperParsingException[failed to parse, document is
> empty]","status":400}
>
>
> Thanks
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/a5c1e37f-9472-499c-9499-1475c944f47b%40googlegroups.com
> .
> For more options, visit https://groups.google.com/groups/opt_out.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQDg%3Du3HfBvKnQrCy6XEJ6knyrvx042j8kn7YZmMz96FhA%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


How to index an existing json file

2014-01-07 Thread ZenMaster80
Hi,

I am just starting with ElasticSearch, I would like to know how to index a 
simple json document "books.json" that has the following in it: Where do I 
place the document? I placed it in root directory of elastic search and in 
/bin folder..

{“books”:[{“name”:”life in heaven”,”author”:”Mike Smith”},{“name”:”get 
rich”,”author”:”Joe Shmoe”},{“name”:”luxury properties”,”author”:”Linda 
Jones”]}}


$ curl -XPUT "http://localhost:9200/books/book/1"; -d @books.json

Warning: Couldn't read data from file "books.json", this makes an empty 
POST.

{"error":"MapperParsingException[failed to parse, document is 
empty]","status":400}


Thanks

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/a5c1e37f-9472-499c-9499-1475c944f47b%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: incrementally scaling ES from the small data

2014-01-07 Thread Ivan Brusic
An increase of shards will not cause an increase in sockets used. Each node
shard action is responsible for gather the responses from each shard at the
file-level before sending the response back to the client.

Since each shard is actually its own Lucene index, an increase of shards
will increase metrics at the IO level, especially the number of open file
descriptors.

It is advised to start of with 5 because that would allow you to scale an
index horizontally without needing to reindex. You can increase your
cluster from 1 to 5 and each node will have a piece of the index instead of
the entire index that. Beyond that number, you can distribute the index
with more replicas. More shards increase availability IMHO. Ultimately you
do not want large shards for performance reasons.

-- 
Ivan


On Tue, Jan 7, 2014 at 5:23 PM, Adolfo Rodriguez wrote:
>
>
> I am worried about the 200 hundred established sockets in my machine
> (running 2 ES) since I suspect they are producing me some random data lose
> on getting highlighting information. And I was wondering if setting just 1
> shard/0 replica on each ES would get rid of these unwanted sockets (?). Why
> is advised to start with (5-10) rather than with (1-0) * 2 ES ? Any reason?
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQDcRQsnr_WONKAcu8QWiroHabhfD9spLKk2qcqatTfgrQ%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: incrementally scaling ES from the small data

2014-01-07 Thread Adolfo Rodriguez
Thanks Ivan,
 

> Elasticsearch uses consistent hashing, so you cannot change the number of 
> shards for an index.
>

So, I understand that, once the index is created, is only possible to 
scale, up and down, nodes, clusters and replicas. But no shards. 
Interesting.
 

> IMHO, shard values in the high single digits (5-10) is a *great starting 
> point.* Even with a single node cluster, the default number of shards (5) 
> should not cause any performance issues.
>

I am worried about the 200 hundred established sockets in my machine 
(running 2 ES) since I suspect they are producing me some random data lose 
on getting highlighting information. And I was wondering if setting just 1 
shard/0 replica on each ES would get rid of these unwanted sockets (?). Why 
is advised to start with (5-10) rather than with (1-0) * 2 ES ? Any reason?

Thanks

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/42c801d9-83ac-4096-b148-f973dadaeb1e%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Design practices for hosting multiple clusters/on-demand cluster creation?

2014-01-07 Thread Josh Harrison
While ES is still in a pre deployment stage at my job, there is growing 
interest in it. For various reasons, a monster cluster holding everyone's 
stuff is simply not possible. Individual projects require complete control 
over their data and the culture and security requirements here are such 
that doing something like always naming project 1's indexes 
PROJECT_1_ will not fly.
We have a fairly beefy hadoop cluster hosting our content currently, along 
with a separate head node acting as the master.
In this situation, is it simply a matter of starting up new processes on 
each node pointed at different configuration profiles and tying specific 
ports to specific projects/clusters?

Basically, is there an established way to build on-demand clusters, given a 
set of resources? We'll layer something in front of it to deal with access 
control/etc.

Thanks!
-Josh

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/ad2695f7-d1a2-4036-82b2-58bddf349681%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Replicating one cluster to another cluster

2014-01-07 Thread Mark Walkom
There are a few ways;
stream2es - https://github.com/elasticsearch/stream2es
logstash with the elasticsearch and elasticsearch_http outputs -
http://logstash.net/
There is also these two which I haven't used -
https://github.com/crate/elasticsearch-inout-plugin
And - https://github.com/jprante/elasticsearch-knapsack

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com
web: www.campaignmonitor.com


On 8 January 2014 11:26, Zuhaib Siddique  wrote:

> Hey,
>
> Was wondering if anyone has tried replicating one cluster to a new cluster
> and keep it in "sync".  Example is I have a production cluster and i need
> to reindex all data.  I would like to do this in a 2nd cluster so I can
> compare the changes but if an update happens on the original index I want
> it reflected on the replicated one.
>
> I am pretty sure I can whip something with scroll/scan but if someone has
> done before and has code to share it would be great.
>
> Thanks
> Zuhaib
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/c9f39a38-53f4-4f23-9aaa-0b270261ebae%40googlegroups.com
> .
> For more options, visit https://groups.google.com/groups/opt_out.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAEM624a9XG2jd-%2BO4TvxCJ_v6odn%3DoeG9aiJNiQsPsa8CSYk9g%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: incrementally scaling ES from the small data

2014-01-07 Thread Ivan Brusic
Elasticsearch uses consistent hashing, so you cannot change the number of
shards for an index.

If you can reindex data, then you can create a new index with a different
number of shards and simply reindex. If your data is temporal in nature,
you can create a new index per day/week/month and these new indices can
have a different shard value. You can search against multiple indices even
if they have different shard values.

IMHO, shard values in the high single digits (5-10) is a great starting
point. Even with a single node cluster, the default number of shards (5)
should not cause any performance issues.

Cheers,

Ivan


On Tue, Jan 7, 2014 at 4:47 PM, Adolfo Rodriguez wrote:

> Thanks both for your comments.
>
> Shards is a little harder, start with the* standard/default of 8 shards*and 
> go from there.
>>
>
> * This is the point that is confusing me the most. For a very small
> initial deployment, with a few thousand docs, why not using just define 1
> shard with no replica? What criteria you used to set 8 shards as a default
> (BTW, defaults - in ES 0.90.5 - are 5 Successful Shards, 5 Unassigned
> Shards, is not it?).
>
> * Suppose that you start with the smaller minimum setup: 1 cluster, 1
> node, 1 shard, no replica, Will I be able to incrementally scale any of
> these settings up? And will I able also to scale any of these settings down
> after? (or will need to repopulate ES in any particular case). The idea is
> testing different configs.
>
> * In my current particular case, can I scale down my current 5 shards/1
> replica (default 0.90.5 AFAIK) to 1 shard/no replica? And start from there?
>
> The reason I am concerned about this is that I see lot of sockets (maybe
> 200 hundreds on my system - 2 ES on different apps in same machine - and
> want to understand where they come from and how to allocate the optimum). I
> watched Shai's presentation yesterday but could no grasp this info.
>
> Thanks
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/4e8b513f-42a0-45e7-b677-842876c2570b%40googlegroups.com
> .
>
> For more options, visit https://groups.google.com/groups/opt_out.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQDzAdvA1mNk%2BBUb-4N5mPayP9MCBXm%2BONsptYhnBOhFgA%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: incrementally scaling ES from the small data

2014-01-07 Thread Adolfo Rodriguez
Thanks both for your comments.

Shards is a little harder, start with the* standard/default of 8 shards*and go 
from there.
>

* This is the point that is confusing me the most. For a very small initial 
deployment, with a few thousand docs, why not using just define 1 shard 
with no replica? What criteria you used to set 8 shards as a default (BTW, 
defaults - in ES 0.90.5 - are 5 Successful Shards, 5 Unassigned Shards, is 
not it?).

* Suppose that you start with the smaller minimum setup: 1 cluster, 1 node, 
1 shard, no replica, Will I be able to incrementally scale any of these 
settings up? And will I able also to scale any of these settings down 
after? (or will need to repopulate ES in any particular case). The idea is 
testing different configs.

* In my current particular case, can I scale down my current 5 shards/1 
replica (default 0.90.5 AFAIK) to 1 shard/no replica? And start from there?

The reason I am concerned about this is that I see lot of sockets (maybe 
200 hundreds on my system - 2 ES on different apps in same machine - and 
want to understand where they come from and how to allocate the optimum). I 
watched Shai's presentation yesterday but could no grasp this info.

Thanks

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/4e8b513f-42a0-45e7-b677-842876c2570b%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Replicating one cluster to another cluster

2014-01-07 Thread Zuhaib Siddique
Hey,

Was wondering if anyone has tried replicating one cluster to a new cluster 
and keep it in "sync".  Example is I have a production cluster and i need 
to reindex all data.  I would like to do this in a 2nd cluster so I can 
compare the changes but if an update happens on the original index I want 
it reflected on the replicated one.

I am pretty sure I can whip something with scroll/scan but if someone has 
done before and has code to share it would be great.

Thanks
Zuhaib

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/c9f39a38-53f4-4f23-9aaa-0b270261ebae%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: incrementally scaling ES from the small data

2014-01-07 Thread Mark Walkom
As a really, really rough guide;
Start with a small instance, 4-8G RAM (2-4G heap). Keep loading documents
until things start to slow down (ie query/update responsiveness drops). Add
a new node.
Rinse and repeat.

If you have one node there is no point using replicas as they have nowhere
to go. You can easily add replicas later though so it's no big deal.
Shards is a little harder, start with the standard/default of 8 shards and
go from there. Using aliases can allow you to reindex your data later if
you feel you may want to change this.

You can monitor your cluster with a range of monitoring plugins -
elasticHQ, kopf, elasticsearch-monitoring, bigdesk. Just search for them on
github.


As Boaz mentioned, it really does depend on what you are doing. Chances are
you will go through all this and get to a point where you want to rebuild
your cluster with all your gained knowledge!

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com
web: www.campaignmonitor.com


On 8 January 2014 09:18, Boaz Leskes  wrote:

> Hi Adolfo,
>
> The best way to scale depends on your data and how it behaves. You can
> watch this great talk by Shay about two use cases to get inspired:
> http://www.elasticsearch.org/videos/big-data-search-and-analytics/
>
> Cheers,
> Boaz
>
>
> On Tuesday, January 7, 2014 8:13:18 PM UTC+1, Adolfo Rodriguez wrote:
>>
>> Hi, I plan to start with a small project, initially, with small data (few
>> thousands records) to learn ES response, and, incrementally, increase data
>> and resources on demand, to the big data, taking advantage of ES
>> scalability.
>>
>> Is there a document describing such a strategy, i.e.:
>>
>> * how to properly configure an small basic deployment with good
>> performance on low resources? (shards, nodes, clusters...)
>>
>> * then, how to keep detecting the necessity of incrementally adding
>> resources, shard/nodes..., according to increases on data load?
>>
>>
>> All docs that I find on scaling ES starts on deployments with m/billions
>> of records.
>>
>> Alternatively, any advice on properly "configuring ES for the small
>> data"? (as a starting point?)
>>
>> Thanks
>>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/3d444d6f-fa0d-4567-a46b-538ea9b379f9%40googlegroups.com
> .
>
> For more options, visit https://groups.google.com/groups/opt_out.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAEM624ZRacXqWCg56kFvjYsf1_cDxLT4Drhdbk6jFL5_Q1EekA%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Match exact substring in not analyzed field

2014-01-07 Thread InquiringMind
This is an interesting problem. Typically, my view of stop words is dim. I 
would prefer that the client side avoids searching on them if that is 
desired, rather than the engine ignores them. Then, phrase matching can 
work properly. And queries such as The Wall can look for just Wall(ignoring 
The as a stop word), but then the Google-like +The Wall can look for The 
Wall. Yeah, I know that ES is not Google; I only look to Google for ideas 
that are nice and for hints about their implementation based upon their 
external behavior.

Then, your problem could be solved using a phrase query with no slop.

Maybe your testMulti field is analyzed but no stop words are ignored. Or, 
maybe testMulti.raw is analyzed but with no stop words ignored. Either way, 
you'd have the full set of words indexed for a phrase query to quickly find 
the sub-match. At least, much, much more quickly than a grep-style wildcard 
search against a non-analyzed form of the field.

I also used phrases within my own table-based synonym matching. Instead of 
using ES synonyms, I create a separate type with lists of synonyms. A query 
for a synonym is first directed to that type to fetch a list of synonyms; 
then an OR query is generated. This has proven to be fast enough. It has 
the benefit of allowing the synonyms to be updated with no changes to the 
97-millon documents that are already indexed. And, synonyms can be phrases, 
for example: HUGE -> "VERY BIG". So now a synonym query for HUGE can find The 
Very Big Dog. Likewise, a synonym query for the phrase "VERY BIG" can find The 
Huge Dog. Really cool; just a matter of Java coding on the front end. And 
ES does the heavy lifting underneath. But I digress a little...

Hope this helps.

Brian

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/5440531a-2ccc-4df1-9edb-422012f7dd3b%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: incrementally scaling ES from the small data

2014-01-07 Thread Boaz Leskes
Hi Adolfo,

The best way to scale depends on your data and how it behaves. You can 
watch this great talk by Shay about two use cases to get 
inspired: http://www.elasticsearch.org/videos/big-data-search-and-analytics/

Cheers,
Boaz

On Tuesday, January 7, 2014 8:13:18 PM UTC+1, Adolfo Rodriguez wrote:
>
> Hi, I plan to start with a small project, initially, with small data (few 
> thousands records) to learn ES response, and, incrementally, increase data 
> and resources on demand, to the big data, taking advantage of ES 
> scalability.
>
> Is there a document describing such a strategy, i.e.:
>
> * how to properly configure an small basic deployment with good 
> performance on low resources? (shards, nodes, clusters...)
>
> * then, how to keep detecting the necessity of incrementally adding 
> resources, shard/nodes..., according to increases on data load?
>
>
> All docs that I find on scaling ES starts on deployments with m/billions 
> of records.
>
> Alternatively, any advice on properly "configuring ES for the small data"? 
> (as a starting point?)
>
> Thanks
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/3d444d6f-fa0d-4567-a46b-538ea9b379f9%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Upgrades causing Elastic Search downtime

2014-01-07 Thread Mark Walkom
You can also use cluster.routing.allocation.disable_allocation to reduce
the need of waiting for things to rebalance.

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com
web: www.campaignmonitor.com


On 8 January 2014 04:41, Ivan Brusic  wrote:

> Almost elasticsearch should support clusters of nodes with different minor
> versions, I have seen issues between minor versions. Version 0.90.8 did
> contain an upgrade of Lucene (4.6), but that does not look like it would
> cause your issue. You could look at the github issues tagged 0.90.[8-9] and
> see if something applies in your case.
>
> A couple of points about upgrading:
>
> If you want to use the double-the-nodes techniques (which should not be
> necessary for minor version upgrades), you could "decommission" a node
> using the Shard API. Here is a good writeup:
> http://blog.sematext.com/2012/05/29/elasticsearch-shard-placement-control/
>
> Since you doubled the amount of nodes in the cluster,
> the minimum_master_nodes setting would be temporarily incorrect and
> potential split-brain clusters might occur. In fact, it might have occurred
> in your case since the cluster state seems incorrect. Merely hypothesizing.
>
> Cheers,
>
> Ivan
>
>
> On Tue, Jan 7, 2014 at 9:26 AM, Jenny Sivapalan <
> jennifer.sivapa...@gmail.com> wrote:
>
>> Hello,
>>
>> We've upgraded Elastic Search twice over the last month and have
>> experienced downtime (roughly 8 minutes) during the roll out. I'm not sure
>> if it something we are doing wrong or not.
>>
>> We use EC2 instances for our Elastic Search cluster and cloud formation
>> to manage our stack. When we deploy a new version or change to Elastic
>> Search we upload the new artefact, double the number of EC2 instances and
>> wait for the new instances to join the cluster.
>>
>> For example 6 nodes form a cluster on v 0.90.7. We upload the 0.90.9
>> version via our deployment process and double the number nodes for the
>> cluster (12). The 6 new nodes will join the cluster with the 0.90.9
>> version.
>>
>> We then want to remove each of the 0.90.7 nodes. We do this by shutting
>> down the node (using the plugin head), wait for the cluster to rebalance
>> the shards and then terminate the EC2 instances. Then repeat with the next
>> node. We leave the master node until last so that it does the re-election
>> just once.
>>
>> The issue we have found in the last two upgrades is that while the
>> penultimate node is shutting down the master starts throwing errors and the
>> cluster goes red. To fix this we've stopped the Elastic Search process on
>> master and have had to restart each of the other nodes (though perhaps they
>> would have rebalanced themselves in a longer time period?). We find that
>> we send an increase error response to our clients during this time.
>>
>> We've set out queue size for search to 300 and we start to see the queue
>> gets full:
>>at java.lang.Thread.run(Thread.java:724)
>> 2014-01-07 15:58:55,508 DEBUG action.search.type[Matt Murdock]
>> [92036651] Failed to execute fetch phase
>> org.elasticsearch.common.util.concurrent.EsRejectedExecutionException:
>> rejected execution (queue capacity 300) on
>> org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction$AsyncAction$2@23f1bc3
>> at
>> org.elasticsearch.common.util.concurrent.EsAbortPolicy.rejectedExecution(EsAbortPolicy.java:61)
>> at
>> java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:821)
>>
>>
>> But also we see the following error which we've been unable to find the
>> diagnosis for:
>>  2014-01-07 15:58:55,530 DEBUG index.shard.service   [Matt Murdock]
>> [index-name][4] Can not build 'doc stats' from engine shard state
>> [RECOVERING]
>> org.elasticsearch.index.shard.IllegalIndexShardStateException:
>> [index-name][4] CurrentState[RECOVERING] operations only allowed when
>> started/relocated
>> at
>> org.elasticsearch.index.shard.service.InternalIndexShard.readAllowed(InternalIndexShard.java:765)
>>
>>  Are we doing anything wrong or has anyone experienced this?
>>
>> Thanks,
>> Jenny
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to elasticsearch+unsubscr...@googlegroups.com.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/elasticsearch/b2328296-e9c9-4763-b61b-6ad2e145e59b%40googlegroups.com
>> .
>> For more options, visit https://groups.google.com/groups/opt_out.
>>
>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQCSPct9-Useg_cbvVZkwx_OoGVa1

Using a nested object property within custom_filters_score script

2014-01-07 Thread Veda
Hi,
  An example document in the index is
  {
"title": "Red Gucci Dress",
"merchant": "zara",
"price": 1972.34,
"colours": [{"name": "red", "emd": 1.98, "percentage": 45}, {"name":
"blue", "emd": 1.99, "percentage": 40}]
}

and i have defined a mapping to let ES know colours is a nested object by
posting the following json 

{
"nested_product" : {
"properties" : {
"colours" : {
"type" : "nested"
}
}
}
}

I'm facing an issue when i want to use custom_filter_score and in the
scoring script I want to use the matched colour's properties to generate the
score. Below is the query 

{
  "query":{
"custom_filters_score" : {
"query": {"match_phrase": {"_all" : "gucci dress"}},
"filters" : [{
"filter" : { "nested" : { "path" : "colours",  "filter" :
{"term" : {"colours.name" : "blue",
"script" : "foreach(colour : doc['colours'].values){ if
(colour.name == 'blue') {return ((0.1 * colour.percentage * colour.emd));}}"
}
],
"score_mode" : "total"
}
}
}

which throws an error 

Query Failed [Failed to execute main query]]; nested:
CompileException[[Error: No field found for [colours] in mapping with types
[nested_product]]\n[Near : {... foreach(colour : doc['colours' }]\n 
   
^\n[Line: 1, Column: 1]]; nested: ElasticSearchIllegalArgumentException[No
field found for [colours] in mapping with types [nested_product]]; "

Any help on it would be appreciated

Thanks in Advance,
Veda






--
View this message in context: 
http://elasticsearch-users.115913.n3.nabble.com/Using-a-nested-object-property-within-custom-filters-score-script-tp4046901.html
Sent from the ElasticSearch Users mailing list archive at Nabble.com.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/1388598575084-4046901.post%40n3.nabble.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Throw TransportSerializationException when jdk 1.6.0_22 client connected jdk 1.7.0_45 Es server

2014-01-07 Thread liangwb2001
Thanks for dadoonet and Ivan's sharing.



--
View this message in context: 
http://elasticsearch-users.115913.n3.nabble.com/Throw-TransportSerializationException-when-jdk-1-6-0-22-client-connected-jdk-1-7-0-45-Es-server-tp4046685p4046919.html
Sent from the ElasticSearch Users mailing list archive at Nabble.com.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/1388629121566-4046919.post%40n3.nabble.com.
For more options, visit https://groups.google.com/groups/opt_out.


Any fix timeline for split brain issue: 2488

2014-01-07 Thread bitsofinfo . g
Hi, is there any timeline on a fix 
for https://github.com/elasticsearch/elasticsearch/issues/2488 ?

thanks!

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/79fc8f45-08f5-4abc-9349-06b23debc3a2%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Scoring and "Relative-ness" based on Business Rules

2014-01-07 Thread David Mitchell
Thanks for your answer. 

So, instead of relying on queries to pull out the right stuff, you're 
suggesting to model the documents to the queries. 

This suggests that there's a custom boost for every search term, which is 
what I was hoping to avoid, if only because of the impossible task of going 
through all our data and determining what to boost/not boost. This also 
implies that there's another key/value store of queries-to-boost keywords, 
which again could get costly to maintain. 

If I'm understanding you correctly, it would look similar to what I 
previously posted, but only with a larger (possibly dynamic) set of boost 
queries. 

Doing so is primarily a manual task - are there more automatic ways to 
build up relevancy, or even tools/processes that help? 

On Tuesday, January 7, 2014 11:50:40 AM UTC-8, Justin Treher wrote:
>
> I think you will find that for small documents, that aren't actually 
> documents at all, but really a mass of data points, such as a product 
> library, you won't even use the built in scoring at all. The built in 
> scoring works well for books and articles (long works of text). For a 
> product library, you will use an array of custom boosts through the 
> function score query. The key is to get all those data points in your 
> documents so that you can boost on matches.
>
> For example, with "xbox," you could have a keywords field that includes 
> xbox just for consoles. Maybe Xbox is the title of the product while games 
> just have Xbox listed as their console compatibility. Only matches in the 
> titles will score higher. 
>
> For the macbook, you could have an accessories flag where items flagged as 
> an accessory receive a negative boost. 
>
> For Apple food vs. Apple products, you can use sales data or user history. 
>
> The key to having relevancy that works for your organization is by 
> providing all the data points to elasticsearch to base its decisions. For 
> products, your best solution is a big old set of constant score queries 
> wrapped in some wild function score queries.
>
> On Tuesday, January 7, 2014 12:36:43 PM UTC-5, David Mitchell wrote:
>>
>> What is the best way to make products more relevant outside of the 
>> default scoring?
>>
>> I have an unknown number of business rules that will dictate a document's 
>> "relativity". Meaning, if one document scores higher than the other, it's 
>> possible that the other document will be more relevant to the user. 
>>
>> Given two products with similar titles but different attributes and the 
>> query "ipad", I'd like to promote one over the other:
>>
>> {
>>"title_simple": "iPad Mini Case",
>>"description_simple": "Royce Leather iPad Mini Case:...",
>>"category": "Computers & Accessories",
>>"brand" : "Royce Leather",
>>"id": 794809052574
>> }
>>
>> {
>>   "title_simple": "Apple iPad mini (16GB, Wi-Fi + Sprint 4G, White)",
>>   "description_simple": "iPad mini features a beautiful 7.9\" display..."
>> ,
>>   "category": "Electronics",
>>   "brand" : "Apple",
>>   "id": 885909689712
>> }
>>
>>
>> A simple query scores the iPad case high:
>>
>> {
>>"query": { "term": { "title_simple": "ipad" }}
>> }
>>
>>
>> But business rules dictate that the actual iPad be on the top. 
>>
>> I can run a filter or score based on the attribute or brand to get what 
>> I'm looking for:
>>
>> {
>>"query": {
>>   "function_score": {
>>   "query": { "term": { "title_simple": "ipad" } },
>>   "functions" : [{
>>   "filter" : { "term": { "category_simple": "electronics" 
>> } },
>>   "boost_factor" : 2
>>   }]  
>>   }
>>}
>> }
>>
>> But building a bunch of these isn't scalable or reasonable. 
>>
>> I have an unknown number of these and that number will continue to grow. 
>> Some other examples:
>>
>> - query "xbox" should promote consoles over games
>> - query "macbook" should promote Apple computers over macbook sleeves
>> - query "Apple" should promote Apple products and not food
>>
>> Building a thousand queries based on functions filters is unreasonable 
>> and unscalable. 
>>
>> Some possible solutions I've considered:
>>
>> - building a lookup table that will build the filter portion of the query 
>> (this could get unmaintainable)
>> - Including a pre-calculated score in the document (unfortunately, 
>> doesn't work on a per query basis, as the score may change based on the 
>> user's needs)
>> - Extending the DefaultSimilary class (I'm not sure how this helps me in 
>> this scenario, though)
>>
>> What have other people done to solve these problems? Is there something 
>> else that I'm missing that could help?
>>
>> Here's a runnable gist - 
>> https://gist.github.com/dlmitchell/826e8fb7ca89bed30e4a/raw/613be2c202b26f5899bdcfeac714737beb49/sample_mapping.sh
>>
>>
>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop rec

Transport Client hangs in my web application during search.

2014-01-07 Thread Search User
I have a web application in which I create a Transport Client using Spring 
(singleton) and inject it into my service. When I receive a request in my 
controller, controller calls the service and service uses the transport 
client to execute the query and return the results. When I deploy this 
application in tomcat, I have the client created but when I execute the 
query, client hangs. 

If I create the client for every request (in my service) and run the query, 
everything is fine. Can some one help me understand this behavior?

Following is my code to create the Client object.

Settings settings = ImmutableSettings.settingsBuilder().put("cluster.name", 
"mysearchcluster").put("client.transport.sniff", true).build();
Client client = new TransportClient(settings).addTransportAddress(new 
InetSocketTransportAddress("10.150.200.101", 9300));



Thanks

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/4c846ec4-15c5-4c6f-9e1c-6c56912cc2ee%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Scoring and "Relative-ness" based on Business Rules

2014-01-07 Thread Justin Treher
I think you will find that for small documents, that aren't actually 
documents at all, but really a mass of data points, such as a product 
library, you won't even use the built in scoring at all. The built in 
scoring works well for books and articles (long works of text). For a 
product library, you will use an array of custom boosts through the 
function score query. The key is to get all those data points in your 
documents so that you can boost on matches.

For example, with "xbox," you could have a keywords field that includes 
xbox just for consoles. Maybe Xbox is the title of the product while games 
just have Xbox listed as their console compatibility. Only matches in the 
titles will score higher. 

For the macbook, you could have an accessories flag where items flagged as 
an accessory receive a negative boost. 

For Apple food vs. Apple products, you can use sales data or user history. 

The key to having relevancy that works for your organization is by 
providing all the data points to elasticsearch to base its decisions. For 
products, your best solution is a big old set of constant score queries 
wrapped in some wild function score queries.

On Tuesday, January 7, 2014 12:36:43 PM UTC-5, David Mitchell wrote:
>
> What is the best way to make products more relevant outside of the default 
> scoring?
>
> I have an unknown number of business rules that will dictate a document's 
> "relativity". Meaning, if one document scores higher than the other, it's 
> possible that the other document will be more relevant to the user. 
>
> Given two products with similar titles but different attributes and the 
> query "ipad", I'd like to promote one over the other:
>
> {
>"title_simple": "iPad Mini Case",
>"description_simple": "Royce Leather iPad Mini Case:...",
>"category": "Computers & Accessories",
>"brand" : "Royce Leather",
>"id": 794809052574
> }
>
> {
>   "title_simple": "Apple iPad mini (16GB, Wi-Fi + Sprint 4G, White)",
>   "description_simple": "iPad mini features a beautiful 7.9\" display...",
>   "category": "Electronics",
>   "brand" : "Apple",
>   "id": 885909689712
> }
>
>
> A simple query scores the iPad case high:
>
> {
>"query": { "term": { "title_simple": "ipad" }}
> }
>
>
> But business rules dictate that the actual iPad be on the top. 
>
> I can run a filter or score based on the attribute or brand to get what 
> I'm looking for:
>
> {
>"query": {
>   "function_score": {
>   "query": { "term": { "title_simple": "ipad" } },
>   "functions" : [{
>   "filter" : { "term": { "category_simple": "electronics" 
> } },
>   "boost_factor" : 2
>   }]  
>   }
>}
> }
>
> But building a bunch of these isn't scalable or reasonable. 
>
> I have an unknown number of these and that number will continue to grow. 
> Some other examples:
>
> - query "xbox" should promote consoles over games
> - query "macbook" should promote Apple computers over macbook sleeves
> - query "Apple" should promote Apple products and not food
>
> Building a thousand queries based on functions filters is unreasonable and 
> unscalable. 
>
> Some possible solutions I've considered:
>
> - building a lookup table that will build the filter portion of the query 
> (this could get unmaintainable)
> - Including a pre-calculated score in the document (unfortunately, doesn't 
> work on a per query basis, as the score may change based on the user's 
> needs)
> - Extending the DefaultSimilary class (I'm not sure how this helps me in 
> this scenario, though)
>
> What have other people done to solve these problems? Is there something 
> else that I'm missing that could help?
>
> Here's a runnable gist - 
> https://gist.github.com/dlmitchell/826e8fb7ca89bed30e4a/raw/613be2c202b26f5899bdcfeac714737beb49/sample_mapping.sh
>
>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/48fb3984-a23c-4d95-aa34-e8e67dce8df9%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Too many open files

2014-01-07 Thread Adolfo Rodriguez
I guess, my problem with excessive number of sockets could be also a 
consequence of having 2 JVM running ES, one embedded in Tomcat, a second 
embedded in other Java app, as said here:

https://groups.google.com/forum/?hl=en-GB#!topicsearchin/elasticsearch/scale%7Csort:date%7Cspell:true/elasticsearch/m9IWpGzoLLE

Is there any experience running an unique embedded ES (as jar files), for 
example, in tomcat's lib folder, being consumed by several tomcat apps and 
other standalone apps in different JVMs?

Any opinion on this configuration as an starting point?

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/3ba7b377-9b66-4d8b-ad65-de362318f9f2%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


incrementally scaling ES from the small data

2014-01-07 Thread Adolfo Rodriguez
Hi, I plan to start with a small project, initially, with small data (few 
thousands records) to learn ES response, and, incrementally, increase data 
and resources on demand, to the big data, taking advantage of ES 
scalability.

Is there a document describing such a strategy, i.e.:

* how to properly configure an small basic deployment with good performance 
on low resources? (shards, nodes, clusters...)

* then, how to keep detecting the necessity of incrementally adding 
resources, shard/nodes..., according to increases on data load?


All docs that I find on scaling ES starts on deployments with m/billions of 
records.

Alternatively, any advice on properly "configuring ES for the small data"? 
(as a starting point?)

Thanks

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/79926cfe-4365-4a34-895b-70835ae895dc%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Beta2 Java Client: java.nio.channels.UnresolvedAddressException

2014-01-07 Thread davrob2
Hi,

I'm having difficulty connecting with the Java client to 1.0.0.Beta2, the 
cluster is up and health, monitoring is fine using elasticsearch Head, 
elasticsearch HQ etc.

This is the stack trace I am getting:

https://gist.github.com/dav-rob/8304130 

thanks,

David.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/6df4a88e-82da-4ef7-ac33-f514e4e50711%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Too many open files

2014-01-07 Thread Adolfo Rodriguez
Hi, my model is quite slow with just about some thousands documents

I realised that, when opening a

node = 
builder.client(clientOnly).data(!clientOnly).local(local).node();
client = node.client();

from my Java program to ES with such a small model, ES automatically 
creates 10 sockets. Casually I have 10 shards (?).

* Is this the expected behavior?
* Can I reduce the number of ES shards dynamically to reduce the number of 
sockets or should I redeploy my ES install?
* By opening other connections I finally get up to 200 simultaneous open 
sockets and, I am afraid, that, when fetching highlight information, some 
of the results are randomly being lost. Can this missing results be somehow 
as a consequence of a too large number of open sockets?

Thanks for your pointers.

Thanks

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/4c0a4660-ef70-491d-998f-5ed73c4a9025%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Upgrades causing Elastic Search downtime

2014-01-07 Thread Ivan Brusic
Almost elasticsearch should support clusters of nodes with different minor
versions, I have seen issues between minor versions. Version 0.90.8 did
contain an upgrade of Lucene (4.6), but that does not look like it would
cause your issue. You could look at the github issues tagged 0.90.[8-9] and
see if something applies in your case.

A couple of points about upgrading:

If you want to use the double-the-nodes techniques (which should not be
necessary for minor version upgrades), you could "decommission" a node
using the Shard API. Here is a good writeup:
http://blog.sematext.com/2012/05/29/elasticsearch-shard-placement-control/

Since you doubled the amount of nodes in the cluster,
the minimum_master_nodes setting would be temporarily incorrect and
potential split-brain clusters might occur. In fact, it might have occurred
in your case since the cluster state seems incorrect. Merely hypothesizing.

Cheers,

Ivan


On Tue, Jan 7, 2014 at 9:26 AM, Jenny Sivapalan <
jennifer.sivapa...@gmail.com> wrote:

> Hello,
>
> We've upgraded Elastic Search twice over the last month and have
> experienced downtime (roughly 8 minutes) during the roll out. I'm not sure
> if it something we are doing wrong or not.
>
> We use EC2 instances for our Elastic Search cluster and cloud formation to
> manage our stack. When we deploy a new version or change to Elastic Search
> we upload the new artefact, double the number of EC2 instances and wait for
> the new instances to join the cluster.
>
> For example 6 nodes form a cluster on v 0.90.7. We upload the 0.90.9
> version via our deployment process and double the number nodes for the
> cluster (12). The 6 new nodes will join the cluster with the 0.90.9
> version.
>
> We then want to remove each of the 0.90.7 nodes. We do this by shutting
> down the node (using the plugin head), wait for the cluster to rebalance
> the shards and then terminate the EC2 instances. Then repeat with the next
> node. We leave the master node until last so that it does the re-election
> just once.
>
> The issue we have found in the last two upgrades is that while the
> penultimate node is shutting down the master starts throwing errors and the
> cluster goes red. To fix this we've stopped the Elastic Search process on
> master and have had to restart each of the other nodes (though perhaps they
> would have rebalanced themselves in a longer time period?). We find that
> we send an increase error response to our clients during this time.
>
> We've set out queue size for search to 300 and we start to see the queue
> gets full:
>at java.lang.Thread.run(Thread.java:724)
> 2014-01-07 15:58:55,508 DEBUG action.search.type[Matt Murdock]
> [92036651] Failed to execute fetch phase
> org.elasticsearch.common.util.concurrent.EsRejectedExecutionException:
> rejected execution (queue capacity 300) on
> org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction$AsyncAction$2@23f1bc3
> at
> org.elasticsearch.common.util.concurrent.EsAbortPolicy.rejectedExecution(EsAbortPolicy.java:61)
> at
> java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:821)
>
>
> But also we see the following error which we've been unable to find the
> diagnosis for:
> 2014-01-07 15:58:55,530 DEBUG index.shard.service   [Matt Murdock]
> [index-name][4] Can not build 'doc stats' from engine shard state
> [RECOVERING]
> org.elasticsearch.index.shard.IllegalIndexShardStateException:
> [index-name][4] CurrentState[RECOVERING] operations only allowed when
> started/relocated
> at
> org.elasticsearch.index.shard.service.InternalIndexShard.readAllowed(InternalIndexShard.java:765)
>
> Are we doing anything wrong or has anyone experienced this?
>
> Thanks,
> Jenny
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/b2328296-e9c9-4763-b61b-6ad2e145e59b%40googlegroups.com
> .
> For more options, visit https://groups.google.com/groups/opt_out.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQCSPct9-Useg_cbvVZkwx_OoGVa1J%2B7tJXimpHx00rb8A%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


Scoring and "Relative-ness" based on Business Rules

2014-01-07 Thread David Mitchell
What is the best way to make products more relevant outside of the default 
scoring?

I have an unknown number of business rules that will dictate a document's 
"relativity". Meaning, if one document scores higher than the other, it's 
possible that the other document will be more relevant to the user. 

Given two products with similar titles but different attributes and the 
query "ipad", I'd like to promote one over the other:

{
   "title_simple": "iPad Mini Case",
   "description_simple": "Royce Leather iPad Mini Case:...",
   "category": "Computers & Accessories",
   "brand" : "Royce Leather",
   "id": 794809052574
}

{
  "title_simple": "Apple iPad mini (16GB, Wi-Fi + Sprint 4G, White)",
  "description_simple": "iPad mini features a beautiful 7.9\" display...",
  "category": "Electronics",
  "brand" : "Apple",
  "id": 885909689712
}


A simple query scores the iPad case high:

{
   "query": { "term": { "title_simple": "ipad" }}
}


But business rules dictate that the actual iPad be on the top. 

I can run a filter or score based on the attribute or brand to get what I'm 
looking for:

{
   "query": {
  "function_score": {
  "query": { "term": { "title_simple": "ipad" } },
  "functions" : [{
  "filter" : { "term": { "category_simple": "electronics" } 
},
  "boost_factor" : 2
  }]  
  }
   }
}

But building a bunch of these isn't scalable or reasonable. 

I have an unknown number of these and that number will continue to grow. 
Some other examples:

- query "xbox" should promote consoles over games
- query "macbook" should promote Apple computers over macbook sleeves
- query "Apple" should promote Apple products and not food

Building a thousand queries based on functions filters is unreasonable and 
unscalable. 

Some possible solutions I've considered:

- building a lookup table that will build the filter portion of the query 
(this could get unmaintainable)
- Including a pre-calculated score in the document (unfortunately, doesn't 
work on a per query basis, as the score may change based on the user's 
needs)
- Extending the DefaultSimilary class (I'm not sure how this helps me in 
this scenario, though)

What have other people done to solve these problems? Is there something 
else that I'm missing that could help?

Here's a runnable gist - 
https://gist.github.com/dlmitchell/826e8fb7ca89bed30e4a/raw/613be2c202b26f5899bdcfeac714737beb49/sample_mapping.sh



-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/70849d62-822a-4bb6-99f4-d9400d091fa9%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: score based on term frequency only

2014-01-07 Thread Ivan Brusic
Great feature. However, it looks like it is only available in the master
branch: https://github.com/elasticsearch/elasticsearch/issues/3772

-- 
Ivan


On Tue, Jan 7, 2014 at 8:31 AM, Britta Weber  wrote:

> You could also use a script as described here:
>
>
> http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-advanced-scripting.html
>
>
> Cheers,
> Britta
>
> On Mon, Jan 6, 2014 at 2:13 AM, Ivan Brusic  wrote:
> > You could provide your own Similarity class as a plugin. Don't have any
> > sample code in front of me, but it would be based of  TFIDFSimilarity and
> > you would basically needed to ignore the norms and other values.
> >
> >
> http://lucene.apache.org/core/4_6_0/core/org/apache/lucene/search/similarities/TFIDFSimilarity.html
> >
> > The IDF portion could probably remain since it ranks the different terms
> in
> > your query, not the score of each term.
> >
> > Cheers,
> >
> > Ivan
> >
> >
> >
> > On Sun, Jan 5, 2014 at 1:57 PM, Kevin S  wrote:
> >>
> >> I would like to score based entirely on term count.
> >>
> >> For example, given the following two documents:
> >>
> >> 1) { "apple" }
> >>
> >> 2) { "apple apple" }
> >>
> >> Searching "apple" ranks the first before the second.  I wish to rank the
> >> second, in which the term occurs twice, with a higher score.
> >>
> >> Can someone please point me in the right direction for this?
> >>
> >> Thank you.
> >>
> >> --
> >> You received this message because you are subscribed to the Google
> Groups
> >> "elasticsearch" group.
> >> To unsubscribe from this group and stop receiving emails from it, send
> an
> >> email to elasticsearch+unsubscr...@googlegroups.com.
> >> To view this discussion on the web visit
> >>
> https://groups.google.com/d/msgid/elasticsearch/1bb386ae-3ab5-4878-9d29-6462eaff14c7%40googlegroups.com
> .
> >> For more options, visit https://groups.google.com/groups/opt_out.
> >
> >
> > --
> > You received this message because you are subscribed to the Google Groups
> > "elasticsearch" group.
> > To unsubscribe from this group and stop receiving emails from it, send an
> > email to elasticsearch+unsubscr...@googlegroups.com.
> > To view this discussion on the web visit
> >
> https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQBwEy7UgdqYQmX3EuO71TwSAMCnDp7hdSkcvxLwH5jMJw%40mail.gmail.com
> .
> >
> > For more options, visit https://groups.google.com/groups/opt_out.
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/CALhJbBiFtgJOfhBqXkS-%2B2YWnDy81j7c5jaSFEkG%3DVizqTpykg%40mail.gmail.com
> .
> For more options, visit https://groups.google.com/groups/opt_out.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQDAzNoZwdcquTqyB70Kpw4DSPSPZr2fe%3DCUbMORv1pbUQ%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


Upgrades causing Elastic Search downtime

2014-01-07 Thread Jenny Sivapalan
Hello,

We've upgraded Elastic Search twice over the last month and have 
experienced downtime (roughly 8 minutes) during the roll out. I'm not sure 
if it something we are doing wrong or not.

We use EC2 instances for our Elastic Search cluster and cloud formation to 
manage our stack. When we deploy a new version or change to Elastic Search 
we upload the new artefact, double the number of EC2 instances and wait for 
the new instances to join the cluster.

For example 6 nodes form a cluster on v 0.90.7. We upload the 0.90.9 
version via our deployment process and double the number nodes for the 
cluster (12). The 6 new nodes will join the cluster with the 0.90.9 
version. 

We then want to remove each of the 0.90.7 nodes. We do this by shutting 
down the node (using the plugin head), wait for the cluster to rebalance 
the shards and then terminate the EC2 instances. Then repeat with the next 
node. We leave the master node until last so that it does the re-election 
just once.

The issue we have found in the last two upgrades is that while the 
penultimate node is shutting down the master starts throwing errors and the 
cluster goes red. To fix this we've stopped the Elastic Search process on 
master and have had to restart each of the other nodes (though perhaps they 
would have rebalanced themselves in a longer time period?). We find that we 
send an increase error response to our clients during this time.

We've set out queue size for search to 300 and we start to see the queue 
gets full:
   at java.lang.Thread.run(Thread.java:724)
2014-01-07 15:58:55,508 DEBUG action.search.type[Matt Murdock] 
[92036651] Failed to execute fetch phase
org.elasticsearch.common.util.concurrent.EsRejectedExecutionException: 
rejected execution (queue capacity 300) on 
org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction$AsyncAction$2@23f1bc3
at 
org.elasticsearch.common.util.concurrent.EsAbortPolicy.rejectedExecution(EsAbortPolicy.java:61)
at 
java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:821)
   

But also we see the following error which we've been unable to find the 
diagnosis for:
2014-01-07 15:58:55,530 DEBUG index.shard.service   [Matt Murdock] 
[index-name][4] Can not build 'doc stats' from engine shard state 
[RECOVERING]
org.elasticsearch.index.shard.IllegalIndexShardStateException: 
[index-name][4] CurrentState[RECOVERING] operations only allowed when 
started/relocated
at 
org.elasticsearch.index.shard.service.InternalIndexShard.readAllowed(InternalIndexShard.java:765)

Are we doing anything wrong or has anyone experienced this? 

Thanks,
Jenny

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/b2328296-e9c9-4763-b61b-6ad2e145e59b%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Hipchat & Elasticsearch

2014-01-07 Thread Ivan Brusic
Here are some related links, including a video of a talk:
http://www.meetup.com/Elasticsearch-San-Francisco/events/141698772/

-- 
Ivan


On Tue, Jan 7, 2014 at 1:43 AM, Ümit Seren  wrote:

> Interesting read about elasticsearch in HipChat
>
>
> http://highscalability.com/blog/2014/1/6/how-hipchat-stores-and-indexes-billions-of-messages-using-el.html
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/15bcb5d7-b1c6-4499-b0de-041e308f083e%40googlegroups.com
> .
> For more options, visit https://groups.google.com/groups/opt_out.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQA28GpcwC1bWJE%3DOGDyZiQAnsBUKea6DoVs2zvxRjY3pg%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: score based on term frequency only

2014-01-07 Thread Britta Weber
You could also use a script as described here:

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-advanced-scripting.html


Cheers,
Britta

On Mon, Jan 6, 2014 at 2:13 AM, Ivan Brusic  wrote:
> You could provide your own Similarity class as a plugin. Don't have any
> sample code in front of me, but it would be based of  TFIDFSimilarity and
> you would basically needed to ignore the norms and other values.
>
> http://lucene.apache.org/core/4_6_0/core/org/apache/lucene/search/similarities/TFIDFSimilarity.html
>
> The IDF portion could probably remain since it ranks the different terms in
> your query, not the score of each term.
>
> Cheers,
>
> Ivan
>
>
>
> On Sun, Jan 5, 2014 at 1:57 PM, Kevin S  wrote:
>>
>> I would like to score based entirely on term count.
>>
>> For example, given the following two documents:
>>
>> 1) { "apple" }
>>
>> 2) { "apple apple" }
>>
>> Searching "apple" ranks the first before the second.  I wish to rank the
>> second, in which the term occurs twice, with a higher score.
>>
>> Can someone please point me in the right direction for this?
>>
>> Thank you.
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to elasticsearch+unsubscr...@googlegroups.com.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/elasticsearch/1bb386ae-3ab5-4878-9d29-6462eaff14c7%40googlegroups.com.
>> For more options, visit https://groups.google.com/groups/opt_out.
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQBwEy7UgdqYQmX3EuO71TwSAMCnDp7hdSkcvxLwH5jMJw%40mail.gmail.com.
>
> For more options, visit https://groups.google.com/groups/opt_out.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CALhJbBiFtgJOfhBqXkS-%2B2YWnDy81j7c5jaSFEkG%3DVizqTpykg%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: ElasticsearchHadoop Hive integration issue

2014-01-07 Thread Costin Leau

Hi,

The 'es.resource' you specified is incorrect - you need to specify both an 
index and a type - e.g. myIndex/products


P.S. Are you using M1 or the current master - the latter should give a proper 
error (and message).

Thanks,


On 07/01/2014 9:48 AM, Badal Mohapatra wrote:

Hi,

I am trying to index data from hive table to elasticsearch and and using 
the latest elasticsearch-hadoop-master plugin.
My elasticsearch version is 0.90.9 and hive version is hive-0.11.0.

As per the documentation of elasticsearch-hadoop plugin (hive integration), I 
successfully created an external table
with the below command

/CREATE EXTERNAL TABLE es_products (
sku int,rating float,
name string,
type string,
saleprice float,
department string,
manufacturer string,
userid string,
category_name string,
query string)
STORED BY 'org.elasticsearch.hadoop.hive.ESStorageHandler'
TBLPROPERTIES('es.resource' ='products');/

Even though the external table is created
I am not able to either insert data or even query the external table.
When I do a /select * from es_products;/
I get the below exception.

hive> select * from es_products;
OK
Failed with exception 
java.io.IOException:java.lang.StringIndexOutOfBoundsException: String index out 
of range: -1
Time taken: 1.699 seconds


Can someone please suggest what / where I am wrong!

Kind Regards,
Badal



--
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to
elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/dd63310c-dc07-4dc6-9354-69051a05da3f%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
Costin

--
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/52CBDC15.6040307%40gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Sorting by deeply nested filters

2014-01-07 Thread Vesa Marttila
On Tuesday, January 7, 2014 12:15:42 PM UTC+2, Vesa Marttila wrote:
>
> Hi,
>
> I am trying to sort with a twice nested filter, this doesn't seem to work. 
> My question is that is this even supposed to be possible? I can provide the 
> query if necessary, but it is quite complicated and requires a bit of 
> obfuscating.
>
> Sincerely,
> Vesa Marttila
>

Just to add, the filter when used for queries works as desired, the 
problems only occur when using it in sorting.

Vesa

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/2537e5a8-7cb4-4d71-af44-5c7948793641%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Order results by value in one of the array entries.

2014-01-07 Thread Johan E
Hi,

I'm trying to order the result of a query by a specified entry in a array.

Here is a sample entry


{
"product_name": "product alfa",
"product_id": "4a86c92ccd26111d7ba0eada7da6a75af",
"description": "This is a sample product",
"image_id": "product_a.jpg",
"inventory": [
{
"warehouse": "warehouse_a",
"stock": 99
},
{
"warehouse": "warehouse_b",
"stock": 19
},
{
"warehouse": "warehouse_c",
"stock": 99
}
]
}

If there were more "products" containing alfa, I would (for example) want 
to sort they by the stock of a warehouse.

I'm currently using a query like:

POST _search
{
"query": {
"match": {
"product_name":{
"query":"alfa",
"type" : "phrase"
}
}
},
"filter": {
"bool": {
"must": [
   {
   "term": {
  "availability.warehouse": "warehouse_a"
   }
   }
]
}
}
}

I would like the results sorted by stock (for warehouse_a only) descending.

Any ideas?

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/01a7baad-40e3-40b3-8104-66910762b004%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Query: Parents with at least x children of type y

2014-01-07 Thread Alexander Stautner
Changing the parent doc should be prevented, because there may be new 
child_types added or old child_types may be removed and we want the 
child_types be independed from the parent_type.

So it seems, that there is at the moment no way for doing suchs queries 
with elasticsearch?

Thanks for helping.

Am Dienstag, 7. Januar 2014 10:39:41 UTC+1 schrieb David Pilato:
>
> I would probably add a num_of_children field in parent doc and update it 
> when a new child is added or removed.
>
> But I guess it depends on your actual use case!
>
> --
> David ;-)
> Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs
>
> Le 7 janv. 2014 à 08:15, Alexander Stautner 
> > 
> a écrit :
>
> Sorry for bumping, but i need an answer, if it's posible to answer the 
> question above with elasticsearch
>
> Am Donnerstag, 2. Januar 2014 15:22:32 UTC+1 schrieb Alexander Stautner:
>>
>> Hello,
>> after some research without any results I have a question about 
>> parent/child relations.
>>
>> The case:
>>
>> I have a parent of type "parent_type" which has children of different 
>> types e.g. "child_type_1", "child_type_2", "child_type_3".
>>
>> My Question is:
>>
>> Is there any possibility to get only the parents which have at least x 
>> children of type "child_type_2"  with an specific value in an attribute.
>>
>> e.g
>>
>> parent_type: family
>> child_type_1: girl attribute:name
>> child_type_2: boy attribute:name
>> child_type_3: cat attribute:name
>>
>> And i want to have all families which have at least three girls with name 
>> "Jane".
>>
>>
>> Thank you for your help,
>> Alex 
>>
>  -- 
> You received this message because you are subscribed to the Google Groups 
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to elasticsearc...@googlegroups.com .
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/elasticsearch/6622897e-3e72-4db4-b4a0-4d8555c077e8%40googlegroups.com
> .
> For more options, visit https://groups.google.com/groups/opt_out.
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/6ca23c8a-631b-4d3c-879e-69bb389eef06%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Sorting by deeply nested filters

2014-01-07 Thread Vesa Marttila
Hi,

I am trying to sort with a twice nested filter, this doesn't seem to work. 
My question is that is this even supposed to be possible? I can provide the 
query if necessary, but it is quite complicated and requires a bit of 
obfuscating.

Sincerely,
Vesa Marttila

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/e13c168e-b6c4-42e3-a536-ed9310fc2500%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Hipchat & Elasticsearch

2014-01-07 Thread Ümit Seren
Interesting read about elasticsearch in HipChat

http://highscalability.com/blog/2014/1/6/how-hipchat-stores-and-indexes-billions-of-messages-using-el.html

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/15bcb5d7-b1c6-4499-b0de-041e308f083e%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Query: Parents with at least x children of type y

2014-01-07 Thread David Pilato
I would probably add a num_of_children field in parent doc and update it when a 
new child is added or removed.

But I guess it depends on your actual use case!

--
David ;-)
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 7 janv. 2014 à 08:15, Alexander Stautner  a 
écrit :

> Sorry for bumping, but i need an answer, if it's posible to answer the 
> question above with elasticsearch
> 
> Am Donnerstag, 2. Januar 2014 15:22:32 UTC+1 schrieb Alexander Stautner:
>> 
>> Hello,
>> after some research without any results I have a question about parent/child 
>> relations.
>> 
>> The case:
>> 
>> I have a parent of type "parent_type" which has children of different types 
>> e.g. "child_type_1", "child_type_2", "child_type_3".
>> 
>> My Question is:
>> 
>> Is there any possibility to get only the parents which have at least x 
>> children of type "child_type_2"  with an specific value in an attribute.
>> 
>> e.g
>> 
>> parent_type: family
>> child_type_1: girl attribute:name
>> child_type_2: boy attribute:name
>> child_type_3: cat attribute:name
>> 
>> And i want to have all families which have at least three girls with name 
>> "Jane".
>> 
>> 
>> Thank you for your help,
>> Alex 
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/elasticsearch/6622897e-3e72-4db4-b4a0-4d8555c077e8%40googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/85E9435C-F658-45A3-9F84-C9CE0EA697DB%40pilato.fr.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Exception cause unwrapping ran for 10 levels

2014-01-07 Thread Jason Wee
Jörg,

Done, https://github.com/elasticsearch/elasticsearch/issues/4639

Today when I investigated this issue, and just do a query to the time stamp
when the exceptions is happening, data were indexed though. The reason I
query is that, we worry if there is no data index during that period
exceptions are happening , thus data lost.

Thank you.

Jason


On Tue, Jan 7, 2014 at 4:34 PM, joergpra...@gmail.com  wrote:

> Yes, it looks like two nodes do not agree about an update action and a
> version conflict is pinging between them, node1 and node4.
>
> Not sure if this happens while index recovery or while an update is
> executed, but it is definitely worth raising an issue at the Elasticsearch
> github to let the Elasticsearch core team have a look. It might be some
> kind of a deadlock.
>
> Jörg
>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGKNB-1KXab4eWhDnKpe4szdPsidEWq2his2j%3DfPwU7Zw%40mail.gmail.com
> .
>
> For more options, visit https://groups.google.com/groups/opt_out.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHO4itzoqWdujn713RK83ZZL4iGr19nY9nz34wbRtTKOSzcMNA%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Cannot search the field of the attachment type?

2014-01-07 Thread HongXuan Ji
OH!
It's so embarrassed.  Anyway, thanks for the tip.

Best,

Ivan

David Pilato於 2014年1月7日星期二UTC+8下午5時00分25秒寫道:
>
> I used http://www.base64decode.org/ to look at your indexed document.
>
> Result is: "*This page moved to 
> I guess you now have the explanation! :-)
>
> -- 
> *David Pilato* | *Technical Advocate* | *Elasticsearch.com*
> @dadoonet  | 
> @elasticsearchfr
>
>
> Le 7 janvier 2014 at 09:16:03, HongXuan Ji (hxu...@gmail.com ) 
> a écrit:
>
>  Hi, David,
>
> This is the result of _search?q=*&pretty:
>
>  {
> "took": 0,
> "timed_out": false,
> "_shards": {
> "total": 1,
> "successful": 1,
> "failed": 0
> },
> "hits": {
> "total": 1,
> "max_score": 1,
> "hits": [
> {
> "_index": "test",
> "_type": "attachment",
> "_id": "9wsReuxrSzCC0qrRHsYYPg",
> "_score": 1,
> "_source": {
> "file": 
> "PGh0bWw+VGhpcyBwYWdlIG1vdmVkIHRvIDxhIGhyZWY9Ig0KL2NvbnRlbnQvZGFtL0ludGVyc2lsL2RvY3VtZW50cy9mbjY3L2ZuNjc0Mi5wZGYNCg==Ij5oZXJlPC9hPjwvaHRtbD4NCg=="
> }
> }
> ]
> }
> }
>  
> BTW, I did restart the ES after the plugin installed. 
>
> Ideas?
>
> Thanks, a lot.
>
>
> David Pilato於 2014年1月7日星期二UTC+8下午3時08分49秒寫道: 
>>
>>  Hard to say. Not enough details about what you did.
>> Could you gist a curl recreation and also add in it the result of a 
>> _search?q=*&pretty
>>
>> Did you restart elasticsearch after plugin installation?
>>
>> --
>> David ;-)
>> Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs
>>
>>  
>> Le 7 janv. 2014 à 07:52, HongXuan Ji  a écrit :
>>
>>  Dear all,  
>>
>> I cannot able to query the field of attachment type. I followed the 
>> instruction in 
>> http://es-cn.medcl.net/tutorials/2011/07/18/attachment-type-in-action.html
>> .
>>
>> And the result of the search query:
>>
>>   curl "http://localhost:9200/_search?pretty=true"; -d '{
>>  "fields" : ["title"],
>>  "query" : {
>>  "query_string" : {
>>  "query" : "amplifier"
>>  }
>>  },
>>  "highlight" : {
>>  "fields" : {
>>  "file" : {}
>>  }
>>  }
>>  }'
>>  
>> result:
>>
>> {
>>   "took" : 1,
>>   "timed_out" : false,
>>   "_shards" : {
>> "total" : 1,
>> "successful" : 1,
>> "failed" : 0
>>   },
>>   "hits" : {
>> "total" : 0,
>> "max_score" : null,
>> "hits" : [ ]
>>   }
>> }
>>
>>
>>  My ElasticSearch is the latest version, elasticsearch-0.90.9 and the plugin 
>> of mapper-attachment is 1.9.0 
>> (https://github.com/elasticsearch/elasticsearch-mapper-attachments).
>>
>> In fact, the environment finished the setup yesterday. I have no clue why it 
>> cannot find anything.
>>
>>
>>  Any ideas?
>>
>>
>>  Regards,
>>
>>
>>  Ivan
>>
>>  --
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to elasticsearc...@googlegroups.com.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/d0768325-af0d-4aeb-ae2d-499fac1ca08a%40googlegroups.com
>> .
>> For more options, visit https://groups.google.com/groups/opt_out.
>>  
>  --
> You received this message because you are subscribed to the Google Groups 
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to elasticsearc...@googlegroups.com .
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/elasticsearch/fa4c12da-c9d4-427f-abc1-299692886046%40googlegroups.com
> .
> For more options, visit https://groups.google.com/groups/opt_out.
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/d8b21d4f-393e-4267-8fd3-40617c40b708%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: which web ui is mostly used for ElasticSearch?

2014-01-07 Thread David Pilato
It's somehow hard to build a UI on top of … something we don't know… :-)

Are you indexing logs, office documents, custom goods declaration, chess games, 
hotel room reservation? 

As you can see, it's hard to know how to represent YOUR data.
It's somehow the same question you could ask when using a SQL database.

There are some generic tools. Of course, my favorite is Kibana! :-)

If you want to build your own UI for local disk documents for example, you 
could fork http://www.scrutmydocs.org/ and see what we have done here although 
it has not been updated for a while.


My 2 cents

-- 
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr


Le 7 janvier 2014 at 02:52:58, 沈国权 (myronleos...@gmail.com) a écrit:

I was a newer to ElasticSearch. As far as I konw that ElasticSearch has no web 
UI interface. So if I want to use a web UI to search ElasticSearch, which one 
is fit for. I know one of web UI called ElasticSearch-Head.  If all of you have 
any ideas about other web UI can be used for ElasticSearch. please feel free to 
comment your ideas and any recommendations
--
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/a3cd3e89-c67c-4ff3-9ede-480e98c73b6a%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/etPan.52cbc391.3dc240fb.11bb1%40MacBook-Air-de-David.local.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Cannot search the field of the attachment type?

2014-01-07 Thread David Pilato
I used http://www.base64decode.org/ to look at your indexed document.

Result is: "This page moved to  a écrit :

Dear all, 

I cannot able to query the field of attachment type. I followed the instruction 
in http://es-cn.medcl.net/tutorials/2011/07/18/attachment-type-in-action.html.

And the result of the search query:

curl "http://localhost:9200/_search?pretty=true"; -d '{
"fields" : ["title"],
"query" : {
"query_string" : {
"query" : "amplifier"
}
},
"highlight" : {
"fields" : {
"file" : {}
}
}
}'

result:
{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
"total" : 1,
"successful" : 1,
"failed" : 0
  },
  "hits" : {
"total" : 0,
"max_score" : null,
"hits" : [ ]
  }
}



My ElasticSearch is the latest version, elasticsearch-0.90.9 and the plugin of 
mapper-attachment is 1.9.0 
(https://github.com/elasticsearch/elasticsearch-mapper-attachments).
In fact, the environment finished the setup yesterday. I have no clue why it 
cannot find anything.


Any ideas?


Regards,


Ivan
--
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearc...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/d0768325-af0d-4aeb-ae2d-499fac1ca08a%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
--
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/fa4c12da-c9d4-427f-abc1-299692886046%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/etPan.52cbc22a.32fff902.11bb1%40MacBook-Air-de-David.local.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Exception cause unwrapping ran for 10 levels

2014-01-07 Thread joergpra...@gmail.com
Yes, it looks like two nodes do not agree about an update action and a
version conflict is pinging between them, node1 and node4.

Not sure if this happens while index recovery or while an update is
executed, but it is definitely worth raising an issue at the Elasticsearch
github to let the Elasticsearch core team have a look. It might be some
kind of a deadlock.

Jörg

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGKNB-1KXab4eWhDnKpe4szdPsidEWq2his2j%3DfPwU7Zw%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: 'More Like This'-functionality suited for comparing entities?

2014-01-07 Thread Maarten Roosendaal
Hi Jörg,

How do i proved the user's wishlist when using the mlt query?
i've tried:

GET http://localhost:9200/wishlists/list/*[id user's wishlist*]/_search/
 {
"more_like_this" : {
"fields" : ["product_id"],
"min_term_freq" : 1,
"max_query_terms" : 12
}
}

but it does not work.

Op maandag 6 januari 2014 13:42:13 UTC+1 schreef Jörg Prante:
>
> Have you tried 
> http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-mlt-query.html
>
> You don't have to provide "text", instead, just provide the user's 
> wishlist you want similar wishlists to.
>
> Jörg
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/2b322095-c155-4748-93c5-d0047a6aa036%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Cannot search the field of the attachment type?

2014-01-07 Thread HongXuan Ji
Hi, David,

This is the result of _search?q=*&pretty:

{
"took": 0,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 1,
"hits": [
{
"_index": "test",
"_type": "attachment",
"_id": "9wsReuxrSzCC0qrRHsYYPg",
"_score": 1,
"_source": {
"file": 
"PGh0bWw+VGhpcyBwYWdlIG1vdmVkIHRvIDxhIGhyZWY9Ig0KL2NvbnRlbnQvZGFtL0ludGVyc2lsL2RvY3VtZW50cy9mbjY3L2ZuNjc0Mi5wZGYNCg==Ij5oZXJlPC9hPjwvaHRtbD4NCg=="
}
}
]
}
}

BTW, I did restart the ES after the plugin installed.

Ideas?

Thanks, a lot.


David Pilato於 2014年1月7日星期二UTC+8下午3時08分49秒寫道:
>
> Hard to say. Not enough details about what you did.
> Could you gist a curl recreation and also add in it the result of a 
> _search?q=*&pretty
>
> Did you restart elasticsearch after plugin installation?
>
> --
> David ;-)
> Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs
>
>
> Le 7 janv. 2014 à 07:52, HongXuan Ji > a 
> écrit :
>
> Dear all, 
>
> I cannot able to query the field of attachment type. I followed the 
> instruction in 
> http://es-cn.medcl.net/tutorials/2011/07/18/attachment-type-in-action.html
> .
>
> And the result of the search query:
>
> curl "http://localhost:9200/_search?pretty=true"; -d '{
>   "fields" : ["title"],
>   "query" : {
> "query_string" : {
>   "query" : "amplifier"
> }
>   },
>   "highlight" : {
> "fields" : {
>   "file" : {}
> }
>   }
> }'
>
> result:
>
> {
>   "took" : 1,
>   "timed_out" : false,
>   "_shards" : {
> "total" : 1,
> "successful" : 1,
> "failed" : 0
>   },
>   "hits" : {
> "total" : 0,
> "max_score" : null,
> "hits" : [ ]
>   }
> }
>
>
> My ElasticSearch is the latest version, elasticsearch-0.90.9 and the plugin 
> of mapper-attachment is 1.9.0 
> (https://github.com/elasticsearch/elasticsearch-mapper-attachments).
>
> In fact, the environment finished the setup yesterday. I have no clue why it 
> cannot find anything.
>
>
> Any ideas?
>
>
> Regards,
>
>
> Ivan
>
>  -- 
> You received this message because you are subscribed to the Google Groups 
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to elasticsearc...@googlegroups.com .
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/elasticsearch/d0768325-af0d-4aeb-ae2d-499fac1ca08a%40googlegroups.com
> .
> For more options, visit https://groups.google.com/groups/opt_out.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/fa4c12da-c9d4-427f-abc1-299692886046%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.