Re: If I set index.number_of_replica:1, then the minimum number of nodes should be 3 to assure that the status of the cluster is gree?

2014-06-06 Thread joergpra...@gmail.com
1. No. Did you change the configuration? You have two data nodes connected?

2. You do not need to be concerned where primary shards are allocated,
secondary shards play the same role (except primaries receive writes first
a few milliseconds earlier than secondaries). Elasticsearch randomly
allocates shards on all the available nodes. If nodes go away, shards on
remaining nodes are promoted to primaries automatically.

Jörg


On Fri, Jun 6, 2014 at 5:41 AM, flyer flyer...@gmail.com wrote:

 I have a cluster of two nodes, and set the configs for shard number and
 replica number as following:

 index.number_of_shards: 10

 index.number_of_replicas: 1


 The master node is elected automatically.

 Before I index data, the state of the cluster is green. After I index
 data, the state of the cluster becomes yellow. Some replica shards are not
 allocated.

 But if I add another node and now the cluster consists of three nodes,
 then I index data and the state of the cluster becomes green.

 The version of ES that I use is 1.1.

 Question 1:
 If the replica number is 1, do I have to have at least 3 nodes to
 assure the state of the cluster is green?

 Question 2:
 I have observed that one (not the master node) of the three nodes just
 has primary nodes. What's the mechanism ES uses to allocate primary/replia
 shards?

 Thanks anyway.

 --
 宠辱不惊,闲看庭前花开花落;去留无意,漫随天边云卷云舒。

  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/CAKLbBG-o6JmkbPAAvGBP8t8Bdp-ONEWEFK5kGW%3Dcyn-h-O6dsg%40mail.gmail.com
 https://groups.google.com/d/msgid/elasticsearch/CAKLbBG-o6JmkbPAAvGBP8t8Bdp-ONEWEFK5kGW%3Dcyn-h-O6dsg%40mail.gmail.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoES%2Bd5gmo_OS-Si2dXLPGbk_q-B554WeL13HC0DfFY1vw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: A plugin to change the result set before sending it back to the http client

2014-06-06 Thread joergpra...@gmail.com
I drink Kölsch only :) ävver et hätt noh immer joot jejange

Greetings from Cologne!

Jörg


On Fri, Jun 6, 2014 at 7:14 AM, Mario Mueller ma...@xenji.com wrote:

 You guys are totally awesome! Thanks a lot! If you ever visit Duesseldorf
 drop me a line, I owe you a beer.

 @Brian:
 Interesting approach, but wouldn't this go against the initial no
 additional proxy statement, if I got you right ..



 Am Donnerstag, 5. Juni 2014 19:45:33 UTC+2 schrieb Mario Mueller:

 Hey folks,

 I kindly ask for a hint to achieve the following thing:

 The goal is to deliver only a json array of source objects to the client.
 The php app that sits on the other side uses JMS\Serializer to deserialize
 the response into entities. At the moment the app needs to take an overhead
 to derserialize it, extract the source and serialize it again. Then the
 serialized stuff is passed to the entity deserializer. That's really
 painful.

 I've found a thread that suggests a proxy in between to handle this type
 of problem, but this is not possible in our env.

 The real question is: Is this achievable by writing a plugin? And if so,
 what type of plugin? I've looked at the RestFilter, but I don't know if
 this is the right way to go...

 Any hints are welcome!

 Regards,
 Mario

  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/b27db27a-b136-4a3b-81b5-16d60047540d%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/b27db27a-b136-4a3b-81b5-16d60047540d%40googlegroups.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHKWraQG7iMiUiV5ucR8VdLO2nxruVX5Vq_P%3D1zcB6WGQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Analyzing queries in the client side of Elasticsearch but not on the server

2014-06-06 Thread ohw
Hi folks

I just asked a question in StackOverflow, please have a look if you have 
encountered similar problem or have some input to it.

Thanks in advance!

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/1f4b1575-f050-46db-853f-511bc24e6392%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Marvel 1.2.0 java.lang.IllegalStateException

2014-06-06 Thread Boaz Leskes
HI Pawel,

I see - your cluster state (nodes + routing only, not meta data), seems to 
be larger then 16KB when rendered to SMILE, which is quite big - does this 
make sense?

Above 16KB an underlying paging system introduced in the ES 1.x branch 
kicks in. At that breaks something in Marvel than normally ships very small 
documents.

I'll work on a fix. Can you confirm your cluster state (again, without the 
metadata) is indeed very large?

Cheers,
Boaz

On Thursday, June 5, 2014 10:56:00 AM UTC+2, Paweł Krzaczkowski wrote:

 Hi.

 After upgrading Marvel to 1.2.0 (running on Elasticsearch 1.2.1) i'm 
 getting errors like

 [2014-06-05 10:47:25,346][INFO ][node ] [es-m-3] 
 version[1.2.1], pid[68924], build[6c95b75/2014-06-03T15:02:52Z]
 [2014-06-05 10:47:25,347][INFO ][node ] [es-m-3] 
 initializing ...
 [2014-06-05 10:47:25,367][INFO ][plugins  ] [es-m-3] 
 loaded [marvel, analysis-icu], sites [marvel, head, segmentspy, browser, 
 paramedic]
 [2014-06-05 10:47:28,455][INFO ][node ] [es-m-3] 
 initialized
 [2014-06-05 10:47:28,456][INFO ][node ] [es-m-3] 
 starting ...
 [2014-06-05 10:47:28,597][INFO ][transport] [es-m-3] 
 bound_address {inet[/0:0:0:0:0:0:0:0:9300]}, publish_address {inet[/
 192.168.0.212:9300]}
 [2014-06-05 10:47:42,340][INFO ][cluster.service  ] [es-m-3] 
 new_master [es-m-3][0H3grrJxTJunU1U6FmkIEg][es-m-3][inet[
 192.168.0.212/192.168.0.212:9300]]{data=false 
 http://192.168.0.212/192.168.0.212:9300%5D%5D%7Bdata=false, 
 master=true}, reason: zen-disco-join (elected_as_master)
 [2014-06-05 10:47:42,350][INFO ][discovery] [es-m-3] 
 freshmind/0H3grrJxTJunU1U6FmkIEg
 [2014-06-05 10:47:42,365][INFO ][http ] [es-m-3] 
 bound_address {inet[/0:0:0:0:0:0:0:0:9200]}, publish_address {inet[/
 192.168.0.212:9200]}
 [2014-06-05 10:47:42,368][INFO ][node ] [es-m-3] 
 started
 [2014-06-05 10:47:44,098][INFO ][cluster.service  ] [es-m-3] added 
 {[es-m-1][MHl5Ls-cRXCwc7OC-P0J5w][es-m-1][inet[/192.168.0.210:9300]]{data=false,
  
 machine=44454c4c-5300-1052-8038-b9c04f5a5a31, master=true},}, reason: 
 zen-disco-receive(join from 
 node[[es-m-1][MHl5Ls-cRXCwc7OC-P0J5w][es-m-1][inet[/192.168.0.210:9300]]{data=false,
  
 machine=44454c4c-5300-1052-8038-b9c04f5a5a31, master=true}])
 [2014-06-05 10:47:44,401][INFO ][gateway  ] [es-m-3] 
 recovered [28] indices into cluster_state
 [2014-06-05 10:47:48,683][ERROR][marvel.agent ] [es-m-3] 
 exporter [es_exporter] has thrown an exception:
 java.lang.IllegalStateException: array not available
 at 
 org.elasticsearch.common.bytes.PagedBytesReference.array(PagedBytesReference.java:289)
 at 
 org.elasticsearch.marvel.agent.exporter.ESExporter.addXContentRendererToConnection(ESExporter.java:209)
 at 
 org.elasticsearch.marvel.agent.exporter.ESExporter.exportXContent(ESExporter.java:252)
 at 
 org.elasticsearch.marvel.agent.exporter.ESExporter.exportEvents(ESExporter.java:161)
 at 
 org.elasticsearch.marvel.agent.AgentService$ExportingWorker.exportEvents(AgentService.java:305)
 at 
 org.elasticsearch.marvel.agent.AgentService$ExportingWorker.run(AgentService.java:240)
 at java.lang.Thread.run(Thread.java:745)
 [2014-06-05 10:47:58,738][ERROR][marvel.agent ] [es-m-3] 
 exporter [es_exporter] has thrown an exception:
 java.lang.IllegalStateException: array not available
 at 
 org.elasticsearch.common.bytes.PagedBytesReference.array(PagedBytesReference.java:289)
 at 
 org.elasticsearch.marvel.agent.exporter.ESExporter.addXContentRendererToConnection(ESExporter.java:209)
 at 
 org.elasticsearch.marvel.agent.exporter.ESExporter.exportXContent(ESExporter.java:252)
 at 
 org.elasticsearch.marvel.agent.exporter.ESExporter.exportEvents(ESExporter.java:161)
 at 
 org.elasticsearch.marvel.agent.AgentService$ExportingWorker.exportEvents(AgentService.java:305)
 at 
 org.elasticsearch.marvel.agent.AgentService$ExportingWorker.run(AgentService.java:240)
 at java.lang.Thread.run(Thread.java:745)
 [2014-06-05 10:48:36,572][INFO ][cluster.service  ] [es-m-3] added 
 {[es-m-2][e5uEqGRhS7uEioNxaYkwTg][es-m-2][inet[/192.168.0.211:9300]]{data=false,
  
 master=true},}, reason: zen-disco-receive(join from 
 node[[es-m-2][e5uEqGRhS7uEioNxaYkwTg][es-m-2][inet[/192.168.0.211:9300]]{data=false,
  
 master=true}])
 [2014-06-05 10:48:38,859][ERROR][marvel.agent ] [es-m-3] 
 exporter [es_exporter] has thrown an exception:
 java.lang.IllegalStateException: array not available
 at 
 org.elasticsearch.common.bytes.PagedBytesReference.array(PagedBytesReference.java:289)
 at 
 org.elasticsearch.marvel.agent.exporter.ESExporter.addXContentRendererToConnection(ESExporter.java:209)
 at 
 

Re: Correct way to use TransportClient connection object

2014-06-06 Thread joergpra...@gmail.com
Closing the transport client may not be enough.

Try this:

- wait for all outstanding actions (all actions send responses
asynchronously)
- then shut down client.threadpool() (perhaps with shutdownNow() or
shutdown()), this effectively disables new actions form being started
- then close the transport client with client.close()
- then set the client instance to null to allow GC to clean up

Also you should put the static transport client object in a
ServetContextListener so the web app container is able to manage start and
stop in contextInitialized() and contextDestroyed()

Beside this, memory leak warnings from tomcat are warnings. They may be
right or wrong.

Jörg


On Fri, Jun 6, 2014 at 7:43 AM, Subhadip Bagui i.ba...@gmail.com wrote:

 Hi,

  I'm using the below code to get a singleton object for TransportClient
 object. I'm using the getInstance() to get the client object which is
 already alive in webapplication.

 public static Client getInstance()
 {
 if (instance == null)
   {
 logger.debug(the client instance is null, creating a new instance);
 ImmutableSettings.Builder settings = ImmutableSettings.settingsBuilder();
 settings.put(node.client, true);
 settings.put(node.data, false);
 settings.put(node.name, node-client);
 settings.put(cluster.name, elasticsearch);
 settings.build();
 instance = new TransportClient(settings)
 .addTransportAddress(new
 InetSocketTransportAddress(10.203.238.139, 9300));
 logger.debug(returning the new created client instance...);
 return instance;
   }
 return instance;
 }

 Calling the client as below from search api.
 Client client = ESClientFactory.getInstance();

 Now the issue is if I don't close client like client.close() I'm getting
 memory leak warning from webserver tomcat side. If I do close the
 connection using client.close() after search api call then I'm getting 
 NoNodeAvailableException
 exception.

 Please suggest what is the correct way to call the connection object.

 Thanks,
 Subhadip

  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/e2aaf77f-cd18-4e52-98fc-c25ed03601fd%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/e2aaf77f-cd18-4e52-98fc-c25ed03601fd%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHL3rAeBsHLrWYkm4-MqO7oU_9CJmL7Ge6SPS15FoyJmg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Analyzing queries in the client side of Elasticsearch but not on the server

2014-06-06 Thread joergpra...@gmail.com
Please ask your question here. Thanks.

Jörg


On Fri, Jun 6, 2014 at 9:28 AM, ohw o...@zhihu.com wrote:

 Hi folks

 I just asked a question in StackOverflow, please have a look if you have
 encountered similar problem or have some input to it.

 Thanks in advance!

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/1f4b1575-f050-46db-853f-511bc24e6392%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/1f4b1575-f050-46db-853f-511bc24e6392%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoEhKcBZBQ2m1oYqe6C7fMzc17APqJvRrHtqCEjecCRunA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Analyzing queries in the client side of Elasticsearch but not on the server

2014-06-06 Thread ohw
Sure, here it is:

-

We are migrating our Lucene based search codebase to Elasticsearch. The 
major problem we encountered is how we should migrate our QueryParsers.

In our old solution, the QueryParsers take in a human input query string, 
and transform that to Lucene's Query object, which is then fed into 
Lucene's IndexSearcher. However, in Elasticsearch we don't directly 
interact with IndexSearcher, instead we can only build the queries in the 
client side using Query DSL 
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl.html
 and 
send the JSON to Elasticsearch server. Elasticsearch server then (possibly) 
rewrites/analyses the JSON query to build a Lucene query.

To make use of our existing and sophisticated logic in QueryParsers, we 
decided that we can stick to our old approach by:

   1. Explicitly tell Elasticsearch to *NOT analyze* any query in the 
   search time.
   2. Do *ALL* the query related analysis (tokenizing, synonym, etc) in the 
   Java client.
   3. Believe that Elasticsearch's Query DSL is kind of a one-to-one 
   mapping to Lucene's Query

The questions are:

   1. Is this approach feasible?
   2. What are the potential problems in doing so?
   3. What is the best practice?

By the way, don't worry about the scoring process, we are writing our 
scorer scripts as a Elasticsearch plugin.

---

Thank you!

Odin

On Friday, June 6, 2014 3:36:54 PM UTC+8, Jörg Prante wrote:

 Please ask your question here. Thanks.

 Jörg


 On Fri, Jun 6, 2014 at 9:28 AM, ohw o...@zhihu.com javascript: wrote:

 Hi folks

 I just asked a question in StackOverflow, please have a look if you have 
 encountered similar problem or have some input to it.

 Thanks in advance!

 -- 
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearc...@googlegroups.com javascript:.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/1f4b1575-f050-46db-853f-511bc24e6392%40googlegroups.com
  
 https://groups.google.com/d/msgid/elasticsearch/1f4b1575-f050-46db-853f-511bc24e6392%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.




-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/50690e24-839a-4908-90f4-5417129debc6%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: A plugin to change the result set before sending it back to the http client

2014-06-06 Thread Mario Mueller
Et kütt wie et kütt ;) Das mit dem Koelsch geht klar, gibts auch hier in 
DUS.

Again thanks to all!

Am Freitag, 6. Juni 2014 09:09:39 UTC+2 schrieb Jörg Prante:

 I drink Kölsch only :) ävver et hätt noh immer joot jejange

 Greetings from Cologne!

 Jörg


 On Fri, Jun 6, 2014 at 7:14 AM, Mario Mueller ma...@xenji.com 
 javascript: wrote:

 You guys are totally awesome! Thanks a lot! If you ever visit Duesseldorf 
 drop me a line, I owe you a beer.

 @Brian:
 Interesting approach, but wouldn't this go against the initial no 
 additional proxy statement, if I got you right ..



 Am Donnerstag, 5. Juni 2014 19:45:33 UTC+2 schrieb Mario Mueller:

 Hey folks,

 I kindly ask for a hint to achieve the following thing:

 The goal is to deliver only a json array of source objects to the 
 client. The php app that sits on the other side uses JMS\Serializer to 
 deserialize the response into entities. At the moment the app needs to take 
 an overhead to derserialize it, extract the source and serialize it again. 
 Then the serialized stuff is passed to the entity deserializer. That's 
 really painful.

 I've found a thread that suggests a proxy in between to handle this type 
 of problem, but this is not possible in our env.

 The real question is: Is this achievable by writing a plugin? And if so, 
 what type of plugin? I've looked at the RestFilter, but I don't know if 
 this is the right way to go...

 Any hints are welcome!

 Regards,
 Mario

  -- 
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearc...@googlegroups.com javascript:.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/b27db27a-b136-4a3b-81b5-16d60047540d%40googlegroups.com
  
 https://groups.google.com/d/msgid/elasticsearch/b27db27a-b136-4a3b-81b5-16d60047540d%40googlegroups.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.




-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/d8c20589-1f4b-4b87-b140-ec7126e5bbe2%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Kibana-MySQL connection

2014-06-06 Thread srinu konda
Hi All,

I need help on kibana, I need to Connect MySQL Database from Kibana. So 
please let me know I can achieve it, and me pseudo code if anyone can have 
it.



Thanks  Regards,
Srinivas.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/13cb6155-12e2-4ec5-952c-7f1996527903%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Marvel 1.2.0 java.lang.IllegalStateException

2014-06-06 Thread Paweł Krzaczkowski
Hi Boaz

My cluster state from http://192.168.0.211:9200/_cluster/state?humanpretty

http://pastebin.com/ZEihQhc1 - well You may say it's big ...

Pawel

W dniu piątek, 6 czerwca 2014 09:28:30 UTC+2 użytkownik Boaz Leskes napisał:

 HI Pawel,

 I see - your cluster state (nodes + routing only, not meta data), seems to 
 be larger then 16KB when rendered to SMILE, which is quite big - does this 
 make sense?

 Above 16KB an underlying paging system introduced in the ES 1.x branch 
 kicks in. At that breaks something in Marvel than normally ships very small 
 documents.

 I'll work on a fix. Can you confirm your cluster state (again, without the 
 metadata) is indeed very large?

 Cheers,
 Boaz

 On Thursday, June 5, 2014 10:56:00 AM UTC+2, Paweł Krzaczkowski wrote:

 Hi.

 After upgrading Marvel to 1.2.0 (running on Elasticsearch 1.2.1) i'm 
 getting errors like

 [2014-06-05 10:47:25,346][INFO ][node ] [es-m-3] 
 version[1.2.1], pid[68924], build[6c95b75/2014-06-03T15:02:52Z]
 [2014-06-05 10:47:25,347][INFO ][node ] [es-m-3] 
 initializing ...
 [2014-06-05 10:47:25,367][INFO ][plugins  ] [es-m-3] 
 loaded [marvel, analysis-icu], sites [marvel, head, segmentspy, browser, 
 paramedic]
 [2014-06-05 10:47:28,455][INFO ][node ] [es-m-3] 
 initialized
 [2014-06-05 10:47:28,456][INFO ][node ] [es-m-3] 
 starting ...
 [2014-06-05 10:47:28,597][INFO ][transport] [es-m-3] 
 bound_address {inet[/0:0:0:0:0:0:0:0:9300]}, publish_address {inet[/
 192.168.0.212:9300]}
 [2014-06-05 10:47:42,340][INFO ][cluster.service  ] [es-m-3] 
 new_master [es-m-3][0H3grrJxTJunU1U6FmkIEg][es-m-3][inet[
 192.168.0.212/192.168.0.212:9300]]{data=false 
 http://192.168.0.212/192.168.0.212:9300%5D%5D%7Bdata=false, 
 master=true}, reason: zen-disco-join (elected_as_master)
 [2014-06-05 10:47:42,350][INFO ][discovery] [es-m-3] 
 freshmind/0H3grrJxTJunU1U6FmkIEg
 [2014-06-05 10:47:42,365][INFO ][http ] [es-m-3] 
 bound_address {inet[/0:0:0:0:0:0:0:0:9200]}, publish_address {inet[/
 192.168.0.212:9200]}
 [2014-06-05 10:47:42,368][INFO ][node ] [es-m-3] 
 started
 [2014-06-05 10:47:44,098][INFO ][cluster.service  ] [es-m-3] 
 added 
 {[es-m-1][MHl5Ls-cRXCwc7OC-P0J5w][es-m-1][inet[/192.168.0.210:9300]]{data=false,
  
 machine=44454c4c-5300-1052-8038-b9c04f5a5a31, master=true},}, reason: 
 zen-disco-receive(join from 
 node[[es-m-1][MHl5Ls-cRXCwc7OC-P0J5w][es-m-1][inet[/192.168.0.210:9300]]{data=false,
  
 machine=44454c4c-5300-1052-8038-b9c04f5a5a31, master=true}])
 [2014-06-05 10:47:44,401][INFO ][gateway  ] [es-m-3] 
 recovered [28] indices into cluster_state
 [2014-06-05 10:47:48,683][ERROR][marvel.agent ] [es-m-3] 
 exporter [es_exporter] has thrown an exception:
 java.lang.IllegalStateException: array not available
 at 
 org.elasticsearch.common.bytes.PagedBytesReference.array(PagedBytesReference.java:289)
 at 
 org.elasticsearch.marvel.agent.exporter.ESExporter.addXContentRendererToConnection(ESExporter.java:209)
 at 
 org.elasticsearch.marvel.agent.exporter.ESExporter.exportXContent(ESExporter.java:252)
 at 
 org.elasticsearch.marvel.agent.exporter.ESExporter.exportEvents(ESExporter.java:161)
 at 
 org.elasticsearch.marvel.agent.AgentService$ExportingWorker.exportEvents(AgentService.java:305)
 at 
 org.elasticsearch.marvel.agent.AgentService$ExportingWorker.run(AgentService.java:240)
 at java.lang.Thread.run(Thread.java:745)
 [2014-06-05 10:47:58,738][ERROR][marvel.agent ] [es-m-3] 
 exporter [es_exporter] has thrown an exception:
 java.lang.IllegalStateException: array not available
 at 
 org.elasticsearch.common.bytes.PagedBytesReference.array(PagedBytesReference.java:289)
 at 
 org.elasticsearch.marvel.agent.exporter.ESExporter.addXContentRendererToConnection(ESExporter.java:209)
 at 
 org.elasticsearch.marvel.agent.exporter.ESExporter.exportXContent(ESExporter.java:252)
 at 
 org.elasticsearch.marvel.agent.exporter.ESExporter.exportEvents(ESExporter.java:161)
 at 
 org.elasticsearch.marvel.agent.AgentService$ExportingWorker.exportEvents(AgentService.java:305)
 at 
 org.elasticsearch.marvel.agent.AgentService$ExportingWorker.run(AgentService.java:240)
 at java.lang.Thread.run(Thread.java:745)
 [2014-06-05 10:48:36,572][INFO ][cluster.service  ] [es-m-3] 
 added 
 {[es-m-2][e5uEqGRhS7uEioNxaYkwTg][es-m-2][inet[/192.168.0.211:9300]]{data=false,
  
 master=true},}, reason: zen-disco-receive(join from 
 node[[es-m-2][e5uEqGRhS7uEioNxaYkwTg][es-m-2][inet[/192.168.0.211:9300]]{data=false,
  
 master=true}])
 [2014-06-05 10:48:38,859][ERROR][marvel.agent ] [es-m-3] 
 exporter [es_exporter] has thrown an exception:
 java.lang.IllegalStateException: array not available
 at 
 

Re: Marvel 1.2.0 java.lang.IllegalStateException

2014-06-06 Thread Paweł Krzaczkowski
This one is without metadata

http://pastebin.com/tmJGA5Kq
http://xxx:9200/_cluster/state/version,master_node,nodes,routing_table,blocks/?humanpretty

Pawel

W dniu piątek, 6 czerwca 2014 09:28:30 UTC+2 użytkownik Boaz Leskes napisał:

 HI Pawel,

 I see - your cluster state (nodes + routing only, not meta data), seems to 
 be larger then 16KB when rendered to SMILE, which is quite big - does this 
 make sense?

 Above 16KB an underlying paging system introduced in the ES 1.x branch 
 kicks in. At that breaks something in Marvel than normally ships very small 
 documents.

 I'll work on a fix. Can you confirm your cluster state (again, without the 
 metadata) is indeed very large?

 Cheers,
 Boaz

 On Thursday, June 5, 2014 10:56:00 AM UTC+2, Paweł Krzaczkowski wrote:

 Hi.

 After upgrading Marvel to 1.2.0 (running on Elasticsearch 1.2.1) i'm 
 getting errors like

 [2014-06-05 10:47:25,346][INFO ][node ] [es-m-3] 
 version[1.2.1], pid[68924], build[6c95b75/2014-06-03T15:02:52Z]
 [2014-06-05 10:47:25,347][INFO ][node ] [es-m-3] 
 initializing ...
 [2014-06-05 10:47:25,367][INFO ][plugins  ] [es-m-3] 
 loaded [marvel, analysis-icu], sites [marvel, head, segmentspy, browser, 
 paramedic]
 [2014-06-05 10:47:28,455][INFO ][node ] [es-m-3] 
 initialized
 [2014-06-05 10:47:28,456][INFO ][node ] [es-m-3] 
 starting ...
 [2014-06-05 10:47:28,597][INFO ][transport] [es-m-3] 
 bound_address {inet[/0:0:0:0:0:0:0:0:9300]}, publish_address {inet[/
 192.168.0.212:9300]}
 [2014-06-05 10:47:42,340][INFO ][cluster.service  ] [es-m-3] 
 new_master [es-m-3][0H3grrJxTJunU1U6FmkIEg][es-m-3][inet[
 192.168.0.212/192.168.0.212:9300]]{data=false 
 http://192.168.0.212/192.168.0.212:9300%5D%5D%7Bdata=false, 
 master=true}, reason: zen-disco-join (elected_as_master)
 [2014-06-05 10:47:42,350][INFO ][discovery] [es-m-3] 
 freshmind/0H3grrJxTJunU1U6FmkIEg
 [2014-06-05 10:47:42,365][INFO ][http ] [es-m-3] 
 bound_address {inet[/0:0:0:0:0:0:0:0:9200]}, publish_address {inet[/
 192.168.0.212:9200]}
 [2014-06-05 10:47:42,368][INFO ][node ] [es-m-3] 
 started
 [2014-06-05 10:47:44,098][INFO ][cluster.service  ] [es-m-3] 
 added 
 {[es-m-1][MHl5Ls-cRXCwc7OC-P0J5w][es-m-1][inet[/192.168.0.210:9300]]{data=false,
  
 machine=44454c4c-5300-1052-8038-b9c04f5a5a31, master=true},}, reason: 
 zen-disco-receive(join from 
 node[[es-m-1][MHl5Ls-cRXCwc7OC-P0J5w][es-m-1][inet[/192.168.0.210:9300]]{data=false,
  
 machine=44454c4c-5300-1052-8038-b9c04f5a5a31, master=true}])
 [2014-06-05 10:47:44,401][INFO ][gateway  ] [es-m-3] 
 recovered [28] indices into cluster_state
 [2014-06-05 10:47:48,683][ERROR][marvel.agent ] [es-m-3] 
 exporter [es_exporter] has thrown an exception:
 java.lang.IllegalStateException: array not available
 at 
 org.elasticsearch.common.bytes.PagedBytesReference.array(PagedBytesReference.java:289)
 at 
 org.elasticsearch.marvel.agent.exporter.ESExporter.addXContentRendererToConnection(ESExporter.java:209)
 at 
 org.elasticsearch.marvel.agent.exporter.ESExporter.exportXContent(ESExporter.java:252)
 at 
 org.elasticsearch.marvel.agent.exporter.ESExporter.exportEvents(ESExporter.java:161)
 at 
 org.elasticsearch.marvel.agent.AgentService$ExportingWorker.exportEvents(AgentService.java:305)
 at 
 org.elasticsearch.marvel.agent.AgentService$ExportingWorker.run(AgentService.java:240)
 at java.lang.Thread.run(Thread.java:745)
 [2014-06-05 10:47:58,738][ERROR][marvel.agent ] [es-m-3] 
 exporter [es_exporter] has thrown an exception:
 java.lang.IllegalStateException: array not available
 at 
 org.elasticsearch.common.bytes.PagedBytesReference.array(PagedBytesReference.java:289)
 at 
 org.elasticsearch.marvel.agent.exporter.ESExporter.addXContentRendererToConnection(ESExporter.java:209)
 at 
 org.elasticsearch.marvel.agent.exporter.ESExporter.exportXContent(ESExporter.java:252)
 at 
 org.elasticsearch.marvel.agent.exporter.ESExporter.exportEvents(ESExporter.java:161)
 at 
 org.elasticsearch.marvel.agent.AgentService$ExportingWorker.exportEvents(AgentService.java:305)
 at 
 org.elasticsearch.marvel.agent.AgentService$ExportingWorker.run(AgentService.java:240)
 at java.lang.Thread.run(Thread.java:745)
 [2014-06-05 10:48:36,572][INFO ][cluster.service  ] [es-m-3] 
 added 
 {[es-m-2][e5uEqGRhS7uEioNxaYkwTg][es-m-2][inet[/192.168.0.211:9300]]{data=false,
  
 master=true},}, reason: zen-disco-receive(join from 
 node[[es-m-2][e5uEqGRhS7uEioNxaYkwTg][es-m-2][inet[/192.168.0.211:9300]]{data=false,
  
 master=true}])
 [2014-06-05 10:48:38,859][ERROR][marvel.agent ] [es-m-3] 
 exporter [es_exporter] has thrown an exception:
 java.lang.IllegalStateException: array not available
 at 
 

Re: If I set index.number_of_replica:1, then the minimum number of nodes should be 3 to assure that the status of the cluster is gree?

2014-06-06 Thread flyer
The configurations on the two nodes are almost the same except the **
node.name**.

The first question is very strange.
I I run the same config on another cluster of two nodes and after I index
data, the state of the cluster is green. I have no idea why this
happened..Is there something I ignore?

I want to know how ES allocates nodes. Is there some reference? I googled
but couldn't find it.

Thank you :D


On Fri, Jun 6, 2014 at 3:05 PM, joergpra...@gmail.com joergpra...@gmail.com
 wrote:

 1. No. Did you change the configuration? You have two data nodes connected?

 2. You do not need to be concerned where primary shards are allocated,
 secondary shards play the same role (except primaries receive writes first
 a few milliseconds earlier than secondaries). Elasticsearch randomly
 allocates shards on all the available nodes. If nodes go away, shards on
 remaining nodes are promoted to primaries automatically.

 Jörg


 On Fri, Jun 6, 2014 at 5:41 AM, flyer flyer...@gmail.com wrote:

 I have a cluster of two nodes, and set the configs for shard number and
 replica number as following:

 index.number_of_shards: 10

 index.number_of_replicas: 1


 The master node is elected automatically.

 Before I index data, the state of the cluster is green. After I index
 data, the state of the cluster becomes yellow. Some replica shards are not
 allocated.

 But if I add another node and now the cluster consists of three nodes,
 then I index data and the state of the cluster becomes green.

 The version of ES that I use is 1.1.

 Question 1:
 If the replica number is 1, do I have to have at least 3 nodes to
 assure the state of the cluster is green?

 Question 2:
 I have observed that one (not the master node) of the three nodes
 just has primary nodes. What's the mechanism ES uses to allocate
 primary/replia shards?

 Thanks anyway.

 --
 宠辱不惊,闲看庭前花开花落;去留无意,漫随天边云卷云舒。

  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/CAKLbBG-o6JmkbPAAvGBP8t8Bdp-ONEWEFK5kGW%3Dcyn-h-O6dsg%40mail.gmail.com
 https://groups.google.com/d/msgid/elasticsearch/CAKLbBG-o6JmkbPAAvGBP8t8Bdp-ONEWEFK5kGW%3Dcyn-h-O6dsg%40mail.gmail.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/CAKdsXoES%2Bd5gmo_OS-Si2dXLPGbk_q-B554WeL13HC0DfFY1vw%40mail.gmail.com
 https://groups.google.com/d/msgid/elasticsearch/CAKdsXoES%2Bd5gmo_OS-Si2dXLPGbk_q-B554WeL13HC0DfFY1vw%40mail.gmail.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.




-- 
宠辱不惊,闲看庭前花开花落;去留无意,漫随天边云卷云舒。

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKLbBG-V0xGEe%3D0gL51-bAcsWMcy-sE8bHwHg_1z7NdEiHt_7w%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Kibana-MySQL connection

2014-06-06 Thread Tomislav Poljak
Hi Srinu,
Kibana is javascript search/visualisation front-end for elasticsearch
which is a distributed search engine which doesn't feature mysql
connection/integration out of the box. But, there is a plugin
https://github.com/jprante/elasticsearch-river-jdbc which you can use
to import data from mysql to elasticsearch and then use Kibana to
visualise it.

So steps would be:

1) Deploy elasticsearch
2) Install jdbc plugin with mysql driver in elasticsearch
3) Create and submit jdbc river configuration (with mysql connection
data and import query) which will import the data
4) Deploy kibana and visualise data

Hope this helps,
Tomislav



2014-06-06 10:02 GMT+02:00 srinu konda konda.srin...@gmail.com:
 Hi All,

 I need help on kibana, I need to Connect MySQL Database from Kibana. So
 please let me know I can achieve it, and me pseudo code if anyone can have
 it.



 Thanks  Regards,
 Srinivas.

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/13cb6155-12e2-4ec5-952c-7f1996527903%40googlegroups.com.
 For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CALuCJxjL%2B0zCCHct0-BVn2Z%2BiaCHFLEGj4YD1NioG7Td%2Bz5FrQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Get by _id doesn't work but search does.

2014-06-06 Thread Luke Wilson-Mawer
Great, thanks Adrien. I will eagerly await the tool.

Kind regards,

Luke


On Thu, Jun 5, 2014 at 1:34 PM, Adrien Grand adrien.gr...@elasticsearch.com
 wrote:

 Hi,

 This is very likely because of
 https://github.com/elasticsearch/elasticsearch/pull/6393

 See http://www.elasticsearch.org/blog/elasticsearch-1-2-1-released/ for
 more information, we are currently working on a tool that would help
 relocate documents to the right shard.


 On Thu, Jun 5, 2014 at 1:54 PM, Luke Wilson-Mawer 
 lukewilsonma...@gmail.com wrote:

 Hi,

 I'm seeing weird behaviours with ids on elasticsearch 1.2.0 (recently
 upgraded from 1.0.1).

 A search retrieves my document, showing the correct value for _id:

 [terminal] curl 'myServer:9200/global/_search?q=someField:something

 {took:79,timed_out:false,_shards:{total:12,successful:12,failed:0},hits:{total:1,max_score:17.715034,hits:[{_index:global,_type:user,_id:7a113e4f-44de-3b2b-a3f1-fb881da1b00a,...

 But a direct lookup on id doesn't:
 [terminal] curl
 'myServer:9200/global/user/7a113e4f-44de-3b2b-a3f1-fb881da1b00a'

 {_index:global,_type:user,_id:7a113e4f-44de-3b2b-a3f1-fb881da1b00a,found:false}

 Any ideas? I'm hoping I won't have to reindex...

 Kind regards,

 Luke


  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.

 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/e0c68f79-12fc-42e9-9a26-102c3b75f42e%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/e0c68f79-12fc-42e9-9a26-102c3b75f42e%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.




 --
 Adrien Grand

 --
 You received this message because you are subscribed to a topic in the
 Google Groups elasticsearch group.
 To unsubscribe from this topic, visit
 https://groups.google.com/d/topic/elasticsearch/B2dlshf_dQI/unsubscribe.
 To unsubscribe from this group and all its topics, send an email to
 elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j6JRYTe01iXb4Lwdurxgmfp6QTXfmEBbA%2Bmci8cQMGZ-A%40mail.gmail.com
 https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j6JRYTe01iXb4Lwdurxgmfp6QTXfmEBbA%2Bmci8cQMGZ-A%40mail.gmail.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKmH%2BjqgLZ6-qBHPvKbWLAbBiP_jqjt8QJ7kPWhSATKch0d4Bw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Analyzing queries in the client side of Elasticsearch but not on the server

2014-06-06 Thread joergpra...@gmail.com
The idea is:

- the basic entry point code for how search works is in
org.elasticsearch.rest.action.search.RestSearchAction, if you want to
expose an enhanced search to REST

- building the query works with
org.elasticsearch.search.builder.SearchSourceBuilder which has a convenient
method query(queryBuilder) for Java API

- org.elasticsearch.indices.query.IndicesQueriesModule is responsible for
managing the query parsers, there are addQuery() and addFilter methods(),
these methods must be invoked at plugin initialization time

- so you can write a pair of My...QueryBuilder and My...QueryParser for
each of your query in your plugin

- copy/paste RestSearchAction to something like My...RestSearchAction with
a custom endpoint, for example _mysearch, and then you can use your query
implementation, wrapped in JSON, just like you would do in _search REST
action. The new REST endpoint must be registered in the plugin
initialization

- for studying implementation details, the existing standard query
parser/builder impls in org.elasticsearch.index.query are useful

Jörg




On Fri, Jun 6, 2014 at 11:19 AM, Heng Wang o...@zhihu.com wrote:

 Thank you Jörg, I didn't realize that I can plug the query parsers into
 elasticsearch, would you please elaborate more on this?


 On Fri, Jun 6, 2014 at 4:53 PM, joergpra...@gmail.com 
 joergpra...@gmail.com wrote:

 The Query DSL is not equivalent to Lucene Query but close to, with
 enhancements.

 If you want to make use of Lucene Query, and you already decided to write
 a plugin for scoring, so why don't you just add your query parsers to the
 plugin?

 Jörg


 On Fri, Jun 6, 2014 at 9:39 AM, ohw o...@zhihu.com wrote:

 Sure, here it is:

 -

 We are migrating our Lucene based search codebase to Elasticsearch. The
 major problem we encountered is how we should migrate our QueryParsers.

 In our old solution, the QueryParsers take in a human input query
 string, and transform that to Lucene's Query object, which is then fed
 into Lucene's IndexSearcher. However, in Elasticsearch we don't directly
 interact with IndexSearcher, instead we can only build the queries in the
 client side using Query DSL
 http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl.html
  and
 send the JSON to Elasticsearch server. Elasticsearch server then (possibly)
 rewrites/analyses the JSON query to build a Lucene query.

 To make use of our existing and sophisticated logic in QueryParsers, we
 decided that we can stick to our old approach by:

1. Explicitly tell Elasticsearch to *NOT analyze* any query in the
search time.
2. Do *ALL* the query related analysis (tokenizing, synonym, etc) in
the Java client.
3. Believe that Elasticsearch's Query DSL is kind of a one-to-one
mapping to Lucene's Query

 The questions are:

1. Is this approach feasible?
2. What are the potential problems in doing so?
3. What is the best practice?

 By the way, don't worry about the scoring process, we are writing our
 scorer scripts as a Elasticsearch plugin.

 ---

 Thank you!

 Odin

 On Friday, June 6, 2014 3:36:54 PM UTC+8, Jörg Prante wrote:

 Please ask your question here. Thanks.

 Jörg


 On Fri, Jun 6, 2014 at 9:28 AM, ohw o...@zhihu.com wrote:

 Hi folks

 I just asked a question in StackOverflow, please have a look if you
 have encountered similar problem or have some input to it.

 Thanks in advance!

 --
 You received this message because you are subscribed to the Google
 Groups elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send
 an email to elasticsearc...@googlegroups.com.

 To view this discussion on the web visit https://groups.google.com/d/
 msgid/elasticsearch/1f4b1575-f050-46db-853f-511bc24e6392%
 40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/1f4b1575-f050-46db-853f-511bc24e6392%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


  --
 You received this message because you are subscribed to the Google
 Groups elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send
 an email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/50690e24-839a-4908-90f4-5417129debc6%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/50690e24-839a-4908-90f4-5417129debc6%40googlegroups.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


  --
 You received this message because you are subscribed to a topic in the
 Google Groups elasticsearch group.
 To unsubscribe from this topic, visit
 https://groups.google.com/d/topic/elasticsearch/bfe7OXPAPKk/unsubscribe.
 To unsubscribe from this group and all its topics, send an email to
 elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 

Re: If I set index.number_of_replica:1, then the minimum number of nodes should be 3 to assure that the status of the cluster is gree?

2014-06-06 Thread joergpra...@gmail.com
You must use same version throughout the cluster. Combination of different
versions is not always recognized and not warned, it will result in
unexpected behavior or strange error messages.

Jörg


On Fri, Jun 6, 2014 at 11:27 AM, flyer flyer...@gmail.com wrote:

 It's the ES version that causes the first problem.

 In my previous test, one node use ES-1.1.1 and the other node use
 ES-1.1.2. After I indexed data, the state of the cluster was yellow other
 than green. I think there're some compatibility problems between the two
 versions.

 Thank you very much :D


 On Fri, Jun 6, 2014 at 4:49 PM, flyer flyer...@gmail.com wrote:

 The configurations on the two nodes are almost the same except the **
 node.name**.

 The first question is very strange.
 I I run the same config on another cluster of two nodes and after I index
 data, the state of the cluster is green. I have no idea why this
 happened..Is there something I ignore?

 I want to know how ES allocates nodes. Is there some reference? I googled
 but couldn't find it.

 Thank you :D


 On Fri, Jun 6, 2014 at 3:05 PM, joergpra...@gmail.com 
 joergpra...@gmail.com wrote:

 1. No. Did you change the configuration? You have two data nodes
 connected?

 2. You do not need to be concerned where primary shards are allocated,
 secondary shards play the same role (except primaries receive writes first
 a few milliseconds earlier than secondaries). Elasticsearch randomly
 allocates shards on all the available nodes. If nodes go away, shards on
 remaining nodes are promoted to primaries automatically.

 Jörg


 On Fri, Jun 6, 2014 at 5:41 AM, flyer flyer...@gmail.com wrote:

 I have a cluster of two nodes, and set the configs for shard number and
 replica number as following:

 index.number_of_shards: 10

 index.number_of_replicas: 1


 The master node is elected automatically.

 Before I index data, the state of the cluster is green. After I index
 data, the state of the cluster becomes yellow. Some replica shards are not
 allocated.

 But if I add another node and now the cluster consists of three nodes,
 then I index data and the state of the cluster becomes green.

 The version of ES that I use is 1.1.

 Question 1:
 If the replica number is 1, do I have to have at least 3 nodes to
 assure the state of the cluster is green?

 Question 2:
 I have observed that one (not the master node) of the three nodes
 just has primary nodes. What's the mechanism ES uses to allocate
 primary/replia shards?

 Thanks anyway.

 --
 宠辱不惊,闲看庭前花开花落;去留无意,漫随天边云卷云舒。

  --
 You received this message because you are subscribed to the Google
 Groups elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send
 an email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/CAKLbBG-o6JmkbPAAvGBP8t8Bdp-ONEWEFK5kGW%3Dcyn-h-O6dsg%40mail.gmail.com
 https://groups.google.com/d/msgid/elasticsearch/CAKLbBG-o6JmkbPAAvGBP8t8Bdp-ONEWEFK5kGW%3Dcyn-h-O6dsg%40mail.gmail.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


  --
 You received this message because you are subscribed to the Google
 Groups elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send
 an email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/CAKdsXoES%2Bd5gmo_OS-Si2dXLPGbk_q-B554WeL13HC0DfFY1vw%40mail.gmail.com
 https://groups.google.com/d/msgid/elasticsearch/CAKdsXoES%2Bd5gmo_OS-Si2dXLPGbk_q-B554WeL13HC0DfFY1vw%40mail.gmail.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.




 --
 宠辱不惊,闲看庭前花开花落;去留无意,漫随天边云卷云舒。




 --
 宠辱不惊,闲看庭前花开花落;去留无意,漫随天边云卷云舒。

  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/CAKLbBG8o%2BOvM8zOWh58bAmeH1iisq8xERTucw%2BJNktPDNkuuRw%40mail.gmail.com
 https://groups.google.com/d/msgid/elasticsearch/CAKLbBG8o%2BOvM8zOWh58bAmeH1iisq8xERTucw%2BJNktPDNkuuRw%40mail.gmail.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHu1fTJd9AmmU3Zh6%3DRkWcCvdnBx6CeNdFDCOmMoND6%2Bg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Get by _id doesn't work but search does.

2014-06-06 Thread joergpra...@gmail.com
Look here for the tool and how to use it

http://www.elasticsearch.org/blog/tool-help-routing-issues-elasticsearch-1-2-0/

Jörg


On Fri, Jun 6, 2014 at 11:24 AM, Luke Wilson-Mawer 
lukewilsonma...@gmail.com wrote:

 Great, thanks Adrien. I will eagerly await the tool.

 Kind regards,

 Luke


 On Thu, Jun 5, 2014 at 1:34 PM, Adrien Grand 
 adrien.gr...@elasticsearch.com wrote:

 Hi,

 This is very likely because of
 https://github.com/elasticsearch/elasticsearch/pull/6393

 See http://www.elasticsearch.org/blog/elasticsearch-1-2-1-released/ for
 more information, we are currently working on a tool that would help
 relocate documents to the right shard.


 On Thu, Jun 5, 2014 at 1:54 PM, Luke Wilson-Mawer 
 lukewilsonma...@gmail.com wrote:

 Hi,

 I'm seeing weird behaviours with ids on elasticsearch 1.2.0 (recently
 upgraded from 1.0.1).

 A search retrieves my document, showing the correct value for _id:

 [terminal] curl 'myServer:9200/global/_search?q=someField:something

 {took:79,timed_out:false,_shards:{total:12,successful:12,failed:0},hits:{total:1,max_score:17.715034,hits:[{_index:global,_type:user,_id:7a113e4f-44de-3b2b-a3f1-fb881da1b00a,...

 But a direct lookup on id doesn't:
 [terminal] curl
 'myServer:9200/global/user/7a113e4f-44de-3b2b-a3f1-fb881da1b00a'

 {_index:global,_type:user,_id:7a113e4f-44de-3b2b-a3f1-fb881da1b00a,found:false}

 Any ideas? I'm hoping I won't have to reindex...

 Kind regards,

 Luke


  --
 You received this message because you are subscribed to the Google
 Groups elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send
 an email to elasticsearch+unsubscr...@googlegroups.com.

 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/e0c68f79-12fc-42e9-9a26-102c3b75f42e%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/e0c68f79-12fc-42e9-9a26-102c3b75f42e%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.




 --
 Adrien Grand

 --
 You received this message because you are subscribed to a topic in the
 Google Groups elasticsearch group.
 To unsubscribe from this topic, visit
 https://groups.google.com/d/topic/elasticsearch/B2dlshf_dQI/unsubscribe.
 To unsubscribe from this group and all its topics, send an email to
 elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j6JRYTe01iXb4Lwdurxgmfp6QTXfmEBbA%2Bmci8cQMGZ-A%40mail.gmail.com
 https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j6JRYTe01iXb4Lwdurxgmfp6QTXfmEBbA%2Bmci8cQMGZ-A%40mail.gmail.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/CAKmH%2BjqgLZ6-qBHPvKbWLAbBiP_jqjt8QJ7kPWhSATKch0d4Bw%40mail.gmail.com
 https://groups.google.com/d/msgid/elasticsearch/CAKmH%2BjqgLZ6-qBHPvKbWLAbBiP_jqjt8QJ7kPWhSATKch0d4Bw%40mail.gmail.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGHObFxwXcQx7c66cN1a7S%2BeOj152ETQPV6%2B%2Bii6cgkhA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


For any 1.2.0 users - You will want to read this

2014-06-06 Thread Mark Walkom
http://www.elasticsearch.org/blog/tool-help-routing-issues-elasticsearch-1-2-0/


 In the 1.2.1 release we fixed a routing bug that had been introduced in
 the 1.2.0 release. This was described in the 1.2.1 blog post. The contents
 of this post apply only to users that ran 1.2.0.



Before we can discuss the tool we developed to help fix damage from the
 bug, we have to understand the problem the bug created. When Elasticsearch
 stores a document it has to decide which shard to put the document in. It
 does this by computing a hash of the document’s “UID”, which is by default
 the type#id tuple of the document. Elasticsearch then uses that hash modulo
 the number of shards to pick a shard. We have to make sure that this hash
 function works the same across different versions of Elasticsearch. If it
 doesn’t, version X could expect to find a given document in shard 1 and
 version X+1 could expect the same document in shard 2. In this case,
 version X+1 could not directly get the document if it were indexed while
 running version X.


Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com
web: www.campaignmonitor.com

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAEM624Zh6-5uVn5doQuOGxQAEQa%3DZVg5RmSSMMXbVv9dgU_3-Q%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: iptablex trojan experiences?

2014-06-06 Thread 'Adolfo Rodriguez' via elasticsearch
probably related 

http://bouk.co/blog/elasticsearch-rce/

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/b6207d97-8baa-4c27-9ecd-7da9933503ab%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Impossible to implement real custom boost query when the weight is in the child document?

2014-06-06 Thread Csaba Dezsényi
I could find only one related post:
https://groups.google.com/forum/#!msg/elasticsearch/EGCeJZbhVtA/i32ROGVmFswJ
But this has different question...

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/6152742a-4d32-47a4-890d-49cd6a4dd291%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Elasticsearch and Hadoop Questions

2014-06-06 Thread ES USER
So if I understand you correctly if the data is stored in Hadoop then 
es-hadoop is really just acting as a job manager?  If that is the case what 
is the rule of thumb on how many ES nodes and shard should be set?



On Thursday, June 5, 2014 6:45:09 PM UTC-4, Costin Leau wrote:

 Think of es-hadoop as a connector between Hadoop and Elasticsearch. You 
 would use it to index data in Hadoop to ES or run queries in ES directly 
 from Hadoop.
 Where does ES store the data? That depends on its configuration 
 (completely separate from es-hadoop itself). In general (and the default) 
 is to store it onto the local file-system. If you want to use it on a 
 shared file-system or HDFS you can easily do that by mounting it locally 
 (for example, mount HDFS through NFS as a local disk) and point ES to it. 
 ES is happy to work with it however the performance will be _significantly_ 
 degraded and most of the real-time nature of it will go down the window 
 since HDFS is a distributed file-system (and thus even basic operations 
 like opening a file or closing a file mean at least one call over the 
 network) plus you're giving up the amazing OS file-system cache (since the 
 fs is not local). If the FS is slow, anything that sits on top of it (like 
 ES) will be slow as well.

 Hope this helps,

 P.S. By the way, if you want/need to snapshot/restore data to/from ES 
 from/to HDFS you can use the HDFS repository (more info here: 
 http://www.elasticsearch.org/guide/en/elasticsearch/hadoop/current/setup.html
 )



 On Thu, Jun 5, 2014 at 8:41 PM, ES USER es.use...@gmail.com javascript:
  wrote:

 Try as I might and I have read all the stuff I can find on ES' website 
 about this I understand somewhat how the integration works but not the 
 actual nuts and bolts of it.

 For example:

 Is Hadoop just storing the files that would normally be stored in the 
 local filesystem for the ES indexes or is it storing the data that would 
 normally be in those indexes and just accessed through es-hadoop?

 If it is the latter how do you go about determining whatto set for the 
 number of nodes and shards.


 If anyone has any information on this or even better yet a place to point 
 me to that has better references so that I can research this on my own it 
 would be much appreciated.

 Thanks.

 -- 
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearc...@googlegroups.com javascript:.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/b78f2fa6-42c9-4ae7-a4ab-aacbc2c53293%40googlegroups.com
  
 https://groups.google.com/d/msgid/elasticsearch/b78f2fa6-42c9-4ae7-a4ab-aacbc2c53293%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.




-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/123c4ed3-077b-4e9f-a838-fa372aea109a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Elasticsearch and Hadoop Questions

2014-06-06 Thread Georgi Ivanov
Hmm i am not sure i understand your questions.
Hadoop is distributed storage system (HDFS) and Map-reduce framework (MR) 
(among other things)
ES is distributed storage/search system (among other things)

So what es-hadoop is giving you:

You can read data from ES , and do some complex analysis , taking benefits 
MR
You can write data to ES - one can process some data stored on HDFS and 
write some pre-aggregated data to ES for example

es-hadoop is basically connector between ES and Hadoop

I hope this helps

On Thursday, June 5, 2014 7:41:34 PM UTC+2, ES USER wrote:

 Try as I might and I have read all the stuff I can find on ES' website 
 about this I understand somewhat how the integration works but not the 
 actual nuts and bolts of it.

 For example:

 Is Hadoop just storing the files that would normally be stored in the 
 local filesystem for the ES indexes or is it storing the data that would 
 normally be in those indexes and just accessed through es-hadoop?

 If it is the latter how do you go about determining whatto set for the 
 number of nodes and shards.


 If anyone has any information on this or even better yet a place to point 
 me to that has better references so that I can research this on my own it 
 would be much appreciated.

 Thanks.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/f4019b07-a660-4a49-b9ec-b04bb1ad71e5%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: What's using memory in ElasticSearch? (Details to follow...)

2014-06-06 Thread joergpra...@gmail.com
No, the settings will not merge existing segments unless you call _optimize
action via API.

And take some patience. Thousands of segments take time - also, they need
quite few memory resources to merge...

I suggest backup your data first, to stay safe if the merging fails /
aborts...

Jörg


On Thu, Jun 5, 2014 at 7:30 AM, jegansp jega...@gmail.com wrote:

 Is this related to
 https://github.com/elasticsearch/elasticsearch/issues/5779

 We are also facing this issue. Our version is also 1.1.0. After running
 fine
 for few days, the cluster slowly starts facing memory issues. At some point
 of time it spends almost 90% of the time in garbage collection with full
 GCs
 with each on taking more than 15 secs. Tuning jvm parameters didn't help us
 either.

 We have a two node cluster with 120+ indexes. In our case segments count is
 too high (one node has 16000+ and other has 25000+).

 Optimizing indexes API doesn't return either as indicated here
 https://groups.google.com/forum/#!topic/elasticsearch/kqTRRADQBwc

 I am going to try with these settings to see if the situation improves.

 https://gist.github.com/jprante/10666960

 But, not sure if this will merge existing segments.





 --
 View this message in context:
 http://elasticsearch-users.115913.n3.nabble.com/What-s-using-memory-in-ElasticSearch-Details-to-follow-tp4056989p4057097.html
 Sent from the ElasticSearch Users mailing list archive at Nabble.com.

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/1401946225557-4057097.post%40n3.nabble.com
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHyMHc242rWrLAihDEkz6GUfKBJbUWJEkqcsU2_VmDxoA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Max doc size for indexing over HTTP

2014-06-06 Thread eperezks
I'm trying to index a document over 1gb in size but I get the following 
error. 

org.elasticsearch.common.netty.handler.codec.frame.TooLongFrameException: 
HTTP content length exceeded 1073741824 bytes.
1073741824  = 1gb

My elasticsearch .yml has http.max_content_length set to 1900mb.  I found 
this post that says Netty has a 2gb limit (
https://github.com/elasticsearch/elasticsearch/issues/2237 ) so I was 
expecting to be able to index documents of approximately 2gb.  Does 
elasticsearch impose a limit of 1gb even though I specified 1gb?  If not, 
I can file an issue on github.


This is my node info

{
  status : 200,
  name : SOURCEONE-elastic-ubuntu-4,
  version : {
number : 1.2.0,
build_hash : c82387f290c21505f781c695f365d0ef4098b272,
build_timestamp : 2014-05-22T12:49:13Z,
build_snapshot : false,
lucene_version : 4.8
  },
  tagline : You Know, for Search
}

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/e32e66f9-0e96-49d0-b3d5-c9c35cd8d10d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Max doc size for indexing over HTTP

2014-06-06 Thread joergpra...@gmail.com
1gb is a very large document and it is unusual to index such sizes.

There is a limit check against the heap. In order to be able to process
such length, you need a large heap alone to store the document source.
Depending on analyzer, heap demand increases even more.

You can index documents of arbitrary length if you preprocess them first
and split them into smaller chunks.

Jörg




On Fri, Jun 6, 2014 at 2:49 PM, eperezks perez...@gmail.com wrote:

 I'm trying to index a document over 1gb in size but I get the following
 error.

 org.elasticsearch.common.netty.handler.codec.frame.TooLongFrameException:
 HTTP content length exceeded 1073741824 bytes.
 1073741824  = 1gb

 My elasticsearch .yml has http.max_content_length set to 1900mb.  I found
 this post that says Netty has a 2gb limit (
 https://github.com/elasticsearch/elasticsearch/issues/2237 ) so I was
 expecting to be able to index documents of approximately 2gb.  Does
 elasticsearch impose a limit of 1gb even though I specified 1gb?  If not,
 I can file an issue on github.


 This is my node info

 {
   status : 200,
   name : SOURCEONE-elastic-ubuntu-4,
   version : {
 number : 1.2.0,
 build_hash : c82387f290c21505f781c695f365d0ef4098b272,
 build_timestamp : 2014-05-22T12:49:13Z,
 build_snapshot : false,
 lucene_version : 4.8
   },
   tagline : You Know, for Search
 }

  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/e32e66f9-0e96-49d0-b3d5-c9c35cd8d10d%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/e32e66f9-0e96-49d0-b3d5-c9c35cd8d10d%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoENg2gujZ01Gk-P6uDHBh5QffdQLB7fEgRJUUqdqeO4oQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


XGET to _mapping does not match the mapping I used to create the index, is this normal?

2014-06-06 Thread Enno Shioji
Hi, I created my index using this mapping JSON:

{
myidx: {
index.refresh_interval:5m,
mappings: {
dynamic: strict,
message: {
_ttl: {
enabled: true
},
properties: {
my_nested_thing: {
type: nested,
properties: {
some_id: {
type: string, index: not_analyzed
},
count: {
type: long
},
}
}
}
}
}
}
}

If I do a GET to _mapping after indexing some documents, it will 
essentially return:

{
message: {
properties: {
my_nested_thing: {
properties: {
some_id: {
type: string, index: not_analyzed
},
count: {
type: long
},
}
}
}
}
}

I.e. _ttl: enabled=true and the type: nested is not present from the 
mapping. I also noticed that it allows auto update to the mapping despite 
the dynamic: strict instruction.

Does this mean these instructions are somehow not being reflected? If so, 
what am I doing wrong?

I'm using version 1.2.1







-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/b57199bf-c01f-4b86-9d8d-a4acfb06618c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: A plugin to change the result set before sending it back to the http client

2014-06-06 Thread Brian
Hi, Mario. Yes, I suppose this kind of goes against the no additional 
proxy requirement you have.

Hehehe. I'm a seeker of loopholes. In my scenario, it's still a plug-in 
design, but ES is my plug-in and not the other way around. Still only one 
HTTP interface in the mix, but it's mine and not ES's.

I also have avoided the plug-in approach because I've read that it's marked 
for deprecation and eventual removal. Yet logstash and ES Head are still 
offered as plug-ins as are a boat-load of other facilities, so I am not 
really sure if that's still the case.

And of course, your own plug-in has a much better chance to be updated to 
match exactly each new ES version to which you migrate. That's one of the 
downsides of third-party plug-ins: They lock you into older ES versions 
until the author gets a chance to update the plug-in.

Brian

On Friday, June 6, 2014 1:14:00 AM UTC-4, Mario Mueller wrote:


 @Brian:
 Interesting approach, but wouldn't this go against the initial no 
 additional proxy statement, if I got you right ..



-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/287ee9f9-b6cd-47d1-937e-f38406a614a1%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: XGET to _mapping does not match the mapping I used to create the index, is this normal?

2014-06-06 Thread Jun Ohtani
Hi,

How do you use API to create index?

I think “dynamic” property is wrong place.

I try to create index to use the following JSON and index;

curl -XPOST localhost:9200/myidx -d '
{
  settings: {
index.refresh_interval: 5m
  },
  mappings: {
message: {
  dynamic: strict,
  _ttl: {
enabled: true
  },
  properties: {
my_nested_thing: {
  type: nested,
  properties: {
some_id: {
  type: string,
  index: not_analyzed
},
count: {
  type: long
}
  }
}
  }
}
  }
}'

curl -XGET localhost:9200/myidx/_mapping?pretty

{
  myidx : {
mappings : {
  message : {
dynamic : strict,
_ttl : {
  enabled : true
},
properties : {
  my_nested_thing : {
type : nested,
properties : {
  count : {
type : long
  },
  some_id : {
type : string,
index : not_analyzed
  }
}
  }
}
  }
}
  }
}

Does it make sense?


Jun Ohtani
joht...@gmail.com
blog : http://blog.johtani.info
twitter : http://twitter.com/johtani

2014/06/06 22:31、Enno Shioji eshi...@gmail.com のメール:

 Hi, I created my index using this mapping JSON:
 
 {
 myidx: {
 index.refresh_interval:5m,
 mappings: {
 dynamic: strict,
 message: {
 _ttl: {
 enabled: true
 },
 properties: {
 my_nested_thing: {
 type: nested,
 properties: {
 some_id: {
 type: string, index: not_analyzed
 },
 count: {
 type: long
 },
 }
 }
 }
 }
 }
 }
 }
 
 If I do a GET to _mapping after indexing some documents, it will essentially 
 return:
 
 {
 message: {
 properties: {
 my_nested_thing: {
 properties: {
 some_id: {
 type: string, index: not_analyzed
 },
 count: {
 type: long
 },
 }
 }
 }
 }
 }
 
 I.e. _ttl: enabled=true and the type: nested is not present from the 
 mapping. I also noticed that it allows auto update to the mapping despite the 
 dynamic: strict instruction.
 
 Does this mean these instructions are somehow not being reflected? If so, 
 what am I doing wrong?
 
 I'm using version 1.2.1
 
 
 
 
 
 
 
 
 -- 
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/b57199bf-c01f-4b86-9d8d-a4acfb06618c%40googlegroups.com.
 For more options, visit https://groups.google.com/d/optout.



signature.asc
Description: Message signed with OpenPGP using GPGMail


Wrong paging with embedded es instance

2014-06-06 Thread Дмитрий Киселев
Hi everyone.

I use embedded ES node as part of java application.

Node node = nodeBuilder().clusterName(OSM-Gazetteer).node();
Client client = node.client();

I try to fetch some data paged.

SearchRequestBuilder searchQ = client.prepareSearch(gazetteer)
.setSearchType(SearchType.QUERY_AND_FETCH)
.setNoFields()
.setQuery(QueryBuilders.matchAllQuery())
.setExplain(false);

searchQ.setSize(PAGE_SIZE);//PAGE_SIZE=5;
searchQ.setFrom(page * PAGE_SIZE);   //page=0;

Here is the querry what I've got generated by client:

{   from : 0,   size : 5,   query : { match_all : { }   },   
explain : false,   fields : [ ] }

curl version returns 5 hits, as expected, but java returns 20 hits:

searchQ.get().getHits().getHits().length; //=20

Index setting are default. Hits do not have duplicates.

Is there some workaround or something?

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/647bddb1-636f-4670-9f9a-b0767001fdd8%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: XGET to _mapping does not match the mapping I used to create the index, is this normal?

2014-06-06 Thread Enno Shioji
Hi Jun,

Ah, I must be doing something wrong then. I'll correct the JSON and test 
again.

Btw わざわざテストしていただきありがとうございます!



On Friday, 6 June 2014 15:29:37 UTC+1, Jun Ohtani wrote:

 Hi, 

 How do you use API to create index? 

 I think “dynamic” property is wrong place. 

 I try to create index to use the following JSON and index; 

 curl -XPOST localhost:9200/myidx -d ' 
 { 
   settings: { 
 index.refresh_interval: 5m 
   }, 
   mappings: { 
 message: { 
   dynamic: strict, 
   _ttl: { 
 enabled: true 
   }, 
   properties: { 
 my_nested_thing: { 
   type: nested, 
   properties: { 
 some_id: { 
   type: string, 
   index: not_analyzed 
 }, 
 count: { 
   type: long 
 } 
   } 
 } 
   } 
 } 
   } 
 }' 

 curl -XGET localhost:9200/myidx/_mapping?pretty 

 { 
   myidx : { 
 mappings : { 
   message : { 
 dynamic : strict, 
 _ttl : { 
   enabled : true 
 }, 
 properties : { 
   my_nested_thing : { 
 type : nested, 
 properties : { 
   count : { 
 type : long 
   }, 
   some_id : { 
 type : string, 
 index : not_analyzed 
   } 
 } 
   } 
 } 
   } 
 } 
   } 
 } 

 Does it make sense? 

  
 Jun Ohtani 
 joh...@gmail.com javascript: 
 blog : http://blog.johtani.info 
 twitter : http://twitter.com/johtani 

 2014/06/06 22:31、Enno Shioji esh...@gmail.com javascript: のメール: 

  Hi, I created my index using this mapping JSON: 
  
  { 
  myidx: { 
  index.refresh_interval:5m, 
  mappings: { 
  dynamic: strict, 
  message: { 
  _ttl: { 
  enabled: true 
  }, 
  properties: { 
  my_nested_thing: { 
  type: nested, 
  properties: { 
  some_id: { 
  type: string, index: 
 not_analyzed 
  }, 
  count: { 
  type: long 
  }, 
  } 
  } 
  } 
  } 
  } 
  } 
  } 
  
  If I do a GET to _mapping after indexing some documents, it will 
 essentially return: 
  
  { 
  message: { 
  properties: { 
  my_nested_thing: { 
  properties: { 
  some_id: { 
  type: string, index: 
 not_analyzed 
  }, 
  count: { 
  type: long 
  }, 
  } 
  } 
  } 
  } 
  } 
  
  I.e. _ttl: enabled=true and the type: nested is not present from 
 the mapping. I also noticed that it allows auto update to the mapping 
 despite the dynamic: strict instruction. 
  
  Does this mean these instructions are somehow not being reflected? If 
 so, what am I doing wrong? 
  
  I'm using version 1.2.1 
  
  
  
  
  
  
  
  
  -- 
  You received this message because you are subscribed to the Google 
 Groups elasticsearch group. 
  To unsubscribe from this group and stop receiving emails from it, send 
 an email to elasticsearc...@googlegroups.com javascript:. 
  To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/b57199bf-c01f-4b86-9d8d-a4acfb06618c%40googlegroups.com.
  

  For more options, visit https://groups.google.com/d/optout. 



-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/7aeb3e33-4ad0-41d5-a27f-4e21b326e78d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Wrong paging with embedded es instance

2014-06-06 Thread David Pilato
Could you print your searchQ object?

May be using a toString()

--
David ;-)
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

 Le 6 juin 2014 à 16:47, Дмитрий Киселев dmitry.v.kise...@gmail.com a écrit :
 
 Hi everyone.
 
 I use embedded ES node as part of java application.
 
 Node node = nodeBuilder().clusterName(OSM-Gazetteer).node();
 Client client = node.client();
 
 I try to fetch some data paged.
 
 SearchRequestBuilder searchQ = client.prepareSearch(gazetteer)
 .setSearchType(SearchType.QUERY_AND_FETCH)
 .setNoFields()
 .setQuery(QueryBuilders.matchAllQuery())
 .setExplain(false);
 
 searchQ.setSize(PAGE_SIZE);//PAGE_SIZE=5;
 searchQ.setFrom(page * PAGE_SIZE);   //page=0;
 
 Here is the querry what I've got generated by client:
 
 {   from : 0,   size : 5,   query : { match_all : { }   },   
 explain : false,   fields : [ ] }
 
 curl version returns 5 hits, as expected, but java returns 20 hits:
 
 searchQ.get().getHits().getHits().length; //=20
 
 Index setting are default. Hits do not have duplicates.
 
 Is there some workaround or something?
 -- 
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/647bddb1-636f-4670-9f9a-b0767001fdd8%40googlegroups.com.
 For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/BEFCD925-5450-4DAC-A121-EAC6EB928371%40pilato.fr.
For more options, visit https://groups.google.com/d/optout.


Trying to undand MArvel stats and merge with SSD

2014-06-06 Thread John Smith
Running Elasticsearch 1.2.1 with Java 1.7_55 on CentOs 6.5

The machine is a 32 core 96GB with standard spinning disk, but I also 
installed 1 Samsung Evo 840 for testing ES.
The Evo is rated at 500MB/s though the Linux perf test reported about 
300MB/s read and about 250MB/s write. The board is SataII which explains 
why it's 300MB/s max.

Using Jmeter to send index requests to ES

Executing about 6200 puts/s

Marvel reports 
2200 IOPS/
20MB merges/s

And iostat for the drive

sdf   0.00 14214.000.00 2021.33 0.0062.3563.17 
   10.495.17   0.48  97.27

Also seeing  on the console: stop throttling indexing: numMergesInFlight=4, 
maxNumMerges=5

Are these numbers good?

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/9c5f418d-99fe-48c4-b24e-4de18b9793d8%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Understanding merge statistics from Marvel

2014-06-06 Thread John Smith
Running Elasticsearch 1.2.1 with Java 1.7_55 on CentOs 6.5

The machine is a 32 core 96GB with standard spinning disk, but I also 
installed 1 Samsung Evo 840 for testing ES.
The Evo is rated at 500MB/s though the Linux perf test reported about 
300MB/s read and about 250MB/s write. The board is SataII which explains 
why it's 300MB/s max.

Using Jmeter to send index requests to ES

Executing about 6200 puts/s

Marvel reports 
2200 IOPS/
20MB merges/s

And iostat for the drive

sdf   0.00 14214.000.00 2021.33 0.0062.3563.17 
   10.495.17   0.48  97.27

Also seeing  on the console: stop throttling indexing: numMergesInFlight=4, 
maxNumMerges=5

Are these numbers good?

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/52970ffa-8cab-4f67-8d97-f4358062485d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: A plugin to change the result set before sending it back to the http client

2014-06-06 Thread Ivan Brusic
Plugins are essential to ES's success and are not going away any time soon.
The river plugins, aka cluster singletons, are the ones which are
discouraged from use. Good ahead and create more plugins!

-- 
Ivan


On Fri, Jun 6, 2014 at 7:23 AM, Brian brian.from...@gmail.com wrote:


 I also have avoided the plug-in approach because I've read that it's
 marked for deprecation and eventual removal. Yet logstash and ES Head are
 still offered as plug-ins as are a boat-load of other facilities, so I am
 not really sure if that's still the case.



-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQBEXPBJ5HTqT8YK25w_qjioYzGuXebQ%3DVvrE5HHmUMNvw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: [ANN] Elasticsearch Simple Action Plugin

2014-06-06 Thread virgil
Good idea! One thing I am not quite clear is that writing a custom
ScoreFunction, I will have to modify elasticsearch source code and compile
it right? Or there is any other way to do it? Thank you.



--
View this message in context: 
http://elasticsearch-users.115913.n3.nabble.com/ANN-Elasticsearch-Simple-Action-Plugin-tp4056971p4057232.html
Sent from the ElasticSearch Users mailing list archive at Nabble.com.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/1402072380650-4057232.post%40n3.nabble.com.
For more options, visit https://groups.google.com/d/optout.


Re: [ANN] Elasticsearch Simple Action Plugin

2014-06-06 Thread Ivan Brusic
Function score code is perhaps the easiest to write for Elasticsearch. You
do not need to create a plugin for it, simply write the code with the
appropriate classes and deploy at as a jar to Elasticsearch's lib
directory. Done. That said, I still prefer to write native scripts as
plugins because they are logged as plugins on server startup. Far easier to
debug if something goes wrong.

Two tutorials:
http://www.spacevatican.org/2012/5/12/elasticsearch-native-scripts-for-dummies/
https://github.com/imotov/elasticsearch-native-script-example

Cheers,

Ivan




On Fri, Jun 6, 2014 at 9:33 AM, virgil virgil...@gmail.com wrote:

 Good idea! One thing I am not quite clear is that writing a custom
 ScoreFunction, I will have to modify elasticsearch source code and compile
 it right? Or there is any other way to do it? Thank you.



 --
 View this message in context:
 http://elasticsearch-users.115913.n3.nabble.com/ANN-Elasticsearch-Simple-Action-Plugin-tp4056971p4057232.html
 Sent from the ElasticSearch Users mailing list archive at Nabble.com.

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/1402072380650-4057232.post%40n3.nabble.com
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQAvHOhOokAKVd2creOCJ%3DGxFROqUznkChbBT4SOsWDghg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Elasticsearch and Hadoop Questions

2014-06-06 Thread Costin Leau
Adding to what Georgi wrote, es-hadoop does not create the shards for you -
that's up to you or index templates (which I highly recommend). However
es-hadoop is aware of the target shards and will use them to parallelize
the reads/writes (such as one task per shard).


On Fri, Jun 6, 2014 at 2:45 PM, Georgi Ivanov georgi.r.iva...@gmail.com
wrote:

 and i don't think this anyhow related with number of shards and nodes


 On Thursday, June 5, 2014 7:41:34 PM UTC+2, ES USER wrote:

 Try as I might and I have read all the stuff I can find on ES' website
 about this I understand somewhat how the integration works but not the
 actual nuts and bolts of it.

 For example:

 Is Hadoop just storing the files that would normally be stored in the
 local filesystem for the ES indexes or is it storing the data that would
 normally be in those indexes and just accessed through es-hadoop?

 If it is the latter how do you go about determining whatto set for the
 number of nodes and shards.


 If anyone has any information on this or even better yet a place to point
 me to that has better references so that I can research this on my own it
 would be much appreciated.

 Thanks.

  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/90662a91-1557-4f61-86a2-bd2e620aec6f%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/90662a91-1557-4f61-86a2-bd2e620aec6f%40googlegroups.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAJogdmeDzSDrBLfpTQ3hGxOh1PN4przGkth2-M_oLdN7VjKYPg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Impossible to implement real custom boost query when the weight is in the child document?

2014-06-06 Thread Ivan Brusic
Did you change the boost_mode of your function score script? The default
should be multiply, which is the behavior you want, not sum, which is
what you are experiencing.

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-function-score-query.html

I have never used it with nested documents, so perhaps it is a bug (or a
feature :) )

-- 
Ivan


On Fri, Jun 6, 2014 at 3:55 AM, Csaba Dezsényi csaba.dezse...@gmail.com
wrote:

 I could find only one related post:

 https://groups.google.com/forum/#!msg/elasticsearch/EGCeJZbhVtA/i32ROGVmFswJ
 But this has different question...

  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/6152742a-4d32-47a4-890d-49cd6a4dd291%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/6152742a-4d32-47a4-890d-49cd6a4dd291%40googlegroups.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQBdTboEEGNeDSZZw%2B%2BwDhpeS6xA-1tZ-QrVsuFt-XzUSA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Speculative deletes

2014-06-06 Thread Nikolas Everett
I'm in the position where I need to make _sure_ a document is deleted from
the index when something occurs in my source system.  I want to just hit it
with a DELETE every time.  Is that a good idea?

It looks to me like DELETEing documents via ID is pretty cheap if the
document doesn't exist.  It looks like it sets some dirty flags but is
otherwise pretty quick.  My alternative is to do a GET and DELETE if it
exists or, maybe, do something cute like:
curl -XPOST 'localhost:9200/index/type/id/_update' -d '{
script : ctx.op = \delete\
}'
which will short circuit that.


Thoughts?

Nik

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAPmjWd3sit9WTgMV6qPb3WojV8-8aQt7VtwOHV6aP8q%2BMx32UQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: [ANN] Elasticsearch Simple Action Plugin

2014-06-06 Thread joergpra...@gmail.com
I mean, you can add a MyOwnFunctionBuilder/MyOwnFunctionParser to
Elasticsearch via plugin. See
package org.elasticsearch.index.query.functionscore for the standard
implementations.

The functionscore code is masterpiece quality - no need to modify existing
code! It is pluggable.

A close example to what you want to achieve is
org.elasticsearch.index.query.functionscore.fieldvaluefactor

This would be an alternative to the AbstractSearchScript call
implementation path you have chosen so far

Jörg





On Fri, Jun 6, 2014 at 6:33 PM, virgil virgil...@gmail.com wrote:

 Good idea! One thing I am not quite clear is that writing a custom
 ScoreFunction, I will have to modify elasticsearch source code and compile
 it right? Or there is any other way to do it? Thank you.



 --
 View this message in context:
 http://elasticsearch-users.115913.n3.nabble.com/ANN-Elasticsearch-Simple-Action-Plugin-tp4056971p4057232.html
 Sent from the ElasticSearch Users mailing list archive at Nabble.com.

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/1402072380650-4057232.post%40n3.nabble.com
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHRK2rUjGKTCxyLQNkKY4Fcc6Dxo0Hz-YkswXZ3SvNELw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Copy index from production to development instance

2014-06-06 Thread Brian Lamb
Hi all,

I want to do a one time copy of the data on my production elastic search 
instance to my development elastic search instance. Both are managed by AWS 
if that makes this easier. Here is what I tried:

On production:

curl -XPUT 'http://localhost:9200/_snapshot/my_s3_repository' -d '{
type: s3,
settings: {
access_key: productionAccessKey,
bucket: productionBucketName,
region: region,
secret_key: productionSecretKey
}
}'
curl -XPUT 
http://localhost:9200/_snapshot/my_s3_repository/snapshot_2014_06_02;

What this does is upload the instance to a production level s3 bucket.

Then in the aws console, I copy all of it to a development level s3 bucket.

Next on development:

curl -XPUT 'http://localhost:9200/_snapshot/my_s3_repository' -d '{
type: s3,
settings: {
access_key: developmentAccessKey,
bucket: developmentBucketName,
region: region,
secret_key: developmentSecretKey
}
}'
curl -XPOST 
http://localhost:9200/_snapshot/my_s3_repository/snapshot_2014_06_02/_restore;

This gives me the following message:

$ curl -XPOST 
http://localhost:9200/_snapshot/my_s3_repository/snapshot_2014_06_02/_restore?pretty=true;
{
  error : SnapshotException[[my_s3_repository:snapshot_2014_06_02] 
failed to get snapshots]; nested: IOException[Failed to get 
[snapshot-snapshot_2014_06_02]]; nested: AmazonS3Exception[Status Code: 
404, AWS Service: Amazon S3, AWS Request ID: RequestId, AWS Error Code: 
NoSuchKey, AWS Error Message: The specified key does not exist.]; ,
  status : 500
}

Also, when I try to get the snapshots, I get the following:

$ curl -XGET localhost:9200/_snapshot/_status?pretty=true
{
  snapshots : [ ]
}

This leads me to believe that I am not connecting the snapshot correctly 
but I'm not sure what I am doing incorrectly. Regenerating the index on 
development is not really a possibility as it took a few months to generate 
the index the first time around. If there is a better way to do this, I'm 
all for it. 

Thanks,

Brian Lamb

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/b5266abf-2ff4-44b3-ba25-734b50d99e83%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Copy index from production to development instance

2014-06-06 Thread Brian Lamb
I should also point out that I had to edit a file in the metadata-snapshot 
file to change around the s3 keys and bucket name to match what development 
was expecting.

On Friday, June 6, 2014 1:11:57 PM UTC-4, Brian Lamb wrote:

 Hi all,

 I want to do a one time copy of the data on my production elastic search 
 instance to my development elastic search instance. Both are managed by AWS 
 if that makes this easier. Here is what I tried:

 On production:

 curl -XPUT 'http://localhost:9200/_snapshot/my_s3_repository' -d '{
 type: s3,
 settings: {
 access_key: productionAccessKey,
 bucket: productionBucketName,
 region: region,
 secret_key: productionSecretKey
 }
 }'
 curl -XPUT 
 http://localhost:9200/_snapshot/my_s3_repository/snapshot_2014_06_02;

 What this does is upload the instance to a production level s3 bucket.

 Then in the aws console, I copy all of it to a development level s3 bucket.

 Next on development:

 curl -XPUT 'http://localhost:9200/_snapshot/my_s3_repository' -d '{
 type: s3,
 settings: {
 access_key: developmentAccessKey,
 bucket: developmentBucketName,
 region: region,
 secret_key: developmentSecretKey
 }
 }'
 curl -XPOST 
 http://localhost:9200/_snapshot/my_s3_repository/snapshot_2014_06_02/_restore
 

 This gives me the following message:

 $ curl -XPOST 
 http://localhost:9200/_snapshot/my_s3_repository/snapshot_2014_06_02/_restore?pretty=true
 
 {
   error : SnapshotException[[my_s3_repository:snapshot_2014_06_02] 
 failed to get snapshots]; nested: IOException[Failed to get 
 [snapshot-snapshot_2014_06_02]]; nested: AmazonS3Exception[Status Code: 
 404, AWS Service: Amazon S3, AWS Request ID: RequestId, AWS Error Code: 
 NoSuchKey, AWS Error Message: The specified key does not exist.]; ,
   status : 500
 }

 Also, when I try to get the snapshots, I get the following:

 $ curl -XGET localhost:9200/_snapshot/_status?pretty=true
 {
   snapshots : [ ]
 }

 This leads me to believe that I am not connecting the snapshot correctly 
 but I'm not sure what I am doing incorrectly. Regenerating the index on 
 development is not really a possibility as it took a few months to generate 
 the index the first time around. If there is a better way to do this, I'm 
 all for it. 

 Thanks,

 Brian Lamb


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/09fd8162-b39d-4b8d-83bc-c011e4d8bf05%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [ANN] Elasticsearch Simple Action Plugin

2014-06-06 Thread joergpra...@gmail.com
For an example function score plugin implementation, see

https://github.com/elasticsearch/elasticsearch/blob/master/src/test/java/org/elasticsearch/search/functionscore/FunctionScorePluginTests.java

Jörg


On Fri, Jun 6, 2014 at 7:10 PM, joergpra...@gmail.com joergpra...@gmail.com
 wrote:

 I mean, you can add a MyOwnFunctionBuilder/MyOwnFunctionParser to
 Elasticsearch via plugin. See
 package org.elasticsearch.index.query.functionscore for the standard
 implementations.

 The functionscore code is masterpiece quality - no need to modify existing
 code! It is pluggable.

 A close example to what you want to achieve is
 org.elasticsearch.index.query.functionscore.fieldvaluefactor

 This would be an alternative to the AbstractSearchScript call
 implementation path you have chosen so far

 Jörg





 On Fri, Jun 6, 2014 at 6:33 PM, virgil virgil...@gmail.com wrote:

 Good idea! One thing I am not quite clear is that writing a custom
 ScoreFunction, I will have to modify elasticsearch source code and compile
 it right? Or there is any other way to do it? Thank you.



 --
 View this message in context:
 http://elasticsearch-users.115913.n3.nabble.com/ANN-Elasticsearch-Simple-Action-Plugin-tp4056971p4057232.html
 Sent from the ElasticSearch Users mailing list archive at Nabble.com.

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/1402072380650-4057232.post%40n3.nabble.com
 .
 For more options, visit https://groups.google.com/d/optout.




-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGgcF50wo6n227MjH84cVYxGPFc-pVAAkDQtioxZ08AiA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Wrong paging with embedded es instance

2014-06-06 Thread Дмитрий Киселев
Yep.

System.out.println(searchQ.toString());

{
  from : 0,
  size : 5,
  query : {
match_all : { }
  },
  explain : false,
  fields : [ ]
}

Also, I think it's might be connected with sharding.
I've tried to change number_of_shards to 1 and paging starts act as I
expect.

Set number_of_shards to 5 and get 25 hits.
Set number_of_shards to 4 and get 20 hits.

It's seems like from and size applied to every shard separately in my case.

-- 
Thank you for your time. Best regards.
Dmitry.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAA9QNxOjT5wGmris_4uZ8_33uF6d51hzZ3Z%3DUvBKGbRV-YuR2Q%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: [ANN] Elasticsearch Simple Action Plugin

2014-06-06 Thread virgil
Thank you for the links!



--
View this message in context: 
http://elasticsearch-users.115913.n3.nabble.com/ANN-Elasticsearch-Simple-Action-Plugin-tp4056971p4057245.html
Sent from the ElasticSearch Users mailing list archive at Nabble.com.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/1402075639288-4057245.post%40n3.nabble.com.
For more options, visit https://groups.google.com/d/optout.


Re: Wrong paging with embedded es instance

2014-06-06 Thread David Pilato
Can you print also the full response object (toString()) as well?

-- 
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr


Le 6 juin 2014 à 19:24:00, Дмитрий Киселев (dmitry.v.kise...@gmail.com) a écrit:

Yep.

System.out.println(searchQ.toString());

{
  from : 0,
  size : 5,
  query : {
    match_all : { }
  },
  explain : false,
  fields : [ ]
}

Also, I think it's might be connected with sharding.
I've tried to change number_of_shards to 1 and paging starts act as I expect.

Set number_of_shards to 5 and get 25 hits.
Set number_of_shards to 4 and get 20 hits.

It's seems like from and size applied to every shard separately in my case.

--
Thank you for your time. Best regards.
Dmitry.
--
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAA9QNxOjT5wGmris_4uZ8_33uF6d51hzZ3Z%3DUvBKGbRV-YuR2Q%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/etPan.5391fd20.12200854.12ee%40MacBook-Air-de-David.local.
For more options, visit https://groups.google.com/d/optout.


Relevancy with location

2014-06-06 Thread Scott Vickers
I am trying to wire up a search that gives documents near a location a 
higher ranking.  Pretty simple but in our case some of the documents won't 
always have a location associated with them.  When using a function_score 
query the documents with no location always get scored higher than the ones 
with a location, even if I search with the exact coordinates.  Here is an 
example:

Documents:
PUT /test/colors/1
{
  name:red
}

PUT /test/colors/2
{
  name:red,
  location:{lat:47,lon:-122}
}

Mapping:
mappings : {
colors : {
properties : {
location : {
lat_lon : true,
type : geo_point,
geohash : true
},
name : {
type : string
}
}
}
}

Query:
{
  query:{
function_score:{
  query:{
multi_match:{
  type:phrase,
  query:red,
  fields:[name],
  tie_breaker:0.3
}
  },
  functions:[
{
  gauss: {
location: {
  origin: 47,-122,
  scale: 10mi
}
  }
}
]
}

  }
}

Results:
{
took: 3,
timed_out: false,
_shards: {
total: 2,
successful: 2,
failed: 0
},
hits: {
total: 2,
max_score: 0.30685282,
hits: [
{
_index: test,
_type: colors,
_id: 1,
_score: 0.30685282,
_source: {
name: red
}
},
{
_index: test,
_type: colors,
_id: 2,
_score: 0.30685282,
_source: {
name: red,
location: {
lat: 47,
lon: -122
}
}
}
]
}
}

Is there any way to factor proximity in for some results and not others?

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/59f8e2e1-856a-432b-a034-b71e31110ae9%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Wrong paging with embedded es instance

2014-06-06 Thread David Pilato
The total hit number is really inconsistent:

With one shard you get: 656523
With 5 shards you get: 3879

I think you are doing something wrong but I can't tell more without looking at 
the full source code.
Could you share how you actually execute the query?

Are you sure your Java client is connected to the right instance/cluster?


-- 
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr


Le 6 juin 2014 à 19:49:47, Дмитрий Киселев (dmitry.v.kise...@gmail.com) a écrit:

This is with 5 shards.
{
  took : 81,
  timed_out : false,
  _shards : {
    total : 5,
    successful : 5,
    failed : 0
  },
  hits : {
    total : 3879,
    max_score : 1.0,
    hits : [ {
  _index : gazetteer,
  _type : location,
  _id : adrpnt-0016087997-w162848733-regular,
  _score : 1.0
    }, {
  _index : gazetteer,
  _type : location,
  _id : adrpnt-0088827105-n2270743905-regular,
  _score : 1.0
    }, {
  _index : gazetteer,
  _type : location,
  _id : adrpnt-0097856729-n2270743903-regular,
  _score : 1.0
    }, {
  _index : gazetteer,
  _type : location,
  _id : adrpnt-0145983393-w154644839-regular,
  _score : 1.0
    }, {
  _index : gazetteer,
  _type : location,
  _id : adrpnt-0209772668-n1884206099-regular,
  _score : 1.0
    }, {
  _index : gazetteer,
  _type : location,
  _id : adrpnt-0015203925-w147150792-regular,
  _score : 1.0
    }, {
  _index : gazetteer,
  _type : location,
  _id : adrpnt-0017569140-n2495059507-regular,
  _score : 1.0
    }, {
  _index : gazetteer,
  _type : location,
  _id : adrpnt-0071389729-w147150672-regular,
  _score : 1.0
    }, {
  _index : gazetteer,
  _type : location,
  _id : adrpnt-0130455978-w145925771-regular,
  _score : 1.0
    }, {
  _index : gazetteer,
  _type : location,
  _id : adrpnt-0139624280-w147150701-regular,
  _score : 1.0
    }, {
  _index : gazetteer,
  _type : location,
  _id : adrpnt-007978-w194531715-regular,
  _score : 1.0
    }, {
  _index : gazetteer,
  _type : location,
  _id : adrpnt-0096243499-w194531714-regular,
  _score : 1.0
    }, {
  _index : gazetteer,
  _type : location,
  _id : adrpnt-0162691059-w164700540-regular,
  _score : 1.0
    }, {
  _index : gazetteer,
  _type : location,
  _id : adrpnt-0202220208-w164698447-regular,
  _score : 1.0
    }, {
  _index : gazetteer,
  _type : location,
  _id : adrpnt-0287053820-n2270743890-regular,
  _score : 1.0
    }, {
  _index : gazetteer,
  _type : location,
  _id : adrpnt-0013765594-w145949343-regular,
  _score : 1.0
    }, {
  _index : gazetteer,
  _type : location,
  _id : adrpnt-0026389358-w147150656-regular,
  _score : 1.0
    }, {
  _index : gazetteer,
  _type : location,
  _id : adrpnt-0093401200-w162848869-regular,
  _score : 1.0
    }, {
  _index : gazetteer,
  _type : location,
  _id : adrpnt-0203517601-n2270743895-regular,
  _score : 1.0
    }, {
  _index : gazetteer,
  _type : location,
  _id : adrpnt-0217217459-n2270743898-regular,
  _score : 1.0
    }, {
  _index : gazetteer,
  _type : location,
  _id : adrpnt-0004074128-w145925740-regular,
  _score : 1.0
    }, {
  _index : gazetteer,
  _type : location,
  _id : adrpnt-0007815983-w175372179-regular,
  _score : 1.0
    }, {
  _index : gazetteer,
  _type : location,
  _id : adrpnt-0039697750-w164700428-regular,
  _score : 1.0
    }, {
  _index : gazetteer,
  _type : location,
  _id : adrpnt-0053473990-w271448695-regular,
  _score : 1.0
    }, {
  _index : gazetteer,
  _type : location,
  _id : adrpnt-0129549665-w162848862-regular,
  _score : 1.0
    } ]
  }
}

And this is with one shard
{
  took : 123,
  timed_out : false,
  _shards : {
    total : 1,
    successful : 1,
    failed : 0
  },
  hits : {
    total : 656523,
    max_score : 1.0,
    hits : [ {
  _index : gazetteer,
  _type : location,
  _id : adrpnt-0675290941-w116699544-regular,
  _score : 1.0
    }, {
  _index : gazetteer,
  _type : location,
  _id : adrpnt-0675314442-n1557245109-regular,
  _score : 1.0
    }, {
  _index : gazetteer,
  _type : location,
  _id : adrpnt-0675335611-w210502362-regular,
  _score : 1.0
    }, {
  _index : gazetteer,
  _type : location,
  _id : adrpnt-0675352866-w245359553-regular,
  _score : 1.0
    }, {
  _index : gazetteer,
  _type : location,
  _id : adrpnt-0675354643-w235622232-regular,
  _score : 1.0
    } ]
  }
}


2014-06-06 19:40 GMT+02:00 David Pilato da...@pilato.fr:
Can you print also the full response object (toString()) as well?

-- 
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr


Le 6 juin 2014 à 

Re: A plugin to change the result set before sending it back to the http client

2014-06-06 Thread Brian
Thanks so much, Ivan. That's a very important distinction.

Brian

On Friday, June 6, 2014 12:28:56 PM UTC-4, Ivan Brusic wrote:

 Plugins are essential to ES's success and are not going away any time 
 soon. The river plugins, aka cluster singletons, are the ones which are 
 discouraged from use. Good ahead and create more plugins!

 -- 
 Ivan


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/fc5afbff-469c-4a02-8fa9-8190a82ce5fb%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Wrong paging with embedded es instance

2014-06-06 Thread Дмитрий Киселев
Sorry, there littlebit different dataset.
Here is answer with same data and 1 shard
{
  took : 63,
  timed_out : false,
  _shards : {
total : 1,
successful : 1,
failed : 0
  },
  hits : {
total : 3879,
max_score : 1.0,
hits : [ {
  _index : gazetteer,
  _type : location,
  _id : adrpnt-0004074128-w145925740-regular,
  _score : 1.0
}, {
  _index : gazetteer,
  _type : location,
  _id : adrpnt-0007815983-w175372179-regular,
  _score : 1.0
}, {
  _index : gazetteer,
  _type : location,
  _id : adrpnt-0013765594-w145949343-regular,
  _score : 1.0
}, {
  _index : gazetteer,
  _type : location,
  _id : adrpnt-0015203925-w147150792-regular,
  _score : 1.0
}, {
  _index : gazetteer,
  _type : location,
  _id : adrpnt-0016087997-w162848733-regular,
  _score : 1.0
} ]
  }
}


2014-06-06 19:54 GMT+02:00 David Pilato da...@pilato.fr:

 The total hit number is really inconsistent:

 With one shard you get: 656523
 With 5 shards you get: 3879

 I think you are doing something wrong but I can't tell more without
 looking at the full source code.
 Could you share how you actually execute the query?

 Are you sure your Java client is connected to the right instance/cluster?


  --
 *David Pilato* | *Technical Advocate* | *Elasticsearch.com*
 @dadoonet https://twitter.com/dadoonet | @elasticsearchfr
 https://twitter.com/elasticsearchfr


 Le 6 juin 2014 à 19:49:47, Дмитрий Киселев (dmitry.v.kise...@gmail.com) a
 écrit:

 This is with 5 shards.
 {
   took : 81,
   timed_out : false,
   _shards : {
 total : 5,
 successful : 5,
 failed : 0
   },
   hits : {
 total : 3879,
 max_score : 1.0,
 hits : [ {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0016087997-w162848733-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0088827105-n2270743905-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0097856729-n2270743903-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0145983393-w154644839-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0209772668-n1884206099-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0015203925-w147150792-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0017569140-n2495059507-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0071389729-w147150672-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0130455978-w145925771-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0139624280-w147150701-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-007978-w194531715-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0096243499-w194531714-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0162691059-w164700540-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0202220208-w164698447-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0287053820-n2270743890-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0013765594-w145949343-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0026389358-w147150656-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0093401200-w162848869-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0203517601-n2270743895-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0217217459-n2270743898-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0004074128-w145925740-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0007815983-w175372179-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0039697750-w164700428-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0053473990-w271448695-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : 

Re: A couple of questions about keys and routing

2014-06-06 Thread Chris Berry
I apologize having to reply to my own message. 
But please replace all CID w/ MID below. I started editing the text and 
inadvertently hit Send.
Thanks,
-- Chris 

On Friday, June 6, 2014 1:03:20 PM UTC-5, Chris Berry wrote:

 Greetings,

 I have a couple of questions about keys and routing.

 Let’s imagine that I have  

1. A set of time-based Indexes. The indexes are time-based because the 
overall set is unbounded (it grows by over 1M/day) . Thus, we'll have, 
 say, 
an Index per Quarter to keep them bounded individually and therefore 
maintain predictable performance. 
2. The documents stored in these Indexes are “keyed” by MID and 
routed to” by PID. 
3. Every document has a unique MID (it’s id_) and contains a PID field 
that is unique to the document but not across all documents.  
4. The common use case is to want all the MIDs for a given PID. 

 So no problem there. (Hopefully that makes sense…)

 My first question; when I lookup by MID in a given Index, and I do not 
 have a PID (routing key), is that an inefficient lookup? 
  I.e. will it have to scan all of the Shards to find it??

 And my second question is really more about design.
 I also need to be able to lookup a document by it’s CID and I don’t 
 necessarily know which time-based Index I will find it in. (E.g. it may 
 have been inactive and then be resurrected) 
 I was hoping to avoid having some sort of metadata Index that could yield 
 this info for me (I.e. given a CID, return it’s PID and current Index) 
 because it will end up being unbounded (we are talking billions of entries 
 eventually). 
 Perhaps this won’t really matter because it is only looked-up by key, 
 which should be fast as it would go to the correct Shard.   
 And the data set is only a few bytes?? But still, it seems like a I would 
 be creating an eventual problem.
 Although the alternative, looking in every time-based index seems much 
 worse.

 Any advise would be greatly appreciated.

 Thanks,
 — Chris 




-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/0e1d66ad-e177-423a-813b-e95f0ec52697%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Wrong paging with embedded es instance

2014-06-06 Thread David Pilato
So? What's wrong here?

You asked for 5 docs and you get 5.

I'm missing something I guess.

--
David ;-)
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

 Le 6 juin 2014 à 20:02, Дмитрий Киселев dmitry.v.kise...@gmail.com a écrit :
 
 Sorry, there littlebit different dataset.
 Here is answer with same data and 1 shard
 {
   took : 63,
   timed_out : false,
   _shards : {
 total : 1,
 successful : 1,
 failed : 0
   },
   hits : {
 total : 3879,
 max_score : 1.0,
 hits : [ {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0004074128-w145925740-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0007815983-w175372179-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0013765594-w145949343-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0015203925-w147150792-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0016087997-w162848733-regular,
   _score : 1.0
 } ]
   }
 }
 
 
 2014-06-06 19:54 GMT+02:00 David Pilato da...@pilato.fr:
 The total hit number is really inconsistent:
 
 With one shard you get: 656523
 With 5 shards you get: 3879
 
 I think you are doing something wrong but I can't tell more without looking 
 at the full source code.
 Could you share how you actually execute the query?
 
 Are you sure your Java client is connected to the right instance/cluster?
 
 
 -- 
 David Pilato | Technical Advocate | Elasticsearch.com
 @dadoonet | @elasticsearchfr
 
 
 Le 6 juin 2014 à 19:49:47, Дмитрий Киселев (dmitry.v.kise...@gmail.com) a 
 écrit:
 
 This is with 5 shards.
 {
   took : 81,
   timed_out : false,
   _shards : {
 total : 5,
 successful : 5,
 failed : 0
   },
   hits : {
 total : 3879,
 max_score : 1.0,
 hits : [ {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0016087997-w162848733-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0088827105-n2270743905-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0097856729-n2270743903-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0145983393-w154644839-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0209772668-n1884206099-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0015203925-w147150792-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0017569140-n2495059507-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0071389729-w147150672-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0130455978-w145925771-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0139624280-w147150701-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-007978-w194531715-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0096243499-w194531714-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0162691059-w164700540-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0202220208-w164698447-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0287053820-n2270743890-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0013765594-w145949343-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0026389358-w147150656-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0093401200-w162848869-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0203517601-n2270743895-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0217217459-n2270743898-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0004074128-w145925740-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0007815983-w175372179-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : 

Re: Wrong paging with embedded es instance

2014-06-06 Thread Дмитрий Киселев
I asked for 5 docs.

With 1 shard - I got 5 docs.
With 5 shards - I got 25 docs.
With 5 shards, using curl instead of embedded java client - I got 5 docs.


2014-06-06 20:20 GMT+02:00 David Pilato da...@pilato.fr:

 So? What's wrong here?

 You asked for 5 docs and you get 5.

 I'm missing something I guess.


 --
 David ;-)
 Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

 Le 6 juin 2014 à 20:02, Дмитрий Киселев dmitry.v.kise...@gmail.com a
 écrit :

 Sorry, there littlebit different dataset.
 Here is answer with same data and 1 shard
 {
   took : 63,
   timed_out : false,
   _shards : {
 total : 1,
 successful : 1,
 failed : 0
   },
   hits : {
 total : 3879,
 max_score : 1.0,
 hits : [ {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0004074128-w145925740-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0007815983-w175372179-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0013765594-w145949343-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0015203925-w147150792-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0016087997-w162848733-regular,
   _score : 1.0
 } ]
   }
 }


 2014-06-06 19:54 GMT+02:00 David Pilato da...@pilato.fr:

 The total hit number is really inconsistent:

 With one shard you get: 656523
 With 5 shards you get: 3879

 I think you are doing something wrong but I can't tell more without
 looking at the full source code.
 Could you share how you actually execute the query?

 Are you sure your Java client is connected to the right instance/cluster?


  --
 *David Pilato* | *Technical Advocate* | *Elasticsearch.com
 http://Elasticsearch.com*
 @dadoonet https://twitter.com/dadoonet | @elasticsearchfr
 https://twitter.com/elasticsearchfr


 Le 6 juin 2014 à 19:49:47, Дмитрий Киселев (dmitry.v.kise...@gmail.com)
 a écrit:

 This is with 5 shards.
 {
   took : 81,
   timed_out : false,
   _shards : {
 total : 5,
 successful : 5,
 failed : 0
   },
   hits : {
 total : 3879,
 max_score : 1.0,
 hits : [ {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0016087997-w162848733-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0088827105-n2270743905-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0097856729-n2270743903-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0145983393-w154644839-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0209772668-n1884206099-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0015203925-w147150792-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0017569140-n2495059507-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0071389729-w147150672-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0130455978-w145925771-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0139624280-w147150701-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-007978-w194531715-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0096243499-w194531714-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0162691059-w164700540-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0202220208-w164698447-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0287053820-n2270743890-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0013765594-w145949343-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0026389358-w147150656-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0093401200-w162848869-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0203517601-n2270743895-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0217217459-n2270743898-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : 

Re: Elasticsearch and Hadoop Questions

2014-06-06 Thread ES USER
I guess the problem I having wrapping my head around is exactly where the 
data is residing and in what format.  

If I understand the Georgi's email above is it that you can run map reduce 
jobs against data stored in local ES through by utilizing es-hadoop and you 
can also run ES queries against data in Hadoop utilizing es-hadoop.


  Is that correct?



On Friday, June 6, 2014 12:39:44 PM UTC-4, Costin Leau wrote:

 Adding to what Georgi wrote, es-hadoop does not create the shards for you 
 - that's up to you or index templates (which I highly recommend). However 
 es-hadoop is aware of the target shards and will use them to parallelize 
 the reads/writes (such as one task per shard).


 On Fri, Jun 6, 2014 at 2:45 PM, Georgi Ivanov georgi@gmail.com 
 javascript: wrote:

 and i don't think this anyhow related with number of shards and nodes 


 On Thursday, June 5, 2014 7:41:34 PM UTC+2, ES USER wrote:

 Try as I might and I have read all the stuff I can find on ES' website 
 about this I understand somewhat how the integration works but not the 
 actual nuts and bolts of it.

 For example:

 Is Hadoop just storing the files that would normally be stored in the 
 local filesystem for the ES indexes or is it storing the data that would 
 normally be in those indexes and just accessed through es-hadoop?

 If it is the latter how do you go about determining whatto set for the 
 number of nodes and shards.


 If anyone has any information on this or even better yet a place to 
 point me to that has better references so that I can research this on my 
 own it would be much appreciated.

 Thanks.

  -- 
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearc...@googlegroups.com javascript:.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/90662a91-1557-4f61-86a2-bd2e620aec6f%40googlegroups.com
  
 https://groups.google.com/d/msgid/elasticsearch/90662a91-1557-4f61-86a2-bd2e620aec6f%40googlegroups.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.




-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/ed729795-a7d6-4320-9da2-16b214e653b0%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


IllegalArgumentException[No type mapped for [43]], version 1.2.1

2014-06-06 Thread Brooke Babcock
In one part of our application we use Elasticsearch as an object store. 
Therefore, when indexing, we supply our own _id. Likewise, when accessing a 
document we use the simple GET method to fetch by _id. This has worked well 
for us, up until recently. Normally, this is what we get:

curl -XGET 'http://127.0.0.1:9200/data-2014.06.06/key/test1?pretty=true'
{
  _index : data-2014.06.06,
  _type : key,
  _id : test1,
  _version : 1,
  found : true,
  _source:{sData:test data 1}
}


Now, we often encounter a recently indexed document that throws the 
following error when we try to fetch it:

curl -XGET 'http://127.0.0.1:9200/data-2014.06.06/key/test2?pretty=true'
{
  error:IllegalArgumentException[No type mapped for [43]],
  status:500
}



This condition persists anywhere from 1 to 25 minutes or so, at which point 
we no longer receive the error for that document and the GET succeeds as 
normal. From that point on, we are able to consistently retrieve that 
document by _id without issue. But, soon after, we will find a different 
newly indexed document caught in the same bad state.

We know the documents are successfully indexed. Our bulk sender (which uses 
the Java transport client) indicates no error during indexing and we are 
still able to locate the document by doing an ids query, such as:

curl -XPOST http://127.0.0.1:9200/data-2014.06.06/key/_search?pretty=true; 
-d '
{
  query: {
ids: {
  values: [test2]
}
  }
}'

Which responds:
{
   took: 543,
   timed_out: false,
   _shards: {
  total: 10,
  successful: 10,
  failed: 0
   },
   hits: {
  total: 1,
  max_score: 1.0,
  hits: [ {
 _index: data-2014.06.06,
 _type: key,
 _id: test2,
 _score: 1.0,
 _source:{sData: test data 2}
  } ]
   }
}


We first noticed this behavior in version 1.2.0. When we upgraded to 1.2.1, 
we deleted all indexes and started with a fresh cluster. We hoped our 
problem would be solved by the big fix that came in 1.2.1, but we are still 
regularly seeing it. Although our situation may sound like the routing bug 
introduced in 1.2.0, we are certain that it is not. This appears to be a 
significant issue with the translog - we hope the developers will be able 
to look at what may have changed. We did not notice this problem in version 
1.1.1.

Just in case, here is the mapping being used:
curl -XGET 'http://127.0.0.1:9200/data-2014.06.06/key/_mapping?pretty=true'
{
  data-2014.06.06 : {
mappings : {
  key : {
_all : {
  enabled : false
},
properties : {
  sData : {
type : string,
index : no
  }
}
  }
}
  }
}


Thanks for your help.



-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/20c45cf8-3459-47f5-8cc3-1e63c93b2c0c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Bug in context suggester?

2014-06-06 Thread Gilbert Chang
I am observing an issue with field reference in context suggester. Consider 
the following use case:

PUT /services/service/_mapping
{
service: {
properties: {
name: {
type : string
},
network_id: {
type : long
},
suggest_field: {
type: completion,
context: {
network: { 
type: category,
path: network_id
}
}
}
}
}
}

PUT /services/service/1
{
name: knapsack,
network_id: 1,
suggest_field: {
input: [knacksack, backpack, daypack]
}
}

POST services/_suggest?pretty'
{
suggest : {
text : k,
completion : {
field : suggest_field,
size: 10,
context: {
network: 1
}
}
}
}

The suggest query does not return any option. However, if I change the 
field type of network_id to 'string' (and not_analyzed), the suggest query 
works. Explicitly indexing the network_id as the context in the 
suggest_field works as well:

PUT /services/service/1
{
name: knapsack,
network_id: 1,
suggest_field: {
input: [knacksack, backpack, daypack],
context: {network_id: 1}
}
}

Seems like a bug?

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/214f059d-e2e6-4a28-aaaf-7dc6c1e1bcad%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Wrong paging with embedded es instance

2014-06-06 Thread David Pilato
Any chance you could share your code. In particular, i'd like to see how you 
run the query.

If you could reproduce it with a test case that would be awesome.

--
David ;-)
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

 Le 6 juin 2014 à 20:26, Дмитрий Киселев dmitry.v.kise...@gmail.com a écrit :
 
 I asked for 5 docs.
 
 With 1 shard - I got 5 docs.
 With 5 shards - I got 25 docs.
 With 5 shards, using curl instead of embedded java client - I got 5 docs.
 
 
 2014-06-06 20:20 GMT+02:00 David Pilato da...@pilato.fr:
 So? What's wrong here?
 
 You asked for 5 docs and you get 5.
 
 I'm missing something I guess.
 
 
 --
 David ;-)
 Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs
 
 Le 6 juin 2014 à 20:02, Дмитрий Киселев dmitry.v.kise...@gmail.com a 
 écrit :
 
 Sorry, there littlebit different dataset.
 Here is answer with same data and 1 shard
 {
   took : 63,
   timed_out : false,
   _shards : {
 total : 1,
 successful : 1,
 failed : 0
   },
   hits : {
 total : 3879,
 max_score : 1.0,
 hits : [ {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0004074128-w145925740-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0007815983-w175372179-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0013765594-w145949343-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0015203925-w147150792-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0016087997-w162848733-regular,
   _score : 1.0
 } ]
   }
 }
 
 
 2014-06-06 19:54 GMT+02:00 David Pilato da...@pilato.fr:
 The total hit number is really inconsistent:
 
 With one shard you get: 656523
 With 5 shards you get: 3879
 
 I think you are doing something wrong but I can't tell more without 
 looking at the full source code.
 Could you share how you actually execute the query?
 
 Are you sure your Java client is connected to the right instance/cluster?
 
 
 -- 
 David Pilato | Technical Advocate | Elasticsearch.com
 @dadoonet | @elasticsearchfr
 
 
 Le 6 juin 2014 à 19:49:47, Дмитрий Киселев (dmitry.v.kise...@gmail.com) a 
 écrit:
 
 This is with 5 shards.
 {
   took : 81,
   timed_out : false,
   _shards : {
 total : 5,
 successful : 5,
 failed : 0
   },
   hits : {
 total : 3879,
 max_score : 1.0,
 hits : [ {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0016087997-w162848733-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0088827105-n2270743905-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0097856729-n2270743903-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0145983393-w154644839-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0209772668-n1884206099-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0015203925-w147150792-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0017569140-n2495059507-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0071389729-w147150672-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0130455978-w145925771-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0139624280-w147150701-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-007978-w194531715-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0096243499-w194531714-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0162691059-w164700540-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0202220208-w164698447-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0287053820-n2270743890-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0013765594-w145949343-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0026389358-w147150656-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0093401200-w162848869-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : 

Re: Aggregation bug? Or user error?

2014-06-06 Thread mooky
 
Ok. I have written a test case that (if run enough) will reproduce it. Its 
an intermittent bug.
I have raised an issue:
https://github.com/elasticsearch/elasticsearch/issues/6435 

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/0b666a91-2f49-4787-ba2f-fb33a8fc023e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


scala elastic4s usage question

2014-06-06 Thread Ramdev Wudali
Hi:
  I have started using the scala libraries provided by the elastic4s 
libraries. I am running into a problem creating a mapping that has a 
straightforward definition (yet it has some complexity).

Example :

{
index: {
mappings: {
OA: {
properties: {
AdminStatus: {
properties: {
content: {
type: string
},
effectiveFrom: {
type: date,
format: dateOptionalTime
}
}
},
IsPublicFlag: {
type: boolean
},
OrganizationAddress: {
properties: {
OrganizationAddressCity: {
type: string
},
OrganizationAddressCountryCode: {
type: string
},
OrganizationAddressLine1: {
type: string
}
}
}
}
}
}
}
}





I am not able to figure out how to define OrganizationAddress field  (which 
is a complex object.)

if I define it as such :

indexClient.execute {
create index index mappings (
  OA as (
  AdminStatus as (
  content typed 
StringType,
  effectiveFrom typed 
DateType
),
IsPublicFlag 
typed BooleanType,

OrganizationAddress as (
  
OrganizationAddressCity typed StringType,
  
OrganizationAddressLine1 typed StringType,
)
) }

I get a compilation  error :
Error:(52, 69) type mismatch;
 found   : com.sksamuel.elastic4s.mapping.MappingDefinition
 required: com.sksamuel.elastic4s.mapping.TypedFieldDefinition
  AdminStatus as (
^

How can I map complex objects using the Scala interface elastic4s ?

Thanks

Ramdev

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/10398958-4755-478f-a3f7-3f1b77aca67d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Using Regex in Query via Kibana

2014-06-06 Thread Bill Paulson
I think the '$' on the end of the regex may cause trouble. Lucene regular 
expressions are already anchored to start and end of the string.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/65007509-daf5-49b7-8cfd-148921bab52c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Elasticsearch and Hadoop Questions

2014-06-06 Thread Costin Leau
ES stores data in its own internal format, which typically resides locally.
What you are stating is partially correct - with the connector you would
move/copy data between Hadoop and ES since, in order for ES to work with
data, it needs to actually index it (that is, to see it).
So you would use es-hadoop to index data from Hadoop in ES or/and query ES
directly from Hadoop.


On Fri, Jun 6, 2014 at 9:29 PM, ES USER es.user.2...@gmail.com wrote:

 I guess the problem I having wrapping my head around is exactly where the
 data is residing and in what format.

 If I understand the Georgi's email above is it that you can run map reduce
 jobs against data stored in local ES through by utilizing es-hadoop and you
 can also run ES queries against data in Hadoop utilizing es-hadoop.


   Is that correct?




 On Friday, June 6, 2014 12:39:44 PM UTC-4, Costin Leau wrote:

 Adding to what Georgi wrote, es-hadoop does not create the shards for you
 - that's up to you or index templates (which I highly recommend). However
 es-hadoop is aware of the target shards and will use them to parallelize
 the reads/writes (such as one task per shard).


 On Fri, Jun 6, 2014 at 2:45 PM, Georgi Ivanov georgi@gmail.com
 wrote:

 and i don't think this anyhow related with number of shards and nodes


 On Thursday, June 5, 2014 7:41:34 PM UTC+2, ES USER wrote:

 Try as I might and I have read all the stuff I can find on ES' website
 about this I understand somewhat how the integration works but not the
 actual nuts and bolts of it.

 For example:

 Is Hadoop just storing the files that would normally be stored in the
 local filesystem for the ES indexes or is it storing the data that would
 normally be in those indexes and just accessed through es-hadoop?

 If it is the latter how do you go about determining whatto set for the
 number of nodes and shards.


 If anyone has any information on this or even better yet a place to
 point me to that has better references so that I can research this on my
 own it would be much appreciated.

 Thanks.

  --
 You received this message because you are subscribed to the Google
 Groups elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send
 an email to elasticsearc...@googlegroups.com.
 To view this discussion on the web visit https://groups.google.com/d/
 msgid/elasticsearch/90662a91-1557-4f61-86a2-bd2e620aec6f%
 40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/90662a91-1557-4f61-86a2-bd2e620aec6f%40googlegroups.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/ed729795-a7d6-4320-9da2-16b214e653b0%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/ed729795-a7d6-4320-9da2-16b214e653b0%40googlegroups.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAJogdmeacuVOXNYcwdYHBg69TAotrqvyuzre_JeUK-RfAcFBXA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Need help setting up autocomplete suggestions using phrase suggester

2014-06-06 Thread Paul Bormans
I just wanted to let anyone know i didn't succeed with the phrase suggester 
and i switched to the completion suggester, that actually works very well 
to my purpose.
Paul

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/e0410cb8-60df-428c-8225-6ea00146630f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Copy index from production to development instance

2014-06-06 Thread Antonio Augusto Santos
Take a look at stream2es https://github.com/elasticsearch/stream2es

On Friday, June 6, 2014 2:13:06 PM UTC-3, Brian Lamb wrote:

 I should also point out that I had to edit a file in the metadata-snapshot 
 file to change around the s3 keys and bucket name to match what development 
 was expecting.

 On Friday, June 6, 2014 1:11:57 PM UTC-4, Brian Lamb wrote:

 Hi all,

 I want to do a one time copy of the data on my production elastic search 
 instance to my development elastic search instance. Both are managed by AWS 
 if that makes this easier. Here is what I tried:

 On production:

 curl -XPUT 'http://localhost:9200/_snapshot/my_s3_repository' -d '{
 type: s3,
 settings: {
 access_key: productionAccessKey,
 bucket: productionBucketName,
 region: region,
 secret_key: productionSecretKey
 }
 }'
 curl -XPUT 
 http://localhost:9200/_snapshot/my_s3_repository/snapshot_2014_06_02;

 What this does is upload the instance to a production level s3 bucket.

 Then in the aws console, I copy all of it to a development level s3 
 bucket.

 Next on development:

 curl -XPUT 'http://localhost:9200/_snapshot/my_s3_repository' -d '{
 type: s3,
 settings: {
 access_key: developmentAccessKey,
 bucket: developmentBucketName,
 region: region,
 secret_key: developmentSecretKey
 }
 }'
 curl -XPOST 
 http://localhost:9200/_snapshot/my_s3_repository/snapshot_2014_06_02/_restore
 

 This gives me the following message:

 $ curl -XPOST 
 http://localhost:9200/_snapshot/my_s3_repository/snapshot_2014_06_02/_restore?pretty=true
 
 {
   error : SnapshotException[[my_s3_repository:snapshot_2014_06_02] 
 failed to get snapshots]; nested: IOException[Failed to get 
 [snapshot-snapshot_2014_06_02]]; nested: AmazonS3Exception[Status Code: 
 404, AWS Service: Amazon S3, AWS Request ID: RequestId, AWS Error Code: 
 NoSuchKey, AWS Error Message: The specified key does not exist.]; ,
   status : 500
 }

 Also, when I try to get the snapshots, I get the following:

 $ curl -XGET localhost:9200/_snapshot/_status?pretty=true
 {
   snapshots : [ ]
 }

 This leads me to believe that I am not connecting the snapshot correctly 
 but I'm not sure what I am doing incorrectly. Regenerating the index on 
 development is not really a possibility as it took a few months to generate 
 the index the first time around. If there is a better way to do this, I'm 
 all for it. 

 Thanks,

 Brian Lamb



-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/52168c96-30ea-4527-b287-676e757b1e6a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Help with cluster.routing.allocation.enable Property

2014-06-06 Thread Ivan Brusic
The cluster state will be yellow while shards are being rebalanced, so you
can just execute a cluster healt request and wait for green status:

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/cluster-health.html#request-params

You can also poll the same API and wait until all the moving shards
(relocating_shards, initializing_shards, unassigned_shards) are 0.

Cheers,

Ivan


On Fri, Jun 6, 2014 at 11:03 AM, sagarl sagar...@gmail.com wrote:

 Hi folks,

 Our requirement is as follows:

 We have our ES cluster running on aws instances where nodes can come and
 go. We would like to disable Shard reallocation on cluster in Normal state
 so that Node going down should not rebalance the shards.(Just to avoid
 unnecessary Network i/o).

 When node comes back up (within few minutes), we would like to enable it
 so that new node will get back it's own shards and after rebalancing is
 finished, we would like to Disable Shard reallocation again.

 We would like this to be an automated process rather than manual one.

 I have already taken a look at following stackoverflow question/answer
 http://stackoverflow.com/questions/16231614/quick-recovery-after-node-restart-in-elasticsearch
 but I am not sure if it can work in automated way.

 curl -XPUT localhost:9200/_cluster/settings -d 
 '{transient:{cluster.routing.allocation.enable: none}}'
 /etc/init.d/elasticsearch restart
 curl -XPUT localhost:9200/_cluster/settings -d 
 '{transient:{cluster.routing.allocation.enable: all}}'


 So I would like to know, is there any way to Listen to the event which
 says Shard Rebalancing is finished (by adding a new listener) and then
 disable cluster.routing.allocation.enable Property ?

 -- Thanks

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/cee6bef8-25cf-4175-ac53-f2734d44ea76%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/cee6bef8-25cf-4175-ac53-f2734d44ea76%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQCXbKF%3DDCig3xhe48yb40Tb88tLXRH2kNk3w4OxJq62aQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Wrong paging with embedded es instance

2014-06-06 Thread Дмитрий Киселев
I've made a snippet.

Code
https://github.com/kiselev-dv/es-test/blob/master/ESPagingTest/src/main/java/test/ESTest.java

Test results
https://github.com/kiselev-dv/es-test/blob/master/test.log
First test 1 shard (test.log line 27) - everything ok
Second test 5 shards (test.log line 86) - error

Search and paging generation
https://github.com/kiselev-dv/es-test/blob/master/ESPagingTest/src/main/java/test/ESTest.java#L73

One more strange thing, search wasn't find anything until I have add small
delay
https://github.com/kiselev-dv/es-test/blob/master/ESPagingTest/src/main/java/test/ESTest.java#L89



2014-06-06 20:53 GMT+02:00 David Pilato da...@pilato.fr:

 Any chance you could share your code. In particular, i'd like to see how
 you run the query.

 If you could reproduce it with a test case that would be awesome.

 --
 David ;-)
 Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

 Le 6 juin 2014 à 20:26, Дмитрий Киселев dmitry.v.kise...@gmail.com a
 écrit :

 I asked for 5 docs.

 With 1 shard - I got 5 docs.
 With 5 shards - I got 25 docs.
 With 5 shards, using curl instead of embedded java client - I got 5 docs.


 2014-06-06 20:20 GMT+02:00 David Pilato da...@pilato.fr:

 So? What's wrong here?

 You asked for 5 docs and you get 5.

 I'm missing something I guess.


 --
 David ;-)
 Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

 Le 6 juin 2014 à 20:02, Дмитрий Киселев dmitry.v.kise...@gmail.com a
 écrit :

 Sorry, there littlebit different dataset.
 Here is answer with same data and 1 shard
 {
   took : 63,
   timed_out : false,
   _shards : {
 total : 1,
 successful : 1,
 failed : 0
   },
   hits : {
 total : 3879,
 max_score : 1.0,
 hits : [ {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0004074128-w145925740-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0007815983-w175372179-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0013765594-w145949343-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0015203925-w147150792-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0016087997-w162848733-regular,
   _score : 1.0
 } ]
   }
 }


 2014-06-06 19:54 GMT+02:00 David Pilato da...@pilato.fr:

 The total hit number is really inconsistent:

 With one shard you get: 656523
 With 5 shards you get: 3879

 I think you are doing something wrong but I can't tell more without
 looking at the full source code.
 Could you share how you actually execute the query?

 Are you sure your Java client is connected to the right instance/cluster?


  --
 *David Pilato* | *Technical Advocate* | *Elasticsearch.com
 http://Elasticsearch.com*
 @dadoonet https://twitter.com/dadoonet | @elasticsearchfr
 https://twitter.com/elasticsearchfr


 Le 6 juin 2014 à 19:49:47, Дмитрий Киселев (dmitry.v.kise...@gmail.com)
 a écrit:

 This is with 5 shards.
 {
   took : 81,
   timed_out : false,
   _shards : {
 total : 5,
 successful : 5,
 failed : 0
   },
   hits : {
 total : 3879,
 max_score : 1.0,
 hits : [ {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0016087997-w162848733-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0088827105-n2270743905-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0097856729-n2270743903-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0145983393-w154644839-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0209772668-n1884206099-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0015203925-w147150792-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0017569140-n2495059507-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0071389729-w147150672-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0130455978-w145925771-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0139624280-w147150701-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-007978-w194531715-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0096243499-w194531714-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0162691059-w164700540-regular,
   _score : 1.0
 

Re: Need help setting up autocomplete suggestions using phrase suggester

2014-06-06 Thread Nikolas Everett
Yeah, sorry for not getting back to you earlier.  The phrase and term
suggesters are more for providing did you mean: style spelling
corrections then typeahead.  The completion suggester is for typeahead.


On Fri, Jun 6, 2014 at 5:05 PM, Paul Bormans pee...@gmail.com wrote:

 I just wanted to let anyone know i didn't succeed with the phrase
 suggester and i switched to the completion suggester, that actually works
 very well to my purpose.
 Paul

  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/e0410cb8-60df-428c-8225-6ea00146630f%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/e0410cb8-60df-428c-8225-6ea00146630f%40googlegroups.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAPmjWd1ABJgLx62CaHCUX%3Du1ztv-jdm%2BdLVcoZjt%3DOfLNdcruA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Help with cluster.routing.allocation.enable Property

2014-06-06 Thread Drew Raines

sagarl wrote:

We have our ES cluster running on aws instances where nodes can 
come and go. We would like to disable Shard reallocation on 
cluster in Normal state so that Node going down should not 
rebalance the shards.(Just to avoid unnecessary Network i/o).


If your shard topology is relatively static, you could just leave 
allocation disabled (or set it to new_primaries for the occasional 
index creation).  But honestly you're trying to micro-manage a 
process that ES does pretty well.  There's nothing wrong with 
having a yellow cluster.  If you're concerned about availability 
you should add replicas.


Drew

--
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/m2y4x9wwax.fsf%40mid.raines.me.
For more options, visit https://groups.google.com/d/optout.


[ElasticSearch Hadoop] Error in configuring object

2014-06-06 Thread bharath bhat
Hi, 

I'm trying to get a simple mapreduce job working with ES hadoop. I followed 
the docs to set up a job but I keep getting 'Error in configuring object' 
when I try to run it on Hadoop in pseudo distributed mode. I am using the 
old API with Hadoop 2.4.0. 

Here's my code I am using:

public class SomeMapper extends MapReduceBase implements MapperObject, 
Object, Text, MapWritable {

// Dummy
public void map(Object key, Object value, OutputCollectorText, 
MapWritable output, Reporter reporter)
throws IOException {
Text docId = (Text) key;
MapWritable doc = (MapWritable) value;
output.collect(docId, doc);
}
}

public class SomeReducer extends MapReduceBase implements ReducerText, 
MapWritable, Text, MapWritable {

// Dummy
public void reduce(Text key, IteratorMapWritable values, 
OutputCollectorText, MapWritable output, Reporter reporter)
throws IOException {
Text docId = (Text) key;
while (values.hasNext()){
MapWritable out = (MapWritable)(values.next());
output.collect(docId, out);
}
}
}

public static void main(String[] args) throws  Exception {

JobConf conf = new JobConf();
conf.setJobName(elastic search hello world);
conf.setSpeculativeExecution(false);

conf.set(es.nodes, localhost:9200);
conf.set(es.resource, answers_development/answer);
conf.set(es.resource.read, answers_development/answer);
conf.set(es.resource.write, 
questions_development_20140603205720870/question);
conf.set(es.query, {});

conf.setOutputFormat(EsOutputFormat.class);
conf.setInputFormat(EsInputFormat.class);

conf.setMapOutputKeyClass(Text.class);
conf.setMapOutputValueClass(MapWritable.class);

conf.setMapperClass(SomeMapper.class);
conf.setReducerClass(SomeReducer.class);

JobClient.runJob(conf);

};

Here's the relevant portion of the stack trace:

14/06/06 21:35:27 INFO mapred.MapTask: Map output collector class = 
org.apache.hadoop.mapred.MapTask$MapOutputBuffer
14/06/06 21:35:28 INFO mapred.LocalJobRunner: map task executor complete.
14/06/06 21:35:28 WARN mapred.LocalJobRunner: job_local141518769_0001
java.lang.Exception: java.lang.RuntimeException: Error in configuring object
at 
org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522)
Caused by: java.lang.RuntimeException: Error in configuring object
at 
org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75)
at 
org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:426)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
at 
org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106)
... 10 more
Caused by: java.lang.RuntimeException: java.lang.NoSuchMethodException: 
com.edcast.cards.MapReduceHelloWorld$SomeMapper.init()
at 
org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:131)
at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:38)
... 15 more
Caused by: java.lang.NoSuchMethodException: 
com.edcast.cards.MapReduceHelloWorld$SomeMapper.init()
at java.lang.Class.getConstructor0(Class.java:2810)
at java.lang.Class.getDeclaredConstructor(Class.java:2053)
at 
org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:125)
... 16 more
14/06/06 21:35:28 INFO mapreduce.Job: Job job_local141518769_0001 failed 
with state FAILED due to: NA
14/06/06 21:35:28 INFO mapreduce.Job: Counters: 0
Exception in thread main java.io.IOException: Job failed!
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:836)
at com.edcast.cards.MapReduceHelloWorld.main(MapReduceHelloWorld.java:79)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 

Re: [ElasticSearch Hadoop] Error in configuring object

2014-06-06 Thread Costin Leau

Caused by: java.lang.RuntimeException: java.lang.NoSuchMethodException:
com.edcast.cards.MapReduceHelloWorld$SomeMapper.init()
at
org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:131)
at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:38)
... 15 more
Caused by: java.lang.NoSuchMethodException:
com.edcast.cards.MapReduceHelloWorld$SomeMapper.init()
at java.lang.Class.getConstructor0(Class.java:2810)
at java.lang.Class.getDeclaredConstructor(Class.java:2053)
at
org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:125)

In other words, your Mapper has no default constructor defined for it. That
is Hadoop cannot instantiate your mapper since your class definition is
incorrect.


On Sat, Jun 7, 2014 at 12:52 AM, bharath bhat bharathpb...@gmail.com
wrote:

 Hi,

 I'm trying to get a simple mapreduce job working with ES hadoop. I
 followed the docs to set up a job but I keep getting 'Error in configuring
 object' when I try to run it on Hadoop in pseudo distributed mode. I am
 using the old API with Hadoop 2.4.0.

 Here's my code I am using:

 public class SomeMapper extends MapReduceBase implements MapperObject,
 Object, Text, MapWritable {

 // Dummy
 public void map(Object key, Object value, OutputCollectorText,
 MapWritable output, Reporter reporter)
 throws IOException {
 Text docId = (Text) key;
 MapWritable doc = (MapWritable) value;
 output.collect(docId, doc);
 }
 }

 public class SomeReducer extends MapReduceBase implements
 ReducerText, MapWritable, Text, MapWritable {

 // Dummy
 public void reduce(Text key, IteratorMapWritable values,
 OutputCollectorText, MapWritable output, Reporter reporter)
 throws IOException {
 Text docId = (Text) key;
 while (values.hasNext()){
 MapWritable out = (MapWritable)(values.next());
 output.collect(docId, out);
 }
 }
 }

 public static void main(String[] args) throws  Exception {

 JobConf conf = new JobConf();
 conf.setJobName(elastic search hello world);
 conf.setSpeculativeExecution(false);

 conf.set(es.nodes, localhost:9200);
 conf.set(es.resource, answers_development/answer);
 conf.set(es.resource.read, answers_development/answer);
 conf.set(es.resource.write,
 questions_development_20140603205720870/question);
 conf.set(es.query, {});

 conf.setOutputFormat(EsOutputFormat.class);
 conf.setInputFormat(EsInputFormat.class);

 conf.setMapOutputKeyClass(Text.class);
 conf.setMapOutputValueClass(MapWritable.class);

 conf.setMapperClass(SomeMapper.class);
 conf.setReducerClass(SomeReducer.class);

 JobClient.runJob(conf);

 };

 Here's the relevant portion of the stack trace:

 14/06/06 21:35:27 INFO mapred.MapTask: Map output collector class =
 org.apache.hadoop.mapred.MapTask$MapOutputBuffer
 14/06/06 21:35:28 INFO mapred.LocalJobRunner: map task executor complete.
 14/06/06 21:35:28 WARN mapred.LocalJobRunner: job_local141518769_0001
 java.lang.Exception: java.lang.RuntimeException: Error in configuring
 object
 at
 org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
 at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522)
 Caused by: java.lang.RuntimeException: Error in configuring object
 at
 org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
 at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75)
 at
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
 at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:426)
 at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
 at
 org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
 at java.util.concurrent.FutureTask.run(FutureTask.java:262)
 at
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:744)
 Caused by: java.lang.reflect.InvocationTargetException
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at
 org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106)
 ... 10 more
 Caused by: java.lang.RuntimeException: java.lang.NoSuchMethodException:
 com.edcast.cards.MapReduceHelloWorld$SomeMapper.init()
 at
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:131)
 at 

Re: [ElasticSearch Hadoop] Error in configuring object

2014-06-06 Thread Costin Leau
By the way, quickly looking at your class, it's likely because your
mapper/reducer are defined at inner classes yet they are not static and
thus cannot be used without their enclosing class. In other words, declare
them as 'static'.


On Sat, Jun 7, 2014 at 1:00 AM, Costin Leau costin.l...@gmail.com wrote:

 
 Caused by: java.lang.RuntimeException: java.lang.NoSuchMethodException:
 com.edcast.cards.MapReduceHelloWorld$SomeMapper.init()
 at
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:131)
  at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:38)
 ... 15 more
 Caused by: java.lang.NoSuchMethodException:
 com.edcast.cards.MapReduceHelloWorld$SomeMapper.init()
  at java.lang.Class.getConstructor0(Class.java:2810)
 at java.lang.Class.getDeclaredConstructor(Class.java:2053)
  at
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:125)

 In other words, your Mapper has no default constructor defined for it.
 That is Hadoop cannot instantiate your mapper since your class definition
 is incorrect.


 On Sat, Jun 7, 2014 at 12:52 AM, bharath bhat bharathpb...@gmail.com
 wrote:

 Hi,

 I'm trying to get a simple mapreduce job working with ES hadoop. I
 followed the docs to set up a job but I keep getting 'Error in configuring
 object' when I try to run it on Hadoop in pseudo distributed mode. I am
 using the old API with Hadoop 2.4.0.

 Here's my code I am using:

 public class SomeMapper extends MapReduceBase implements MapperObject,
 Object, Text, MapWritable {

 // Dummy
 public void map(Object key, Object value, OutputCollectorText,
 MapWritable output, Reporter reporter)
 throws IOException {
 Text docId = (Text) key;
 MapWritable doc = (MapWritable) value;
 output.collect(docId, doc);
 }
 }

 public class SomeReducer extends MapReduceBase implements
 ReducerText, MapWritable, Text, MapWritable {

 // Dummy
 public void reduce(Text key, IteratorMapWritable values,
 OutputCollectorText, MapWritable output, Reporter reporter)
 throws IOException {
 Text docId = (Text) key;
 while (values.hasNext()){
 MapWritable out = (MapWritable)(values.next());
 output.collect(docId, out);
 }
 }
 }

 public static void main(String[] args) throws  Exception {

 JobConf conf = new JobConf();
 conf.setJobName(elastic search hello world);
 conf.setSpeculativeExecution(false);

 conf.set(es.nodes, localhost:9200);
 conf.set(es.resource, answers_development/answer);
 conf.set(es.resource.read, answers_development/answer);
 conf.set(es.resource.write,
 questions_development_20140603205720870/question);
 conf.set(es.query, {});

 conf.setOutputFormat(EsOutputFormat.class);
 conf.setInputFormat(EsInputFormat.class);

 conf.setMapOutputKeyClass(Text.class);
 conf.setMapOutputValueClass(MapWritable.class);

 conf.setMapperClass(SomeMapper.class);
 conf.setReducerClass(SomeReducer.class);

 JobClient.runJob(conf);

 };

 Here's the relevant portion of the stack trace:

 14/06/06 21:35:27 INFO mapred.MapTask: Map output collector class =
 org.apache.hadoop.mapred.MapTask$MapOutputBuffer
 14/06/06 21:35:28 INFO mapred.LocalJobRunner: map task executor complete.
 14/06/06 21:35:28 WARN mapred.LocalJobRunner: job_local141518769_0001
 java.lang.Exception: java.lang.RuntimeException: Error in configuring
 object
  at
 org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
 at
 org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522)
 Caused by: java.lang.RuntimeException: Error in configuring object
 at
 org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
 at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75)
  at
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
 at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:426)
  at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
 at
 org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
  at
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
 at java.util.concurrent.FutureTask.run(FutureTask.java:262)
  at
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
  at java.lang.Thread.run(Thread.java:744)
 Caused by: java.lang.reflect.InvocationTargetException
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  at
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  at 

Re: Help with cluster.routing.allocation.enable Property

2014-06-06 Thread sagarl
Thanks Ivan for the suggestions, I'll try to make use of them.

Drew, Sorry I din't get the part when you said There's nothing wrong with 
having a yellow cluster And yes I agree ES does a good job of rebalancing 
but just think of a case where you have close to few hundred GBs of data 
per shard which will move around the cluster even though the new node takes 
only few minutes to come back up. Hence, looking to avoid unnecessary 
network i/O.

-- Sagar

On Friday, June 6, 2014 2:41:37 PM UTC-7, Drew Raines wrote:

 sagarl wrote: 

  We have our ES cluster running on aws instances where nodes can 
  come and go. We would like to disable Shard reallocation on 
  cluster in Normal state so that Node going down should not 
  rebalance the shards.(Just to avoid unnecessary Network i/o). 

 If your shard topology is relatively static, you could just leave 
 allocation disabled (or set it to new_primaries for the occasional 
 index creation).  But honestly you're trying to micro-manage a 
 process that ES does pretty well.  There's nothing wrong with 
 having a yellow cluster.  If you're concerned about availability 
 you should add replicas. 

 Drew 



-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/386d674e-52b8-4ed1-95ce-6efcf03718c0%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [ElasticSearch Hadoop] Error in configuring object

2014-06-06 Thread bharath bhat
That was it! Thank you. 



On Friday, June 6, 2014 3:12:39 PM UTC-7, Costin Leau wrote:

 By the way, quickly looking at your class, it's likely because your 
 mapper/reducer are defined at inner classes yet they are not static and 
 thus cannot be used without their enclosing class. In other words, declare 
 them as 'static'.


 On Sat, Jun 7, 2014 at 1:00 AM, Costin Leau costi...@gmail.com 
 javascript: wrote:

 
 Caused by: java.lang.RuntimeException: java.lang.NoSuchMethodException: 
 com.edcast.cards.MapReduceHelloWorld$SomeMapper.init()
  at 
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:131)
  at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:38)
 ... 15 more
 Caused by: java.lang.NoSuchMethodException: 
 com.edcast.cards.MapReduceHelloWorld$SomeMapper.init()
  at java.lang.Class.getConstructor0(Class.java:2810)
 at java.lang.Class.getDeclaredConstructor(Class.java:2053)
  at 
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:125)

 In other words, your Mapper has no default constructor defined for it. 
 That is Hadoop cannot instantiate your mapper since your class definition 
 is incorrect.


 On Sat, Jun 7, 2014 at 12:52 AM, bharath bhat bharat...@gmail.com 
 javascript: wrote:

 Hi, 

 I'm trying to get a simple mapreduce job working with ES hadoop. I 
 followed the docs to set up a job but I keep getting 'Error in configuring 
 object' when I try to run it on Hadoop in pseudo distributed mode. I am 
 using the old API with Hadoop 2.4.0. 

 Here's my code I am using:

 public class SomeMapper extends MapReduceBase implements MapperObject, 
 Object, Text, MapWritable {

 // Dummy
 public void map(Object key, Object value, OutputCollectorText, 
 MapWritable output, Reporter reporter)
 throws IOException {
 Text docId = (Text) key;
 MapWritable doc = (MapWritable) value;
 output.collect(docId, doc);
 }
 }

 public class SomeReducer extends MapReduceBase implements 
 ReducerText, MapWritable, Text, MapWritable {

 // Dummy
 public void reduce(Text key, IteratorMapWritable values, 
 OutputCollectorText, MapWritable output, Reporter reporter)
 throws IOException {
 Text docId = (Text) key;
 while (values.hasNext()){
 MapWritable out = (MapWritable)(values.next());
 output.collect(docId, out);
  }
 }
 }

 public static void main(String[] args) throws  Exception {

 JobConf conf = new JobConf();
 conf.setJobName(elastic search hello world);
 conf.setSpeculativeExecution(false);

 conf.set(es.nodes, localhost:9200);
 conf.set(es.resource, answers_development/answer);
 conf.set(es.resource.read, answers_development/answer);
 conf.set(es.resource.write, 
 questions_development_20140603205720870/question);
 conf.set(es.query, {});

 conf.setOutputFormat(EsOutputFormat.class);
 conf.setInputFormat(EsInputFormat.class);

 conf.setMapOutputKeyClass(Text.class);
 conf.setMapOutputValueClass(MapWritable.class);

 conf.setMapperClass(SomeMapper.class);
 conf.setReducerClass(SomeReducer.class);

 JobClient.runJob(conf);

 };

 Here's the relevant portion of the stack trace:

 14/06/06 21:35:27 INFO mapred.MapTask: Map output collector class = 
 org.apache.hadoop.mapred.MapTask$MapOutputBuffer
 14/06/06 21:35:28 INFO mapred.LocalJobRunner: map task executor complete.
 14/06/06 21:35:28 WARN mapred.LocalJobRunner: job_local141518769_0001
 java.lang.Exception: java.lang.RuntimeException: Error in configuring 
 object
  at 
 org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
 at 
 org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522)
 Caused by: java.lang.RuntimeException: Error in configuring object
 at 
 org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
 at 
 org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75)
  at 
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
 at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:426)
  at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
 at 
 org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
  at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
 at java.util.concurrent.FutureTask.run(FutureTask.java:262)
  at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
  at java.lang.Thread.run(Thread.java:744)
 Caused by: java.lang.reflect.InvocationTargetException
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  at 
 

'Shard Allocation' dashboard in Marvel 1.2 showing SearchParseExceptions

2014-06-06 Thread philsttr
I'm running elasticsearch 1.2.1, and just upgraded to Marvel 1.2.

When I go to the new Shard Allocation dashboard in Marvel, I see the 
following error:

SearchPhaseExecutionException[Failed to execute phase [query_fetch], all 
shards failed; shardFailures 
{[5FNjdIADRE2rRuOkDZdieA][.marvel-2014.06.04][0]: 
RemoteTransportException[[][inet[/x.x.x.x:9300]][search/phase/query+fetch]];
 
nested: SearchParseException[[.marvel-2014.06.04][0]: from[0],size[300]: 
Parse Failure [Failed to parse source 
[{size:300,from:0,fields:[@timestamp,message,status],sort:{@timestamp:{order:desc}},query:{filtered:{filter:{range:{@timestamp:{from:2014-05-30T22:33:17.689Z,to:2014-06-06T22:33:17.689Z}}]]];
 
nested: SearchParseException[[.marvel-2014.06.04][0]: from[0],size[300]: 
Parse Failure [No mapping found for [@timestamp] in order to sort on]]; }]



I see the following in my elasticsearch.log


[2014-06-06 22:48:07,919][DEBUG][action.search.type   ] [] All 
shards failed for phase: [query_fetch]
[2014-06-06 22:48:17,937][DEBUG][action.search.type   ] [] 
[.marvel-2014.06.06][0], node[5FNjdIADRE2rRuOkDZdieA], [R], s[STARTED]: 
Failed to execute [org.elasticsearch.action.search.SearchRequest@6af079e8]
org.elasticsearch.search.SearchParseException: [.marvel-2014.06.06][0]: 
from[-1],size[1]: Parse Failure [Failed to parse source 
[{size:1,sort:{@timestamp:{order:desc}}}]]
at 
org.elasticsearch.search.SearchService.parseSource(SearchService.java:649)
at 
org.elasticsearch.search.SearchService.createContext(SearchService.java:511)
at 
org.elasticsearch.search.SearchService.createAndPutContext(SearchService.java:483)
at 
org.elasticsearch.search.SearchService.executeFetchPhase(SearchService.java:324)
at 
org.elasticsearch.search.action.SearchServiceTransportAction$11.call(SearchServiceTransportAction.java:308)
at 
org.elasticsearch.search.action.SearchServiceTransportAction$11.call(SearchServiceTransportAction.java:305)
at 
org.elasticsearch.search.action.SearchServiceTransportAction$23.run(SearchServiceTransportAction.java:517)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.elasticsearch.search.SearchParseException: 
[.marvel-2014.06.06][0]: from[-1],size[1]: Parse Failure [No mapping found 
for [@timestamp] in order to sort on]
at 
org.elasticsearch.search.sort.SortParseElement.addSortField(SortParseElement.java:198)
at 
org.elasticsearch.search.sort.SortParseElement.addCompoundSortField(SortParseElement.java:172)
at 
org.elasticsearch.search.sort.SortParseElement.parse(SortParseElement.java:90)
at 
org.elasticsearch.search.SearchService.parseSource(SearchService.java:633)
... 9 more


I have confirmed that marvel is upgraded on every node in the cluster.

I tried deleting the .marvel* indexes.  Didn't help
Shutdown the cluster, and restarted it.  Didn't help.

Anybody have ideas?

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/1aeb0c36-9cd5-4f2b-a8c7-9b32471b4716%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


UnavailableShardsException after loading 1.5M documents

2014-06-06 Thread Chris Mildebrandt


Hi all,


I'm using the Python API (pyes) to perform the bulk loading of our data, 
here's the important part of the code:


import os
from pyes import ES

max_docs = 1
es = ES(server='hadoop42.robinsystems.com:9200')

for prefix in xrange(1, 105):
f_name = os.path.join('data', str(prefix) + '.json')
with open(f_name, 'rb') as f:
for line in f:
es.index(line, str(prefix), 'my_type', bulk=True)


It loops through files (1.json, 2.json, 3.json, etc) and loads them into 
indexes ('1', '2', '3', etc). The API does 400 documents at a time. It hums 
along until about 1.5M documents, then the process fails with the following 
error:

Traceback (most recent call last):
  File load_data.py, line 24, in module
es.index(line, str(prefix), 'my_type', bulk=True)
  File /usr/local/lib/python2.7/site-packages/pyes/es.py, line 729, in 
index
return self.flush_bulk()
  File /usr/local/lib/python2.7/site-packages/pyes/es.py, line 763, in 
flush_bulk
return self.bulker.flush_bulk(forced)
  File /usr/local/lib/python2.7/site-packages/pyes/models.py, line 204, 
in flush_bulk
\n.join(batch) + \n)
  File /usr/local/lib/python2.7/site-packages/pyes/es.py, line 441, in 
_send_request
response = self.connection.execute(request)
  File /usr/local/lib/python2.7/site-packages/pyes/connection_http.py, 
line 109, in execute
self._local.server = server = self._get_server()
  File /usr/local/lib/python2.7/site-packages/pyes/connection_http.py, 
line 145, in _get_server
raise NoServerAvailable(ex)
pyes.exceptions.NoServerAvailable: list index out of range


After that, I can't even load one document into the system:

curl -XPOST http://hadoop42.robinsystems.com:9200/_bulk --data-binary 
@t.json

{took:60001,errors:true,items:[{create:{_index:21,_type:my_type,_id:unj0OWVgQZCNXYqfChaOVg,status:503,error:UnavailableShardsException[[21][5]
 
[3] shardIt, [1] active : Timeout waiting for [1m], request: 
org.elasticsearch.action.bulk.BulkShardRequest@5e27693e]}}]}

The t.json file has one document in it. I restarted the cluster and I get 
the same error. All my primary shards are active, the replicas are coming 
up slowly. The current state of the cluster is yellow. I would expect to be 
able to still load documents in this state. 

Here are some more details of our setup:


   - 6 node cluster with 256GB RAM, 120GB set as ES_HEAP
   - 104 indexes with 10 shards each and 2 replicas
   - Each index holds 80,000 documents and each document is about 20KB
   

Any idea why I'd be unable to load documents into my cluster after this 
point?

Thanks,
-Chris

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/912ea701-bc14-455b-a023-f0f644b9f5de%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Keyword Position for Relevance

2014-06-06 Thread samuel dean
Hi 

Looking for help on improving relevancy based on keyword positioning.

Search term * ipad mini*

Search result 1
*ipad mini 16 gb*

Search result 2
*black leather case ipad mini / ipad air*

Can search result 1 be made more relevant due to the match on the search 
term being at the beginning of the title. 
At the moment search result 2 is more relevant because ipad' is mentioned 
2 times. However the search terms best match is Search result 1.

Thanks

Sam

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/69459423-6919-4157-b68e-e38339534fd3%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: What's using memory in ElasticSearch? (Details to follow...)

2014-06-06 Thread jegansp
Thanks for your replies Mark and Jorg.

My index size is around 500GB.

After using the settings (provided in my last post) initially the segments
count came down to just around 200 (without doing optimize calls), but after
some time it started increasing and now it stands around 2000+ in each node.
Is this expected?

I would surely try with the optimise APIs for each index. Is there any other
settings I need to be aware of?

Thanks,

Jegan



--
View this message in context: 
http://elasticsearch-users.115913.n3.nabble.com/What-s-using-memory-in-ElasticSearch-Details-to-follow-tp4056989p4057223.html
Sent from the ElasticSearch Users mailing list archive at Nabble.com.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/1402063709309-4057223.post%40n3.nabble.com.
For more options, visit https://groups.google.com/d/optout.


Re: UnavailableShardsException after loading 1.5M documents

2014-06-06 Thread Chris Mildebrandt
It looks like the problem is on my end. I misplaced the HEAP size parameter 
and was only running with 1GB. After bumping it up to a more respectable 
amount, the loading is humming along again.

-Chris

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/1270ea62-1b90-4e52-950d-84776ba3a668%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Kibana/nginx could not connect to elasticsearch

2014-06-06 Thread Harsh Jha
 

I have Elasticsearch and Kibana with nginx running on a EC2 instance but 
Kibana/nginx is not able to connect to Elasticsearch and I'm getting Error 
Could not contact Elasticsearch at . Please ensure that 
Elasticsearch is reachable from your system.

When I see the nginx error logs, I get - [error] 13067#0: *1 access 
forbidden by rule, client: x, server: 127.0.0.1, request: GET 
/index.html HTTP/1.1, host: x [error] 13283#0: *14 no user/password 
was provided for basic authentication, client: x, server: xx, 
request: GET //cgi-bin/php5 HTTP/1.1, host: x

My conf file is - http://pastebin.com/JuxifP2n

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/e2305f08-53b6-4a45-84fb-5a11dd4c3801%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Help with cluster.routing.allocation.enable Property

2014-06-06 Thread Drew Raines

sagarl wrote:

Drew, Sorry I din't get the part when you said There's nothing 
wrong with having a yellow cluster And yes I agree ES does a 
good job of rebalancing but just think of a case where you have 
close to few hundred GBs of data per shard which will move 
around the cluster even though the new node takes only few 
minutes to come back up. Hence, looking to avoid unnecessary 
network i/O.


If you were referring to /planned/ node outages, then I agree.  I 
thought you were trying to reinvent schemes for dealing with the 
unpredictable.  Apologies for misinterpreting!


Drew

--
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/m2ppilwicj.fsf%40mid.raines.me.
For more options, visit https://groups.google.com/d/optout.