date:20140606

1. No. Did you change the configuration? You have two data nodes connected?

2. You do not need to be concerned where primary shards are allocated,
secondary shards play the same role (except primaries receive writes first
a few milliseconds earlier than secondaries). Elasticsearch randomly
allocates shards on all the available nodes. If nodes go away, shards on
remaining nodes are promoted to primaries automatically.

Jörg

On Fri, Jun 6, 2014 at 5:41 AM, flyer flyer...@gmail.com wrote:

I have a cluster of two nodes, and set the configs for shard number and
replica number as following:

index.number_of_shards: 10

index.number_of_replicas: 1

The master node is elected automatically.

Before I index data, the state of the cluster is green. After I index
data, the state of the cluster becomes yellow. Some replica shards are not
allocated.

But if I add another node and now the cluster consists of three nodes,
then I index data and the state of the cluster becomes green.

The version of ES that I use is 1.1.

Question 1:
If the replica number is 1, do I have to have at least 3 nodes to
assure the state of the cluster is green?

Question 2:
I have observed that one (not the master node) of the three nodes just
has primary nodes. What's the mechanism ES uses to allocate primary/replia
shards?

Thanks anyway.

--
宠辱不惊，闲看庭前花开花落；去留无意，漫随天边云卷云舒。

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAKLbBG-o6JmkbPAAvGBP8t8Bdp-ONEWEFK5kGW%3Dcyn-h-O6dsg%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAKLbBG-o6JmkbPAAvGBP8t8Bdp-ONEWEFK5kGW%3Dcyn-h-O6dsg%40mail.gmail.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoES%2Bd5gmo_OS-Si2dXLPGbk_q-B554WeL13HC0DfFY1vw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: A plugin to change the result set before sending it back to the http client

I drink Kölsch only :) ävver et hätt noh immer joot jejange

Greetings from Cologne!

Jörg

On Fri, Jun 6, 2014 at 7:14 AM, Mario Mueller ma...@xenji.com wrote:

You guys are totally awesome! Thanks a lot! If you ever visit Duesseldorf
drop me a line, I owe you a beer.

@Brian:
Interesting approach, but wouldn't this go against the initial no
additional proxy statement, if I got you right ..

Am Donnerstag, 5. Juni 2014 19:45:33 UTC+2 schrieb Mario Mueller:

Hey folks,

I kindly ask for a hint to achieve the following thing:

The goal is to deliver only a json array of source objects to the client.
The php app that sits on the other side uses JMS\Serializer to deserialize
the response into entities. At the moment the app needs to take an overhead
to derserialize it, extract the source and serialize it again. Then the
serialized stuff is passed to the entity deserializer. That's really
painful.

I've found a thread that suggests a proxy in between to handle this type
of problem, but this is not possible in our env.

The real question is: Is this achievable by writing a plugin? And if so,
what type of plugin? I've looked at the RestFilter, but I don't know if
this is the right way to go...

Any hints are welcome!

Regards,
Mario

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/b27db27a-b136-4a3b-81b5-16d60047540d%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/b27db27a-b136-4a3b-81b5-16d60047540d%40googlegroups.com?utm_medium=emailutm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHKWraQG7iMiUiV5ucR8VdLO2nxruVX5Vq_P%3D1zcB6WGQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Analyzing queries in the client side of Elasticsearch but not on the server

2014-06-06 Thread ohw

Hi folks

I just asked a question in StackOverflow, please have a look if you have 
encountered similar problem or have some input to it.

Thanks in advance!

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/1f4b1575-f050-46db-853f-511bc24e6392%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Marvel 1.2.0 java.lang.IllegalStateException

2014-06-06 Thread Boaz Leskes

HI Pawel,

I see - your cluster state (nodes + routing only, not meta data), seems to 
be larger then 16KB when rendered to SMILE, which is quite big - does this 
make sense?

Above 16KB an underlying paging system introduced in the ES 1.x branch 
kicks in. At that breaks something in Marvel than normally ships very small 
documents.

I'll work on a fix. Can you confirm your cluster state (again, without the 
metadata) is indeed very large?

Cheers,
Boaz

On Thursday, June 5, 2014 10:56:00 AM UTC+2, Paweł Krzaczkowski wrote:

 Hi.

 After upgrading Marvel to 1.2.0 (running on Elasticsearch 1.2.1) i'm 
 getting errors like

 [2014-06-05 10:47:25,346][INFO ][node ] [es-m-3] 
 version[1.2.1], pid[68924], build[6c95b75/2014-06-03T15:02:52Z]
 [2014-06-05 10:47:25,347][INFO ][node ] [es-m-3] 
 initializing ...
 [2014-06-05 10:47:25,367][INFO ][plugins  ] [es-m-3] 
 loaded [marvel, analysis-icu], sites [marvel, head, segmentspy, browser, 
 paramedic]
 [2014-06-05 10:47:28,455][INFO ][node ] [es-m-3] 
 initialized
 [2014-06-05 10:47:28,456][INFO ][node ] [es-m-3] 
 starting ...
 [2014-06-05 10:47:28,597][INFO ][transport] [es-m-3] 
 bound_address {inet[/0:0:0:0:0:0:0:0:9300]}, publish_address {inet[/
 192.168.0.212:9300]}
 [2014-06-05 10:47:42,340][INFO ][cluster.service  ] [es-m-3] 
 new_master [es-m-3][0H3grrJxTJunU1U6FmkIEg][es-m-3][inet[
 192.168.0.212/192.168.0.212:9300]]{data=false 
 http://192.168.0.212/192.168.0.212:9300%5D%5D%7Bdata=false, 
 master=true}, reason: zen-disco-join (elected_as_master)
 [2014-06-05 10:47:42,350][INFO ][discovery] [es-m-3] 
 freshmind/0H3grrJxTJunU1U6FmkIEg
 [2014-06-05 10:47:42,365][INFO ][http ] [es-m-3] 
 bound_address {inet[/0:0:0:0:0:0:0:0:9200]}, publish_address {inet[/
 192.168.0.212:9200]}
 [2014-06-05 10:47:42,368][INFO ][node ] [es-m-3] 
 started
 [2014-06-05 10:47:44,098][INFO ][cluster.service  ] [es-m-3] added 
 {[es-m-1][MHl5Ls-cRXCwc7OC-P0J5w][es-m-1][inet[/192.168.0.210:9300]]{data=false,
  
 machine=44454c4c-5300-1052-8038-b9c04f5a5a31, master=true},}, reason: 
 zen-disco-receive(join from 
 node[[es-m-1][MHl5Ls-cRXCwc7OC-P0J5w][es-m-1][inet[/192.168.0.210:9300]]{data=false,
  
 machine=44454c4c-5300-1052-8038-b9c04f5a5a31, master=true}])
 [2014-06-05 10:47:44,401][INFO ][gateway  ] [es-m-3] 
 recovered [28] indices into cluster_state
 [2014-06-05 10:47:48,683][ERROR][marvel.agent ] [es-m-3] 
 exporter [es_exporter] has thrown an exception:
 java.lang.IllegalStateException: array not available
 at 
 org.elasticsearch.common.bytes.PagedBytesReference.array(PagedBytesReference.java:289)
 at 
 org.elasticsearch.marvel.agent.exporter.ESExporter.addXContentRendererToConnection(ESExporter.java:209)
 at 
 org.elasticsearch.marvel.agent.exporter.ESExporter.exportXContent(ESExporter.java:252)
 at 
 org.elasticsearch.marvel.agent.exporter.ESExporter.exportEvents(ESExporter.java:161)
 at 
 org.elasticsearch.marvel.agent.AgentService$ExportingWorker.exportEvents(AgentService.java:305)
 at 
 org.elasticsearch.marvel.agent.AgentService$ExportingWorker.run(AgentService.java:240)
 at java.lang.Thread.run(Thread.java:745)
 [2014-06-05 10:47:58,738][ERROR][marvel.agent ] [es-m-3] 
 exporter [es_exporter] has thrown an exception:
 java.lang.IllegalStateException: array not available
 at 
 org.elasticsearch.common.bytes.PagedBytesReference.array(PagedBytesReference.java:289)
 at 
 org.elasticsearch.marvel.agent.exporter.ESExporter.addXContentRendererToConnection(ESExporter.java:209)
 at 
 org.elasticsearch.marvel.agent.exporter.ESExporter.exportXContent(ESExporter.java:252)
 at 
 org.elasticsearch.marvel.agent.exporter.ESExporter.exportEvents(ESExporter.java:161)
 at 
 org.elasticsearch.marvel.agent.AgentService$ExportingWorker.exportEvents(AgentService.java:305)
 at 
 org.elasticsearch.marvel.agent.AgentService$ExportingWorker.run(AgentService.java:240)
 at java.lang.Thread.run(Thread.java:745)
 [2014-06-05 10:48:36,572][INFO ][cluster.service  ] [es-m-3] added 
 {[es-m-2][e5uEqGRhS7uEioNxaYkwTg][es-m-2][inet[/192.168.0.211:9300]]{data=false,
  
 master=true},}, reason: zen-disco-receive(join from 
 node[[es-m-2][e5uEqGRhS7uEioNxaYkwTg][es-m-2][inet[/192.168.0.211:9300]]{data=false,
  
 master=true}])
 [2014-06-05 10:48:38,859][ERROR][marvel.agent ] [es-m-3] 
 exporter [es_exporter] has thrown an exception:
 java.lang.IllegalStateException: array not available
 at 
 org.elasticsearch.common.bytes.PagedBytesReference.array(PagedBytesReference.java:289)
 at 
 org.elasticsearch.marvel.agent.exporter.ESExporter.addXContentRendererToConnection(ESExporter.java:209)
 at

Re: Correct way to use TransportClient connection object

Closing the transport client may not be enough.

Try this:

- wait for all outstanding actions (all actions send responses
asynchronously)
- then shut down client.threadpool() (perhaps with shutdownNow() or
shutdown()), this effectively disables new actions form being started
- then close the transport client with client.close()
- then set the client instance to null to allow GC to clean up

Also you should put the static transport client object in a
ServetContextListener so the web app container is able to manage start and
stop in contextInitialized() and contextDestroyed()

Beside this, memory leak warnings from tomcat are warnings. They may be
right or wrong.

Jörg


On Fri, Jun 6, 2014 at 7:43 AM, Subhadip Bagui i.ba...@gmail.com wrote:

 Hi,

  I'm using the below code to get a singleton object for TransportClient
 object. I'm using the getInstance() to get the client object which is
 already alive in webapplication.

 public static Client getInstance()
 {
 if (instance == null)
   {
 logger.debug(the client instance is null, creating a new instance);
 ImmutableSettings.Builder settings = ImmutableSettings.settingsBuilder();
 settings.put(node.client, true);
 settings.put(node.data, false);
 settings.put(node.name, node-client);
 settings.put(cluster.name, elasticsearch);
 settings.build();
 instance = new TransportClient(settings)
 .addTransportAddress(new
 InetSocketTransportAddress(10.203.238.139, 9300));
 logger.debug(returning the new created client instance...);
 return instance;
   }
 return instance;
 }

 Calling the client as below from search api.
 Client client = ESClientFactory.getInstance();

 Now the issue is if I don't close client like client.close() I'm getting
 memory leak warning from webserver tomcat side. If I do close the
 connection using client.close() after search api call then I'm getting 
 NoNodeAvailableException
 exception.

 Please suggest what is the correct way to call the connection object.

 Thanks,
 Subhadip

  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/e2aaf77f-cd18-4e52-98fc-c25ed03601fd%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/e2aaf77f-cd18-4e52-98fc-c25ed03601fd%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHL3rAeBsHLrWYkm4-MqO7oU_9CJmL7Ge6SPS15FoyJmg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Analyzing queries in the client side of Elasticsearch but not on the server

Please ask your question here. Thanks.

Jörg


On Fri, Jun 6, 2014 at 9:28 AM, ohw o...@zhihu.com wrote:

 Hi folks

 I just asked a question in StackOverflow, please have a look if you have
 encountered similar problem or have some input to it.

 Thanks in advance!

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/1f4b1575-f050-46db-853f-511bc24e6392%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/1f4b1575-f050-46db-853f-511bc24e6392%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoEhKcBZBQ2m1oYqe6C7fMzc17APqJvRrHtqCEjecCRunA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Analyzing queries in the client side of Elasticsearch but not on the server

2014-06-06 Thread ohw

Sure, here it is:

We are migrating our Lucene based search codebase to Elasticsearch. The
major problem we encountered is how we should migrate our QueryParsers.

In our old solution, the QueryParsers take in a human input query string,
and transform that to Lucene's Query object, which is then fed into
Lucene's IndexSearcher. However, in Elasticsearch we don't directly
interact with IndexSearcher, instead we can only build the queries in the
client side using Query DSL
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl.html
and
send the JSON to Elasticsearch server. Elasticsearch server then (possibly)
rewrites/analyses the JSON query to build a Lucene query.

To make use of our existing and sophisticated logic in QueryParsers, we
decided that we can stick to our old approach by:

1. Explicitly tell Elasticsearch to *NOT analyze* any query in the
search time.
2. Do *ALL* the query related analysis (tokenizing, synonym, etc) in the
Java client.
3. Believe that Elasticsearch's Query DSL is kind of a one-to-one
mapping to Lucene's Query

The questions are:

1. Is this approach feasible?
2. What are the potential problems in doing so?
3. What is the best practice?

By the way, don't worry about the scoring process, we are writing our
scorer scripts as a Elasticsearch plugin.

---

Thank you!

Odin

On Friday, June 6, 2014 3:36:54 PM UTC+8, Jörg Prante wrote:

Please ask your question here. Thanks.

Jörg

On Fri, Jun 6, 2014 at 9:28 AM, ohw o...@zhihu.com javascript: wrote:

Hi folks

I just asked a question in StackOverflow, please have a look if you have
encountered similar problem or have some input to it.

Thanks in advance!

https://groups.google.com/d/msgid/elasticsearch/1f4b1575-f050-46db-853f-511bc24e6392%40googlegroups.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/50690e24-839a-4908-90f4-5417129debc6%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: A plugin to change the result set before sending it back to the http client

2014-06-06 Thread Mario Mueller

Et kütt wie et kütt ;) Das mit dem Koelsch geht klar, gibts auch hier in
DUS.

Again thanks to all!

Am Freitag, 6. Juni 2014 09:09:39 UTC+2 schrieb Jörg Prante:

I drink Kölsch only :) ävver et hätt noh immer joot jejange

Greetings from Cologne!

Jörg

On Fri, Jun 6, 2014 at 7:14 AM, Mario Mueller ma...@xenji.com
javascript: wrote:

You guys are totally awesome! Thanks a lot! If you ever visit Duesseldorf
drop me a line, I owe you a beer.

@Brian:
Interesting approach, but wouldn't this go against the initial no
additional proxy statement, if I got you right ..

Am Donnerstag, 5. Juni 2014 19:45:33 UTC+2 schrieb Mario Mueller:

Hey folks,

I kindly ask for a hint to achieve the following thing:

The goal is to deliver only a json array of source objects to the
client. The php app that sits on the other side uses JMS\Serializer to
deserialize the response into entities. At the moment the app needs to take
an overhead to derserialize it, extract the source and serialize it again.
Then the serialized stuff is passed to the entity deserializer. That's
really painful.

I've found a thread that suggests a proxy in between to handle this type
of problem, but this is not possible in our env.

The real question is: Is this achievable by writing a plugin? And if so,
what type of plugin? I've looked at the RestFilter, but I don't know if
this is the right way to go...

Any hints are welcome!

Regards,
Mario

https://groups.google.com/d/msgid/elasticsearch/b27db27a-b136-4a3b-81b5-16d60047540d%40googlegroups.com?utm_medium=emailutm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/d8c20589-1f4b-4b87-b140-ec7126e5bbe2%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Kibana-MySQL connection

2014-06-06 Thread srinu konda

Hi All,

I need help on kibana, I need to Connect MySQL Database from Kibana. So 
please let me know I can achieve it, and me pseudo code if anyone can have 
it.



Thanks  Regards,
Srinivas.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/13cb6155-12e2-4ec5-952c-7f1996527903%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Marvel 1.2.0 java.lang.IllegalStateException

2014-06-06 Thread Paweł Krzaczkowski

Hi Boaz

My cluster state from http://192.168.0.211:9200/_cluster/state?humanpretty

http://pastebin.com/ZEihQhc1 - well You may say it's big ...

Pawel

W dniu piątek, 6 czerwca 2014 09:28:30 UTC+2 użytkownik Boaz Leskes napisał:

 HI Pawel,

 I see - your cluster state (nodes + routing only, not meta data), seems to 
 be larger then 16KB when rendered to SMILE, which is quite big - does this 
 make sense?

 Above 16KB an underlying paging system introduced in the ES 1.x branch 
 kicks in. At that breaks something in Marvel than normally ships very small 
 documents.

 I'll work on a fix. Can you confirm your cluster state (again, without the 
 metadata) is indeed very large?

 Cheers,
 Boaz

 On Thursday, June 5, 2014 10:56:00 AM UTC+2, Paweł Krzaczkowski wrote:

 Hi.

 After upgrading Marvel to 1.2.0 (running on Elasticsearch 1.2.1) i'm 
 getting errors like

 [2014-06-05 10:47:25,346][INFO ][node ] [es-m-3] 
 version[1.2.1], pid[68924], build[6c95b75/2014-06-03T15:02:52Z]
 [2014-06-05 10:47:25,347][INFO ][node ] [es-m-3] 
 initializing ...
 [2014-06-05 10:47:25,367][INFO ][plugins  ] [es-m-3] 
 loaded [marvel, analysis-icu], sites [marvel, head, segmentspy, browser, 
 paramedic]
 [2014-06-05 10:47:28,455][INFO ][node ] [es-m-3] 
 initialized
 [2014-06-05 10:47:28,456][INFO ][node ] [es-m-3] 
 starting ...
 [2014-06-05 10:47:28,597][INFO ][transport] [es-m-3] 
 bound_address {inet[/0:0:0:0:0:0:0:0:9300]}, publish_address {inet[/
 192.168.0.212:9300]}
 [2014-06-05 10:47:42,340][INFO ][cluster.service  ] [es-m-3] 
 new_master [es-m-3][0H3grrJxTJunU1U6FmkIEg][es-m-3][inet[
 192.168.0.212/192.168.0.212:9300]]{data=false 
 http://192.168.0.212/192.168.0.212:9300%5D%5D%7Bdata=false, 
 master=true}, reason: zen-disco-join (elected_as_master)
 [2014-06-05 10:47:42,350][INFO ][discovery] [es-m-3] 
 freshmind/0H3grrJxTJunU1U6FmkIEg
 [2014-06-05 10:47:42,365][INFO ][http ] [es-m-3] 
 bound_address {inet[/0:0:0:0:0:0:0:0:9200]}, publish_address {inet[/
 192.168.0.212:9200]}
 [2014-06-05 10:47:42,368][INFO ][node ] [es-m-3] 
 started
 [2014-06-05 10:47:44,098][INFO ][cluster.service  ] [es-m-3] 
 added 
 {[es-m-1][MHl5Ls-cRXCwc7OC-P0J5w][es-m-1][inet[/192.168.0.210:9300]]{data=false,
  
 machine=44454c4c-5300-1052-8038-b9c04f5a5a31, master=true},}, reason: 
 zen-disco-receive(join from 
 node[[es-m-1][MHl5Ls-cRXCwc7OC-P0J5w][es-m-1][inet[/192.168.0.210:9300]]{data=false,
  
 machine=44454c4c-5300-1052-8038-b9c04f5a5a31, master=true}])
 [2014-06-05 10:47:44,401][INFO ][gateway  ] [es-m-3] 
 recovered [28] indices into cluster_state
 [2014-06-05 10:47:48,683][ERROR][marvel.agent ] [es-m-3] 
 exporter [es_exporter] has thrown an exception:
 java.lang.IllegalStateException: array not available
 at 
 org.elasticsearch.common.bytes.PagedBytesReference.array(PagedBytesReference.java:289)
 at 
 org.elasticsearch.marvel.agent.exporter.ESExporter.addXContentRendererToConnection(ESExporter.java:209)
 at 
 org.elasticsearch.marvel.agent.exporter.ESExporter.exportXContent(ESExporter.java:252)
 at 
 org.elasticsearch.marvel.agent.exporter.ESExporter.exportEvents(ESExporter.java:161)
 at 
 org.elasticsearch.marvel.agent.AgentService$ExportingWorker.exportEvents(AgentService.java:305)
 at 
 org.elasticsearch.marvel.agent.AgentService$ExportingWorker.run(AgentService.java:240)
 at java.lang.Thread.run(Thread.java:745)
 [2014-06-05 10:47:58,738][ERROR][marvel.agent ] [es-m-3] 
 exporter [es_exporter] has thrown an exception:
 java.lang.IllegalStateException: array not available
 at 
 org.elasticsearch.common.bytes.PagedBytesReference.array(PagedBytesReference.java:289)
 at 
 org.elasticsearch.marvel.agent.exporter.ESExporter.addXContentRendererToConnection(ESExporter.java:209)
 at 
 org.elasticsearch.marvel.agent.exporter.ESExporter.exportXContent(ESExporter.java:252)
 at 
 org.elasticsearch.marvel.agent.exporter.ESExporter.exportEvents(ESExporter.java:161)
 at 
 org.elasticsearch.marvel.agent.AgentService$ExportingWorker.exportEvents(AgentService.java:305)
 at 
 org.elasticsearch.marvel.agent.AgentService$ExportingWorker.run(AgentService.java:240)
 at java.lang.Thread.run(Thread.java:745)
 [2014-06-05 10:48:36,572][INFO ][cluster.service  ] [es-m-3] 
 added 
 {[es-m-2][e5uEqGRhS7uEioNxaYkwTg][es-m-2][inet[/192.168.0.211:9300]]{data=false,
  
 master=true},}, reason: zen-disco-receive(join from 
 node[[es-m-2][e5uEqGRhS7uEioNxaYkwTg][es-m-2][inet[/192.168.0.211:9300]]{data=false,
  
 master=true}])
 [2014-06-05 10:48:38,859][ERROR][marvel.agent ] [es-m-3] 
 exporter [es_exporter] has thrown an exception:
 java.lang.IllegalStateException: array not available
 at

Re: Marvel 1.2.0 java.lang.IllegalStateException

2014-06-06 Thread Paweł Krzaczkowski

This one is without metadata

http://pastebin.com/tmJGA5Kq
http://xxx:9200/_cluster/state/version,master_node,nodes,routing_table,blocks/?humanpretty

Pawel

W dniu piątek, 6 czerwca 2014 09:28:30 UTC+2 użytkownik Boaz Leskes napisał:

 HI Pawel,

 I see - your cluster state (nodes + routing only, not meta data), seems to 
 be larger then 16KB when rendered to SMILE, which is quite big - does this 
 make sense?

 Above 16KB an underlying paging system introduced in the ES 1.x branch 
 kicks in. At that breaks something in Marvel than normally ships very small 
 documents.

 I'll work on a fix. Can you confirm your cluster state (again, without the 
 metadata) is indeed very large?

 Cheers,
 Boaz

 On Thursday, June 5, 2014 10:56:00 AM UTC+2, Paweł Krzaczkowski wrote:

 Hi.

 After upgrading Marvel to 1.2.0 (running on Elasticsearch 1.2.1) i'm 
 getting errors like

 [2014-06-05 10:47:25,346][INFO ][node ] [es-m-3] 
 version[1.2.1], pid[68924], build[6c95b75/2014-06-03T15:02:52Z]
 [2014-06-05 10:47:25,347][INFO ][node ] [es-m-3] 
 initializing ...
 [2014-06-05 10:47:25,367][INFO ][plugins  ] [es-m-3] 
 loaded [marvel, analysis-icu], sites [marvel, head, segmentspy, browser, 
 paramedic]
 [2014-06-05 10:47:28,455][INFO ][node ] [es-m-3] 
 initialized
 [2014-06-05 10:47:28,456][INFO ][node ] [es-m-3] 
 starting ...
 [2014-06-05 10:47:28,597][INFO ][transport] [es-m-3] 
 bound_address {inet[/0:0:0:0:0:0:0:0:9300]}, publish_address {inet[/
 192.168.0.212:9300]}
 [2014-06-05 10:47:42,340][INFO ][cluster.service  ] [es-m-3] 
 new_master [es-m-3][0H3grrJxTJunU1U6FmkIEg][es-m-3][inet[
 192.168.0.212/192.168.0.212:9300]]{data=false 
 http://192.168.0.212/192.168.0.212:9300%5D%5D%7Bdata=false, 
 master=true}, reason: zen-disco-join (elected_as_master)
 [2014-06-05 10:47:42,350][INFO ][discovery] [es-m-3] 
 freshmind/0H3grrJxTJunU1U6FmkIEg
 [2014-06-05 10:47:42,365][INFO ][http ] [es-m-3] 
 bound_address {inet[/0:0:0:0:0:0:0:0:9200]}, publish_address {inet[/
 192.168.0.212:9200]}
 [2014-06-05 10:47:42,368][INFO ][node ] [es-m-3] 
 started
 [2014-06-05 10:47:44,098][INFO ][cluster.service  ] [es-m-3] 
 added 
 {[es-m-1][MHl5Ls-cRXCwc7OC-P0J5w][es-m-1][inet[/192.168.0.210:9300]]{data=false,
  
 machine=44454c4c-5300-1052-8038-b9c04f5a5a31, master=true},}, reason: 
 zen-disco-receive(join from 
 node[[es-m-1][MHl5Ls-cRXCwc7OC-P0J5w][es-m-1][inet[/192.168.0.210:9300]]{data=false,
  
 machine=44454c4c-5300-1052-8038-b9c04f5a5a31, master=true}])
 [2014-06-05 10:47:44,401][INFO ][gateway  ] [es-m-3] 
 recovered [28] indices into cluster_state
 [2014-06-05 10:47:48,683][ERROR][marvel.agent ] [es-m-3] 
 exporter [es_exporter] has thrown an exception:
 java.lang.IllegalStateException: array not available
 at 
 org.elasticsearch.common.bytes.PagedBytesReference.array(PagedBytesReference.java:289)
 at 
 org.elasticsearch.marvel.agent.exporter.ESExporter.addXContentRendererToConnection(ESExporter.java:209)
 at 
 org.elasticsearch.marvel.agent.exporter.ESExporter.exportXContent(ESExporter.java:252)
 at 
 org.elasticsearch.marvel.agent.exporter.ESExporter.exportEvents(ESExporter.java:161)
 at 
 org.elasticsearch.marvel.agent.AgentService$ExportingWorker.exportEvents(AgentService.java:305)
 at 
 org.elasticsearch.marvel.agent.AgentService$ExportingWorker.run(AgentService.java:240)
 at java.lang.Thread.run(Thread.java:745)
 [2014-06-05 10:47:58,738][ERROR][marvel.agent ] [es-m-3] 
 exporter [es_exporter] has thrown an exception:
 java.lang.IllegalStateException: array not available
 at 
 org.elasticsearch.common.bytes.PagedBytesReference.array(PagedBytesReference.java:289)
 at 
 org.elasticsearch.marvel.agent.exporter.ESExporter.addXContentRendererToConnection(ESExporter.java:209)
 at 
 org.elasticsearch.marvel.agent.exporter.ESExporter.exportXContent(ESExporter.java:252)
 at 
 org.elasticsearch.marvel.agent.exporter.ESExporter.exportEvents(ESExporter.java:161)
 at 
 org.elasticsearch.marvel.agent.AgentService$ExportingWorker.exportEvents(AgentService.java:305)
 at 
 org.elasticsearch.marvel.agent.AgentService$ExportingWorker.run(AgentService.java:240)
 at java.lang.Thread.run(Thread.java:745)
 [2014-06-05 10:48:36,572][INFO ][cluster.service  ] [es-m-3] 
 added 
 {[es-m-2][e5uEqGRhS7uEioNxaYkwTg][es-m-2][inet[/192.168.0.211:9300]]{data=false,
  
 master=true},}, reason: zen-disco-receive(join from 
 node[[es-m-2][e5uEqGRhS7uEioNxaYkwTg][es-m-2][inet[/192.168.0.211:9300]]{data=false,
  
 master=true}])
 [2014-06-05 10:48:38,859][ERROR][marvel.agent ] [es-m-3] 
 exporter [es_exporter] has thrown an exception:
 java.lang.IllegalStateException: array not available
 at

Re: If I set index.number_of_replica:1, then the minimum number of nodes should be 3 to assure that the status of the cluster is gree?

2014-06-06 Thread flyer

The configurations on the two nodes are almost the same except the **
node.name**.

The first question is very strange.
I I run the same config on another cluster of two nodes and after I index
data, the state of the cluster is green. I have no idea why this
happened..Is there something I ignore?

I want to know how ES allocates nodes. Is there some reference? I googled
but couldn't find it.

Thank you :D

On Fri, Jun 6, 2014 at 3:05 PM, joergpra...@gmail.com joergpra...@gmail.com
wrote:

1. No. Did you change the configuration? You have two data nodes connected?

Jörg

On Fri, Jun 6, 2014 at 5:41 AM, flyer flyer...@gmail.com wrote:

I have a cluster of two nodes, and set the configs for shard number and
replica number as following:

index.number_of_shards: 10

index.number_of_replicas: 1

The master node is elected automatically.

Before I index data, the state of the cluster is green. After I index
data, the state of the cluster becomes yellow. Some replica shards are not
allocated.

But if I add another node and now the cluster consists of three nodes,
then I index data and the state of the cluster becomes green.

The version of ES that I use is 1.1.

Question 1:
If the replica number is 1, do I have to have at least 3 nodes to
assure the state of the cluster is green?

Question 2:
I have observed that one (not the master node) of the three nodes
just has primary nodes. What's the mechanism ES uses to allocate
primary/replia shards?

Thanks anyway.

--
宠辱不惊，闲看庭前花开花落；去留无意，漫随天边云卷云舒。

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoES%2Bd5gmo_OS-Si2dXLPGbk_q-B554WeL13HC0DfFY1vw%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoES%2Bd5gmo_OS-Si2dXLPGbk_q-B554WeL13HC0DfFY1vw%40mail.gmail.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
宠辱不惊，闲看庭前花开花落；去留无意，漫随天边云卷云舒。

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAKLbBG-V0xGEe%3D0gL51-bAcsWMcy-sE8bHwHg_1z7NdEiHt_7w%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Kibana-MySQL connection

2014-06-06 Thread Tomislav Poljak

Hi Srinu,
Kibana is javascript search/visualisation front-end for elasticsearch
which is a distributed search engine which doesn't feature mysql
connection/integration out of the box. But, there is a plugin
https://github.com/jprante/elasticsearch-river-jdbc which you can use
to import data from mysql to elasticsearch and then use Kibana to
visualise it.

So steps would be:

1) Deploy elasticsearch
2) Install jdbc plugin with mysql driver in elasticsearch
3) Create and submit jdbc river configuration (with mysql connection
data and import query) which will import the data
4) Deploy kibana and visualise data

Hope this helps,
Tomislav



2014-06-06 10:02 GMT+02:00 srinu konda konda.srin...@gmail.com:
 Hi All,

 I need help on kibana, I need to Connect MySQL Database from Kibana. So
 please let me know I can achieve it, and me pseudo code if anyone can have
 it.



 Thanks  Regards,
 Srinivas.

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/13cb6155-12e2-4ec5-952c-7f1996527903%40googlegroups.com.
 For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CALuCJxjL%2B0zCCHct0-BVn2Z%2BiaCHFLEGj4YD1NioG7Td%2Bz5FrQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Get by _id doesn't work but search does.

2014-06-06 Thread Luke Wilson-Mawer

Great, thanks Adrien. I will eagerly await the tool.

Kind regards,

Luke

On Thu, Jun 5, 2014 at 1:34 PM, Adrien Grand adrien.gr...@elasticsearch.com
wrote:

Hi,

This is very likely because of
https://github.com/elasticsearch/elasticsearch/pull/6393

See http://www.elasticsearch.org/blog/elasticsearch-1-2-1-released/ for
more information, we are currently working on a tool that would help
relocate documents to the right shard.

On Thu, Jun 5, 2014 at 1:54 PM, Luke Wilson-Mawer
lukewilsonma...@gmail.com wrote:

Hi,

I'm seeing weird behaviours with ids on elasticsearch 1.2.0 (recently
upgraded from 1.0.1).

A search retrieves my document, showing the correct value for _id:

[terminal] curl 'myServer:9200/global/_search?q=someField:something

{took:79,timed_out:false,_shards:{total:12,successful:12,failed:0},hits:{total:1,max_score:17.715034,hits:[{_index:global,_type:user,_id:7a113e4f-44de-3b2b-a3f1-fb881da1b00a,...

But a direct lookup on id doesn't:
[terminal] curl
'myServer:9200/global/user/7a113e4f-44de-3b2b-a3f1-fb881da1b00a'

{_index:global,_type:user,_id:7a113e4f-44de-3b2b-a3f1-fb881da1b00a,found:false}

Any ideas? I'm hoping I won't have to reindex...

Kind regards,

Luke

To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/e0c68f79-12fc-42e9-9a26-102c3b75f42e%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/e0c68f79-12fc-42e9-9a26-102c3b75f42e%40googlegroups.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
Adrien Grand

--
You received this message because you are subscribed to a topic in the
Google Groups elasticsearch group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/B2dlshf_dQI/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j6JRYTe01iXb4Lwdurxgmfp6QTXfmEBbA%2Bmci8cQMGZ-A%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j6JRYTe01iXb4Lwdurxgmfp6QTXfmEBbA%2Bmci8cQMGZ-A%40mail.gmail.com?utm_medium=emailutm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAKmH%2BjqgLZ6-qBHPvKbWLAbBiP_jqjt8QJ7kPWhSATKch0d4Bw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Analyzing queries in the client side of Elasticsearch but not on the server

The idea is:

- the basic entry point code for how search works is in
org.elasticsearch.rest.action.search.RestSearchAction, if you want to
expose an enhanced search to REST

- building the query works with
org.elasticsearch.search.builder.SearchSourceBuilder which has a convenient
method query(queryBuilder) for Java API

- org.elasticsearch.indices.query.IndicesQueriesModule is responsible for
managing the query parsers, there are addQuery() and addFilter methods(),
these methods must be invoked at plugin initialization time

- so you can write a pair of My...QueryBuilder and My...QueryParser for
each of your query in your plugin

- copy/paste RestSearchAction to something like My...RestSearchAction with
a custom endpoint, for example _mysearch, and then you can use your query
implementation, wrapped in JSON, just like you would do in _search REST
action. The new REST endpoint must be registered in the plugin
initialization

- for studying implementation details, the existing standard query
parser/builder impls in org.elasticsearch.index.query are useful

Jörg

On Fri, Jun 6, 2014 at 11:19 AM, Heng Wang o...@zhihu.com wrote:

Thank you Jörg, I didn't realize that I can plug the query parsers into
elasticsearch, would you please elaborate more on this?

On Fri, Jun 6, 2014 at 4:53 PM, joergpra...@gmail.com
joergpra...@gmail.com wrote:

The Query DSL is not equivalent to Lucene Query but close to, with
enhancements.

If you want to make use of Lucene Query, and you already decided to write
a plugin for scoring, so why don't you just add your query parsers to the
plugin?

Jörg

On Fri, Jun 6, 2014 at 9:39 AM, ohw o...@zhihu.com wrote:

Sure, here it is:

We are migrating our Lucene based search codebase to Elasticsearch. The
major problem we encountered is how we should migrate our QueryParsers.

In our old solution, the QueryParsers take in a human input query
string, and transform that to Lucene's Query object, which is then fed
into Lucene's IndexSearcher. However, in Elasticsearch we don't directly
interact with IndexSearcher, instead we can only build the queries in the
client side using Query DSL
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl.html
and
send the JSON to Elasticsearch server. Elasticsearch server then (possibly)
rewrites/analyses the JSON query to build a Lucene query.

To make use of our existing and sophisticated logic in QueryParsers, we
decided that we can stick to our old approach by:

1. Explicitly tell Elasticsearch to *NOT analyze* any query in the
search time.
2. Do *ALL* the query related analysis (tokenizing, synonym, etc) in
the Java client.
3. Believe that Elasticsearch's Query DSL is kind of a one-to-one
mapping to Lucene's Query

The questions are:

1. Is this approach feasible?
2. What are the potential problems in doing so?
3. What is the best practice?

By the way, don't worry about the scoring process, we are writing our
scorer scripts as a Elasticsearch plugin.

---

Thank you!

Odin

On Friday, June 6, 2014 3:36:54 PM UTC+8, Jörg Prante wrote:

Please ask your question here. Thanks.

Jörg

On Fri, Jun 6, 2014 at 9:28 AM, ohw o...@zhihu.com wrote:

Hi folks

I just asked a question in StackOverflow, please have a look if you
have encountered similar problem or have some input to it.

Thanks in advance!

To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/1f4b1575-f050-46db-853f-511bc24e6392%
40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/1f4b1575-f050-46db-853f-511bc24e6392%40googlegroups.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google
Groups elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/50690e24-839a-4908-90f4-5417129debc6%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/50690e24-839a-4908-90f4-5417129debc6%40googlegroups.com?utm_medium=emailutm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

Re: If I set index.number_of_replica:1, then the minimum number of nodes should be 3 to assure that the status of the cluster is gree?

You must use same version throughout the cluster. Combination of different
versions is not always recognized and not warned, it will result in
unexpected behavior or strange error messages.

Jörg

On Fri, Jun 6, 2014 at 11:27 AM, flyer flyer...@gmail.com wrote:

It's the ES version that causes the first problem.

In my previous test, one node use ES-1.1.1 and the other node use
ES-1.1.2. After I indexed data, the state of the cluster was yellow other
than green. I think there're some compatibility problems between the two
versions.

Thank you very much :D

On Fri, Jun 6, 2014 at 4:49 PM, flyer flyer...@gmail.com wrote:

The configurations on the two nodes are almost the same except the **
node.name**.

I want to know how ES allocates nodes. Is there some reference? I googled
but couldn't find it.

Thank you :D

On Fri, Jun 6, 2014 at 3:05 PM, joergpra...@gmail.com
joergpra...@gmail.com wrote:

1. No. Did you change the configuration? You have two data nodes
connected?

Jörg

On Fri, Jun 6, 2014 at 5:41 AM, flyer flyer...@gmail.com wrote:

I have a cluster of two nodes, and set the configs for shard number and
replica number as following:

index.number_of_shards: 10

index.number_of_replicas: 1

The master node is elected automatically.

Before I index data, the state of the cluster is green. After I index
data, the state of the cluster becomes yellow. Some replica shards are not
allocated.

But if I add another node and now the cluster consists of three nodes,
then I index data and the state of the cluster becomes green.

The version of ES that I use is 1.1.

Question 1:
If the replica number is 1, do I have to have at least 3 nodes to
assure the state of the cluster is green?

Question 2:
I have observed that one (not the master node) of the three nodes
just has primary nodes. What's the mechanism ES uses to allocate
primary/replia shards?

Thanks anyway.

--
宠辱不惊，闲看庭前花开花落；去留无意，漫随天边云卷云舒。

--
You received this message because you are subscribed to the Google
Groups elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAKLbBG-o6JmkbPAAvGBP8t8Bdp-ONEWEFK5kGW%3Dcyn-h-O6dsg%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAKLbBG-o6JmkbPAAvGBP8t8Bdp-ONEWEFK5kGW%3Dcyn-h-O6dsg%40mail.gmail.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google
Groups elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoES%2Bd5gmo_OS-Si2dXLPGbk_q-B554WeL13HC0DfFY1vw%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoES%2Bd5gmo_OS-Si2dXLPGbk_q-B554WeL13HC0DfFY1vw%40mail.gmail.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
宠辱不惊，闲看庭前花开花落；去留无意，漫随天边云卷云舒。

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAKLbBG8o%2BOvM8zOWh58bAmeH1iisq8xERTucw%2BJNktPDNkuuRw%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAKLbBG8o%2BOvM8zOWh58bAmeH1iisq8xERTucw%2BJNktPDNkuuRw%40mail.gmail.com?utm_medium=emailutm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHu1fTJd9AmmU3Zh6%3DRkWcCvdnBx6CeNdFDCOmMoND6%2Bg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Get by _id doesn't work but search does.

Look here for the tool and how to use it

http://www.elasticsearch.org/blog/tool-help-routing-issues-elasticsearch-1-2-0/

Jörg

On Fri, Jun 6, 2014 at 11:24 AM, Luke Wilson-Mawer
lukewilsonma...@gmail.com wrote:

Great, thanks Adrien. I will eagerly await the tool.

Kind regards,

Luke

On Thu, Jun 5, 2014 at 1:34 PM, Adrien Grand
adrien.gr...@elasticsearch.com wrote:

Hi,

This is very likely because of
https://github.com/elasticsearch/elasticsearch/pull/6393

See http://www.elasticsearch.org/blog/elasticsearch-1-2-1-released/ for
more information, we are currently working on a tool that would help
relocate documents to the right shard.

On Thu, Jun 5, 2014 at 1:54 PM, Luke Wilson-Mawer
lukewilsonma...@gmail.com wrote:

Hi,

I'm seeing weird behaviours with ids on elasticsearch 1.2.0 (recently
upgraded from 1.0.1).

A search retrieves my document, showing the correct value for _id:

[terminal] curl 'myServer:9200/global/_search?q=someField:something

{took:79,timed_out:false,_shards:{total:12,successful:12,failed:0},hits:{total:1,max_score:17.715034,hits:[{_index:global,_type:user,_id:7a113e4f-44de-3b2b-a3f1-fb881da1b00a,...

But a direct lookup on id doesn't:
[terminal] curl
'myServer:9200/global/user/7a113e4f-44de-3b2b-a3f1-fb881da1b00a'

{_index:global,_type:user,_id:7a113e4f-44de-3b2b-a3f1-fb881da1b00a,found:false}

Any ideas? I'm hoping I won't have to reindex...

Kind regards,

Luke

--
Adrien Grand

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAKmH%2BjqgLZ6-qBHPvKbWLAbBiP_jqjt8QJ7kPWhSATKch0d4Bw%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAKmH%2BjqgLZ6-qBHPvKbWLAbBiP_jqjt8QJ7kPWhSATKch0d4Bw%40mail.gmail.com?utm_medium=emailutm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGHObFxwXcQx7c66cN1a7S%2BeOj152ETQPV6%2B%2Bii6cgkhA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

For any 1.2.0 users - You will want to read this

2014-06-06 Thread Mark Walkom

http://www.elasticsearch.org/blog/tool-help-routing-issues-elasticsearch-1-2-0/

In the 1.2.1 release we fixed a routing bug that had been introduced in
the 1.2.0 release. This was described in the 1.2.1 blog post. The contents
of this post apply only to users that ran 1.2.0.

Before we can discuss the tool we developed to help fix damage from the
bug, we have to understand the problem the bug created. When Elasticsearch
stores a document it has to decide which shard to put the document in. It
does this by computing a hash of the document’s “UID”, which is by default
the type#id tuple of the document. Elasticsearch then uses that hash modulo
the number of shards to pick a shard. We have to make sure that this hash
function works the same across different versions of Elasticsearch. If it
doesn’t, version X could expect to find a given document in shard 1 and
version X+1 could expect the same document in shard 2. In this case,
version X+1 could not directly get the document if it were indexed while
running version X.

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com
web: www.campaignmonitor.com

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAEM624Zh6-5uVn5doQuOGxQAEQa%3DZVg5RmSSMMXbVv9dgU_3-Q%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: iptablex trojan experiences?

2014-06-06 Thread 'Adolfo Rodriguez' via elasticsearch

probably related 

http://bouk.co/blog/elasticsearch-rce/

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/b6207d97-8baa-4c27-9ecd-7da9933503ab%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Impossible to implement real custom boost query when the weight is in the child document?

2014-06-06 Thread Csaba Dezsényi

I could find only one related post:
https://groups.google.com/forum/#!msg/elasticsearch/EGCeJZbhVtA/i32ROGVmFswJ
But this has different question...

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/6152742a-4d32-47a4-890d-49cd6a4dd291%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Elasticsearch and Hadoop Questions

2014-06-06 Thread ES USER

So if I understand you correctly if the data is stored in Hadoop then
es-hadoop is really just acting as a job manager? If that is the case what
is the rule of thumb on how many ES nodes and shard should be set?

On Thursday, June 5, 2014 6:45:09 PM UTC-4, Costin Leau wrote:

Think of es-hadoop as a connector between Hadoop and Elasticsearch. You
would use it to index data in Hadoop to ES or run queries in ES directly
from Hadoop.
Where does ES store the data? That depends on its configuration
(completely separate from es-hadoop itself). In general (and the default)
is to store it onto the local file-system. If you want to use it on a
shared file-system or HDFS you can easily do that by mounting it locally
(for example, mount HDFS through NFS as a local disk) and point ES to it.
ES is happy to work with it however the performance will be _significantly_
degraded and most of the real-time nature of it will go down the window
since HDFS is a distributed file-system (and thus even basic operations
like opening a file or closing a file mean at least one call over the
network) plus you're giving up the amazing OS file-system cache (since the
fs is not local). If the FS is slow, anything that sits on top of it (like
ES) will be slow as well.

Hope this helps,

P.S. By the way, if you want/need to snapshot/restore data to/from ES
from/to HDFS you can use the HDFS repository (more info here:
http://www.elasticsearch.org/guide/en/elasticsearch/hadoop/current/setup.html
)

On Thu, Jun 5, 2014 at 8:41 PM, ES USER es.use...@gmail.com javascript:
wrote:

Try as I might and I have read all the stuff I can find on ES' website
about this I understand somewhat how the integration works but not the
actual nuts and bolts of it.

For example:

Is Hadoop just storing the files that would normally be stored in the
local filesystem for the ES indexes or is it storing the data that would
normally be in those indexes and just accessed through es-hadoop?

If it is the latter how do you go about determining whatto set for the
number of nodes and shards.

If anyone has any information on this or even better yet a place to point
me to that has better references so that I can research this on my own it
would be much appreciated.

Thanks.

https://groups.google.com/d/msgid/elasticsearch/b78f2fa6-42c9-4ae7-a4ab-aacbc2c53293%40googlegroups.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/123c4ed3-077b-4e9f-a838-fa372aea109a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Elasticsearch and Hadoop Questions

2014-06-06 Thread Georgi Ivanov

Hmm i am not sure i understand your questions.
Hadoop is distributed storage system (HDFS) and Map-reduce framework (MR)
(among other things)
ES is distributed storage/search system (among other things)

So what es-hadoop is giving you:

You can read data from ES , and do some complex analysis , taking benefits
MR
You can write data to ES - one can process some data stored on HDFS and
write some pre-aggregated data to ES for example

es-hadoop is basically connector between ES and Hadoop

I hope this helps

On Thursday, June 5, 2014 7:41:34 PM UTC+2, ES USER wrote:

Try as I might and I have read all the stuff I can find on ES' website
about this I understand somewhat how the integration works but not the
actual nuts and bolts of it.

For example:

If it is the latter how do you go about determining whatto set for the
number of nodes and shards.

If anyone has any information on this or even better yet a place to point
me to that has better references so that I can research this on my own it
would be much appreciated.

Thanks.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/f4019b07-a660-4a49-b9ec-b04bb1ad71e5%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: What's using memory in ElasticSearch? (Details to follow...)

No, the settings will not merge existing segments unless you call _optimize
action via API.

And take some patience. Thousands of segments take time - also, they need
quite few memory resources to merge...

I suggest backup your data first, to stay safe if the merging fails /
aborts...

Jörg

On Thu, Jun 5, 2014 at 7:30 AM, jegansp jega...@gmail.com wrote:

Is this related to
https://github.com/elasticsearch/elasticsearch/issues/5779

We are also facing this issue. Our version is also 1.1.0. After running
fine
for few days, the cluster slowly starts facing memory issues. At some point
of time it spends almost 90% of the time in garbage collection with full
GCs
with each on taking more than 15 secs. Tuning jvm parameters didn't help us
either.

We have a two node cluster with 120+ indexes. In our case segments count is
too high (one node has 16000+ and other has 25000+).

Optimizing indexes API doesn't return either as indicated here
https://groups.google.com/forum/#!topic/elasticsearch/kqTRRADQBwc

I am going to try with these settings to see if the situation improves.

https://gist.github.com/jprante/10666960

But, not sure if this will merge existing segments.

--
View this message in context:
http://elasticsearch-users.115913.n3.nabble.com/What-s-using-memory-in-ElasticSearch-Details-to-follow-tp4056989p4057097.html
Sent from the ElasticSearch Users mailing list archive at Nabble.com.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHyMHc242rWrLAihDEkz6GUfKBJbUWJEkqcsU2_VmDxoA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Max doc size for indexing over HTTP

2014-06-06 Thread eperezks

I'm trying to index a document over 1gb in size but I get the following 
error. 

org.elasticsearch.common.netty.handler.codec.frame.TooLongFrameException: 
HTTP content length exceeded 1073741824 bytes.
1073741824  = 1gb

My elasticsearch .yml has http.max_content_length set to 1900mb.  I found 
this post that says Netty has a 2gb limit (
https://github.com/elasticsearch/elasticsearch/issues/2237 ) so I was 
expecting to be able to index documents of approximately 2gb.  Does 
elasticsearch impose a limit of 1gb even though I specified 1gb?  If not, 
I can file an issue on github.


This is my node info

{
  status : 200,
  name : SOURCEONE-elastic-ubuntu-4,
  version : {
number : 1.2.0,
build_hash : c82387f290c21505f781c695f365d0ef4098b272,
build_timestamp : 2014-05-22T12:49:13Z,
build_snapshot : false,
lucene_version : 4.8
  },
  tagline : You Know, for Search
}

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/e32e66f9-0e96-49d0-b3d5-c9c35cd8d10d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Max doc size for indexing over HTTP

1gb is a very large document and it is unusual to index such sizes.

There is a limit check against the heap. In order to be able to process
such length, you need a large heap alone to store the document source.
Depending on analyzer, heap demand increases even more.

You can index documents of arbitrary length if you preprocess them first
and split them into smaller chunks.

Jörg

On Fri, Jun 6, 2014 at 2:49 PM, eperezks perez...@gmail.com wrote:

I'm trying to index a document over 1gb in size but I get the following
error.

org.elasticsearch.common.netty.handler.codec.frame.TooLongFrameException:
HTTP content length exceeded 1073741824 bytes.
1073741824 = 1gb

My elasticsearch .yml has http.max_content_length set to 1900mb. I found
this post that says Netty has a 2gb limit (
https://github.com/elasticsearch/elasticsearch/issues/2237 ) so I was
expecting to be able to index documents of approximately 2gb. Does
elasticsearch impose a limit of 1gb even though I specified 1gb? If not,
I can file an issue on github.

This is my node info

{
status : 200,
name : SOURCEONE-elastic-ubuntu-4,
version : {
number : 1.2.0,
build_hash : c82387f290c21505f781c695f365d0ef4098b272,
build_timestamp : 2014-05-22T12:49:13Z,
build_snapshot : false,
lucene_version : 4.8
},
tagline : You Know, for Search
}

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/e32e66f9-0e96-49d0-b3d5-c9c35cd8d10d%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/e32e66f9-0e96-49d0-b3d5-c9c35cd8d10d%40googlegroups.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoENg2gujZ01Gk-P6uDHBh5QffdQLB7fEgRJUUqdqeO4oQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

XGET to _mapping does not match the mapping I used to create the index, is this normal?

2014-06-06 Thread Enno Shioji

Hi, I created my index using this mapping JSON:

{
myidx: {
index.refresh_interval:5m,
mappings: {
dynamic: strict,
message: {
_ttl: {
enabled: true
},
properties: {
my_nested_thing: {
type: nested,
properties: {
some_id: {
type: string, index: not_analyzed
},
count: {
type: long
},
}
}
}
}
}
}
}

If I do a GET to _mapping after indexing some documents, it will 
essentially return:

{
message: {
properties: {
my_nested_thing: {
properties: {
some_id: {
type: string, index: not_analyzed
},
count: {
type: long
},
}
}
}
}
}

I.e. _ttl: enabled=true and the type: nested is not present from the 
mapping. I also noticed that it allows auto update to the mapping despite 
the dynamic: strict instruction.

Does this mean these instructions are somehow not being reflected? If so, 
what am I doing wrong?

I'm using version 1.2.1







-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/b57199bf-c01f-4b86-9d8d-a4acfb06618c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: A plugin to change the result set before sending it back to the http client

2014-06-06 Thread Brian

Hi, Mario. Yes, I suppose this kind of goes against the no additional 
proxy requirement you have.

Hehehe. I'm a seeker of loopholes. In my scenario, it's still a plug-in 
design, but ES is my plug-in and not the other way around. Still only one 
HTTP interface in the mix, but it's mine and not ES's.

I also have avoided the plug-in approach because I've read that it's marked 
for deprecation and eventual removal. Yet logstash and ES Head are still 
offered as plug-ins as are a boat-load of other facilities, so I am not 
really sure if that's still the case.

And of course, your own plug-in has a much better chance to be updated to 
match exactly each new ES version to which you migrate. That's one of the 
downsides of third-party plug-ins: They lock you into older ES versions 
until the author gets a chance to update the plug-in.

Brian

On Friday, June 6, 2014 1:14:00 AM UTC-4, Mario Mueller wrote:


 @Brian:
 Interesting approach, but wouldn't this go against the initial no 
 additional proxy statement, if I got you right ..



-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/287ee9f9-b6cd-47d1-937e-f38406a614a1%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: XGET to _mapping does not match the mapping I used to create the index, is this normal?

2014-06-06 Thread Jun Ohtani

Hi,

How do you use API to create index?

I think “dynamic” property is wrong place.

I try to create index to use the following JSON and index;

curl -XPOST localhost:9200/myidx -d '
{
  settings: {
index.refresh_interval: 5m
  },
  mappings: {
message: {
  dynamic: strict,
  _ttl: {
enabled: true
  },
  properties: {
my_nested_thing: {
  type: nested,
  properties: {
some_id: {
  type: string,
  index: not_analyzed
},
count: {
  type: long
}
  }
}
  }
}
  }
}'

curl -XGET localhost:9200/myidx/_mapping?pretty

{
  myidx : {
mappings : {
  message : {
dynamic : strict,
_ttl : {
  enabled : true
},
properties : {
  my_nested_thing : {
type : nested,
properties : {
  count : {
type : long
  },
  some_id : {
type : string,
index : not_analyzed
  }
}
  }
}
  }
}
  }
}

Does it make sense?


Jun Ohtani
joht...@gmail.com
blog : http://blog.johtani.info
twitter : http://twitter.com/johtani

2014/06/06 22:31、Enno Shioji eshi...@gmail.com のメール：

 Hi, I created my index using this mapping JSON:
 
 {
 myidx: {
 index.refresh_interval:5m,
 mappings: {
 dynamic: strict,
 message: {
 _ttl: {
 enabled: true
 },
 properties: {
 my_nested_thing: {
 type: nested,
 properties: {
 some_id: {
 type: string, index: not_analyzed
 },
 count: {
 type: long
 },
 }
 }
 }
 }
 }
 }
 }
 
 If I do a GET to _mapping after indexing some documents, it will essentially 
 return:
 
 {
 message: {
 properties: {
 my_nested_thing: {
 properties: {
 some_id: {
 type: string, index: not_analyzed
 },
 count: {
 type: long
 },
 }
 }
 }
 }
 }
 
 I.e. _ttl: enabled=true and the type: nested is not present from the 
 mapping. I also noticed that it allows auto update to the mapping despite the 
 dynamic: strict instruction.
 
 Does this mean these instructions are somehow not being reflected? If so, 
 what am I doing wrong?
 
 I'm using version 1.2.1
 
 
 
 
 
 
 
 
 -- 
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/b57199bf-c01f-4b86-9d8d-a4acfb06618c%40googlegroups.com.
 For more options, visit https://groups.google.com/d/optout.



signature.asc
Description: Message signed with OpenPGP using GPGMail

Wrong paging with embedded es instance

Hi everyone.

I use embedded ES node as part of java application.

Node node = nodeBuilder().clusterName(OSM-Gazetteer).node();
Client client = node.client();

I try to fetch some data paged.

SearchRequestBuilder searchQ = client.prepareSearch(gazetteer)
.setSearchType(SearchType.QUERY_AND_FETCH)
.setNoFields()
.setQuery(QueryBuilders.matchAllQuery())
.setExplain(false);

searchQ.setSize(PAGE_SIZE);//PAGE_SIZE=5;
searchQ.setFrom(page * PAGE_SIZE);   //page=0;

Here is the querry what I've got generated by client:

{   from : 0,   size : 5,   query : { match_all : { }   },   
explain : false,   fields : [ ] }

curl version returns 5 hits, as expected, but java returns 20 hits:

searchQ.get().getHits().getHits().length; //=20

Index setting are default. Hits do not have duplicates.

Is there some workaround or something?

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/647bddb1-636f-4670-9f9a-b0767001fdd8%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: XGET to _mapping does not match the mapping I used to create the index, is this normal?

2014-06-06 Thread Enno Shioji

Hi Jun,

Ah, I must be doing something wrong then. I'll correct the JSON and test 
again.

Btw わざわざテストしていただきありがとうございます！



On Friday, 6 June 2014 15:29:37 UTC+1, Jun Ohtani wrote:

 Hi, 

 How do you use API to create index? 

 I think “dynamic” property is wrong place. 

 I try to create index to use the following JSON and index; 

 curl -XPOST localhost:9200/myidx -d ' 
 { 
   settings: { 
 index.refresh_interval: 5m 
   }, 
   mappings: { 
 message: { 
   dynamic: strict, 
   _ttl: { 
 enabled: true 
   }, 
   properties: { 
 my_nested_thing: { 
   type: nested, 
   properties: { 
 some_id: { 
   type: string, 
   index: not_analyzed 
 }, 
 count: { 
   type: long 
 } 
   } 
 } 
   } 
 } 
   } 
 }' 

 curl -XGET localhost:9200/myidx/_mapping?pretty 

 { 
   myidx : { 
 mappings : { 
   message : { 
 dynamic : strict, 
 _ttl : { 
   enabled : true 
 }, 
 properties : { 
   my_nested_thing : { 
 type : nested, 
 properties : { 
   count : { 
 type : long 
   }, 
   some_id : { 
 type : string, 
 index : not_analyzed 
   } 
 } 
   } 
 } 
   } 
 } 
   } 
 } 

 Does it make sense? 

  
 Jun Ohtani 
 joh...@gmail.com javascript: 
 blog : http://blog.johtani.info 
 twitter : http://twitter.com/johtani 

 2014/06/06 22:31、Enno Shioji esh...@gmail.com javascript: のメール： 

  Hi, I created my index using this mapping JSON: 
  
  { 
  myidx: { 
  index.refresh_interval:5m, 
  mappings: { 
  dynamic: strict, 
  message: { 
  _ttl: { 
  enabled: true 
  }, 
  properties: { 
  my_nested_thing: { 
  type: nested, 
  properties: { 
  some_id: { 
  type: string, index: 
 not_analyzed 
  }, 
  count: { 
  type: long 
  }, 
  } 
  } 
  } 
  } 
  } 
  } 
  } 
  
  If I do a GET to _mapping after indexing some documents, it will 
 essentially return: 
  
  { 
  message: { 
  properties: { 
  my_nested_thing: { 
  properties: { 
  some_id: { 
  type: string, index: 
 not_analyzed 
  }, 
  count: { 
  type: long 
  }, 
  } 
  } 
  } 
  } 
  } 
  
  I.e. _ttl: enabled=true and the type: nested is not present from 
 the mapping. I also noticed that it allows auto update to the mapping 
 despite the dynamic: strict instruction. 
  
  Does this mean these instructions are somehow not being reflected? If 
 so, what am I doing wrong? 
  
  I'm using version 1.2.1 
  
  
  
  
  
  
  
  
  -- 
  You received this message because you are subscribed to the Google 
 Groups elasticsearch group. 
  To unsubscribe from this group and stop receiving emails from it, send 
 an email to elasticsearc...@googlegroups.com javascript:. 
  To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/b57199bf-c01f-4b86-9d8d-a4acfb06618c%40googlegroups.com.
  

  For more options, visit https://groups.google.com/d/optout. 



-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/7aeb3e33-4ad0-41d5-a27f-4e21b326e78d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Wrong paging with embedded es instance

Could you print your searchQ object?

May be using a toString()

--
David ;-)
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

 Le 6 juin 2014 à 16:47, Дмитрий Киселев dmitry.v.kise...@gmail.com a écrit :
 
 Hi everyone.
 
 I use embedded ES node as part of java application.
 
 Node node = nodeBuilder().clusterName(OSM-Gazetteer).node();
 Client client = node.client();
 
 I try to fetch some data paged.
 
 SearchRequestBuilder searchQ = client.prepareSearch(gazetteer)
 .setSearchType(SearchType.QUERY_AND_FETCH)
 .setNoFields()
 .setQuery(QueryBuilders.matchAllQuery())
 .setExplain(false);
 
 searchQ.setSize(PAGE_SIZE);//PAGE_SIZE=5;
 searchQ.setFrom(page * PAGE_SIZE);   //page=0;
 
 Here is the querry what I've got generated by client:
 
 {   from : 0,   size : 5,   query : { match_all : { }   },   
 explain : false,   fields : [ ] }
 
 curl version returns 5 hits, as expected, but java returns 20 hits:
 
 searchQ.get().getHits().getHits().length; //=20
 
 Index setting are default. Hits do not have duplicates.
 
 Is there some workaround or something?
 -- 
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/647bddb1-636f-4670-9f9a-b0767001fdd8%40googlegroups.com.
 For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/BEFCD925-5450-4DAC-A121-EAC6EB928371%40pilato.fr.
For more options, visit https://groups.google.com/d/optout.

Trying to undand MArvel stats and merge with SSD

2014-06-06 Thread John Smith

Running Elasticsearch 1.2.1 with Java 1.7_55 on CentOs 6.5

The machine is a 32 core 96GB with standard spinning disk, but I also 
installed 1 Samsung Evo 840 for testing ES.
The Evo is rated at 500MB/s though the Linux perf test reported about 
300MB/s read and about 250MB/s write. The board is SataII which explains 
why it's 300MB/s max.

Using Jmeter to send index requests to ES

Executing about 6200 puts/s

Marvel reports 
2200 IOPS/
20MB merges/s

And iostat for the drive

sdf   0.00 14214.000.00 2021.33 0.0062.3563.17 
   10.495.17   0.48  97.27

Also seeing  on the console: stop throttling indexing: numMergesInFlight=4, 
maxNumMerges=5

Are these numbers good?

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/9c5f418d-99fe-48c4-b24e-4de18b9793d8%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Understanding merge statistics from Marvel

2014-06-06 Thread John Smith

Running Elasticsearch 1.2.1 with Java 1.7_55 on CentOs 6.5

The machine is a 32 core 96GB with standard spinning disk, but I also 
installed 1 Samsung Evo 840 for testing ES.
The Evo is rated at 500MB/s though the Linux perf test reported about 
300MB/s read and about 250MB/s write. The board is SataII which explains 
why it's 300MB/s max.

Using Jmeter to send index requests to ES

Executing about 6200 puts/s

Marvel reports 
2200 IOPS/
20MB merges/s

And iostat for the drive

sdf   0.00 14214.000.00 2021.33 0.0062.3563.17 
   10.495.17   0.48  97.27

Also seeing  on the console: stop throttling indexing: numMergesInFlight=4, 
maxNumMerges=5

Are these numbers good?

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/52970ffa-8cab-4f67-8d97-f4358062485d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: A plugin to change the result set before sending it back to the http client

Plugins are essential to ES's success and are not going away any time soon.
The river plugins, aka cluster singletons, are the ones which are
discouraged from use. Good ahead and create more plugins!

-- 
Ivan


On Fri, Jun 6, 2014 at 7:23 AM, Brian brian.from...@gmail.com wrote:


 I also have avoided the plug-in approach because I've read that it's
 marked for deprecation and eventual removal. Yet logstash and ES Head are
 still offered as plug-ins as are a boat-load of other facilities, so I am
 not really sure if that's still the case.



-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQBEXPBJ5HTqT8YK25w_qjioYzGuXebQ%3DVvrE5HHmUMNvw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: [ANN] Elasticsearch Simple Action Plugin

2014-06-06 Thread virgil

Good idea! One thing I am not quite clear is that writing a custom
ScoreFunction, I will have to modify elasticsearch source code and compile
it right? Or there is any other way to do it? Thank you.



--
View this message in context: 
http://elasticsearch-users.115913.n3.nabble.com/ANN-Elasticsearch-Simple-Action-Plugin-tp4056971p4057232.html
Sent from the ElasticSearch Users mailing list archive at Nabble.com.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/1402072380650-4057232.post%40n3.nabble.com.
For more options, visit https://groups.google.com/d/optout.

Re: [ANN] Elasticsearch Simple Action Plugin

Function score code is perhaps the easiest to write for Elasticsearch. You
do not need to create a plugin for it, simply write the code with the
appropriate classes and deploy at as a jar to Elasticsearch's lib
directory. Done. That said, I still prefer to write native scripts as
plugins because they are logged as plugins on server startup. Far easier to
debug if something goes wrong.

Two tutorials:
http://www.spacevatican.org/2012/5/12/elasticsearch-native-scripts-for-dummies/
https://github.com/imotov/elasticsearch-native-script-example

Cheers,

Ivan

On Fri, Jun 6, 2014 at 9:33 AM, virgil virgil...@gmail.com wrote:

Good idea! One thing I am not quite clear is that writing a custom
ScoreFunction, I will have to modify elasticsearch source code and compile
it right? Or there is any other way to do it? Thank you.

--
View this message in context:
http://elasticsearch-users.115913.n3.nabble.com/ANN-Elasticsearch-Simple-Action-Plugin-tp4056971p4057232.html
Sent from the ElasticSearch Users mailing list archive at Nabble.com.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQAvHOhOokAKVd2creOCJ%3DGxFROqUznkChbBT4SOsWDghg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Elasticsearch and Hadoop Questions

Adding to what Georgi wrote, es-hadoop does not create the shards for you -
that's up to you or index templates (which I highly recommend). However
es-hadoop is aware of the target shards and will use them to parallelize
the reads/writes (such as one task per shard).

On Fri, Jun 6, 2014 at 2:45 PM, Georgi Ivanov georgi.r.iva...@gmail.com
wrote:

and i don't think this anyhow related with number of shards and nodes

On Thursday, June 5, 2014 7:41:34 PM UTC+2, ES USER wrote:

Try as I might and I have read all the stuff I can find on ES' website
about this I understand somewhat how the integration works but not the
actual nuts and bolts of it.

For example:

If it is the latter how do you go about determining whatto set for the
number of nodes and shards.

If anyone has any information on this or even better yet a place to point
me to that has better references so that I can research this on my own it
would be much appreciated.

Thanks.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/90662a91-1557-4f61-86a2-bd2e620aec6f%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/90662a91-1557-4f61-86a2-bd2e620aec6f%40googlegroups.com?utm_medium=emailutm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAJogdmeDzSDrBLfpTQ3hGxOh1PN4przGkth2-M_oLdN7VjKYPg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Impossible to implement real custom boost query when the weight is in the child document?

Did you change the boost_mode of your function score script? The default
should be multiply, which is the behavior you want, not sum, which is
what you are experiencing.

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-function-score-query.html

I have never used it with nested documents, so perhaps it is a bug (or a
feature :) )

--
Ivan

On Fri, Jun 6, 2014 at 3:55 AM, Csaba Dezsényi csaba.dezse...@gmail.com
wrote:

I could find only one related post:

https://groups.google.com/forum/#!msg/elasticsearch/EGCeJZbhVtA/i32ROGVmFswJ
But this has different question...

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/6152742a-4d32-47a4-890d-49cd6a4dd291%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/6152742a-4d32-47a4-890d-49cd6a4dd291%40googlegroups.com?utm_medium=emailutm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQBdTboEEGNeDSZZw%2B%2BwDhpeS6xA-1tZ-QrVsuFt-XzUSA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Speculative deletes

2014-06-06 Thread Nikolas Everett

I'm in the position where I need to make _sure_ a document is deleted from
the index when something occurs in my source system.  I want to just hit it
with a DELETE every time.  Is that a good idea?

It looks to me like DELETEing documents via ID is pretty cheap if the
document doesn't exist.  It looks like it sets some dirty flags but is
otherwise pretty quick.  My alternative is to do a GET and DELETE if it
exists or, maybe, do something cute like:
curl -XPOST 'localhost:9200/index/type/id/_update' -d '{
script : ctx.op = \delete\
}'
which will short circuit that.


Thoughts?

Nik

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAPmjWd3sit9WTgMV6qPb3WojV8-8aQt7VtwOHV6aP8q%2BMx32UQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: [ANN] Elasticsearch Simple Action Plugin

I mean, you can add a MyOwnFunctionBuilder/MyOwnFunctionParser to
Elasticsearch via plugin. See
package org.elasticsearch.index.query.functionscore for the standard
implementations.

The functionscore code is masterpiece quality - no need to modify existing
code! It is pluggable.

A close example to what you want to achieve is
org.elasticsearch.index.query.functionscore.fieldvaluefactor

This would be an alternative to the AbstractSearchScript call
implementation path you have chosen so far

Jörg

On Fri, Jun 6, 2014 at 6:33 PM, virgil virgil...@gmail.com wrote:

Good idea! One thing I am not quite clear is that writing a custom
ScoreFunction, I will have to modify elasticsearch source code and compile
it right? Or there is any other way to do it? Thank you.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHRK2rUjGKTCxyLQNkKY4Fcc6Dxo0Hz-YkswXZ3SvNELw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Copy index from production to development instance

2014-06-06 Thread Brian Lamb

Hi all,

I want to do a one time copy of the data on my production elastic search 
instance to my development elastic search instance. Both are managed by AWS 
if that makes this easier. Here is what I tried:

On production:

curl -XPUT 'http://localhost:9200/_snapshot/my_s3_repository' -d '{
type: s3,
settings: {
access_key: productionAccessKey,
bucket: productionBucketName,
region: region,
secret_key: productionSecretKey
}
}'
curl -XPUT 
http://localhost:9200/_snapshot/my_s3_repository/snapshot_2014_06_02;

What this does is upload the instance to a production level s3 bucket.

Then in the aws console, I copy all of it to a development level s3 bucket.

Next on development:

curl -XPUT 'http://localhost:9200/_snapshot/my_s3_repository' -d '{
type: s3,
settings: {
access_key: developmentAccessKey,
bucket: developmentBucketName,
region: region,
secret_key: developmentSecretKey
}
}'
curl -XPOST 
http://localhost:9200/_snapshot/my_s3_repository/snapshot_2014_06_02/_restore;

This gives me the following message:

$ curl -XPOST 
http://localhost:9200/_snapshot/my_s3_repository/snapshot_2014_06_02/_restore?pretty=true;
{
  error : SnapshotException[[my_s3_repository:snapshot_2014_06_02] 
failed to get snapshots]; nested: IOException[Failed to get 
[snapshot-snapshot_2014_06_02]]; nested: AmazonS3Exception[Status Code: 
404, AWS Service: Amazon S3, AWS Request ID: RequestId, AWS Error Code: 
NoSuchKey, AWS Error Message: The specified key does not exist.]; ,
  status : 500
}

Also, when I try to get the snapshots, I get the following:

$ curl -XGET localhost:9200/_snapshot/_status?pretty=true
{
  snapshots : [ ]
}

This leads me to believe that I am not connecting the snapshot correctly 
but I'm not sure what I am doing incorrectly. Regenerating the index on 
development is not really a possibility as it took a few months to generate 
the index the first time around. If there is a better way to do this, I'm 
all for it. 

Thanks,

Brian Lamb

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/b5266abf-2ff4-44b3-ba25-734b50d99e83%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Copy index from production to development instance

2014-06-06 Thread Brian Lamb

I should also point out that I had to edit a file in the metadata-snapshot 
file to change around the s3 keys and bucket name to match what development 
was expecting.

On Friday, June 6, 2014 1:11:57 PM UTC-4, Brian Lamb wrote:

 Hi all,

 I want to do a one time copy of the data on my production elastic search 
 instance to my development elastic search instance. Both are managed by AWS 
 if that makes this easier. Here is what I tried:

 On production:

 curl -XPUT 'http://localhost:9200/_snapshot/my_s3_repository' -d '{
 type: s3,
 settings: {
 access_key: productionAccessKey,
 bucket: productionBucketName,
 region: region,
 secret_key: productionSecretKey
 }
 }'
 curl -XPUT 
 http://localhost:9200/_snapshot/my_s3_repository/snapshot_2014_06_02;

 What this does is upload the instance to a production level s3 bucket.

 Then in the aws console, I copy all of it to a development level s3 bucket.

 Next on development:

 curl -XPUT 'http://localhost:9200/_snapshot/my_s3_repository' -d '{
 type: s3,
 settings: {
 access_key: developmentAccessKey,
 bucket: developmentBucketName,
 region: region,
 secret_key: developmentSecretKey
 }
 }'
 curl -XPOST 
 http://localhost:9200/_snapshot/my_s3_repository/snapshot_2014_06_02/_restore
 

 This gives me the following message:

 $ curl -XPOST 
 http://localhost:9200/_snapshot/my_s3_repository/snapshot_2014_06_02/_restore?pretty=true
 
 {
   error : SnapshotException[[my_s3_repository:snapshot_2014_06_02] 
 failed to get snapshots]; nested: IOException[Failed to get 
 [snapshot-snapshot_2014_06_02]]; nested: AmazonS3Exception[Status Code: 
 404, AWS Service: Amazon S3, AWS Request ID: RequestId, AWS Error Code: 
 NoSuchKey, AWS Error Message: The specified key does not exist.]; ,
   status : 500
 }

 Also, when I try to get the snapshots, I get the following:

 $ curl -XGET localhost:9200/_snapshot/_status?pretty=true
 {
   snapshots : [ ]
 }

 This leads me to believe that I am not connecting the snapshot correctly 
 but I'm not sure what I am doing incorrectly. Regenerating the index on 
 development is not really a possibility as it took a few months to generate 
 the index the first time around. If there is a better way to do this, I'm 
 all for it. 

 Thanks,

 Brian Lamb


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/09fd8162-b39d-4b8d-83bc-c011e4d8bf05%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: [ANN] Elasticsearch Simple Action Plugin

For an example function score plugin implementation, see

https://github.com/elasticsearch/elasticsearch/blob/master/src/test/java/org/elasticsearch/search/functionscore/FunctionScorePluginTests.java

Jörg

On Fri, Jun 6, 2014 at 7:10 PM, joergpra...@gmail.com joergpra...@gmail.com
wrote:

I mean, you can add a MyOwnFunctionBuilder/MyOwnFunctionParser to
Elasticsearch via plugin. See
package org.elasticsearch.index.query.functionscore for the standard
implementations.

The functionscore code is masterpiece quality - no need to modify existing
code! It is pluggable.

A close example to what you want to achieve is
org.elasticsearch.index.query.functionscore.fieldvaluefactor

This would be an alternative to the AbstractSearchScript call
implementation path you have chosen so far

Jörg

On Fri, Jun 6, 2014 at 6:33 PM, virgil virgil...@gmail.com wrote:

Good idea! One thing I am not quite clear is that writing a custom
ScoreFunction, I will have to modify elasticsearch source code and compile
it right? Or there is any other way to do it? Thank you.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGgcF50wo6n227MjH84cVYxGPFc-pVAAkDQtioxZ08AiA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Wrong paging with embedded es instance

Yep.

System.out.println(searchQ.toString());

{
  from : 0,
  size : 5,
  query : {
match_all : { }
  },
  explain : false,
  fields : [ ]
}

Also, I think it's might be connected with sharding.
I've tried to change number_of_shards to 1 and paging starts act as I
expect.

Set number_of_shards to 5 and get 25 hits.
Set number_of_shards to 4 and get 20 hits.

It's seems like from and size applied to every shard separately in my case.

-- 
Thank you for your time. Best regards.
Dmitry.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAA9QNxOjT5wGmris_4uZ8_33uF6d51hzZ3Z%3DUvBKGbRV-YuR2Q%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: [ANN] Elasticsearch Simple Action Plugin

2014-06-06 Thread virgil

Thank you for the links!



--
View this message in context: 
http://elasticsearch-users.115913.n3.nabble.com/ANN-Elasticsearch-Simple-Action-Plugin-tp4056971p4057245.html
Sent from the ElasticSearch Users mailing list archive at Nabble.com.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/1402075639288-4057245.post%40n3.nabble.com.
For more options, visit https://groups.google.com/d/optout.

Re: Wrong paging with embedded es instance

Can you print also the full response object (toString()) as well?

--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr

Le 6 juin 2014 à 19:24:00, Дмитрий Киселев (dmitry.v.kise...@gmail.com) a écrit:

Yep.

System.out.println(searchQ.toString());

{
from : 0,
size : 5,
query : {
match_all : { }
},
explain : false,
fields : [ ]
}

Also, I think it's might be connected with sharding.
I've tried to change number_of_shards to 1 and paging starts act as I expect.

Set number_of_shards to 5 and get 25 hits.
Set number_of_shards to 4 and get 20 hits.

It's seems like from and size applied to every shard separately in my case.

--
Thank you for your time. Best regards.
Dmitry.
--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAA9QNxOjT5wGmris_4uZ8_33uF6d51hzZ3Z%3DUvBKGbRV-YuR2Q%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/etPan.5391fd20.12200854.12ee%40MacBook-Air-de-David.local.
For more options, visit https://groups.google.com/d/optout.

Relevancy with location

2014-06-06 Thread Scott Vickers

I am trying to wire up a search that gives documents near a location a 
higher ranking.  Pretty simple but in our case some of the documents won't 
always have a location associated with them.  When using a function_score 
query the documents with no location always get scored higher than the ones 
with a location, even if I search with the exact coordinates.  Here is an 
example:

Documents:
PUT /test/colors/1
{
  name:red
}

PUT /test/colors/2
{
  name:red,
  location:{lat:47,lon:-122}
}

Mapping:
mappings : {
colors : {
properties : {
location : {
lat_lon : true,
type : geo_point,
geohash : true
},
name : {
type : string
}
}
}
}

Query:
{
  query:{
function_score:{
  query:{
multi_match:{
  type:phrase,
  query:red,
  fields:[name],
  tie_breaker:0.3
}
  },
  functions:[
{
  gauss: {
location: {
  origin: 47,-122,
  scale: 10mi
}
  }
}
]
}

  }
}

Results:
{
took: 3,
timed_out: false,
_shards: {
total: 2,
successful: 2,
failed: 0
},
hits: {
total: 2,
max_score: 0.30685282,
hits: [
{
_index: test,
_type: colors,
_id: 1,
_score: 0.30685282,
_source: {
name: red
}
},
{
_index: test,
_type: colors,
_id: 2,
_score: 0.30685282,
_source: {
name: red,
location: {
lat: 47,
lon: -122
}
}
}
]
}
}

Is there any way to factor proximity in for some results and not others?

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/59f8e2e1-856a-432b-a034-b71e31110ae9%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Wrong paging with embedded es instance

The total hit number is really inconsistent:

With one shard you get: 656523
With 5 shards you get: 3879

I think you are doing something wrong but I can't tell more without looking at 
the full source code.
Could you share how you actually execute the query?

Are you sure your Java client is connected to the right instance/cluster?


-- 
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr


Le 6 juin 2014 à 19:49:47, Дмитрий Киселев (dmitry.v.kise...@gmail.com) a écrit:

This is with 5 shards.
{
  took : 81,
  timed_out : false,
  _shards : {
    total : 5,
    successful : 5,
    failed : 0
  },
  hits : {
    total : 3879,
    max_score : 1.0,
    hits : [ {
  _index : gazetteer,
  _type : location,
  _id : adrpnt-0016087997-w162848733-regular,
  _score : 1.0
    }, {
  _index : gazetteer,
  _type : location,
  _id : adrpnt-0088827105-n2270743905-regular,
  _score : 1.0
    }, {
  _index : gazetteer,
  _type : location,
  _id : adrpnt-0097856729-n2270743903-regular,
  _score : 1.0
    }, {
  _index : gazetteer,
  _type : location,
  _id : adrpnt-0145983393-w154644839-regular,
  _score : 1.0
    }, {
  _index : gazetteer,
  _type : location,
  _id : adrpnt-0209772668-n1884206099-regular,
  _score : 1.0
    }, {
  _index : gazetteer,
  _type : location,
  _id : adrpnt-0015203925-w147150792-regular,
  _score : 1.0
    }, {
  _index : gazetteer,
  _type : location,
  _id : adrpnt-0017569140-n2495059507-regular,
  _score : 1.0
    }, {
  _index : gazetteer,
  _type : location,
  _id : adrpnt-0071389729-w147150672-regular,
  _score : 1.0
    }, {
  _index : gazetteer,
  _type : location,
  _id : adrpnt-0130455978-w145925771-regular,
  _score : 1.0
    }, {
  _index : gazetteer,
  _type : location,
  _id : adrpnt-0139624280-w147150701-regular,
  _score : 1.0
    }, {
  _index : gazetteer,
  _type : location,
  _id : adrpnt-007978-w194531715-regular,
  _score : 1.0
    }, {
  _index : gazetteer,
  _type : location,
  _id : adrpnt-0096243499-w194531714-regular,
  _score : 1.0
    }, {
  _index : gazetteer,
  _type : location,
  _id : adrpnt-0162691059-w164700540-regular,
  _score : 1.0
    }, {
  _index : gazetteer,
  _type : location,
  _id : adrpnt-0202220208-w164698447-regular,
  _score : 1.0
    }, {
  _index : gazetteer,
  _type : location,
  _id : adrpnt-0287053820-n2270743890-regular,
  _score : 1.0
    }, {
  _index : gazetteer,
  _type : location,
  _id : adrpnt-0013765594-w145949343-regular,
  _score : 1.0
    }, {
  _index : gazetteer,
  _type : location,
  _id : adrpnt-0026389358-w147150656-regular,
  _score : 1.0
    }, {
  _index : gazetteer,
  _type : location,
  _id : adrpnt-0093401200-w162848869-regular,
  _score : 1.0
    }, {
  _index : gazetteer,
  _type : location,
  _id : adrpnt-0203517601-n2270743895-regular,
  _score : 1.0
    }, {
  _index : gazetteer,
  _type : location,
  _id : adrpnt-0217217459-n2270743898-regular,
  _score : 1.0
    }, {
  _index : gazetteer,
  _type : location,
  _id : adrpnt-0004074128-w145925740-regular,
  _score : 1.0
    }, {
  _index : gazetteer,
  _type : location,
  _id : adrpnt-0007815983-w175372179-regular,
  _score : 1.0
    }, {
  _index : gazetteer,
  _type : location,
  _id : adrpnt-0039697750-w164700428-regular,
  _score : 1.0
    }, {
  _index : gazetteer,
  _type : location,
  _id : adrpnt-0053473990-w271448695-regular,
  _score : 1.0
    }, {
  _index : gazetteer,
  _type : location,
  _id : adrpnt-0129549665-w162848862-regular,
  _score : 1.0
    } ]
  }
}

And this is with one shard
{
  took : 123,
  timed_out : false,
  _shards : {
    total : 1,
    successful : 1,
    failed : 0
  },
  hits : {
    total : 656523,
    max_score : 1.0,
    hits : [ {
  _index : gazetteer,
  _type : location,
  _id : adrpnt-0675290941-w116699544-regular,
  _score : 1.0
    }, {
  _index : gazetteer,
  _type : location,
  _id : adrpnt-0675314442-n1557245109-regular,
  _score : 1.0
    }, {
  _index : gazetteer,
  _type : location,
  _id : adrpnt-0675335611-w210502362-regular,
  _score : 1.0
    }, {
  _index : gazetteer,
  _type : location,
  _id : adrpnt-0675352866-w245359553-regular,
  _score : 1.0
    }, {
  _index : gazetteer,
  _type : location,
  _id : adrpnt-0675354643-w235622232-regular,
  _score : 1.0
    } ]
  }
}


2014-06-06 19:40 GMT+02:00 David Pilato da...@pilato.fr:
Can you print also the full response object (toString()) as well?

-- 
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr


Le 6 juin 2014 à

Re: A plugin to change the result set before sending it back to the http client

2014-06-06 Thread Brian

Thanks so much, Ivan. That's a very important distinction.

Brian

On Friday, June 6, 2014 12:28:56 PM UTC-4, Ivan Brusic wrote:

 Plugins are essential to ES's success and are not going away any time 
 soon. The river plugins, aka cluster singletons, are the ones which are 
 discouraged from use. Good ahead and create more plugins!

 -- 
 Ivan


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/fc5afbff-469c-4a02-8fa9-8190a82ce5fb%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Wrong paging with embedded es instance

Sorry, there littlebit different dataset.
Here is answer with same data and 1 shard
{
  took : 63,
  timed_out : false,
  _shards : {
total : 1,
successful : 1,
failed : 0
  },
  hits : {
total : 3879,
max_score : 1.0,
hits : [ {
  _index : gazetteer,
  _type : location,
  _id : adrpnt-0004074128-w145925740-regular,
  _score : 1.0
}, {
  _index : gazetteer,
  _type : location,
  _id : adrpnt-0007815983-w175372179-regular,
  _score : 1.0
}, {
  _index : gazetteer,
  _type : location,
  _id : adrpnt-0013765594-w145949343-regular,
  _score : 1.0
}, {
  _index : gazetteer,
  _type : location,
  _id : adrpnt-0015203925-w147150792-regular,
  _score : 1.0
}, {
  _index : gazetteer,
  _type : location,
  _id : adrpnt-0016087997-w162848733-regular,
  _score : 1.0
} ]
  }
}


2014-06-06 19:54 GMT+02:00 David Pilato da...@pilato.fr:

 The total hit number is really inconsistent:

 With one shard you get: 656523
 With 5 shards you get: 3879

 I think you are doing something wrong but I can't tell more without
 looking at the full source code.
 Could you share how you actually execute the query?

 Are you sure your Java client is connected to the right instance/cluster?


  --
 *David Pilato* | *Technical Advocate* | *Elasticsearch.com*
 @dadoonet https://twitter.com/dadoonet | @elasticsearchfr
 https://twitter.com/elasticsearchfr


 Le 6 juin 2014 à 19:49:47, Дмитрий Киселев (dmitry.v.kise...@gmail.com) a
 écrit:

 This is with 5 shards.
 {
   took : 81,
   timed_out : false,
   _shards : {
 total : 5,
 successful : 5,
 failed : 0
   },
   hits : {
 total : 3879,
 max_score : 1.0,
 hits : [ {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0016087997-w162848733-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0088827105-n2270743905-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0097856729-n2270743903-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0145983393-w154644839-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0209772668-n1884206099-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0015203925-w147150792-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0017569140-n2495059507-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0071389729-w147150672-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0130455978-w145925771-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0139624280-w147150701-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-007978-w194531715-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0096243499-w194531714-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0162691059-w164700540-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0202220208-w164698447-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0287053820-n2270743890-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0013765594-w145949343-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0026389358-w147150656-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0093401200-w162848869-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0203517601-n2270743895-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0217217459-n2270743898-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0004074128-w145925740-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0007815983-w175372179-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0039697750-w164700428-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0053473990-w271448695-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type :

Re: A couple of questions about keys and routing

2014-06-06 Thread Chris Berry

I apologize having to reply to my own message.
But please replace all CID w/ MID below. I started editing the text and
inadvertently hit Send.
Thanks,
-- Chris

On Friday, June 6, 2014 1:03:20 PM UTC-5, Chris Berry wrote:

Greetings,

I have a couple of questions about keys and routing.

Let’s imagine that I have

1. A set of time-based Indexes. The indexes are time-based because the
overall set is unbounded (it grows by over 1M/day) . Thus, we'll have,
say,
an Index per Quarter to keep them bounded individually and therefore
maintain predictable performance.
2. The documents stored in these Indexes are “keyed” by MID and
routed to” by PID.
3. Every document has a unique MID (it’s id_) and contains a PID field
that is unique to the document but not across all documents.
4. The common use case is to want all the MIDs for a given PID.

So no problem there. (Hopefully that makes sense…)

My first question; when I lookup by MID in a given Index, and I do not
have a PID (routing key), is that an inefficient lookup?
I.e. will it have to scan all of the Shards to find it??

And my second question is really more about design.
I also need to be able to lookup a document by it’s CID and I don’t
necessarily know which time-based Index I will find it in. (E.g. it may
have been inactive and then be resurrected)
I was hoping to avoid having some sort of metadata Index that could yield
this info for me (I.e. given a CID, return it’s PID and current Index)
because it will end up being unbounded (we are talking billions of entries
eventually).
Perhaps this won’t really matter because it is only looked-up by key,
which should be fast as it would go to the correct Shard.
And the data set is only a few bytes?? But still, it seems like a I would
be creating an eventual problem.
Although the alternative, looking in every time-based index seems much
worse.

Any advise would be greatly appreciated.

Thanks,
— Chris

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/0e1d66ad-e177-423a-813b-e95f0ec52697%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Wrong paging with embedded es instance

So? What's wrong here?

You asked for 5 docs and you get 5.

I'm missing something I guess.

--
David ;-)
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

 Le 6 juin 2014 à 20:02, Дмитрий Киселев dmitry.v.kise...@gmail.com a écrit :
 
 Sorry, there littlebit different dataset.
 Here is answer with same data and 1 shard
 {
   took : 63,
   timed_out : false,
   _shards : {
 total : 1,
 successful : 1,
 failed : 0
   },
   hits : {
 total : 3879,
 max_score : 1.0,
 hits : [ {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0004074128-w145925740-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0007815983-w175372179-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0013765594-w145949343-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0015203925-w147150792-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0016087997-w162848733-regular,
   _score : 1.0
 } ]
   }
 }
 
 
 2014-06-06 19:54 GMT+02:00 David Pilato da...@pilato.fr:
 The total hit number is really inconsistent:
 
 With one shard you get: 656523
 With 5 shards you get: 3879
 
 I think you are doing something wrong but I can't tell more without looking 
 at the full source code.
 Could you share how you actually execute the query?
 
 Are you sure your Java client is connected to the right instance/cluster?
 
 
 -- 
 David Pilato | Technical Advocate | Elasticsearch.com
 @dadoonet | @elasticsearchfr
 
 
 Le 6 juin 2014 à 19:49:47, Дмитрий Киселев (dmitry.v.kise...@gmail.com) a 
 écrit:
 
 This is with 5 shards.
 {
   took : 81,
   timed_out : false,
   _shards : {
 total : 5,
 successful : 5,
 failed : 0
   },
   hits : {
 total : 3879,
 max_score : 1.0,
 hits : [ {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0016087997-w162848733-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0088827105-n2270743905-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0097856729-n2270743903-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0145983393-w154644839-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0209772668-n1884206099-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0015203925-w147150792-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0017569140-n2495059507-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0071389729-w147150672-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0130455978-w145925771-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0139624280-w147150701-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-007978-w194531715-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0096243499-w194531714-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0162691059-w164700540-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0202220208-w164698447-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0287053820-n2270743890-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0013765594-w145949343-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0026389358-w147150656-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0093401200-w162848869-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0203517601-n2270743895-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0217217459-n2270743898-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0004074128-w145925740-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0007815983-w175372179-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id :

Re: Wrong paging with embedded es instance

I asked for 5 docs.

With 1 shard - I got 5 docs.
With 5 shards - I got 25 docs.
With 5 shards, using curl instead of embedded java client - I got 5 docs.


2014-06-06 20:20 GMT+02:00 David Pilato da...@pilato.fr:

 So? What's wrong here?

 You asked for 5 docs and you get 5.

 I'm missing something I guess.


 --
 David ;-)
 Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

 Le 6 juin 2014 à 20:02, Дмитрий Киселев dmitry.v.kise...@gmail.com a
 écrit :

 Sorry, there littlebit different dataset.
 Here is answer with same data and 1 shard
 {
   took : 63,
   timed_out : false,
   _shards : {
 total : 1,
 successful : 1,
 failed : 0
   },
   hits : {
 total : 3879,
 max_score : 1.0,
 hits : [ {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0004074128-w145925740-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0007815983-w175372179-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0013765594-w145949343-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0015203925-w147150792-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0016087997-w162848733-regular,
   _score : 1.0
 } ]
   }
 }


 2014-06-06 19:54 GMT+02:00 David Pilato da...@pilato.fr:

 The total hit number is really inconsistent:

 With one shard you get: 656523
 With 5 shards you get: 3879

 I think you are doing something wrong but I can't tell more without
 looking at the full source code.
 Could you share how you actually execute the query?

 Are you sure your Java client is connected to the right instance/cluster?


  --
 *David Pilato* | *Technical Advocate* | *Elasticsearch.com
 http://Elasticsearch.com*
 @dadoonet https://twitter.com/dadoonet | @elasticsearchfr
 https://twitter.com/elasticsearchfr


 Le 6 juin 2014 à 19:49:47, Дмитрий Киселев (dmitry.v.kise...@gmail.com)
 a écrit:

 This is with 5 shards.
 {
   took : 81,
   timed_out : false,
   _shards : {
 total : 5,
 successful : 5,
 failed : 0
   },
   hits : {
 total : 3879,
 max_score : 1.0,
 hits : [ {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0016087997-w162848733-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0088827105-n2270743905-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0097856729-n2270743903-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0145983393-w154644839-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0209772668-n1884206099-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0015203925-w147150792-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0017569140-n2495059507-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0071389729-w147150672-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0130455978-w145925771-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0139624280-w147150701-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-007978-w194531715-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0096243499-w194531714-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0162691059-w164700540-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0202220208-w164698447-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0287053820-n2270743890-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0013765594-w145949343-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0026389358-w147150656-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0093401200-w162848869-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0203517601-n2270743895-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0217217459-n2270743898-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type :

Re: Elasticsearch and Hadoop Questions

2014-06-06 Thread ES USER

I guess the problem I having wrapping my head around is exactly where the
data is residing and in what format.

If I understand the Georgi's email above is it that you can run map reduce
jobs against data stored in local ES through by utilizing es-hadoop and you
can also run ES queries against data in Hadoop utilizing es-hadoop.

Is that correct?

On Friday, June 6, 2014 12:39:44 PM UTC-4, Costin Leau wrote:

Adding to what Georgi wrote, es-hadoop does not create the shards for you
- that's up to you or index templates (which I highly recommend). However
es-hadoop is aware of the target shards and will use them to parallelize
the reads/writes (such as one task per shard).

On Fri, Jun 6, 2014 at 2:45 PM, Georgi Ivanov georgi@gmail.com
javascript: wrote:

and i don't think this anyhow related with number of shards and nodes

On Thursday, June 5, 2014 7:41:34 PM UTC+2, ES USER wrote:

Try as I might and I have read all the stuff I can find on ES' website
about this I understand somewhat how the integration works but not the
actual nuts and bolts of it.

For example:

If it is the latter how do you go about determining whatto set for the
number of nodes and shards.

If anyone has any information on this or even better yet a place to
point me to that has better references so that I can research this on my
own it would be much appreciated.

Thanks.

https://groups.google.com/d/msgid/elasticsearch/90662a91-1557-4f61-86a2-bd2e620aec6f%40googlegroups.com?utm_medium=emailutm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/ed729795-a7d6-4320-9da2-16b214e653b0%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

IllegalArgumentException[No type mapped for [43]], version 1.2.1

2014-06-06 Thread Brooke Babcock

In one part of our application we use Elasticsearch as an object store. 
Therefore, when indexing, we supply our own _id. Likewise, when accessing a 
document we use the simple GET method to fetch by _id. This has worked well 
for us, up until recently. Normally, this is what we get:

curl -XGET 'http://127.0.0.1:9200/data-2014.06.06/key/test1?pretty=true'
{
  _index : data-2014.06.06,
  _type : key,
  _id : test1,
  _version : 1,
  found : true,
  _source:{sData:test data 1}
}


Now, we often encounter a recently indexed document that throws the 
following error when we try to fetch it:

curl -XGET 'http://127.0.0.1:9200/data-2014.06.06/key/test2?pretty=true'
{
  error:IllegalArgumentException[No type mapped for [43]],
  status:500
}



This condition persists anywhere from 1 to 25 minutes or so, at which point 
we no longer receive the error for that document and the GET succeeds as 
normal. From that point on, we are able to consistently retrieve that 
document by _id without issue. But, soon after, we will find a different 
newly indexed document caught in the same bad state.

We know the documents are successfully indexed. Our bulk sender (which uses 
the Java transport client) indicates no error during indexing and we are 
still able to locate the document by doing an ids query, such as:

curl -XPOST http://127.0.0.1:9200/data-2014.06.06/key/_search?pretty=true; 
-d '
{
  query: {
ids: {
  values: [test2]
}
  }
}'

Which responds:
{
   took: 543,
   timed_out: false,
   _shards: {
  total: 10,
  successful: 10,
  failed: 0
   },
   hits: {
  total: 1,
  max_score: 1.0,
  hits: [ {
 _index: data-2014.06.06,
 _type: key,
 _id: test2,
 _score: 1.0,
 _source:{sData: test data 2}
  } ]
   }
}


We first noticed this behavior in version 1.2.0. When we upgraded to 1.2.1, 
we deleted all indexes and started with a fresh cluster. We hoped our 
problem would be solved by the big fix that came in 1.2.1, but we are still 
regularly seeing it. Although our situation may sound like the routing bug 
introduced in 1.2.0, we are certain that it is not. This appears to be a 
significant issue with the translog - we hope the developers will be able 
to look at what may have changed. We did not notice this problem in version 
1.1.1.

Just in case, here is the mapping being used:
curl -XGET 'http://127.0.0.1:9200/data-2014.06.06/key/_mapping?pretty=true'
{
  data-2014.06.06 : {
mappings : {
  key : {
_all : {
  enabled : false
},
properties : {
  sData : {
type : string,
index : no
  }
}
  }
}
  }
}


Thanks for your help.



-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/20c45cf8-3459-47f5-8cc3-1e63c93b2c0c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Bug in context suggester?

2014-06-06 Thread Gilbert Chang

I am observing an issue with field reference in context suggester. Consider 
the following use case:

PUT /services/service/_mapping
{
service: {
properties: {
name: {
type : string
},
network_id: {
type : long
},
suggest_field: {
type: completion,
context: {
network: { 
type: category,
path: network_id
}
}
}
}
}
}

PUT /services/service/1
{
name: knapsack,
network_id: 1,
suggest_field: {
input: [knacksack, backpack, daypack]
}
}

POST services/_suggest?pretty'
{
suggest : {
text : k,
completion : {
field : suggest_field,
size: 10,
context: {
network: 1
}
}
}
}

The suggest query does not return any option. However, if I change the 
field type of network_id to 'string' (and not_analyzed), the suggest query 
works. Explicitly indexing the network_id as the context in the 
suggest_field works as well:

PUT /services/service/1
{
name: knapsack,
network_id: 1,
suggest_field: {
input: [knacksack, backpack, daypack],
context: {network_id: 1}
}
}

Seems like a bug?

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/214f059d-e2e6-4a28-aaaf-7dc6c1e1bcad%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Wrong paging with embedded es instance

Any chance you could share your code. In particular, i'd like to see how you 
run the query.

If you could reproduce it with a test case that would be awesome.

--
David ;-)
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

 Le 6 juin 2014 à 20:26, Дмитрий Киселев dmitry.v.kise...@gmail.com a écrit :
 
 I asked for 5 docs.
 
 With 1 shard - I got 5 docs.
 With 5 shards - I got 25 docs.
 With 5 shards, using curl instead of embedded java client - I got 5 docs.
 
 
 2014-06-06 20:20 GMT+02:00 David Pilato da...@pilato.fr:
 So? What's wrong here?
 
 You asked for 5 docs and you get 5.
 
 I'm missing something I guess.
 
 
 --
 David ;-)
 Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs
 
 Le 6 juin 2014 à 20:02, Дмитрий Киселев dmitry.v.kise...@gmail.com a 
 écrit :
 
 Sorry, there littlebit different dataset.
 Here is answer with same data and 1 shard
 {
   took : 63,
   timed_out : false,
   _shards : {
 total : 1,
 successful : 1,
 failed : 0
   },
   hits : {
 total : 3879,
 max_score : 1.0,
 hits : [ {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0004074128-w145925740-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0007815983-w175372179-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0013765594-w145949343-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0015203925-w147150792-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0016087997-w162848733-regular,
   _score : 1.0
 } ]
   }
 }
 
 
 2014-06-06 19:54 GMT+02:00 David Pilato da...@pilato.fr:
 The total hit number is really inconsistent:
 
 With one shard you get: 656523
 With 5 shards you get: 3879
 
 I think you are doing something wrong but I can't tell more without 
 looking at the full source code.
 Could you share how you actually execute the query?
 
 Are you sure your Java client is connected to the right instance/cluster?
 
 
 -- 
 David Pilato | Technical Advocate | Elasticsearch.com
 @dadoonet | @elasticsearchfr
 
 
 Le 6 juin 2014 à 19:49:47, Дмитрий Киселев (dmitry.v.kise...@gmail.com) a 
 écrit:
 
 This is with 5 shards.
 {
   took : 81,
   timed_out : false,
   _shards : {
 total : 5,
 successful : 5,
 failed : 0
   },
   hits : {
 total : 3879,
 max_score : 1.0,
 hits : [ {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0016087997-w162848733-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0088827105-n2270743905-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0097856729-n2270743903-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0145983393-w154644839-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0209772668-n1884206099-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0015203925-w147150792-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0017569140-n2495059507-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0071389729-w147150672-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0130455978-w145925771-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0139624280-w147150701-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-007978-w194531715-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0096243499-w194531714-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0162691059-w164700540-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0202220208-w164698447-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0287053820-n2270743890-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0013765594-w145949343-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0026389358-w147150656-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0093401200-w162848869-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id :

Re: Aggregation bug? Or user error?

2014-06-06 Thread mooky

 
Ok. I have written a test case that (if run enough) will reproduce it. Its 
an intermittent bug.
I have raised an issue:
https://github.com/elasticsearch/elasticsearch/issues/6435 

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/0b666a91-2f49-4787-ba2f-fb33a8fc023e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

scala elastic4s usage question

2014-06-06 Thread Ramdev Wudali

Hi:
  I have started using the scala libraries provided by the elastic4s 
libraries. I am running into a problem creating a mapping that has a 
straightforward definition (yet it has some complexity).

Example :

{
index: {
mappings: {
OA: {
properties: {
AdminStatus: {
properties: {
content: {
type: string
},
effectiveFrom: {
type: date,
format: dateOptionalTime
}
}
},
IsPublicFlag: {
type: boolean
},
OrganizationAddress: {
properties: {
OrganizationAddressCity: {
type: string
},
OrganizationAddressCountryCode: {
type: string
},
OrganizationAddressLine1: {
type: string
}
}
}
}
}
}
}
}





I am not able to figure out how to define OrganizationAddress field  (which 
is a complex object.)

if I define it as such :

indexClient.execute {
create index index mappings (
  OA as (
  AdminStatus as (
  content typed 
StringType,
  effectiveFrom typed 
DateType
),
IsPublicFlag 
typed BooleanType,

OrganizationAddress as (
  
OrganizationAddressCity typed StringType,
  
OrganizationAddressLine1 typed StringType,
)
) }

I get a compilation  error :
Error:(52, 69) type mismatch;
 found   : com.sksamuel.elastic4s.mapping.MappingDefinition
 required: com.sksamuel.elastic4s.mapping.TypedFieldDefinition
  AdminStatus as (
^

How can I map complex objects using the Scala interface elastic4s ?

Thanks

Ramdev

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/10398958-4755-478f-a3f7-3f1b77aca67d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Using Regex in Query via Kibana

2014-06-06 Thread Bill Paulson

I think the '$' on the end of the regex may cause trouble. Lucene regular 
expressions are already anchored to start and end of the string.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/65007509-daf5-49b7-8cfd-148921bab52c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Elasticsearch and Hadoop Questions

ES stores data in its own internal format, which typically resides locally.
What you are stating is partially correct - with the connector you would
move/copy data between Hadoop and ES since, in order for ES to work with
data, it needs to actually index it (that is, to see it).
So you would use es-hadoop to index data from Hadoop in ES or/and query ES
directly from Hadoop.

On Fri, Jun 6, 2014 at 9:29 PM, ES USER es.user.2...@gmail.com wrote:

I guess the problem I having wrapping my head around is exactly where the
data is residing and in what format.

Is that correct?

On Friday, June 6, 2014 12:39:44 PM UTC-4, Costin Leau wrote:

Adding to what Georgi wrote, es-hadoop does not create the shards for you
- that's up to you or index templates (which I highly recommend). However
es-hadoop is aware of the target shards and will use them to parallelize
the reads/writes (such as one task per shard).

On Fri, Jun 6, 2014 at 2:45 PM, Georgi Ivanov georgi@gmail.com
wrote:

and i don't think this anyhow related with number of shards and nodes

On Thursday, June 5, 2014 7:41:34 PM UTC+2, ES USER wrote:

Try as I might and I have read all the stuff I can find on ES' website
about this I understand somewhat how the integration works but not the
actual nuts and bolts of it.

For example:

If it is the latter how do you go about determining whatto set for the
number of nodes and shards.

If anyone has any information on this or even better yet a place to
point me to that has better references so that I can research this on my
own it would be much appreciated.

Thanks.

--
You received this message because you are subscribed to the Google
Groups elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/90662a91-1557-4f61-86a2-bd2e620aec6f%
40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/90662a91-1557-4f61-86a2-bd2e620aec6f%40googlegroups.com?utm_medium=emailutm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/ed729795-a7d6-4320-9da2-16b214e653b0%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/ed729795-a7d6-4320-9da2-16b214e653b0%40googlegroups.com?utm_medium=emailutm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAJogdmeacuVOXNYcwdYHBg69TAotrqvyuzre_JeUK-RfAcFBXA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Need help setting up autocomplete suggestions using phrase suggester

2014-06-06 Thread Paul Bormans

I just wanted to let anyone know i didn't succeed with the phrase suggester 
and i switched to the completion suggester, that actually works very well 
to my purpose.
Paul

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/e0410cb8-60df-428c-8225-6ea00146630f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Copy index from production to development instance

2014-06-06 Thread Antonio Augusto Santos

Take a look at stream2es https://github.com/elasticsearch/stream2es

On Friday, June 6, 2014 2:13:06 PM UTC-3, Brian Lamb wrote:

 I should also point out that I had to edit a file in the metadata-snapshot 
 file to change around the s3 keys and bucket name to match what development 
 was expecting.

 On Friday, June 6, 2014 1:11:57 PM UTC-4, Brian Lamb wrote:

 Hi all,

 I want to do a one time copy of the data on my production elastic search 
 instance to my development elastic search instance. Both are managed by AWS 
 if that makes this easier. Here is what I tried:

 On production:

 curl -XPUT 'http://localhost:9200/_snapshot/my_s3_repository' -d '{
 type: s3,
 settings: {
 access_key: productionAccessKey,
 bucket: productionBucketName,
 region: region,
 secret_key: productionSecretKey
 }
 }'
 curl -XPUT 
 http://localhost:9200/_snapshot/my_s3_repository/snapshot_2014_06_02;

 What this does is upload the instance to a production level s3 bucket.

 Then in the aws console, I copy all of it to a development level s3 
 bucket.

 Next on development:

 curl -XPUT 'http://localhost:9200/_snapshot/my_s3_repository' -d '{
 type: s3,
 settings: {
 access_key: developmentAccessKey,
 bucket: developmentBucketName,
 region: region,
 secret_key: developmentSecretKey
 }
 }'
 curl -XPOST 
 http://localhost:9200/_snapshot/my_s3_repository/snapshot_2014_06_02/_restore
 

 This gives me the following message:

 $ curl -XPOST 
 http://localhost:9200/_snapshot/my_s3_repository/snapshot_2014_06_02/_restore?pretty=true
 
 {
   error : SnapshotException[[my_s3_repository:snapshot_2014_06_02] 
 failed to get snapshots]; nested: IOException[Failed to get 
 [snapshot-snapshot_2014_06_02]]; nested: AmazonS3Exception[Status Code: 
 404, AWS Service: Amazon S3, AWS Request ID: RequestId, AWS Error Code: 
 NoSuchKey, AWS Error Message: The specified key does not exist.]; ,
   status : 500
 }

 Also, when I try to get the snapshots, I get the following:

 $ curl -XGET localhost:9200/_snapshot/_status?pretty=true
 {
   snapshots : [ ]
 }

 This leads me to believe that I am not connecting the snapshot correctly 
 but I'm not sure what I am doing incorrectly. Regenerating the index on 
 development is not really a possibility as it took a few months to generate 
 the index the first time around. If there is a better way to do this, I'm 
 all for it. 

 Thanks,

 Brian Lamb



-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/52168c96-30ea-4527-b287-676e757b1e6a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Help with cluster.routing.allocation.enable Property

The cluster state will be yellow while shards are being rebalanced, so you
can just execute a cluster healt request and wait for green status:

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/cluster-health.html#request-params

You can also poll the same API and wait until all the moving shards
(relocating_shards, initializing_shards, unassigned_shards) are 0.

Cheers,

Ivan

On Fri, Jun 6, 2014 at 11:03 AM, sagarl sagar...@gmail.com wrote:

Hi folks,

Our requirement is as follows:

We have our ES cluster running on aws instances where nodes can come and
go. We would like to disable Shard reallocation on cluster in Normal state
so that Node going down should not rebalance the shards.(Just to avoid
unnecessary Network i/o).

When node comes back up (within few minutes), we would like to enable it
so that new node will get back it's own shards and after rebalancing is
finished, we would like to Disable Shard reallocation again.

We would like this to be an automated process rather than manual one.

I have already taken a look at following stackoverflow question/answer
http://stackoverflow.com/questions/16231614/quick-recovery-after-node-restart-in-elasticsearch
but I am not sure if it can work in automated way.

curl -XPUT localhost:9200/_cluster/settings -d
'{transient:{cluster.routing.allocation.enable: none}}'
/etc/init.d/elasticsearch restart
curl -XPUT localhost:9200/_cluster/settings -d
'{transient:{cluster.routing.allocation.enable: all}}'

So I would like to know, is there any way to Listen to the event which
says Shard Rebalancing is finished (by adding a new listener) and then
disable cluster.routing.allocation.enable Property ?

-- Thanks

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/cee6bef8-25cf-4175-ac53-f2734d44ea76%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/cee6bef8-25cf-4175-ac53-f2734d44ea76%40googlegroups.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQCXbKF%3DDCig3xhe48yb40Tb88tLXRH2kNk3w4OxJq62aQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Wrong paging with embedded es instance

I've made a snippet.

Code
https://github.com/kiselev-dv/es-test/blob/master/ESPagingTest/src/main/java/test/ESTest.java

Test results
https://github.com/kiselev-dv/es-test/blob/master/test.log
First test 1 shard (test.log line 27) - everything ok
Second test 5 shards (test.log line 86) - error

Search and paging generation
https://github.com/kiselev-dv/es-test/blob/master/ESPagingTest/src/main/java/test/ESTest.java#L73

One more strange thing, search wasn't find anything until I have add small
delay
https://github.com/kiselev-dv/es-test/blob/master/ESPagingTest/src/main/java/test/ESTest.java#L89



2014-06-06 20:53 GMT+02:00 David Pilato da...@pilato.fr:

 Any chance you could share your code. In particular, i'd like to see how
 you run the query.

 If you could reproduce it with a test case that would be awesome.

 --
 David ;-)
 Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

 Le 6 juin 2014 à 20:26, Дмитрий Киселев dmitry.v.kise...@gmail.com a
 écrit :

 I asked for 5 docs.

 With 1 shard - I got 5 docs.
 With 5 shards - I got 25 docs.
 With 5 shards, using curl instead of embedded java client - I got 5 docs.


 2014-06-06 20:20 GMT+02:00 David Pilato da...@pilato.fr:

 So? What's wrong here?

 You asked for 5 docs and you get 5.

 I'm missing something I guess.


 --
 David ;-)
 Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

 Le 6 juin 2014 à 20:02, Дмитрий Киселев dmitry.v.kise...@gmail.com a
 écrit :

 Sorry, there littlebit different dataset.
 Here is answer with same data and 1 shard
 {
   took : 63,
   timed_out : false,
   _shards : {
 total : 1,
 successful : 1,
 failed : 0
   },
   hits : {
 total : 3879,
 max_score : 1.0,
 hits : [ {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0004074128-w145925740-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0007815983-w175372179-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0013765594-w145949343-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0015203925-w147150792-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0016087997-w162848733-regular,
   _score : 1.0
 } ]
   }
 }


 2014-06-06 19:54 GMT+02:00 David Pilato da...@pilato.fr:

 The total hit number is really inconsistent:

 With one shard you get: 656523
 With 5 shards you get: 3879

 I think you are doing something wrong but I can't tell more without
 looking at the full source code.
 Could you share how you actually execute the query?

 Are you sure your Java client is connected to the right instance/cluster?


  --
 *David Pilato* | *Technical Advocate* | *Elasticsearch.com
 http://Elasticsearch.com*
 @dadoonet https://twitter.com/dadoonet | @elasticsearchfr
 https://twitter.com/elasticsearchfr


 Le 6 juin 2014 à 19:49:47, Дмитрий Киселев (dmitry.v.kise...@gmail.com)
 a écrit:

 This is with 5 shards.
 {
   took : 81,
   timed_out : false,
   _shards : {
 total : 5,
 successful : 5,
 failed : 0
   },
   hits : {
 total : 3879,
 max_score : 1.0,
 hits : [ {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0016087997-w162848733-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0088827105-n2270743905-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0097856729-n2270743903-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0145983393-w154644839-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0209772668-n1884206099-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0015203925-w147150792-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0017569140-n2495059507-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0071389729-w147150672-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0130455978-w145925771-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0139624280-w147150701-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-007978-w194531715-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0096243499-w194531714-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0162691059-w164700540-regular,
   _score : 1.0

Re: Need help setting up autocomplete suggestions using phrase suggester

2014-06-06 Thread Nikolas Everett

Yeah, sorry for not getting back to you earlier.  The phrase and term
suggesters are more for providing did you mean: style spelling
corrections then typeahead.  The completion suggester is for typeahead.


On Fri, Jun 6, 2014 at 5:05 PM, Paul Bormans pee...@gmail.com wrote:

 I just wanted to let anyone know i didn't succeed with the phrase
 suggester and i switched to the completion suggester, that actually works
 very well to my purpose.
 Paul

  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/e0410cb8-60df-428c-8225-6ea00146630f%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/e0410cb8-60df-428c-8225-6ea00146630f%40googlegroups.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAPmjWd1ABJgLx62CaHCUX%3Du1ztv-jdm%2BdLVcoZjt%3DOfLNdcruA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Help with cluster.routing.allocation.enable Property

2014-06-06 Thread Drew Raines


sagarl wrote:

We have our ES cluster running on aws instances where nodes can 
come and go. We would like to disable Shard reallocation on 
cluster in Normal state so that Node going down should not 
rebalance the shards.(Just to avoid unnecessary Network i/o).


If your shard topology is relatively static, you could just leave 
allocation disabled (or set it to new_primaries for the occasional 
index creation).  But honestly you're trying to micro-manage a 
process that ES does pretty well.  There's nothing wrong with 
having a yellow cluster.  If you're concerned about availability 
you should add replicas.


Drew

--
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/m2y4x9wwax.fsf%40mid.raines.me.
For more options, visit https://groups.google.com/d/optout.

[ElasticSearch Hadoop] Error in configuring object

2014-06-06 Thread bharath bhat

Hi, 

I'm trying to get a simple mapreduce job working with ES hadoop. I followed 
the docs to set up a job but I keep getting 'Error in configuring object' 
when I try to run it on Hadoop in pseudo distributed mode. I am using the 
old API with Hadoop 2.4.0. 

Here's my code I am using:

public class SomeMapper extends MapReduceBase implements MapperObject, 
Object, Text, MapWritable {

// Dummy
public void map(Object key, Object value, OutputCollectorText, 
MapWritable output, Reporter reporter)
throws IOException {
Text docId = (Text) key;
MapWritable doc = (MapWritable) value;
output.collect(docId, doc);
}
}

public class SomeReducer extends MapReduceBase implements ReducerText, 
MapWritable, Text, MapWritable {

// Dummy
public void reduce(Text key, IteratorMapWritable values, 
OutputCollectorText, MapWritable output, Reporter reporter)
throws IOException {
Text docId = (Text) key;
while (values.hasNext()){
MapWritable out = (MapWritable)(values.next());
output.collect(docId, out);
}
}
}

public static void main(String[] args) throws  Exception {

JobConf conf = new JobConf();
conf.setJobName(elastic search hello world);
conf.setSpeculativeExecution(false);

conf.set(es.nodes, localhost:9200);
conf.set(es.resource, answers_development/answer);
conf.set(es.resource.read, answers_development/answer);
conf.set(es.resource.write, 
questions_development_20140603205720870/question);
conf.set(es.query, {});

conf.setOutputFormat(EsOutputFormat.class);
conf.setInputFormat(EsInputFormat.class);

conf.setMapOutputKeyClass(Text.class);
conf.setMapOutputValueClass(MapWritable.class);

conf.setMapperClass(SomeMapper.class);
conf.setReducerClass(SomeReducer.class);

JobClient.runJob(conf);

};

Here's the relevant portion of the stack trace:

14/06/06 21:35:27 INFO mapred.MapTask: Map output collector class = 
org.apache.hadoop.mapred.MapTask$MapOutputBuffer
14/06/06 21:35:28 INFO mapred.LocalJobRunner: map task executor complete.
14/06/06 21:35:28 WARN mapred.LocalJobRunner: job_local141518769_0001
java.lang.Exception: java.lang.RuntimeException: Error in configuring object
at 
org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522)
Caused by: java.lang.RuntimeException: Error in configuring object
at 
org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75)
at 
org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:426)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
at 
org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106)
... 10 more
Caused by: java.lang.RuntimeException: java.lang.NoSuchMethodException: 
com.edcast.cards.MapReduceHelloWorld$SomeMapper.init()
at 
org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:131)
at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:38)
... 15 more
Caused by: java.lang.NoSuchMethodException: 
com.edcast.cards.MapReduceHelloWorld$SomeMapper.init()
at java.lang.Class.getConstructor0(Class.java:2810)
at java.lang.Class.getDeclaredConstructor(Class.java:2053)
at 
org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:125)
... 16 more
14/06/06 21:35:28 INFO mapreduce.Job: Job job_local141518769_0001 failed 
with state FAILED due to: NA
14/06/06 21:35:28 INFO mapreduce.Job: Counters: 0
Exception in thread main java.io.IOException: Job failed!
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:836)
at com.edcast.cards.MapReduceHelloWorld.main(MapReduceHelloWorld.java:79)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at

Re: [ElasticSearch Hadoop] Error in configuring object


Caused by: java.lang.RuntimeException: java.lang.NoSuchMethodException:
com.edcast.cards.MapReduceHelloWorld$SomeMapper.init()
at
org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:131)
at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:38)
... 15 more
Caused by: java.lang.NoSuchMethodException:
com.edcast.cards.MapReduceHelloWorld$SomeMapper.init()
at java.lang.Class.getConstructor0(Class.java:2810)
at java.lang.Class.getDeclaredConstructor(Class.java:2053)
at
org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:125)

In other words, your Mapper has no default constructor defined for it. That
is Hadoop cannot instantiate your mapper since your class definition is
incorrect.


On Sat, Jun 7, 2014 at 12:52 AM, bharath bhat bharathpb...@gmail.com
wrote:

 Hi,

 I'm trying to get a simple mapreduce job working with ES hadoop. I
 followed the docs to set up a job but I keep getting 'Error in configuring
 object' when I try to run it on Hadoop in pseudo distributed mode. I am
 using the old API with Hadoop 2.4.0.

 Here's my code I am using:

 public class SomeMapper extends MapReduceBase implements MapperObject,
 Object, Text, MapWritable {

 // Dummy
 public void map(Object key, Object value, OutputCollectorText,
 MapWritable output, Reporter reporter)
 throws IOException {
 Text docId = (Text) key;
 MapWritable doc = (MapWritable) value;
 output.collect(docId, doc);
 }
 }

 public class SomeReducer extends MapReduceBase implements
 ReducerText, MapWritable, Text, MapWritable {

 // Dummy
 public void reduce(Text key, IteratorMapWritable values,
 OutputCollectorText, MapWritable output, Reporter reporter)
 throws IOException {
 Text docId = (Text) key;
 while (values.hasNext()){
 MapWritable out = (MapWritable)(values.next());
 output.collect(docId, out);
 }
 }
 }

 public static void main(String[] args) throws  Exception {

 JobConf conf = new JobConf();
 conf.setJobName(elastic search hello world);
 conf.setSpeculativeExecution(false);

 conf.set(es.nodes, localhost:9200);
 conf.set(es.resource, answers_development/answer);
 conf.set(es.resource.read, answers_development/answer);
 conf.set(es.resource.write,
 questions_development_20140603205720870/question);
 conf.set(es.query, {});

 conf.setOutputFormat(EsOutputFormat.class);
 conf.setInputFormat(EsInputFormat.class);

 conf.setMapOutputKeyClass(Text.class);
 conf.setMapOutputValueClass(MapWritable.class);

 conf.setMapperClass(SomeMapper.class);
 conf.setReducerClass(SomeReducer.class);

 JobClient.runJob(conf);

 };

 Here's the relevant portion of the stack trace:

 14/06/06 21:35:27 INFO mapred.MapTask: Map output collector class =
 org.apache.hadoop.mapred.MapTask$MapOutputBuffer
 14/06/06 21:35:28 INFO mapred.LocalJobRunner: map task executor complete.
 14/06/06 21:35:28 WARN mapred.LocalJobRunner: job_local141518769_0001
 java.lang.Exception: java.lang.RuntimeException: Error in configuring
 object
 at
 org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
 at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522)
 Caused by: java.lang.RuntimeException: Error in configuring object
 at
 org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
 at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75)
 at
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
 at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:426)
 at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
 at
 org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
 at java.util.concurrent.FutureTask.run(FutureTask.java:262)
 at
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:744)
 Caused by: java.lang.reflect.InvocationTargetException
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at
 org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106)
 ... 10 more
 Caused by: java.lang.RuntimeException: java.lang.NoSuchMethodException:
 com.edcast.cards.MapReduceHelloWorld$SomeMapper.init()
 at
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:131)
 at

Re: [ElasticSearch Hadoop] Error in configuring object