Re: High cpu load but low memory usage.

2014-10-26 Thread joergpra...@gmail.com
For backup/restore, do not use cp. There is snapshot/restore for that. It
works on primary shards only.

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-snapshots.html

You can not reduce shards in an existing index. Use export/import tools and
create new index.

Jörg

On Sun, Oct 26, 2014 at 3:44 AM, Atrus anhhu...@gmail.com wrote:

 Thanks Mathieu for you response. I will try your suggestions.

 It's a huge value. Shards can be split between nodes, do you target tu
 use 15 nodes?

 Hi Mat,

 - For examples if I have just one node, shards = 5, replica = 0. Then I
 can easily backup the data by cp /var/lib/elasticsearch/nodename
 /somewhere/backup -rfp

 - Now I add one more node, so the cluster has two nodes, shards = 5,
 replica = 0. The shards are redistributed, maybe 1st node holds 0 2 4, 2nd
 node holds 1 3. = How can I backup, each node does not hold the whole
 data, can not simple cp ...

 - If I update replicas = 1, each node now have full 5 shards, I can easy
 cp backup on any node.

 If you know the better way for backup which can handle distributed shards,
 plz let me know.

 Thank you.

 PS : Can I reduce shards from 5 to 4 without losing data ?

 On Sunday, October 26, 2014 12:57:40 AM UTC+7, Mathieu Lecarme wrote:



 Le vendredi 24 octobre 2014 10:43:21 UTC+2, Atrus a écrit :

 - There is 15 shards per index, is this too much or enough ? I've used
 the default config. I know that this could be effect the load but dont know
 how to figure out the exact number.

 It's a huge value. Shards can be split between nodes, do you target tu
 use 15 nodes?


 - Is there any way to show the running queries ? something like mysql
 show process list ? to show what queries have eat CPU alot. I have enable
 slow log queries 1s but found nothing.

 You can watch HTTP traffic, with pcap (I hack packetbeat, for that). It's
 from the outside, from the inside, use the hot thread. strace can help, too.


 - Any suggestion is appreciate.

 Do you poll the _nodes/stat url? a monitoring tool, or a web page like
 kopf?

  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/4ab77f45-177d-4546-b953-5f38c7f4f5d1%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/4ab77f45-177d-4546-b953-5f38c7f4f5d1%40googlegroups.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoEAjqyXotjzP3STJcX2vMzjwBtjiWerOe0rF0wOtk1rsg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: High cpu load but low memory usage.

2014-10-26 Thread Atrus
Thanks Jorg.

Use export/import tools and create new index. Such as ?

Could you recommend me ?

Thanks so much.

BRs.

On Sunday, October 26, 2014 5:08:33 PM UTC+7, Jörg Prante wrote:

 For backup/restore, do not use cp. There is snapshot/restore for that. It 
 works on primary shards only. 


 http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-snapshots.html

 You can not reduce shards in an existing index. Use export/import tools 
 and create new index.

 Jörg

 On Sun, Oct 26, 2014 at 3:44 AM, Atrus anhh...@gmail.com javascript: 
 wrote:

 Thanks Mathieu for you response. I will try your suggestions.

 It's a huge value. Shards can be split between nodes, do you target tu 
 use 15 nodes?

 Hi Mat, 

 - For examples if I have just one node, shards = 5, replica = 0. Then I 
 can easily backup the data by cp /var/lib/elasticsearch/nodename 
 /somewhere/backup -rfp

 - Now I add one more node, so the cluster has two nodes, shards = 5, 
 replica = 0. The shards are redistributed, maybe 1st node holds 0 2 4, 2nd 
 node holds 1 3. = How can I backup, each node does not hold the whole 
 data, can not simple cp ...

 - If I update replicas = 1, each node now have full 5 shards, I can easy 
 cp backup on any node.

 If you know the better way for backup which can handle distributed 
 shards, plz let me know.

 Thank you.

 PS : Can I reduce shards from 5 to 4 without losing data ? 

 On Sunday, October 26, 2014 12:57:40 AM UTC+7, Mathieu Lecarme wrote:



 Le vendredi 24 octobre 2014 10:43:21 UTC+2, Atrus a écrit :

 - There is 15 shards per index, is this too much or enough ? I've used 
 the default config. I know that this could be effect the load but dont 
 know 
 how to figure out the exact number.

 It's a huge value. Shards can be split between nodes, do you target tu 
 use 15 nodes?
  

 - Is there any way to show the running queries ? something like mysql 
 show process list ? to show what queries have eat CPU alot. I have enable 
 slow log queries 1s but found nothing.

 You can watch HTTP traffic, with pcap (I hack packetbeat, for that). 
 It's from the outside, from the inside, use the hot thread. strace can 
 help, too.
  

 - Any suggestion is appreciate.

 Do you poll the _nodes/stat url? a monitoring tool, or a web page like 
 kopf?

  -- 
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearc...@googlegroups.com javascript:.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/4ab77f45-177d-4546-b953-5f38c7f4f5d1%40googlegroups.com
  
 https://groups.google.com/d/msgid/elasticsearch/4ab77f45-177d-4546-b953-5f38c7f4f5d1%40googlegroups.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.




-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/f234358a-ae82-4748-8717-65b2ec420c5c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: High cpu load but low memory usage.

2014-10-26 Thread joergpra...@gmail.com
I have written a plugin for that

https://github.com/jprante/elasticsearch-knapsack

maybe it fits your requirements.

Jörg

On Sun, Oct 26, 2014 at 11:21 AM, Atrus anhhu...@gmail.com wrote:

 Thanks Jorg.

 Use export/import tools and create new index. Such as ?

 Could you recommend me ?

 Thanks so much.

 BRs.

 On Sunday, October 26, 2014 5:08:33 PM UTC+7, Jörg Prante wrote:

 For backup/restore, do not use cp. There is snapshot/restore for that. It
 works on primary shards only.

 http://www.elasticsearch.org/guide/en/elasticsearch/
 reference/current/modules-snapshots.html

 You can not reduce shards in an existing index. Use export/import tools
 and create new index.

 Jörg

 On Sun, Oct 26, 2014 at 3:44 AM, Atrus anhh...@gmail.com wrote:

 Thanks Mathieu for you response. I will try your suggestions.

 It's a huge value. Shards can be split between nodes, do you target tu
 use 15 nodes?

 Hi Mat,

 - For examples if I have just one node, shards = 5, replica = 0. Then I
 can easily backup the data by cp /var/lib/elasticsearch/nodename
 /somewhere/backup -rfp

 - Now I add one more node, so the cluster has two nodes, shards = 5,
 replica = 0. The shards are redistributed, maybe 1st node holds 0 2 4, 2nd
 node holds 1 3. = How can I backup, each node does not hold the whole
 data, can not simple cp ...

 - If I update replicas = 1, each node now have full 5 shards, I can easy
 cp backup on any node.

 If you know the better way for backup which can handle distributed
 shards, plz let me know.

 Thank you.

 PS : Can I reduce shards from 5 to 4 without losing data ?

 On Sunday, October 26, 2014 12:57:40 AM UTC+7, Mathieu Lecarme wrote:



 Le vendredi 24 octobre 2014 10:43:21 UTC+2, Atrus a écrit :

 - There is 15 shards per index, is this too much or enough ? I've used
 the default config. I know that this could be effect the load but dont 
 know
 how to figure out the exact number.

 It's a huge value. Shards can be split between nodes, do you target tu
 use 15 nodes?


 - Is there any way to show the running queries ? something like mysql
 show process list ? to show what queries have eat CPU alot. I have enable
 slow log queries 1s but found nothing.

 You can watch HTTP traffic, with pcap (I hack packetbeat, for that).
 It's from the outside, from the inside, use the hot thread. strace can
 help, too.


 - Any suggestion is appreciate.

 Do you poll the _nodes/stat url? a monitoring tool, or a web page like
 kopf?

  --
 You received this message because you are subscribed to the Google
 Groups elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send
 an email to elasticsearc...@googlegroups.com.
 To view this discussion on the web visit https://groups.google.com/d/
 msgid/elasticsearch/4ab77f45-177d-4546-b953-5f38c7f4f5d1%
 40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/4ab77f45-177d-4546-b953-5f38c7f4f5d1%40googlegroups.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/f234358a-ae82-4748-8717-65b2ec420c5c%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/f234358a-ae82-4748-8717-65b2ec420c5c%40googlegroups.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGtnN2uuc2yEc5fn%2BBuRM%3DEjLdh5SWnvHGJLbWa%2BkAgJQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


lowercase tokenizer and filter, but incliding digits also

2014-10-26 Thread Andrew Gaydenko
Hi!

I widely use lowercase tokenizer and lowercase filter in different places. 
At some cases I'd want to extend their chars set with digits. How to?

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/abfcd94b-9532-40c4-8768-1253de990ec7%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Aggregation on last element

2014-10-26 Thread vineeth mohan
Hello Michaël ,

I cant think of a way to do this in a single call.
May be you should try the following


(Terms aggregation on element) - (Top N hits aggregation , sort by date by
asc and size = 1  ) - (Filter  aggregation by type A)
With this you will get the elements that you are looking for. Now do a
filter on those elements and a terms aggregation query on element filed to
get the results.

Thanks
  Vineeth



On Sun, Oct 26, 2014 at 1:04 AM, Michaël Gallego mich...@maestrooo.com
wrote:

 Hi,

 I have a type whose data looks like this:

 {
 date: 2014-01-01
 element: abc,
 type: A
 },
 {
 date: 2014-01-02
 element: abc,
 type: B
 },
 {
 date: 2014-01-03
 element: def,
 type: A
 }

 I'd like to be able to group the data by element, and count the documents
 where the LAST document by date have a type of A. In this case, I want the
 result to be 1 (because the second document, that has the same element as
 the first document, has a date that is after the first document, but as its
 type is not B, I don't want it to be counted ; for the last document, it is
 the only one with element def and the type is A).

 I'm not sure this is even possible. Please note that the cardinality of
 element can be quite high (up to 20 000 different values).

 Thank you in advance!

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/46509869-4afa-4062-8c34-ad828dcf680c%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/46509869-4afa-4062-8c34-ad828dcf680c%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAGdPd5mGGdrY-K3maf4H0QeGuDjS-GUTCbV3MSxdE62wdMYpyA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Aggregation on last element

2014-10-26 Thread Michaël Gallego
Hi Vineeth,

I'm afraid that this won't work, because as I said element can have high 
cardinality (while it's not bounded in theory, in practice it will range 
from 500 to 4). Therefore if I do a terms on element, then a top hit, 
it will require to generate maybe 4 sub-buckets. I think this will kill 
performance.

For now, I've rethought my format so it now looks like this:

{
   element: abc,
   history: [
  {type: A, date: 2014-01-01},
  {type: B, date: 2014-01-02}
   ]
}

Where history is mapped as nested. Now, I can do that:

{
  aggs: {
history: {
  nested: {
path: history
  },
  aggs: {
latest-history: {
  filter: {
limit: {
  value: 1
}
  },
  
  aggs: {
by-type: {
  terms: {
field: history.type,
size: 0
  }
}
  }
}
  }
}
  }
}

This will get the nested history, limit by 1, then group by type, so I can 
get the count of the ones I'm interested (A type or B type). The only 
drawback is that inside the history nested, I need to sort the history by 
date in my application (I have not found any way to sort the nested by date 
before doing the limit filter...), and that while history is typically 
quite low (around 10-200 elements), it is not bounded, and updating is 
harder to do...

If anyone has any other idea, don't hesitate to share!


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/3948211d-2029-42f4-a07a-3ff0ba1834c7%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Connection Pool

2014-10-26 Thread joergpra...@gmail.com
The TransportClient creates connections automatically.

You can set number of worker threads in the setting
transport.netty.worker_count

See also source code

https://github.com/elasticsearch/elasticsearch/blob/master/src/main/java/org/elasticsearch/transport/netty/NettyTransport.java#L101

for more settings regarding connections per node.

Jörg

On Sun, Oct 26, 2014 at 6:09 PM, pooja.fis...@gmail.com wrote:

 Hi,

 I want to know how to create connection pool for connecting to Elastic
 Search using java.

 Now, using TransportClient in java.

 And also, How TransportClient handle connection?

 Looking forward for quick reply.

 Thanks.

  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/69e84cce-3ef8-4f13-9df7-35385d3dca2e%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/69e84cce-3ef8-4f13-9df7-35385d3dca2e%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoF2fh%3DM5Kiv7DtuiK5KSYZ0EybVk5TKi%2BXCUOd1OXiBsA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


How to setup the ES cluster using the Client APIs

2014-10-26 Thread Gaurav gupta
Hi,

Somehow the default multicast discovery is not working for me and I am
exploring the unicast discovery mechanism to set up a ES cluster on two
physical machines. I am am able to set up it using the default ES
installation on two machines with configuring the both YML files to point
to each other using UNICAST discovery. Now, I want to achieve the same
thing using the ES Client API and I am using the below code for it. Could
you let me know if below code is correct to point to two nodes in UNICAST
or kindly point me to the right direction to refer the sample client code
for it. I am facing issue in the below code and unicast discovery is also
not working in below code.
(I have around one week experience on ES and want to migrate one of the
product from lucene to ES).

 Settings settings1 = ImmutableSettings.settingsBuilder()
.put(cluster.name, PBcluster)
.put(node.name,PBnode1)
.put(node.master, true)
.put(node.data, true)
//.put(http.port, 9250)
//.put(network.bind_host, 152.144.214.62)
//.put(network.publish_host, 152.144.214.62)
.put(discovery.zen.ping.multicast.enabled, false)
// .put(discovery.zen.ping.unicast.hosts, [\152.144.226.12\,
\152.144.214.62\])
.put(discovery.zen.ping.unicast.hosts, 152.144.214.62)
.put(discovery.zen.minimum_master_nodes, 1)
.build();

node1 = nodeBuilder().settings(settings1).node();
System.out.println(Starting node1...);
node1.start();
System.out.println(Started node1...);
...
Node clientNode =
nodeBuilder().clusterName(PBcluster).client(true).node();
Client client = clientNode.client();
BulkRequestBuilder bulkRequest = client.prepareBulk();
...

Thanks
Gaurav

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CALZAj3L0B-h%2BbE%3D5p2MeMigf7jwSDC1FZhOC%3DnDfqtWsG-9Y0A%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Aggregation on last element

2014-10-26 Thread Michaël Gallego
After some testing, it appears that my solution does not work, but I'm not 
sure to understand why. The filter returns less result that what is 
expected.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/933d228e-82f1-47c4-9fc3-909de234b93b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Setting up Packetbeat/Elasticsearch/kibana

2014-10-26 Thread Sarah Larysz
I have set up Packetbeat/Elasticsearch/kibana on a Windows 2012 server as 
per the instructions.

1:  The rev of Kibana is 3.1.1. 
2:  Packetbeat is collecting some kind of data and indexes have been 
created in Elasticsearch. 
3:  A web browser pointing to the Elasticsearch source responds with a 
JASON object listing. 
4:  Kibana shows a generic dashboard, but there is no listing of dashboards 
under load and I don't know if kibana is seeing any data at all from 
packetbeat. 

What have I done wrong, and how can I test for packetbeat data? Here's my 
site:

*http://gazoslive.com/kibana/#/dashboard/file/default.json*

Here is my config.js file (though I cannot find where in the javascript 
code this file is read):


















































































*/** @scratch /configuration/config.js/1 * * == Configuration * config.js 
is where you will find the core Kibana configuration. This file contains 
parameter that * must be set before kibana is run for the first 
time. */define(['settings'],function (Settings) {/** @scratch 
/configuration/config.js/2   *   * === Parameters   */  return new 
Settings({/** @scratch /configuration/config.js/5 * *  
elasticsearch * * The URL to your elasticsearch server. You almost 
certainly don't * want +http://localhost:9200+ here. Even if Kibana and 
Elasticsearch are on * the same host. By default this will attempt to 
reach ES at the same host you have * kibana installed on. You probably 
want to set it to the FQDN of your * elasticsearch host * * 
Note: this can also be an object if you want to pass options to the http 
client. For example: * *  +elasticsearch: {server: 
http://localhost:9200;, withCredentials: true}+ * * elasticsearch: 
http://+window.location.hostname+:9200;, * */elasticsearch: 
http://ihpgazos.cloudapp.net:9200;,/** @scratch 
/configuration/config.js/5 * *  default_route * * This 
is the default landing page when you don't specify a dashboard to load. You 
can specify * files, scripts or saved dashboards here. For example, if 
you had saved a dashboard called * `WebLogs' to elasticsearch you might 
use: * * default_route: '/dashboard/elasticsearch/WebLogs', 
*/default_route : '/dashboard/file/default.json',/** @scratch 
/configuration/config.js/5 * *  kibana-int * * The 
default ES index to use for storing Kibana specific object * such as 
stored dashboards */kibana_index: kibana-int,/** @scratch 
/configuration/config.js/5 * *  panel_name * * An array 
of panel modules available. Panels will only be loaded when they are 
defined in the * dashboard, but this list is used in the add panel 
interface. */panel_names: [  'histogram',  'map',  
'goal',  'table',  'filtering',  'timepicker',  
'text',  'hits',  'column',  'trends',  'bettermap',  
'query',  'terms',  'stats',  'sparklines']  });});*


All suggestions will be valued.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/0fe67bc4-d778-4ca6-a967-6e2033cc2f9a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Setting up Packetbeat/Elasticsearch/kibana

2014-10-26 Thread Mark Walkom
Opening up ES to the world is asking for a lot of pain. You really need to
lock it down.

Did you install kibana from the PB repo and load the dashboards script?

On 27 October 2014 07:13, Sarah Larysz slar...@gazos.com wrote:

 I have set up Packetbeat/Elasticsearch/kibana on a Windows 2012 server as
 per the instructions.

 1:  The rev of Kibana is 3.1.1.
 2:  Packetbeat is collecting some kind of data and indexes have been
 created in Elasticsearch.
 3:  A web browser pointing to the Elasticsearch source responds with a
 JASON object listing.
 4:  Kibana shows a generic dashboard, but there is no listing of
 dashboards under load and I don't know if kibana is seeing any data at
 all from packetbeat.

 What have I done wrong, and how can I test for packetbeat data? Here's my
 site:

 *http://gazoslive.com/kibana/#/dashboard/file/default.json
 http://gazoslive.com/kibana/#/dashboard/file/default.json*

 Here is my config.js file (though I cannot find where in the javascript
 code this file is read):


















































































 */** @scratch /configuration/config.js/1 * * == Configuration * config.js
 is where you will find the core Kibana configuration. This file contains
 parameter that * must be set before kibana is run for the first
 time. */define(['settings'],function (Settings) {/** @scratch
 /configuration/config.js/2   *   * === Parameters   */  return new
 Settings({/** @scratch /configuration/config.js/5 * * 
 elasticsearch * * The URL to your elasticsearch server. You almost
 certainly don't * want +http://localhost:9200+ here. Even if Kibana and
 Elasticsearch are on * the same host. By default this will attempt to
 reach ES at the same host you have * kibana installed on. You probably
 want to set it to the FQDN of your * elasticsearch host * *
 Note: this can also be an object if you want to pass options to the http
 client. For example: * *  +elasticsearch: {server:
 http://localhost:9200 http://localhost:9200, withCredentials: true}+
  * * elasticsearch: http://+window.location.hostname+:9200;,
 * */elasticsearch: http://ihpgazos.cloudapp.net:9200
 http://ihpgazos.cloudapp.net:9200,/** @scratch
 /configuration/config.js/5 * *  default_route * * This
 is the default landing page when you don't specify a dashboard to load. You
 can specify * files, scripts or saved dashboards here. For example, if
 you had saved a dashboard called * `WebLogs' to elasticsearch you might
 use: * * default_route: '/dashboard/elasticsearch/WebLogs',
 */default_route : '/dashboard/file/default.json',/** @scratch
 /configuration/config.js/5 * *  kibana-int * * The
 default ES index to use for storing Kibana specific object * such as
 stored dashboards */kibana_index: kibana-int,/** @scratch
 /configuration/config.js/5 * *  panel_name * * An array
 of panel modules available. Panels will only be loaded when they are
 defined in the * dashboard, but this list is used in the add panel
 interface. */panel_names: [  'histogram',  'map',
 'goal',  'table',  'filtering',  'timepicker',
 'text',  'hits',  'column',  'trends',  'bettermap',
 'query',  'terms',  'stats',  'sparklines']  });});*


 All suggestions will be valued.

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/0fe67bc4-d778-4ca6-a967-6e2033cc2f9a%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/0fe67bc4-d778-4ca6-a967-6e2033cc2f9a%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAF3ZnZmvhqiWXsctSmi2fRiO%2Bhyhz71A44vDN6Y%2BxrEifrxgkQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Scrolling over Parent/Child Documents

2014-10-26 Thread Govind Chandrasekhar
I have an index of ~500 million documents, which I'd like to scroll 
through either sorted by a common field (across multiple types), or by 
clustering parent-child documents together.

Here's how a typical set of documents is laid out:
PARENT: {id : PARENT_ID1, index = CommonIndex, type = TypeA, 
data = {parent_id : PARENT_ID1, name : ... } }
CHILD1: {id : CHILD_ID1A, index = CommonIndex, type = TypeB, 
parent : PARENT_ID1, data = {parent_id : PARENT_ID1, name : 
... } }
CHILD2: {id : CHILD_ID1B, index = CommonIndex, type = TypeB, 
parent : PARENT_ID1, data = {parent_id : PARENT_ID1, name : 
... } }
CHILD3: {id : CHILD_ID1C, index = CommonIndex, type = TypeC, 
parent : PARENT_ID1, data = {parent_id : PARENT_ID1, name : 
... } }

I'd like to retrieve parent and child documents clustered together, merge 
them, and store the merged result in a separate index.
*Which leads me to my question*: is there a way to scan/scroll and retrieve 
parent-child documents bundled / clustered together?

--

*Alternative approach tried with partial success:*

FYI, I tried using sort with scroll (without scan set) on the 
parent_id field as follows: *{sort = [{parent_id = desc}]}*, as a 
shot in the dark. While the results did appear sorted at first glance, upon 
closer examination, I noticed that the results were at times out of order. 
Here's an extract of just the parent_id fields printed:

7n3uxOf1dYUq
7n3uxOf1dYUq
7n3Zj2oadUKM
7n3Zj2oadUKM
7n3U2l0DkeYg
7n3U2l0DkeYg
7n3SOHKjAG2K  -- out of order
7n3U2l0DkeYg
7n3U2l0DkeYg

Thanks!

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/de41eda5-0b33-44e9-9951-b72dde7d7f85%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


plan for river

2014-10-26 Thread Mungeol Heo
Hi,

My question is that will es remove all river related plugin in the future?
If it will, I'd like to know that is there substitution for JDBC?
Thanks.

Best regards,

- Mungeol

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/cc6d541f-1609-4218-932b-064a27e9692a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


plan for river

2014-10-26 Thread Mungeol Heo
Hi,

My question is that will es remove all river related plugin in the future?
If it will, I'd like to know that is there any kind of substitution for 
JDBC?
Thanks.

Best regards,

- Mungeo

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/db75e61c-076c-46b4-a584-d3b959b25ef1%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


FacetPhaseExecutionException with new Marvel installation

2014-10-26 Thread Ross Simpson
I've got a brand-new Marvel installation, and am having some frustrating 
issues with it: on the overview screen, I am constantly getting errors like:
*Oops!* FacetPhaseExecutionException[Facet [timestamp]: failed to find 
mapping for node.ip_port.raw]

*Production cluster:*
* ElasticSearch 1.1.1
* Marvel 1.2.1
* Running in vSphere

*Monitoring cluster:*
* ElasticSearch 1.3.4
* Marvel 1.2.1
* Running in AWS

After installing the plugin and bouncing all nodes in both clusters, Marvel 
seems to be working -- an index has been created in the monitoring cluster (
.marvel-2014.10.26), and I see thousands of documents in there.  There are 
documents with the following types: cluster_state, cluster_stats, 
index_stats, indices_stats, node_stats.  So, it does seem that data is 
being shipped from the prod cluster to the monitoring cluster.

I've seen in the user group that other people have had similar issues. 
 Some of those mention problems with the marvel index template.  I don't 
seem to have any at all templates in my monitoring cluster:

$ curl -XGET localhost:9200/_template/
{}

I tried manually adding the default template (as described in 
http://www.elasticsearch.org/guide/en/marvel/current/#config-marvel-indices), 
but that didn't seem to have any effect.

So far, I've seen just two specific errors in Marvel:
* FacetPhaseExecutionException[Facet [timestamp]: failed to find mapping 
for node.ip_port.raw]
* FacetPhaseExecutionException[Facet [timestamp]: failed to find mapping 
for index.raw]

I've also looked through the logs on both the production and monitoring 
clusters, and the only errors are in the monitoring cluster resulting from 
queries from the Marvel UI, like this:

[2014-10-27 11:08:13,427][DEBUG][action.search.type   ] [ip-10-4-1-187] 
[.marvel-2014.10.27][1], node[SR_hriFmTCav-8ofbKU-8g], [R], s[STARTED]: 
Failed to execute [org.elasticsearch.action.search.SearchRequest@661dc47e]
org.elasticsearch.search.SearchParseException: [.marvel-2014.10.27][1]: 
query[ConstantScore(BooleanFilter(+*:* +cache(_type:index_stats) +cache(
@timestamp:[141436788 TO 141436854])))],from[-1],size[10]: Parse 
Failure [Failed to parse source [{size:10,query:{filtered:{query:{
match_all:{}},filter:{bool:{must:[{match_all:{}},{term:{_type:
index_stats}},{range:{@timestamp:{from:now-10m/m,to:now/m
}}}],facets:{timestamp:{terms_stats:{key_field:index.raw,
value_field:@timestamp,order:term,size:2000}},
primaries.docs.count:{terms_stats:{key_field:index.raw,value_field
:primaries.docs.count,order:term,size:2000}},
primaries.indexing.index_total:{terms_stats:{key_field:index.raw,
value_field:primaries.indexing.index_total,order:term,size:2000}},
total.search.query_total:{terms_stats:{key_field:index.raw,
value_field:total.search.query_total,order:term,size:2000}},
total.merges.total_size_in_bytes:{terms_stats:{key_field:index.raw,
value_field:total.merges.total_size_in_bytes,order:term,size:2000
}},total.fielddata.memory_size_in_bytes:{terms_stats:{key_field:
index.raw,value_field:total.fielddata.memory_size_in_bytes,order:
term,size:2000]]
at org.elasticsearch.search.SearchService.parseSource(SearchService.
java:660)
at org.elasticsearch.search.SearchService.createContext(
SearchService.java:516)
at org.elasticsearch.search.SearchService.createAndPutContext(
SearchService.java:488)
at org.elasticsearch.search.SearchService.executeQueryPhase(
SearchService.java:257)
at org.elasticsearch.search.action.SearchServiceTransportAction$5.
call(SearchServiceTransportAction.java:206)
at org.elasticsearch.search.action.SearchServiceTransportAction$5.
call(SearchServiceTransportAction.java:203)
at org.elasticsearch.search.action.SearchServiceTransportAction$23.
run(SearchServiceTransportAction.java:517)
at java.util.concurrent.ThreadPoolExecutor.runWorker(
ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(
ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.elasticsearch.search.facet.FacetPhaseExecutionException: 
Facet [timestamp]: failed to find mapping for index.raw
at org.elasticsearch.search.facet.termsstats.TermsStatsFacetParser.
parse(TermsStatsFacetParser.java:126)
at org.elasticsearch.search.facet.FacetParseElement.parse(
FacetParseElement.java:93)
at org.elasticsearch.search.SearchService.parseSource(SearchService.
java:644)
... 9 more
[2014-10-27 11:08:13,427][DEBUG][action.search.type   ] [ip-10-4-1-187] 
All shards failed for phase: [query]

Both clusters use the same timezone and their clocks are synchronized via 
NTP.

Does anyone have any suggestions on what to do next?  I've reinstalled the 
plugin across both clusters without any changes.

Thanks much,
Ross

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails 

FacetPhaseExecutionException with new Marvel installation

2014-10-26 Thread Ross Simpson
I've got a brand-new Marvel installation, and am having some frustrating 
issues with it: on the overview screen, I am constantly getting errors like:
× *Oops!* FacetPhaseExecutionException[Facet [timestamp]: failed to find 
mapping for node.ip_port.raw]

*Production cluster:*
* ElasticSearch 1.1.1
* Marvel 1.2.1
* Running in vSphere

*Monitoring cluster:*
* ElasticSearch 1.3.4
* Marvel 1.2.1
* Running in AWS

After installing the plugin and bouncing all nodes in both clusters, Marvel 
seems to be working -- an index has been created in the monitoring cluster (
.marvel-2014.10.26), and I see thousands of documents in there.  There are 
documents with the following types: cluster_state, cluster_stats, 
index_stats, indices_stats, node_stats.  So, it does seem that data is 
being shipped from the prod cluster to the monitoring cluster.

I've seen in the user group that other people have had similar issues. 
 Some of those mention problems with the marvel index template.  I don't 
seem to have any at all templates in my monitoring cluster:

$ curl -XGET localhost:9200/_template/
{}

I tried manually adding the default template (as described in 
http://www.elasticsearch.org/guide/en/marvel/current/#config-marvel-indices), 
but that didn't seem to have any effect.

So far, I've seen just two specific errors in Marvel:
* FacetPhaseExecutionException[Facet [timestamp]: failed to find mapping 
for node.ip_port.raw]
* FacetPhaseExecutionException[Facet [timestamp]: failed to find mapping 
for index.raw]

I've also looked through the logs on both the production and monitoring 
clusters, and the only errors are in the monitoring cluster resulting from 
queries from the Marvel UI, like this:

[2014-10-27 11:08:13,427][DEBUG][action.search.type   ] [ip-10-4-1-187] 
[.marvel-2014.10.27][1], node[SR_hriFmTCav-8ofbKU-8g], [R], s[STARTED]: 
Failed to execute [org.elasticsearch.action.search.SearchRequest@661dc47e]
org.elasticsearch.search.SearchParseException: [.marvel-2014.10.27][1]: 
query[ConstantScore(BooleanFilter(+*:* +cache(_type:index_stats) +cache(
@timestamp:[141436788 TO 141436854])))],from[-1],size[10]: Parse 
Failure [Failed to parse source [{size:10,query:{filtered:{query:{
match_all:{}},filter:{bool:{must:[{match_all:{}},{term:{_type:
index_stats}},{range:{@timestamp:{from:now-10m/m,to:now/m
}}}],facets:{timestamp:{terms_stats:{key_field:index.raw,
value_field:@timestamp,order:term,size:2000}},
primaries.docs.count:{terms_stats:{key_field:index.raw,value_field
:primaries.docs.count,order:term,size:2000}},
primaries.indexing.index_total:{terms_stats:{key_field:index.raw,
value_field:primaries.indexing.index_total,order:term,size:2000}},
total.search.query_total:{terms_stats:{key_field:index.raw,
value_field:total.search.query_total,order:term,size:2000}},
total.merges.total_size_in_bytes:{terms_stats:{key_field:index.raw,
value_field:total.merges.total_size_in_bytes,order:term,size:2000
}},total.fielddata.memory_size_in_bytes:{terms_stats:{key_field:
index.raw,value_field:total.fielddata.memory_size_in_bytes,order:
term,size:2000]]
at org.elasticsearch.search.SearchService.parseSource(SearchService.
java:660)
at org.elasticsearch.search.SearchService.createContext(
SearchService.java:516)
at org.elasticsearch.search.SearchService.createAndPutContext(
SearchService.java:488)
at org.elasticsearch.search.SearchService.executeQueryPhase(
SearchService.java:257)
at org.elasticsearch.search.action.SearchServiceTransportAction$5.
call(SearchServiceTransportAction.java:206)
at org.elasticsearch.search.action.SearchServiceTransportAction$5.
call(SearchServiceTransportAction.java:203)
at org.elasticsearch.search.action.SearchServiceTransportAction$23.
run(SearchServiceTransportAction.java:517)
at java.util.concurrent.ThreadPoolExecutor.runWorker(
ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(
ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.elasticsearch.search.facet.FacetPhaseExecutionException: 
Facet [timestamp]: failed to find mapping for index.raw
at org.elasticsearch.search.facet.termsstats.TermsStatsFacetParser.
parse(TermsStatsFacetParser.java:126)
at org.elasticsearch.search.facet.FacetParseElement.parse(
FacetParseElement.java:93)
at org.elasticsearch.search.SearchService.parseSource(SearchService.
java:644)
... 9 more
[2014-10-27 11:08:13,427][DEBUG][action.search.type   ] [ip-10-4-1-187] 
All shards failed for phase: [query]

Both clusters use the same timezone and their clocks are synchronized via 
NTP.

Does anyone have any suggestions on what to do next?  I've reinstalled the 
plugin across both clusters without any changes.

Thanks much,
Ross

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving 

Re: plan for river

2014-10-26 Thread vineeth mohan
Hello Mungeol ,

As far as i know , the plan is to depreciate rivers and move them to
logstash.

Thanks
   Vineeth

On Mon, Oct 27, 2014 at 5:19 AM, Mungeol Heo mungeol@gmail.com wrote:

 Hi,

 My question is that will es remove all river related plugin in the future?
 If it will, I'd like to know that is there substitution for JDBC?
 Thanks.

 Best regards,

 - Mungeol

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/cc6d541f-1609-4218-932b-064a27e9692a%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/cc6d541f-1609-4218-932b-064a27e9692a%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAGdPd5nRAy3o1LWi9n9%3DWEMEDb%3DBE60v_D8LiZmy%2BM5g6%2BPrdA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Announcing release of v1.0 RC elascticsearch pugin for Liferay (elasticray)

2014-10-26 Thread kcr
Hi all,

We releases the v1.0 RC version of elasticray.  
https://github.com/R-Knowsys/elasticray/releases/tag/v1.0-RC

This has quite a few bug fixes and is pretty much a drop in replacement for 
SOLR plugin.
We have tested this on large datasets in BEta environments.

Please try it out in your TEST/Beta environments. 

We are continuing to test this and should have a production ready plugin 
either this week or the next.

Thanks,
kc

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/f34cdce8-c59e-4029-9627-b07d6f503ac1%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.