date:20140813

I think this should help you:
http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/phrase-matching.html#phrase-matching

--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr

Le 13 août 2014 à 10:24:23, Thami Inaflas (inaflas.th...@gmail.com) a écrit:

Hello,

How to only match documents which whole the title is included in the query, for
example :

If the list of document titles is : Elasticsearch server, Elasticsearch
experts
If the query is : Configure Elasticsearch Server only the first one should
match
If the query is : Elasticsearch No documents should match

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/etPan.53eb231b.6ceaf087.18f0%40MacBook-Air-de-David.local.
For more options, visit https://groups.google.com/d/optout.

Curiosity : elasticsearch frontend

2014-08-13 Thread Arthur BABEY

Hi everyone,

I'm currently a Pages Jaunes trainee. During my internship i worked on an
application to facilitate access to data to non-technical staff.
Curiosity was born.

The goal of this tool is to provide a simple way to built complex queries
and aggregations, save those queries, and share them with other users.
Curiosity comes with a template system, that you can use to personalize
your results list and aggregation (there is default template, like
piechart).

Yet the tool is stable and powerful but not very friendly. I worked hard to
provide a good documentation but as you can see my English suxx and that
good in explanation.
But there is something i am not so bad : development. So if you have some
time to try curiosity, i will be very happy if you give me some feedback.

I am open to criticism, and maybe if you liked the tool and want new
features i will be proud to develop them.

If you have any questions about functionality or code, ask me, i will try
to give you the best answer.

You can find Curiosity on Github https://github.com/pagesjaunes/curiosity
(it's
fully open source) and the documentation is here
http://pagesjaunes.github.io/curiosity/.

Hope your eyes didn't bleed to much.
Have a good day or night.

Arthur

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/ed47cd67-1435-44be-9810-44d885129469%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Matchs on whole title

2014-08-13 Thread Thami Inaflas

The problem with phrase-matching is that the request Configure 
Elasticsearch Server won't match the document Elasticsearch Server, 
because it's based on AND operator but the word Configure isn't in the 
title.


Le mercredi 13 août 2014 10:34:53 UTC+2, David Pilato a écrit :

 I think this should help you: 
 http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/phrase-matching.html#phrase-matching

 -- 
 *David Pilato* | *Technical Advocate* | *Elasticsearch.com*
 @dadoonet https://twitter.com/dadoonet | @elasticsearchfr 
 https://twitter.com/elasticsearchfr


 Le 13 août 2014 à 10:24:23, Thami Inaflas (inafla...@gmail.com 
 javascript:) a écrit:

 Hello,

 How to only match documents which whole the title is included in the 
 query, for example :

 If the list of document titles is : Elasticsearch server, Elasticsearch 
 experts
 If the query is : Configure Elasticsearch Server only the first one 
 should match
 If the query is : Elasticsearch No documents should match

 Thanks for any help.
 --
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearc...@googlegroups.com javascript:.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/e88fabfe-9ccf-4630-bffb-aee43a86de04%40googlegroups.com
  
 https://groups.google.com/d/msgid/elasticsearch/e88fabfe-9ccf-4630-bffb-aee43a86de04%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.



-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/9a30aae4-c54a-4e3c-84f9-8c735997850d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Rescore Sorting

2014-08-13 Thread Shawn Ritchie

Hi Guys,

I'm getting weird score from my rescore anyone mind explaining why the 
rescore score is getting ignored in my query?

{

 from: 0,

 size: 10,

 explain: false,

 sort: [_score, {

 networks: {

 order: desc,

 mode: sum

 }

 }, {

 rich: {

 order: desc,

 mode: sum

 }

 }, {

 picture: {

 order: desc,

 mode: sum

 }

 }],

 query: {

 filtered: {

 query: {

 bool: {

 should: [{

 constant_score: {

 query: {

 match: {

 _all: {

 query: Daryl

 }

 }

 },

 boost: 1.0

 }

 }, {

 constant_score: {

 query: {

 match: {

 _all: {

 query: Davies

 }

 }

 },

 boost: 1.0

 }

 }],

 disable_coord: 1

 }

 },

 filter: [{

 or: [{

 query: {

 match: {

 _all: {

 query: Daryl

 }

 }

 }

 }, {

 query: {

 match: {

 _all: {

 query: Davies

 }

 }

 }

 }]

 }]

 }

 },

 rescore: [{

 query: {

 query_weight: 0.0,

 rescore_query_weight: 1.0,

 score_mode: total,

 rescore_query: {

 bool: {

 should: [{

 match_all: {

 boost: 20.0

 }

 }],

 disable_coord: 1

 }

 }

 },

 window_size: 50

 }]

 }


Results showing weird scores when the scores should all total up to 20.0

 

{

  took : 470,

  timed_out : false,

  _shards : {

total : 8,

successful : 8,

failed : 0

  },

  hits : {

total : 84244,

max_score : null,

hits : [ {

  _index : crawledpeople_completepeople_v1,

  _type : zeepexmanager_buisnessobjects_zeepexsearch_zeepexprofile,

  _id : 51c3c56f6dd2bc0854e9e333,

  _score : 1.4142135,

  sort : [ 1.4142135, 2, 66, 1 ]

}, {

  _index : crawledpeople_completepeople_v1,

  _type : zeepexmanager_buisnessobjects_zeepexsearch_zeepexprofile,

  _id : 51c6d25e6dd2bc08543fa577,

  _score : 1.4142135,

  sort : [ 1.4142135, 2, 54, 1 ]

}, {

  _index : crawledpeople_completepeople_v1,

  _type : zeepexmanager_buisnessobjects_zeepexsearch_zeepexprofile,

  _id : 51f342e06dd2bc0b788c6372,

  _score : 1.4142135,

  sort : [ 1.4142135, 1, 298, -9223372036854775808 ]

}, {

  _index : crawledpeople_completepeople_v1,

  _type : zeepexmanager_buisnessobjects_zeepexsearch_zeepexprofile,

  _id : 51e162e16dd2bc08549fb2b5,

  _score : 1.4142135,

  sort : [ 1.4142135, 1, 253, -9223372036854775808 ]

}, {

  _index : crawledpeople_completepeople_v1,

  _type : zeepexmanager_buisnessobjects_zeepexsearch_zeepexprofile,

  _id : 52a3b8ad6dd2bc053c372722,

  _score : 1.4142135,

  sort : [ 1.4142135, 1, 104, 1 ]

}, {

  _index : crawledpeople_completepeople_v1,

  _type : zeepexmanager_buisnessobjects_zeepexsearch_zeepexprofile,

  _id : 5155e11bd25e9d09d0602aa9,

  _score : 1.4142135,

  sort : [ 1.4142135, 1, 97, 1 ]

}, {

  _index : crawledpeople_completepeople_v1,

  _type : zeepexmanager_buisnessobjects_zeepexsearch_zeepexprofile,

  _id : 52a798656dd2bc053c7e6950,

  _score : 1.4142135,

  sort : [ 1.4142135, 1, 67, 1 ]

}, {

  _index : crawledpeople_completepeople_v1,

  _type : zeepexmanager_buisnessobjects_zeepexsearch_zeepexprofile,

  _id : 51db66f86dd2bc0854e0ddc2,

  _score : 1.4142135,

  sort : [ 1.4142135, 1, 63, 1 ]

}, {

  _index : crawledpeople_completepeople_v1,

  _type : zeepexmanager_buisnessobjects_zeepexsearch_zeepexprofile,

  _id :

Re: Matchs on whole title

Did you read the full chapter?

May be the other pages in the same chapter give you more info?
http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/slop.html#slop
http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/proximity-relevance.html

--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr

Le 13 août 2014 à 10:46:31, Thami Inaflas (inaflas.th...@gmail.com) a écrit:

The problem with phrase-matching is that the request Configure Elasticsearch
Server won't match the document Elasticsearch Server, because it's based on
AND operator but the word Configure isn't in the title.

Le mercredi 13 août 2014 10:34:53 UTC+2, David Pilato a écrit :
I think this should help you:
http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/phrase-matching.html#phrase-matching

--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr

Le 13 août 2014 à 10:24:23, Thami Inaflas (inafla...@gmail.com) a écrit:

Hello,

How to only match documents which whole the title is included in the query, for
example :

Thanks for any help.
--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearc...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/e88fabfe-9ccf-4630-bffb-aee43a86de04%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/9a30aae4-c54a-4e3c-84f9-8c735997850d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/etPan.53eb2933.3006c83e.18f0%40MacBook-Air-de-David.local.
For more options, visit https://groups.google.com/d/optout.

Join and merge documents

2014-08-13 Thread tao hiko

Hi All,

My background is DBA (RDBMS) and starter for elasticsearch.
My question may duplicate to another question but I cannot find the answer 
correctly.

My environment, I have 2 indexes and 1 document in each index as below

index : ind1
type : doc1
_id : running number not automatic generate id (1,2,3xx)
name : string
status : string
last_update : date

index : ind2
type : doc2
_id : automatic generate id
doc_id : number
status : string
last_update : date

The relation is ind1.doc1._id = ind2.doc2.doc_id.

My purpose, I need to insert data from ind2.doc2 into ind1doc1 when it 
doesn't exists in ind1.doc1 and update status and last_update fields when 
it exist.

So Parent/Child method cannot use because data has been loaded to both 
tables already.

Could you please advise me in this case? and Is this possible to use 
aggregation or nested method across index? if not, How to join and merge 
data in same index?

Thank you,
Hiko

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/abedffe2-f301-4255-8c2a-d3b89c74b585%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Curiosity : elasticsearch frontend

And very nice effort around the documentation. Congrats!

--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr

Le 13 août 2014 à 10:42:51, Arthur BABEY (arthur.babey...@gmail.com) a écrit:

Hi everyone,

I'm currently a Pages Jaunes trainee. During my internship i worked on an
application to facilitate access to data to non-technical staff.
Curiosity was born.

The goal of this tool is to provide a simple way to built complex queries and
aggregations, save those queries, and share them with other users.
Curiosity comes with a template system, that you can use to personalize your
results list and aggregation (there is default template, like piechart).

Yet the tool is stable and powerful but not very friendly. I worked hard to
provide a good documentation but as you can see my English suxx and that good
in explanation.
But there is something i am not so bad : development. So if you have some time
to try curiosity, i will be very happy if you give me some feedback.

I am open to criticism, and maybe if you liked the tool and want new features i
will be proud to develop them.

If you have any questions about functionality or code, ask me, i will try to
give you the best answer.

You can find Curiosity on Github (it's fully open source) and the documentation
is here.

Hope your eyes didn't bleed to much.
Have a good day or night.

Arthur
--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/ed47cd67-1435-44be-9810-44d885129469%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/etPan.53eb29c0.5577f8e1.18f0%40MacBook-Air-de-David.local.
For more options, visit https://groups.google.com/d/optout.

Query post processing

2014-08-13 Thread Pawel

Hi,
I need an elasticsearch plugin which will build and add filter to each
query passed to Elasticsearch. Above short scenario is showed:

1. Query string is sent to Elasticsearch
2. Query is parsed and prepared to execute on cluster
3. A filter is build based eg. on query, current time etc. and added to a
query
4. Modified query is executed on Elasticsearch cluster

I'm wondering if I can do a plugin which will be triggered each time a
query is being processed.

I know I can make a plugin with my own query parser but it also has
drawback. Because all queries will have to contain something like:

{
  query : {
bool: {
  must: [
{
  term: {
some_field: some_term
  }
},
{
  CUSTOM_QUERY_TYPE: {
enabled: true
  }
}
  ]
   }
}

But this looks not to be so convenient because client application will
always have to extend the original query.

Is it possible to build plugin with query post processing and avoid
building query parser or Rest action?

--
Paweł Róg

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAF9ZkbOU0FSNUXF%2B7DEyXPebVsVhYHhZ158YJZc0EVY%2BrpvEyg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Unexpected results in regexp query

2014-08-13 Thread Markos Fragkakis

Hi all,

I have created my own mapping that does sentence tokenization:

This is the relevant part:

analysis: {
   filter: {
  csr_token_length_filter: {
 type: length,
 max: 32776,
 min: 0
  }
   },
   char_filter: {
  csr_new_line_character_filter: {
 type: mapping,
 mappings: [
 -\\n = ,
 —\\n = ,
 \\n =\\u0020
 ]
  }
   },
   analyzer: {
  csr_sentence_analyzer: {
 filter: csr_token_length_filter,
 char_filter: [
html_strip,
csr_new_line_character_filter
 ],
 type: custom,
 tokenizer: csr_sentence_tokenizer
  }
   },
   tokenizer: {
  csr_sentence_tokenizer: {
 flags: [
NONE
 ],
 type: pattern,
 pattern: (?=[.?!])\\s+(?=[\\da-zA-Z])
  }
   }
},

I am indexing a string containing the lorem ipsum text. So, when I tokenize 
it in marvel with this:

GET /markosindex/_analyze?analyzer=csr_sentence_analyzer
{
Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod 
tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, 
quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo 
consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse 
cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat 
non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.
}

I get this:

{
   tokens: [
  {
 token: { Lorem ipsum dolor sit amet, consectetur adipisicing 
elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.,
 start_offset: 0,
 end_offset: 126,
 type: word,
 position: 1
  },
  {
 token: Ut enim ad minim veniam, quis nostrud exercitation 
ullamco laboris nisi ut aliquip ex ea commodo consequat.,
 start_offset: 127,
 end_offset: 234,
 type: word,
 position: 2
  },
  {
 token: Duis aute irure dolor in reprehenderit in voluptate 
velit esse cillum dolore eu fugiat nulla pariatur.,
 start_offset: 235,
 end_offset: 337,
 type: word,
 position: 3
  },
  {
 token: Excepteur sint occaecat cupidatat non proident, sunt in 
culpa qui officia deserunt mollit anim id est laborum. } ,
 start_offset: 338,
 end_offset: 451,
 type: word,
 position: 4
  }
   ]
}

But when I run a regexp query on this field, I don't get any results:

GET /markosindex/_search
{


  fields : [filename, mime],
  
  query: {
filtered: {
  query: {
regexp: {
  fileTextContent.fileTextContentSentenceAnalyzed: 
.*consectetur\\s+adipisicing\\s+elit.*
}
  },
  filter : {
query : {
  match_all : { }
}
  }
}
  },
  highlight: {
fields: {fileTextContent.fileTextContentSentenceAnalyzed: {}}}
}


However, when I change the intermediate \\s+ to .+, the query works:

GET /markosindex/_search
{

  fields : [filename, mime],
  
  query: {
filtered: {
  query: {
regexp: {
  fileTextContent.fileTextContentSentenceAnalyzed: 
.*consectetur.+adipisicing.+elit.*
}
  },
  filter : {
query : {
  match_all : { }
}
  }
}
  },
  highlight: {
fields: {fileTextContent.fileTextContentSentenceAnalyzed: {}}}
}

Any idea what is going on? My analyzer shows spaces between words in each 
token (sentence). However, the regexp query does not work for me.

Cheers,

Markos

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/a89f8dfc-16b0-46c7-b0ae-f68f1ac9079f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Unexpected results in regexp query

2014-08-13 Thread Markos Fragkakis

Forgot to add the relevant part of my mapping:

   fileTextContent: {
  type: string,
  index: no,
  fields: {
 fileTextContentSentenceAnalyzed: {
type: string,
analyzer: csr_sentence_analyzer
 },
 fileTextContentAnalyzed: {
type: string
 }
  }
   },

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/30900d12-e36d-4059-9b88-dbd81c240708%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Recommendations needed for large ELK system design

2014-08-13 Thread Alex

Hi Mark,

I've done more investigating and it seems that a Client (AKA Query) node
cannot also be a Master node. As it says here
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-discovery-zen.html#master-election

*Nodes can be excluded from becoming a master by setting node.master to
false. Note, once a node is a client node (node.client set to true), it
will not be allowed to become a master (node.master is automatically set to
false).*

And from the elasticsearch.yml config file it says:

*# 2. You want this node to only serve as a master: to not store any data
and # to have free resources. This will be the coordinator of your
cluster. # #node.master: true #node.data: false # # 3. You want this node
to be neither master nor data node, but # to act as a search load
balancer (fetching data from nodes, # aggregating results, etc.) #
#node.master: false #node.data: false*

So I'm wondering how exactly you set up your client nodes to also be master
nodes. It seems like a master node can only either be purely a master or
master + data.

Regards, Alex

On Thursday, 31 July 2014 23:57:26 UTC+1, Mark Walkom wrote:

1 - Curator FTW.
2 - Masters handle cluster state, shard allocation and a whole bunch of
other stuff around managing the cluster and it's members and data. A node
that is master and data set to false is considered a search node. But the
role of being a master is not onerous, so it made sense for us to double up
the roles. We then just round robin any queries to these three masters.
3 - Yes, butit's entirely dependent on your environment. If you're
happy with that and you can get the go-ahead then see where it takes you.
4 - Quorum is automatic and having the n/2+1 means that the majority of
nodes will have to take place in an election, which reduces the possibility
of split brain. If you set the discovery settings then you are also
essentially setting the quorum settings.

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com javascript:
web: www.campaignmonitor.com

On 31 July 2014 22:27, Alex alex@gmail.com javascript: wrote:

Hello Mark,

Thank you for your reply, it certainly helps to clarify many things.

Of course I have some new questions for you!

1. I haven't looked into it much yet but I'm guessing Curator can
handle different index naming schemes. E.g. logs-2014.06.30 and
stats-2014.06.30. We'd actually be wanting to store the stats data for 2
years and logs for 90 days so it would indeed be helpful to split the
data
into different index sets. Do you use Curator?

2. You say that you have 3 masters that also handle queries... but
I thought all masters did was handle queries? What is a master node that
*doesn't* handle queries? Should we have search load balancer nodes?
AKA not master and not data nodes.

3. In the interests of reducing the number of node combinations for
us to test out would you say, then, that 3 master (and query(??)) only
nodes, and the 6 1TB data only nodes would be good?

4. Quorum and split brain are new to me. This webpage

http://blog.trifork.com/2013/10/24/how-to-avoid-the-split-brain-problem-in-elasticsearch/
about
split brain recommends setting *discovery.zen.minimum_master_nodes* equal
to *N/2 + 1*. This formula is similar to the one given in the
documentation for quorum

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/docs-index_.html#index-consistency:

index operations only succeed if a quorum (replicas/2+1) of active
shards
are available. I completely understand the split brain issue, but not
quorum. Is quorum handled automatically or should I change some settings?

Thanks again for your help, we appreciate your time and knowledge!
Regards,
Alex

On Thursday, 31 July 2014 05:57:35 UTC+1, Mark Walkom wrote:

1 - Looks ok, but why two replicas? You're chewing up disk for what
reason? Extra comments below.
2 - It's personal preference really and depends on how your end points
send to redis.
3 - 4GB for redis will cache quite a lot of data if you're only doing 50
events p/s (ie hours or even days based on what I've seen).
4 - No, spread it out to all the nodes. More on that below though.
5 - No it will handle that itself. Again, more on that below though.

Suggestions;
Set your indexes to (factors of) 6 shards, ie one per node, it spreads
query performance. I say factors of in that you can set it to 12 shards
per index to start and easily scale the node count and still spread the
load.
Split your stats and your log data into different indexes, it'll make
management and retention easier.
You can consider a master only node or (ideally) three that also handle
queries.
Preferably have an uneven number of master eligible nodes, whether you
make them VMs or

Re: Rescore Sorting

2014-08-13 Thread Shawn Ritchie

Tried simplfying the query but no luck still getting the original query 
score

Tried Simplifying the Query but with no luck still getting the same scores


 {

 from: 0,

 size: 10,

 explain: false,

 sort: [_score, {

 networks: {

 order: desc,

 mode: sum

 }

 }, {

 rich: {

 order: desc,

 mode: sum

 }

 }, {

 picture: {

 order: desc,

 mode: sum

 }

 }],

 query: {

 filtered: {

 query: {

 bool: {

 should: [{

 constant_score: {

 query: {

 match: {

 _all: {

 query: Daryl

 }

 }

 },

 boost: 1.0

 }

 }, {

 constant_score: {

 query: {

 match: {

 _all: {

 query: Davies

 }

 }

 },

 boost: 1.0

 }

 }],

 disable_coord: 1

 }

 },

 filter: [{

 or: [{

 query: {

 match: {

 _all: {

 query: Daryl

 }

 }

 }

 }, {

 query: {

 match: {

 _all: {

 query: Davies

 }

 }

 }

 }]

 }]

 }

 },

 rescore: [{

 query: {

 query_weight: 0.0,

 rescore_query_weight: 1.0,

 score_mode: total,

 rescore_query: {

 constant_score: {

 query: {

 match_all: {}

 },

 boost: 20.0

 }

 }

 },

 window_size: 50

 }]

 }

  


On Wednesday, 13 August 2014 10:53:45 UTC+2, Shawn Ritchie wrote:

 Hi Guys,

 I'm getting weird score from my rescore anyone mind explaining why the 
 rescore score is getting ignored in my query?

 {

 from: 0,

 size: 10,

 explain: false,

 sort: [_score, {

 networks: {

 order: desc,

 mode: sum

 }

 }, {

 rich: {

 order: desc,

 mode: sum

 }

 }, {

 picture: {

 order: desc,

 mode: sum

 }

 }],

 query: {

 filtered: {

 query: {

 bool: {

 should: [{

 constant_score: {

 query: {

 match: {

 _all: {

 query: Daryl

 }

 }

 },

 boost: 1.0

 }

 }, {

 constant_score: {

 query: {

 match: {

 _all: {

 query: Davies

 }

 }

 },

 boost: 1.0

 }

 }],

 disable_coord: 1

 }

 },

 filter: [{

 or: [{

 query: {

 match: {

 _all: {

 query: Daryl

 }

 }

 }

 }, {

 query: {

 match: {

 _all: {

 query: Davies

 }

 }

 }

 }]

 }]

 }

 },

 rescore: [{

 query: {

 query_weight: 0.0,

 rescore_query_weight: 1.0,

 score_mode: total,

Re: Embedded ElasticSearch On Java

2014-08-13 Thread feenz

I must be doing something wrong I continue to get the same error I
removed the cluster.name in the settings and still get the same error.

Elasticsearch is running on the same machine that I am trying to connect
with the client. Is TransportClient the correct way to connect? Or should I
be using Node?

I tried this

Node node =
nodeBuilder().client(true).loadConfigSettings(false).settings(settings).node();

This returns the error NoSuchMethodError: java.lang.NoSuchMethodError:
org.apache.log4j.Logger.isTraceEnabled()Z at
org.elasticsearch.common.logging.log4j.Log4jESLogger.isTraceEnabled()

I have included the log4j-1.4.0.jar that came packaged with the version of
elasticsearch-1.3.1 


On Tue, Aug 12, 2014 at 12:13 PM, Vivek Sachdeva [via ElasticSearch Users] 
ml-node+s115913n4061741...@n3.nabble.com wrote:

 The default cluster name is elasticsearch. Changing it in your code
 works


 On Tue, Aug 12, 2014 at 9:33 PM, Vivek Sachdeva [hidden email]
 http://user/SendEmail.jtp?type=nodenode=4061741i=0 wrote:

 Your code works if you dont add cluster name to it. Tried with Java
 this time.. :)


 On Tue, Aug 12, 2014 at 7:47 PM, Kfeenz [hidden email]
 http://user/SendEmail.jtp?type=nodenode=4061741i=1 wrote:

 @Jorg,

 Thanks for the advice, I will make sure that I do so during actual
 implementation, but this is purely for testing the connection.. Also, I see
 a client.close() and a client.threadPool().shutdown(), but I do not see a
 client.threadPool().close(). I am using ES v1.3.1.

 @ Vivek,

 I am not sure how you were able to use 'localhost' vise localhost.
 Java complains about an invalid character constant because 'localhost' is
 not a character but a String...

 My current code is as follows... with still no luck...

 Settings settings = ImmutableSettings.settingsBuilder().put(
 cluster.name, mycluster).build();

 Client client = new TransportClient(settings).addTransportAddress(new
 InetSocketTransportAddress(localhost, 9300));

 ClusterStatsRequestBuilder builder =
 client.admin().cluster().prepareClusterStats();

 ClusterStatsResponse response = builder.execute().actionGet(); // fails
 on execute... NoNodeAvailableException

 assertEquals(mycluster, response.getClusterName()); // never gets to
 this point

 NoNodeAvailableException: None of the configured nodes are available []

 If I add a setting to the settings object

 .put(client.transport.sniff, true);

 I get a different error - [org.elasticsearch.client.transport] [Argus]
 failed to get local cluster state info for [#transport#-1]...

 I can query the cluster using 
 *http://localhost:9200/_cluster/health?pretty=true
 http://localhost:9200/_cluster/health?pretty=true* which returns

 {
   cluster_name : mycluster,
   status : green,
   timed_out : false,
   number_of_nodes : 1,
   number_of_data_nodes : 1,
   active_primary_shards : 0,
   active_shards : 0,
   relocating_shards : 0,
   initializing_shards : 0,
   unassigned_shards : 0
 }

 I am on Windows 7 64-bit.
 I am using Java 1.7_u55.
 I am using ES version 1.3.1.
 I have included in my pom.xml:
   - elasticsearch-1.3.1.jar
   - lucene-core-4.9.0.jar

 Any other suggestions are greatly appreciated.



 On Tuesday, August 12, 2014 5:45:16 AM UTC-4, Vivek Sachdeva wrote:

 Replace

 .setTransportAddress(new InetSocketTransportAddress(localhost,
 9300));

 with

 .addTransportAddress(new InetSocketTransportAddress('localhost',
 9300)).

 And I guess if you dont give cluster name, it automatically joins the
 default cluster.

 I tried the code that you provided and changed above mentioned code. It
 works on my end. Can you try it?

 On Monday, August 11, 2014 11:34:43 PM UTC+5:30, Kfeenz wrote:

 So I am very new to elasticsearch... so I apologize in advance..

 I started a local instance of elasticsearch and I am trying to connect
 to it through the Java API.

 I was under the impression that the transport client was for remote
 clients?

 I tried:

 @Test
 public void testIndexResponse() {

   Client client = new TransportClient().setTransportAddress(new
 InetSocketTransportAddress(localhost, 9300));

   String json = { +
 \user\:\kimchy\, +



 \postDate\:\2013-01-30\, +
 \message\:\trying out Elasticsearch\ +



 };

   IndexResponse response = client.prepareIndex(twitter, tweet)



 .setSource(json)
 .execute()
 .actionGet();


   client.close();

   System.out.println(response.getIndex());
 }

 I receive org.elasticsearch.client.transport.NoNodeAvailableException:
 None of the configured nodes are available: [].



 On Monday, August 11, 2014 1:19:06 PM UTC-4, Vivek Sachdeva wrote:

 Have you tried using transport client for connecting...

 On Monday, August 11, 2014 10:26:29 PM UTC+5:30, Kfeenz wrote:

 All,

 I know this post is old, but I continue to have an issue with
 this...

 I get an NoSuchMethodError: org.apache.log4j.Logger.isTraceEnabled()Z
 exception when I run

 Node node =

Example needed for Perl Search::Elasticsearch

Hi,

Simple question, but there seems to be a lack of detailed examples for 
using the otherwise very useful Search::Elasticsearch CPAN module !

I'm getting syslog data into elasticsearch via fluentd.

What I'd like to do now is run a perl search that will give me results for 
notice, emerg and crit events.  As a test (seeing as I don't get many 
emerg/crit events !), I've tried the  below, but it only seems to pick up 
notice events and doesn't return any info events !

Help welcome !

Thanks.

Tim

#!/usr/bin/perl

use 5.014;
use strict;
use warnings;
use autodie;

use Data::Dumper;
use Search::Elasticsearch;

my $e = Search::Elasticsearch-new();

my $results = $e-search(
   index = 'logstash-2014.08.13',
   body  = {
   query = {
bool = {
must = {match = { severity = 'notice'},match = 
{ severity = 'info'}}
}
}
   }
);

print Dumper($results);

 

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/42e60034-655f-46ca-979e-308b0e7532e3%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: About elasticsearch.org

2014-08-13 Thread James Green

The BGP routing table burst past 512k entries yesterday, I'm told. Any
old routers with that as a limit would have been affected limiting
traffic apparently at random. A few ISPs were caught out.

On 12 August 2014 18:21, Jack Park jackp...@topicquests.org wrote:

I just hear second hand that the outage is pretty large.

On Tue, Aug 12, 2014 at 10:14 AM, Jack Park jackp...@topicquests.org
wrote:
It appears that liquidweb (their host) has some problems. I and a
friend in Canada can open it on my cell phone, but nobody I know
around here can raise it on some other networks.

On Tue, Aug 12, 2014 at 9:58 AM, Antonio Augusto Santos
mkha...@gmail.com wrote:
Working fine for me (in Brazil).

On Tuesday, August 12, 2014 1:57:48 PM UTC-3, Jack Park wrote:

Just curious:
on all browsers here in silicon valley, I cannot raise any
elasticsearch.org

Is it just me (or comcast?)
Other websites appear fine.

https://groups.google.com/d/msgid/elasticsearch/ee9afb15-2e69-400c-a21c-2017d3d818ea%40googlegroups.com
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAH6s0fwTcaAuAjBrB_8GnoBwxijBLvhbGOwuKV%2B37_6Tq958rw%40mail.gmail.com
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAMH6%2BazkwoeK7byAd98MA8iqOPmBTev_%2BhUuYAdEWzdiYWLAkA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: New version of Kibana in the works?

2014-08-13 Thread James Green

There's been nothing since 13 Jun according to Github. Most disappointing,
lots of people waiting for their PRs to be attended to.

On 12 August 2014 17:44, Antonio Augusto Santos mkha...@gmail.com wrote:

This one is for the devs, and Rashid in special: there is any new version
of Kibana in the works?
I'm asking this because I'm about to start a project in my company for log
management, and there are some requisites to it (user separation, event
correlation, histogram to compare two values, and so on).

So, any changes of these functionalities landing on Kibana 4.0? ;)

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/2780b7a3-6312-4883-8a02-42b5aeefd88d%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/2780b7a3-6312-4883-8a02-42b5aeefd88d%40googlegroups.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAMH6%2Baw9%3DntGtezVXuzwftYDD7r8Q3r3yq1AwJ2mBY%3DsCMoktg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Embedded ElasticSearch On Java

2014-08-13 Thread Vivek Sachdeva

If your project is not really big, maybe you can send it here or directly...

vivek.sachd...@intelligrape.com


On Tue, Aug 12, 2014 at 11:19 PM, feenz kfeeney5...@gmail.com wrote:

 I must be doing something wrong I continue to get the same error I
 removed the cluster.name in the settings and still get the same error.

 Elasticsearch is running on the same machine that I am trying to connect
 with the client. Is TransportClient the correct way to connect? Or should I
 be using Node?

 I tried this

 Node node =
 nodeBuilder().client(true).loadConfigSettings(false).settings(settings).node();

 This returns the error NoSuchMethodError: java.lang.NoSuchMethodError:
 org.apache.log4j.Logger.isTraceEnabled()Z at
 org.elasticsearch.common.logging.log4j.Log4jESLogger.isTraceEnabled()

 I have included the log4j-1.4.0.jar that came packaged with the version of
 elasticsearch-1.3.1 


 On Tue, Aug 12, 2014 at 12:13 PM, Vivek Sachdeva [via ElasticSearch Users]
 [hidden email] http://user/SendEmail.jtp?type=nodenode=4061750i=0
 wrote:

 The default cluster name is elasticsearch. Changing it in your code
 works


 On Tue, Aug 12, 2014 at 9:33 PM, Vivek Sachdeva [hidden email]
 http://user/SendEmail.jtp?type=nodenode=4061741i=0 wrote:

 Your code works if you dont add cluster name to it. Tried with Java
 this time.. :)


 On Tue, Aug 12, 2014 at 7:47 PM, Kfeenz [hidden email]
 http://user/SendEmail.jtp?type=nodenode=4061741i=1 wrote:

 @Jorg,

 Thanks for the advice, I will make sure that I do so during actual
 implementation, but this is purely for testing the connection.. Also, I see
 a client.close() and a client.threadPool().shutdown(), but I do not see a
 client.threadPool().close(). I am using ES v1.3.1.

 @ Vivek,

 I am not sure how you were able to use 'localhost' vise localhost.
 Java complains about an invalid character constant because 'localhost' is
 not a character but a String...

 My current code is as follows... with still no luck...

 Settings settings = ImmutableSettings.settingsBuilder().put(
 cluster.name, mycluster).build();

 Client client = new TransportClient(settings).addTransportAddress(new
 InetSocketTransportAddress(localhost, 9300));

 ClusterStatsRequestBuilder builder =
 client.admin().cluster().prepareClusterStats();

 ClusterStatsResponse response = builder.execute().actionGet(); // fails
 on execute... NoNodeAvailableException

 assertEquals(mycluster, response.getClusterName()); // never gets to
 this point

 NoNodeAvailableException: None of the configured nodes are available []

 If I add a setting to the settings object

 .put(client.transport.sniff, true);

 I get a different error - [org.elasticsearch.client.transport] [Argus]
 failed to get local cluster state info for [#transport#-1]...

 I can query the cluster using 
 *http://localhost:9200/_cluster/health?pretty=true
 http://localhost:9200/_cluster/health?pretty=true* which returns

 {
   cluster_name : mycluster,
   status : green,
   timed_out : false,
   number_of_nodes : 1,
   number_of_data_nodes : 1,
   active_primary_shards : 0,
   active_shards : 0,
   relocating_shards : 0,
   initializing_shards : 0,
   unassigned_shards : 0
 }

 I am on Windows 7 64-bit.
 I am using Java 1.7_u55.
 I am using ES version 1.3.1.
 I have included in my pom.xml:
   - elasticsearch-1.3.1.jar
   - lucene-core-4.9.0.jar

 Any other suggestions are greatly appreciated.



 On Tuesday, August 12, 2014 5:45:16 AM UTC-4, Vivek Sachdeva wrote:

 Replace

 .setTransportAddress(new InetSocketTransportAddress(localhost,
 9300));

 with

 .addTransportAddress(new InetSocketTransportAddress('localhost',
 9300)).

 And I guess if you dont give cluster name, it automatically joins the
 default cluster.

 I tried the code that you provided and changed above mentioned code.
 It works on my end. Can you try it?

 On Monday, August 11, 2014 11:34:43 PM UTC+5:30, Kfeenz wrote:

 So I am very new to elasticsearch... so I apologize in advance..

 I started a local instance of elasticsearch and I am trying to
 connect to it through the Java API.

 I was under the impression that the transport client was for remote
 clients?

 I tried:

 @Test
 public void testIndexResponse() {

   Client client = new TransportClient().setTransportAddress(new
 InetSocketTransportAddress(localhost, 9300));

   String json = { +
 \user\:\kimchy\, +





 \postDate\:\2013-01-30\, +
 \message\:\trying out Elasticsearch\ +





 };

   IndexResponse response = client.prepareIndex(twitter, tweet)





 .setSource(json)
 .execute()
 .actionGet();


   client.close();

   System.out.println(response.getIndex());
 }

 I receive org.elasticsearch.client.transport.NoNodeAvailableException:
 None of the configured nodes are available: [].



 On Monday, August 11, 2014 1:19:06 PM UTC-4, Vivek Sachdeva wrote:

 Have you tried using transport client for connecting...

 On Monday, August 11, 2014 10:26:29 PM UTC+5:30, Kfeenz wrote:

Re: About elasticsearch.org

Seems that some did not catch the news about the millions of new gTLDs.

Jörg

On Wed, Aug 13, 2014 at 12:03 PM, James Green james.mk.gr...@gmail.com
wrote:

The BGP routing table burst past 512k entries yesterday, I'm told. Any
old routers with that as a limit would have been affected limiting
traffic apparently at random. A few ISPs were caught out.

On 12 August 2014 18:21, Jack Park jackp...@topicquests.org wrote:

I just hear second hand that the outage is pretty large.

On Tue, Aug 12, 2014 at 9:58 AM, Antonio Augusto Santos
mkha...@gmail.com wrote:
Working fine for me (in Brazil).

On Tuesday, August 12, 2014 1:57:48 PM UTC-3, Jack Park wrote:

Just curious:
on all browsers here in silicon valley, I cannot raise any
elasticsearch.org

Is it just me (or comcast?)
Other websites appear fine.

https://groups.google.com/d/msgid/elasticsearch/ee9afb15-2e69-400c-a21c-2017d3d818ea%40googlegroups.com
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAH6s0fwTcaAuAjBrB_8GnoBwxijBLvhbGOwuKV%2B37_6Tq958rw%40mail.gmail.com
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAMH6%2BazkwoeK7byAd98MA8iqOPmBTev_%2BhUuYAdEWzdiYWLAkA%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAMH6%2BazkwoeK7byAd98MA8iqOPmBTev_%2BhUuYAdEWzdiYWLAkA%40mail.gmail.com?utm_medium=emailutm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGS9%2B4uG5CA319oJUtNQ%2BfcG38kiHKy8DZf5Mk%3DD8VtHw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Rescore Sorting

2014-08-13 Thread Shawn Ritchie

Ok,

So I managed to fix this issue but I'm not entirely sure if I found a Bug 
or if the functionality is intended to work like so.

Basically if I remove the sort functionality the query works as intended, 
If I use any kind of Sort, the rescore query is totally ignored and sorting 
is done on the original query score.



-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/73aa1295-82ee-4341-8d53-b815806bee73%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Empty input for _suggest

2014-08-13 Thread vineeth mohan

Hi ,

On using _suggest , I want to get an empty request response.
That is for a empty text , the response should be top N suggestions.

Is this possible ?

Thanks
   Vineeth

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAGdPd5%3D3JS_kjhvp5093dwXxr6tpNoHbWkf5K-7H_BTQqJhF%2Bw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Example needed for Perl Search::Elasticsearch

Try this to search notice or info severity.

my $results = $e-search(
   index = 'logstash-2014.08.13',
   body  = {
   query = {
bool = {
should =  [
{ match = { severity = 'notice'} },
{ match = { severity = 'info'} }
]
}
}
   }
);


Jörg


On Wed, Aug 13, 2014 at 12:01 PM, Log Muncher railroaderslam...@gmail.com
wrote:

 Hi,

 Simple question, but there seems to be a lack of detailed examples for
 using the otherwise very useful Search::Elasticsearch CPAN module !

 I'm getting syslog data into elasticsearch via fluentd.

 What I'd like to do now is run a perl search that will give me results for
 notice, emerg and crit events.  As a test (seeing as I don't get many
 emerg/crit events !), I've tried the  below, but it only seems to pick up
 notice events and doesn't return any info events !

 Help welcome !

 Thanks.

 Tim

 #!/usr/bin/perl

 use 5.014;
 use strict;
 use warnings;
 use autodie;

 use Data::Dumper;
 use Search::Elasticsearch;

 my $e = Search::Elasticsearch-new();

 my $results = $e-search(
index = 'logstash-2014.08.13',
body  = {
query = {
 bool = {
 must = {match = { severity = 'notice'},match
 = { severity = 'info'}}
 }
 }
}
 );

 print Dumper($results);



 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/42e60034-655f-46ca-979e-308b0e7532e3%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/42e60034-655f-46ca-979e-308b0e7532e3%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoFrb%3Dz246SNk4GGgbuZTSv-dh-GgPXdN%3DPOP1jhVhxZow%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Example needed for Perl Search::Elasticsearch

Well the the Perl module certainly doesn't complain about the syntax, but 
it stil doesn't manage to output anything other than the notice severity ?

$ perl test.pl  | fgrep severity
'severity' = 'notice'
'severity' = 'notice',
'severity' = 'notice',
'severity' = 'notice',
'severity' = 'notice',
'severity' = 'notice',
'severity' = 'notice'
'severity' = 'notice',
'severity' = 'notice',
'severity' = 'notice',


$ cat test.pl 
#!/usr/bin/perl

use 5.014;
use strict;
use warnings;
use autodie;

use Data::Dumper;
use Search::Elasticsearch;

my $e = Search::Elasticsearch-new();

my $results = $e-search(
   index = 'logstash-2014.08.13',
   body  = {
   query = {
#match = { severity = 'notice'}
bool = {
should = [
{match = { severity = 'notice'}},
{match = { severity = 'info'}}
]
}
}
   }
);

print Dumper($results);







On Wednesday, 13 August 2014 11:40:42 UTC+1, Jörg Prante wrote:

 Try this to search notice or info severity.

 my $results = $e-search(
index = 'logstash-2014.08.13',
body  = {
query = {
 bool = {
 should =  [
 { match = { severity = 'notice'} },
 { match = { severity = 'info'} }
 ]
 }
 }
}
 );


 Jörg


 On Wed, Aug 13, 2014 at 12:01 PM, Log Muncher railroad...@gmail.com 
 javascript: wrote:

 Hi,

 Simple question, but there seems to be a lack of detailed examples for 
 using the otherwise very useful Search::Elasticsearch CPAN module !

 I'm getting syslog data into elasticsearch via fluentd.

 What I'd like to do now is run a perl search that will give me results 
 for notice, emerg and crit events.  As a test (seeing as I don't get many 
 emerg/crit events !), I've tried the  below, but it only seems to pick up 
 notice events and doesn't return any info events !

 Help welcome !

 Thanks.

 Tim

 #!/usr/bin/perl

 use 5.014;
 use strict;
 use warnings;
 use autodie;

 use Data::Dumper;
 use Search::Elasticsearch;

 my $e = Search::Elasticsearch-new();

 my $results = $e-search(
index = 'logstash-2014.08.13',
body  = {
query = {
 bool = {
 must = {match = { severity = 'notice'},match 
 = { severity = 'info'}}
 }
 }
}
 );

 print Dumper($results);

  

 -- 
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearc...@googlegroups.com javascript:.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/42e60034-655f-46ca-979e-308b0e7532e3%40googlegroups.com
  
 https://groups.google.com/d/msgid/elasticsearch/42e60034-655f-46ca-979e-308b0e7532e3%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.




-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/1967d9c9-e53e-4037-803c-586dce6a6568%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Curiosity : elasticsearch frontend

Very helpful! So awesome you released it to the public with open source
license. Thank you so much

Jörg

On Wed, Aug 13, 2014 at 10:42 AM, Arthur BABEY arthur.babey...@gmail.com
wrote:

Hi everyone,

I'm currently a Pages Jaunes trainee. During my internship i worked on an
application to facilitate access to data to non-technical staff.
Curiosity was born.

Yet the tool is stable and powerful but not very friendly. I worked hard
to provide a good documentation but as you can see my English suxx and that
good in explanation.
But there is something i am not so bad : development. So if you have some
time to try curiosity, i will be very happy if you give me some feedback.

I am open to criticism, and maybe if you liked the tool and want new
features i will be proud to develop them.

If you have any questions about functionality or code, ask me, i will try
to give you the best answer.

You can find Curiosity on Github
https://github.com/pagesjaunes/curiosity (it's fully open source) and
the documentation is here http://pagesjaunes.github.io/curiosity/.

Hope your eyes didn't bleed to much.
Have a good day or night.

Arthur

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/ed47cd67-1435-44be-9810-44d885129469%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/ed47cd67-1435-44be-9810-44d885129469%40googlegroups.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoFuO%2BKhBOdoyFdFVjyHQgtYdtCyecFCCqMSWVJ9tLEu2Q%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

A few questions about node types + usage

2014-08-13 Thread Alex

Hello I would like some clarification about node types and their usage.

We will have 3 client nodes and 6 data nodes. The 6 1TB data nodes can also
be masters (discovery.zen.minimum_master_nodes set to 4). We will use
Logstash and Kibana. Kibana will be used 24/7 by between a couple and
handfuls of people.

Some questions:

1. Should incoming Logstash write requests be sent to the cluster in
general (using the *cluster* setting in the *elasticsearch* output) or
specifically to the client nodes or to the data nodes (via load balancer)?
I am unsure what kind of node is best for handling writes.

2. If client nodes exist in the cluster are Kibana requests
automatically routed to them? Do I need to somehow specify to Kibana which
nodes to contact?

3. I have heard different information about master nodes and the
minimum_master_node setting. I've heard that you should have a odd number
of master nodes but I fail to see why the parity of the number of masters
matters as long as minimum_master_node is set to at least N/2 + 1. Does it
really need to be odd?

4. I have been advised that the client nodes will use huge amount of
memory (which makes sense due to the nature of the Kibana facet queries).
64GB per client node was recommended but I have no idea if that sounds
right or not. I don't have the ability to actually test it right now so any
more guidance on that would be helpful.

I'd be so grateful to hear from you even if you only know something about
one of my queries.

Thank you for your time,
Alex

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/fe5adb02-5cd6-4554-8993-28b8e24160fc%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: CSV River showing unexpected results in a 3 node cluster

2014-08-13 Thread Sree

Hi David,

Thank you very much. It worked.

On Tuesday, 12 August 2014 18:39:28 UTC+5:30, David Pilato wrote:

I think you could set in elasticsearch.yml:

node.river: _none_

On nodes you don't want to allocate any river.

--
*David Pilato* | *Technical Advocate* | *Elasticsearch.com*
@dadoonet https://twitter.com/dadoonet | @elasticsearchfr
https://twitter.com/elasticsearchfr

Le 12 août 2014 à 15:01:27, Sree (srssr...@gmail.com javascript:) a
écrit:

Hi all,

I have a 3 node cluster . One is Master and others are eligible for master
when the master node failed. I installed CSV River plugin in master node.
The csv files which need to process is also in master node. When i am
running the CSV river plugin from master , then it trying to execute the
indexing in other nodes and it fails in finding the csv files. Because csv
files are in master.

Any pointers on this. Is this an expected behaviour ?

Thank you,

Srijith
--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com javascript:.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/c64b88ce-1f3c-4966-bd42-d9845e91ef36%40googlegroups.com

https://groups.google.com/d/msgid/elasticsearch/c64b88ce-1f3c-4966-bd42-d9845e91ef36%40googlegroups.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/0986eb43-8834-44dc-8772-f07258825b59%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Example needed for Perl Search::Elasticsearch

Aahh.. newbie mistake !  I didn't realise the results were limited by 
default.   ;-)

Thanks !

On Wednesday, 13 August 2014 12:09:43 UTC+1, Jörg Prante wrote:

 A reason may be that your result set size is too small for containing both 
 severity values. You could either try a larger result set size, or boost 
 the info clause so you get docs with info before notice.

 Jörg


 On Wed, Aug 13, 2014 at 12:51 PM, Log Muncher railroad...@gmail.com 
 javascript: wrote:

 Well the the Perl module certainly doesn't complain about the syntax, but 
 it stil doesn't manage to output anything other than the notice severity ?

 $ perl test.pl  | fgrep severity
 'severity' = 'notice'
 'severity' = 'notice',
 'severity' = 'notice',
 'severity' = 'notice',
 'severity' = 'notice',
 'severity' = 'notice',
 'severity' = 'notice'
 'severity' = 'notice',
 'severity' = 'notice',
 'severity' = 'notice',


 $ cat test.pl 
 #!/usr/bin/perl

 use 5.014;
 use strict;
 use warnings;
 use autodie;

 use Data::Dumper;
 use Search::Elasticsearch;

 my $e = Search::Elasticsearch-new();

 my $results = $e-search(
index = 'logstash-2014.08.13',
body  = {
query = {

 #match = { severity = 'notice'}

 bool = {
 should = [
 {match = { severity = 'notice'}},
 {match = { severity = 'info'}}
 ]
 }
 }
}
 );

 print Dumper($results);







 On Wednesday, 13 August 2014 11:40:42 UTC+1, Jörg Prante wrote:

 Try this to search notice or info severity.

 my $results = $e-search(
index = 'logstash-2014.08.13',
body  = {
query = {
 bool = {
 should =  [
 { match = { severity = 'notice'} },
 { match = { severity = 'info'} }
 ]
 }
 }
}
 );


 Jörg


 On Wed, Aug 13, 2014 at 12:01 PM, Log Muncher railroad...@gmail.com 
 wrote:

 Hi,

 Simple question, but there seems to be a lack of detailed examples for 
 using the otherwise very useful Search::Elasticsearch CPAN module !

 I'm getting syslog data into elasticsearch via fluentd.

 What I'd like to do now is run a perl search that will give me results 
 for notice, emerg and crit events.  As a test (seeing as I don't get many 
 emerg/crit events !), I've tried the  below, but it only seems to pick up 
 notice events and doesn't return any info events !

 Help welcome !

 Thanks.

 Tim

 #!/usr/bin/perl

 use 5.014;
 use strict;
 use warnings;
 use autodie;

 use Data::Dumper;
 use Search::Elasticsearch;

 my $e = Search::Elasticsearch-new();

 my $results = $e-search(
index = 'logstash-2014.08.13',
body  = {
query = {
 bool = {
 must = {match = { severity = 'notice'},match 
 = { severity = 'info'}}
 }
 }
}
 );

 print Dumper($results);

  

 -- 
 You received this message because you are subscribed to the Google 
 Groups elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send 
 an email to elasticsearc...@googlegroups.com.

 To view this discussion on the web visit https://groups.google.com/d/
 msgid/elasticsearch/42e60034-655f-46ca-979e-308b0e7532e3%
 40googlegroups.com 
 https://groups.google.com/d/msgid/elasticsearch/42e60034-655f-46ca-979e-308b0e7532e3%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


  -- 
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearc...@googlegroups.com javascript:.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/1967d9c9-e53e-4037-803c-586dce6a6568%40googlegroups.com
  
 https://groups.google.com/d/msgid/elasticsearch/1967d9c9-e53e-4037-803c-586dce6a6568%40googlegroups.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.




-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/64102621-766c-4071-8e0c-5b6fdf7c5146%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Curiosity : elasticsearch frontend

2014-08-13 Thread Arthur BABEY

I'm very happy that you found curiosity useful. 
We released curiosity as an open source project, because we think it's the 
best way for a project to live and with the help of the community to 
imagine new cool features.

Arthur

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/0f64db8a-ba3c-4f38-ac8d-3802a8b0415a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Example needed for Perl Search::Elasticsearch

Would this be the correct syntax ?

{match = { severity = {query='info',boost=20}}}


Even with the agressive boost, I'm still getting notice as the 
prioritised results ?




On Wednesday, 13 August 2014 12:09:43 UTC+1, Jörg Prante wrote:

 A reason may be that your result set size is too small for containing both 
 severity values. You could either try a larger result set size, or boost 
 the info clause so you get docs with info before notice.

 Jörg


 On Wed, Aug 13, 2014 at 12:51 PM, Log Muncher railroad...@gmail.com 
 javascript: wrote:

 Well the the Perl module certainly doesn't complain about the syntax, but 
 it stil doesn't manage to output anything other than the notice severity ?

 $ perl test.pl  | fgrep severity
 'severity' = 'notice'
 'severity' = 'notice',
 'severity' = 'notice',
 'severity' = 'notice',
 'severity' = 'notice',
 'severity' = 'notice',
 'severity' = 'notice'
 'severity' = 'notice',
 'severity' = 'notice',
 'severity' = 'notice',


 $ cat test.pl 
 #!/usr/bin/perl

 use 5.014;
 use strict;
 use warnings;
 use autodie;

 use Data::Dumper;
 use Search::Elasticsearch;

 my $e = Search::Elasticsearch-new();

 my $results = $e-search(
index = 'logstash-2014.08.13',
body  = {
query = {

 #match = { severity = 'notice'}

 bool = {
 should = [
 {match = { severity = 'notice'}},
 {match = { severity = 'info'}}
 ]
 }
 }
}
 );

 print Dumper($results);







 On Wednesday, 13 August 2014 11:40:42 UTC+1, Jörg Prante wrote:

 Try this to search notice or info severity.

 my $results = $e-search(
index = 'logstash-2014.08.13',
body  = {
query = {
 bool = {
 should =  [
 { match = { severity = 'notice'} },
 { match = { severity = 'info'} }
 ]
 }
 }
}
 );


 Jörg


 On Wed, Aug 13, 2014 at 12:01 PM, Log Muncher railroad...@gmail.com 
 wrote:

 Hi,

 Simple question, but there seems to be a lack of detailed examples for 
 using the otherwise very useful Search::Elasticsearch CPAN module !

 I'm getting syslog data into elasticsearch via fluentd.

 What I'd like to do now is run a perl search that will give me results 
 for notice, emerg and crit events.  As a test (seeing as I don't get many 
 emerg/crit events !), I've tried the  below, but it only seems to pick up 
 notice events and doesn't return any info events !

 Help welcome !

 Thanks.

 Tim

 #!/usr/bin/perl

 use 5.014;
 use strict;
 use warnings;
 use autodie;

 use Data::Dumper;
 use Search::Elasticsearch;

 my $e = Search::Elasticsearch-new();

 my $results = $e-search(
index = 'logstash-2014.08.13',
body  = {
query = {
 bool = {
 must = {match = { severity = 'notice'},match 
 = { severity = 'info'}}
 }
 }
}
 );

 print Dumper($results);

  

 -- 
 You received this message because you are subscribed to the Google 
 Groups elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send 
 an email to elasticsearc...@googlegroups.com.

 To view this discussion on the web visit https://groups.google.com/d/
 msgid/elasticsearch/42e60034-655f-46ca-979e-308b0e7532e3%
 40googlegroups.com 
 https://groups.google.com/d/msgid/elasticsearch/42e60034-655f-46ca-979e-308b0e7532e3%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


  -- 
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearc...@googlegroups.com javascript:.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/1967d9c9-e53e-4037-803c-586dce6a6568%40googlegroups.com
  
 https://groups.google.com/d/msgid/elasticsearch/1967d9c9-e53e-4037-803c-586dce6a6568%40googlegroups.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.




-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/d61847eb-8fb8-46d8-b371-96f23e17fcd7%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Curiosity : elasticsearch frontend

2014-08-13 Thread Alfredo Serafini

Great! it looks very promising! I'll try it as soon as I can, thanks :-)

Il giorno mercoledì 13 agosto 2014 10:42:48 UTC+2, Arthur BABEY ha scritto:

Hi everyone,

I'm currently a Pages Jaunes trainee. During my internship i worked on an
application to facilitate access to data to non-technical staff.
Curiosity was born.

Yet the tool is stable and powerful but not very friendly. I worked hard
to provide a good documentation but as you can see my English suxx and that
good in explanation.
But there is something i am not so bad : development. So if you have some
time to try curiosity, i will be very happy if you give me some feedback.

I am open to criticism, and maybe if you liked the tool and want new
features i will be proud to develop them.

If you have any questions about functionality or code, ask me, i will try
to give you the best answer.

You can find Curiosity on Github
https://github.com/pagesjaunes/curiosity (it's fully open source) and
the documentation is here http://pagesjaunes.github.io/curiosity/.

Hope your eyes didn't bleed to much.
Have a good day or night.

Arthur

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/e6be1480-46eb-479d-ac8a-4ebc23db29e3%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Feature request? Ignore allow_explicit_index when accessing root /_bulk URL

2014-08-13 Thread Иван Кадочников

Hello,

When url-based access control is used for bulk requests

rest.action.multi.allow_explicit_index: false

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/url-access-control.html
It forbids explicitly setting the index in the request body regardless of 
the bulk url used.

Would it be possible to allow setting explicit indexes when the URL 
accessed is the /_bulk root, with no index specified in the URL?

This way if a user is allowed to access /_bulk, he can work as if 
allow_explicit_index is false, while if a user is only allowed to access 
specific {index}/_bulk urls, he is effectively contained. 

With the current rules, the only way to allow bulk access with explicit 
index to one user is to set allow_explicit_index to true and thus allow 
full access to everybody with bulk access.

Maybe this feature is not that high-priority, I see that access control in 
general does not seem to be the focus of elasticsearch. But if this is an 
easy change, would this work?

Thanks,
Ivan

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/58bee79e-2e30-4dc2-809b-d2b6ba275336%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Query problem

2014-08-13 Thread Luc Evers

 
  I like to use elasticsearch as Nosql database + search engine for data 
coming from text files (router configs) and databases .
  First I moved a routerconfig to a json file which I indexed .

  Mapping:

{
  configs : {
mappings : {
  test : {
properties : {
  ConfLength : {
type : string
  },
  NVRAM : {
type : string
  },
  aaa : {
type : string
  },
  enable : {
type : string
  },
  hostname : {
type : string
  },
  lastChange : {
type : string
  },
  logging : {
type : string
  },
  model : {
type : string
  },
  policy-map : {
type : string
  }
}
  }
}
  }
}


Document:

{
_index : configs,
_type : test,
_id : 7,
_score : 1,
_source : {
hostname : [
hostname test-1234
]
}
},


Example of a simple search:  search a hostname.

*If I start a query*:   

*curl -XGET 'http://127.0.0.1:9200/configs/_search?q=hostname test-1234'*
curl: (52) Empty reply from server

No respone

If I start a second query without hostname if got an answer:

*curl -XGET 'http://127.0.0.1:9200/configs/_search?q=test-1234;'*
OKE

Analyser: standard

Why a search instruction can find test-1234 but not hostname test-1234 
? 













-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/b25127bb-2dca-440c-a7b3-937b5ddccd6d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Query problem

Having no answer is not good. I think something goes wrong here. May be you 
should see something in logs.

That said, if you don't want to break your string as tokens at index time, you 
could set index:not_analyzed for fields you don't want to analyze.

But, you should read this part of the book: 
http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/analysis-intro.html#analysis-intro

-- 
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr


Le 13 août 2014 à 14:39:20, Luc Evers (lucev...@gmail.com) a écrit:

     
  I like to use elasticsearch as Nosql database + search engine for data coming 
from text files (router configs) and databases .
  First I moved a routerconfig to a json file which I indexed .

  Mapping:

{
  configs : {
    mappings : {
      test : {
        properties : {
          ConfLength : {
            type : string
          },
          NVRAM : {
            type : string
          },
          aaa : {
            type : string
          },
          enable : {
            type : string
          },
          hostname : {
            type : string
          },
          lastChange : {
            type : string
          },
          logging : {
            type : string
          },
          model : {
            type : string
          },
          policy-map : {
            type : string
          }
        }
      }
    }
  }
}


Document:

{
_index : configs,
_type : test,
_id : 7,
_score : 1,
_source : {
hostname : [
hostname test-1234
]
}
},


Example of a simple search:  search a hostname.

If I start a query:   

curl -XGET 'http://127.0.0.1:9200/configs/_search?q=hostname test-1234'
curl: (52) Empty reply from server

No respone

If I start a second query without hostname if got an answer:

curl -XGET 'http://127.0.0.1:9200/configs/_search?q=test-1234;'
OKE

Analyser: standard

Why a search instruction can find test-1234 but not hostname test-1234 ? 













--
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/b25127bb-2dca-440c-a7b3-937b5ddccd6d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/etPan.53eb5e1b.3804823e.18f0%40MacBook-Air-de-David.local.
For more options, visit https://groups.google.com/d/optout.

Re: Feature request? Ignore allow_explicit_index when accessing root /_bulk URL

2014-08-13 Thread 'Sandeep Ramesh Khanzode' via elasticsearch

That'd be worth entering in here -
https://github.com/elasticsearch/elasticsearch/issues :)

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com
web: www.campaignmonitor.com

On 13 August 2014 22:37, Иван Кадочников fizmat@gmail.com wrote:

Hello,

When url-based access control is used for bulk requests

rest.action.multi.allow_explicit_index: false

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/url-access-control.html
It forbids explicitly setting the index in the request body regardless of
the bulk url used.

Would it be possible to allow setting explicit indexes when the URL
accessed is the /_bulk root, with no index specified in the URL?

This way if a user is allowed to access /_bulk, he can work as if
allow_explicit_index is false, while if a user is only allowed to access
specific {index}/_bulk urls, he is effectively contained.

With the current rules, the only way to allow bulk access with explicit
index to one user is to set allow_explicit_index to true and thus allow
full access to everybody with bulk access.

Maybe this feature is not that high-priority, I see that access control in
general does not seem to be the focus of elasticsearch. But if this is an
easy change, would this work?

Thanks,
Ivan

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/58bee79e-2e30-4dc2-809b-d2b6ba275336%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/58bee79e-2e30-4dc2-809b-d2b6ba275336%40googlegroups.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAEM624b7pArFMuT1jn%3DF_WAc1weVNpaVYJ3CCk1r5baGZnMrWw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Feature request? Ignore allow_explicit_index when accessing root /_bulk URL

2014-08-13 Thread Ivan Kadochnikov

Ok, done.
I was not sure if I should go right to github or ask here first =)

On 08/13/2014 04:46 PM, Mark Walkom wrote:
That'd be worth entering in here -
https://github.com/elasticsearch/elasticsearch/issues :)

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com mailto:ma...@campaignmonitor.com
web: www.campaignmonitor.com http://www.campaignmonitor.com

On 13 August 2014 22:37, Иван Кадочников fizmat@gmail.com
mailto:fizmat@gmail.com wrote:

Hello,

When url-based access control is used for bulk requests

rest.action.multi.allow_explicit_index: false

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/url-access-control.html
It forbids explicitly setting the index in the request body
regardless of the bulk url used.

Would it be possible to allow setting explicit indexes when the
URL accessed is the /_bulk root, with no index specified in the URL?

This way if a user is allowed to access /_bulk, he can work as if
allow_explicit_index is false, while if a user is only allowed to
access specific {index}/_bulk urls, he is effectively contained.

With the current rules, the only way to allow bulk access with
explicit index to one user is to set allow_explicit_index to true
and thus allow full access to everybody with bulk access.

Maybe this feature is not that high-priority, I see that access
control in general does not seem to be the focus of elasticsearch.
But if this is an easy change, would this work?

Thanks,
Ivan
--
You received this message because you are subscribed to the Google

Groups elasticsearch group.
To unsubscribe from this group and stop receiving emails from it,
send an email to elasticsearch+unsubscr...@googlegroups.com
mailto:elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit

https://groups.google.com/d/msgid/elasticsearch/58bee79e-2e30-4dc2-809b-d2b6ba275336%40googlegroups.com

https://groups.google.com/d/msgid/elasticsearch/58bee79e-2e30-4dc2-809b-d2b6ba275336%40googlegroups.com?utm_medium=emailutm_source=footer.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to a topic in the
Google Groups elasticsearch group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/aNj84bHWfDE/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscr...@googlegroups.com
mailto:elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAEM624b7pArFMuT1jn%3DF_WAc1weVNpaVYJ3CCk1r5baGZnMrWw%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAEM624b7pArFMuT1jn%3DF_WAc1weVNpaVYJ3CCk1r5baGZnMrWw%40mail.gmail.com?utm_medium=emailutm_source=footer.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/53EB5F1A.3000803%40gmail.com.
For more options, visit https://groups.google.com/d/optout.

Moving Index/Shards from One Node to Another

Hi,

Lets say I have a 3 node cluster and I deploy one index SPECIFIC to every
data node in the cluster.
So, index1 goes to node1, index2 goes to node2, etc. using the
routing.allocation settings based on node.zone etc. config properties.
There may be 5-6 shards per index, but no replicas. All three indices,
index 1/2/3 will have the same mapping schemas.

Now, if the following scenarios occur:

1.] One node, node2, goes down:
How can I get the node 2's index, index2, live on the other two data nodes?
Can I just copy the data directory to the other nodes? Since there is no
mapping like index2 defined on those nodes, will I have to first create the
mapping there?
Can I move half the shards to each remaining node?

2.] Assume one more node is now added to this cluster:
Can I copy the mapping schema to the new node and selectively copy 1-2
shards each from the existing 3 data nodes so that I can rebalance the
cluster 3-4 shards per index per node?

I am not sure if there is this level of control and how it is exposed.
Please let me know. Thanks,

Thanks,
Sandeep

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/9a5eca15-401d-4aae-ad62-ef39a78d863f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Mysterious sudden load increase

2014-08-13 Thread Martin Forssen

Hello,

We are running an es-cluster with 13 nodes, 10 data and 3 master, on Amazon 
hi1.4xlarge machines. The cluster contains almost 10T of data (including 
one replica). It is running Elasticsearch 1.1.1 on Oracle java  1.7.0_25.

Our problem is that every now and then the cpu load suddenly increases on 
one of the data nodes. The load average can suddenly jump from about 4 up 
to 10-16, and once it has jumped up it stays there. Then after a couple of 
days another node is also affected and so on. Eventually most nodes in the 
cluster are affected and we have to restart them. A restart of the Java 
process brings the load back to normal.

We are not experiencing any abnormal levels of garbage collection on the 
affected nodes.

I did a java stack dump on one of the affected node and one things which 
stood out was that it had a nubber of threads with state IN_JAVA, the 
non-loaded nodes had no such threads. The stack-dump for these threads 
ivariably looks something lie this:

Thread 23022: (state = IN_JAVA)
 - java.util.HashMap.getEntry(java.lang.Object) @bci=72, line=446 (Compiled 
frame; information may be imprecise)
 - java.util.HashMap.get(java.lang.Object) @bci=11, line=405 (Compiled 
frame)
 - 
org.elasticsearch.search.scan.ScanContext$ScanFilter.getDocIdSet(org.apache.lucene.index.AtomicReaderContext,
 
org.apache.lucene.util.Bits) @bci=8, line=156 (Compiled frame)
 - 
org.elasticsearch.common.lucene.search.ApplyAcceptedDocsFilter.getDocIdSet(org.apache.lucene.index.AtomicReaderContext,
 
org.apache.lucene.util.Bits) @bci=6, line=45 (Compiled frame)
 - 
org.apache.lucene.search.FilteredQuery$1.scorer(org.apache.lucene.index.AtomicReaderContext,
 
boolean, boolean, org.apache.lucene.util.Bits) @bci=34, line=130 (Compiled 
frame)
 - org.apache.lucene.search.IndexSearcher.search(java.util.List, 
org.apache.lucene.search.Weight, org.apache.lucene.search.Collector) 
@bci=68, line=618 (Compiled frame)
 - 
org.elasticsearch.search.internal.ContextIndexSearcher.search(java.util.List, 
org.apache.lucene.search.Weight, org.apache.lucene.search.Collector) 
@bci=225, line=173 (Compiled frame)
 - 
org.apache.lucene.search.IndexSearcher.search(org.apache.lucene.search.Query, 
org.apache.lucene.search.Collector) @bci=11, line=309 (Interpreted frame)
 - 
org.elasticsearch.search.scan.ScanContext.execute(org.elasticsearch.search.internal.SearchContext)
 
@bci=54, line=52 (Interpreted frame)
 - 
org.elasticsearch.search.query.QueryPhase.execute(org.elasticsearch.search.internal.SearchContext)
 
@bci=174, line=119 (Compiled frame)
 - 
org.elasticsearch.search.SearchService.executeScan(org.elasticsearch.search.internal.InternalScrollSearchRequest)
 
@bci=49, line=233 (Interpreted frame)
 - 
org.elasticsearch.search.action.SearchServiceTransportAction$SearchScanScrollTransportHandler.messageReceived(org.elasticsearch.search.internal.InternalScrollSearchRequest,
 
org.elasticsearch.transport.TransportChannel) @bci=8, line=791 (Interpreted 
frame)
 - 
org.elasticsearch.search.action.SearchServiceTransportAction$SearchScanScrollTransportHandler.messageReceived(org.elasticsearch.transport.TransportRequest,
 
org.elasticsearch.transport.TransportChannel) @bci=6, line=780 (Interpreted 
frame)
 - 
org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.run() 
@bci=12, line=270 (Compiled frame)
 - 
java.util.concurrent.ThreadPoolExecutor.runWorker(java.util.concurrent.ThreadPoolExecutor$Worker)
 
@bci=95, line=1145 (Compiled frame)
 - java.util.concurrent.ThreadPoolExecutor$Worker.run() @bci=5, line=615 
(Interpreted frame)
 - java.lang.Thread.run() @bci=11, line=724 (Interpreted frame)

Does anybody know what we are experiencing, or have any tips on how to 
further debug this?

/MaF

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/e83a7e9f-6fe4-4d45-b19c-95f8d8418659%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Can plugin be written for TCP transport?

Hi I have been looking at the various transport plugins. Correct me if I am 
wrong but those are for the http rest interface... Can plugins be written for 
the node transport?

Bassically this leads to securing ES. My ES is definitely not public and I know 
i can use reverse proxies or one of the http plugins... But what about 
client/programs connecting directly as nodes?

Bassically I need user auth and some form of acl. SSL is secondary. Also need 
to be able to audit the user access. Dealing with credit card data. So I need 
to know 100% who is accessing the data.

So...
What are some good steps to secure my ES cluster!?

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/1ef17a07-bd72-4eee-a6b9-93ff8d0e7980%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: How to Not index a field

2014-08-13 Thread Nikolas Everett

I'm not sure the right way to do it but if you set dynamic to false and
then just send the field it'll be stored but not indexed.


On Wed, Aug 13, 2014 at 9:35 AM, Sam2014 sabdall...@gmail.com wrote:

 Is it possible in ElasticSearch?

 Assume I have a doc { field1:value1, field2:value2 ...}

 One of the fields field2 I would like not to index, basically just store
 its content in ElasticSearch and retrieve it when need it.

  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/bafaa51a-e970-45b6-8acf-5666707edf4d%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/bafaa51a-e970-45b6-8acf-5666707edf4d%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAPmjWd0Ov0ObTPgVi58hGNO5W2NrObcTNbajeS%3DNGyb5BE1wzQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Example needed for Perl Search::Elasticsearch

2014-08-13 Thread Clinton Gormley

Hiya

 Simple question, but there seems to be a lack of detailed examples for 
using the otherwise very useful Search::Elasticsearch CPAN module !

The idea was that the API of the module maps very closely to all of the 
REST APIs in Elasticsearch, so that anything that works with raw curl 
statements should be easy to translate into requests with Search::ES.

Btw, you can always see the equivalent curl statement output to STDERR with 
the following:

$e = Search::Elasticsearch-new( trace_to = 'Stderr')
 

Would this be the correct syntax ?

 {match = { severity = {query='info',boost=20}}}


 Even with the agressive boost, I'm still getting notice as the 
 prioritised results ?


That is the correct syntax.  Perhaps try just searching for info to see 
if you actually have matching results?




-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/46ddc9bf-5bc4-417f-a26d-82c1c5679eb0%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Can plugin be written for TCP transport?

You can write a Java app to authorize access with JAAS and use a SOCKS
proxy to connect to an ES cluster in a private subnet. That is all a matter
of network configuration, there is nothing that requires the effort of an
extra ES plugin.

Jörg


On Wed, Aug 13, 2014 at 3:38 PM, John Smith java.dev@gmail.com wrote:

 Hi I have been looking at the various transport plugins. Correct me if I
 am wrong but those are for the http rest interface... Can plugins be
 written for the node transport?

 Bassically this leads to securing ES. My ES is definitely not public and I
 know i can use reverse proxies or one of the http plugins... But what about
 client/programs connecting directly as nodes?

 Bassically I need user auth and some form of acl. SSL is secondary. Also
 need to be able to audit the user access. Dealing with credit card data. So
 I need to know 100% who is accessing the data.

 So...
 What are some good steps to secure my ES cluster!?

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/1ef17a07-bd72-4eee-a6b9-93ff8d0e7980%40googlegroups.com
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoF4QuX1OjHeiUOe-7VScPAGvjwB0MTWZnD6SE4nLCBS4g%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Access to AbstractAggregationBuilder.name

2014-08-13 Thread Phil Wills

Hello,

In the Java API AbstractAggregationBuilder's name property is protected. Is 
there a particular reason it can't be public, or have an accessor added, or 
is this something you'd consider a PR for?

Not having access is making things more complicated than I'd like.

Thanks,

Phil

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/63c1ae3f-ff37-4f47-9147-037d19ff9eec%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: How to Not index a field

2014-08-13 Thread Sam2014

Is there a link/example of how to do this somewhere?

On Wednesday, August 13, 2014 9:40:16 AM UTC-4, Nikolas Everett wrote:

I'm not sure the right way to do it but if you set dynamic to false and
then just send the field it'll be stored but not indexed.

On Wed, Aug 13, 2014 at 9:35 AM, Sam2014 sabda...@gmail.com javascript:
wrote:

Is it possible in ElasticSearch?

Assume I have a doc { field1:value1, field2:value2 ...}

One of the fields field2 I would like not to index, basically just
store its content in ElasticSearch and retrieve it when need it.

https://groups.google.com/d/msgid/elasticsearch/bafaa51a-e970-45b6-8acf-5666707edf4d%40googlegroups.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/257017cd-77b2-497d-a0d4-0a2cd80f5ed3%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Debug message of Elasticsearch when running

2014-08-13 Thread Yuheng Du

Hi guys,

I am running a single node Elasticsearch instance, connected to two 
rabbitmq queues as input using logstash.

I kept getting debug messages in ES terminal:

[2014-08-13 10:26:13,867][DEBUG][action.search.type   ] [Jebediah 
Guthrie] [vehicles][0], node[9BIZ4GpLQYu8933RiTs3DQ], [P], s[STARTED]: 
Failed to execute [org.elasticsearch.action.search.SearchRequest@46b3261b] 
lastShard [true]
org.elasticsearch.search.SearchParseException: [vehicles][0]: 
from[-1],size[-1]: Parse Failure [Failed to parse source 
[{facets:{0:{date_histogram:{field:@timestamp,interval:12h},global:true,facet_filter:{fquery:{query:{filtered:{query:{query_string:{query:BOT-GW}},filter:{bool:{must:[{range:{@timestamp:{from:1405347973579,to:1407939973579}}}],size:20,query:{filtered:{query:{query_string:{query:*}},filter:{bool:{must:[{range:{@timestamp:{from:1405347973579,to:1407939973579}}}],sort:[{_score:{order:desc,ignore_unmapped:true}},{@timestamp:{order:desc,ignore_unmapped:true}}]}]]


Can I know why and is it normal?

I restarted the Kibana interface and found out that some of the indexes 
were getting deleted. Like logstash-2014-08.07 to logstash-2014-08.12.
Can I know why some indexes are getting deleted?

Thanks.

best,

Yuheng

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/70b73095-601f-4ff5-b843-d55eed683219%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Can plugin be written for TCP transport?

That's what I was thinking...

1- I would like this java app to use the node client, cause I like that
fact that there no extra hop and automatic failover to next node.
2- I figure it would be a firewall setting/socks to only allow the java app
to connect to ES. But again here anyone can go create a node client on the
same machine and pull data anonymously.

I know any one person can log in a machine at any time and any person can
read regardless and it's ok, the data is supposed to be read but at least
you know who read it and when. That's not an issue... Security is a best
effort, but the issue is the audit process and how well you can check if
all your eggs are there.

Even if I do exactly as you said, subnet plus socks proxy, someone can
still go to that machine create their own node client and bypass the java
app with no direct trace. This will probably never happen, but all it takes
is one angry employ.

On Wednesday, 13 August 2014 09:50:25 UTC-4, Jörg Prante wrote:

You can write a Java app to authorize access with JAAS and use a SOCKS
proxy to connect to an ES cluster in a private subnet. That is all a matter
of network configuration, there is nothing that requires the effort of an
extra ES plugin.

Jörg

On Wed, Aug 13, 2014 at 3:38 PM, John Smith java.d...@gmail.com
javascript: wrote:

Hi I have been looking at the various transport plugins. Correct me if I
am wrong but those are for the http rest interface... Can plugins be
written for the node transport?

Bassically this leads to securing ES. My ES is definitely not public and
I know i can use reverse proxies or one of the http plugins... But what
about client/programs connecting directly as nodes?

Bassically I need user auth and some form of acl. SSL is secondary. Also
need to be able to audit the user access. Dealing with credit card data. So
I need to know 100% who is accessing the data.

So...
What are some good steps to secure my ES cluster!?

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com javascript:.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/1ef17a07-bd72-4eee-a6b9-93ff8d0e7980%40googlegroups.com
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/648c4a70-ff8b-43c8-b818-4f942a02daf5%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Return selected fields from aggregation?

2014-08-13 Thread project2501

The old facet DSL was very nice and easy to understand. I could declare 
only which fields I wanted returned.

how is this done with aggregations? The docs do not say.

I am only interested in the aggregation metrics not all the document 
results.

I tried setting size:0 but that DOES NOT EVEN WORK.

Any help appreciated.

Thank you,
D

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/2c845104-59c0-4ea4-90a6-551a93dc3f99%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: how to get char_filter to work?

2014-08-13 Thread IronMan2014

Ivan,

A followup question, As I mentioned earlier storing html and applying 
char-filter doesn't really work especially with highlighted fields coming 
back with weird html display. 
So, I am thinking stripping html before indexing, so no html in index and 
source, but I will add an extra field like html_content which meant to 
store the html version and not be indexed. 
Do you see any problems with my approach? I see one like big index size. 
What do you recommend for an ideal solution? I am still confused as I 
thought this would be a common problem?

On Friday, August 8, 2014 8:16:09 PM UTC-4, IronMan wrote:

 Thanks again. I wasn't expecting it to remove what's between the tags. I 
 believe I understand the behavior and maybe its the case where I was greedy 
 and expecting ElasticSearch to do it all.
 Here is a scenario that I was looking for: Assume I am looking to get an 
 excerpt of text (Extracted text from a document), Elastic Search query will 
 give me excerpt with html tags, but the tags are out of context, so I would 
 have liked to be to display this excerpt with no html tags, I know I can 
 probably strip the tags after the fact, but that's what I was trying to 
 avoid.  In other words, in a perfect world, I would have liked 2 versions 
 of the document, the original html one and another stripped one. When I 
 need to query things like excerpts, I would query the stripped one, and 
 when I needed the html, I would query the source. Hopefully I didn't make 
 this more confusing.

 On Friday, August 8, 2014 4:58:03 PM UTC-4, Ivan Brusic wrote:

 The tokens that appear in the analyze API are the ones that are put into 
 the inverted index. When you search for one of the terms that is not an 
 HTML tag, there will be a match. What I don't understand after reading in 
 detail your original, is exactly what behavior you are expecting.

 You indexed the phrase
 htmltrying out bElasticsearch/b, This is an html test/html

 but you expected a query for the term html to not match. However, the 
 work html is clearly in the content. The html stripper will not remove 
 the contents in between the tags, just the tags themselve. The analyze API 
 should show you the correct term.

 Lucene has more control over what information you can retrieve, but the 
 only way to get the analyzed token stream back from Elasticsearch is to use 
 the analyze API on the field. Most people do not want an analyzed token 
 stream, just the original field.

 -- 
 Ivan


 On Fri, Aug 8, 2014 at 12:01 PM, IronMike sabda...@gmail.com wrote:

 Also, Here is a link for someone who had the same problem, I am not sure 
 if there was a final answer to that one. 
 http://grokbase.com/t/gg/elasticsearch/126r4kv8tx/problem-with-standard-html-strip
 ,
 I have to admit that I am a bit confused now about this topic. I 
 understand analyzers will tokenize the sentence and strip html in the case 
 of the html_strip, and _analyze works fine using the analyzer, what I am 
 failing to understand, is how can I get the results of these tokens. Isn't 
 the whole idea to be able to search for them tokens eventually?

 If not, whats the solution of what I would think is a common scenario, 
 having to index html documents, where html tags don't need to be indexed, 
 while keeping the original html for presentational purpose? Any ideas 
 (Besides having to strip html tags manually before indexing?


 On Friday, August 8, 2014 1:02:07 PM UTC-4, IronMike wrote:

 Thanks for explaining. So, is there a way to be able to get non html 
 from the index? I thought I read that it was possible to index without the 
 html tags while keeping source intact. So, how would I get at the index 
 with non html tags if you will?

 On Friday, August 8, 2014 12:52:37 PM UTC-4, Ivan Brusic wrote:

 The field is derived from the source and not generated from the tokens.

 If we indexed the sentence The quick brown foxes jumped over the lazy 
 dogs with the english analyzer, the tokens would be

 http://localhost:9200/_analyze?text=The%20quick%
 20brown%20foxes%20jumped%20over%20the%20lazy%20dogsanalyzer=english

 quick brown fox jump over lazi dog

 After applying stopwords and stemming, the tokens do not form a 
 sentence that looks like the original.

 -- 
 Ivan


 On Fri, Aug 8, 2014 at 9:42 AM, IronMike sabda...@gmail.com wrote:

 Ivan,

 The search results I am showing is for the field title not for the 
 source. I thought I could query the field not the source and look at it 
 with no html while the source was intact. Did I misunderstand?


 On Friday, August 8, 2014 12:36:16 PM UTC-4, Ivan Brusic wrote:

 The analyzers control how text is parsed/tokenized and how terms are 
 indexed in the inverted index. The source document remains untouched.

 -- 
 Ivan


 On Fri, Aug 8, 2014 at 9:24 AM, IronMike sabda...@gmail.com wrote:

  I also used Clint's example and tried to map it to a document and 
 search the field, but still getting html in query results... Here is 
 my

Large Scale elastic Search Logstash collection system

2014-08-13 Thread Robert Gardam

Hello

We have a 10 node elasticsearch cluster which is receieving roughly 10k/s
worth of logs lines from our application.

Each elasticsearch node has 132gb of memory - 48gb heap size, the disk
subsystem is not great, but it seems to be keeping up. (This could be an
issue, but i'm not sure that it is)

The logs path is:

app server - redis (via logstash) - logstash filters (3 dedicated boxes)
- elasticsearch_http

We currently bulk import from logstash at 5k documents per flush to keep up
with the volume of data that comes in.

Here are the es non standard configs.

indices.memory.index_buffer_size: 50%
index.translog.flush_threshold_ops: 5
# Refresh tuning.
index.refresh_interval: 15s
# Field Data cache tuning
indices.fielddata.cache.size: 24g
indices.fielddata.cache.expire: 10m
#Segment Merging Tuning
index.merge.policy.max_merged_segment: 15g
# Thread Tuning
threadpool:
bulk:
type: fixed
queue_size: -1

We have not had this cluster stay up for more than a week, but it also
seems to crash for no real reason.

It seems like one node starts having issues and then it takes the entire
cluster down.

Does anyone from the community have any experience with this kind of setup?

Thanks in Advance,
Rob

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/d04a643e-990b-40b0-b230-2ba560f08eea%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Unallocated shards with empty nodes

2014-08-13 Thread Christopher Ambler

Nobody?

I've seen many people posting with similar issues - has nobody encountered 
this?

Anyone from ES care to comment?

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/92917452-a079-45fe-9f34-a200a855c8f6%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Quick Kibnana exclude terms question

2014-08-13 Thread digitalx00

He all...

What's the format for the exclude terms in the Terms panel?  I'm trying 
to not show all IP's that have 172.16 in them, and I'm not having much 
luck.  I've tried:

172.16
172.16.*
172.16.1.112

If I specify it with the full IP it works, but I'd like to wildcard this if 
possible.  Thank you.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/3681d873-0213-495c-a0df-2b946c436306%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: New version of Kibana in the works?

2014-08-13 Thread Jay Swan

When I took the Core Elasticsearch class last week, someone from 
Elasticsearch said that Kibana 4.0 is in the works but is at least a couple 
of months away from release. There wasn't any real detail on features, 
other than to say that it is a complete rewrite that tries to expose as 
much native Elasticsearch query functionality as possible.

I was also told that most security-related features will be released in a 
commercial security module for Elasticsearch that will have a non-trivial 
cost associated with it. I don't know if this includes role-based access 
control or not.


On Tuesday, August 12, 2014 10:44:55 AM UTC-6, Antonio Augusto Santos wrote:

 This one is for the devs, and Rashid in special: there is any new version 
 of Kibana in the works?
 I'm asking this because I'm about to start a project in my company for log 
 management, and there are some requisites to it (user separation, event 
 correlation, histogram to compare two values, and so on).

 So, any changes of these functionalities landing on Kibana 4.0? ;)


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/ca643d8f-2b03-4e4d-930c-5d9a0a6a70eb%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: New version of Kibana in the works?

2014-08-13 Thread Antonio Augusto Santos

Thanks Jay!
These are very interesting info.

Hope something official come out. And that Kibana 4 is open source as
well...

On Wed, Aug 13, 2014 at 12:27 PM, Jay Swan sanjuans...@gmail.com wrote:

When I took the Core Elasticsearch class last week, someone from
Elasticsearch said that Kibana 4.0 is in the works but is at least a couple
of months away from release. There wasn't any real detail on features,
other than to say that it is a complete rewrite that tries to expose as
much native Elasticsearch query functionality as possible.

I was also told that most security-related features will be released in a
commercial security module for Elasticsearch that will have a non-trivial
cost associated with it. I don't know if this includes role-based access
control or not.

On Tuesday, August 12, 2014 10:44:55 AM UTC-6, Antonio Augusto Santos
wrote:

This one is for the devs, and Rashid in special: there is any new version
of Kibana in the works?
I'm asking this because I'm about to start a project in my company for
log management, and there are some requisites to it (user separation, event
correlation, histogram to compare two values, and so on).

So, any changes of these functionalities landing on Kibana 4.0? ;)

--
You received this message because you are subscribed to a topic in the
Google Groups elasticsearch group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/I7um1mX4GSk/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/ca643d8f-2b03-4e4d-930c-5d9a0a6a70eb%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/ca643d8f-2b03-4e4d-930c-5d9a0a6a70eb%40googlegroups.com?utm_medium=emailutm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAGz5QREtDi5QtBhpNpsE2aQGVh8NiydaT4Zq%3D8EQ9Zth_Vt-iQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Elastic + Kibana Server Specs Recommendation

2014-08-13 Thread AK

Hi,

I recently launched ELK and I'm receiving about 3,000,000 - 8,000,000 docs 
per day (~ 5GB)
I'm running on AWS on a small server, and after a week of data collection 
the system becomes very very slow, mainly when I am looking for data older 
than 2 days.
Do you have a recommendation for servers in points such as cpu, memory and 
iops and elstic settings like shards.

Thanks
AK




-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/349f33f6-aad8-4089-a482-22eaf4dd4cb4%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

how to use cluster.routing.allocation.allow_rebalance ?

2014-08-13 Thread Gokul

Hi,
I am trying to ensure that re-allocation of shards are prevented whenever
one of the nodes in the cluster goes down. Based on the documentation I read
online, I use the following settings on the cluster

persistent: {
cluster.routing.allocation.enable: none,
cluster.routing.allocation.disable_allocation:true
}

However, this prevents new indices from being allocated as well. This is not
OK for me and so my requirement is to be able to do both the following
things -
1. Be able to create indices dynamically.
2. Prevent re-allocation of shards when a node in the cluster goes down.

Based on the documentation, I understood that setting
cluster.routing.allocation.allow_rebalance to indices_all_active will do
both the things. But this didn't happen. Hence, my understanding is clearly
wrong. So what does it mean to set allow_rebalance to indices_all_active ?
When going over the mailing list, I also found the concept of dynamic
settings. I assume this means that there are some values that I can set
when the cluster is live and some settings that I can only set in the
configuration file, which will be set on restarting elastic search. Is
allow_rebalance one such setting?
Please correct me if I am on the wrong path altogether.

I see that I have raised multiple questions in one thread. However, all of
them are a related use case to me. Can you please advice me on these.

Thanks,

--
View this message in context:
http://elasticsearch-users.115913.n3.nabble.com/how-to-use-cluster-routing-allocation-allow-rebalance-tp4061785.html
Sent from the ElasticSearch Users mailing list archive at Nabble.com.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/1407924508318-4061785.post%40n3.nabble.com.
For more options, visit https://groups.google.com/d/optout.

Re: How would you compare ES, Lucene with Enterprise Search ?

2014-08-13 Thread Shawn Johnson

Otis,
I know this is old, but what do you mean by taller?

On Friday, May 24, 2013 12:28:17 PM UTC-4, Otis Gospodnetic wrote:

Hi,

Short answer: I don't know.
Medium answer: I'm sure each vendor would claim superiority or, if losing
in perf, would bring up functionality their tool has that others do not
True answer: you would really have to test it yourself. There are a
million ways to do comparative benchmarks and lots of different use cases,
that it's impossible to test them all and provide truthful reports.

What I can say though is that OSS search engines like ES and Solr can be
*very* fast if you know how to use them. For example, Sematext had a
client recently whose initial queries against a 500K doc index took 90
seconds. We changed the structure of the index in a way that may have
seemed crazy to some and ended up with 10x taller index that was
returning results in just a few seconds on a single machine. So since
open-source engines are free to use, you should simply try them out and see
if they are fast enough for you, for your use case. I'll bet my left
pinkie that you'll find they'll be plenty fast. And if they are not, you
can get help on this ML or from companies that provide ES consulting
services.

Otis
--
ELASTICSEARCH Performance Monitoring - http://sematext.com/spm
http://sematext.com/spm/index.html

On Thursday, May 23, 2013 5:10:09 AM UTC-4, Paul wrote:

Hi Otis

Probably not the best place here to ask. maybe quora.

Apart from the cost and support, what about performance?

On Thursday, May 23, 2013 3:29:11 AM UTC+8, Otis Gospodnetic wrote:

Hi Paul,

Hm, that's a pretty big question. For one, open-source engines are
cheaper. :)
But maybe you have specific requirements or features you want to inquire
about?

Otis
--
ELASTICSEARCH Performance Monitoring -
http://sematext.com/spm/index.html
Search Analytics - http://sematext.com/search-analytics/index.html

On Tuesday, May 21, 2013 10:17:45 PM UTC-4, Paul wrote:

Hi,

How would you compare community-based search engines (ES, Apache
Lucene) with Commercial Enterprise Search (IBM, EMC, Oracle Endeca - MDEX
Engine, etc) ?

Thanks.
Paul

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/0407ad67-07da-4d9f-9ee5-f5190aef2816%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Large Scale elastic Search Logstash collection system

Because you set queue_size: -1 in the bulk thread pool, you explicitly
allowed the node to crash.

You should use reasonable resource limits. Default settings, which are
reasonable, are sufficient in most cases.

Jörg

On Wed, Aug 13, 2014 at 5:18 PM, Robert Gardam robert.gar...@fyber.com
wrote:

Hello

We have a 10 node elasticsearch cluster which is receieving roughly 10k/s
worth of logs lines from our application.

Each elasticsearch node has 132gb of memory - 48gb heap size, the disk
subsystem is not great, but it seems to be keeping up. (This could be an
issue, but i'm not sure that it is)

The logs path is:

app server - redis (via logstash) - logstash filters (3 dedicated boxes)
- elasticsearch_http

We currently bulk import from logstash at 5k documents per flush to keep
up with the volume of data that comes in.

Here are the es non standard configs.

We have not had this cluster stay up for more than a week, but it also
seems to crash for no real reason.

It seems like one node starts having issues and then it takes the entire
cluster down.

Does anyone from the community have any experience with this kind of setup?

Thanks in Advance,
Rob

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/d04a643e-990b-40b0-b230-2ba560f08eea%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/d04a643e-990b-40b0-b230-2ba560f08eea%40googlegroups.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoEU84GhzjUuuo69K2xeu1N2-nQPCJXXCj3wu20Mz0R4VA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

[ANN] Elasticsearch 1.2.4 and 1.3.2 released

2014-08-13 Thread Kevin Kluge

Hi All.

We're happy to announce that Elasticsearch 1.2.4 and 1.3.2 have been
released!   These are bug fix releases.

The blog post at [1] describes the release content at a high level. The
release notes at [2] and [3] give details and a direct link to download.

Please download and try it out.   Feedback, bug reports and patches are all
welcome.

Kevin

[1] http://www.elasticsearch.org/blog/elasticsearch-1-3-2-released/
[2] http://www.elasticsearch.org/downloads/1-2-4/
[3] http://www.elasticsearch.org/downloads/1-3-2

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAB1XBh8_hkpaWFot7EcEsUFz-QQBn1AtS2XU1NGPeWYohdRZaQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Aggregate a count of matched words among documents

2014-08-13 Thread Travis Bullock

Hi everyone. I have a set of documents that represent web site pages, and 
I'd like to query the total *word* matches for a term within a specific 
site. For example, find all documents from the site www.example.com and 
count the total number of occurrences of words matching example among the 
documents. 

I don't want the number of document hits - I want to know how many times a 
word is mentioned within the site's content.

Is this possible? Can someone give me a rough idea of where to start?

Thanks,
Travis

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/6d98122a-653e-4898-8eed-869c7a3e93c7%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Can plugin be written for TCP transport?

It would be cool if we could use something like Jetty plugin and disable
client access from the TCP node and only use the TCP node for cluster
related work.

On Wednesday, 13 August 2014 11:07:45 UTC-4, John Smith wrote:

That's what I was thinking...

1- I would like this java app to use the node client, cause I like that
fact that there no extra hop and automatic failover to next node.
2- I figure it would be a firewall setting/socks to only allow the java
app to connect to ES. But again here anyone can go create a node client on
the same machine and pull data anonymously.

On Wednesday, 13 August 2014 09:50:25 UTC-4, Jörg Prante wrote:

Jörg

On Wed, Aug 13, 2014 at 3:38 PM, John Smith java.d...@gmail.com wrote:

Hi I have been looking at the various transport plugins. Correct me if I
am wrong but those are for the http rest interface... Can plugins be
written for the node transport?

Bassically this leads to securing ES. My ES is definitely not public and
I know i can use reverse proxies or one of the http plugins... But what
about client/programs connecting directly as nodes?

Bassically I need user auth and some form of acl. SSL is secondary. Also
need to be able to audit the user access. Dealing with credit card data. So
I need to know 100% who is accessing the data.

So...
What are some good steps to secure my ES cluster!?

--
You received this message because you are subscribed to the Google
Groups elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/1ef17a07-bd72-4eee-a6b9-93ff8d0e7980%40googlegroups.com
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/c7a7d241-575e-480c-afe9-e2ef69043160%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Elastic + Kibana Server Specs Recommendation

2014-08-13 Thread Jay Swan

For Elasticsearch, try m3.xlarge and set ES_HEAP_SIZE to 7 or 8GB. You may 
also want to have more than one node in your cluster.

You might also want to split Logstash off onto a separate instance. It is 
CPU intensive but not particularly RAM intensive. Set the -w {n} flag in 
the startup script to allow Logstash to run multiple threads across 
multiple cores. You might start with a m3.large for this and use -w 2 and 
see how it goes.

On Wednesday, August 13, 2014 9:38:10 AM UTC-6, AK wrote:

 Hi,

 I recently launched ELK and I'm receiving about 3,000,000 - 8,000,000 docs 
 per day (~ 5GB)
 I'm running on AWS on a small server, and after a week of data collection 
 the system becomes very very slow, mainly when I am looking for data older 
 than 2 days.
 Do you have a recommendation for servers in points such as cpu, memory and 
 iops and elstic settings like shards.

 Thanks
 AK






-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/90398be0-4804-44d7-9f8e-e033daa7050b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Large Scale elastic Search Logstash collection system

2014-08-13 Thread Robert Gardam

Hi,
The reason this is set is because without it we reject messages and there
fore don't have all the log entries.

I'm happy to be told this isn't required, but i'm pretty sure it is. We are
constantly bulk indexing large numbers of events.

On Wednesday, August 13, 2014 6:09:46 PM UTC+2, Jörg Prante wrote:

Because you set queue_size: -1 in the bulk thread pool, you explicitly
allowed the node to crash.

You should use reasonable resource limits. Default settings, which are
reasonable, are sufficient in most cases.

Jörg

On Wed, Aug 13, 2014 at 5:18 PM, Robert Gardam robert...@fyber.com
javascript: wrote:

Hello

We have a 10 node elasticsearch cluster which is receieving roughly 10k/s
worth of logs lines from our application.

Each elasticsearch node has 132gb of memory - 48gb heap size, the disk
subsystem is not great, but it seems to be keeping up. (This could be an
issue, but i'm not sure that it is)

The logs path is:

app server - redis (via logstash) - logstash filters (3 dedicated
boxes) - elasticsearch_http

We currently bulk import from logstash at 5k documents per flush to keep
up with the volume of data that comes in.

Here are the es non standard configs.

We have not had this cluster stay up for more than a week, but it also
seems to crash for no real reason.

It seems like one node starts having issues and then it takes the entire
cluster down.

Does anyone from the community have any experience with this kind of
setup?

Thanks in Advance,
Rob

https://groups.google.com/d/msgid/elasticsearch/d04a643e-990b-40b0-b230-2ba560f08eea%40googlegroups.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/be02ed79-31de-4002-9144-d124370d0c31%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Large Scale elastic Search Logstash collection system

2014-08-13 Thread 'Sandeep Ramesh Khanzode' via elasticsearch

If Elasticsearch rejects bulk actions, this is serious and you should
examine the cluster to find out why this is so. From slow disks, cluster
health, or capacity problems, everything comes to mind. But if you ignore
problem solution and merely disable bulk resource control instead, you open
the gate wide to unpredictable node crashes, and you won't be able to
control the cluster at a certain point.

To reduce the number of active bulk requests per timeframe, for example,
you could increase the bulk request actions per request. Or simply increase
the number of nodes. Or think about the shard/replica organization while
indexing - it can be an advantage to bulk index to replica level 0 index
only and increase the replica level later.

Jörg

On Wed, Aug 13, 2014 at 6:50 PM, Robert Gardam robert.gar...@fyber.com
wrote:

Hi,
The reason this is set is because without it we reject messages and there
fore don't have all the log entries.

I'm happy to be told this isn't required, but i'm pretty sure it is. We
are constantly bulk indexing large numbers of events.

On Wednesday, August 13, 2014 6:09:46 PM UTC+2, Jörg Prante wrote:

Because you set queue_size: -1 in the bulk thread pool, you explicitly
allowed the node to crash.

You should use reasonable resource limits. Default settings, which are
reasonable, are sufficient in most cases.

Jörg

On Wed, Aug 13, 2014 at 5:18 PM, Robert Gardam robert...@fyber.com
wrote:

Hello

We have a 10 node elasticsearch cluster which is receieving roughly
10k/s worth of logs lines from our application.

Each elasticsearch node has 132gb of memory - 48gb heap size, the disk
subsystem is not great, but it seems to be keeping up. (This could be an
issue, but i'm not sure that it is)

The logs path is:

app server - redis (via logstash) - logstash filters (3 dedicated
boxes) - elasticsearch_http

We currently bulk import from logstash at 5k documents per flush to keep
up with the volume of data that comes in.

Here are the es non standard configs.

We have not had this cluster stay up for more than a week, but it also
seems to crash for no real reason.

It seems like one node starts having issues and then it takes the entire
cluster down.

Does anyone from the community have any experience with this kind of
setup?

Thanks in Advance,
Rob

To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/d04a643e-990b-40b0-b230-2ba560f08eea%
40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/d04a643e-990b-40b0-b230-2ba560f08eea%40googlegroups.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/be02ed79-31de-4002-9144-d124370d0c31%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/be02ed79-31de-4002-9144-d124370d0c31%40googlegroups.com?utm_medium=emailutm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoFUPQBUvrsFYtQcWZ0W%3DnLZz_LasQ1T9p0XTx%3DqoesNMw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Get Shard Info From Cluster/Nodes/Index

Hi,

Can I using Cluster API or some other Java way, find the shards that are 
allocated to cluster - node - index.

I would like to check which shards are deployed to a physical node, and 
query only that shard to find what was indexed on that data.

I would be using a _routing value, and using this query, I want to check 
which routing value went to which shard.

Thanks,
Sandeep

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/808f15d3-f552-4fb9-a404-8645faaded79%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Large Scale elastic Search Logstash collection system

2014-08-13 Thread Robert Gardam

I appreciate your answers. I think IO could be a contributing factor. I'm
thinking of splitting the index into an hourly index with no replicas for
bulk importing and then switch it on afterwards.

I think the risk of loosing data would be too high if it was any longer
than that. Also Does the async replication from the logstash side of things
cause unknown issues?

On Wednesday, August 13, 2014 7:08:05 PM UTC+2, Jörg Prante wrote:

Jörg

On Wed, Aug 13, 2014 at 6:50 PM, Robert Gardam robert...@fyber.com
javascript: wrote:

Hi,
The reason this is set is because without it we reject messages and there
fore don't have all the log entries.

I'm happy to be told this isn't required, but i'm pretty sure it is. We
are constantly bulk indexing large numbers of events.

On Wednesday, August 13, 2014 6:09:46 PM UTC+2, Jörg Prante wrote:

Because you set queue_size: -1 in the bulk thread pool, you explicitly
allowed the node to crash.

You should use reasonable resource limits. Default settings, which are
reasonable, are sufficient in most cases.

Jörg

On Wed, Aug 13, 2014 at 5:18 PM, Robert Gardam robert...@fyber.com
wrote:

Hello

We have a 10 node elasticsearch cluster which is receieving roughly
10k/s worth of logs lines from our application.

Each elasticsearch node has 132gb of memory - 48gb heap size, the disk
subsystem is not great, but it seems to be keeping up. (This could be an
issue, but i'm not sure that it is)

The logs path is:

app server - redis (via logstash) - logstash filters (3 dedicated
boxes) - elasticsearch_http

We currently bulk import from logstash at 5k documents per flush to
keep up with the volume of data that comes in.

Here are the es non standard configs.

We have not had this cluster stay up for more than a week, but it also
seems to crash for no real reason.

It seems like one node starts having issues and then it takes the
entire cluster down.

Does anyone from the community have any experience with this kind of
setup?

Thanks in Advance,
Rob

https://groups.google.com/d/msgid/elasticsearch/be02ed79-31de-4002-9144-d124370d0c31%40googlegroups.com?utm_medium=emailutm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/29de6d50-0798-490d-903f-631e4d47a7a5%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Can plugin be written for TCP transport?

With the configuration settings transport.type and
transport.service.type

Example:

transport.type:
org.xbib.elasticsearch.transport.syslog.netty.SyslogNettyTransportModule
transport.service.type:
org.xbib.elasticsearch.transport.syslog.SyslogTransportService

you can implement your own transport layer.

With this mechanism you can write plugins to create an audit trail of all
clients or nodes that connect to your cluster and you can log what they
did, for later revision. Or, you can add a registry for clients, add JAAS,
...

For example, I played with a modified netty transport layer to dump all
parsed transport actions between nodes to a remote host syslog, including
the action names and the channel connection information of local and remote
host/port.

On such a custom transport layer implementation, you can add even more low
level logic. If you do not want certain nodes or clients to connect, you
could a) use zen unicast for manual configuration of permitted nodes or
clients and/or b) reject all network actions from unknown/unregistered
clients, independent of discovery.

Note, manipulating transport layer is not free lunch and is not always fun.
Performance may degrade, other things may break etc.

Jörg

On Wed, Aug 13, 2014 at 5:07 PM, John Smith java.dev@gmail.com wrote:

That's what I was thinking...

1- I would like this java app to use the node client, cause I like that
fact that there no extra hop and automatic failover to next node.
2- I figure it would be a firewall setting/socks to only allow the java
app to connect to ES. But again here anyone can go create a node client on
the same machine and pull data anonymously.

On Wednesday, 13 August 2014 09:50:25 UTC-4, Jörg Prante wrote:

Jörg

On Wed, Aug 13, 2014 at 3:38 PM, John Smith java.d...@gmail.com wrote:

Hi I have been looking at the various transport plugins. Correct me if I
am wrong but those are for the http rest interface... Can plugins be
written for the node transport?

Bassically this leads to securing ES. My ES is definitely not public and
I know i can use reverse proxies or one of the http plugins... But what
about client/programs connecting directly as nodes?

Bassically I need user auth and some form of acl. SSL is secondary. Also
need to be able to audit the user access. Dealing with credit card data. So
I need to know 100% who is accessing the data.

So...
What are some good steps to secure my ES cluster!?

To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/1ef17a07-bd72-4eee-a6b9-93ff8d0e7980%
40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/648c4a70-ff8b-43c8-b818-4f942a02daf5%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/648c4a70-ff8b-43c8-b818-4f942a02daf5%40googlegroups.com?utm_medium=emailutm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGXBVE4K0jGusJO-Ta0vcNZRZxiOxbfX4-E6_n%2BSLHMZg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: How to find null_value in query_string like we can in missing filter

2014-08-13 Thread pulkitsinghal

Can anyone help with this comparison between missing fields in filter vs. 
query?

On Sunday, August 3, 2014 12:18:05 PM UTC-5, pulkitsinghal wrote:

 I'm using elasticsearch v0.90.5

 With a missing filter, we can track missing fields:

 http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-missing-filter.html
 and make sure that a null_value also counts as missing.

 How can we do the same in a query_string?

 http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-query-string-query.html#_field_names
 Based on my test so far, the non-existent field counts as missing but the 
 null_value field counts as present.

 How should I write my query?


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/67d6303a-2694-4097-bf6d-c8ffcaaa9a58%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Can plugin be written for TCP transport?

Hi thanks Jorg. 

Just to be sure, creating a plugin and overriding: transport.type and 
transport.service.type this allow us to create custom transport for TCP 
9300 transport or the HTTP 9200 transport?


I don't have a problem with using 

Subnet and firewall
And
Nginx or specified plugin.
Or
Custom application

To secure ES from outside world. In fact that's pretty good.

We run a web layer in DMZ and ES on a second layer. So only DMZ has access 
to ES.

The question is how do you secure ES from the inside world? Anyone who has 
access to the subnet and to the ES cluster can just login and install their 
own jobs that can use port 9300 and do what ever they want with ES. Even 
a simple authentication would be better then nothing here.

If what you said above allows us to replace 9300 transport, then awesome. 
The issue is Elasticsearch has built the really nice fast Ferari but opted 
out of providing even a simple door locking mechanism. Using nginx is like 
telling your neighbor to keep an eye on your car while you are away on the 
weekend and trusting they will do right.

On Wednesday, 13 August 2014 14:21:01 UTC-4, Jörg Prante wrote:

 With the configuration settings transport.type and 
 transport.service.type

 Example:

 transport.type: 
 org.xbib.elasticsearch.transport.syslog.netty.SyslogNettyTransportModule
 transport.service.type: 
 org.xbib.elasticsearch.transport.syslog.SyslogTransportService

 you can implement your own transport layer.

 With this mechanism you can write plugins to create an audit trail of all 
 clients or nodes that connect to your cluster and you can log what they 
 did, for later revision. Or, you can add a registry for clients, add JAAS, 
 ...

 For example, I played with a modified netty transport layer to dump all 
 parsed transport actions between nodes to a remote host syslog, including 
 the action names and the channel connection information of local and remote 
 host/port.

 On such a custom transport layer implementation, you can add even more low 
 level logic. If you do not want certain nodes or clients to connect, you 
 could a) use zen unicast for manual configuration of permitted nodes or 
 clients and/or b) reject all network actions from unknown/unregistered 
 clients, independent of discovery.

 Note, manipulating transport layer is not free lunch and is not always 
 fun. Performance may degrade, other things may break etc.

 Jörg



 On Wed, Aug 13, 2014 at 5:07 PM, John Smith java.d...@gmail.com 
 javascript: wrote:

 That's what I was thinking...

 1- I would like this java app to use the node client, cause I like that 
 fact that there no extra hop and automatic failover to next node.
 2- I figure it would be a firewall setting/socks to only allow the java 
 app to connect to ES. But again here anyone can go create a node client on 
 the same machine and pull data anonymously.

 I know any one person can log in a machine at any time and any person can 
 read regardless and it's ok, the data is supposed to be read but at least 
 you know who read it and when. That's not an issue... Security is a best 
 effort, but the issue is the audit process and how well you can check if 
 all your eggs are there.

 Even if I do exactly as you said, subnet plus socks proxy, someone can 
 still go to that machine create their own node client and bypass the java 
 app with no direct trace. This will probably never happen, but all it takes 
 is one angry employ.






 On Wednesday, 13 August 2014 09:50:25 UTC-4, Jörg Prante wrote:

 You can write a Java app to authorize access with JAAS and use a SOCKS 
 proxy to connect to an ES cluster in a private subnet. That is all a matter 
 of network configuration, there is nothing that requires the effort of an 
 extra ES plugin.

 Jörg


 On Wed, Aug 13, 2014 at 3:38 PM, John Smith java.d...@gmail.com wrote:

 Hi I have been looking at the various transport plugins. Correct me if 
 I am wrong but those are for the http rest interface... Can plugins be 
 written for the node transport?

 Bassically this leads to securing ES. My ES is definitely not public 
 and I know i can use reverse proxies or one of the http plugins... But 
 what 
 about client/programs connecting directly as nodes?

 Bassically I need user auth and some form of acl. SSL is secondary. 
 Also need to be able to audit the user access. Dealing with credit card 
 data. So I need to know 100% who is accessing the data.

 So...
 What are some good steps to secure my ES cluster!?

 --
 You received this message because you are subscribed to the Google 
 Groups elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send 
 an email to elasticsearc...@googlegroups.com.

 To view this discussion on the web visit https://groups.google.com/d/
 msgid/elasticsearch/1ef17a07-bd72-4eee-a6b9-93ff8d0e7980%
 40googlegroups.com.
 For more options, visit https://groups.google.com/d/optout.


  -- 
 You received this message because you are

Too many open files /var/lib/elasticsearch doesn't exist on the nodes

2014-08-13 Thread José Andrés

Can someone elaborate as to why the exception below is thrown?

[2014-08-11 11:27:41,189][WARN ][cluster.action.shard ] [mycluster] 
[all][4] received shard failed for [all][4], node[MLNWasasasWi2V58hRA93mg], 
[P], s[INITIALIZING], indexUUID [KSScFASsasas6yTEcK7HqlmA], reason [Failed 
to start shard, message [IndexShardGatewayRecoveryException[[all][4] failed 
recovery]; nested: EngineCreationFailureException[[all][4] failed to open 
reader on writer]; nested: 
FileSystemException[/var/lib/elasticsearch/mycluster/nodes/0/indices/all/4/index/_
m4bz_es090_0.tim: Too many open files]; ]] 2:11 /var/lib/elasticsearch 
doesn't exist on the nodes

Thank you very much for your feedback.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/16968aac-dbfe-44c4-b3fe-d25af0dfbe37%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Native(Java) script performance

2014-08-13 Thread avacados

is there any documents available about performance of native script ?? I am 
facing search performance issue due to native script. General guideline to 
optimize script performance will be helpful.

Here is specific details where i am facing performance issue.

I am using script to find documents which overlap betweens two date ranges. 
(i.e. find documents between [01-Aug, 2-Nov]... where document contain two 
fields (start_time, end_time)) . First, startRange-endRange are params.

My script filter is,
 
===
{
 script: {
script: xyz,
params: {
   startRange: 1407939675,   // Timestamp in 
milliseconds
   endRange: 1410531675 // Timestamp in 
milliseconds
},
lang: native,
_cache: true
 }
  },

===
My Native(Java) script code 



ScriptDocValues XsDocValue = (ScriptDocValues) doc().get(

  start_time);

 long XsLong = 0l;

 if (XsDocValue != null  !XsDocValue.isEmpty()) {

  XsLong = ((ScriptDocValues.Longs) doc().get(start_time))

   .getValue();

 }

 ScriptDocValues XeDocValue = (ScriptDocValues) doc().get(end_time);

long XeLong = 0l;

 if (XeDocValue != null  !XeDocValue.isEmpty()) {

  XeLong = ((ScriptDocValues.Longs) doc().get(end_time))

   .getValue();

 }

if ((endRange = XsLong)  (startRange = XeLong)) {

  return true;

 }

===

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/e6f056a0-4e47-4e04-abd2-e74f27bd2e68%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

slowlog not populating after deletion

2014-08-13 Thread Tim Hopper

This morning, I enabled slowlogs on a bunch of indices in my cluster by
issuing something like this
https://gist.github.com/tdhopper/a44cc4200b9c09aea389. Because I set the
threshold at 0 and we're doing lots of reads and writes, I got large logs
rather quickly. I decided to raise the thresholds and delete the logs
manually from all four machines on my cluster; I did this without
restarting the cluster. After doing this, no new *slowlog.log files were
created. I tried 'touch'ing the appropriate paths, but those files were not
populated.

I have since set the slowlog thresholds manually in my ElasticSearch.yml
file

index.search.slowlog.threshold.query.warn: 10s
index.search.slowlog.threshold.query.info: 5s
index.search.slowlog.threshold.query.debug: 2s
index.search.slowlog.threshold.query.trace: 0s
index.search.slowlog.threshold.fetch.warn: 10s
index.search.slowlog.threshold.fetch.info: 5s
index.search.slowlog.threshold.fetch.debug: 2s
index.search.slowlog.threshold.fetch.trace: 0s
index.indexing.slowlog.threshold.index.warn: 10s
index.indexing.slowlog.threshold.index.info: 5s
index.indexing.slowlog.threshold.index.debug: 2s
index.indexing.slowlog.threshold.index.trace: 0s

and restarted the cluster. The log files still were not populated.

I again issued a curl command to set the thresholds to 0ms, and I also set
'additivity.index.search.slowlog' and 'additivity.index.indexing.slowlog'
to 'true' for the cluster (though I'm not entirely clear on what those do).
Still no logs. I have also tried fiddling with the log level with a command
like this https://gist.github.com/tdhopper/2dc0d2aa039f3dd598ab.

I've uncovered two
http://stackoverflow.com/questions/23195280/elasticsearch-slow-log-wont-write-to-log-file

SO questions
http://stackoverflow.com/questions/23899327/elasticsearch-wont-log-slow-queries-anymore

that seem to have similar problems that have not been resolved.

Have I done something wrong that is preventing these logs from appearing?

I am using 1.3.0.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/54d753c1-e9a5-4116-ac5c-0c8edd3a392b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Can plugin be written for TCP transport?

I think what allot of people is more like this? If I understand 
correctly... This runs above the 9300 transport. So you still get all the 
goodies/zen discovery etc... Plus added authentication. Except this only 
works with found...



On Wednesday, 13 August 2014 15:23:05 UTC-4, John Smith wrote:

 Hi thanks Jorg. 

 Just to be sure, creating a plugin and overriding: transport.type and 
 transport.service.type this allow us to create custom transport for TCP 
 9300 transport or the HTTP 9200 transport?


 I don't have a problem with using 

 Subnet and firewall
 And
 Nginx or specified plugin.
 Or
 Custom application

 To secure ES from outside world. In fact that's pretty good.

 We run a web layer in DMZ and ES on a second layer. So only DMZ has access 
 to ES.

 The question is how do you secure ES from the inside world? Anyone who has 
 access to the subnet and to the ES cluster can just login and install their 
 own jobs that can use port 9300 and do what ever they want with ES. Even 
 a simple authentication would be better then nothing here.

 If what you said above allows us to replace 9300 transport, then awesome. 
 The issue is Elasticsearch has built the really nice fast Ferari but opted 
 out of providing even a simple door locking mechanism. Using nginx is like 
 telling your neighbor to keep an eye on your car while you are away on the 
 weekend and trusting they will do right.

 On Wednesday, 13 August 2014 14:21:01 UTC-4, Jörg Prante wrote:

 With the configuration settings transport.type and 
 transport.service.type

 Example:

 transport.type: 
 org.xbib.elasticsearch.transport.syslog.netty.SyslogNettyTransportModule
 transport.service.type: 
 org.xbib.elasticsearch.transport.syslog.SyslogTransportService

 you can implement your own transport layer.

 With this mechanism you can write plugins to create an audit trail of all 
 clients or nodes that connect to your cluster and you can log what they 
 did, for later revision. Or, you can add a registry for clients, add JAAS, 
 ...

 For example, I played with a modified netty transport layer to dump all 
 parsed transport actions between nodes to a remote host syslog, including 
 the action names and the channel connection information of local and remote 
 host/port.

 On such a custom transport layer implementation, you can add even more 
 low level logic. If you do not want certain nodes or clients to connect, 
 you could a) use zen unicast for manual configuration of permitted nodes or 
 clients and/or b) reject all network actions from unknown/unregistered 
 clients, independent of discovery.

 Note, manipulating transport layer is not free lunch and is not always 
 fun. Performance may degrade, other things may break etc.

 Jörg



 On Wed, Aug 13, 2014 at 5:07 PM, John Smith java.d...@gmail.com wrote:

 That's what I was thinking...

 1- I would like this java app to use the node client, cause I like that 
 fact that there no extra hop and automatic failover to next node.
 2- I figure it would be a firewall setting/socks to only allow the java 
 app to connect to ES. But again here anyone can go create a node client on 
 the same machine and pull data anonymously.

 I know any one person can log in a machine at any time and any person 
 can read regardless and it's ok, the data is supposed to be read but at 
 least you know who read it and when. That's not an issue... Security is a 
 best effort, but the issue is the audit process and how well you can check 
 if all your eggs are there.

 Even if I do exactly as you said, subnet plus socks proxy, someone can 
 still go to that machine create their own node client and bypass the java 
 app with no direct trace. This will probably never happen, but all it takes 
 is one angry employ.






 On Wednesday, 13 August 2014 09:50:25 UTC-4, Jörg Prante wrote:

 You can write a Java app to authorize access with JAAS and use a SOCKS 
 proxy to connect to an ES cluster in a private subnet. That is all a 
 matter 
 of network configuration, there is nothing that requires the effort of an 
 extra ES plugin.

 Jörg


 On Wed, Aug 13, 2014 at 3:38 PM, John Smith java.d...@gmail.com 
 wrote:

 Hi I have been looking at the various transport plugins. Correct me if 
 I am wrong but those are for the http rest interface... Can plugins be 
 written for the node transport?

 Bassically this leads to securing ES. My ES is definitely not public 
 and I know i can use reverse proxies or one of the http plugins... But 
 what 
 about client/programs connecting directly as nodes?

 Bassically I need user auth and some form of acl. SSL is secondary. 
 Also need to be able to audit the user access. Dealing with credit card 
 data. So I need to know 100% who is accessing the data.

 So...
 What are some good steps to secure my ES cluster!?

 --
 You received this message because you are subscribed to the Google 
 Groups elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send

Re: Kibana not showing all events

2014-08-13 Thread cablenightmare

Well, I shouldn't say no events. What I mean is, that I expected the 
histogram to be filled with events and those events don't show up until I 
drill down.

On Wednesday, August 13, 2014 12:54:04 PM UTC-7, cableni...@gmail.com wrote:

 Hi,

 I am using the ELK stack. I have an event that is logged 5 times an hour. 
 I use Kibana to query the event by type going back 24 hours (see k1.png), 6 
 hours (see k2.png), and 1 hour (see k3.png). 

 The display at 24 hours and 6 hours is confusing to me because it appears 
 there are no events, but when I drill down (so to speak) to the 1 hour, the 
 events show up as expected. For example, in k3.png you can see there are 
 events from 8AM to 9AM as expected, but those events don't show up in the 
 histogram for 24 hours and 6 hours.

 Thank for your assistance.

 Matt


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/075a58a8-ad1c-493b-95bb-f367a596344c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Too many open files /var/lib/elasticsearch doesn't exist on the nodes

2014-08-13 Thread Adrien Grand

It seems that this particular node is complaining about too many open
files. This usually happens if you have very low limits on your operating
system and/or if you have many shards on a single node. When this happens,
things degrade pretty badly as there is no way to open new files anymore.

Please see
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/setup-configuration.html#file-descriptors
for more information.


On Wed, Aug 13, 2014 at 9:23 PM, José Andrés japmycr...@gmail.com wrote:

 Can someone elaborate as to why the exception below is thrown?

 [2014-08-11 11:27:41,189][WARN ][cluster.action.shard ] [mycluster]
 [all][4] received shard failed for [all][4], node[MLNWasasasWi2V58hRA93mg],
 [P], s[INITIALIZING], indexUUID [KSScFASsasas6yTEcK7HqlmA], reason [Failed
 to start shard, message [IndexShardGatewayRecoveryException[[all][4] failed
 recovery]; nested: EngineCreationFailureException[[all][4] failed to open
 reader on writer]; nested:
 FileSystemException[/var/lib/elasticsearch/mycluster/nodes/0/indices/all/4/index/_
 m4bz_es090_0.tim: Too many open files]; ]] 2:11 /var/lib/elasticsearch
 doesn't exist on the nodes

 Thank you very much for your feedback.

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/16968aac-dbfe-44c4-b3fe-d25af0dfbe37%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/16968aac-dbfe-44c4-b3fe-d25af0dfbe37%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.




-- 
Adrien Grand

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j5n0x_zBEYEP1B3a1uf0vCFGoCphZw%3DCZwJF7nuF24ERQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

size = 0 w/ query_then_fetch vs count search_type

2014-08-13 Thread Andrew Ochsner

Hi:

For some reason I remember someone saying (might have been during a 
training or git issue) that ElasticSearch would optimize a query_then_fetch 
search with size=0 to a count search.  But now I can't find anywhere on the 
interwebs or in a quick search through the code.   Does anyone know if this 
is the case or if it really is more important to specify count search_type 
rather than size=0 (in this instance, it's in the context of wanting to do 
an aggregation and I don't care about the hits).

Thanks in advance
Andy O

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/b349267e-c01b-4fcb-9b2d-c80a14e5015f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Best URL for load balancer HTTP health check

2014-08-13 Thread Matt Hughes

With a standard LB in front of an N-node cluster, what's the best URL in 
the ES API to check the health of a particular node (so as to know to 
remove it at least temporarily).

There is the node info API:

curl -XGET 'http://localhost:9200/_nodes'

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/cluster-nodes-info.html

But that's returning node info for all the nodes in the cluster that 
localhost is a member of.  I want something like /_nodes/_current.  Or 
maybe the fact that a particular node can even process any GET request 
(e.g., http://localhost:9200/_cluster/health) signifies that it is 
responsive enough and can take requests?  

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/73d376f4-5aee-420b-8a5e-dd2a7deb632c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Best URL for load balancer HTTP health check

If you just curl http://IP:9200 you will get a response, if it's not 200
then chances are it's not part of the cluster and something is wrong.

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com
web: www.campaignmonitor.com

On 14 August 2014 07:35, Matt Hughes hughes.m...@gmail.com wrote:

With a standard LB in front of an N-node cluster, what's the best URL in
the ES API to check the health of a particular node (so as to know to
remove it at least temporarily).

There is the node info API:

curl -XGET 'http://localhost:9200/_nodes'

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/cluster-nodes-info.html

But that's returning node info for all the nodes in the cluster that
localhost is a member of. I want something like /_nodes/_current. Or
maybe the fact that a particular node can even process any GET request
(e.g., http://localhost:9200/_cluster/health) signifies that it is
responsive enough and can take requests?

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/73d376f4-5aee-420b-8a5e-dd2a7deb632c%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/73d376f4-5aee-420b-8a5e-dd2a7deb632c%40googlegroups.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAEM624Y%3D%3DD0jQfwTjgCh7o69OPmmUrFZ0kLioceuOQ0siSj%3D4g%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Get Shard Info From Cluster/Nodes/Index

The _cat/shards API will tell you this -
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/cat-shards.html

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com
web: www.campaignmonitor.com

On 14 August 2014 03:19, 'Sandeep Ramesh Khanzode' via elasticsearch
elasticsearch@googlegroups.com wrote:

Hi,

Can I using Cluster API or some other Java way, find the shards that are
allocated to cluster - node - index.

I would like to check which shards are deployed to a physical node, and
query only that shard to find what was indexed on that data.

I would be using a _routing value, and using this query, I want to check
which routing value went to which shard.

Thanks,
Sandeep

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/808f15d3-f552-4fb9-a404-8645faaded79%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/808f15d3-f552-4fb9-a404-8645faaded79%40googlegroups.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAEM624ZQcMFxNTDN6jTRp4REXRsjTCQgT4ZVkJ%3Dvug%2B9r3K5kA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: impact of stored fields on performance

2014-08-13 Thread Ashish Mishra

That sounds possible. We are using spindle disks. I have ~36Gb free for
the filesystem cache, and the previous data size (without the added field)
was 60-65Gb per node. So it's likely that 50% of queries were previously
addressed out of the FS cache, even more if queries are unevenly
distributed.
Data size is now 200Gb/node. So only ~18% of queries could hit the cache
and the rest would incur seek times.

Hmm... given this knowledge, is there a way to mitigate the effect without
moving everything to SSD? Only a minority of queries return the stored
field and it is not indexed. Ideally, it would be stored in separate
(colocated) files from the indexed fields. That way, most queries would be
unaffected and only those returning the value incur the seek cost.

I imagine indexes with _source enabled would see similar effects.

Is a parent-child relationship a good way to achieve the scenario above?
The parent can contain indexed fields and the child has stored fields.
Not sure if this just introduces new problems.

On Wednesday, August 13, 2014 1:16:49 AM UTC-7, Adrien Grand wrote:

OK, so quite large pages. Another question would be how much memory you
have on each node, how much is given to elasticsearch (ES_HEAP_SIZE) and
how large is the data/ directory.

For example if you used to have ${ES_HEAP_SIZE} + ${size of data}
${total memory of the machine}, it meant that your whole index could fit in
the filesystem cache, which is very fast (that would explain why you got
such good response times of 40-60ms in spite of having a size of 200). But
if it is greater now, it would mean that the disk needs to often perform
actual seeks (on magnetic storage, that would be around 5 to 10ms per seek)
which can highly degrade the latency.

On Tue, Aug 12, 2014 at 11:33 PM, Ashish Mishra laughin...@gmail.com
javascript: wrote:

The query size parameter is 200.
Actual hit totals vary widely, generally around 1000-1. A minority
are much lower. About 10% of queries end up with just 1 or 0 hits.

On Tuesday, August 12, 2014 6:31:29 AM UTC-7, Adrien Grand wrote:

Hi Ashish,

How many documents do your queries typically retrieve? (the value of the
`size` parameter)

On Tue, Aug 12, 2014 at 12:48 AM, Ashish Mishra laughin...@gmail.com
wrote:

I recently added a binary type field to all documents with mapping
store: true. The field contents are large and as a result the on-disk
index size rose by 3x, from 2.5Gb/shard to ~8Gb/shard.

After this change I've seen a big jump in query latency. Searches
which previously took 40-60ms now take 800ms and longer. This is the case
even for queries which *don't* return the binary field.
I tried optimizing the index down to max_num_segments=1, but query
latency remains high.

Is this expected? Obviously queries returning the new field will take
a hit (since field data needs to be loaded from disk). But I would've
expected other queries should not be much affected.

Is the problem that larger file sizes make memory-mapping and the FS
cache less efficient? Or are stored fields still getting loaded from disk
even when not included in the fields term?

--
Adrien Grand

https://groups.google.com/d/msgid/elasticsearch/1105d739-114e-4047-994e-aba8e27066b3%40googlegroups.com?utm_medium=emailutm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
Adrien Grand

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/ccb6caca-bee2-4628-8dda-7edad4db4a4c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Moving Index/Shards from One Node to Another

Mappings aren't saved to a specific node, so you don't need to worry about
that.
You may be able to physically copy the data across and have it recover, but
you should test this.

With out more information I'd question why you want to do this. ES handles
allocation itself perfectly fine and not running replicas is risky,
especially if you are forcing the entire index onto one instance.

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com
web: www.campaignmonitor.com

On 13 August 2014 23:08, 'Sandeep Ramesh Khanzode' via elasticsearch
elasticsearch@googlegroups.com wrote:

Hi,

Now, if the following scenarios occur:

I am not sure if there is this level of control and how it is exposed.
Please let me know. Thanks,

Thanks,
Sandeep

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/9a5eca15-401d-4aae-ad62-ef39a78d863f%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/9a5eca15-401d-4aae-ad62-ef39a78d863f%40googlegroups.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAEM624YNKYL3x4tm_9QJBVR4iHw4rW50tYLpMqAoX%2BOeSS63zg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Shard data inconsistencies

2014-08-13 Thread Paul Smith

Hi Aaron, late to this party for sure, sorry. I feel your pain, this is
happening for us, and I've seen reports of it occurring across versions,
but with very little information to go on I don't think progress has been
made. I actually don't think there's an issue raised for it. Perhaps that
should be a first step.

We call this problem a Flappy Item because the item appears, disappears
in search results depending on whether the search hits the primary or
replica shard. Flaps back and forth.

The only way to repair the problem is to rebuild the replica shard. You
can disable _all_ replicas and then re-enable them, and the primary shard
will be used as the source and it will work. That's if you can live with
the lack of redundancy for that length of time

Alternatively we have found that issuing a Move command to relocate the
replica shard off the current host and on to another, also causes ES to
generate a new replica shard using the primary as the source, and that
corrects the problem.

A caveat we've found with this approach at least with the old version of ES
we're sadly still using (0.19... hmm) that after the move, the cluster will
likely want to rebalance, and the shard allocation after rebalance can from
time to time put the replica back where it was. ES on that original node
then goes Oh look, here's the same shard I had earlier, lets use that..
Which means you're back to square one.. You can force _all_ replica
shards to move by coming up with a move command that shuffles them around,
and that definitely does work, but obviously takes longer for large
clusters.

In terms of tooling around this, I offer you these:

Scrutineer - https://github.com/Aconex/scrutineer- Can detect differences
between your source of truth (db?) and your index (ES). This does pickup
the case where the replica is reporting an item that should have been
deleted.

Flappy Item Detector - https://github.com/Aconex/es-flappyitem-detector -
given a set of suspect IDs can check the primary vs replica to confirm/deny
it being one of these cases. There is also support to issue basic move
commands with some simple logic to attempt to rebuild that replica.

Hope that helps.

cheers,

Paul Smith

On 8 August 2014 01:14, aaron atdi...@gmail.com wrote:

I've noticed on a few of my clusters that some shard replicas will be
perpetually inconsistent w/ other shards. Even when all of my writes are
successful and use write_consistency = ALL and replication = SYNC.

A GET by id will return 404/missing for one replica but return the
document for the other two replicas. Even after refresh, the shard is never
repaired.

Using ES 0.90.7.

Is this a known defect? Is there a means to detect, prevent, or at least
detect repair when this occurs?

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/164c3362-1ed4-4e90-8bb6-283543a20cf9%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/164c3362-1ed4-4e90-8bb6-283543a20cf9%40googlegroups.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAHfYWB7XhaB-ZkJqE8%3DfNu%2BZdNzGC%2BPx%3Dv61OGTzQABbHNZfSg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

how to avoid memory swap on windows server?

2014-08-13 Thread Andrew Gui

From elasticsearch's website(
*http://www.elasticsearch.org/guide/en/elasticsearch/referenc…* 
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/setup-configuration.html),
 
it recommends to disable memory swapping for better performance . But the 
approaches are for Linux only. We are planning to run the elasticsearch on 
Windows servers. Can we simply disable the page file on Windows? Any tips 
how to do this?

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/392e286b-7881-448e-828b-1f33091b2681%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

where to set indices.fielddata.cache.size and indices.fielddata.cache.expire

2014-08-13 Thread panfei

I have a server with 48GB memory and set ES_HEAP_SIZE to 32GB.

1. In this case I want to know, where to set indices.fielddata.cache.size
and indices.fielddata.cache.expire and what is the proper value for the two
configurations in my case.

currently I set them in the elasticsearch.yml:

indices.fielddata.cache.size: 16GB
indices.fielddata.cache.expire: 1m

2. I also noted another two settings: indices.cache.filter.size and
indices.cache.filter.expire, I want to know what's the difference between
them and the default value of them.

Thanks.

-- 
不学习，不知道

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CA%2BJstLBKWDx8mhWinhr31%2BX%3DXT5gWGnKep%2BG1h%3D9J8gF0QN8JA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: where to set indices.fielddata.cache.size and indices.fielddata.cache.expire

2014-08-13 Thread panfei

PS: I am using Elasticsearch 1.2.1


2014-08-14 9:09 GMT+08:00 panfei cnwe...@gmail.com:

 I have a server with 48GB memory and set ES_HEAP_SIZE to 32GB.

 1. In this case I want to know, where to set indices.fielddata.cache.size
 and indices.fielddata.cache.expire and what is the proper value for the two
 configurations in my case.

 currently I set them in the elasticsearch.yml:

 indices.fielddata.cache.size: 16GB
 indices.fielddata.cache.expire: 1m

 2. I also noted another two settings: indices.cache.filter.size and
 indices.cache.filter.expire, I want to know what's the difference between
 them and the default value of them.

 Thanks.

 --
 不学习，不知道




-- 
不学习，不知道

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CA%2BJstLCQbJuCjC9DDu8bQPr2YaRZeCg%2BF5ChnfjyXaDNrW6LKg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Automatic partial shutdown of cluster

2014-08-13 Thread Krishnan

Hi All,

I have cluster of one master only nodes and 7 data only nodes. Some nodes 
are getting shutdown automatically. Could you please advise what is the 
possible cause of this behaviour or how do I investigate the issue. 

For you perusal the log entry is below :

Data node logs :

[2014-08-13 17:10:22,882][INFO ][action.admin.cluster.node.shutdown] 
[search7] shutting down in [200ms]
[2014-08-13 17:10:23,102][INFO ][action.admin.cluster.node.shutdown] 
[search7] initiating requested shutdown...
[2014-08-13 17:10:23,106][INFO ][node ] [search7] 
stopping ...
[2014-08-13 17:10:52,290][INFO ][node ] [search7] 
stopped
[2014-08-13 17:10:52,290][INFO ][node ] [search7] 
closing ...
[2014-08-13 17:10:52,323][INFO ][node ] [search7] closed

Server node logs :
[2014-08-13 17:10:21,879][INFO ][action.admin.cluster.node.shutdown] 
[search1] [partial_cluster_shutdown]: requested, shutting down 
[[QowbqobdTyqrMZ00EVSvbw]] in [1s]
[2014-08-13 17:10:21,991][INFO ][action.admin.cluster.node.shutdown] 
[search1] [partial_cluster_shutdown]: requested, shutting down 
[[hRJliGmSSTuXX1Kmmjz94Q]] in [1s]
[2014-08-13 17:10:22,206][INFO ][action.admin.cluster.node.shutdown] 
[search1] [partial_cluster_shutdown]: requested, shutting down 
[[dqnFXTV9RoKeJpSy7zNqPA]] in [1s]
[2014-08-13 17:10:22,901][INFO ][action.admin.cluster.node.shutdown] 
[search1] [partial_cluster_shutdown]: done shutting down 
[[QowbqobdTyqrMZ00EVSvbw]]
[2014-08-13 17:10:22,999][INFO ][action.admin.cluster.node.shutdown] 
[search1] [partial_cluster_shutdown]: done shutting down 
[[hRJliGmSSTuXX1Kmmjz94Q]]
[2014-08-13 17:10:23,290][INFO ][action.admin.cluster.node.shutdown] 
[search1] [partial_cluster_shutdown]: done shutting down 
[[dqnFXTV9RoKeJpSy7zNqPA]]

Thank You,
Krishna

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/1cf8f1ca-6388-4ffa-b4aa-805fc4254b8e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Automatic partial shutdown of cluster