Re: howto: food for dogs == dogfood

2015-04-28 Thread Itamar Syn-Hershko
Synonyms

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer  Consultant
Lucene.NET committer and PMC member

On Tue, Apr 28, 2015 at 5:33 PM, Maarten Roosendaal mroosendaa...@gmail.com
 wrote:

 Hi,

 We have users typing stuff like food for dogs and we've indexed the data
 with dogfood. What is the best strategy to get a match with
 elasticsearch's filters and or analyzers?

 Thanks,
 Maarten

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/c35ceba0-f5af-47f2-821f-384e4b3272bf%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/c35ceba0-f5af-47f2-821f-384e4b3272bf%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZuZqC78O%2Bz_QwBTEfWK-MDWDPH19W_TiL_SOTApBsny6A%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: What is the correct _primary_first syntax? What is the relevant debug logger ?

2015-04-28 Thread Itamar Syn-Hershko
?preference=_primary_first

see
http://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-preference.html

No verbose mode at the moment

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer  Consultant
Lucene.NET committer and PMC member

On Mon, Apr 27, 2015 at 8:53 AM, Itai Frenkel itaifren...@live.com wrote:

 Hello,

 What is the correct syntax of using _primary_first in search and search
 template queries?

 GET myindex/_search/template?preference=_primary_first

 or

 GET myindex/_search/template?routing=_primary_first

 Is there any verbose mode that can log the list of shards that were
 actually accessed?

 thanks,
 Itai

  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/6f6d44a0-f689-4168-85cf-574610f73155%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/6f6d44a0-f689-4168-85cf-574610f73155%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZsgOPJqY2dB1A_jytn5EwxoJo5Lkjp9BMWr%3DFHT9o-b3g%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: inner_hits and highlighting

2015-04-28 Thread Itamar Syn-Hershko
I think I've heard the ES team discourage the extensive use of this
aggregation type, mainly because it is highly expensive. Adding
highlighting support to it will more than double it's cost, and I'd
personally vote against it.

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer  Consultant
Lucene.NET committer and PMC member

On Tue, Apr 28, 2015 at 8:17 PM, Nikolas Everett nik9...@gmail.com wrote:

 If its not in the issues its unlikely that its planned. If it isn't
 planned I think filing an issue is a good thing - just be super clear what
 you want to do with examples in curl/gist form. If it is planned maybe add
 your proposed usage to the issue.

 Nik

 On Tue, Apr 28, 2015 at 11:26 AM, Ian Battersby ian.batter...@gmail.com
 wrote:

 Been playing with the new *experimental* inner_hits functionality
 released in 1.5.0, mainly with child/parent related documents. It seems to
 work really well but notice that highlighting doesn't seem supported on
 content/fields within inner_hits; a quick scan of the code-base seems to
 confirm this. Anyone know if this is already under consideration for a
 future release?

 Thanks, Ian.

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/6512722f-caa0-4f48-baf0-c255d8685cb0%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/6512722f-caa0-4f48-baf0-c255d8685cb0%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/CAPmjWd2SdkCbZYdrjJE6PJ7TnF7Kce1ke0ZyuVpkVmVpgAW%3DUQ%40mail.gmail.com
 https://groups.google.com/d/msgid/elasticsearch/CAPmjWd2SdkCbZYdrjJE6PJ7TnF7Kce1ke0ZyuVpkVmVpgAW%3DUQ%40mail.gmail.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZtRmigxAk8O-ZKp-cfQ9W5GOQ05Tk58knjcObLtUDLi_A%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Using serialized doc_value instead of _source to improve read latency

2015-04-20 Thread Itamar Syn-Hershko
This is how _source works. doc_values don't make sense in this regard -
what you are looking for is using stored fields and have the transform
script write to that. Loading stored fields (even one field per hit) may be
slower than loading and parsing _source, though.

I'd just put this logic in the indexer, though. It will definitely help
with other things as well, such as nasty huge mappings.

Alternatively, find a way to avoid IO completely. How about using ES for
search and something like riak for loading the actual data, if IO costs are
so noticable?

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer  Consultant
Lucene.NET committer and PMC member

On Mon, Apr 20, 2015 at 11:18 PM, Itai Frenkel itaifren...@live.com wrote:

 Hi,

 We are having a performance problem in which for each hit, elasticsearch
 parses the entire _source then generates a new Json with only the requested
 query _source fields. In order to overcome this issue we would like to use
 mapping transform script that serializes the requested query fields (which
 is known in advance) into a doc_value. Does that makes sense?

 The actual problem with the transform script is  SecurityException that
 does not allow using any json serialization mechanism. A binary
 serialization would also be ok.


 Itai

  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/b897aba2-c250-4474-a03f-1d2a993baef9%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/b897aba2-c250-4474-a03f-1d2a993baef9%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHTr4Zsmri8LvzAqnXrwCA7B2PesCtH05BQxmj%3D3vMr%2B9abikw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Using serialized doc_value instead of _source to improve read latency

2015-04-20 Thread Itamar Syn-Hershko
What if all those fields are collapsed to one, like you suggest, but that
one field is projected out of _source (think non-indexed json in a string
field)? do you see a noticable performance gain then?

What if that field is set to be stored (and loaded using fields, not via
_source)? what is the performance gain then?

Fielddata and the doc_values optimization on top of them will not help you
here, those data structures aren't being used for sending data out, only
for aggregations and sorting. Also, using fielddata will require indexing
those fields; it is apparent that you are not looking to be doing that.

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer  Consultant
Lucene.NET committer and PMC member

On Tue, Apr 21, 2015 at 12:14 AM, Itai Frenkel itaifren...@live.com wrote:

 Itamar,

 1. The _source field includes many fields that are only being indexed, and
 many fields that are only needed as a query search result. _source includes
 them both.The projection from _source from the query result is too CPU
 intensive to do during search time for each result, especially if the size
 is big.
 2. I agree that adding another NoSQL could solve this problem, however it
 is currently out of scope, as it would require syncing data with another
 data store.
 3. Wouldn't a big stored field will bloat the lucene index size? Even if
 not, isn't non_analyzed fields are destined to be (or already are)
 doc_fields?

 On Tuesday, April 21, 2015 at 1:36:20 AM UTC+3, Itamar Syn-Hershko wrote:

 This is how _source works. doc_values don't make sense in this regard -
 what you are looking for is using stored fields and have the transform
 script write to that. Loading stored fields (even one field per hit) may be
 slower than loading and parsing _source, though.

 I'd just put this logic in the indexer, though. It will definitely help
 with other things as well, such as nasty huge mappings.

 Alternatively, find a way to avoid IO completely. How about using ES for
 search and something like riak for loading the actual data, if IO costs are
 so noticable?

 --

 Itamar Syn-Hershko
 http://code972.com | @synhershko https://twitter.com/synhershko
 Freelance Developer  Consultant
 Lucene.NET committer and PMC member

 On Mon, Apr 20, 2015 at 11:18 PM, Itai Frenkel itaif...@live.com wrote:

 Hi,

 We are having a performance problem in which for each hit, elasticsearch
 parses the entire _source then generates a new Json with only the requested
 query _source fields. In order to overcome this issue we would like to use
 mapping transform script that serializes the requested query fields (which
 is known in advance) into a doc_value. Does that makes sense?

 The actual problem with the transform script is  SecurityException that
 does not allow using any json serialization mechanism. A binary
 serialization would also be ok.


 Itai

  --
 You received this message because you are subscribed to the Google
 Groups elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send
 an email to elasticsearc...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/b897aba2-c250-4474-a03f-1d2a993baef9%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/b897aba2-c250-4474-a03f-1d2a993baef9%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/630a2998-e2a9-44a3-9c93-e692be2c2338%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/630a2998-e2a9-44a3-9c93-e692be2c2338%40googlegroups.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZuxvUoZ4L%2BUq0G82GLZKYfN-hj_e_gez6RsUc3hZeHbyw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Evaluating Moving to Discourse - Feedback Wanted

2015-04-15 Thread Itamar Syn-Hershko
I believe the biggest impact would be on responsiveness

As another long-timer, and someone who often responds to questions, I will
probably cease to do that if the forum would move to Discourse simply
because it lacks the push style notifications on new questions. Right now
if a question title in my inbox catches my eye I'll quickly read it and
respond. I'm quite sure this pattern (which I'm sure I'm not the only one
relying on) will go away once you move to Discourse and the forum
responsiveness with it.

Just my 2 cents.

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer  Consultant
Lucene.NET committer and PMC member

On Mon, Apr 13, 2015 at 8:13 PM, Leslie Hawthorn leslie.hawth...@elastic.co
 wrote:

 Thanks for your feedback, Ivan.

 There's no plan to remove threads from the forums, so information would
 always be archived there as well.

 Does that impact your thoughts on moving to Discourse?

 Folks, please keep the feedback coming!

 Cheers,
 LH

 On Sat, Apr 11, 2015 at 12:09 AM, Ivan Brusic i...@brusic.com wrote:

 As one of the oldest and most frequent users (before my sabbatical) of
 the mailing list, I just wanted to say that I never had an issue with it.
 It works. As long as I could continue using only email, I am happy.

 For realtime communication, there is the IRC channel. If prefer the
 mailing list since everything is archived.

 Ivan
  On Apr 2, 2015 5:36 PM, leslie.hawthorn leslie.hawth...@elastic.co
 wrote:

 Hello everyone,

 As we’ve begun to scale up development on three different open source
 projects, we’ve found Google Groups to be a difficult solution for dealing
 with all of our needs for community support. We’ve got multiple mailing
 lists going, which can be confusing for new folks trying to figure out
 where to go to ask a question.

 We’ve also found our lists are becoming noisy in the “good problem to
 have” kind of way. As we’ve seen more user adoption, and across such a wide
 variety of use cases, we’re getting widely different types of questions
 asked. For example, I can imagine that folks not using our Python client
 would rather not be distracted with emails about it.

 There’s also a few other strikes against Groups as a tool, such as the
 fact that it is no longer a supported product by Google, it provides no API
 hooks and it is not available for users in China.

 We’ve evaluated several options and we’re currently considering
 shuttering the elasticsearch-user and logstash-users Google Groups in favor
 of a Discourse forum. You can read more about Discourse at
 http://www.discourse.org

 We feel Discourse will allow us to provide a better experience for all
 of our users for a few reasons:

 * More fine grained conversation topics = less noise and better targeted
 discussions. e.g. we can offer a forum for each language client, individual
 logstash plugin or for each city to plan user group meetings, etc.

 * Facilitates discussions that are not generally happening on list now,
 such as best practices by use case or tips from moving to development to
 production

 * Easier for folks who are purely end users - and less used to getting
 peer support on a mailing list - to get help when they need it

 Obviously, Discourse does not function the exact same way as a mailing
 list - however, email interaction with Discourse is supported and will
 continue to allow you to participate in discussions over email (though
 there are some small issues related to in-line replies. [0])

 We’re working with the Discourse team now as part of evaluating this
 transition, and we know they’re working to resolve this particular issue.
 We’re also still determining how Discourse will handle our needs for both
 user and list archive migration, and we’ll know the precise details of how
 that would work soon. (We’ll share when we have them.)

 The final goal would be to move Google Groups to read-only archives, and
 cut over to Discourse completely for community support discussions.

 We’re looking at making the cut over in ~30 days from today, but
 obviously that’s subject to the feedback we receive from all of you. We’re
 sharing this information to set expectations about time frame for making
 the switch. It’s not set in stone. Our highest priority is to ensure
 effective migration of our list archives and subscribers, which may mean a
 longer time horizon for deploying Discourse, as well.

 In the meantime, though, we wanted to communicate early and often and
 get your feedback. Would this change make your life better? Worse? Meh?

 Please share your thoughts with us so we can evaluate your feedback. We
 don’t take this switch lightly, and we want to understand how it will
 impact your overall workflow and experience.

 We’ll make regular updates to the list responding to incoming feedback
 and be completely transparent about how our thought processes evolve based
 on it.

 Thanks in advance!

 [0] - https

Re: Evaluating Moving to Discourse - Feedback Wanted

2015-04-15 Thread Itamar Syn-Hershko
Fair play, will check that out, assuming you can reply to that email to
respond?

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer  Consultant
Lucene.NET committer and PMC member

On Wed, Apr 15, 2015 at 8:53 PM, Glen Smith g...@smithsrock.com wrote:

 * it lacks the push style notifications on new questions*


 https://lh3.googleusercontent.com/-enGiohVrmdk/VS6lNPnQMbI/A70/fc-SzjQxRlk/s1600/Screen%2BShot%2B2015-04-15%2Bat%2B1.49.18%2BPM.png
 That doesn't seem to be correct to me. Does send me an email for every
 new post not cover what you want?




 On Wednesday, April 15, 2015 at 1:42:50 PM UTC-4, Itamar Syn-Hershko wrote:

 I believe the biggest impact would be on responsiveness

 As another long-timer, and someone who often responds to questions, I
 will probably cease to do that if the forum would move to Discourse simply
 because it lacks the push style notifications on new questions. Right now
 if a question title in my inbox catches my eye I'll quickly read it and
 respond. I'm quite sure this pattern (which I'm sure I'm not the only one
 relying on) will go away once you move to Discourse and the forum
 responsiveness with it.

 Just my 2 cents.

 --

 Itamar Syn-Hershko
 http://code972.com | @synhershko https://twitter.com/synhershko
 Freelance Developer  Consultant
 Lucene.NET committer and PMC member

 On Mon, Apr 13, 2015 at 8:13 PM, Leslie Hawthorn leslie@elastic.co
 wrote:

 Thanks for your feedback, Ivan.

 There's no plan to remove threads from the forums, so information would
 always be archived there as well.

 Does that impact your thoughts on moving to Discourse?

 Folks, please keep the feedback coming!

 Cheers,
 LH

 On Sat, Apr 11, 2015 at 12:09 AM, Ivan Brusic iv...@brusic.com wrote:

 As one of the oldest and most frequent users (before my sabbatical) of
 the mailing list, I just wanted to say that I never had an issue with it.
 It works. As long as I could continue using only email, I am happy.

 For realtime communication, there is the IRC channel. If prefer the
 mailing list since everything is archived.

 Ivan
  On Apr 2, 2015 5:36 PM, leslie.hawthorn leslie@elastic.co
 wrote:

 Hello everyone,

 As we’ve begun to scale up development on three different open source
 projects, we’ve found Google Groups to be a difficult solution for dealing
 with all of our needs for community support. We’ve got multiple mailing
 lists going, which can be confusing for new folks trying to figure out
 where to go to ask a question.

 We’ve also found our lists are becoming noisy in the “good problem to
 have” kind of way. As we’ve seen more user adoption, and across such a 
 wide
 variety of use cases, we’re getting widely different types of questions
 asked. For example, I can imagine that folks not using our Python client
 would rather not be distracted with emails about it.

 There’s also a few other strikes against Groups as a tool, such as the
 fact that it is no longer a supported product by Google, it provides no 
 API
 hooks and it is not available for users in China.

 We’ve evaluated several options and we’re currently considering
 shuttering the elasticsearch-user and logstash-users Google Groups in 
 favor
 of a Discourse forum. You can read more about Discourse at
 http://www.discourse.org

 We feel Discourse will allow us to provide a better experience for all
 of our users for a few reasons:

 * More fine grained conversation topics = less noise and better
 targeted discussions. e.g. we can offer a forum for each language client,
 individual logstash plugin or for each city to plan user group meetings,
 etc.

 * Facilitates discussions that are not generally happening on list
 now, such as best practices by use case or tips from moving to development
 to production

 * Easier for folks who are purely end users - and less used to getting
 peer support on a mailing list - to get help when they need it

 Obviously, Discourse does not function the exact same way as a mailing
 list - however, email interaction with Discourse is supported and will
 continue to allow you to participate in discussions over email (though
 there are some small issues related to in-line replies. [0])

 We’re working with the Discourse team now as part of evaluating this
 transition, and we know they’re working to resolve this particular issue.
 We’re also still determining how Discourse will handle our needs for both
 user and list archive migration, and we’ll know the precise details of how
 that would work soon. (We’ll share when we have them.)

 The final goal would be to move Google Groups to read-only archives,
 and cut over to Discourse completely for community support discussions.

 We’re looking at making the cut over in ~30 days from today, but
 obviously that’s subject to the feedback we receive from all of you. We’re
 sharing this information to set expectations about time frame for making
 the switch. It’s not set

Re: Should I use elasticsearch as a core for faceted navigation-heavy website?

2015-03-24 Thread Itamar Syn-Hershko
Short answer: yes. With properly sharded and scaled out environment, and
using ES 1.4 or newer, you should be able to get those numbers.

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer  Consultant
Lucene.NET committer and PMC member

On Tue, Mar 24, 2015 at 5:38 PM, Dmitry dmitry.bit...@gmail.com wrote:

 Hello,

 I'm evaluating elasticsearch for use in new project. I can see that it has
 all search features we need. Problem is that after reading documentation
 and forum I still can't understand whether elastic is suitable technology
 for us performance-wise. I'd be very grateful to get your opinion on that.


 We're building a directory of businesses, similar to Yelp. We have 5m
 businesses, and main feature of our site is faceted search on different
 facets: geography (tens of thousands geo objects), business type (several
 thousand options), additional services offered by that business (hundreds)
 and so on. So for each request (combination of search parameters) we need
 to get search results, but also what options are available in each facet
 (for example, what business types are located are in selected geography)
 for user to be able to narrow down his search (example:
 http://take.ms/oAZan). Full text search (by business name for example) is
 used in very small percentage of requests, bulk of requests is exact match
 on one or several facets.

 Based on the similar our project we expect 1-5m requests per day. All
 requests are highly diversified: no single page (combination of search
 params) constitutes more than 0,1% of total requests. We expect to be able
 to answer request in 200-300ms, so I guess request to elasticsearch should
 take no more than 100ms.

 On our similar project we use big lookup table in database with all
 possible combinations of params mapped to search result count. For each
 request we generate all possible combinations of parameters to refine
 current search and then check lookup table to see if they have any results.


 My questions are:

 Is elastic search suitable for our purposes? Specifically, are
 aggregations meant to be used in large number of low-latency requests, or
 are they more like analytical feature, where response time is not that
 important? I ask that because in discussions of aggregation and faceting
 performance here and elsewhere response times are mentioned in 1-10s range,
 which is ok for analytics and infrequent searches, but obviously on ok for
 us.

 How hard it is to get performance we need: 50 rps, 100ms response time for
 search+facets, on some reasonable hardware, taking into account big number
 of possible facet combinations and high diversification of requests? What
 kind of hardware should we expect to handle our loads? I understand that
 these are vague questions, but I just need some approximation. Is it more
 like 1 server with commodity hardware and simple configuration, or more
 like cloud of 10 servers and extensive tuning? For example, our lookup
 table solution works on 1 commodity server with 16gb of ram with almost
 default setup.


 Thank you for your responses,
 Dmitry

  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/807801d4-2926-4e52-9161-dc82d3f33a75%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/807801d4-2926-4e52-9161-dc82d3f33a75%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZsOOMoJUDmdcH-zsL4EHymR6SUw6T3dKSs_UHfVq0mtCw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Supported Operating System

2015-03-21 Thread Itamar Syn-Hershko
I'll recommend using a Linux based system, and not as a VM, for various
reasons relating to resource management

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer  Consultant
Lucene.NET committer and PMC member

On Sun, Mar 22, 2015 at 1:20 AM, Gil Peleg gilpele...@gmail.com wrote:

 Hey,

 I was wondering if there was any guidelines as to preferred OS for ES? Are
 there any that are not supported?
 I currently run Windows Server 2008 R2 on a project I am working on and
 was wondering if there was any issues with using ES on it.
 Going to go ahead and assume, if Linux is the preferred option, would it
 be better to run it on a virtual machine on top my current Windows OS and
 it will reach better performance than being installed simply on the Windows
 Server?

 Thanks,
 Gil

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/abe4a23d-6ad6-4a00-8e19-5817b6b52cba%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/abe4a23d-6ad6-4a00-8e19-5817b6b52cba%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZsaAhZSZ_Tbb7wGK5XKza5fbRqzwUuexpf6U-4vgEMD%3Dw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Elasticsearch - Not require an exact match

2015-03-19 Thread Itamar Syn-Hershko
You should use this then:
http://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-word-delimiter-tokenfilter.html

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer  Consultant
Lucene.NET committer and PMC member

On Thu, Mar 19, 2015 at 2:26 PM, James m...@employ.com wrote:

 Currently I have an item in my elasticsearch index with the title:
 *testing123*

 When I search for it, I can only get it returned if I search *testing123*
 exactly. However, I want to be able to search *testing* and have it
 returned too.

 How can I have it so the search must start with that term but also not be
 an exact match?

 Any help would be appreciated.

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/de42c52d-1566-461e-b578-594aa963a498%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/de42c52d-1566-461e-b578-594aa963a498%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZvcvCYD0_SRY_QcHNcytaj7KcmCH2itJcUsPCTc2FjsrQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Elasticsearch - Not require an exact match

2015-03-19 Thread Itamar Syn-Hershko
This boils down to Lucene fundamentals, in particular what search tokens
are created and then searched. I've explained this in depth here:
https://www.youtube.com/watch?v=QI566fe9Svs

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer  Consultant
Lucene.NET committer and PMC member

On Thu, Mar 19, 2015 at 2:33 PM, James m...@employ.com wrote:

 Thank you for the reply. I thought it was much about making my search
 query not require an exact match, rather than splitting down the words I am
 searching against?

 On 19 March 2015 at 12:30, Itamar Syn-Hershko ita...@code972.com wrote:

 You should use this then:
 http://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-word-delimiter-tokenfilter.html

 --

 Itamar Syn-Hershko
 http://code972.com | @synhershko https://twitter.com/synhershko
 Freelance Developer  Consultant
 Lucene.NET committer and PMC member

 On Thu, Mar 19, 2015 at 2:26 PM, James m...@employ.com wrote:

 Currently I have an item in my elasticsearch index with the title:
 *testing123*

 When I search for it, I can only get it returned if I search
 *testing123* exactly. However, I want to be able to search *testing*
 and have it returned too.

 How can I have it so the search must start with that term but also not
 be an exact match?

 Any help would be appreciated.

 --
 You received this message because you are subscribed to the Google
 Groups elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send
 an email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/de42c52d-1566-461e-b578-594aa963a498%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/de42c52d-1566-461e-b578-594aa963a498%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


  --
 You received this message because you are subscribed to a topic in the
 Google Groups elasticsearch group.
 To unsubscribe from this topic, visit
 https://groups.google.com/d/topic/elasticsearch/psZ8iAVOziM/unsubscribe.
 To unsubscribe from this group and all its topics, send an email to
 elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZvcvCYD0_SRY_QcHNcytaj7KcmCH2itJcUsPCTc2FjsrQ%40mail.gmail.com
 https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZvcvCYD0_SRY_QcHNcytaj7KcmCH2itJcUsPCTc2FjsrQ%40mail.gmail.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/CAPng%3D3f7pQOm4HPkxtec3or%2Ba4ZUVpC1USb7tWMaD9cvicQ8gQ%40mail.gmail.com
 https://groups.google.com/d/msgid/elasticsearch/CAPng%3D3f7pQOm4HPkxtec3or%2Ba4ZUVpC1USb7tWMaD9cvicQ8gQ%40mail.gmail.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHTr4Zs9A%2B5mVB1txVyFX-KX%3DtXu9m5Rmq0e2n7ET5G3JZnpYQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Courier Fetch error, maybe due to lack of @timestamp?

2015-03-17 Thread Itamar Syn-Hershko
Like the error suggests, No mapping found for [@timestamp] in order to
sort on

Kibana expects a @timestamp field - make sure to push that in your source

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer  Consultant
Lucene.NET committer and PMC member

On Tue, Mar 17, 2015 at 11:19 PM, David Reagan jer...@gmail.com wrote:

 I keep getting an error like this: Courier Fetch: 5 of 270 shards
 failed. in Kibana 4.0.1.

 After some Googling, I think it has something to do with @timestamp not
 existing for some of my data. But I'm not sure, because
 https://groups.google.com/d/topic/elasticsearch/L6AG3dZOGJ8/discussion
 was solved by not searching the kibana indexes. I'm only searching my
 logstash indexes. And I'm still getting that error.

 In kibana 4 I went to Settings-Indices and made sure I only have
 logstash-* listed under Index Patterns.

 I did recently update the template to what was in the logstash git HEAD.

 See http://pastebin.com/w7PmHxXS for my /var/log/elasticsearch/index.log
 output. As well as the template I'm using. It's at the bottom of the paste.

 I did check with curl -XGET 'http://localhost:9200/_cat/shards?pretty=true'
 to see if any shards had issues. They all had STARTED as their status.

 Any suggestions?

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/9d816fa6-62c4-4651-a1e3-30c4f9239f5a%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/9d816fa6-62c4-4651-a1e3-30c4f9239f5a%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHTr4Zsf8HH4WFvF8geoDy4zNhWOX6Y6hEsaLv8E8xhc04F62A%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Courier Fetch error, maybe due to lack of @timestamp?

2015-03-17 Thread Itamar Syn-Hershko
@timestamp is generated automatically by logstash, any documents not added
by logstash will not have it

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer  Consultant
Lucene.NET committer and PMC member

On Wed, Mar 18, 2015 at 12:51 AM, David Reagan jer...@gmail.com wrote:

 @timestamp has always been applied automatically. Only time I've ever
 touched it is when I've adjusted the date to what the log message holds,
 rather than when the log message is processed by logstash.

 So, I have no idea where it comes from, or how I could have turned it off
 on something.

 Is that in the template?

 --David Reagan

 On Tue, Mar 17, 2015 at 2:24 PM, Itamar Syn-Hershko ita...@code972.com
 wrote:

 Like the error suggests, No mapping found for [@timestamp] in order to
 sort on

 Kibana expects a @timestamp field - make sure to push that in your source

 --

 Itamar Syn-Hershko
 http://code972.com | @synhershko https://twitter.com/synhershko
 Freelance Developer  Consultant
 Lucene.NET committer and PMC member

 On Tue, Mar 17, 2015 at 11:19 PM, David Reagan jer...@gmail.com wrote:

 I keep getting an error like this: Courier Fetch: 5 of 270 shards
 failed. in Kibana 4.0.1.

 After some Googling, I think it has something to do with @timestamp not
 existing for some of my data. But I'm not sure, because
 https://groups.google.com/d/topic/elasticsearch/L6AG3dZOGJ8/discussion
 was solved by not searching the kibana indexes. I'm only searching my
 logstash indexes. And I'm still getting that error.

 In kibana 4 I went to Settings-Indices and made sure I only have
 logstash-* listed under Index Patterns.

 I did recently update the template to what was in the logstash git HEAD.

 See http://pastebin.com/w7PmHxXS for my
 /var/log/elasticsearch/index.log output. As well as the template I'm using.
 It's at the bottom of the paste.

 I did check with curl -XGET '
 http://localhost:9200/_cat/shards?pretty=true' to see if any shards had
 issues. They all had STARTED as their status.

 Any suggestions?

 --
 You received this message because you are subscribed to the Google
 Groups elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send
 an email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/9d816fa6-62c4-4651-a1e3-30c4f9239f5a%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/9d816fa6-62c4-4651-a1e3-30c4f9239f5a%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


  --
 You received this message because you are subscribed to a topic in the
 Google Groups elasticsearch group.
 To unsubscribe from this topic, visit
 https://groups.google.com/d/topic/elasticsearch/dH6zw6swHBg/unsubscribe.
 To unsubscribe from this group and all its topics, send an email to
 elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/CAHTr4Zsf8HH4WFvF8geoDy4zNhWOX6Y6hEsaLv8E8xhc04F62A%40mail.gmail.com
 https://groups.google.com/d/msgid/elasticsearch/CAHTr4Zsf8HH4WFvF8geoDy4zNhWOX6Y6hEsaLv8E8xhc04F62A%40mail.gmail.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/CANo%2B_AdzgruuC8mb5W2fKrxYi58tyBwak%2B-3g8w2hbWJTyRThw%40mail.gmail.com
 https://groups.google.com/d/msgid/elasticsearch/CANo%2B_AdzgruuC8mb5W2fKrxYi58tyBwak%2B-3g8w2hbWJTyRThw%40mail.gmail.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZsPGBtn9cSLt6Dyz0M%2BEznMCFM0d0Chj1h4%3DwJFX3qTng%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: doc_values in index template for new generated indexes

2015-03-17 Thread Itamar Syn-Hershko
http://www.elastic.co/guide/en/elasticsearch/guide/current/doc-values.html#_enabling_doc_values

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer  Consultant
Lucene.NET committer and PMC member

On Tue, Mar 17, 2015 at 5:35 AM, chris85l...@googlemail.com wrote:

 Hello,

 We have an elasticsearch setup where we are using the default values, so
 no doc_values. How can I add doc_values: true to the index template so that
 the new daily based generated indexes using this feature.

 Thank you in advanced!

 Cheers
 Chris

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/81c8614a-4946-49ac-9e98-9af787445b92%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/81c8614a-4946-49ac-9e98-9af787445b92%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZuKEgR0uhttUBVoA2YQ4Cgu8BbX%3DYqZuJsuFzmUJY8Cfg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Running Kibana 4 on index

2015-03-16 Thread Itamar Syn-Hershko
If there is a @timestamp field, Kibana will use it to work with the
documents as timebased. Either way, you need to type the name of the index
explicitly and override the logstash-* pattern that is suggested by default.

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer  Consultant
Lucene.NET committer and PMC member

On Mon, Mar 16, 2015 at 6:44 PM, Moshe Recanati re.mo...@gmail.com wrote:

 Yes latest ES.
 How I make sure it's timebased events?
 On Mar 16, 2015 6:37 PM, Itamar Syn-Hershko ita...@code972.com wrote:

 Are you using Kibana 4 with the latest Elasticsearch?

 Basically, in Kibana 4 you need to make sure you uncheck Index contains
 time-based events, then type the name of the index and click Create

 --

 Itamar Syn-Hershko
 http://code972.com | @synhershko https://twitter.com/synhershko
 Freelance Developer  Consultant
 Lucene.NET committer and PMC member

 On Mon, Mar 16, 2015 at 6:31 PM, Moshe Recanati re.mo...@gmail.com
 wrote:

 Hi,
 I've Elastic with 2 simple indexes (chatmessages and sessions).
 I tried to run Kibana but although it can see the indexes I get the
 following message:


 Indices and aliases that were found, but did not match the pattern:
 .kibana
 chatmessages
 sessions

 Let me know what need to be done in order to solve it.

 Thank you,
 Moshe

 --
 You received this message because you are subscribed to the Google
 Groups elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send
 an email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/43a38a2b-cb98-44ae-a107-873074cbc9e3%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/43a38a2b-cb98-44ae-a107-873074cbc9e3%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


  --
 You received this message because you are subscribed to a topic in the
 Google Groups elasticsearch group.
 To unsubscribe from this topic, visit
 https://groups.google.com/d/topic/elasticsearch/jKDqXRQyYDs/unsubscribe.
 To unsubscribe from this group and all its topics, send an email to
 elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZvmfouaDEYHmHrqeZZ3p7Li%3D3yY5WAr7%3D5A9wVagpfhgA%40mail.gmail.com
 https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZvmfouaDEYHmHrqeZZ3p7Li%3D3yY5WAr7%3D5A9wVagpfhgA%40mail.gmail.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.

  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/CA%2BhKCHO4G_GY4FXnvB9KFRUm4URxqVHtEgBzYa4Q4C%3D7v6FJBw%40mail.gmail.com
 https://groups.google.com/d/msgid/elasticsearch/CA%2BhKCHO4G_GY4FXnvB9KFRUm4URxqVHtEgBzYa4Q4C%3D7v6FJBw%40mail.gmail.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHTr4Zus-f7d9OSU%2BKfYvKtQuy5eYRQJr%3DkUokJdFW9MRzD9KA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: metaphone3

2015-03-16 Thread Itamar Syn-Hershko
I believe there are licensing issues involved

You have metaphone available in core and here
https://github.com/elastic/elasticsearch-analysis-phonetic

Also see
https://github.com/elastic/elasticsearch-analysis-phonetic/issues/16

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer  Consultant
Lucene.NET committer and PMC member

On Mon, Mar 16, 2015 at 3:28 PM, kianmob...@gmail.com wrote:

 Is there any way to add metaphone 3 in elasticsearch as phonetic token
 filter? there is licence for metaphone3 here.

 https://code.google.com/p/google-refine/source/browse/trunk/main/src/com/google/refine/clustering/binning/Metaphone3.java


 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/53fadd30-5ace-4012-b602-019981208f30%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/53fadd30-5ace-4012-b602-019981208f30%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZsHJ%2BAB-rb4o2gNHsbPd2erM5Seg4aH%3DQnWnzzVm8HHvg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Running Kibana 4 on index

2015-03-16 Thread Itamar Syn-Hershko
Are you using Kibana 4 with the latest Elasticsearch?

Basically, in Kibana 4 you need to make sure you uncheck Index contains
time-based events, then type the name of the index and click Create

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer  Consultant
Lucene.NET committer and PMC member

On Mon, Mar 16, 2015 at 6:31 PM, Moshe Recanati re.mo...@gmail.com wrote:

 Hi,
 I've Elastic with 2 simple indexes (chatmessages and sessions).
 I tried to run Kibana but although it can see the indexes I get the
 following message:


 Indices and aliases that were found, but did not match the pattern:
 .kibana
 chatmessages
 sessions

 Let me know what need to be done in order to solve it.

 Thank you,
 Moshe

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/43a38a2b-cb98-44ae-a107-873074cbc9e3%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/43a38a2b-cb98-44ae-a107-873074cbc9e3%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZvmfouaDEYHmHrqeZZ3p7Li%3D3yY5WAr7%3D5A9wVagpfhgA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Strange exception in Elasticsearch 1.4.3

2015-03-13 Thread Itamar Syn-Hershko
This looks like a bug in elasticsearch-analysis-combo, I'd post it as an
issue there

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer  Consultant
Lucene.NET committer and PMC member

On Fri, Mar 13, 2015 at 1:35 PM, Angel Cross niegi...@gmail.com wrote:

 Hello.
 Recently in our test system we started to notice the following exception.
 Googling and investigation of setup itself didn't make it more clear. Still
 have no idea why is this happening. Maybe somebody already faced the issue
 and knows the reason? Or have any ideas?


 java.lang.IllegalArgumentException: State contains AttributeImpl of type
 org.apache.lucene.analysis.tokenattributes.PayloadAttributeImpl that is not
 in in this AttributeSource
 at
 org.apache.lucene.util.AttributeSource.restoreState(AttributeSource.java:313)
 at
 org.apache.lucene.analysis.ComboTokenStream.incrementToken(ComboTokenStream.java:106)
 at
 org.apache.lucene.index.DefaultIndexingChain$PerField.invert(DefaultIndexingChain.java:618)
 at
 org.apache.lucene.index.DefaultIndexingChain.processField(DefaultIndexingChain.java:359)
 at
 org.apache.lucene.index.DefaultIndexingChain.processDocument(DefaultIndexingChain.java:318)
 at
 org.apache.lucene.index.DocumentsWriterPerThread.updateDocument(DocumentsWriterPerThread.java:239)
 at
 org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:457)
 at
 org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1511)
 at
 org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1246)
 at
 org.elasticsearch.index.engine.internal.InternalEngine.innerIndex(InternalEngine.java:594)
 at
 org.elasticsearch.index.engine.internal.InternalEngine.index(InternalEngine.java:522)
 at
 org.elasticsearch.index.shard.service.InternalIndexShard.index(InternalIndexShard.java:425)
 at
 org.elasticsearch.action.bulk.TransportShardBulkAction.shardIndexOperation(TransportShardBulkAction.java:439)
 at
 org.elasticsearch.action.bulk.TransportShardBulkAction.shardOperationOnPrimary(TransportShardBulkAction.java:150)
 at
 org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction.performOnPrimary(TransportShardReplicationOperationAction.java:512)
 at
 org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction$1.run(TransportShardReplicationOperationAction.java:419)
 at
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:724)

 elastic configuration
 elasticsearch 1.4.3
 plugins
 elasticsearch-analysis-baseform
 https://github.com/jprante/elasticsearch-analysis-baseform - version
 1.4.0
 elasticsearch-analysis-kuromoji
 https://github.com/elasticsearch/elasticsearch-analysis-kuromoji -
 version 2.4.2
 elasticsearch-analysis-combo/
 https://github.com/yakaz/elasticsearch-analysis-combo/ -version 1.5.1
 elasticsearch-analysis-decompound
 https://github.com/jprante/elasticsearch-analysis-decompound - version
 for 1.0.0RC1
 elasticsearch-analysis-icu
 https://github.com/elasticsearch/elasticsearch-analysis-icu - version
 2.4.2
 elasticsearch-analysis-smartcn
 https://github.com/elasticsearch/elasticsearch-analysis-smartcn
 -version 2.4.3
 elasticsearch-head/ http://mobz.github.io/elasticsearch-head/ - the
 last one
 server(1 machine) works with 2 nodes. One of then is data node, another is
 tribe node. Nodes are running on different ports and differs in
 configuration.
 Server OS ir RedHat 6.5

 Exception appears when we try to reindex of document containing nested
 documents. Indexing is happening via bulks. So this is not update but
 actually another index request for the existing document with the same id.
 This exception doesn't appear on another Centos machine and another RedHat
 machine with similar setup. We reinstalled Elastic on test machine, still
 no difference.

 Thanks, Liuba



  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/d3f18bae-4df7-4563-9fc4-59d87cd1a50b%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/d3f18bae-4df7-4563-9fc4-59d87cd1a50b%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch

Re: Strange exception in Elasticsearch 1.4.3

2015-03-13 Thread Itamar Syn-Hershko
Probably a version mismatch as that analyzer seem to only support 1.3.8

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer  Consultant
Lucene.NET committer and PMC member

On Fri, Mar 13, 2015 at 3:06 PM, Itamar Syn-Hershko ita...@code972.com
wrote:

 This looks like a bug in elasticsearch-analysis-combo, I'd post it as an
 issue there

 --

 Itamar Syn-Hershko
 http://code972.com | @synhershko https://twitter.com/synhershko
 Freelance Developer  Consultant
 Lucene.NET committer and PMC member

 On Fri, Mar 13, 2015 at 1:35 PM, Angel Cross niegi...@gmail.com wrote:

 Hello.
 Recently in our test system we started to notice the following exception.
 Googling and investigation of setup itself didn't make it more clear. Still
 have no idea why is this happening. Maybe somebody already faced the issue
 and knows the reason? Or have any ideas?


 java.lang.IllegalArgumentException: State contains AttributeImpl of type
 org.apache.lucene.analysis.tokenattributes.PayloadAttributeImpl that is not
 in in this AttributeSource
 at
 org.apache.lucene.util.AttributeSource.restoreState(AttributeSource.java:313)
 at
 org.apache.lucene.analysis.ComboTokenStream.incrementToken(ComboTokenStream.java:106)
 at
 org.apache.lucene.index.DefaultIndexingChain$PerField.invert(DefaultIndexingChain.java:618)
 at
 org.apache.lucene.index.DefaultIndexingChain.processField(DefaultIndexingChain.java:359)
 at
 org.apache.lucene.index.DefaultIndexingChain.processDocument(DefaultIndexingChain.java:318)
 at
 org.apache.lucene.index.DocumentsWriterPerThread.updateDocument(DocumentsWriterPerThread.java:239)
 at
 org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:457)
 at
 org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1511)
 at
 org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1246)
 at
 org.elasticsearch.index.engine.internal.InternalEngine.innerIndex(InternalEngine.java:594)
 at
 org.elasticsearch.index.engine.internal.InternalEngine.index(InternalEngine.java:522)
 at
 org.elasticsearch.index.shard.service.InternalIndexShard.index(InternalIndexShard.java:425)
 at
 org.elasticsearch.action.bulk.TransportShardBulkAction.shardIndexOperation(TransportShardBulkAction.java:439)
 at
 org.elasticsearch.action.bulk.TransportShardBulkAction.shardOperationOnPrimary(TransportShardBulkAction.java:150)
 at
 org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction.performOnPrimary(TransportShardReplicationOperationAction.java:512)
 at
 org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction$1.run(TransportShardReplicationOperationAction.java:419)
 at
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:724)

 elastic configuration
 elasticsearch 1.4.3
 plugins
 elasticsearch-analysis-baseform
 https://github.com/jprante/elasticsearch-analysis-baseform - version
 1.4.0
 elasticsearch-analysis-kuromoji
 https://github.com/elasticsearch/elasticsearch-analysis-kuromoji -
 version 2.4.2
 elasticsearch-analysis-combo/
 https://github.com/yakaz/elasticsearch-analysis-combo/ -version 1.5.1
 elasticsearch-analysis-decompound
 https://github.com/jprante/elasticsearch-analysis-decompound - version
 for 1.0.0RC1
 elasticsearch-analysis-icu
 https://github.com/elasticsearch/elasticsearch-analysis-icu - version
 2.4.2
 elasticsearch-analysis-smartcn
 https://github.com/elasticsearch/elasticsearch-analysis-smartcn
 -version 2.4.3
 elasticsearch-head/ http://mobz.github.io/elasticsearch-head/ - the
 last one
 server(1 machine) works with 2 nodes. One of then is data node, another
 is tribe node. Nodes are running on different ports and differs in
 configuration.
 Server OS ir RedHat 6.5

 Exception appears when we try to reindex of document containing nested
 documents. Indexing is happening via bulks. So this is not update but
 actually another index request for the existing document with the same id.
 This exception doesn't appear on another Centos machine and another
 RedHat machine with similar setup. We reinstalled Elastic on test machine,
 still no difference.

 Thanks, Liuba



  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/d3f18bae-4df7-4563-9fc4-59d87cd1a50b%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/d3f18bae-4df7-4563-9fc4-59d87cd1a50b

Re: Sanitize a text for indexing

2015-03-12 Thread Itamar Syn-Hershko
See
http://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-length-tokenfilter.html

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer  Consultant
Lucene.NET committer and PMC member

On Thu, Mar 12, 2015 at 10:52 AM, Bernhard Berger 
bernhardberger3...@gmail.com wrote:

 Hi,

 while indexing various comments from Facebook I sometimes get Exceptions:

 IllegalArgumentException: Document contains at least one immense term...

 Is it possible to sanitize a text for indexing in Elasticsearch so it doesn't 
 throw these Exceptions? Maybe there is a Filter to remove too-long Unicode 
 terms?

 For details about the failing documents, see my (unanswered) Stackoverflow 
 question: 
 http://stackoverflow.com/questions/28941570/remove-long-unicode-terms-from-string-in-java
 (I fear to break another Elasticsearch-based (Maillist) crawler, so I better 
 don't write the failing doc text here ;-) )

  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/93a5ed0d-6486-48b4-a228-1aff47d14ce0%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/93a5ed0d-6486-48b4-a228-1aff47d14ce0%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZtqBSYcM9oFRa%3DGsWeafzHsE%3DSVMSa6H9e1aVfDbS2q%3Dg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: How to install a plugin from a jar file

2015-03-05 Thread Itamar Syn-Hershko
Probably a bug in the plugin script which just looks at the folders under
/plugins

Did you put an es.properties file in your jar as a resource?

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer  Consultant
Lucene.NET committer and PMC member

On Thu, Mar 5, 2015 at 2:54 PM, Oranit Dror ora...@gmail.com wrote:

 Hi,

 I have written a plugin, but I cannot make ElasticSearch 1.4.4 to use it.
 Specifically, I packed the plugin as a jar file and placed the jar file
 under my ELASTIC_SEARCH_DIR/plugins/plugin name directory. However,
 when I am starting ElasticSearch, the list of installed plugins is empty:

 [INFO ][plugins  ] [Ant-Man] loaded [],sites []

 I should also note that when I run the 'plugin' command line with the
 list option (i.e. bin\plugin.bat -l), it does list my plug-in.

 Any advice?

 thank you,
 Oranit.

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/7ae1df87-e777-4b42-96b1-050ac0ec92a2%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/7ae1df87-e777-4b42-96b1-050ac0ec92a2%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZvxeOKZXjFrVf3%3Dfne-zW0F7iefRt%3D1SRTb%2BnhiXzsSeQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Elastic search front end

2015-03-03 Thread Itamar Syn-Hershko
You just need to create a Lucene QueryParser (implementing QueryBuilder)
and register it like so:
https://github.com/elasticsearch/elasticsearch/issues/3264#issuecomment-20247436

However, Elasticsearch provides a very good and expressive query DSL - so
I'd rather look at doing this on your search facade, and generate a verbose
query JSON to send to Elasticsearch. Many things that you have to support
in Solr via custom query parsers can be done using the provided query DSL
with Elasticsearch because JSON is way better than LocalParams etc

Alternatively, NLP and POS tagging could be done also on the analysis
level. I'd look at doing using TeeSinkTolenFilters.

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer  Consultant
Lucene.NET committer and PMC member

On Tue, Mar 3, 2015 at 12:12 PM, Oranit Dror ora...@gmail.com wrote:

 Hi,

 I will be glad to get some more information on your suggestion to write my
 own QueryParser as a plugin. To be more specific, I would like that this
 parser will do some Natural Language processing on the full text query
 string, supplied by the user, in the front-end search bar. In fact, in SolR
 I have implemented such a parser (as a QParserPlugin subclass). The output
 of the parser plugin should be a new string that I would like to give to
 ElasticSearch.

 Additionally, before displaying the returned results, I would like to add
 my own code for selecting the text that I would like to highlight. In SolR,
 I have implemented a class that extends the
 DefaultSolrHighlighter class.

 thank you,
 Oranit.

 On Monday, March 2, 2015 at 10:46:32 PM UTC+2, Itamar Syn-Hershko wrote:

 You can write your own QueryParser as a plugin but that sounds like an
 overkill. If all you need is display some highlighted results its easy
 enough to do in any language and I'd say you don't really need Kibana for
 that

 --

 Itamar Syn-Hershko
 http://code972.com | @synhershko https://twitter.com/synhershko
 Freelance Developer  Consultant
 Lucene.NET committer and PMC member

 On Mon, Mar 2, 2015 at 10:27 PM, Oranit Dror ora...@gmail.com wrote:

 Hi,

 I am new to ElasticSearch and have a newbie question: I want to have a
 user-friendly front-end to the data with a free text search bar. In this
 search bar the user inputs a query string, which I would like to parse and
 transform to a new string (application-dependent) that will be used on
 ElasticSearch. I then want to highlight the matching search terms in the
 results. I have implemented a similar application in Solr.

 I thought of using Kibana's Discover page. Is there a way to hook into
 Kibana and/or ElasticSearch, so I can transform the user's query string
 before it is sent to ES and highlight the results?

 Regards,
 Oranit

 --
 You received this message because you are subscribed to the Google
 Groups elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send
 an email to elasticsearc...@googlegroups.com.
 To view this discussion on the web visit https://groups.google.com/d/
 msgid/elasticsearch/648dfc5a-49e4-4a82-8b4a-2497a90dad42%
 40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/648dfc5a-49e4-4a82-8b4a-2497a90dad42%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/5d8e4b46-a68f-42d8-b097-0848fde5508c%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/5d8e4b46-a68f-42d8-b097-0848fde5508c%40googlegroups.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHTr4Zs%2BbzJzkcPY7FYi0W7FDuYf9v-9%3DngL_PXhtokCgBPDzQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: how test plugin in eclipse in elasticsearch

2015-02-17 Thread Itamar Syn-Hershko
See
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/using-elasticsearch-test-classes.html

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer  Consultant
Lucene.NET committer and PMC member

On Tue, Feb 17, 2015 at 5:23 PM, Ali Lotfdar ali.lotfda...@gmail.com
wrote:

 Hi All,

 I found this topic in previous topics but I need some help too.

 I want to know if it is possible to test my plugin before installing
 inside ES? and if yes how(test to give some sample data and see the
 result!)?
 Could you please let me know how it is possible and how I can debug it
 using main method?

 Thanks,
 Ali

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/37768df0-8877-46db-8fc2-556428c9896a%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/37768df0-8877-46db-8fc2-556428c9896a%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZvA%3DSN-Wmfd8Dcn8uJuzs60d_jJrghdTdpEcaf8J7D5jQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: elasticsearch deploy on ec2

2015-02-16 Thread Itamar Syn-Hershko
Can you describe your deployment process? the cluster can't be _always_ red
- it should be green when you first deploy

Other than that, check the obvious - that AWS securty groups are properly
defined for those machines (all of them under the same named security group)

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer  Consultant
Lucene.NET committer and PMC member

On Mon, Feb 16, 2015 at 8:11 PM, Eliran Shlomo eli...@whipclip.com wrote:

 Hi,

 I'm trying to deploy new elasticsearch environment on aws and the cluster
 is always at red and i get the following message

 {
error: ClusterBlockException[blocked by:
 [SERVICE_UNAVAILABLE/1/state not recovered / initialized];],
status: 503
 }

 I'm trying to deploy 3 master nodes, 3 data nodes, and 2 client nodes.
 when i check cluster health:
 {
cluster_name: stress_new,
status: red,
timed_out: false,
number_of_nodes: 8,
number_of_data_nodes: 3,
active_primary_shards: 0,
active_shards: 0,
relocating_shards: 0,
initializing_shards: 0,
unassigned_shards: 0
 }

 please adivce

 this is the configuration i used with changes by node role ( node.master 
 node.data ):

 cluster.name: stress_new
 plugin.mandatory: cloud-aws
 cloud.aws.access_key: 
 cloud.aws.secret_key: *
 cloud.aws.region: us-west-2
 discovery.type: ec2
 discovery.ec2.groups: stress_new_elasticsearch
 discovery.ec2.host_type: private_ip
 discovery.ec2.ping_timeout: 30s
 discovery.ec2.tag.elasticsearch: stress_new

 node.name: 172.**

 node.master: false
 node.data: false

 index.number_of_shards: 1
 index.number_of_replicas: 0
 path.data: /mnt/elasticsearch

 path.logs: /var/log/elasticsearch

 bootstrap.mlockall: true

 http.enabled: true

 gateway.recover_after_nodes: 8
 gateway.expected_nodes: 8
 discovery.zen.minimum_master_nodes: 3
 discovery.zen.ping.timeout: 10s
 discovery.zen.ping.multicast.enabled: false

 index.search.slowlog.threshold.query.warn: 500ms
 index.search.slowlog.threshold.query.info: 200ms
 index.search.slowlog.threshold.query.debug: 100ms
 index.search.slowlog.threshold.query.trace: 50ms

 index.search.slowlog.threshold.fetch.warn: 500ms
 index.search.slowlog.threshold.fetch.info: 200ms
 index.search.slowlog.threshold.fetch.debug: 100ms
 index.search.slowlog.threshold.fetch.trace: 50ms

 index.indexing.slowlog.threshold.index.warn: 500ms
 index.indexing.slowlog.threshold.index.info: 200ms
 index.indexing.slowlog.threshold.index.debug: 1000ms
 index.indexing.slowlog.threshold.index.trace: 50ms

 script.disable_dynamic: false
 script.native.socialScoreCalc.type: *.SocialScriptFactory
 script.default_lang: native

 action.disable_delete_all_indices: true
 action.auto_create_index: .marvel-*
 indices.fielddata.cache.size: 30%
 indices.fielddata.cache.expire: 15s
 marvel.agent.enabled: false
 allow_leading_wildcard: false
 script.groovy.sandbox.enabled: false


  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/843e39bd-842a-4301-899d-7a1bb1d119a9%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/843e39bd-842a-4301-899d-7a1bb1d119a9%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZstAUwTELHxOx%3DqWBuOjByEn3mhgdd93-kG3M%3DxQMSS7g%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: elasticsearch deploy on ec2

2015-02-16 Thread Itamar Syn-Hershko
Wait a second, you should use gateway.expected_data_nodes: 3 and
gateway.expected_master_nodes: 3 instead of what you have there now. Also
min master nodes should be 2 in your case.

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer  Consultant
Lucene.NET committer and PMC member

On Mon, Feb 16, 2015 at 8:22 PM, Eliran Shlomo eli...@whipclip.com wrote:

 Hi,
 Since the first moment the cluster in in red
 The servers are under the same security group and inside the security
 group i allow any/any between the servers.

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/ed1d0f69-93cc-4467-a0ad-bffe4b5175df%40googlegroups.com
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZtJEVnyXCDkEX7Hj%3DR%3Dvyy%3DUsUiiDSXC0EVhBMU2PaYXQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: elasticsearch deploy on ec2

2015-02-16 Thread Itamar Syn-Hershko
Master eligible nodes and Data nodes need to have this setting

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer  Consultant
Lucene.NET committer and PMC member

On Mon, Feb 16, 2015 at 8:31 PM, Eliran Shlomo eli...@whipclip.com wrote:

 Wait a second, you should use gateway.expected_data_nodes: 3 and
 gateway.expected_master_nodes: 3 instead of what you have there now. Also
 min master nodes should be 2 in your case.

 Those settings should be in the configuration of all nodes or only in the
 external gateway?(client)

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/e6914e16-28ac-4d00-96b7-b3b007538aba%40googlegroups.com
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZtzjXeHrzHiRAitfOHASpXrvzXza8JPVObtjMwCHqMrVA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: elasticsearch deploy on ec2

2015-02-16 Thread Itamar Syn-Hershko
Remove the number of nodes setting, if that doesn't help start looking at
the logs. I seen clusters on AWS that took some time to discover and
stabilize, it may also be that.

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer  Consultant
Lucene.NET committer and PMC member

On Mon, Feb 16, 2015 at 8:56 PM, Eliran Shlomo eli...@whipclip.com wrote:

 Hi, made the changes.
 no change in the cluster status and the same response from the servers

 {
cluster_name: stress_new,
status: red,
timed_out: false,
number_of_nodes: 8,
number_of_data_nodes: 3,
active_primary_shards: 0,
active_shards: 0,
relocating_shards: 0,
initializing_shards: 0,
unassigned_shards: 0
 }

 get _status

 {
error: ClusterBlockException[blocked by:
 [SERVICE_UNAVAILABLE/1/state not recovered / initialized];],
status: 503
 }


 On Monday, February 16, 2015 at 8:53:11 PM UTC+2, Itamar Syn-Hershko wrote:

 Master eligible nodes and Data nodes need to have this setting

 --

 Itamar Syn-Hershko
 http://code972.com | @synhershko https://twitter.com/synhershko
 Freelance Developer  Consultant
 Lucene.NET committer and PMC member

 On Mon, Feb 16, 2015 at 8:31 PM, Eliran Shlomo eli...@whipclip.com
 wrote:

 Wait a second, you should use gateway.expected_data_nodes: 3 and
 gateway.expected_master_nodes: 3 instead of what you have there now. Also
 min master nodes should be 2 in your case.

 Those settings should be in the configuration of all nodes or only in
 the external gateway?(client)

 --
 You received this message because you are subscribed to the Google
 Groups elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send
 an email to elasticsearc...@googlegroups.com.
 To view this discussion on the web visit https://groups.google.com/d/
 msgid/elasticsearch/e6914e16-28ac-4d00-96b7-b3b007538aba%
 40googlegroups.com.
 For more options, visit https://groups.google.com/d/optout.


  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/cd40be93-9b10-4bf1-ac9a-eafe8b98c8fe%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/cd40be93-9b10-4bf1-ac9a-eafe8b98c8fe%40googlegroups.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHTr4Zsa0DgkCJuuLrhyrssM%2BEFMQ4jcZwECWmq%3Dq0GFyqHVEg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Adding timestamp property

2015-02-16 Thread Itamar Syn-Hershko
Kibana requires the timestamp filed to be named @timestamp so the internal
_timestamp field isn't going to work - I'm pretty sure that's still the
case for Kibana 4 as well

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer  Consultant
Lucene.NET committer and PMC member

On Mon, Feb 16, 2015 at 6:22 PM, Roy Zanbel r...@jfrog.com wrote:

 Also no..
 Will take an hour tomorrow to start fresh, will delete the index and start
 over again.
 Will keep you posted.
 {
 _index: aql,
 _type: item,
 _id: AUuR4GgLMJioTmulRq4u,
 _version: 1,
 found: true,
 _source: {
 path: Desktop/Desktop/Desktop,
 depth: 4,
 size: 477,
 downloads: 0,
 created: 2014-11-04T17:26:01.435+02:00,
 repo: archive-local,
 name: Desktop-Desktop.pom,
 type: file,
 updated: 2014-11-04T17:25:55.822+02:00
 }
 }

 Thanks for the quick response.

 BR,
 Roy.

 On Saturday, February 14, 2015 at 3:30:28 PM UTC+2, Roy Zanbel wrote:

 Hi,

 New to elasticsearch and have a simple question had a hard time finding
 online.
 I wish to add a timestamp field.and later use it in kibana
 This is how my settings/ mappings looks like:
 {
 aql: {
 mappings: {
 item: {
 _timestamp: {
 enabled: true,
 store: true
 },
 properties: {}
 }
 },
 settings: {
 index: {
 item: {
 _timestamp: {
 enabled: true,
 store: true
 }
 },
 creation_date: 1423908699031,
 number_of_shards: 5,
 number_of_replicas: 1,
 version: {
 created: 1040299
 },
 uuid: JqNaClL1Q5-ucG6NI1bvOA
 }
 }
 }
 }

 and after posting new indices would like to see a timetamp option to
 filter event in kibana.

 Thanks in advance.

 BR,
 Roy.

  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/340289ee-301f-4ac4-928a-1b547b9c4f74%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/340289ee-301f-4ac4-928a-1b547b9c4f74%40googlegroups.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZvuABk%2BWGEvi9rgQgHXPoHnWVdJYbbsj%3DH-GEMvgPSCdA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: A strange behavior we've encountered on our ELK

2015-02-12 Thread Itamar Syn-Hershko
Yes - can you try using the bulk API? Also, are you running on a cloud
server?

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer  Consultant
Lucene.NET committer and PMC member

On Thu, Feb 12, 2015 at 11:28 AM, Yuval Khalifa iyuv...@gmail.com wrote:

 Hi,

 I wrote that program and ran it and it did managed to keep a steady rate
 of about 1,000 events per minute even when the Kibana's total events per
 minute dropped from 60,000 to 6,000. However, when the
 Kibana's total events per minute dropped to zero, my program got a
 connection refused exception. I ran netstat -s and found out that every
 time the Kibana's line hit zero the number of RX-DRP increased. At that
 point I understood that I forgot to mention that this server has a 10GbE
 nic. Is it possible that the packets are being dropped because of some
 bufferis filling up? If so, how can I test it and verify that this is
 actually the case? If it is, how can I solve it?

 Thanks,
 Yuval.
 On Wednesday, February 11, 2015, Yuval Khalifa iyuv...@gmail.com wrote:

 Hi.

 When you say see how the file behaves I'm not quite sure what you mean
 by that... As I mentioned earlier, it's not that events do not appear at
 all but instead, the RATE at which they come decreases, so how can I
 measure the events rate in a file? I thought that there's another way that
 I can test this: I'll write a quick-and-dirty program that will send an
 event to the ELK via TCP every 12ms which should result in events rate of
 about 5,000 events per minute and I'll let you know if the events rate
 continues to drop or not...


 Thanks,
 Yuval.

 On Tuesday, February 10, 2015, Itamar Syn-Hershko ita...@code972.com
 wrote:

 I'd start by using logstash with input tcp and output fs and see how the
 file behaves. Same for the fs inputs - see how their files behave. And take
 it from there.

 --

 Itamar Syn-Hershko
 http://code972.com | @synhershko https://twitter.com/synhershko
 Freelance Developer  Consultant
 Lucene.NET committer and PMC member

 On Tue, Feb 10, 2015 at 7:47 PM, Yuval Khalifa iyuv...@gmail.com
 wrote:

 Great! How can I check that?


 On Tuesday, February 10, 2015, Itamar Syn-Hershko ita...@code972.com
 wrote:

 The graphic you sent suggests the issue is with logstash - since the
 @timestamp field is being populated by logstash and is the one that is 
 used
 to display the date histogram graphics in Kibana. I would start there. 
 I.e.
 maybe SecurityOnion buffers writes etc, and then to check the logstash
 shipper process stats.

 --

 Itamar Syn-Hershko
 http://code972.com | @synhershko https://twitter.com/synhershko
 Freelance Developer  Consultant
 Lucene.NET committer and PMC member

 On Tue, Feb 10, 2015 at 7:07 PM, Yuval Khalifa iyuv...@gmail.com
 wrote:

 Hi.

 Absolutely (but since that in the past I also worked at the helpdesk
 dept. I certainly understand why it is important to ask those Are you 
 sure
 it's plugged in? questions...). One of the logs is comming from
 SecurityOnion which logs (via bro-conn) all the connections so it must be
 sending data 24x7x365.

 Thanks for the quick reply,
 Yuval.

 On Tuesday, February 10, 2015, Itamar Syn-Hershko ita...@code972.com
 wrote:

 Are you sure your logs are generated linearly without bursts?

 --

 Itamar Syn-Hershko
 http://code972.com | @synhershko https://twitter.com/synhershko
 Freelance Developer  Consultant
 Lucene.NET committer and PMC member

 On Tue, Feb 10, 2015 at 6:29 PM, Yuval Khalifa iyuv...@gmail.com
 wrote:

 Hi,

 We just installed an ELK server and configured the logstash
 configuration to match the data that we send to it and until last 
 month it
 seems to be working fine but since then we see very strange behavior 
 in the
 Kibana, the event over time histogram shows the event rate at the 
 normal
 level for about a half an hour, then drops to about 20% of the normal 
 rate
 and then it continues to drop slowly for about two hours and then 
 stops and
 after a minute or two it returns to normal for the next half an hour 
 or so
 and the same behavior repeats. Needless to say that both the
 /var/log/logstash and /var/log/elasticsearch both show nothing since 
 the
 service started and by using tcpdump we can verify that events keep 
 coming
 in at the same rate all time. I attached our logstash configuration, 
 the
 /var/logstash/logstash.log, the /var/log/elasticsearch/clustername.log 
 and
 a screenshot of our Kibana with no filter applied so that you can see 
 the
 weird behavior that we see.

 Is there someone/somewhere that we can turn to to get some help on
 the subject?


 Thanks a lot,
 Yuval.

 --
 You received this message because you are subscribed to the Google
 Groups elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it,
 send an email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch

Re: Master Node vs. Data Node Architecture

2015-02-12 Thread Itamar Syn-Hershko
Depending why the node goes down - going mid-way with dedicated master
nodes is sometimes the solution

And if this is due to massive use of aggregations, doc-values may be the
answer (or larger heap, but that's costlier)

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer  Consultant
Lucene.NET committer and PMC member

On Thu, Feb 12, 2015 at 11:40 PM, Mark Walkom markwal...@gmail.com wrote:

 Except that is overkill when you only have 3 nodes.

 How much data do you have in the cluster?

 On 13 February 2015 at 01:15, Itamar Syn-Hershko ita...@code972.com
 wrote:

 See this:
 http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-node.html

 Basically, the recommended pattern talks about isolating
 responsibilities. A node should either be a data node, master-eligible
 node, or an external gateway to the cluster (client node)

 --

 Itamar Syn-Hershko
 http://code972.com | @synhershko https://twitter.com/synhershko
 Freelance Developer  Consultant
 Lucene.NET committer and PMC member

 On Thu, Feb 12, 2015 at 4:08 PM, Eric eric.luel...@gmail.com wrote:

 Hello,

 Currently I have a 3 node ElasticSearch cluster. Each node is a RHEL VM
 with 16 gig RAM. The basic config is:

 - All nodes can be master and are data nodes.
 - 3 shards and 1 replica
 - 6 different indexes

 I'm starting to run into issues of ElasticSearch bogging down on
 searches and is completely freezing sometimes at night. I've dedicated 9
 gig to heap size and it says i'm using ~60% of the heap RAM and about 70%
 of the overall heap. So even though I'm using quite a bit of the heap, I'm
 not maxed out. I've attached a screenshot of the exact stats from Elastic
 HQ. I'm averaging around 10,000 events/sec coming into the cluster from 6
 different Logstash instances on another server.

 My question is what can I do to help the stability and speed of my
 cluster. Currently I'm having issues with 1 node going down and it taking
 everything else down. The HA portion isn't working very well. I'm debating
 about either adding 1 more node with the exact same stats or adding 2 more
 smaller VMs that will act as master nodes only. I didn't know which one was
 recommended or where I would get the biggest bang for the buck.

 Any information would be greatly appreciated.

 Thanks,
 Eric

 --
 You received this message because you are subscribed to the Google
 Groups elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send
 an email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/354a2326-5532-4239-87ea-f02af64fe71f%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/354a2326-5532-4239-87ea-f02af64fe71f%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZureLROJMaO7gVprFjA2OmRZA0ZYyH1v%2Bges06u_V__6w%40mail.gmail.com
 https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZureLROJMaO7gVprFjA2OmRZA0ZYyH1v%2Bges06u_V__6w%40mail.gmail.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/CAEYi1X93EwqeGf9S4UpMvtJy3%2BmaAjovfVicj7LRHz%2BPyAbSug%40mail.gmail.com
 https://groups.google.com/d/msgid/elasticsearch/CAEYi1X93EwqeGf9S4UpMvtJy3%2BmaAjovfVicj7LRHz%2BPyAbSug%40mail.gmail.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZvFX-bEqpSEnh3EmdbzAwAhDLE7PYDivd5Q2VnFu_xviA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Master Node vs. Data Node Architecture

2015-02-12 Thread Itamar Syn-Hershko
See this:
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-node.html

Basically, the recommended pattern talks about isolating responsibilities.
A node should either be a data node, master-eligible node, or an external
gateway to the cluster (client node)

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer  Consultant
Lucene.NET committer and PMC member

On Thu, Feb 12, 2015 at 4:08 PM, Eric eric.luel...@gmail.com wrote:

 Hello,

 Currently I have a 3 node ElasticSearch cluster. Each node is a RHEL VM
 with 16 gig RAM. The basic config is:

 - All nodes can be master and are data nodes.
 - 3 shards and 1 replica
 - 6 different indexes

 I'm starting to run into issues of ElasticSearch bogging down on searches
 and is completely freezing sometimes at night. I've dedicated 9 gig to heap
 size and it says i'm using ~60% of the heap RAM and about 70% of the
 overall heap. So even though I'm using quite a bit of the heap, I'm not
 maxed out. I've attached a screenshot of the exact stats from Elastic HQ.
 I'm averaging around 10,000 events/sec coming into the cluster from 6
 different Logstash instances on another server.

 My question is what can I do to help the stability and speed of my
 cluster. Currently I'm having issues with 1 node going down and it taking
 everything else down. The HA portion isn't working very well. I'm debating
 about either adding 1 more node with the exact same stats or adding 2 more
 smaller VMs that will act as master nodes only. I didn't know which one was
 recommended or where I would get the biggest bang for the buck.

 Any information would be greatly appreciated.

 Thanks,
 Eric

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/354a2326-5532-4239-87ea-f02af64fe71f%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/354a2326-5532-4239-87ea-f02af64fe71f%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZureLROJMaO7gVprFjA2OmRZA0ZYyH1v%2Bges06u_V__6w%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Elasticsearch + attachment plugin + Kibana + couchbase

2015-02-12 Thread Itamar Syn-Hershko
The XDCR plugin indexes the data using an envelope document. Long story
short, make sure you use the latest XDCR plugin as older ones are missing
lots of important functions, and use templates and dynamic templates with
proper field paths for this to work correctly

http://code972.com/blog/2015/02/80-elasticsearch-one-tip-a-day-managing-index-mappings-like-a-pro
http://code972.com/blog/2015/02/81-elasticsearch-one-tip-a-day-using-dynamic-templates-to-avoid-rigorous-mappings

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer  Consultant
Lucene.NET committer and PMC member

On Thu, Feb 12, 2015 at 3:59 PM, Nadav Hashimshony nad...@gmail.com wrote:

 Hi,

 I'm new to the group, hope ill find what i need and share my experience as
 i go along..

 im using ES with the attachment-plugin in order to store and search files.
 when i set the mapping right and insert the file data in a Base64 manner
 I'm able to query my data via Kibana.

 my problem is this.

 if i create the index + mapping in ES, then insert the data to Couchbase
 and use XDRC to replicate it to ES, i can't query the Data with Kibana.
 it looks like the mapping of the index created in ES doesn't index well
 the data it gets from Couchbase.

 has anyone encounter such an issue?

 Thanks You

 Nadav.

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/8092eaf5-0ef8-4249-8e5d-acff8281a81a%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/8092eaf5-0ef8-4249-8e5d-acff8281a81a%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHTr4Zs5OwXJe9aT1pPNu9vuooXO10Z3Mx7xc8CJh77EN9s%3DCQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: A strange behavior we've encountered on our ELK

2015-02-12 Thread Itamar Syn-Hershko
Yes, make sure the disk is local and not low latency shared one (e.g. SAN).
Also SSD will probably fix all your pains.

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer  Consultant
Lucene.NET committer and PMC member

On Thu, Feb 12, 2015 at 3:28 PM, Yuval Khalifa iyuv...@gmail.com wrote:

 Sort of... The ELK is running as a VM on a dedicated ESXi. Are there
 special configurations I should do in such a case?

 Thanks,
 Yuval.

 On Thursday, February 12, 2015, Itamar Syn-Hershko ita...@code972.com
 wrote:

 Yes - can you try using the bulk API? Also, are you running on a cloud
 server?

 --

 Itamar Syn-Hershko
 http://code972.com | @synhershko https://twitter.com/synhershko
 Freelance Developer  Consultant
 Lucene.NET committer and PMC member

 On Thu, Feb 12, 2015 at 11:28 AM, Yuval Khalifa iyuv...@gmail.com
 wrote:

 Hi,

 I wrote that program and ran it and it did managed to keep a steady rate
 of about 1,000 events per minute even when the Kibana's total events per
 minute dropped from 60,000 to 6,000. However, when the
 Kibana's total events per minute dropped to zero, my program got a
 connection refused exception. I ran netstat -s and found out that every
 time the Kibana's line hit zero the number of RX-DRP increased. At that
 point I understood that I forgot to mention that this server has a 10GbE
 nic. Is it possible that the packets are being dropped because of some
 bufferis filling up? If so, how can I test it and verify that this is
 actually the case? If it is, how can I solve it?

 Thanks,
 Yuval.
 On Wednesday, February 11, 2015, Yuval Khalifa iyuv...@gmail.com
 wrote:

 Hi.

 When you say see how the file behaves I'm not quite sure what you
 mean by that... As I mentioned earlier, it's not that events do not appear
 at all but instead, the RATE at which they come decreases, so how can I
 measure the events rate in a file? I thought that there's another way that
 I can test this: I'll write a quick-and-dirty program that will send an
 event to the ELK via TCP every 12ms which should result in events rate of
 about 5,000 events per minute and I'll let you know if the events rate
 continues to drop or not...


 Thanks,
 Yuval.

 On Tuesday, February 10, 2015, Itamar Syn-Hershko ita...@code972.com
 wrote:

 I'd start by using logstash with input tcp and output fs and see how
 the file behaves. Same for the fs inputs - see how their files behave. And
 take it from there.

 --

 Itamar Syn-Hershko
 http://code972.com | @synhershko https://twitter.com/synhershko
 Freelance Developer  Consultant
 Lucene.NET committer and PMC member

 On Tue, Feb 10, 2015 at 7:47 PM, Yuval Khalifa iyuv...@gmail.com
 wrote:

 Great! How can I check that?


 On Tuesday, February 10, 2015, Itamar Syn-Hershko ita...@code972.com
 wrote:

 The graphic you sent suggests the issue is with logstash - since the
 @timestamp field is being populated by logstash and is the one that is 
 used
 to display the date histogram graphics in Kibana. I would start there. 
 I.e.
 maybe SecurityOnion buffers writes etc, and then to check the logstash
 shipper process stats.

 --

 Itamar Syn-Hershko
 http://code972.com | @synhershko https://twitter.com/synhershko
 Freelance Developer  Consultant
 Lucene.NET committer and PMC member

 On Tue, Feb 10, 2015 at 7:07 PM, Yuval Khalifa iyuv...@gmail.com
 wrote:

 Hi.

 Absolutely (but since that in the past I also worked at
 the helpdesk dept. I certainly understand why it is important to ask 
 those
 Are you sure it's plugged in? questions...). One of the logs is 
 comming
 from SecurityOnion which logs (via bro-conn) all the connections so it 
 must
 be sending data 24x7x365.

 Thanks for the quick reply,
 Yuval.

 On Tuesday, February 10, 2015, Itamar Syn-Hershko 
 ita...@code972.com wrote:

 Are you sure your logs are generated linearly without bursts?

 --

 Itamar Syn-Hershko
 http://code972.com | @synhershko https://twitter.com/synhershko
 Freelance Developer  Consultant
 Lucene.NET committer and PMC member

 On Tue, Feb 10, 2015 at 6:29 PM, Yuval Khalifa iyuv...@gmail.com
 wrote:

 Hi,

 We just installed an ELK server and configured the logstash
 configuration to match the data that we send to it and until last 
 month it
 seems to be working fine but since then we see very strange behavior 
 in the
 Kibana, the event over time histogram shows the event rate at the 
 normal
 level for about a half an hour, then drops to about 20% of the 
 normal rate
 and then it continues to drop slowly for about two hours and then 
 stops and
 after a minute or two it returns to normal for the next half an hour 
 or so
 and the same behavior repeats. Needless to say that both the
 /var/log/logstash and /var/log/elasticsearch both show nothing since 
 the
 service started and by using tcpdump we can verify that events keep 
 coming
 in at the same rate all time. I attached our logstash configuration, 
 the
 /var/logstash/logstash.log

Re: A strange behavior we've encountered on our ELK

2015-02-12 Thread Itamar Syn-Hershko
There's a good writeup on the subject by Mike btw, you should read it
http://www.elasticsearch.org/blog/performance-considerations-elasticsearch-indexing/

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer  Consultant
Lucene.NET committer and PMC member

On Thu, Feb 12, 2015 at 3:30 PM, Itamar Syn-Hershko ita...@code972.com
wrote:

 Yes, make sure the disk is local and not low latency shared one (e.g.
 SAN). Also SSD will probably fix all your pains.

 --

 Itamar Syn-Hershko
 http://code972.com | @synhershko https://twitter.com/synhershko
 Freelance Developer  Consultant
 Lucene.NET committer and PMC member

 On Thu, Feb 12, 2015 at 3:28 PM, Yuval Khalifa iyuv...@gmail.com wrote:

 Sort of... The ELK is running as a VM on a dedicated ESXi. Are there
 special configurations I should do in such a case?

 Thanks,
 Yuval.

 On Thursday, February 12, 2015, Itamar Syn-Hershko ita...@code972.com
 wrote:

 Yes - can you try using the bulk API? Also, are you running on a cloud
 server?

 --

 Itamar Syn-Hershko
 http://code972.com | @synhershko https://twitter.com/synhershko
 Freelance Developer  Consultant
 Lucene.NET committer and PMC member

 On Thu, Feb 12, 2015 at 11:28 AM, Yuval Khalifa iyuv...@gmail.com
 wrote:

 Hi,

 I wrote that program and ran it and it did managed to keep a steady
 rate of about 1,000 events per minute even when the Kibana's total events
 per minute dropped from 60,000 to 6,000. However, when the
 Kibana's total events per minute dropped to zero, my program got a
 connection refused exception. I ran netstat -s and found out that every
 time the Kibana's line hit zero the number of RX-DRP increased. At that
 point I understood that I forgot to mention that this server has a 10GbE
 nic. Is it possible that the packets are being dropped because of some
 bufferis filling up? If so, how can I test it and verify that this is
 actually the case? If it is, how can I solve it?

 Thanks,
 Yuval.
 On Wednesday, February 11, 2015, Yuval Khalifa iyuv...@gmail.com
 wrote:

 Hi.

 When you say see how the file behaves I'm not quite sure what you
 mean by that... As I mentioned earlier, it's not that events do not appear
 at all but instead, the RATE at which they come decreases, so how can I
 measure the events rate in a file? I thought that there's another way that
 I can test this: I'll write a quick-and-dirty program that will send an
 event to the ELK via TCP every 12ms which should result in events rate of
 about 5,000 events per minute and I'll let you know if the events rate
 continues to drop or not...


 Thanks,
 Yuval.

 On Tuesday, February 10, 2015, Itamar Syn-Hershko ita...@code972.com
 wrote:

 I'd start by using logstash with input tcp and output fs and see how
 the file behaves. Same for the fs inputs - see how their files behave. 
 And
 take it from there.

 --

 Itamar Syn-Hershko
 http://code972.com | @synhershko https://twitter.com/synhershko
 Freelance Developer  Consultant
 Lucene.NET committer and PMC member

 On Tue, Feb 10, 2015 at 7:47 PM, Yuval Khalifa iyuv...@gmail.com
 wrote:

 Great! How can I check that?


 On Tuesday, February 10, 2015, Itamar Syn-Hershko 
 ita...@code972.com wrote:

 The graphic you sent suggests the issue is with logstash - since
 the @timestamp field is being populated by logstash and is the one 
 that is
 used to display the date histogram graphics in Kibana. I would start 
 there.
 I.e. maybe SecurityOnion buffers writes etc, and then to check the 
 logstash
 shipper process stats.

 --

 Itamar Syn-Hershko
 http://code972.com | @synhershko https://twitter.com/synhershko
 Freelance Developer  Consultant
 Lucene.NET committer and PMC member

 On Tue, Feb 10, 2015 at 7:07 PM, Yuval Khalifa iyuv...@gmail.com
 wrote:

 Hi.

 Absolutely (but since that in the past I also worked at
 the helpdesk dept. I certainly understand why it is important to ask 
 those
 Are you sure it's plugged in? questions...). One of the logs is 
 comming
 from SecurityOnion which logs (via bro-conn) all the connections so 
 it must
 be sending data 24x7x365.

 Thanks for the quick reply,
 Yuval.

 On Tuesday, February 10, 2015, Itamar Syn-Hershko 
 ita...@code972.com wrote:

 Are you sure your logs are generated linearly without bursts?

 --

 Itamar Syn-Hershko
 http://code972.com | @synhershko https://twitter.com/synhershko
 Freelance Developer  Consultant
 Lucene.NET committer and PMC member

 On Tue, Feb 10, 2015 at 6:29 PM, Yuval Khalifa iyuv...@gmail.com
  wrote:

 Hi,

 We just installed an ELK server and configured the logstash
 configuration to match the data that we send to it and until last 
 month it
 seems to be working fine but since then we see very strange 
 behavior in the
 Kibana, the event over time histogram shows the event rate at the 
 normal
 level for about a half an hour, then drops to about 20% of the 
 normal rate
 and then it continues to drop slowly for about two hours and then 
 stops

Re: Elasticsearch + attachment plugin + Kibana + couchbase

2015-02-12 Thread Itamar Syn-Hershko
Like I said, you need the mapping to catch before the XDCR plugin begins
the replication - so you need to put a template with this mapping that will
override XDCR's

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer  Consultant
Lucene.NET committer and PMC member

On Thu, Feb 12, 2015 at 4:59 PM, Nadav Hashimshony nad...@gmail.com wrote:

 Thanks you for the response

 i am using mapping, i created the following index
 PUT /storage/files/_mapping
 {
   files: {
 properties: {
   file: {
 type: attachment,
 path: full,
 fields: {
   content_type: {
 type: string,
 store: true
   }
 }
   }
 }
   }
 }

 when i insert data via ES and query it, all is fine.
 the problem is when data is inserted to Couchbase..

 Nadav

 On Thursday, February 12, 2015 at 4:03:01 PM UTC+2, Itamar Syn-Hershko
 wrote:

 The XDCR plugin indexes the data using an envelope document. Long story
 short, make sure you use the latest XDCR plugin as older ones are missing
 lots of important functions, and use templates and dynamic templates with
 proper field paths for this to work correctly

 http://code972.com/blog/2015/02/80-elasticsearch-one-tip-a-
 day-managing-index-mappings-like-a-pro
 http://code972.com/blog/2015/02/81-elasticsearch-one-tip-a-
 day-using-dynamic-templates-to-avoid-rigorous-mappings

 --

 Itamar Syn-Hershko
 http://code972.com | @synhershko https://twitter.com/synhershko
 Freelance Developer  Consultant
 Lucene.NET committer and PMC member

 On Thu, Feb 12, 2015 at 3:59 PM, Nadav Hashimshony nad...@gmail.com
 wrote:

 Hi,

 I'm new to the group, hope ill find what i need and share my experience
 as i go along..

 im using ES with the attachment-plugin in order to store and search
 files.
 when i set the mapping right and insert the file data in a Base64 manner
 I'm able to query my data via Kibana.

 my problem is this.

 if i create the index + mapping in ES, then insert the data to Couchbase
 and use XDRC to replicate it to ES, i can't query the Data with Kibana.
 it looks like the mapping of the index created in ES doesn't index well
 the data it gets from Couchbase.

 has anyone encounter such an issue?

 Thanks You

 Nadav.

 --
 You received this message because you are subscribed to the Google
 Groups elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send
 an email to elasticsearc...@googlegroups.com.
 To view this discussion on the web visit https://groups.google.com/d/
 msgid/elasticsearch/8092eaf5-0ef8-4249-8e5d-acff8281a81a%
 40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/8092eaf5-0ef8-4249-8e5d-acff8281a81a%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/1d9c8ce5-116f-40cc-a5e3-6ebe47191850%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/1d9c8ce5-116f-40cc-a5e3-6ebe47191850%40googlegroups.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHTr4Zuh7HHK8XmdznuHnw7E01ffXV8BC-49D70ekMc1-YhQCA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Elasticsearch + attachment plugin + Kibana + couchbase

2015-02-12 Thread Itamar Syn-Hershko
Yes. Just make sure the template reflects the actual document structure -
as I said XDCR wraps your document in an envelope document

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer  Consultant
Lucene.NET committer and PMC member

On Thu, Feb 12, 2015 at 5:12 PM, Nadav Hashimshony nad...@gmail.com wrote:

 ok, just to be clear.

 the steps i did was as followed:
 1. create the index with the mapping.
 2. define the XDCR to replicate my bucket with the index in ES.
 3. insert data to couchbase.
 4. try to query with kibana

 What you suggest is to Add another BEFORE step 1:
 0. create a template to include my mapping.
 1. crate the index in ES
 and so on...

 did i get it right?

 Thanks.
 Nadav.


 On Thursday, February 12, 2015 at 5:04:24 PM UTC+2, Itamar Syn-Hershko
 wrote:

 Like I said, you need the mapping to catch before the XDCR plugin
 begins the replication - so you need to put a template with this mapping
 that will override XDCR's

 --

 Itamar Syn-Hershko
 http://code972.com | @synhershko https://twitter.com/synhershko
 Freelance Developer  Consultant
 Lucene.NET committer and PMC member

 On Thu, Feb 12, 2015 at 4:59 PM, Nadav Hashimshony nad...@gmail.com
 wrote:

 Thanks you for the response

 i am using mapping, i created the following index
 PUT /storage/files/_mapping
 {
   files: {
 properties: {
   file: {
 type: attachment,
 path: full,
 fields: {
   content_type: {
 type: string,
 store: true
   }
 }
   }
 }
   }
 }

 when i insert data via ES and query it, all is fine.
 the problem is when data is inserted to Couchbase..

 Nadav

 On Thursday, February 12, 2015 at 4:03:01 PM UTC+2, Itamar Syn-Hershko
 wrote:

 The XDCR plugin indexes the data using an envelope document. Long story
 short, make sure you use the latest XDCR plugin as older ones are missing
 lots of important functions, and use templates and dynamic templates with
 proper field paths for this to work correctly

 http://code972.com/blog/2015/02/80-elasticsearch-one-tip-a-d
 ay-managing-index-mappings-like-a-pro
 http://code972.com/blog/2015/02/81-elasticsearch-one-tip-a-d
 ay-using-dynamic-templates-to-avoid-rigorous-mappings

 --

 Itamar Syn-Hershko
 http://code972.com | @synhershko https://twitter.com/synhershko
 Freelance Developer  Consultant
 Lucene.NET committer and PMC member

 On Thu, Feb 12, 2015 at 3:59 PM, Nadav Hashimshony nad...@gmail.com
 wrote:

 Hi,

 I'm new to the group, hope ill find what i need and share my
 experience as i go along..

 im using ES with the attachment-plugin in order to store and search
 files.
 when i set the mapping right and insert the file data in a Base64
 manner I'm able to query my data via Kibana.

 my problem is this.

 if i create the index + mapping in ES, then insert the data to
 Couchbase and use XDRC to replicate it to ES, i can't query the Data with
 Kibana.
 it looks like the mapping of the index created in ES doesn't index
 well the data it gets from Couchbase.

 has anyone encounter such an issue?

 Thanks You

 Nadav.

 --
 You received this message because you are subscribed to the Google
 Groups elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send
 an email to elasticsearc...@googlegroups.com.
 To view this discussion on the web visit https://groups.google.com/d/
 msgid/elasticsearch/8092eaf5-0ef8-4249-8e5d-acff8281a81a%40goo
 glegroups.com
 https://groups.google.com/d/msgid/elasticsearch/8092eaf5-0ef8-4249-8e5d-acff8281a81a%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


  --
 You received this message because you are subscribed to the Google
 Groups elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send
 an email to elasticsearc...@googlegroups.com.
 To view this discussion on the web visit https://groups.google.com/d/
 msgid/elasticsearch/1d9c8ce5-116f-40cc-a5e3-6ebe47191850%
 40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/1d9c8ce5-116f-40cc-a5e3-6ebe47191850%40googlegroups.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/8628ef07-2e10-423a-9de0-13ebaa37a0e8%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/8628ef07-2e10-423a-9de0-13ebaa37a0e8%40googlegroups.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group

Re: Elasticsearch + attachment plugin + Kibana + couchbase

2015-02-12 Thread Itamar Syn-Hershko
Yes, that too :)

Also if its a time based data, you will not be able to use kibana's date
filtering etc  - because it lacks the @timestamp field. Basically, the XDCR
elasticsearch plugin was built around the XDCR / Couchbase realm and not
around Elasticsearch's. Unfortunately this means many ES features are
unavailable / hard to use, e.g.
https://github.com/couchbaselabs/elasticsearch-transport-couchbase/issues/63

https://github.com/couchbaselabs/elasticsearch-transport-couchbase/issues/64

I can help fixing this on the XDCR plugin if you'd like - ping me privately
and we can work something out (or I can convince you to avoid using the
XDCR replication)

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer  Consultant
Lucene.NET committer and PMC member

On Thu, Feb 12, 2015 at 5:18 PM, Nadav Hashimshony nad...@gmail.com wrote:

 ok, ill try.

 this envelope document, is it something i need to be concerned about
 when I'm querying via Kibana?

 On Thursday, February 12, 2015 at 5:14:39 PM UTC+2, Itamar Syn-Hershko
 wrote:

 Yes. Just make sure the template reflects the actual document structure -
 as I said XDCR wraps your document in an envelope document

 --

 Itamar Syn-Hershko
 http://code972.com | @synhershko https://twitter.com/synhershko
 Freelance Developer  Consultant
 Lucene.NET committer and PMC member

 On Thu, Feb 12, 2015 at 5:12 PM, Nadav Hashimshony nad...@gmail.com
 wrote:

 ok, just to be clear.

 the steps i did was as followed:
 1. create the index with the mapping.
 2. define the XDCR to replicate my bucket with the index in ES.
 3. insert data to couchbase.
 4. try to query with kibana

 What you suggest is to Add another BEFORE step 1:
 0. create a template to include my mapping.
 1. crate the index in ES
 and so on...

 did i get it right?

 Thanks.
 Nadav.


 On Thursday, February 12, 2015 at 5:04:24 PM UTC+2, Itamar Syn-Hershko
 wrote:

 Like I said, you need the mapping to catch before the XDCR plugin
 begins the replication - so you need to put a template with this mapping
 that will override XDCR's

 --

 Itamar Syn-Hershko
 http://code972.com | @synhershko https://twitter.com/synhershko
 Freelance Developer  Consultant
 Lucene.NET committer and PMC member

 On Thu, Feb 12, 2015 at 4:59 PM, Nadav Hashimshony nad...@gmail.com
 wrote:

 Thanks you for the response

 i am using mapping, i created the following index
 PUT /storage/files/_mapping
 {
   files: {
 properties: {
   file: {
 type: attachment,
 path: full,
 fields: {
   content_type: {
 type: string,
 store: true
   }
 }
   }
 }
   }
 }

 when i insert data via ES and query it, all is fine.
 the problem is when data is inserted to Couchbase..

 Nadav

 On Thursday, February 12, 2015 at 4:03:01 PM UTC+2, Itamar Syn-Hershko
 wrote:

 The XDCR plugin indexes the data using an envelope document. Long
 story short, make sure you use the latest XDCR plugin as older ones are
 missing lots of important functions, and use templates and dynamic
 templates with proper field paths for this to work correctly

 http://code972.com/blog/2015/02/80-elasticsearch-one-tip-a-d
 ay-managing-index-mappings-like-a-pro
 http://code972.com/blog/2015/02/81-elasticsearch-one-tip-a-d
 ay-using-dynamic-templates-to-avoid-rigorous-mappings

 --

 Itamar Syn-Hershko
 http://code972.com | @synhershko https://twitter.com/synhershko
 Freelance Developer  Consultant
 Lucene.NET committer and PMC member

 On Thu, Feb 12, 2015 at 3:59 PM, Nadav Hashimshony nad...@gmail.com
 wrote:

 Hi,

 I'm new to the group, hope ill find what i need and share my
 experience as i go along..

 im using ES with the attachment-plugin in order to store and search
 files.
 when i set the mapping right and insert the file data in a Base64
 manner I'm able to query my data via Kibana.

 my problem is this.

 if i create the index + mapping in ES, then insert the data to
 Couchbase and use XDRC to replicate it to ES, i can't query the Data 
 with
 Kibana.
 it looks like the mapping of the index created in ES doesn't index
 well the data it gets from Couchbase.

 has anyone encounter such an issue?

 Thanks You

 Nadav.

 --
 You received this message because you are subscribed to the Google
 Groups elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it,
 send an email to elasticsearc...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/8092eaf5-0ef
 8-4249-8e5d-acff8281a81a%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/8092eaf5-0ef8-4249-8e5d-acff8281a81a%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


  --
 You received this message because you are subscribed to the Google
 Groups elasticsearch group.
 To unsubscribe from this group and stop receiving

Re: Can I use Newtonsoft.Json 4.5.0.0 instead of 6.0.0.0 with NEST?

2015-02-11 Thread Itamar Syn-Hershko
Try using assembly redirects, if that doesn't work it means no...

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer  Consultant
Lucene.NET committer and PMC member

On Wed, Feb 11, 2015 at 12:38 PM, Martin Widmer swissm...@gmail.com wrote:

 Can I use Newtonsoft.Json 4.5.0.0 instead of 6.0.0.0 with NEST?
 How?

 Thanks for your advice.

 Martin

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/5a2a4684-d74c-462b-903d-be973dbd327d%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/5a2a4684-d74c-462b-903d-be973dbd327d%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHTr4Zuz-T0hfrACUC27LeOzP9QMf-m3YjOcEWSXhzgAutnZ5w%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: A strange behavior we've encountered on our ELK

2015-02-10 Thread Itamar Syn-Hershko
Are you sure your logs are generated linearly without bursts?

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer  Consultant
Lucene.NET committer and PMC member

On Tue, Feb 10, 2015 at 6:29 PM, Yuval Khalifa iyuv...@gmail.com wrote:

 Hi,

 We just installed an ELK server and configured the logstash configuration
 to match the data that we send to it and until last month it seems to be
 working fine but since then we see very strange behavior in the Kibana, the
 event over time histogram shows the event rate at the normal level for
 about a half an hour, then drops to about 20% of the normal rate and then
 it continues to drop slowly for about two hours and then stops and after a
 minute or two it returns to normal for the next half an hour or so and the
 same behavior repeats. Needless to say that both the /var/log/logstash and
 /var/log/elasticsearch both show nothing since the service started and by
 using tcpdump we can verify that events keep coming in at the same rate all
 time. I attached our logstash configuration, the
 /var/logstash/logstash.log, the /var/log/elasticsearch/clustername.log and
 a screenshot of our Kibana with no filter applied so that you can see the
 weird behavior that we see.

 Is there someone/somewhere that we can turn to to get some help on the
 subject?


 Thanks a lot,
 Yuval.

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/c2e5a524-1ba6-4dc9-9fc3-d206d8f82717%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/c2e5a524-1ba6-4dc9-9fc3-d206d8f82717%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZsRoNmJ__QdLnB6NYLhoDVaD9CR1RNkC_9_c%2Boaqccqww%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Dumping raw data in custom format

2015-02-10 Thread Itamar Syn-Hershko
Use the scan/scroll API with different queries (filter by document type
etc), from a custom tool written in Java. This will be the fastest.

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer  Consultant
Lucene.NET committer and PMC member

On Tue, Feb 10, 2015 at 7:41 PM, Andrew McFague redmu...@gmail.com wrote:

 Forgot to mention--the data set size is around 1.6 billion documents.

 On Tuesday, February 10, 2015 at 9:29:39 AM UTC-8, Andrew McFague wrote:

 I have a use case where I'd like to be able to dump *all* the documents
 in ES to a specific output format.  However, using scan or any other
 consistent view is relatively slow.  Using the scan query with a
 match_all, it processes items at a rate of around 80,000 a second--but
 that means it will still take over 5 hours to dump.  It also means it can't
 be parallelized across machines, which effectively stops scaling.

 I've also looked at things like Knapsack, Elastidump, etc., but these
 still don't give me the ability to parallelize the work, and they're not
 particularly fast.  They also don't allow me to manipulate it to the
 specific format I want (it's not JSON, and requires some organization of
 the data).

 So I have a few ideas, which may or may not be possible:

1. Retrieve shard-specific data from ElasticSearch (i.e., Give me
all the data for Shard X).  This would allow me to divide the task up 
 into
/at least/ S tasks, where S is the number of segments, but there doesn't
seem to be an API that exposes this.
2. Get snapshots of each shard from disk.  This would also allow me
to divide up the work, but would also require a framework on top to
coordinate which segments have been retrieved, etc..
3. Hadoop.  However, launching an entire MR cluster just to dump data
sounds like overkill.

 The first option gives me the most flexibility and would require the
 least amount of work on my part, but there doesn't seem to be any way to
 dump all the data for a specific shard via the API.  Is there any sort of
 API or flag that provides this, or otherwise provides a way to partition
 the data to different consumers?

 The second would also (assumingly) give me the ability to subdivide tasks
 out per worker, and would also allow these to be done offline.  I was able
 to write a sample program that uses Lucene to do this, but this adds the
 additional complexity of coordinating work across the various hosts in the
 cluster, as well as requiring an intermediate step where I transfer the
 common files to another host to combine them.  This isn't a terrible
 problem to have--but does require additional infrastructure to organize.

 The third is not desirable because it's an incredible amount of
 operational load without a clear tradeoff, since we don't already have a
 map reduce cluster on hand.

 Thanks for any tips or suggestions!

 Andrew

  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/91cebf19-dc58-48bf-80fa-839a7cea4596%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/91cebf19-dc58-48bf-80fa-839a7cea4596%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHTr4Zv9-%3DEsiY1DpzjT8SzQ8jSg7rYrH04UPqYHpwOq2nyMOw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: A strange behavior we've encountered on our ELK

2015-02-10 Thread Itamar Syn-Hershko
The graphic you sent suggests the issue is with logstash - since the
@timestamp field is being populated by logstash and is the one that is used
to display the date histogram graphics in Kibana. I would start there. I.e.
maybe SecurityOnion buffers writes etc, and then to check the logstash
shipper process stats.

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer  Consultant
Lucene.NET committer and PMC member

On Tue, Feb 10, 2015 at 7:07 PM, Yuval Khalifa iyuv...@gmail.com wrote:

 Hi.

 Absolutely (but since that in the past I also worked at the helpdesk dept.
 I certainly understand why it is important to ask those Are you sure it's
 plugged in? questions...). One of the logs is comming from SecurityOnion
 which logs (via bro-conn) all the connections so it must be sending data
 24x7x365.

 Thanks for the quick reply,
 Yuval.

 On Tuesday, February 10, 2015, Itamar Syn-Hershko ita...@code972.com
 wrote:

 Are you sure your logs are generated linearly without bursts?

 --

 Itamar Syn-Hershko
 http://code972.com | @synhershko https://twitter.com/synhershko
 Freelance Developer  Consultant
 Lucene.NET committer and PMC member

 On Tue, Feb 10, 2015 at 6:29 PM, Yuval Khalifa iyuv...@gmail.com wrote:

 Hi,

 We just installed an ELK server and configured the logstash
 configuration to match the data that we send to it and until last month it
 seems to be working fine but since then we see very strange behavior in the
 Kibana, the event over time histogram shows the event rate at the normal
 level for about a half an hour, then drops to about 20% of the normal rate
 and then it continues to drop slowly for about two hours and then stops and
 after a minute or two it returns to normal for the next half an hour or so
 and the same behavior repeats. Needless to say that both the
 /var/log/logstash and /var/log/elasticsearch both show nothing since the
 service started and by using tcpdump we can verify that events keep coming
 in at the same rate all time. I attached our logstash configuration, the
 /var/logstash/logstash.log, the /var/log/elasticsearch/clustername.log and
 a screenshot of our Kibana with no filter applied so that you can see the
 weird behavior that we see.

 Is there someone/somewhere that we can turn to to get some help on the
 subject?


 Thanks a lot,
 Yuval.

 --
 You received this message because you are subscribed to the Google
 Groups elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send
 an email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/c2e5a524-1ba6-4dc9-9fc3-d206d8f82717%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/c2e5a524-1ba6-4dc9-9fc3-d206d8f82717%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


  --
 You received this message because you are subscribed to a topic in the
 Google Groups elasticsearch group.
 To unsubscribe from this topic, visit
 https://groups.google.com/d/topic/elasticsearch/cw7zEVTy09M/unsubscribe.
 To unsubscribe from this group and all its topics, send an email to
 elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZsRoNmJ__QdLnB6NYLhoDVaD9CR1RNkC_9_c%2Boaqccqww%40mail.gmail.com
 https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZsRoNmJ__QdLnB6NYLhoDVaD9CR1RNkC_9_c%2Boaqccqww%40mail.gmail.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.



 --

 בברכה,

 *יובל כליפא*

 CTO
 תחום מערכות מידע | מגדל סוכנויות.
 נייד:052-3336098
 משרד:  03-7966565
 פקס:03-7976565
   בלוג: http://www.artifex.co.il
 https://owa.mvs.co.il/OWA/redir.aspx?C=2843559e53a94386b1211d26cb20f8efURL=http%3a%2f%2fwww.artifex.co.il%2f

 *[image: תיאור: תיאור: cid:image003.png@01CBB583.C49AE5A0]*

  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/CADtR2A9-UtP5GJLORnVW%2BMowbB%2B0ZV%3DeDFMfN5u3xFPD2Zv5FQ%40mail.gmail.com
 https://groups.google.com/d/msgid/elasticsearch/CADtR2A9-UtP5GJLORnVW%2BMowbB%2B0ZV%3DeDFMfN5u3xFPD2Zv5FQ%40mail.gmail.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch

Re: A strange behavior we've encountered on our ELK

2015-02-10 Thread Itamar Syn-Hershko
I'd start by using logstash with input tcp and output fs and see how the
file behaves. Same for the fs inputs - see how their files behave. And take
it from there.

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer  Consultant
Lucene.NET committer and PMC member

On Tue, Feb 10, 2015 at 7:47 PM, Yuval Khalifa iyuv...@gmail.com wrote:

 Great! How can I check that?


 On Tuesday, February 10, 2015, Itamar Syn-Hershko ita...@code972.com
 wrote:

 The graphic you sent suggests the issue is with logstash - since the
 @timestamp field is being populated by logstash and is the one that is used
 to display the date histogram graphics in Kibana. I would start there. I.e.
 maybe SecurityOnion buffers writes etc, and then to check the logstash
 shipper process stats.

 --

 Itamar Syn-Hershko
 http://code972.com | @synhershko https://twitter.com/synhershko
 Freelance Developer  Consultant
 Lucene.NET committer and PMC member

 On Tue, Feb 10, 2015 at 7:07 PM, Yuval Khalifa iyuv...@gmail.com wrote:

 Hi.

 Absolutely (but since that in the past I also worked at the helpdesk
 dept. I certainly understand why it is important to ask those Are you sure
 it's plugged in? questions...). One of the logs is comming from
 SecurityOnion which logs (via bro-conn) all the connections so it must be
 sending data 24x7x365.

 Thanks for the quick reply,
 Yuval.

 On Tuesday, February 10, 2015, Itamar Syn-Hershko ita...@code972.com
 wrote:

 Are you sure your logs are generated linearly without bursts?

 --

 Itamar Syn-Hershko
 http://code972.com | @synhershko https://twitter.com/synhershko
 Freelance Developer  Consultant
 Lucene.NET committer and PMC member

 On Tue, Feb 10, 2015 at 6:29 PM, Yuval Khalifa iyuv...@gmail.com
 wrote:

 Hi,

 We just installed an ELK server and configured the logstash
 configuration to match the data that we send to it and until last month it
 seems to be working fine but since then we see very strange behavior in 
 the
 Kibana, the event over time histogram shows the event rate at the normal
 level for about a half an hour, then drops to about 20% of the normal rate
 and then it continues to drop slowly for about two hours and then stops 
 and
 after a minute or two it returns to normal for the next half an hour or so
 and the same behavior repeats. Needless to say that both the
 /var/log/logstash and /var/log/elasticsearch both show nothing since the
 service started and by using tcpdump we can verify that events keep coming
 in at the same rate all time. I attached our logstash configuration, the
 /var/logstash/logstash.log, the /var/log/elasticsearch/clustername.log and
 a screenshot of our Kibana with no filter applied so that you can see the
 weird behavior that we see.

 Is there someone/somewhere that we can turn to to get some help on the
 subject?


 Thanks a lot,
 Yuval.

 --
 You received this message because you are subscribed to the Google
 Groups elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send
 an email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/c2e5a524-1ba6-4dc9-9fc3-d206d8f82717%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/c2e5a524-1ba6-4dc9-9fc3-d206d8f82717%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


  --
 You received this message because you are subscribed to a topic in the
 Google Groups elasticsearch group.
 To unsubscribe from this topic, visit
 https://groups.google.com/d/topic/elasticsearch/cw7zEVTy09M/unsubscribe
 .
 To unsubscribe from this group and all its topics, send an email to
 elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZsRoNmJ__QdLnB6NYLhoDVaD9CR1RNkC_9_c%2Boaqccqww%40mail.gmail.com
 https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZsRoNmJ__QdLnB6NYLhoDVaD9CR1RNkC_9_c%2Boaqccqww%40mail.gmail.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.



 --

 בברכה,

 *יובל כליפא*

 CTO
 תחום מערכות מידע | מגדל סוכנויות.
 נייד:052-3336098
 משרד:  03-7966565
 פקס:03-7976565
   בלוג: http://www.artifex.co.il
 https://owa.mvs.co.il/OWA/redir.aspx?C=2843559e53a94386b1211d26cb20f8efURL=http%3a%2f%2fwww.artifex.co.il%2f

 *[image: תיאור: תיאור: cid:image003.png@01CBB583.C49AE5A0]*

  --
 You received this message because you are subscribed to the Google
 Groups elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send
 an email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/CADtR2A9-UtP5GJLORnVW%2BMowbB%2B0ZV%3DeDFMfN5u3xFPD2Zv5FQ%40mail.gmail.com
 https://groups.google.com/d/msgid/elasticsearch/CADtR2A9

Re: renaming a nodes in a cluster

2015-02-09 Thread Itamar Syn-Hershko
No, you will have to restart them tho

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer  Consultant
Lucene.NET committer and PMC member

On Mon, Feb 9, 2015 at 9:41 PM, Crista Shawler ecs...@gmail.com wrote:

 I would like to rename a couple of the nodes in my cluster.  Are there any
 issues with doing this?

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/6e4edd7a-8bfb-4633-b7fd-b7d87c382ec7%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/6e4edd7a-8bfb-4633-b7fd-b7d87c382ec7%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZsG7wqkvdjC-Rm08gyyz7D3tMhpsQPzLKGYQu6Ay%3DE7Gw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Possible? Wildcard template for a collection of fields to solve some dynamic mapping woes

2015-02-09 Thread Itamar Syn-Hershko
Please refer to
www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-core-types.html

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer  Consultant
Lucene.NET committer and PMC member

On Mon, Feb 9, 2015 at 12:24 PM, Itamar Syn-Hershko ita...@code972.com
wrote:

 yes, you are using string properties on a date mapping field

 --

 Itamar Syn-Hershko
 http://code972.com | @synhershko https://twitter.com/synhershko
 Freelance Developer  Consultant
 Lucene.NET committer and PMC member

 On Mon, Feb 9, 2015 at 12:23 PM, Paul Kavanagh pkavan...@shopkeep.com
 wrote:

 I think you have something there. I have come up with this:

 curl -XPUT localhost:9200/_template/template_1 -d '
 {
 template : logstash-*,
 order : 0,
 settings : {
 number_of_shards : 15
 },
 mappings : {
   dynamic_templates:[
 {apiservice_logstash:{
 match:apiservice.logstash.@fields.parameters.*,
 match_mapping_type:dateOptionalTime,
 mapping:{
   type:string,
   analyzer:english
 }
   }
 }
   ]
 }
 }
 '

 However... When I try to post it, Elasticsearch throws:
 {error:ElasticsearchIllegalArgumentException[Malformed mappings
 section for type [dynamic_templates], should include an inner object
 describing the mapping],status:400}

 i've tried a few things, but it doesn't seem to like my mappings block
 for some reason.

 Any idea why?


 On Friday, February 6, 2015 at 11:41:49 AM UTC, Itamar Syn-Hershko wrote:

 You mean something like dynamic templates? http://code972.com/
 blog/2015/02/81-elasticsearch-one-tip-a-day-using-dynamic-
 templates-to-avoid-rigorous-mappings

 --

 Itamar Syn-Hershko
 http://code972.com | @synhershko https://twitter.com/synhershko
 Freelance Developer  Consultant
 Lucene.NET committer and PMC member

 On Fri, Feb 6, 2015 at 1:39 PM, Paul Kavanagh pkav...@shopkeep.com
 wrote:

 Hi all,
 We're having a MapperParsingException problem with some field values
 when we get when we use the JSON Filter for Logstash to explode out a JSON
 document to Elasticsearch fields.

 In 99.9% of cases, certain of these fields are either blank, or contain
 dates in the format of -mm-dd. This allows ES to dynamically map this
 field to type dateOptionalTime.

 However, we occasionally see non-standard date formats in these fields,
 which our main service can handle fine, but which throws a
 MapperParsingException in Elasticsearch - such are here:



 [2015-02-06 10:46:50,679][WARN ][cluster.action.shard ] [logging-
 production-elasticsearch-ip-xxx-xxx-xxx-148] [logstash-2015.02.06][2]
 received shard failed for [logstash-2015.02.06][2], node[
 GZpltBjAQUqGyp2B1SLz_g], [R], s[INITIALIZING], indexUUID [BEdTwj-
 QRuOZB713YAQwvA], reason [Failed to start shard, message [
 RecoveryFailedException[[logstash-2015.02.06][2]: Recovery failed from
 [logging-production-elasticsearch-ip-xxx-xxx-xxx-82][IALW-92
 RReiLffQjSL3I-g][logging-production-elasticsearch-ip-xxx-xxx-xxx-82][
 inet[ip-xxx-xxx-xxx-82.ec2.internal/xxx.xxx.xxx.82:9300]]{
 max_local_storage_nodes=1, aws_availability_zone=us-east-1e, aws_az=us-
 east-1e} into [logging-production-elasticsearch-ip-xxx-xxx-xxx-148][
 GZpltBjAQUqGyp2B1SLz_g][logging-production-elasticsearch-ip-xxx-xxx-xxx
 -148][inet[ip-xxx.xxx.xxx.148.ec2.internal/xxx.xxx.xxx.148:9300]]{
 max_local_storage_nodes=1, aws_availability_zone=us-east-1c, aws_az=us-
 east-1c}]; nested: RemoteTransportException[[logging-production-
 elasticsearch-ip-xxx-xxx-xxx-82][inet[/xxx.xxx.xxx.82:9300]][internal:
 index/shard/recovery/start_recovery]]; nested: RecoveryEngineException
 [[logstash-2015.02.06][2] Phase[2] Execution failed]; nested:
 RemoteTransportException[[logging-production-elasticsearch-ip-xxx-xxx-
 xxx-148][inet[/xxx.xxx.xxx.148:9300]][internal:index/shard/recovery/
 translog_ops]]; nested: MapperParsingException[failed to parse [
 apiservice.logstash.@fields.parameters.start_time]]; nested:
 MapperParsingException[failed to parse date field [Feb 5 2015 12:00 AM
 ], tried both date format [dateOptionalTime], and timestamp number with
 locale []]; nested: IllegalArgumentException[Invalid format: Feb 5
 2015 12:00 AM]; ]]

 2015-02-06 10:46:53,685][WARN ][cluster.action.shard ] [logging-
 production-elasticsearch-ip-xxx-xxx-xxx-148] [logstash-2015.02.06][2]
 received shard failed for [logstash-2015.02.06][2], node[
 GZpltBjAQUqGyp2B1SLz_g], [R], s[INITIALIZING], indexUUID [BEdTwj-
 QRuOZB713YAQwvA], reason [master [logging-production-elasticsearch-ip-
 xxx-xxx-xxx-148][GZpltBjAQUqGyp2B1SLz_g][logging-production-
 elasticsearch-ip-xxx-xxx-xxx-148][inet[ip-xxx-xxx-xxx-148.ec2.internal/
 xxx.xxx.xxx.148:9300]]{max_local_storage_nodes=1, aws_availability_zone
 =us-east-1c, aws_az=us-east-1c} marked shard as initializing, but
 shard is marked as failed, resend shard failure]


 Our planned solution

Re: Possible? Wildcard template for a collection of fields to solve some dynamic mapping woes

2015-02-09 Thread Itamar Syn-Hershko
yes, you are using string properties on a date mapping field

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer  Consultant
Lucene.NET committer and PMC member

On Mon, Feb 9, 2015 at 12:23 PM, Paul Kavanagh pkavan...@shopkeep.com
wrote:

 I think you have something there. I have come up with this:

 curl -XPUT localhost:9200/_template/template_1 -d '
 {
 template : logstash-*,
 order : 0,
 settings : {
 number_of_shards : 15
 },
 mappings : {
   dynamic_templates:[
 {apiservice_logstash:{
 match:apiservice.logstash.@fields.parameters.*,
 match_mapping_type:dateOptionalTime,
 mapping:{
   type:string,
   analyzer:english
 }
   }
 }
   ]
 }
 }
 '

 However... When I try to post it, Elasticsearch throws:
 {error:ElasticsearchIllegalArgumentException[Malformed mappings section
 for type [dynamic_templates], should include an inner object describing the
 mapping],status:400}

 i've tried a few things, but it doesn't seem to like my mappings block for
 some reason.

 Any idea why?


 On Friday, February 6, 2015 at 11:41:49 AM UTC, Itamar Syn-Hershko wrote:

 You mean something like dynamic templates? http://code972.com/
 blog/2015/02/81-elasticsearch-one-tip-a-day-using-dynamic-
 templates-to-avoid-rigorous-mappings

 --

 Itamar Syn-Hershko
 http://code972.com | @synhershko https://twitter.com/synhershko
 Freelance Developer  Consultant
 Lucene.NET committer and PMC member

 On Fri, Feb 6, 2015 at 1:39 PM, Paul Kavanagh pkav...@shopkeep.com
 wrote:

 Hi all,
 We're having a MapperParsingException problem with some field values
 when we get when we use the JSON Filter for Logstash to explode out a JSON
 document to Elasticsearch fields.

 In 99.9% of cases, certain of these fields are either blank, or contain
 dates in the format of -mm-dd. This allows ES to dynamically map this
 field to type dateOptionalTime.

 However, we occasionally see non-standard date formats in these fields,
 which our main service can handle fine, but which throws a
 MapperParsingException in Elasticsearch - such are here:



 [2015-02-06 10:46:50,679][WARN ][cluster.action.shard ] [logging-
 production-elasticsearch-ip-xxx-xxx-xxx-148] [logstash-2015.02.06][2]
 received shard failed for [logstash-2015.02.06][2], node[
 GZpltBjAQUqGyp2B1SLz_g], [R], s[INITIALIZING], indexUUID [BEdTwj-
 QRuOZB713YAQwvA], reason [Failed to start shard, message [
 RecoveryFailedException[[logstash-2015.02.06][2]: Recovery failed from [
 logging-production-elasticsearch-ip-xxx-xxx-xxx-82][IALW-92RReiLffQjSL3I
 -g][logging-production-elasticsearch-ip-xxx-xxx-xxx-82][inet[ip-xxx-xxx-
 xxx-82.ec2.internal/xxx.xxx.xxx.82:9300]]{max_local_storage_nodes=1,
 aws_availability_zone=us-east-1e, aws_az=us-east-1e} into [logging-
 production-elasticsearch-ip-xxx-xxx-xxx-148][GZpltBjAQUqGyp2B1SLz_g][
 logging-production-elasticsearch-ip-xxx-xxx-xxx-148][inet[ip-xxx.xxx.xxx
 .148.ec2.internal/xxx.xxx.xxx.148:9300]]{max_local_storage_nodes=1,
 aws_availability_zone=us-east-1c, aws_az=us-east-1c}]; nested:
 RemoteTransportException[[logging-production-elasticsearch-ip-xxx-xxx-
 xxx-82][inet[/xxx.xxx.xxx.82:9300]][internal:index/shard/recovery/
 start_recovery]]; nested: RecoveryEngineException[[logstash-2015.02.06][
 2] Phase[2] Execution failed]; nested: RemoteTransportException[[logging
 -production-elasticsearch-ip-xxx-xxx-xxx-148][inet[/xxx.xxx.xxx.148:9300
 ]][internal:index/shard/recovery/translog_ops]]; nested:
 MapperParsingException[failed to parse [apiservice.logstash.@fields.p
 arameters.start_time]]; nested: MapperParsingException[failed to parse
 date field [Feb 5 2015 12:00 AM], tried both date format [
 dateOptionalTime], and timestamp number with locale []]; nested:
 IllegalArgumentException[Invalid format: Feb 5 2015 12:00 AM]; ]]

 2015-02-06 10:46:53,685][WARN ][cluster.action.shard ] [logging-
 production-elasticsearch-ip-xxx-xxx-xxx-148] [logstash-2015.02.06][2]
 received shard failed for [logstash-2015.02.06][2], node[
 GZpltBjAQUqGyp2B1SLz_g], [R], s[INITIALIZING], indexUUID [BEdTwj-
 QRuOZB713YAQwvA], reason [master [logging-production-elasticsearch-ip-
 xxx-xxx-xxx-148][GZpltBjAQUqGyp2B1SLz_g][logging-production-
 elasticsearch-ip-xxx-xxx-xxx-148][inet[ip-xxx-xxx-xxx-148.ec2.internal/
 xxx.xxx.xxx.148:9300]]{max_local_storage_nodes=1, aws_availability_zone=
 us-east-1c, aws_az=us-east-1c} marked shard as initializing, but shard
 is marked as failed, resend shard failure]


 Our planned solution was to create a template for Logstash indices that
 will set these fields to string. But as the field above isn't the only
 culprit, and more may be added overtime, it makes more sense to create a
 template to map all fields under apiservice.logstash.@fields.parameters.*
 to be string. (We never need to query on user entered data, but it's great
 to have

Re: Performance Limitation with ELK stack

2015-02-09 Thread Itamar Syn-Hershko
Logstash is CPU bound, SSD won't help. It's a JRuby implementation. Try to
see if you can have multiple logstash shippers on the same logs. Having a
redis / kafka server as a middle tier is also a general practice. If that
is not feasible then yes - my advise to you would be to roll your own.

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer  Consultant
Lucene.NET committer and PMC member

On Mon, Feb 9, 2015 at 12:15 PM, Hagai T hagai@gmail.com wrote:

 Hi,

 We were able to identify the bottleneck which seems to be the Logstash
 service.
 It seems that the Elasticsearch cluster is able to handle 40,000 per
 second documents with a 3 ES servers cluster using Java client that was
 written by us using SDK with bulk inserts.
 The client (written for load testing) is generating JSON format and send
 it to Elasticsearch for further processing.

 We run the same test with Logstash which reads the JSON format from Apache
 access log on a general purpose SSD and managed to achieve maximum of 4,000
 requests per second.
 With 2 Logstash servers we achieved 8,000 req per second.

 Getting rid of the filtering section in logstash configuration file helped
 us get to this number. with filtering we achieved only 1,5000-2,000 req per
 sec.
 I also tried to move the log file to Ephemeral storage but didn't get any
 improvement.
 We don't have any resources problem in the Logstash server (I/O / CPU) so
 it seems like a limit in the file input module or either logstash itself.

 I was able to test Logstash performance by creating huge log file 2GB and
 starting Logstash to send it's content.
 I also did tried with smaller files (4-5MB each) but performance didn't
 get any better.

 Does it sound reasonable for you guys that I got to a limit of 4,000 req
 per second with one Logstash?
 If you have any suggestions of how to proceed from here I will be more
 than happy to hear that.

 If we can't get more from one Logstash, we'll have to develop our own Java
 service to do that instead.

 *Apache Access log file output example - (already in JSON Format):*
 { timestamp:2015-02-09T10:07:48+,
 bq_timestamp:2015-02-09T10:07:48, client_ip:52.2.11.111,
 client_port:80, latency_ms:57, latency_sec:0,
 elb_status_code:200,
 request:/il.html?e=fpAdOpportunityw=wfl_dosevid=1vname=compName_PMecpm=8adid=1814157media_file_type=MEDIA_FILE_TYPEmedia_file_url=MEDIA_FILE_URLcurrent_url=%0Ahttp%3A%2F%
 2Fu-sd.gga.tv
 %2Fa%2Fh%2FJvf82UX3%2Beff48Z%2fwU20swbapQoWau_%3Fcb%3D5605126933660359000%26pet%3Dpreroll%26pageUrl%3Dhttp%253A%252F%252F3ffese.com%26eov%3Deov%0A%09current_main_vast_url=MAIN_VAST_URLerror_code=ERROR_CODEerror_message=ERROR_MESSAGEq9=
 dsdase.comapid=dose.comd=Convertdevice=6719csize=300X250token=14123669cb=260174713417pc=PLAYCOUNT,
 request_path:/il.html, referer:-, user_agent:Mozilla/5.0
 (redhat-x86_64-linux-gnu) Siege/3.0.8 }


 *Logstash configuration file (for the testing I ran it with root without
 any limitations):*
 input {
 file {
 path =
 /var/log/httpd/aaa.d.com.logstash-acc.log.[0-9]*
 codec = json
 type = tracking
 discover_interval = 1
 sincedb_path = /opt/logstash/httpd-sincedb
 sincedb_write_interval = 1
   }
 }

 output {
 elasticsearch {
 workers = 1
 host = aaa..com
 index = %{request_path}-logstash-%{+-MM-dd}
 flush_size = 1000
 cluster = video
 codec = json
 }
 }


 Your help is appreciated!
 Thanks!



 On Thursday, February 5, 2015 at 1:56:34 PM UTC+2, Itamar Syn-Hershko
 wrote:

 I'd recommend you use ephemeral SSD - 2+ factor replicas and proper use
 of the snapshot/restore API will provide you HA and DR guarantees.

 The rejections you are seeing are due to slow I/O operations, because the
 disk is not local. There is a way to have a bigger queue but I'd advise
 against that and instead go with a local fast disk.

 --

 Itamar Syn-Hershko
 http://code972.com | @synhershko https://twitter.com/synhershko
 Freelance Developer  Consultant
 Lucene.NET committer and PMC member

 On Thu, Feb 5, 2015 at 1:51 PM, Hagai T haga...@gmail.com wrote:

 Hi Itamar, thank you for the reply.

 This is 15k inserts totally and not for one host in the cluster.
 Yes, we have 15 sharding whiting one index. shards are spreaded on the
 nodes equally (automatically by Elasticsearch cluster).
 We currently use general purpose SSD and not Ephemeral storage.

 In addition, I see a lot of thread pool bulk rejections from the
 Elasticsearch side.


 On Thursday, February 5, 2015 at 1:33:05 PM UTC+2, Itamar Syn-Hershko
 wrote:

 What is the question?...

 15k inserts per sec per node is actually quite nice.

 Are your index sharded? If you write to one index only, you write to
 maximum of x nodes where x is the number of shards

Re: Doc Values

2015-02-07 Thread Itamar Syn-Hershko
You can update mappings cluster-wide (just post the mapping definition to
server:9200/*), but you will need to specify the field names explicitly

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer  Consultant
Lucene.NET committer and PMC member

On Sat, Feb 7, 2015 at 9:30 PM, Joel Baranick jbaran...@gmail.com wrote:

 Is there a way to turn doc_values on cluster wide and override any index
 specific settings?

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/0f54a59e-7490-4c63-b223-6371fa49719a%40googlegroups.com
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHTr4Zu7aZudsPYCaMLTJGtngn9%2B3h7qny%2B4fYzksf%3DVrUmEEg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Force search on a local node?

2015-02-07 Thread Itamar Syn-Hershko
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-request-preference.html

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer  Consultant
Lucene.NET committer and PMC member

On Sat, Feb 7, 2015 at 9:15 PM, codemasterg gtotsl...@gmail.com wrote:

 Hi -

 I am new to Elasticsearch and have what I hope is a basic question for a
 simple configuration.  Assume I have 3 node cluster with a single index and:

   - 1 primary shard
   - 2 replicas of the primary shard

 The majority of requests will be searches with relatively few index
 updates.

 All requests are distributed by a network load balancer across the three
 nodes.  Since each node has a copy of the index and the requests are being
 spread across the cluster  by the network load balancer, my intuition is
 that a local search (i.e. execute a search on the node that received the
 request) will perform best.  In other words, I do not want Elasticsearch to
 round-robin each search request from the node received to another node; I
 want the node that received the request to search its local copy of the
 index.

 My question: Is there a way for make Elasticsearch search against only the
 shard on the node received (and avoid a network hop to another shard)?

 Thanks very much.

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/b4289cc5-7981-443f-a26c-569b271cda3a%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/b4289cc5-7981-443f-a26c-569b271cda3a%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHTr4Zvqw8CnzAYfMg7-zGnv-4tQ2izPAi8Bo1z8xDyHao7jHQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Doc Values

2015-02-07 Thread Itamar Syn-Hershko
If the indexes have been already created you will have to be creative to
find those fields that need updating - not familiar with a plugin that can
do that. A simple client side tool that will grab all mappings from the
/_mapping endpoint, change it and send it back should do

For indexes that weren't created yet you can use index templates

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer  Consultant
Lucene.NET committer and PMC member

On Sat, Feb 7, 2015 at 10:04 PM, Joel Baranick jbaran...@gmail.com wrote:

 Got it. What I was hoping for would be a way to force doc_values to be the
 only way for fielddata to be stored for all mapping a in the entire cluster
 without having to update each index. Could this be done with a plugin?

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/67ce9351-f37a-40aa-ad04-c5328140d6fd%40googlegroups.com
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHTr4Zv6pffNZ2nZJMm0ew%2BPeaVX2GaNRoQdSHRXYD-x_T%2BARA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Doc Values

2015-02-07 Thread Itamar Syn-Hershko
You don't need a plugin for index when an index is created - use index
templates + dynamic templates for this, e.g.
http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/custom-dynamic-mapping.html#dynamic-templates

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer  Consultant
Lucene.NET committer and PMC member

On Sat, Feb 7, 2015 at 11:56 PM, Joel Baranick jbaran...@gmail.com wrote:

 Thanks. I will look into if I can create a plugin which will automatically
 enable doc_values whenever an index is created or updated.  This seems like
 it could be very useful for multitenant clusters.

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/032429bb-38de-40f1-8290-334a4890851d%40googlegroups.com
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZuKav0hP6G%2BCM55t6r1pYh62%3DOY-eOQarMteEeVyDE7_w%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Paid help with ES/ELK?

2015-02-07 Thread Itamar Syn-Hershko
I'm available for Elasticsearch consulting, feel free to ping me privately

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer  Consultant
Lucene.NET committer and PMC member

On Sat, Feb 7, 2015 at 11:04 PM, Steve Johnson st...@filethis.com wrote:


 I’ll keep you up to date es5z via this thread.

  I’ve gotten one response so far with no real info attached, and haven’t
 followed up yet.  I will check with sites like elance at some point.

 Steve

 On Feb 7, 2015, at 3:06 AM, es5z wrote:

 I'm wondering the same thing actually. Have you tried freelancer websites
 like elance and the others?

 On Friday, February 6, 2015 at 8:22:27 PM UTC+1, Steve Johnson wrote:

 I hope a posting like this is not taboo in this forum...

 We are struggling to understand how to properly configure an ELK stack
 for our production environment.  We think we have things set up pretty much
 right, and then ES throws us a curve ball.  We've had a couple of things
 happen over the last few days that are simply baffling to us.  We've decide
 we need the help of someone who really knows ES.

 Support companies all seem to want to sell only long-term contracts. We
 need short-term help.  We are therefore thinking that we need to find an
 individual ES expert who we can pay on an hourly basis to help us set up
 our ES cluster and learn how it works and how to maintain it.

 If anyone reading this fits this description, or knows of some other
 person or organization that does, please contact me at elastic at
 filethis dot c0m.  If you're offering your services directly, please
 let me know as much as you can about your experience with ES, including the
 number of years you've worked with it and the sizes of the clusters you've
 worked with.

 TIA for all help!

 Steve


 --
 You received this message because you are subscribed to a topic in the
 Google Groups elasticsearch group.
 To unsubscribe from this topic, visit
 https://groups.google.com/d/topic/elasticsearch/T4QQ2t23uAw/unsubscribe.
 To unsubscribe from this group and all its topics, send an email to
 elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/9ca23652-8eec-445c-945d-49eb82388499%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/9ca23652-8eec-445c-945d-49eb82388499%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/AD545C15-0FC1-4DBA-B3A5-1C8753353F84%40filethis.com
 https://groups.google.com/d/msgid/elasticsearch/AD545C15-0FC1-4DBA-B3A5-1C8753353F84%40filethis.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZuJHVOodDVW2iF66Q_mFdRHpa_wLPTFJSOhJ2iTNsB%3D1w%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Possible? Wildcard template for a collection of fields to solve some dynamic mapping woes

2015-02-06 Thread Itamar Syn-Hershko
You mean something like dynamic templates?
http://code972.com/blog/2015/02/81-elasticsearch-one-tip-a-day-using-dynamic-templates-to-avoid-rigorous-mappings

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer  Consultant
Lucene.NET committer and PMC member

On Fri, Feb 6, 2015 at 1:39 PM, Paul Kavanagh pkavan...@shopkeep.com
wrote:

 Hi all,
 We're having a MapperParsingException problem with some field values when
 we get when we use the JSON Filter for Logstash to explode out a JSON
 document to Elasticsearch fields.

 In 99.9% of cases, certain of these fields are either blank, or contain
 dates in the format of -mm-dd. This allows ES to dynamically map this
 field to type dateOptionalTime.

 However, we occasionally see non-standard date formats in these fields,
 which our main service can handle fine, but which throws a
 MapperParsingException in Elasticsearch - such are here:



 [2015-02-06 10:46:50,679][WARN ][cluster.action.shard ] [logging-
 production-elasticsearch-ip-xxx-xxx-xxx-148] [logstash-2015.02.06][2]
 received shard failed for [logstash-2015.02.06][2], node[
 GZpltBjAQUqGyp2B1SLz_g], [R], s[INITIALIZING], indexUUID [BEdTwj-
 QRuOZB713YAQwvA], reason [Failed to start shard, message [
 RecoveryFailedException[[logstash-2015.02.06][2]: Recovery failed from [
 logging-production-elasticsearch-ip-xxx-xxx-xxx-82][IALW-92RReiLffQjSL3I-g
 ][logging-production-elasticsearch-ip-xxx-xxx-xxx-82][inet[ip-xxx-xxx-xxx-
 82.ec2.internal/xxx.xxx.xxx.82:9300]]{max_local_storage_nodes=1,
 aws_availability_zone=us-east-1e, aws_az=us-east-1e} into [logging-
 production-elasticsearch-ip-xxx-xxx-xxx-148][GZpltBjAQUqGyp2B1SLz_g][
 logging-production-elasticsearch-ip-xxx-xxx-xxx-148][inet[ip-xxx.xxx.xxx.
 148.ec2.internal/xxx.xxx.xxx.148:9300]]{max_local_storage_nodes=1,
 aws_availability_zone=us-east-1c, aws_az=us-east-1c}]; nested:
 RemoteTransportException[[logging-production-elasticsearch-ip-xxx-xxx-xxx-
 82][inet[/xxx.xxx.xxx.82:9300]][internal:index/shard/recovery/
 start_recovery]]; nested: RecoveryEngineException[[logstash-2015.02.06][2]
 Phase[2] Execution failed]; nested: RemoteTransportException[[logging-
 production-elasticsearch-ip-xxx-xxx-xxx-148][inet[/xxx.xxx.xxx.148:9300]][
 internal:index/shard/recovery/translog_ops]]; nested:
 MapperParsingException[failed to parse [apiservice.logstash.@fields.
 parameters.start_time]]; nested: MapperParsingException[failed to parse
 date field [Feb 5 2015 12:00 AM], tried both date format [dateOptionalTime
 ], and timestamp number with locale []]; nested: IllegalArgumentException[
 Invalid format: Feb 5 2015 12:00 AM]; ]]

 2015-02-06 10:46:53,685][WARN ][cluster.action.shard ] [logging-
 production-elasticsearch-ip-xxx-xxx-xxx-148] [logstash-2015.02.06][2]
 received shard failed for [logstash-2015.02.06][2], node[
 GZpltBjAQUqGyp2B1SLz_g], [R], s[INITIALIZING], indexUUID [BEdTwj-
 QRuOZB713YAQwvA], reason [master [logging-production-elasticsearch-ip-xxx-
 xxx-xxx-148][GZpltBjAQUqGyp2B1SLz_g][logging-production-elasticsearch-ip-
 xxx-xxx-xxx-148][inet[ip-xxx-xxx-xxx-148.ec2.internal/xxx.xxx.xxx.148:9300
 ]]{max_local_storage_nodes=1, aws_availability_zone=us-east-1c, aws_az=us-
 east-1c} marked shard as initializing, but shard is marked as failed,
 resend shard failure]


 Our planned solution was to create a template for Logstash indices that
 will set these fields to string. But as the field above isn't the only
 culprit, and more may be added overtime, it makes more sense to create a
 template to map all fields under apiservice.logstash.@fields.parameters.*
 to be string. (We never need to query on user entered data, but it's great
 to have logged for debugging)

 Is it possible to do this with a template? I could not find a way to do
 this via the template documentation on the ES site.

 Any guidance would be great!

 Thanks,
 -Paul

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/6ca4030f-b6bb-4907-b2fc-e3166fa2a6af%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/6ca4030f-b6bb-4907-b2fc-e3166fa2a6af%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZviZWbhJC83fB-3cm5qmcsuH-ScOo4x-ghS9BZ9t28HCA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: ES crashes when parsing fails due to mapping failure

2015-02-05 Thread Itamar Syn-Hershko
It's not crashing, it is just a log that says the document insert was
rejected

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer  Consultant
Lucene.NET committer and PMC member

On Thu, Feb 5, 2015 at 2:56 PM, as...@singular.net wrote:

 We have automatic mapping turned for our logstash indexes. Every now and
 then our system logs a record that has a wrong (out of the ordinary)
 field data type.
 For example, a field that's been automatically mapped to be a number
 occasionally is logged as a string.
 This causes ES to crash with the following stack trace:

 at
 org.elasticsearch.search.SearchService.parseSource(SearchService.java:681)
 at
 org.elasticsearch.search.SearchService.createContext(SearchService.java:537)
 at
 org.elasticsearch.search.SearchService.createAndPutContext(SearchService.java:509)
 at
 org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:264)
 at
 org.elasticsearch.search.action.SearchServiceTransportAction$5.call(SearchServiceTransportAction.java:231)
 at
 org.elasticsearch.search.action.SearchServiceTransportAction$5.call(SearchServiceTransportAction.java:228)
 at
 org.elasticsearch.search.action.SearchServiceTransportAction$23.run(SearchServiceTransportAction.java:559)
 at
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:745)
 Caused by: java.lang.NumberFormatException: For input string: 1.0
 at
 java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
 at java.lang.Long.parseLong(Long.java:441)
 at java.lang.Long.parseLong(Long.java:483)
 at
 org.elasticsearch.index.mapper.core.NumberFieldMapper.parseLongValue(NumberFieldMapper.java:357)
 at
 org.elasticsearch.index.mapper.core.LongFieldMapper.termQuery(LongFieldMapper.java:185)
 at
 org.apache.lucene.queryparser.classic.MapperQueryParser.getFieldQuerySingle(MapperQueryParser.java:257)
 at
 org.apache.lucene.queryparser.classic.MapperQueryParser.getFieldQuery(MapperQueryParser.java:168)
 at
 org.apache.lucene.queryparser.classic.QueryParserBase.getFieldQuery(QueryParserBase.java:487)
 at
 org.apache.lucene.queryparser.classic.MapperQueryParser.getFieldQuery(MapperQueryParser.java:287)
 at
 org.apache.lucene.queryparser.classic.QueryParserBase.handleQuotedTerm(QueryParserBase.java:875)
 at
 org.apache.lucene.queryparser.classic.QueryParser.Term(QueryParser.java:464)
 at
 org.apache.lucene.queryparser.classic.QueryParser.Clause(QueryParser.java:259)
 at
 org.apache.lucene.queryparser.classic.QueryParser.Query(QueryParser.java:183)
 at
 org.apache.lucene.queryparser.classic.QueryParser.Clause(QueryParser.java:263)
 at
 org.apache.lucene.queryparser.classic.QueryParser.Query(QueryParser.java:183)
 at
 org.apache.lucene.queryparser.classic.QueryParser.TopLevelQuery(QueryParser.java:172)
 at
 org.apache.lucene.queryparser.classic.QueryParserBase.parse(QueryParserBase.java:123)
 at
 org.apache.lucene.queryparser.classic.MapperQueryParser.parse(MapperQueryParser.java:882)
 at
 org.elasticsearch.index.query.QueryStringQueryParser.parse(QueryStringQueryParser.java:223)
 at
 org.elasticsearch.index.query.QueryParseContext.parseInnerQuery(QueryParseContext.java:277)
 at
 org.elasticsearch.index.query.FQueryFilterParser.parse(FQueryFilterParser.java:66)
 at
 org.elasticsearch.index.query.QueryParseContext.executeFilterParser(QueryParseContext.java:343)
 at
 org.elasticsearch.index.query.QueryParseContext.parseInnerFilter(QueryParseContext.java:324)
 at
 org.elasticsearch.index.query.BoolFilterParser.parse(BoolFilterParser.java:92)
 at
 org.elasticsearch.index.query.QueryParseContext.executeFilterParser(QueryParseContext.java:343)
 at
 org.elasticsearch.index.query.QueryParseContext.parseInnerFilter(QueryParseContext.java:324)
 at
 org.elasticsearch.index.query.FilteredQueryParser.parse(FilteredQueryParser.java:74)
 at
 org.elasticsearch.index.query.QueryParseContext.parseInnerQuery(QueryParseContext.java:277)
 at
 org.elasticsearch.index.query.IndexQueryParserService.innerParse(IndexQueryParserService.java:382)
 at
 org.elasticsearch.index.query.IndexQueryParserService.parse(IndexQueryParserService.java:281)
 at
 org.elasticsearch.index.query.IndexQueryParserService.parse(IndexQueryParserService.java:276)
 at
 org.elasticsearch.search.query.QueryParseElement.parse(QueryParseElement.java:33)
 at
 org.elasticsearch.search.SearchService.parseSource(SearchService.java:665)

 Is there a way to tell ES not to crash when failing to parse a field? I
 realize I can override the mapping, and so on, but regardless I'm also
 interested in getting ES to run reliably without crashing on rare inputs.

 Thanks,
 Assaf

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email

Re: Terms facet changing date to long

2015-02-05 Thread Itamar Syn-Hershko
terms panel you mean Kibana? take a look at Kibana 4, they are doing this
automatically in most places

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer  Consultant
Lucene.NET committer and PMC member

On Thu, Feb 5, 2015 at 7:29 PM, Chris Neal chris.n...@derbysoft.net wrote:

 Please excuse the bump of my own question. :)  After almost 8 months, I
 still have this question!  Just wanted to get it in front of people's eyes
 again.

 Is there a way to have date fields stored in ES displayed in a terms panel
 as nicely formatted dates instead of epoch time?

 Very much appreciated!
 Chris

 On Mon, Jun 30, 2014 at 3:21 PM, Chris Neal chris.n...@derbysoft.net
 wrote:

 Hello all,

 The issue is I have a terms panel in Kibana that I want to group events
 by a date field from each record (Not the @timestamp field).  The terms
 panel is taking my nicely formatted dates (2014-07-31) and turning them
 into longs since UTC (140356800).  I did a quick test by creating a new
 index, giving it a mapping, then running both a search and a facet query,
 and sure enough, the facet query returns the long format instead of the
 date format!  I tried two types of dates, just to see if that made a
 difference.  It did not.


 =
 #Create mapping for index
 PUT /test_index_jerry/test/_mapping
 {
   test: {
 properties: {
date1: {
   type: date,
   format: dateOptionalTime
},
 date2: {
   type: date,
   format: date
}
 }
   }
 }

 #Put some data
 POST /test_index_jerry/test
 {
   date1:2014-06-30,
   date2:2014-06-30
 }

 #Execute a basic query
 GET /test_index_jerry/test/_search
 {
   query: {
 match_all: {}
   }
 }

 # It returns dates in date format
 {
took: 0,
timed_out: false,
_shards: {
   total: 2,
   successful: 2,
   failed: 0
},
hits: {
   total: 1,
   max_score: 1,
   hits: [
  {
 _index: test_index_jerry,
 _type: test,
 _id: VUOeBuiUTGeqBS2Zl8--lg,
 _score: 1,
 _source: {
date1: 2014-06-30,
date2: 2014-06-30
 }
  }
   ]
}
 }

 #Execute a terms facet
 GET /test_index_jerry/test/_search
 {
   facets: {
 terms: {
   terms: {
 field: date1,
 size: 10,
 order: count,
 exclude: []
   }
 }
   }
 }

 #Now we have longs
 {
took: 1,
timed_out: false,
_shards: {
   total: 2,
   successful: 2,
   failed: 0
},
hits: {
   total: 1,
   max_score: 1,
   hits: [
  {
 _index: test_index_jerry,
 _type: test,
 _id: VUOeBuiUTGeqBS2Zl8--lg,
 _score: 1,
 _source: {
date1: 2014-06-30,
date2: 2014-06-30
 }
  }
   ]
},
facets: {
   terms: {
  _type: terms,
  missing: 0,
  total: 1,
  other: 0,
  terms: [
 {
term: 140408640,
count: 1
 }
  ]
   }
}
 }
 ===

 Is there some way I can get the term to stay in date formatted buckets?
 I also tried the date histogram facet, but it returned longs as well.

 Very much appreciate the help :)
 Thanks,
 Chris


  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/CAND3DphaBX2%2BZ_mRuS4vtx39EQKs9k9EnH08nrJBN58hj6yCYA%40mail.gmail.com
 https://groups.google.com/d/msgid/elasticsearch/CAND3DphaBX2%2BZ_mRuS4vtx39EQKs9k9EnH08nrJBN58hj6yCYA%40mail.gmail.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZsvwQKtabn8-%3D1JE_nDrwEwEy-eFB6KsXnj%3Dg4mzzOCKw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Performance Limitation with ELK stack

2015-02-05 Thread Itamar Syn-Hershko
What is the question?...

15k inserts per sec per node is actually quite nice.

Are your index sharded? If you write to one index only, you write to
maximum of x nodes where x is the number of shards of that index. Since
shards of the same index can co-exist on one node, check if you are
spanning writes.

Use local disks - never EBS, and if you really care about writing speeds
use SSDs.

Other than that, Mike did an excellent write up on the subject :
http://www.elasticsearch.org/blog/performance-considerations-elasticsearch-indexing/

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer  Consultant
Lucene.NET committer and PMC member

On Thu, Feb 5, 2015 at 1:26 PM, Hagai T hagai@gmail.com wrote:

 Hi Guys,

 We use ElasticSearch as our tracking system of our products in a dynamic
 to track performance.
 Searching of this data is being used by small group of users (10-12) in
 the company to measure performance.

 In our current environment, we see limit of 15,000 documents being
 inserted per second without the ability to scale.

 Some information on the current setup and  the flow:

 *- Tracking servers*
 8 x Apache servers behind Amazon ELB which serve empty html files so it
 tracks the parameters given and writes it to Apache access log.
 on each server, we also have Logstash which configured to read this access
 log file and send data to Elasticsearch cluster.

 *- Elasticsearch Cluster: *
 4 x r3.2xlarge (61.0 GB RAM, 8 cores) - contains one Elasticsearch
 process - 30GB Heap size
 1 x r3.4xlarge (122.0 GB RAM, 16 cores) - contains two Elasticsearch
 processes each with 30GB Heap size.


 Additional information on the Cluster:
 https://gist.github.com/hagait/61e3fac181ff413a8b8c#file-gistfile1-txt

 Cluster health:
 https://gist.github.com/hagait/d4f16d8f7b724f85b0ee#file-gistfile1-txt

 Logstash Configuration:
 https://gist.github.com/hagait/23f4b2bc614a4c4acbb6

 Elasticsearch configuration:
 https://gist.github.com/hagait/ba3684048abe2f9219b8

 Thank you for the support!
 Regards,
 Hagai


  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/9c5352ed-010d-4080-9794-b8dc0c2a5370%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/9c5352ed-010d-4080-9794-b8dc0c2a5370%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZuuedagMgkbe4JoUGWg2DT1pFMkmdRKS3Rp4f4mwq6ntg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Performance Limitation with ELK stack

2015-02-05 Thread Itamar Syn-Hershko
I'd recommend you use ephemeral SSD - 2+ factor replicas and proper use of
the snapshot/restore API will provide you HA and DR guarantees.

The rejections you are seeing are due to slow I/O operations, because the
disk is not local. There is a way to have a bigger queue but I'd advise
against that and instead go with a local fast disk.

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer  Consultant
Lucene.NET committer and PMC member

On Thu, Feb 5, 2015 at 1:51 PM, Hagai T hagai@gmail.com wrote:

 Hi Itamar, thank you for the reply.

 This is 15k inserts totally and not for one host in the cluster.
 Yes, we have 15 sharding whiting one index. shards are spreaded on the
 nodes equally (automatically by Elasticsearch cluster).
 We currently use general purpose SSD and not Ephemeral storage.

 In addition, I see a lot of thread pool bulk rejections from the
 Elasticsearch side.


 On Thursday, February 5, 2015 at 1:33:05 PM UTC+2, Itamar Syn-Hershko
 wrote:

 What is the question?...

 15k inserts per sec per node is actually quite nice.

 Are your index sharded? If you write to one index only, you write to
 maximum of x nodes where x is the number of shards of that index. Since
 shards of the same index can co-exist on one node, check if you are
 spanning writes.

 Use local disks - never EBS, and if you really care about writing speeds
 use SSDs.

 Other than that, Mike did an excellent write up on the subject :
 http://www.elasticsearch.org/blog/performance-
 considerations-elasticsearch-indexing/

 --

 Itamar Syn-Hershko
 http://code972.com | @synhershko https://twitter.com/synhershko
 Freelance Developer  Consultant
 Lucene.NET committer and PMC member

 On Thu, Feb 5, 2015 at 1:26 PM, Hagai T haga...@gmail.com wrote:

 Hi Guys,

 We use ElasticSearch as our tracking system of our products in a dynamic
 to track performance.
 Searching of this data is being used by small group of users (10-12) in
 the company to measure performance.

 In our current environment, we see limit of 15,000 documents being
 inserted per second without the ability to scale.

 Some information on the current setup and  the flow:

 *- Tracking servers*
 8 x Apache servers behind Amazon ELB which serve empty html files so it
 tracks the parameters given and writes it to Apache access log.
 on each server, we also have Logstash which configured to read this
 access log file and send data to Elasticsearch cluster.

 *- Elasticsearch Cluster: *
 4 x r3.2xlarge (61.0 GB RAM, 8 cores) - contains one Elasticsearch
 process - 30GB Heap size
 1 x r3.4xlarge (122.0 GB RAM, 16 cores) - contains two Elasticsearch
 processes each with 30GB Heap size.


 Additional information on the Cluster:
 https://gist.github.com/hagait/61e3fac181ff413a8b8c#file-gistfile1-txt

 Cluster health: https://gist.github.com/hagait/d4f16d8f7b724f85b0ee#
 file-gistfile1-txt

 Logstash Configuration: https://gist.github.com/
 hagait/23f4b2bc614a4c4acbb6

 Elasticsearch configuration: https://gist.github.com/hagait/
 ba3684048abe2f9219b8

 Thank you for the support!
 Regards,
 Hagai


  --
 You received this message because you are subscribed to the Google
 Groups elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send
 an email to elasticsearc...@googlegroups.com.
 To view this discussion on the web visit https://groups.google.com/d/
 msgid/elasticsearch/9c5352ed-010d-4080-9794-b8dc0c2a5370%
 40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/9c5352ed-010d-4080-9794-b8dc0c2a5370%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/318b46ed-626f-4975-a417-c99ada9c30fe%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/318b46ed-626f-4975-a417-c99ada9c30fe%40googlegroups.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHTr4Zv0toyUEhc%2BynpkH%3DUxf9GVLuvnKdPSGUQcnG1grwpbCw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Is there ever a reason to store _id?

2015-02-04 Thread Itamar Syn-Hershko
Setting fields to stored in Elasticsearch in general is not required and
a bad practice, since all fields are extracted from _soruce when they are
required and _source benefits from block compression and more.

There are only some very few edge cases where you want to not save the
_source and enable stored for a few fields (usually several small ones
out of many) that this feature becomes helpful.

HTH

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer  Consultant
Lucene.NET committer and PMC member

On Wed, Feb 4, 2015 at 2:57 PM, Andrew White and...@datarank.com wrote:

 I wanted to give this a friendly bump and follow up with my experience.

 After doing some light testing I can't see a reason to ever store _id.
 Doing so inflated the index size and response objects but offered no
 improvements on scanning. So, at least for my case it doesn't seem to make
 sense. perhaps there is another use case I am missing.

 Thanks,
 Andrew White

 On Wednesday, January 21, 2015 at 7:11:38 AM UTC-6, Andrew White wrote:

 According to the documentation on _id
 http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-id-field.htmlit
 is possible to store _id but it never gives a reason why that would be
 useful.

 I have a use case where I am exporting all ids from ES using scan/scroll
 with no query. If I set the fields parameter to nothing/blank I get back
 the _id automatically. I assume this happens by parsing the _uid. If I
 store the _id I get back the _id in both the metadata section of the
 document and the fields property which seems redundant.

 I am a little unsure what ES does when a request for no fields and no
 query come in. I assume it's scanning something (what?) and then fetching
 the metadata from somewhere (where?). If what it's scanning and what it's
 fetching from are the same thing then storing the _id seems moot.

 So, Is there any performance advantage to storing the _id for scan/scroll
 requests, or in any specific case?

  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/3ea84863-63ac-4719-b54f-6a6bc0bb1cfa%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/3ea84863-63ac-4719-b54f-6a6bc0bb1cfa%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZtD6DB3EOVsUQM3ro0FUfvkG3o%2BBv77CkqfaeUL0qdUnw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Kibana4 Beta3: Battling with wildcard search on not_analyzed fields

2015-02-03 Thread Itamar Syn-Hershko
Here's a working gist:

https://gist.github.com/synhershko/3d915a7819145f2d7a1f

You need to double escape the slashes - not sure if this is by design or no
but that works now

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer  Consultant
Lucene.NET committer and PMC member

On Tue, Feb 3, 2015 at 7:56 PM, Ali Kheyrollahi alios...@gmail.com wrote:

 Wildcard does not work either.
 {wildcard:{CounterName:\\Windows Azure
 Caching:Client(w3wp_*)\\Failure Exceptions}}

 And regardless, Regexp does not work so on its own right it is a bug.
 Can you please help open the issue on GitHub? Already have an issue which
 was closed:

 https://github.com/elasticsearch/kibana/issues/2698


 On Tuesday, 3 February 2015 13:42:11 UTC, Itamar Syn-Hershko wrote:

 Thinking of it, I'm not sure why you are using regexp here - can you just
 use wildcard query instead? http://www.elasticsearch.org/guide/en/
 elasticsearch/reference/current/query-dsl-wildcard-query.html

 --

 Itamar Syn-Hershko
 http://code972.com | @synhershko https://twitter.com/synhershko
 Freelance Developer  Consultant
 Lucene.NET committer and PMC member

 On Tue, Feb 3, 2015 at 12:00 PM, Ali Kheyrollahi alio...@gmail.com
 wrote:

 No it doesn't which has been my experience:

 {regexp:{CounterName:\\Windows Azure 
 Caching:Client\\(w3wp_.*\\)\\Failure
 Exceptions}}
 or
 {regexp:{CounterName:\\Windows Azure 
 Caching\\:Client\\(w3wp_.*\\)\\Failure
 Exceptions}}

 None of them work

 --
 You received this message because you are subscribed to the Google
 Groups elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send
 an email to elasticsearc...@googlegroups.com.
 To view this discussion on the web visit https://groups.google.com/d/
 msgid/elasticsearch/9a4eabaa-1634-46a5-aa8a-f2c47ccd5745%
 40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/9a4eabaa-1634-46a5-aa8a-f2c47ccd5745%40googlegroups.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/3ed729ef-697b-42e0-975b-3b3c86fd7734%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/3ed729ef-697b-42e0-975b-3b3c86fd7734%40googlegroups.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZtYtioqUuAyGWm%3DBf3Jxs8DpUvKUjeTsALO4m38-%3DOr8A%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Persisting Aggregations

2015-02-03 Thread Itamar Syn-Hershko
The Aggs Fw doesn't allow for persisting results, mainly because it is
targeted at real-time data that can still change, but it does support
caching as of 1.4. That is, if you issue the same query  aggregations
request again and again you will be served directly from cache, given the
data hasn't changed.

That is to say, if you care about performance, the caching layer should be
the answer. If you need other things (point in time view of data, further
processing, etc) you will need to store the results back to ES or other
storage as a document.

HTH

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer  Consultant
Lucene.NET committer and PMC member

On Tue, Feb 3, 2015 at 11:21 AM, AndrewK kenworth...@gmail.com wrote:

 I've not yet used the aggregations framework, but one question that has
 come up recently with contacts and prospective clients is how best to
 persist aggregations in ElasticSearch for repeated use.

 If I have understood the documentation correctly, the aggregation
 framework does a pretty good job of using shard caching to make
 repeated-or-similar queries as efficient as possible, but it would -
 presumably - be even better if static results (i.e. which will hardly
 ever - or never - change) could be persisted in some way (in a dedicated
 index, for example).

 Is this possible internally (i.e. to GET an aggregation result and POST
 it in one call) or would one simply have to extract the desired data and
 then post it oneself?

 Regards, Andrew

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/2b492a47-1fa6-40f1-a14e-54ccb7fe2a0e%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/2b492a47-1fa6-40f1-a14e-54ccb7fe2a0e%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHTr4Zs6c4tbG-2vXYowbpcA45MTQty1i6Hquv%3DOYVYOWSp9%3DQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Kibana4 Beta3: Battling with wildcard search on not_analyzed fields

2015-02-02 Thread Itamar Syn-Hershko
Can you try executing a simple term query in JSON using that query bar?

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer  Consultant
Lucene.NET committer and PMC member

On Mon, Feb 2, 2015 at 11:57 PM, Ali Kheyrollahi alios...@gmail.com wrote:

 Thanks for responding.

 It is *surely* not_analyzed - hence my frustration. Here is the mapping


 {
 my_index: {
 mappings: {
 my_type: {
 properties: {
 @timestamp: {
 type: date,
 format: dateOptionalTime
 },
 CounterName: {
 type: string,
 index: not_analyzed
 },
 CounterValue: {
 type: double
 },
 DeploymentId: {
 type: string,
 index: not_analyzed
 },
 EventTickCount: {
 type: long
 },
 PartitionKey: {
 type: string,
 index: not_analyzed
 },
 Role: {
 type: string,
 index: not_analyzed
 },
 RoleInstance: {
 type: string,
 index: not_analyzed
 },
 RowKey: {
 type: string,
 index: not_analyzed
 }
 }
 }
 }
 }
 }


 On Monday, 2 February 2015 13:20:49 UTC, Itamar Syn-Hershko wrote:

 It looks like your field is analyzed and you are trying to query it
 assuming its not_analyzed (e.g. one string). Hard to say without seeing
 your index mapping.

 --

 Itamar Syn-Hershko
 http://code972.com | @synhershko https://twitter.com/synhershko
 Freelance Developer  Consultant
 Lucene.NET committer and PMC member

 On Mon, Feb 2, 2015 at 3:08 PM, Ali Kheyrollahi alio...@gmail.com
 wrote:

 Any help please??

 On Saturday, 31 January 2015 09:56:38 UTC, Ali Kheyrollahi wrote:

 Hi,

 I really haven't found a consistent way to use query window in Discover
 or Visualize tabs. My results become hit and miss and inconsistent.

 So I am searching for types of my_type  and I have a field called
 CounterName and I am looking for \Windows Azure
 Caching:Client(w3wp_2392)\Total Local Cache Hits

 Funny thing is searching for verbatim value does not work:
 CounterName\Windows Azure Caching:Client(w3wp_2392)\Total Local Cache
 Hits
 And I have to escape only backslashes (well I am using double quotes so
 it is literal, no?) and not brackets or colon:
 CounterName\\Windows Azure Caching:Client(w3wp_2392)\\Total Local
 Cache Hits

 Now, the 2392 number here is variable (pid on the box) so I am trying
 to look for \Windows Azure Caching:Client(w3wp_*)\Total Local Cache
 Hits and I have tried all these to no avail:

 CounterName:\\Windows Azure Caching:Client(w3wp_*)\\Total Local Cache
 Hits
 CounterName:\\Windows Azure Caching:Client(w3wp_\*)\\Total Local Cache
 Hits
 CounterName:\Windows Azure Caching:Client(w3wp_*\Total Local Cache
 Hits (nothing comes back)

 And also tried regex:

 CounterName:/\Windows Azure Caching:Client(w3wp_*)\\Total Local Cache
 Hits/
 CounterName:/\Windows Azure Caching:Client(w3wp_.*)\\Total Local Cache
 Hits/
 ...

 With many different combinations of replacing reserved chars with ?.

 What am I doing wrong?

 Thanks



  --
 You received this message because you are subscribed to the Google
 Groups elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send
 an email to elasticsearc...@googlegroups.com.
 To view this discussion on the web visit https://groups.google.com/d/
 msgid/elasticsearch/54e8264f-00ee-4327-b4fc-ae074152669e%
 40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/54e8264f-00ee-4327-b4fc-ae074152669e%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/a5aa9d83-a0cc-459d-87fe-d5da8142a4fb%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/a5aa9d83-a0cc-459d-87fe-d5da8142a4fb%40googlegroups.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group

Re: Kibana4 Beta3: Battling with wildcard search on not_analyzed fields

2015-02-02 Thread Itamar Syn-Hershko
inline

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer  Consultant
Lucene.NET committer and PMC member

On Tue, Feb 3, 2015 at 1:32 AM, Ali Kheyrollahi alios...@gmail.com wrote:

 This *works* (exact value)

 {term:{CounterName:\\Windows Azure Caching:Client(w3wp_5412)\\Failure
 Exceptions}}


As expected


 But NOT this:
 {term:{CounterName:Caching}}
 Nor
 {term:{CounterName:\\Windows Azure Caching:Client(w3wp_.*)\\Failure
 Exceptions}}
 Or this
 {term:{CounterName:\\Windows Azure Caching:Client(w3wp_*)\\Failure
 Exceptions}}


As expected too - term query will take the entire string and look for
documents matching this exact query. .* has no meaning in this context, its
just a different string than the original, hence no hits.



 And *not even* this
 {regexp:{CounterName:\\Windows Azure Caching:Client(w3wp_.*)\\Failure
 Exceptions}}
 or
 {regexp:{CounterName:\\Windows Azure Caching:Client(w3wp_.+)\\Failure
 Exceptions}}
 or
 {regexp:{CounterName:\\Windows Azure Caching:Client(w3wp_*)\\Failure
 Exceptions}}


I believe you should escape the parenthesis, this is getting parsed as a
regex grouping. See
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-regexp-query.html#regexp-syntax


  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/327ba38a-9caf-41c1-8a45-f93be1532bf2%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/327ba38a-9caf-41c1-8a45-f93be1532bf2%40googlegroups.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZvNWEKQ6_j3aEBrJX0vfBJX9APVQja1CqzBMjpyynZypA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Kibana4 Beta3: Battling with wildcard search on not_analyzed fields

2015-02-02 Thread Itamar Syn-Hershko
It looks like your field is analyzed and you are trying to query it
assuming its not_analyzed (e.g. one string). Hard to say without seeing
your index mapping.

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer  Consultant
Lucene.NET committer and PMC member

On Mon, Feb 2, 2015 at 3:08 PM, Ali Kheyrollahi alios...@gmail.com wrote:

 Any help please??

 On Saturday, 31 January 2015 09:56:38 UTC, Ali Kheyrollahi wrote:

 Hi,

 I really haven't found a consistent way to use query window in Discover
 or Visualize tabs. My results become hit and miss and inconsistent.

 So I am searching for types of my_type  and I have a field called
 CounterName and I am looking for \Windows Azure
 Caching:Client(w3wp_2392)\Total Local Cache Hits

 Funny thing is searching for verbatim value does not work:
 CounterName\Windows Azure Caching:Client(w3wp_2392)\Total Local Cache
 Hits
 And I have to escape only backslashes (well I am using double quotes so
 it is literal, no?) and not brackets or colon:
 CounterName\\Windows Azure Caching:Client(w3wp_2392)\\Total Local Cache
 Hits

 Now, the 2392 number here is variable (pid on the box) so I am trying to
 look for \Windows Azure Caching:Client(w3wp_*)\Total Local Cache Hits and
 I have tried all these to no avail:

 CounterName:\\Windows Azure Caching:Client(w3wp_*)\\Total Local Cache
 Hits
 CounterName:\\Windows Azure Caching:Client(w3wp_\*)\\Total Local Cache
 Hits
 CounterName:\Windows Azure Caching:Client(w3wp_*\Total Local Cache
 Hits (nothing comes back)

 And also tried regex:

 CounterName:/\Windows Azure Caching:Client(w3wp_*)\\Total Local Cache
 Hits/
 CounterName:/\Windows Azure Caching:Client(w3wp_.*)\\Total Local Cache
 Hits/
 ...

 With many different combinations of replacing reserved chars with ?.

 What am I doing wrong?

 Thanks



  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/54e8264f-00ee-4327-b4fc-ae074152669e%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/54e8264f-00ee-4327-b4fc-ae074152669e%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHTr4Ztkn6wx%2BJB8iJVyLwmZTbX30SKuFkZOvZ38E-96guj7eQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Implementing search as you type example

2015-02-01 Thread Itamar Syn-Hershko
For implementing good autocomplete I recommend you look at the completion
suggester - its much faster and has more capabilities. It was built
especially for that.

See http://www.elasticsearch.org/blog/you-complete-me/ and
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-suggesters-completion.html

You can then complement it with Phrase Suggester to recommend spelling
corrections etc

edge-grams are less than ideal for this use case given the above tools

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer  Consultant
Lucene.NET committer and PMC member

On Sun, Feb 1, 2015 at 7:12 PM, Craig Ching craigch...@gmail.com wrote:

 Hi,

 I'm trying to implement the search as you type example from
 http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/_index_time_search_as_you_type.html

 Can someone see what I'm doing wrong?

 curl -XDELETE localhost:9200/my_index
 echo
 curl -XPUT localhost:9200/my_index -d '
 {
 settings: {
 number_of_shards: 1,
 analysis: {
 filter: {
 autocomplete_filter: {
 type: edge_ngram,
 min_gram: 1,
 max_gram: 20
 }
 },
 analyzer: {
 autocomplete: {
 type:  custom,
 tokenizer: standard,
 filter: [
 lowercase,
 autocomplete_filter
 ]
 }
 }
 }
 }
 }'
 echo
 curl -XPUT localhost:9200/my_index/_mapping/my_type -d '
 {
 my_type: {
 properties: {
 name: {
 type: string,
 analyzer: autocomplete
 }
 }
 }
 }'
 echo
 curl localhost:9200/my_index/my_type/_bulk -d '
 { index: { _id: 1}}
 { name: Brown foxes}
 { index: { _id: 2}}
 { name: Yellow furballs }
 '
 echo
 curl localhost:9200/my_index/my_type/_search -d '
 {
 query: {
 match: {
 name: brown fo
 }
 }
 }'
 echo
 curl localhost:9200/my_index/my_type/_validate/query?explain -d '
 {
 query: {
 match: {
 name: brown fo
 }
 }
 }'
 echo
 curl localhost:9200/my_index/my_type/_search -d '
 {
 query: {
 match: {
 name: {
 query:brown fo,
 analyzer: standard
 }
 }
 }
 }'
 echo
 curl localhost:9200/my_index/my_type/_validate/query?explain -d '
 {
 query: {
 match: {
 name: {
 query:brown fo,
 analyzer: standard
 }
 }
 }
 }'
 echo



  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/3265ddb0-eab4-4cc7-9fc0-66ae56c358e5%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/3265ddb0-eab4-4cc7-9fc0-66ae56c358e5%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZvjP1CsF9JSs1H0u6fioT_igm%3DBMxWfYs3iY2A5M6SXJw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: synonym dictionaries of person names

2015-02-01 Thread Itamar Syn-Hershko
Was it raw POS tagged data or just raw data? can you share the code /
process you used?

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer  Consultant
Lucene.NET committer and PMC member

On Thu, Jan 29, 2015 at 3:34 PM, Mark Harwood 
mark.harw...@elasticsearch.com wrote:

 I've built one before from raw data but you need:
 1) a *lot* of data
 2) a unique ID per person
 3) some noise/variation in the names recorded for each person

 The input is of this form:

 personID   recorded_name
 ===  =
 1   Rob
 1   Robert
 1   Bob
 2   Dave
 2   David
 2   Alice
 ...

 The output is a weighted graph of name-variant e.g Robert== Bob with a
 strong confidence rating.
 Using this I know not just real names but also typos e.g. that Janes is
 more likely to be James than Jane (a common typo due to key locations
 on keyboard).




 On Thursday, January 29, 2015 at 5:28:33 AM UTC, David Kemp wrote:

 I am looking for synonym dictionaries of person names that I can use with
 the Elasticsearch synonym analyser.
 e.g. dictionaries that map Ted to Edward, and Bill to William.
 I am curious to know what others are using.
 So far I have found these two possible sources:

 https://code.google.com/p/nickname-and-diminutive-names-
 lookup/downloads/list
 https://github.com/DallanQ/Names/wiki/Name-variant-files

 And perhaps
 http://www.behindthename.com

 Thanks,
 David

  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/6a473177-7fdd-49d9-95e3-538b51df57f1%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/6a473177-7fdd-49d9-95e3-538b51df57f1%40googlegroups.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHTr4Zup6FroPitENCjBohH8Zxjtcs_H4fCvWmL1nQeD8zZL7w%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Kibana - IIS 7.5

2015-01-26 Thread Itamar Syn-Hershko
You may want to give this a try:
https://github.com/synhershko/KibanaDotNet/tree/owin

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer  Consultant
Lucene.NET committer and PMC member

On Mon, Jan 26, 2015 at 3:58 PM, GWired garrettcjohn...@gmail.com wrote:

 I was able to get Kibana setup on my localhost and did a generic entry to
 allow everything into the elasticsearch.yml

 http.cors.allow-origin: /.*/

 Now I'm trying to getting it to run on my remote server running IIS 7.5 on
 port 8080.

 The page loads but only the top bar loads and nothing else any ideas?

  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/31402e52-0b96-4f2a-900a-d7f09bf62774%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/31402e52-0b96-4f2a-900a-d7f09bf62774%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZsZSWdjC6RwLFgm2p8Q3Y_kSJT6W1wVw8SQ7U-MeJtjqA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: When searching for 'Boss' with fuzziness, get higher score for 'Bose' than 'Boss'. ???? How Comes !?!?

2015-01-20 Thread Itamar Syn-Hershko
Famous last words :)

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer  Consultant
Lucene.NET committer and PMC member

On Tue, Jan 20, 2015 at 11:11 AM, Mark Harwood 
mark.harw...@elasticsearch.com wrote:

  it doesn't seem like this would address the IDF

 Trust me, I wrote it.


 On Tuesday, January 20, 2015 at 12:16:44 AM UTC, kasper...@yahoo.com
 wrote:

 Thanks Mark. Sounds like this issue affects a lot of people.

 I looked at your suggestion about FLT, and the ignore_tf parameter should
 help, however unless I'm missing something, it doesn't seem like this would
 address the IDF, and results could be biased. But I will experiment.

 Ultimately I think what my particular use case requires is a scorer that
 only uses edit distance (when querying with fuzziness) and field boosts,
 but no TF / IDF.


 On Monday, January 19, 2015 at 3:15:47 PM UTC-8, Mark Harwood wrote:

 This issue rounds up a bunch of related issues that have been raised
 previously: https://github.com/elasticsearch/elasticsearch/issues/9103

 For now try FuzzyLikeThis (http://www.elasticsearch.org/
 guide/en/elasticsearch/reference/current/query-dsl-
 flt-query.html#query-dsl-flt-query )
 It blends More Like This and fuzzy functionality but includes the
 adjustments to IDF that I think make more sense than the other
 implementations with their bias towards rewarding scarcity.


 On Monday, January 19, 2015 at 6:48:49 PM UTC, kasper...@yahoo.com
 wrote:

 I have the same problem, where some results with higher edit distance
 are ranked higher than other results that are closer in terms of edit
 distance.

 I suspect it does have to do with document frequency, as you think
 Adrien. In my case I want to ignore document frequency completely. Any
 suggestion to achieve this?

 I'm a taker of any solution as this looks like a show stopper for us,
 so even a workaround would help.

 I can try to create this other rewrite method you mentioned if you
 could point me in the right direction.

 Thanks

 On Thursday, January 15, 2015 at 7:44:57 AM UTC-8, Adrien Grand wrote:

 This is because the score takes two factors into account: the document
 frequency and the edit distance. Quite likely in your case, even though
 Boss is closer than Bose, Bose has a much lower document frequency which
 helped it eventually get a better score. I guess we should have another
 rewrite method that would not take freqs into account (or somehow merge
 them) to avoid that issue.

 On Thu, Jan 15, 2015 at 4:06 PM, Eylon Steiner eylon@gmail.com
 wrote:

 Any ideas?


  --
 You received this message because you are subscribed to the Google
 Groups elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it,
 send an email to elasticsearc...@googlegroups.com.
 To view this discussion on the web visit https://groups.google.com/d/
 msgid/elasticsearch/52e09e54-90b6-4014-8454-34e3db5756e5%
 40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/52e09e54-90b6-4014-8454-34e3db5756e5%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.




 --
 Adrien Grand

  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/9523b3d5-ffea-4760-9782-69167b9807ed%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/9523b3d5-ffea-4760-9782-69167b9807ed%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHTr4Zt5ycYCcwVkqL%2BMazATz5nS5VXtDq6DHmUv2KS%2BrKE_SQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Elasticsearch at Google Cloud Engine

2015-01-20 Thread Itamar Syn-Hershko
This requires port 9300 to be open on the cloud for you (UDP), and for your
client code to set the cluster name correctly

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer  Consultant
Lucene.NET committer and PMC member

On Tue, Jan 20, 2015 at 1:52 PM, Klausen Schaefersinho 
klaus.schaef...@gmail.com wrote:

 Hi,


 I have just the click to deploy features to set up a small ElasticSearch
 cluster. That seems to have worked fine and I can connect to the cluster of
 rest. For instance curl http://ip:9200 will return


 {
  status : 200,
  name : elasticsearch-8tqw,
  cluster_name : my_elasticsearch-cluster,
  version : {...},
  tagline : You Know, for Search
 }


 So I assume the cluster name is my_elasticsearch-cluster. However if I
 try to connect to the cluster using the java node client the client takes
 really long to join the cluster and if I try to perform a healt check, just
 to check if I am really connected I get the following exception:

 org.elasticsearch.discovery.MasterNotDiscoveredException: waited for [30s]
 at
 org.elasticsearch.action.support.master.TransportMasterNodeOperationAction$4.onTimeout(TransportMasterNodeOperationAction.java:164)
 at
 org.elasticsearch.cluster.ClusterStateObserver$ObserverClusterStateListener.onTimeout(ClusterStateObserver.java:239)
 at
 org.elasticsearch.cluster.service.InternalClusterService$NotifyTimeout.run(InternalClusterService.java:497)
 at
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:745)


 Why does this happen, and how can I solve it? Is there something not
 correctly configured in my network or should I use the transport client?


 Thanks!

 Klaus

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/bdc6e797-1656-4dfc-adda-794519a82eaf%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/bdc6e797-1656-4dfc-adda-794519a82eaf%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHTr4Zsa5v%3DyW7QF2G03oTuSB0D_mg_CUyUGTdFBRmZSYcdvsA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: How can I sort results by _id?

2015-01-15 Thread Itamar Syn-Hershko
No, an ID has to be a string

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer  Consultant
Author of RavenDB in Action http://manning.com/synhershko/

On Thu, Jan 15, 2015 at 12:12 PM, Jason Zhang moc...@gmail.com wrote:

 Can I specify its type as integer in _mapping? Because the _id I use is
 rewritten.

 On Thursday, January 15, 2015 at 6:07:22 PM UTC+8, Adrien Grand wrote:

 This is because the _id is a string field, so comparison is based on the
 lexicographical order, not numeric.

 On Thu, Jan 15, 2015 at 11:04 AM, Jason Zhang moc...@gmail.com wrote:

 What I'm confused is the 'sorted' results are still partly unordered.

 Also, if I query:

 {  range: {
 _id: {
   gt: 1,
   lt: 1}}}

 the results contain _id: 199989.

 On Thursday, January 15, 2015 at 5:48:48 PM UTC+8, Adrien Grand wrote:

 Making it index:not_analyzed should work, what is the issue with the
 results?

 Note that loading the _id in fielddata is typically very costly since
 the _id field is typically unique per document.

 On Thu, Jan 15, 2015 at 10:35 AM, Jason Zhang moc...@gmail.com wrote:

 I use a query dsl like:

 {
   filter: {
 exists: { field: info }
   },
   sort: { _id: desc }
 }

 And the _id here is an integer like '123'.

 But the result is like:

 {
   took: 50,
   ...
   hits: {
 ...
 hits: [
   {
 ...
 sort: [ null ]
   }]
   }
 }

 Also, I've tried to add _id: { index: not_analyzerd } in the
 _mapping.
 This time the sort section returns values. But I find the results
 are still partly unordered.

 Can I sort results by _id? How?

 Thank you.

 --
 You received this message because you are subscribed to the Google
 Groups elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send
 an email to elasticsearc...@googlegroups.com.
 To view this discussion on the web visit https://groups.google.com/d/
 msgid/elasticsearch/4ea45f18-847a-4b58-b78e-ddcd9ee1e9f9%40goo
 glegroups.com
 https://groups.google.com/d/msgid/elasticsearch/4ea45f18-847a-4b58-b78e-ddcd9ee1e9f9%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.




 --
 Adrien Grand

  --
 You received this message because you are subscribed to the Google
 Groups elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send
 an email to elasticsearc...@googlegroups.com.
 To view this discussion on the web visit https://groups.google.com/d/
 msgid/elasticsearch/b7f625dd-8afd-4603-afc8-1fd6d5b601d1%
 40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/b7f625dd-8afd-4603-afc8-1fd6d5b601d1%40googlegroups.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.




 --
 Adrien Grand

  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/2475cb1a-5631-4b06-8507-28c4d81f9d4d%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/2475cb1a-5631-4b06-8507-28c4d81f9d4d%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZvWQtGKE6JDd6%3D%2BXRJENrAyLPkTE3%2BBRpFsEJ%2BS09bTpg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: tuning elasticsearch node client non-heap memory consumption

2015-01-13 Thread Itamar Syn-Hershko
Why would you want that?

Locking heap memory usage is done by Elasticsearch on data nodes to reduce
GC rounds, mainly because it loads a lot of data that is best managed by ES
itself.

On client nodes you don't need that (and if you did, you wouldn't be using
that small heap sizes)

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer  Consultant
Author of RavenDB in Action http://manning.com/synhershko/

On Tue, Jan 13, 2015 at 12:37 PM, Itai Frenkel itaifren...@live.com wrote:

 Hello,

 We are running a node client on each machine with small JVM heap -Xms384m
 -Xmx384m -Xss256k which is suitable for our use case.
 There are however another 285MB non-heap memory (676-389=285).
 How can this extra non-heap memory usage be configured ? What is it used
 for ?
 Below are the relevant node stats.

 Regards,
 Itai

 process: {
 open_file_descriptors: 340,
 mem: {
   resident_in_bytes: 676884480,
   share_in_bytes: 23248896,
   total_virtual_in_bytes: 1696899072
 }
   },
 jvm: {
 mem: {
   heap_used_in_bytes: 44794784,
   heap_used_percent: 11,
   heap_committed_in_bytes: 389283840,
   heap_max_in_bytes: 389283840,
   non_heap_used_in_bytes: 44208640,
   non_heap_committed_in_bytes: 44564480,
   pools: {
 young: {
   used_in_bytes: 13765016,
   max_in_bytes: 107479040,
   peak_used_in_bytes: 107479040,
   peak_max_in_bytes: 107479040
 },
 survivor: {
   used_in_bytes: 5086896,
   max_in_bytes: 13369344,
   peak_used_in_bytes: 13369344,
   peak_max_in_bytes: 13369344
 },
 old: {
   used_in_bytes: 25942872,
   max_in_bytes: 268435456,
   peak_used_in_bytes: 25942872,
   peak_max_in_bytes: 268435456
 }
   }
 },

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/178f4f2f-5dfe-418e-82a3-de505a9ebd9a%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/178f4f2f-5dfe-418e-82a3-de505a9ebd9a%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZuWvT7Co6PMJZjYY5tA1XTtiMaBHRZDyynX4zKTmC%3D6Bg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Join between two different sources using Kibana 4

2015-01-12 Thread Itamar Syn-Hershko
You either use parent / child
http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/parent-child.html

Or index denormalized data in the first place

Elasticsearch isn't meant to be used using the same models as relational
databases

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer  Consultant
Author of RavenDB in Action http://manning.com/synhershko/

On Mon, Jan 12, 2015 at 9:36 PM, Gregory Touretsky 
gregory.touret...@intel.com wrote:

 Hi,

what would be the right way to join between two data sources using
 Kibana 4 interface?
 Assume 2 data sources:
 1. source=jobs,  fields = {jobid, user, host, exitstatus,
 starttime,finishtime}
 Sample record:
  type = jobs;  jobid = 1234; user = john; host = myhost; exitstatus =
 -3002; starttime = 01/01/2015 01:01; finishtime = 01/01/2015  01:15
 2. source=license, fields = {host, user, time, feature, result}
 Sample records:
  type = license;  user = john; host = myhost; time = 01/01/2015 01:05;
 feature = AAA; result = DENIED
  type = license;  user = john; host = myhost; time = 01/01/2015 01:07;
 feature = BBB; result = APPROVED

 I’d like to create a dashboard in Kibana 4 which would show a joint table
 combining both sources.
 Using pseudo-SQL code, it should do something like:

 select
 jobs.jobid,jobs.user,jobs.host,license.feature,license.result,count(license.time)
 from jobs
 LEFT JOIN license
 WHERE jobs.exitstatus=-3002 AND license.user=jobs.user AND
 license.host=jobs.host AND license.time=jobs.starttime AND
 license.time=jobs.finishtime
 GROUP BY jobs.jobid,jobs.user,jobs.host

 Thanks in advance,
Gregory

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/daf3dbf4-7b76-477e-8b10-5ca54cb53bf0%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/daf3dbf4-7b76-477e-8b10-5ca54cb53bf0%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZuW4n8JLyAXsnM%3Dppv_Wjg1SSm0OJrmyVYWKkAtrKTzUw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Searching with Elasticsearch.Net

2015-01-09 Thread Itamar Syn-Hershko
If all you need is querying, I will highly recommend looking at
https://github.com/CenturyLinkCloud/ElasticLINQ for .NET

I also have my own stab at a .NET client library for Elasticsearch here:
https://github.com/synhershko/NElasticsearch /
https://www.nuget.org/packages/NElasticsearch/1.0.14 (still WIP)

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer  Consultant
Author of RavenDB in Action http://manning.com/synhershko/

On Fri, Jan 9, 2015 at 7:54 PM, Garrett Johnson garrettcjohn...@gmail.com
wrote:

 { query = this.textBox1.Text,

 default_field = _all

 },

 This seems to send it.  Still don't know how to get the results.

 It's like it should be in a foreach(hit in hits) {get fields}  but I have
 nothing and documentation isn't helping.


 On Friday, January 9, 2015 at 12:43:36 PM UTC-5, Garrett Johnson wrote:

 Hi All,

 I would like to use Elasticsearch.Net (NEST requires types and I do not
 want strong types) to do a simple _all term search.  I can do this using
 the plugin elasticsearch head and I retrieve the appropriate documents.
 Here is some simple code I wrote just to say hey give me all that match.

 var node = new Uri(http://myhost:9200);

 var config = new ConnectionConfiguration(node);

 var exposed = config.ExposeRawResponse(true);

 var client = new ElasticsearchClient(config);

 var search = new

 {

 size = 10,

 from = 1 * 10,

 query = new { query_string = new { query = this.textBox1.Text } },

 };

 var searchResponse = client.Search(jdbc,search);

 This returns these results:

 {StatusCode: 200,
  Method: POST,
  Url: http://u4vmeqlditapp01:9200/jdbc/_search,
  Request: {size:10,from:10,query:{query_string:{query:Garrett}}},

  Response: {took:5,timed_out:false,_shards:{total:5,
 successful:5,failed:0},hits:{total:10,max_score:
 1.1672286,hits:[]}}}

 But no documents.

 Here is the JSON I'm trying to replicate:



- query: {
   - bool: {
  - must: [
 - {
- query_string: {
   - default_field: _all,
   - query: Garrett
}
 }
  ],
  - must_not: [ ],
  - should: [ ]
   }
},
- from: 0,
- size: 25000,
- sort: [ ],
- facets: { }


 I'm pretty sure it is because the query doesn't have the default_field
 set to _all... But I don't know how to set that.  I've tried several string
 concatenations to no avail it just searched for them.

 Any one with any ideas.  I want to simply search all types for a single
 string.

 Garrett





  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/c7be81ac-d3ff-4841-b458-41aae34df921%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/c7be81ac-d3ff-4841-b458-41aae34df921%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZtOWzNT6to0H5ahDooJ-J%3DAgLhRcJxdJwUwCB3i7wjLeg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Bucket query results | top hits performance

2015-01-06 Thread Itamar Syn-Hershko
Can you share the query and example results please?

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer  Consultant
Author of RavenDB in Action http://manning.com/synhershko/

On Tue, Jan 6, 2015 at 10:11 PM, Michael Irani irani.mich...@gmail.com
wrote:

 Hello,
 I'm working on a corpus of size approximately 10 million documents. The
 issue I'm running into right now is that the top scoring documents that
 come back from my query are essentially all the same result. I'm trying to
 find a way to get back unique results.

 I've looked into modeling the data differently with nested objects or
 parent-child relationships, but neither layout seems to fit the bill. The
 nested model won't work because some of the documents have too many closely
 related objects. On the flip side there are also too many unique documents
 for the parent-child relationship to fit.

 I then tried the top hits aggregation and it's exactly what I'm looking
 for, except the running time of the query is approximately 30x slower than
 the query without the aggregation. Are there known performance issues with
 top hits? Any ideas on what I should use to make these queries? Here's
 the aggregation piece:
 aggs: {

 top-fingerprints: {
 terms: {
 field: fingerprint,
 size: 50
 },
 aggs: {
 top_tag_hits: {
 top_hits: {
 size: 1,
 _source: {
include: [
   title
]
 }
 }
 }
 }
 }
 }


 Thanks,
 Michael

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/29fce15c-79b7-4756-b033-93e490204095%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/29fce15c-79b7-4756-b033-93e490204095%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHTr4Zv6oymHVY8ambsshh6CMtD%2BMJrf-VSA0hoKAeYwvVQL8w%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Failed stopping 'elasticsearch-service-x64' service

2015-01-04 Thread Itamar Syn-Hershko
Does it also happen when you uninstall the JDBC river?

Also, I'd highly recommend using Linux servers for Elasticsearch instances
and not Windows ones

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer  Consultant
Author of RavenDB in Action http://manning.com/synhershko/

On Mon, Jan 5, 2015 at 2:25 AM, Garrett Johnson garrettcjohn...@gmail.com
wrote:

 Log entries:
 [2015-01-04 18:13:56,185][INFO ][node ] [Bucky III]
 stopping ...
 [2015-01-04 18:13:56,202][INFO ][river.jdbc.JDBCRiver ] river closed
 [jdbc/users]
 [2015-01-04 18:13:56,203][INFO ][river.jdbc.JDBCRiver ] river closed
 [jdbc/product2]
 [2015-01-04 18:13:56,342][INFO ][node ] [Bucky III]
 stopped
 [2015-01-04 18:13:56,342][INFO ][node ] [Bucky III]
 closing ...
 [2015-01-04 18:13:56,355][INFO ][node ] [Bucky III]
 closed


 Windows Server 2008R2

 ElasticSearch 1.4.2
 Plugins ElasticSearch Head, jdbc river 1.4.0.6

 Microsoft jdbc driver.

 Thanks,

 Garrett


 On Saturday, January 3, 2015 10:10:42 AM UTC-6, Costin Leau wrote:

 Do you see anything in the logs? Can you try removing and reinstalling
 the service? What's your OS/configuration?

 On 1/2/15 10:32 PM, Garrett Johnson wrote:
  By own it's own I mean service stop or using services.msc and clicking
 restart on the service.  Both attempts get the
  same error.
 
  On Friday, January 2, 2015 2:31:28 PM UTC-6, Garrett Johnson wrote:
 
  I'm getting this error every time I try to start and stop the
 elastic search windows service.
 
  Takes a couple of minutes then fails.  I can kill the task in task
 manager and then restart but cannot get it to
  stop on its own.
 
  --
  You received this message because you are subscribed to the Google
 Groups elasticsearch group.
  To unsubscribe from this group and stop receiving emails from it, send
 an email to
  elasticsearc...@googlegroups.com mailto:elasticsearch+
 unsubscr...@googlegroups.com.
  To view this discussion on the web visit
  https://groups.google.com/d/msgid/elasticsearch/f4efa651-
 9c60-4abb-b04a-47992f1c3e82%40googlegroups.com
  https://groups.google.com/d/msgid/elasticsearch/f4efa651-
 9c60-4abb-b04a-47992f1c3e82%40googlegroups.com?utm_medium=
 emailutm_source=footer.
  For more options, visit https://groups.google.com/d/optout.

 --
 Costin

  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/fed85958-becc-4269-9300-044e22499624%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/fed85958-becc-4269-9300-044e22499624%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHTr4Zvr0bkpiNCwJpGWhPRV0POVn1eJQxcs%2Bb19MmtaiB%2BX_g%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Question about highlight query.

2015-01-01 Thread Itamar Syn-Hershko
A bit off-topic, but I'd really like to see is the ability to perform
highlighting asynchronously, that is - first get the search results from
Elsaticsearch, process them and get the highlighted snippets on a second
wave, asynchronously.

The main problem with highlighting currently is that it is slow - because
of hackish recursive algorithms and mandatory I/O access. I'd like to avoid
doing 2-step searches (one search for the results, the other one is to
artificially propagate the highlights to the UI on a second wave - I
wonder if we can come up with a way to have ES propagate them
asynchronously for us?

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer  Consultant
Author of RavenDB in Action http://manning.com/synhershko/

On Wed, Dec 31, 2014 at 5:38 PM, Nikolas Everett nik9...@gmail.com wrote:

 Highlighting isn't a nice pretty thing - its kind of a hacky.  There are
 three highlighters built in
 http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-request-highlighting.html
 to Elasticsearch and they all work differently.  You should try all of them
 and see if they do what you want.  They all come at the problem from a
 different perspective and have their own idiosyncrasies.  I maintain a 
 highlighter
 plugin https://github.com/wikimedia/search-highlighter as well that you
 can use as a forth option.  It merges lots of the implementation strategies
 that the other ones use together and attempts to give you more options and
 it might do what you need.

 Nik

 On Tue, Dec 23, 2014 at 12:44 PM, Yang Liu yl...@nyu.edu wrote:

 No one knows anything about this? I really appreciate anything you
 offered.


 On Monday, December 22, 2014 5:27:57 PM UTC-5, Yang Liu wrote:

 Hi, guys,
 I have a question about highlight query in ES.
 *Below is my query,*
 {
   _source: [

  .
   ],
   highlight: {
 fields: {
   FDS_ATTACHMENTS: {
 type: plain
   },
   FDS_ATTACHMENTS.no_stem: {
 type: plain
   },
   FDS_ATTACHMENTS.with_case: {
 type: plain
   },
   headline: {
 type: plain
   },
   headline.no_stem: {
 type: plain
   },
   headline.with_case: {
 type: plain
   }
 },
 fragment_size: 500,
 highlight_query: {
   bool: {
 must: [
   {
 bool: {
   minimum_should_match: 1,
   should: [
 {
   span_near: {
 clauses: [
   {
 span_term: {
   FDS_ATTACHMENTS.no_stem: rights
 }
   },
   {
 span_term: {
   FDS_ATTACHMENTS.no_stem: agreement
 }
   }
 ],
 in_order: true,
 slop: 0
   }
 }
   ]
 }
   },
   {
 bool: {
   minimum_should_match: 1,
   should: [
 {
   span_near: {
 clauses: [
   {
 span_term: {
   FDS_ATTACHMENTS.no_stem: rights
 }
   },
   {
 span_term: {
   FDS_ATTACHMENTS.no_stem: agreement
 }
   },
   {
 span_term: {
   FDS_ATTACHMENTS.no_stem: merger
 }
   }
 ],
 in_order: false,
 slop: 5
   }
 }
   ]
 }
   }
 ]
   }
 },
 number_of_fragments: 50,
 post_tags: [
   /font
 ],
 pre_tags: [
   font color=red
 ],
 require_field_match: true
   },
   query: {
 filtered: {
   filter: {
 range: {
   story_datetime: {
 gte: 20141221t00,
 lte: 20141222t235959
   }
 }
   },
   query: {
 bool: {
   must: [
 {
   bool: {
 minimum_should_match: 1,
 should: [
   {
 span_near: {
   clauses: [
 {
   span_term: {
 FDS_ATTACHMENTS.no_stem: rights
   }
 },
 {
   span_term: {
 FDS_ATTACHMENTS.no_stem: agreement

Re: Running elasticsearch 1.4.2 and kibana 4 as service

2014-12-23 Thread Itamar Syn-Hershko
Elasticsearch has packages which will do this for you on every Linux
distribution:
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/setup-repositories.html

For Kibana 4 you'll need to use init.d and /sbin/service , the specifics
are going to depend on the distribution and the tools you have installed

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer  Consultant
Author of RavenDB in Action http://manning.com/synhershko/

On Tue, Dec 23, 2014 at 11:02 PM, Ram Maram ram.mara...@gmail.com wrote:

 Hi,

 Right now I am running kibana 3 and elasticsearch 1.3.2 for our ELK stack,
 I would like to use kibana 4 and elasticsearch 1.4.2.

 Can someone please let me know how to install kibana 4 and elasticsearch
 1.4.2 as a service on linux?

 I was able to run them manually but I couldn't figure how to run them as a
 service.

 Thanks,

 Ram

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/13d0fe92-bb67-4552-b8da-f482a4291dd1%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/13d0fe92-bb67-4552-b8da-f482a4291dd1%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZtVSuCKY8sBoPhw5yzbdMaHiXB2XsHieBvtvRNfJGL5hg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Running elasticsearch 1.4.2 and kibana 4 as service

2014-12-23 Thread Itamar Syn-Hershko
It's basic Linux administration stuff, see
http://arstechnica.com/civis/viewtopic.php?p=2147913sid=16c526bdb60201e802cf7f6b8bc598e2#p2147913
for example (and the rest of the instructions on chkconfig). Just update
the script to point at your Kibana files.

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer  Consultant
Author of RavenDB in Action http://manning.com/synhershko/

On Tue, Dec 23, 2014 at 11:28 PM, Ram Maram ram.mara...@gmail.com wrote:

 Thank you Itamar for your quick respone, my distribution is redhat linux
 6.x and the tools that have installed are logstash, java, elasticsearch.

 Can you guide me on how to create the init file for kibana 4 or can I host
 it on apache ?

 Thanks,

 Ram

 On Tuesday, December 23, 2014 4:06:25 PM UTC-5, Itamar Syn-Hershko wrote:

 Elasticsearch has packages which will do this for you on every Linux
 distribution: http://www.elasticsearch.org/guide/en/
 elasticsearch/reference/current/setup-repositories.html

 For Kibana 4 you'll need to use init.d and /sbin/service , the specifics
 are going to depend on the distribution and the tools you have installed

 --

 Itamar Syn-Hershko
 http://code972.com | @synhershko https://twitter.com/synhershko
 Freelance Developer  Consultant
 Author of RavenDB in Action http://manning.com/synhershko/

 On Tue, Dec 23, 2014 at 11:02 PM, Ram Maram ram.m...@gmail.com wrote:

 Hi,

 Right now I am running kibana 3 and elasticsearch 1.3.2 for our ELK
 stack, I would like to use kibana 4 and elasticsearch 1.4.2.

 Can someone please let me know how to install kibana 4 and elasticsearch
 1.4.2 as a service on linux?

 I was able to run them manually but I couldn't figure how to run them as
 a service.

 Thanks,

 Ram

 --
 You received this message because you are subscribed to the Google
 Groups elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send
 an email to elasticsearc...@googlegroups.com.
 To view this discussion on the web visit https://groups.google.com/d/
 msgid/elasticsearch/13d0fe92-bb67-4552-b8da-f482a4291dd1%
 40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/13d0fe92-bb67-4552-b8da-f482a4291dd1%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/a2229186-903a-46c7-b132-b0cae3737236%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/a2229186-903a-46c7-b132-b0cae3737236%40googlegroups.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZvJsvbhB9Qy5yesH004YK-3RWVXc1fvjRz3RBuooK94-A%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Running elasticsearch 1.4.2 and kibana 4 as service

2014-12-23 Thread Itamar Syn-Hershko
I'd actually prefer to install from repositories as they take care of
placing things in the right place and create a user to run ES under

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer  Consultant
Author of RavenDB in Action http://manning.com/synhershko/

On Tue, Dec 23, 2014 at 11:45 PM, joergpra...@gmail.com 
joergpra...@gmail.com wrote:

 Use https://github.com/elasticsearch/elasticsearch-servicewrapper to run
 ES as a service under RHEL 6.

 Jörg

 On Tue, Dec 23, 2014 at 10:02 PM, Ram Maram ram.mara...@gmail.com wrote:

 Hi,

 Right now I am running kibana 3 and elasticsearch 1.3.2 for our ELK
 stack, I would like to use kibana 4 and elasticsearch 1.4.2.

 Can someone please let me know how to install kibana 4 and elasticsearch
 1.4.2 as a service on linux?

 I was able to run them manually but I couldn't figure how to run them as
 a service.

 Thanks,

 Ram

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/13d0fe92-bb67-4552-b8da-f482a4291dd1%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/13d0fe92-bb67-4552-b8da-f482a4291dd1%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/CAKdsXoERpmJxSASTyghjpVA7omKqx3N1Y7CdMX_GRpfJh5J6Hg%40mail.gmail.com
 https://groups.google.com/d/msgid/elasticsearch/CAKdsXoERpmJxSASTyghjpVA7omKqx3N1Y7CdMX_GRpfJh5J6Hg%40mail.gmail.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHTr4Zs6xznALSGzjQGz2OGsU%3D3xh88Ab5HOZw9bLVn%3Dcjc3YQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: When to use fields and when to use source filtering

2014-12-22 Thread Itamar Syn-Hershko
Fields are used to pull data from stored fields whereas source filtering is
targeting _source. At the moment both fallback on each other, so the
differences is in the order of precedence. I believe I've heard there're
plans to deprecate fields completely, wonder if someone from ES could
confirm?

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer  Consultant
Author of RavenDB in Action http://manning.com/synhershko/

On Mon, Dec 22, 2014 at 2:28 PM, Shelef shlaf...@gmail.com wrote:

 I read about two ways to filter the fields returned by elasticsearch.
 fields
 http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-request-fields.html
  And source filtering
 http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-request-source-filtering.html.
 when to use which?

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/0a8593ca-d17f-4f6d-b3b8-b5ee10196892%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/0a8593ca-d17f-4f6d-b3b8-b5ee10196892%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZtVDGbnuBdGgxHavr3O9ToO%2BC_5Tg%3DE8tntw_5yW%3Djm%3Dg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Custom _source compression / compaction to reduce disk usage

2014-12-15 Thread Itamar Syn-Hershko
I'm pretty sure you'll lose cross-document compression that way, which is
highly noticable on lots of 3k large documents

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer  Consultant
Author of RavenDB in Action http://manning.com/synhershko/

On Mon, Dec 15, 2014 at 10:56 PM, Eran Duchan pav...@gmail.com wrote:

 Thanks for the pointers. I just realized I can disable _source and store
 a field with the encoded data (D'oh). If I find anything semi-intelligent
 during my tests, I'll report back.

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/bd777d03-06d9-416b-8366-1b6b1f6e1302%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/bd777d03-06d9-416b-8366-1b6b1f6e1302%40googlegroups.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZttApz%3D6kG%2B_Pw19V%3Db-%2Bbvh-deAsZwRZ3TeeqXuzGq5g%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Performance issues when flagging a document in Elasticsearch

2014-12-10 Thread Itamar Syn-Hershko
Lucene / Elasticsearch is pretty much insignificant to this as long as you
use filters. You should prefer not_analyzed fields with string values to
represent those flags vs having dedicated boolean fields if you will have
more than a few such flags.

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer  Consultant
Author of RavenDB in Action http://manning.com/synhershko/

On Wed, Dec 10, 2014 at 10:22 AM, Dror Atariah dror...@gmail.com wrote:

 Assume that I want to be able to flag documents in an index according to
 their attributes: isFoo and isBar [1]. As far as I understand, there are
 two approaches:

 1) Use dedicated fields for the flags: If the document is a Foo then add a
 field named isFoo. Similarly, for isBar.
 2) Use a flags field that will be an array of strings. In this case, if
 the document is Foo then flags will contain the string isFoo.

 What are the pros and cons in terms of space and runtime complexities?

 Bear in mind the following queries examples: Consider the case where one
 wants to check the attributes of the documents in the index. In particular,
 if I want to find the documents that are either Foo *or* Bar I can either
 (a) In case (1): Use a Boolean should filter the surrounds two
 exists's filters checking whether either isFoo or isBar exist.
 (b) In case (2): Use a single exists filter that checks the existence of
 the field flags.

 A different case, is if I want to find the documents that are both Foo
 *and* Bar:
 (a) In case (1): Like before, replace the should with a must.
 (b) In case (2): Surround two terms filters with a must Boolean one.

 Lastly, finding the documents that are Foo but *not* Bar.

 In the bottom line, In case (1) all queries boil down to mixture of
 Boolean, exists and missing filters. In case (2), one has to process the
 strings in the array of strings named flags. My intuition is that it is
 faster to use method (1). In terms of space complexity I believe there is
 no difference.

 I'm looking forward to your insights!
 Dror

 [1]: Obviously, there could be way more flags...

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/ef637057-4303-4c75-9bbf-ed72e0d4806b%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/ef637057-4303-4c75-9bbf-ed72e0d4806b%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZstGjg-b7tHX8R56sGB9_znBzDwnJO4naC6y_L6FaQ19g%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Performance issues when flagging a document in Elasticsearch

2014-12-10 Thread Itamar Syn-Hershko
Basically, you will have to maintain more filters. Also Lucene supports up
to certain amount of fields, it wasn't designed to handle unlimited number
of them

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer  Consultant
Author of RavenDB in Action http://manning.com/synhershko/

On Wed, Dec 10, 2014 at 10:35 AM, Dror Atariah dror...@gmail.com wrote:

 @Itamar: Can you please elaborate on the matter? Why/how does the number
 of fields relevant here?

 On Wednesday, December 10, 2014 4:26:16 PM UTC+1, Itamar Syn-Hershko wrote:

 Lucene / Elasticsearch is pretty much insignificant to this as long as
 you use filters. You should prefer not_analyzed fields with string values
 to represent those flags vs having dedicated boolean fields if you will
 have more than a few such flags.

 --

 Itamar Syn-Hershko
 http://code972.com | @synhershko https://twitter.com/synhershko
 Freelance Developer  Consultant
 Author of RavenDB in Action http://manning.com/synhershko/

 On Wed, Dec 10, 2014 at 10:22 AM, Dror Atariah dro...@gmail.com wrote:

 Assume that I want to be able to flag documents in an index according to
 their attributes: isFoo and isBar [1]. As far as I understand, there are
 two approaches:

 1) Use dedicated fields for the flags: If the document is a Foo then add
 a field named isFoo. Similarly, for isBar.
 2) Use a flags field that will be an array of strings. In this case, if
 the document is Foo then flags will contain the string isFoo.

 What are the pros and cons in terms of space and runtime complexities?

 Bear in mind the following queries examples: Consider the case where one
 wants to check the attributes of the documents in the index. In particular,
 if I want to find the documents that are either Foo *or* Bar I can either
 (a) In case (1): Use a Boolean should filter the surrounds two
 exists's filters checking whether either isFoo or isBar exist.
 (b) In case (2): Use a single exists filter that checks the existence
 of the field flags.

 A different case, is if I want to find the documents that are both Foo
 *and* Bar:
 (a) In case (1): Like before, replace the should with a must.
 (b) In case (2): Surround two terms filters with a must Boolean one.

 Lastly, finding the documents that are Foo but *not* Bar.

 In the bottom line, In case (1) all queries boil down to mixture of
 Boolean, exists and missing filters. In case (2), one has to process the
 strings in the array of strings named flags. My intuition is that it is
 faster to use method (1). In terms of space complexity I believe there is
 no difference.

 I'm looking forward to your insights!
 Dror

 [1]: Obviously, there could be way more flags...

 --
 You received this message because you are subscribed to the Google
 Groups elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send
 an email to elasticsearc...@googlegroups.com.
 To view this discussion on the web visit https://groups.google.com/d/
 msgid/elasticsearch/ef637057-4303-4c75-9bbf-ed72e0d4806b%
 40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/ef637057-4303-4c75-9bbf-ed72e0d4806b%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/c376b40d-1c46-43f5-952f-96ec01338788%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/c376b40d-1c46-43f5-952f-96ec01338788%40googlegroups.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHTr4Zv78zrf%3DBkiBr%2BB5k_tM0qOS5QEA83BQ2PD34WtoXt_HA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Performance issues when flagging a document in Elasticsearch

2014-12-10 Thread Itamar Syn-Hershko
I imagine the types of graphs you could come up with will differ
significantly, to start with

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer  Consultant
Author of RavenDB in Action http://manning.com/synhershko/

On Wed, Dec 10, 2014 at 11:03 AM, Dror Atariah dror...@gmail.com wrote:

 Is there any difference or any implications if there is also need of
 aggregations?

 On Wednesday, December 10, 2014 4:57:10 PM UTC+1, Itamar Syn-Hershko wrote:

 Basically, you will have to maintain more filters. Also Lucene supports
 up to certain amount of fields, it wasn't designed to handle unlimited
 number of them

 --

 Itamar Syn-Hershko
 http://code972.com | @synhershko https://twitter.com/synhershko
 Freelance Developer  Consultant
 Author of RavenDB in Action http://manning.com/synhershko/

 On Wed, Dec 10, 2014 at 10:35 AM, Dror Atariah dro...@gmail.com wrote:

 @Itamar: Can you please elaborate on the matter? Why/how does the number
 of fields relevant here?

 On Wednesday, December 10, 2014 4:26:16 PM UTC+1, Itamar Syn-Hershko
 wrote:

 Lucene / Elasticsearch is pretty much insignificant to this as long as
 you use filters. You should prefer not_analyzed fields with string values
 to represent those flags vs having dedicated boolean fields if you will
 have more than a few such flags.

 --

 Itamar Syn-Hershko
 http://code972.com | @synhershko https://twitter.com/synhershko
 Freelance Developer  Consultant
 Author of RavenDB in Action http://manning.com/synhershko/

 On Wed, Dec 10, 2014 at 10:22 AM, Dror Atariah dro...@gmail.com
 wrote:

 Assume that I want to be able to flag documents in an index according
 to their attributes: isFoo and isBar [1]. As far as I understand, there 
 are
 two approaches:

 1) Use dedicated fields for the flags: If the document is a Foo then
 add a field named isFoo. Similarly, for isBar.
 2) Use a flags field that will be an array of strings. In this case,
 if the document is Foo then flags will contain the string isFoo.

 What are the pros and cons in terms of space and runtime complexities?

 Bear in mind the following queries examples: Consider the case where
 one wants to check the attributes of the documents in the index. In
 particular, if I want to find the documents that are either Foo *or* Bar I
 can either
 (a) In case (1): Use a Boolean should filter the surrounds two
 exists's filters checking whether either isFoo or isBar exist.
 (b) In case (2): Use a single exists filter that checks the
 existence of the field flags.

 A different case, is if I want to find the documents that are both Foo
 *and* Bar:
 (a) In case (1): Like before, replace the should with a must.
 (b) In case (2): Surround two terms filters with a must Boolean
 one.

 Lastly, finding the documents that are Foo but *not* Bar.

 In the bottom line, In case (1) all queries boil down to mixture of
 Boolean, exists and missing filters. In case (2), one has to process the
 strings in the array of strings named flags. My intuition is that it is
 faster to use method (1). In terms of space complexity I believe there is
 no difference.

 I'm looking forward to your insights!
 Dror

 [1]: Obviously, there could be way more flags...

 --
 You received this message because you are subscribed to the Google
 Groups elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send
 an email to elasticsearc...@googlegroups.com.
 To view this discussion on the web visit https://groups.google.com/d/
 msgid/elasticsearch/ef637057-4303-4c75-9bbf-ed72e0d4806b%40goo
 glegroups.com
 https://groups.google.com/d/msgid/elasticsearch/ef637057-4303-4c75-9bbf-ed72e0d4806b%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


  --
 You received this message because you are subscribed to the Google
 Groups elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send
 an email to elasticsearc...@googlegroups.com.
 To view this discussion on the web visit https://groups.google.com/d/
 msgid/elasticsearch/c376b40d-1c46-43f5-952f-96ec01338788%
 40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/c376b40d-1c46-43f5-952f-96ec01338788%40googlegroups.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/a74d02d3-5065-4642-801e-a1823fab37a4%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/a74d02d3-5065-4642-801e-a1823fab37a4%40googlegroups.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


-- 
You

Re: Migrate ES cluster to use doc_values

2014-12-07 Thread Itamar Syn-Hershko
You will need to reindex, see:

http://www.elasticsearch.org/blog/disk-based-field-data-a-k-a-doc-values/
http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/doc-values.html#_enabling_doc_values

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer  Consultant
Author of RavenDB in Action http://manning.com/synhershko/

On Sun, Dec 7, 2014 at 3:25 PM, Yoav Melamed yo...@exelate.com wrote:

 Hello,

 We want to migrate our ES cluster to use version 1.4.1 with doc_values.
 We have 20 nodes with 4TB data.
 What is the best practice? Can we just change the mapping and restart the
 cluster?
 How can we make sure the change was done?

 Thanks,

 Yoav Melamed

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/f6fb341c-9469-4e8b-be92-0d4f9107463e%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/f6fb341c-9469-4e8b-be92-0d4f9107463e%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZsFE7WnsbFqW7EC3b3HrS4%3DuJx3TtvRffaH-yYFuXD-bw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: downgrading from 1.4 to1.3

2014-12-04 Thread Itamar Syn-Hershko
Classic CORS error - maybe * is blocked by ES. Haven't had to deal with
this myself (yet) so can't help you here. All in all just a small rough
edge to smooth, not a clusterfuck.

A quick solution would be to install K3 as a site plugin and use it
internally (don't expose it to the web)

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer  Consultant
Author of RavenDB in Action http://manning.com/synhershko/

On Thu, Dec 4, 2014 at 3:20 AM, Jack Judge jackjudg...@gmail.com wrote:

 Well you're right there's JS errors, CORS related;

 XMLHttpRequest cannot load
 http://10.5.41.120:9200/logstash-2014.12.04/_search. Request header field
 Content-Type is not allowed by Access-Control-Allow-Headers.

 In my elasticsearch.yml I've got this on all nodes,

 http.cors.allow-origin: /.*/
 http.cors.enabled: true

 Which google leads me to believe should open it up for anything. K3 is
 fronted by apache and a bit more googling prompted me to add this to the
 Directory section of httpd.conf

 Header set Access-Control-Allow-Origin *

 Still getting the same errors :(
 I'm at a loss to know what else to do now.



 On Wednesday, 3 December 2014 15:48:28 UTC-8, Itamar Syn-Hershko wrote:

 I'm not aware of compat issues with K3 and ES 1.4 other than
 https://github.com/elasticsearch/kibana/issues/1637 . I'd check for
 javascript errors, and try to see what's going on under the hood, really.
 When you have more data about this, you can either quickly resolve, or open
 a concrete bug :)


  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/19d57d39-b764-493c-bd60-c8ae3aff087a%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/19d57d39-b764-493c-bd60-c8ae3aff087a%40googlegroups.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZsXoVG%3DqYm8_0RC0H1%3DtWQfBtXhOScv6bjwKN9winF4cQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: downgrading from 1.4 to1.3

2014-12-03 Thread Itamar Syn-Hershko
I'm pretty sure you can't due to different Lucene versions. I wouldn't even
try - just export and re-index.

I will be more than happy to hear about what went wrong for you with
upgrading?

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer  Consultant
Author of RavenDB in Action http://manning.com/synhershko/

On Wed, Dec 3, 2014 at 10:53 PM, Jack Judge jackjudg...@gmail.com wrote:

 Upgrading to 1.4 has been a clusterfuck for us. It's broken pretty much
 everything we rely on. I need to go back to 1.3, can I use the snapshot
 feature?
 Will a snapshot taken on a 1.4 cluster restore to a separate 1.3 cluster ?

 I'm really just interested in the data, I'd like to reapply my own
 mappings as the data is ingested into 1.3
 Should I be looking at a third party script or will the snapshot/restore
 features of elastic search be adequate ?

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/cd99a5a2-d9a0-4bdf-ae4f-efe054ba8ebd%40googlegroups.com
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZtKDnvjrv9sU0aAJdab5tTtBP08jB%2BJ9fOiRgtyHbYf9A%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: downgrading from 1.4 to1.3

2014-12-03 Thread Itamar Syn-Hershko
I'm a bit confused. Are you downgrading just because of Kibana compat
issues? seems to me like killing a fly with a bazooka.

Enabling CORS and using K3 dashboards seem like the better solution to me,
for now. K4 isn't even officially released yet. As for data disappearing,
I'm sure it wasn't and a relaxed debugging session can help you find that.

As for export-import - yes, knapsack is a great option but it does make
sense Joerg hasn't updated it yet as its not officially maintained. Writing
your own export-import tool is an easy option; I'd also look at
https://github.com/elasticsearch/stream2es ,
https://github.com/taskrabbit/elasticsearch-dump

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer  Consultant
Author of RavenDB in Action http://manning.com/synhershko/

On Thu, Dec 4, 2014 at 12:12 AM, Jack Judge jackjudg...@gmail.com wrote:




 I will be more than happy to hear about what went wrong for you with
 upgrading?


 Well Kibana 4 is unusable for us, the lack of auto refresh killed it for
 us.
 Most of the time K4 simply doesn't work even for browsing small sets, I'd
 say 2 times in 3 we get the 30 ms timeout error, is there a solution to
 this yet ?
 And when we finally do get any results it's as slow as a pig on crutches.

 I need to go back to my old K3 dashboards, but after fighting thru the
 CORS features, I find they're all blank. There's no errors, just no data
 and I desperately need them.
 I also need my old Packetbeat dashboards.

 From my googling I think the quickest way to get them back is to downgrade
 ES from 1.4 to 1.3, or am I wrong ?

 So, I need a way to export the data from the indices to a new cluster. My
 systems are in a secure environment, I can't use anything that needs to
 connect out to the interrnet during compile / install time, so the node.js
 / npm stuff is locked out for me.
 I tried the knapsack plugin but it doesn't seem to work installed into an
 ES 1.4 cluster.

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/3b3ba0ec-d84e-4a90-920c-997bce89c847%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/3b3ba0ec-d84e-4a90-920c-997bce89c847%40googlegroups.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHTr4Zt-r2XQVGhDQ%2Bthqnt6hxavObpTcM%3DUkj-j27JLeRA_3A%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: downgrading from 1.4 to1.3

2014-12-03 Thread Itamar Syn-Hershko
I'm not aware of compat issues with K3 and ES 1.4 other than
https://github.com/elasticsearch/kibana/issues/1637 . I'd check for
javascript errors, and try to see what's going on under the hood, really.
When you have more data about this, you can either quickly resolve, or open
a concrete bug :)

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer  Consultant
Author of RavenDB in Action http://manning.com/synhershko/

On Thu, Dec 4, 2014 at 1:44 AM, Jack Judge jackjudg...@gmail.com wrote:

 Well you say just, but at the moment Kibana is our only view into the ES
 cluster, so yes it's a dealbreaker for us.

 After enabling CORS, and what an unexpected knock about of pure fun that
 was, I still can't use the K3 dashboards, they're blank, no data, no errors
 just empty dashboards :(
 Data isn't disappearing, I can still see it via the head plugin

 So what's the quickest way of being able to see my data via the K3 boards
 ? Is it exporting out into a new cluster ? Or is there a way to make them
 work with the ES 1.4 cluster ?

 JJ

 .On Wednesday, 3 December 2014 14:17:39 UTC-8, Itamar Syn-Hershko wrote:

 I'm a bit confused. Are you downgrading just because of Kibana compat
 issues? seems to me like killing a fly with a bazooka.

 Enabling CORS and using K3 dashboards seem like the better solution to
 me, for now. K4 isn't even officially released yet. As for data
 disappearing, I'm sure it wasn't and a relaxed debugging session can help
 you find that.

 As for export-import - yes, knapsack is a great option but it does make
 sense Joerg hasn't updated it yet as its not officially maintained. Writing
 your own export-import tool is an easy option; I'd also look at
 https://github.com/elasticsearch/stream2es , https://github.com/
 taskrabbit/elasticsearch-dump

 --

 Itamar Syn-Hershko
 http://code972.com | @synhershko https://twitter.com/synhershko
 Freelance Developer  Consultant
 Author of RavenDB in Action http://manning.com/synhershko/

 On Thu, Dec 4, 2014 at 12:12 AM, Jack Judge jackj...@gmail.com wrote:




 I will be more than happy to hear about what went wrong for you with
 upgrading?


 Well Kibana 4 is unusable for us, the lack of auto refresh killed it for
 us.
 Most of the time K4 simply doesn't work even for browsing small sets,
 I'd say 2 times in 3 we get the 30 ms timeout error, is there a
 solution to this yet ?
 And when we finally do get any results it's as slow as a pig on crutches.

 I need to go back to my old K3 dashboards, but after fighting thru the
 CORS features, I find they're all blank. There's no errors, just no data
 and I desperately need them.
 I also need my old Packetbeat dashboards.

 From my googling I think the quickest way to get them back is to
 downgrade ES from 1.4 to 1.3, or am I wrong ?

 So, I need a way to export the data from the indices to a new cluster.
 My systems are in a secure environment, I can't use anything that needs to
 connect out to the interrnet during compile / install time, so the node.js
 / npm stuff is locked out for me.
 I tried the knapsack plugin but it doesn't seem to work installed into
 an ES 1.4 cluster.

 --
 You received this message because you are subscribed to the Google
 Groups elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send
 an email to elasticsearc...@googlegroups.com.
 To view this discussion on the web visit https://groups.google.com/d/
 msgid/elasticsearch/3b3ba0ec-d84e-4a90-920c-997bce89c847%
 40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/3b3ba0ec-d84e-4a90-920c-997bce89c847%40googlegroups.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/803bd821-95e8-4e29-8f0a-b2df813e204f%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/803bd821-95e8-4e29-8f0a-b2df813e204f%40googlegroups.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZuOEEn_5_AQt9SKk3JAFmTnjq1afA9PAkP77aDmOLgLCg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: This version of Kibana requires at least Elasticsearch 1.4.0.Beta1 but using 1.4.1

2014-12-03 Thread Itamar Syn-Hershko
https://github.com/elasticsearch/kibana/issues/1637

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer  Consultant
Author of RavenDB in Action http://manning.com/synhershko/

On Thu, Dec 4, 2014 at 1:49 AM, David Montgomery davidmontgom...@gmail.com
wrote:

 I am kibana 4.0.0-BETA2

 On Wednesday, December 3, 2014 7:30:03 PM UTC+8, Mark Walkom wrote:

 What version of Kibana?

 On 3 December 2014 at 21:48, David Montgomery davidmo...@gmail.com
 wrote:

 Hi,


 This version of Kibana requires at least Elasticsearch 1.4.0.Beta1

 SetupError@http://monitor-development-east.test.com:
 5601/index.js?_b=3998:42905:51
 checkEsVersion/@http://monitor-development-east.test.
 com:5601/index.js?_b=3998:43091:14
 qFactory/defer/deferred.promise.then/wrappedCallback@h
 ttp://monitor-development-east.test.com:5601/index.js?_b=3998:20764:15
 qFactory/ref/.then/@http://monitor-development-east.test.
 com:5601/index.js?_b=3998:20850:11
 $RootScopeProvider/this.$get/Scope.prototype.$eval@http://
 monitor-development-east.test.com:5601/index.js?_b=3998:21893:9
 $RootScopeProvider/this.$get/Scope.prototype.$digest@http:/
 /monitor-development-east.test.com:5601/index.js?_b=3998:21705:15
 $RootScopeProvider/this.$get/Scope.prototype.$apply@http://
 monitor-development-east.test.com:5601/index.js?_b=3998:21997:13
 done@http://monitor-development-east.test.com:
 5601/index.js?_b=3998:17570:34
 completeRequest@http://monitor-development-east.test.
 com:5601/index.js?_b=3998:17784:7
 createHttpBackend//xhr.onreadystatechange@http://
 monitor-development-east.test.com:5601/index.js?_b=3998:17727:1

 I am using 1.4.1.  Clearly kibana is not working.  Why?



 Here is my ES server
 {
   status : 200,
   name : Controller,
   cluster_name : elasticsearch,
   version : {
 number : 1.4.1,
 build_hash : 89d3241d670db65f994242c8e8383b169779e2d4,
 build_timestamp : 2014-11-26T15:49:29Z,
 build_snapshot : false,
 lucene_version : 4.10.2
   },
   tagline : You Know, for Search
 }


 Thaks

 --
 You received this message because you are subscribed to the Google
 Groups elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send
 an email to elasticsearc...@googlegroups.com.
 To view this discussion on the web visit https://groups.google.com/d/
 msgid/elasticsearch/CACF%2B8roTzNMtt5FJGQqGQSUQBaMXvVSP
 s4phpcLDCbM-Baek0g%40mail.gmail.com
 https://groups.google.com/d/msgid/elasticsearch/CACF%2B8roTzNMtt5FJGQqGQSUQBaMXvVSPs4phpcLDCbM-Baek0g%40mail.gmail.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/35e05f7f-4275-4a15-bfca-518afc4e1cc7%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/35e05f7f-4275-4a15-bfca-518afc4e1cc7%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHTr4Zu7x9Ouaa2YMttO%3D-h1R5LvZAZFFbwGDACB5qgZ6r8ZEQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Java client - setTimeout vs actionGet(timeout)

2014-11-30 Thread Itamar Syn-Hershko
IIRC the Java API doesn't have any default client-side timeout for search
requests, its an opt-in feature

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer  Consultant
Author of RavenDB in Action http://manning.com/synhershko/

On Sun, Nov 30, 2014 at 11:37 PM, Nikolas Everett nik9...@gmail.com wrote:

 Default for server side timeout is none and I don't know client side
 timeout. I imagine it is a long time.
 On Nov 30, 2014 1:46 PM, Ron Sher ron.s...@gmail.com wrote:

 Thanks for the info.

 Do you know what are the defaults?

 On Sunday, November 30, 2014 5:53:49 PM UTC+2, Nikolas Everett wrote:

 Timeouts are server side and best effort. I believe action get(timeout)
 is client side.

 I use the http client but use both and set the server side timeout to
 lower than the client side timeout.

 The server side timeout should return partial results if possible.
 On Nov 30, 2014 10:41 AM, Ron Sher ron@gmail.com wrote:

 Hi all,

 I want to make sure the search query doesn't exceed some limit.

 I've seen the option to use a setTimeout vs actionGet(timeout).

 Can someone please explain the difference?

 Also, I've read somewhere that there's a default connection timeout.
 Can that be used instead. If so, how?

 Thanks for your help,

 Ron

 --
 You received this message because you are subscribed to the Google
 Groups elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send
 an email to elasticsearc...@googlegroups.com.
 To view this discussion on the web visit https://groups.google.com/d/
 msgid/elasticsearch/0f39e669-bd98-4fc3-afbc-df9e3d9b1f52%
 40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/0f39e669-bd98-4fc3-afbc-df9e3d9b1f52%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.

  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/660f86bb-0f96-431c-acb5-aa4f5971578a%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/660f86bb-0f96-431c-acb5-aa4f5971578a%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.

  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/CAPmjWd0uKbdDubBAS-9AhwW4G2gQbJHs-fAg_qkhv0WcRn1K%2BQ%40mail.gmail.com
 https://groups.google.com/d/msgid/elasticsearch/CAPmjWd0uKbdDubBAS-9AhwW4G2gQbJHs-fAg_qkhv0WcRn1K%2BQ%40mail.gmail.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZtfnhCZJUrKv7fDc-hO%3D3Dpgv8xjUK3up3cCiaY60SmCQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: char_filter for German

2014-11-29 Thread Itamar Syn-Hershko
You may find the approach I give in the end of this talk helpful:
https://skillsmatter.com/skillscasts/4968-approaches-to-multi-lingual-text-search-with-elasticsearch-and-lucene

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer  Consultant
Author of RavenDB in Action http://manning.com/synhershko/

On Tue, Nov 18, 2014 at 12:30 PM, Krešimir Slugan kresimir.slu...@gmail.com
 wrote:

  Hi,

 To handle German language in search I have to be able to provide same
 results if user searches for e.g  über, uber or ueber

 I would do this at the index time where I would have über in the data.  But
 if I use just asciifolding filter I lose information that this was work
 with umlaut and I can't get ueber token. If I use char_fiter, it is
 applied before analysis and I would not be able to get uber.

 Is it possible to preserve original with char filter or apply it after the
 analysis?

 Cheers,

 Kresimir

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/f18f94bc-58e0-4bbf-a445-b45ba4db11f3%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/f18f94bc-58e0-4bbf-a445-b45ba4db11f3%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZsUPgHpwYwruOc%3DLhhrb2JnEG5CWS5O4Nuj52vnty9yPA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: char_filter for German

2014-11-29 Thread Itamar Syn-Hershko
Why do you need it as ueber? what I'm usually doing is end up with [über,
uber] at the same position, possibly marking the first as being the
original. Seeing Jurgen's response, I seem to be on the right path...

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer  Consultant
Author of RavenDB in Action http://manning.com/synhershko/

On Sat, Nov 29, 2014 at 9:21 PM, Krešimir Slugan kresimir.slu...@gmail.com
wrote:

 Which token filter can I use to replace words like über with ueber?

 On Saturday, November 29, 2014 8:16:14 PM UTC+1, Itamar Syn-Hershko wrote:

 What I'm saying is don't use char_filter, and use the token filters chain
 to achieve that

 --

 Itamar Syn-Hershko
 http://code972.com | @synhershko https://twitter.com/synhershko
 Freelance Developer  Consultant
 Author of RavenDB in Action http://manning.com/synhershko/

 On Sat, Nov 29, 2014 at 9:02 PM, Krešimir Slugan kresimi...@gmail.com
 wrote:

 Hi Itamar,

 I don't think this solves my problem. I'm aware that you can preserve
 original with ASCIIfolding but since char_filter is applied
 before ASCIIfolding then there would not be any umlauts to fold :) If I
 could apply char_filter on the end that would be ok, or preserve original
 with char_filter.

 Best,

 Kresimir

 On Saturday, November 29, 2014 5:41:11 PM UTC+1, Itamar Syn-Hershko
 wrote:

 You may find the approach I give in the end of this talk helpful:
 https://skillsmatter.com/skillscasts/4968-approaches-to-multi-lingual-
 text-search-with-elasticsearch-and-lucene

 --

 Itamar Syn-Hershko
 http://code972.com | @synhershko https://twitter.com/synhershko
 Freelance Developer  Consultant
 Author of RavenDB in Action http://manning.com/synhershko/

 On Tue, Nov 18, 2014 at 12:30 PM, Krešimir Slugan kresimi...@gmail.com
  wrote:

  Hi,

 To handle German language in search I have to be able to provide same
 results if user searches for e.g  über, uber or ueber

 I would do this at the index time where I would have über in the
 data.  But if I use just asciifolding filter I lose information that
 this was work with umlaut and I can't get ueber token. If I use
 char_fiter, it is applied before analysis and I would not be able to get
 uber.

 Is it possible to preserve original with char filter or apply it after
 the analysis?

 Cheers,

 Kresimir

 --
 You received this message because you are subscribed to the Google
 Groups elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send
 an email to elasticsearc...@googlegroups.com.
 To view this discussion on the web visit https://groups.google.com/d/
 msgid/elasticsearch/f18f94bc-58e0-4bbf-a445-b45ba4db11f3%40goo
 glegroups.com
 https://groups.google.com/d/msgid/elasticsearch/f18f94bc-58e0-4bbf-a445-b45ba4db11f3%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


  --
 You received this message because you are subscribed to the Google
 Groups elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send
 an email to elasticsearc...@googlegroups.com.
 To view this discussion on the web visit https://groups.google.com/d/
 msgid/elasticsearch/4d362cd4-21a4-486c-bf57-f2de5949f072%
 40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/4d362cd4-21a4-486c-bf57-f2de5949f072%40googlegroups.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/8e3cc964-59fc-4be7-bb13-b1411a312ade%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/8e3cc964-59fc-4be7-bb13-b1411a312ade%40googlegroups.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZuvKNq58xryBXJ5FLewOafWd0LvsaTADh%2BeYCtHGaRK2A%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: 3 Node Cluster With Nodes Out of Sync

2014-11-26 Thread Itamar Syn-Hershko
If this is replicas only, you should be able to set replica count to 0 and
then after a while back to 2 again

If this is sharded, then no, you'll have to reindex from scratch.

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer  Consultant
Author of RavenDB in Action http://manning.com/synhershko/

On Wed, Nov 26, 2014 at 10:26 AM, Yosi Haran y...@my6sense.com wrote:

 Alright, we'll try upgrading. Thanks :)

 Meanwhile, any advice on how to fix an inconsistency once it is found? Is
 there an API to forcefully sync nodes, or at-least reindex from a
 specific node?

 On Tuesday, November 25, 2014 8:44:44 PM UTC+2, Itamar Syn-Hershko wrote:

 I suggest you upgrade to 1.4 and try again - see
 http://www.elasticsearch.org/guide/en/elasticsearch/
 resiliency/current/index.html

 --

 Itamar Syn-Hershko
 http://code972.com | @synhershko https://twitter.com/synhershko
 Freelance Developer  Consultant
 Author of RavenDB in Action http://manning.com/synhershko/

 On Tue, Nov 25, 2014 at 7:29 PM, Yosi Haran yo...@my6sense.com wrote:

 1.0.0

 On Tuesday, November 25, 2014 6:41:36 PM UTC+2, Itamar Syn-Hershko wrote:

 minimum_master_nodes still doesn't protect you from all possible
 failure scenarios, see http://aphyr.com/posts/317-call-me-maybe-
 elasticsearch

 What version are you running?

 --

 Itamar Syn-Hershko
 http://code972.com | @synhershko https://twitter.com/synhershko
 Freelance Developer  Consultant
 Author of RavenDB in Action http://manning.com/synhershko/

 On Tue, Nov 25, 2014 at 6:37 PM, Yosi Haran yo...@my6sense.com wrote:

 Hi Guys,

 We are running a 3 node cluster, and each node returns a different
 number of documents when issued a direct HTTP _count call.

 The cluster holds about 150K documents and the differences range from
 30~50 documents, but are still troubling.

 This shouldn't be a split brain problem, since we have set:
 discovery.zen.minimum_master_nodes: 2
 We also have a client node, but since client nodes are eligible to
 be master, I understand that they shouldn't affect the master election
 process.

 Any Ideas about why and how this is happening?

 Thanks!

  --
 You received this message because you are subscribed to the Google
 Groups elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send
 an email to elasticsearc...@googlegroups.com.
 To view this discussion on the web visit https://groups.google.com/d/
 msgid/elasticsearch/8ed828c8-fb62-413a-9fe0-91806fcf34e6%40goo
 glegroups.com
 https://groups.google.com/d/msgid/elasticsearch/8ed828c8-fb62-413a-9fe0-91806fcf34e6%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


  --
 You received this message because you are subscribed to the Google
 Groups elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send
 an email to elasticsearc...@googlegroups.com.
 To view this discussion on the web visit https://groups.google.com/d/
 msgid/elasticsearch/fa91da1d-2127-4f70-96e0-15125a5af3bc%
 40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/fa91da1d-2127-4f70-96e0-15125a5af3bc%40googlegroups.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/b53c3df3-1d7f-4381-884a-713f605a7fba%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/b53c3df3-1d7f-4381-884a-713f605a7fba%40googlegroups.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZuJuoVgzzc3bgguk5a9WmFkEaEPEv7DnbET-iQVREfriw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: 3 Node Cluster With Nodes Out of Sync

2014-11-25 Thread Itamar Syn-Hershko
minimum_master_nodes still doesn't protect you from all possible failure
scenarios, see http://aphyr.com/posts/317-call-me-maybe-elasticsearch

What version are you running?

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer  Consultant
Author of RavenDB in Action http://manning.com/synhershko/

On Tue, Nov 25, 2014 at 6:37 PM, Yosi Haran y...@my6sense.com wrote:

 Hi Guys,

 We are running a 3 node cluster, and each node returns a different number
 of documents when issued a direct HTTP _count call.

 The cluster holds about 150K documents and the differences range from
 30~50 documents, but are still troubling.

 This shouldn't be a split brain problem, since we have set:
 discovery.zen.minimum_master_nodes: 2
 We also have a client node, but since client nodes are eligible to be
 master, I understand that they shouldn't affect the master election process.

 Any Ideas about why and how this is happening?

 Thanks!

  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/8ed828c8-fb62-413a-9fe0-91806fcf34e6%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/8ed828c8-fb62-413a-9fe0-91806fcf34e6%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHTr4Zu6r2z39iT%2B%3Dy7b6Brez%2BhLz8davSRaq0UmMviCCqV_sQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Performance issue while indexing lot of documents

2014-11-06 Thread Itamar Syn-Hershko
It may worth looking at 2 things:

1. Using the latest Elasticsearch version (1.4). Many work went on
optimizing those type of scenarios on the server side.

2. Disabling refresh / flush - I assume this is an ETL process and as such
this could greatly help.

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer  Consultant
Author of RavenDB in Action http://manning.com/synhershko/

On Thu, Nov 6, 2014 at 4:01 PM, Moshe Recanati re.mo...@gmail.com wrote:

 hi Thomas,
 I fixed the code per your suggestion and initiated prepared bulk each 1000
 documents (code below).
 However add document time is still increasing.

 Please let me know what's wrong. Thank you in advance.

 Moshe


 Output:
 Going to add 100
 processed 1000 records from -1000 until  0 at 704
 processed 1000 records from 0 until  1000 at 3068
 processed 1000 records from 1000 until  2000 at 1030
 processed 1000 records from 2000 until  3000 at 1654
 processed 1000 records from 3000 until  4000 at 1798
 processed 1000 records from 4000 until  5000 at 1808
 processed 1000 records from 5000 until  6000 at 580
 processed 1000 records from 6000 until  7000 at 354
 processed 1000 records from 7000 until  8000 at 731
 processed 1000 records from 8000 until  9000 at 496
 processed 1000 records from 9000 until  1 at 822
 processed 1000 records from 1 until  11000 at 564
 processed 1000 records from 11000 until  12000 at 588
 processed 1000 records from 12000 until  13000 at 690
 processed 1000 records from 13000 until  14000 at 774
 processed 1000 records from 14000 until  15000 at 1528
 processed 1000 records from 15000 until  16000 at 1028
 processed 1000 records from 16000 until  17000 at 966
 processed 1000 records from 17000 until  18000 at 1397
 processed 1000 records from 18000 until  19000 at 962
 processed 1000 records from 19000 until  2 at 3573
 processed 1000 records from 2 until  21000 at 1332
 processed 1000 records from 21000 until  22000 at 1282
 processed 1000 records from 22000 until  23000 at 1746
 processed 1000 records from 23000 until  24000 at 1411
 processed 1000 records from 24000 until  25000 at 1742
 processed 1000 records from 25000 until  26000 at 2540
 processed 1000 records from 26000 until  27000 at 2217
 processed 1000 records from 27000 until  28000 at 1203
 processed 1000 records from 28000 until  29000 at 1714
 processed 1000 records from 29000 until  3 at 1595
 processed 1000 records from 3 until  31000 at 1809
 processed 1000 records from 31000 until  32000 at 2305
 processed 1000 records from 32000 until  33000 at 1604
 processed 1000 records from 33000 until  34000 at 2208
 processed 1000 records from 34000 until  35000 at 1989
 processed 1000 records from 35000 until  36000 at 1939
 processed 1000 records from 36000 until  37000 at 1826
 processed 1000 records from 37000 until  38000 at 1716
 processed 1000 records from 38000 until  39000 at 1957
 processed 1000 records from 39000 until  4 at 1665
 processed 1000 records from 4 until  41000 at 1743
 processed 1000 records from 41000 until  42000 at 2166
 processed 1000 records from 42000 until  43000 at 2450
 processed 1000 records from 43000 until  44000 at 3342
 processed 1000 records from 44000 until  45000 at 2632
 processed 1000 records from 45000 until  46000 at 2795
 processed 1000 records from 46000 until  47000 at 3129
 processed 1000 records from 47000 until  48000 at 3290
 processed 1000 records from 48000 until  49000 at 3973
 processed 1000 records from 49000 until  5 at 3297
 processed 1000 records from 5 until  51000 at 3500
 processed 1000 records from 51000 until  52000 at 4328
 processed 1000 records from 52000 until  53000 at 3913
 processed 1000 records from 53000 until  54000 at 3636
 processed 1000 records from 54000 until  55000 at 3971
 processed 1000 records from 55000 until  56000 at 5851
 processed 1000 records from 56000 until  57000 at 4150
 processed 1000 records from 57000 until  58000 at 4557
 processed 1000 records from 58000 until  59000 at 4534
 processed 1000 records from 59000 until  6 at 4918
 processed 1000 records from 6 until  61000 at 3839
 processed 1000 records from 61000 until  62000 at 4297
 processed 1000 records from 62000 until  63000 at 4516
 processed 1000 records from 63000 until  64000 at 4782
 processed 1000 records from 64000 until  65000 at 4581

 Code:
 Node node = NodeBuilder.nodeBuilder().node();
 Client client = node.client();
  try
 {
 CreateIndexRequestBuilder createIndexRequestBuilder =
 client.admin().indices().prepareCreate(twitter2);
 createIndexRequestBuilder.execute().actionGet();
 }
 catch (Exception e)
 {
 e.printStackTrace();
 }
 BulkRequestBuilder bulkRequest = client.prepareBulk();
 int numOfDocs = 100;
 long startTime = System.currentTimeMillis();
 System.out.println(Going to add  + numOfDocs);
 // either use client#prepare, or use Requests# to directly build
 index/delete requests

Re: Enabling doc_values for _timestamp and _parent fields

2014-10-21 Thread Itamar Syn-Hershko
Yes

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer  Consultant
Author of RavenDB in Action http://manning.com/synhershko/

On Tue, Oct 21, 2014 at 10:47 AM, Costya Regev cos...@totango.com wrote:

 Hi,

 It's not clear from the documentation. Can doc_values be set for
 _timestamp and _parent fields?

 Thanks,
 Costya, Totango

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/8736df68-26f8-446c-bd1f-bf231fb73849%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/8736df68-26f8-446c-bd1f-bf231fb73849%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZsZgKt4%2BxEUq%2BPV8TEAfTT6czAxQTZFLRL6oO%2BVT7%2Bxxg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: ES-to-ES river?

2014-10-21 Thread Itamar Syn-Hershko
I personally recommend https://github.com/elasticsearch/stream2es

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer  Consultant
Author of RavenDB in Action http://manning.com/synhershko/

On Tue, Oct 21, 2014 at 3:24 PM, joergpra...@gmail.com 
joergpra...@gmail.com wrote:

 You can also try the knapsack plugin, where you can archive index data,
 but also move index data around, between indices and across clusters.

 https://github.com/jprante/elasticsearch-knapsack

 Jörg

 On Tue, Oct 21, 2014 at 3:57 PM, raidex ralg...@gmail.com wrote:

 Hi all,

 Is there a reason why a ES-to-ES river hasn't been implemented? I need to
 implement a fast copy mechanism to move data between indices (same cluster
 and across clusters) and seems to me like a river is the right mechanism. I
 am planning to write my own, but I want to check if it is a reasonable
 approach. -- Thanks.

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/564b7d29-677f-4bb2-9ab0-5ca206894621%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/564b7d29-677f-4bb2-9ab0-5ca206894621%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHLk6A3LUFR6YN%3DDJBdbDOy4QpAXjdzT_%2BNUa7NoqE_iQ%40mail.gmail.com
 https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHLk6A3LUFR6YN%3DDJBdbDOy4QpAXjdzT_%2BNUa7NoqE_iQ%40mail.gmail.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHTr4Zs-8T67tF%3DdBkEnX44Z3jhpgKj0OJD9mPC4HS8bpW3DYQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Hot backup strategy for Elasticsearch

2014-10-15 Thread Itamar Syn-Hershko
No - you should definitely use the snapshot and restore as its the most
stable and efficient way for backups there is.

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer  Consultant
Author of RavenDB in Action http://manning.com/synhershko/

On Wed, Oct 15, 2014 at 1:12 AM, skm replyson...@gmail.com wrote:

 Hello List,

 Going through the current documentation I found that snapshot/restore
 mechanism is one type of backup strategy that we can use for ES clusters.
 Any other recommendations?

 Using the following

 1.elasticsearch-
 version : {
 number : 1.3.4,

 2. AWS-cloud-plugin
 3. curator


 curator snapshot --repository mys3_repository --all-indices  (weekend)
 curator snapshot --repository mys3_repository --most-recent 1 (every week
 day)

 The above would be run as cron jobs from one of the nodes in the cluster.

 Let me know of recommendations for hot backup for elastic search cluster.

 Thanks,
 skm


 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/fdb9ebae-0352-491c-bca6-dc905cd623ae%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/fdb9ebae-0352-491c-bca6-dc905cd623ae%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZtpT5Q2C2sswPDJRN0KK3xNWGbxdUFoctqGd%3D%2B1q7cs1Q%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: NotFilter dude

2014-10-15 Thread Itamar Syn-Hershko
See
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-not-filter.html

You should probably switch to a bool and a should clause before instead of
an and filter

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer  Consultant
Author of RavenDB in Action http://manning.com/synhershko/

On Tue, Oct 14, 2014 at 9:26 PM, Waldemar Neto waldema...@gmail.com wrote:

 Hello all!
 ia have a criteria with *AND* , *OR* and *NOT* operator, but the *NOT* is
 a single Filter, what is the best way to set multiples *NOT's*?

 see my query with *AND*

 my *AND* i need *NOT* :D

 {
 highlight: {
 fields: {
 *: {
 fragment_size: 150,
 number_of_fragments: 1,
 pre_tags: [
 b
 ],
 post_tags: [
 /b
 ]
 }
 }
 },
 facets: {
 documents: {
 terms: {
 field: primary_field,
 size: 5
 }
 }
 },
 fields: [
 Document.id,
 Document.name,
 Document.updated,
 DocumentTag.name,
 Document.approval_number,
 Document.approval_number_us,
 Document.approval_number_jp,
 Version.status,
 Document.rate,
 Document.last_status
 ],
 sort: [
 _type
 ],
 size: 10,
 query: {
 filtered: {
 query: {
 match_all: {}
 },
 filter: {
 and: {
 filters: [
 {
 terms: {
 Product.id: [
 6
 ]
 }
 },
 {
 terms: {
 Version.jp: [
 true
 ]
 }
 },
 {
 terms: {
 Version.jp: [
 true
 ]
 }
 },
 {
 terms: {
 Document.last_status: [
 4
 ]
 }
 }
 ]
 }
 }
 }
 }
 }

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/d9d34e14-7d3c-41a5-b3ad-a33ccbd79d45%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/d9d34e14-7d3c-41a5-b3ad-a33ccbd79d45%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZvhhpLcLojKEk0%2Bw_sx%3DFeMkOnvSbiK-aWYeH7ByD1WWg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Hot backup strategy for Elasticsearch

2014-10-15 Thread Itamar Syn-Hershko
Incremental. See
http://www.elasticsearch.org/blog/introducing-snapshot-restore/

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer  Consultant
Author of RavenDB in Action http://manning.com/synhershko/

On Wed, Oct 15, 2014 at 5:15 PM, skm replyson...@gmail.com wrote:

 Thank you for the response!

 Usually for large amounts of data (TBs) how the snapshot backup strategy
 work? Full snapshots every week and then most-recent snapshots work well?
 The most recent would be redundant if there is no new data in the last 24
 hrs.?

 Thanks,
 skm


 On Wednesday, October 15, 2014 12:54:13 AM UTC-7, Itamar Syn-Hershko wrote:

 No - you should definitely use the snapshot and restore as its the most
 stable and efficient way for backups there is.

 --

 Itamar Syn-Hershko
 http://code972.com | @synhershko https://twitter.com/synhershko
 Freelance Developer  Consultant
 Author of RavenDB in Action http://manning.com/synhershko/

 On Wed, Oct 15, 2014 at 1:12 AM, skm reply...@gmail.com wrote:

 Hello List,

 Going through the current documentation I found that snapshot/restore
 mechanism is one type of backup strategy that we can use for ES clusters.
 Any other recommendations?

 Using the following

 1.elasticsearch-
 version : {
 number : 1.3.4,

 2. AWS-cloud-plugin
 3. curator


 curator snapshot --repository mys3_repository --all-indices  (weekend)
 curator snapshot --repository mys3_repository --most-recent 1 (every
 week day)

 The above would be run as cron jobs from one of the nodes in the cluster.

 Let me know of recommendations for hot backup for elastic search cluster.

 Thanks,
 skm


 --
 You received this message because you are subscribed to the Google
 Groups elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send
 an email to elasticsearc...@googlegroups.com.
 To view this discussion on the web visit https://groups.google.com/d/
 msgid/elasticsearch/fdb9ebae-0352-491c-bca6-dc905cd623ae%
 40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/fdb9ebae-0352-491c-bca6-dc905cd623ae%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/e42872cd-7f44-4ada-b1d5-e988edac60e0%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/e42872cd-7f44-4ada-b1d5-e988edac60e0%40googlegroups.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZtbuP1qpXXDpV-zrkdhuuO6DACq32-LF45z%3DQ9s2QA4Ag%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: running on EC2 S3 vs EBS

2014-10-13 Thread Itamar Syn-Hershko
Yes, you don't want to use anything other than local storage for
Elasticsearch. Not EBS and definitely not S3. You can use the
snapshot/restore API to continously backup to S3 and get all the data
protection you need.

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer  Consultant
Author of RavenDB in Action http://manning.com/synhershko/

On Tue, Oct 14, 2014 at 12:17 AM, Matthias Johnson openno...@gmail.com
wrote:

 We've begun deploying to AWS EC2. I've seen refrences in the group about
 the S3 gateway and it being deprecated. That seems to be confirmed by
 looking at the docs, which don't seem to list the S3 Gateway specifically
 after 0.90.x.

 We are also using the elasticsearch-cloud-aws plugins
 https://github.com/elasticsearch/elasticsearch-cloud-aws, which does a
 nice job at helping the auto discovery. It also shows settings for using S3.

 After some reading my understanding is that the plugin is basically just
 snapshots that are stored in S3. Is that understanding correct? Is this
 much different from the original gateway?

 That suggests that unless we take frequent snapshots we would run a risk
 of data loss if the entire cluster wen't down (right now we are using
 instance storage). Is that right?

 Switching to EBS would give us better protection against data loss, since
 the data is stored on a more permanent basis as well as improved recovery
 after an entire cluster going down?

 Are there any good guides on configuring this sort of setup with
 cloudformation and templates and/or tying EBS volumes for ES use to
 machines when a cluster is resurrected?

 \@matthias

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/415a7669-4c2a-4d3e-a960-67390c1197cf%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/415a7669-4c2a-4d3e-a960-67390c1197cf%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZtPqGN7PDTQkoYcsYwfM_3bmVrEECwZcCiNqDsLsa9gqQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


  1   2   3   >