Re: howto: food for dogs == dogfood
Synonyms -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Lucene.NET committer and PMC member On Tue, Apr 28, 2015 at 5:33 PM, Maarten Roosendaal mroosendaa...@gmail.com wrote: Hi, We have users typing stuff like food for dogs and we've indexed the data with dogfood. What is the best strategy to get a match with elasticsearch's filters and or analyzers? Thanks, Maarten -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/c35ceba0-f5af-47f2-821f-384e4b3272bf%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/c35ceba0-f5af-47f2-821f-384e4b3272bf%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZuZqC78O%2Bz_QwBTEfWK-MDWDPH19W_TiL_SOTApBsny6A%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: What is the correct _primary_first syntax? What is the relevant debug logger ?
?preference=_primary_first see http://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-preference.html No verbose mode at the moment -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Lucene.NET committer and PMC member On Mon, Apr 27, 2015 at 8:53 AM, Itai Frenkel itaifren...@live.com wrote: Hello, What is the correct syntax of using _primary_first in search and search template queries? GET myindex/_search/template?preference=_primary_first or GET myindex/_search/template?routing=_primary_first Is there any verbose mode that can log the list of shards that were actually accessed? thanks, Itai -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/6f6d44a0-f689-4168-85cf-574610f73155%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/6f6d44a0-f689-4168-85cf-574610f73155%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZsgOPJqY2dB1A_jytn5EwxoJo5Lkjp9BMWr%3DFHT9o-b3g%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: inner_hits and highlighting
I think I've heard the ES team discourage the extensive use of this aggregation type, mainly because it is highly expensive. Adding highlighting support to it will more than double it's cost, and I'd personally vote against it. -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Lucene.NET committer and PMC member On Tue, Apr 28, 2015 at 8:17 PM, Nikolas Everett nik9...@gmail.com wrote: If its not in the issues its unlikely that its planned. If it isn't planned I think filing an issue is a good thing - just be super clear what you want to do with examples in curl/gist form. If it is planned maybe add your proposed usage to the issue. Nik On Tue, Apr 28, 2015 at 11:26 AM, Ian Battersby ian.batter...@gmail.com wrote: Been playing with the new *experimental* inner_hits functionality released in 1.5.0, mainly with child/parent related documents. It seems to work really well but notice that highlighting doesn't seem supported on content/fields within inner_hits; a quick scan of the code-base seems to confirm this. Anyone know if this is already under consideration for a future release? Thanks, Ian. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/6512722f-caa0-4f48-baf0-c255d8685cb0%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/6512722f-caa0-4f48-baf0-c255d8685cb0%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAPmjWd2SdkCbZYdrjJE6PJ7TnF7Kce1ke0ZyuVpkVmVpgAW%3DUQ%40mail.gmail.com https://groups.google.com/d/msgid/elasticsearch/CAPmjWd2SdkCbZYdrjJE6PJ7TnF7Kce1ke0ZyuVpkVmVpgAW%3DUQ%40mail.gmail.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZtRmigxAk8O-ZKp-cfQ9W5GOQ05Tk58knjcObLtUDLi_A%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: Using serialized doc_value instead of _source to improve read latency
This is how _source works. doc_values don't make sense in this regard - what you are looking for is using stored fields and have the transform script write to that. Loading stored fields (even one field per hit) may be slower than loading and parsing _source, though. I'd just put this logic in the indexer, though. It will definitely help with other things as well, such as nasty huge mappings. Alternatively, find a way to avoid IO completely. How about using ES for search and something like riak for loading the actual data, if IO costs are so noticable? -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Lucene.NET committer and PMC member On Mon, Apr 20, 2015 at 11:18 PM, Itai Frenkel itaifren...@live.com wrote: Hi, We are having a performance problem in which for each hit, elasticsearch parses the entire _source then generates a new Json with only the requested query _source fields. In order to overcome this issue we would like to use mapping transform script that serializes the requested query fields (which is known in advance) into a doc_value. Does that makes sense? The actual problem with the transform script is SecurityException that does not allow using any json serialization mechanism. A binary serialization would also be ok. Itai -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/b897aba2-c250-4474-a03f-1d2a993baef9%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/b897aba2-c250-4474-a03f-1d2a993baef9%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHTr4Zsmri8LvzAqnXrwCA7B2PesCtH05BQxmj%3D3vMr%2B9abikw%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: Using serialized doc_value instead of _source to improve read latency
What if all those fields are collapsed to one, like you suggest, but that one field is projected out of _source (think non-indexed json in a string field)? do you see a noticable performance gain then? What if that field is set to be stored (and loaded using fields, not via _source)? what is the performance gain then? Fielddata and the doc_values optimization on top of them will not help you here, those data structures aren't being used for sending data out, only for aggregations and sorting. Also, using fielddata will require indexing those fields; it is apparent that you are not looking to be doing that. -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Lucene.NET committer and PMC member On Tue, Apr 21, 2015 at 12:14 AM, Itai Frenkel itaifren...@live.com wrote: Itamar, 1. The _source field includes many fields that are only being indexed, and many fields that are only needed as a query search result. _source includes them both.The projection from _source from the query result is too CPU intensive to do during search time for each result, especially if the size is big. 2. I agree that adding another NoSQL could solve this problem, however it is currently out of scope, as it would require syncing data with another data store. 3. Wouldn't a big stored field will bloat the lucene index size? Even if not, isn't non_analyzed fields are destined to be (or already are) doc_fields? On Tuesday, April 21, 2015 at 1:36:20 AM UTC+3, Itamar Syn-Hershko wrote: This is how _source works. doc_values don't make sense in this regard - what you are looking for is using stored fields and have the transform script write to that. Loading stored fields (even one field per hit) may be slower than loading and parsing _source, though. I'd just put this logic in the indexer, though. It will definitely help with other things as well, such as nasty huge mappings. Alternatively, find a way to avoid IO completely. How about using ES for search and something like riak for loading the actual data, if IO costs are so noticable? -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Lucene.NET committer and PMC member On Mon, Apr 20, 2015 at 11:18 PM, Itai Frenkel itaif...@live.com wrote: Hi, We are having a performance problem in which for each hit, elasticsearch parses the entire _source then generates a new Json with only the requested query _source fields. In order to overcome this issue we would like to use mapping transform script that serializes the requested query fields (which is known in advance) into a doc_value. Does that makes sense? The actual problem with the transform script is SecurityException that does not allow using any json serialization mechanism. A binary serialization would also be ok. Itai -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/b897aba2-c250-4474-a03f-1d2a993baef9%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/b897aba2-c250-4474-a03f-1d2a993baef9%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/630a2998-e2a9-44a3-9c93-e692be2c2338%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/630a2998-e2a9-44a3-9c93-e692be2c2338%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZuxvUoZ4L%2BUq0G82GLZKYfN-hj_e_gez6RsUc3hZeHbyw%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: Evaluating Moving to Discourse - Feedback Wanted
I believe the biggest impact would be on responsiveness As another long-timer, and someone who often responds to questions, I will probably cease to do that if the forum would move to Discourse simply because it lacks the push style notifications on new questions. Right now if a question title in my inbox catches my eye I'll quickly read it and respond. I'm quite sure this pattern (which I'm sure I'm not the only one relying on) will go away once you move to Discourse and the forum responsiveness with it. Just my 2 cents. -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Lucene.NET committer and PMC member On Mon, Apr 13, 2015 at 8:13 PM, Leslie Hawthorn leslie.hawth...@elastic.co wrote: Thanks for your feedback, Ivan. There's no plan to remove threads from the forums, so information would always be archived there as well. Does that impact your thoughts on moving to Discourse? Folks, please keep the feedback coming! Cheers, LH On Sat, Apr 11, 2015 at 12:09 AM, Ivan Brusic i...@brusic.com wrote: As one of the oldest and most frequent users (before my sabbatical) of the mailing list, I just wanted to say that I never had an issue with it. It works. As long as I could continue using only email, I am happy. For realtime communication, there is the IRC channel. If prefer the mailing list since everything is archived. Ivan On Apr 2, 2015 5:36 PM, leslie.hawthorn leslie.hawth...@elastic.co wrote: Hello everyone, As we’ve begun to scale up development on three different open source projects, we’ve found Google Groups to be a difficult solution for dealing with all of our needs for community support. We’ve got multiple mailing lists going, which can be confusing for new folks trying to figure out where to go to ask a question. We’ve also found our lists are becoming noisy in the “good problem to have” kind of way. As we’ve seen more user adoption, and across such a wide variety of use cases, we’re getting widely different types of questions asked. For example, I can imagine that folks not using our Python client would rather not be distracted with emails about it. There’s also a few other strikes against Groups as a tool, such as the fact that it is no longer a supported product by Google, it provides no API hooks and it is not available for users in China. We’ve evaluated several options and we’re currently considering shuttering the elasticsearch-user and logstash-users Google Groups in favor of a Discourse forum. You can read more about Discourse at http://www.discourse.org We feel Discourse will allow us to provide a better experience for all of our users for a few reasons: * More fine grained conversation topics = less noise and better targeted discussions. e.g. we can offer a forum for each language client, individual logstash plugin or for each city to plan user group meetings, etc. * Facilitates discussions that are not generally happening on list now, such as best practices by use case or tips from moving to development to production * Easier for folks who are purely end users - and less used to getting peer support on a mailing list - to get help when they need it Obviously, Discourse does not function the exact same way as a mailing list - however, email interaction with Discourse is supported and will continue to allow you to participate in discussions over email (though there are some small issues related to in-line replies. [0]) We’re working with the Discourse team now as part of evaluating this transition, and we know they’re working to resolve this particular issue. We’re also still determining how Discourse will handle our needs for both user and list archive migration, and we’ll know the precise details of how that would work soon. (We’ll share when we have them.) The final goal would be to move Google Groups to read-only archives, and cut over to Discourse completely for community support discussions. We’re looking at making the cut over in ~30 days from today, but obviously that’s subject to the feedback we receive from all of you. We’re sharing this information to set expectations about time frame for making the switch. It’s not set in stone. Our highest priority is to ensure effective migration of our list archives and subscribers, which may mean a longer time horizon for deploying Discourse, as well. In the meantime, though, we wanted to communicate early and often and get your feedback. Would this change make your life better? Worse? Meh? Please share your thoughts with us so we can evaluate your feedback. We don’t take this switch lightly, and we want to understand how it will impact your overall workflow and experience. We’ll make regular updates to the list responding to incoming feedback and be completely transparent about how our thought processes evolve based on it. Thanks in advance! [0] - https
Re: Evaluating Moving to Discourse - Feedback Wanted
Fair play, will check that out, assuming you can reply to that email to respond? -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Lucene.NET committer and PMC member On Wed, Apr 15, 2015 at 8:53 PM, Glen Smith g...@smithsrock.com wrote: * it lacks the push style notifications on new questions* https://lh3.googleusercontent.com/-enGiohVrmdk/VS6lNPnQMbI/A70/fc-SzjQxRlk/s1600/Screen%2BShot%2B2015-04-15%2Bat%2B1.49.18%2BPM.png That doesn't seem to be correct to me. Does send me an email for every new post not cover what you want? On Wednesday, April 15, 2015 at 1:42:50 PM UTC-4, Itamar Syn-Hershko wrote: I believe the biggest impact would be on responsiveness As another long-timer, and someone who often responds to questions, I will probably cease to do that if the forum would move to Discourse simply because it lacks the push style notifications on new questions. Right now if a question title in my inbox catches my eye I'll quickly read it and respond. I'm quite sure this pattern (which I'm sure I'm not the only one relying on) will go away once you move to Discourse and the forum responsiveness with it. Just my 2 cents. -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Lucene.NET committer and PMC member On Mon, Apr 13, 2015 at 8:13 PM, Leslie Hawthorn leslie@elastic.co wrote: Thanks for your feedback, Ivan. There's no plan to remove threads from the forums, so information would always be archived there as well. Does that impact your thoughts on moving to Discourse? Folks, please keep the feedback coming! Cheers, LH On Sat, Apr 11, 2015 at 12:09 AM, Ivan Brusic iv...@brusic.com wrote: As one of the oldest and most frequent users (before my sabbatical) of the mailing list, I just wanted to say that I never had an issue with it. It works. As long as I could continue using only email, I am happy. For realtime communication, there is the IRC channel. If prefer the mailing list since everything is archived. Ivan On Apr 2, 2015 5:36 PM, leslie.hawthorn leslie@elastic.co wrote: Hello everyone, As we’ve begun to scale up development on three different open source projects, we’ve found Google Groups to be a difficult solution for dealing with all of our needs for community support. We’ve got multiple mailing lists going, which can be confusing for new folks trying to figure out where to go to ask a question. We’ve also found our lists are becoming noisy in the “good problem to have” kind of way. As we’ve seen more user adoption, and across such a wide variety of use cases, we’re getting widely different types of questions asked. For example, I can imagine that folks not using our Python client would rather not be distracted with emails about it. There’s also a few other strikes against Groups as a tool, such as the fact that it is no longer a supported product by Google, it provides no API hooks and it is not available for users in China. We’ve evaluated several options and we’re currently considering shuttering the elasticsearch-user and logstash-users Google Groups in favor of a Discourse forum. You can read more about Discourse at http://www.discourse.org We feel Discourse will allow us to provide a better experience for all of our users for a few reasons: * More fine grained conversation topics = less noise and better targeted discussions. e.g. we can offer a forum for each language client, individual logstash plugin or for each city to plan user group meetings, etc. * Facilitates discussions that are not generally happening on list now, such as best practices by use case or tips from moving to development to production * Easier for folks who are purely end users - and less used to getting peer support on a mailing list - to get help when they need it Obviously, Discourse does not function the exact same way as a mailing list - however, email interaction with Discourse is supported and will continue to allow you to participate in discussions over email (though there are some small issues related to in-line replies. [0]) We’re working with the Discourse team now as part of evaluating this transition, and we know they’re working to resolve this particular issue. We’re also still determining how Discourse will handle our needs for both user and list archive migration, and we’ll know the precise details of how that would work soon. (We’ll share when we have them.) The final goal would be to move Google Groups to read-only archives, and cut over to Discourse completely for community support discussions. We’re looking at making the cut over in ~30 days from today, but obviously that’s subject to the feedback we receive from all of you. We’re sharing this information to set expectations about time frame for making the switch. It’s not set
Re: Should I use elasticsearch as a core for faceted navigation-heavy website?
Short answer: yes. With properly sharded and scaled out environment, and using ES 1.4 or newer, you should be able to get those numbers. -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Lucene.NET committer and PMC member On Tue, Mar 24, 2015 at 5:38 PM, Dmitry dmitry.bit...@gmail.com wrote: Hello, I'm evaluating elasticsearch for use in new project. I can see that it has all search features we need. Problem is that after reading documentation and forum I still can't understand whether elastic is suitable technology for us performance-wise. I'd be very grateful to get your opinion on that. We're building a directory of businesses, similar to Yelp. We have 5m businesses, and main feature of our site is faceted search on different facets: geography (tens of thousands geo objects), business type (several thousand options), additional services offered by that business (hundreds) and so on. So for each request (combination of search parameters) we need to get search results, but also what options are available in each facet (for example, what business types are located are in selected geography) for user to be able to narrow down his search (example: http://take.ms/oAZan). Full text search (by business name for example) is used in very small percentage of requests, bulk of requests is exact match on one or several facets. Based on the similar our project we expect 1-5m requests per day. All requests are highly diversified: no single page (combination of search params) constitutes more than 0,1% of total requests. We expect to be able to answer request in 200-300ms, so I guess request to elasticsearch should take no more than 100ms. On our similar project we use big lookup table in database with all possible combinations of params mapped to search result count. For each request we generate all possible combinations of parameters to refine current search and then check lookup table to see if they have any results. My questions are: Is elastic search suitable for our purposes? Specifically, are aggregations meant to be used in large number of low-latency requests, or are they more like analytical feature, where response time is not that important? I ask that because in discussions of aggregation and faceting performance here and elsewhere response times are mentioned in 1-10s range, which is ok for analytics and infrequent searches, but obviously on ok for us. How hard it is to get performance we need: 50 rps, 100ms response time for search+facets, on some reasonable hardware, taking into account big number of possible facet combinations and high diversification of requests? What kind of hardware should we expect to handle our loads? I understand that these are vague questions, but I just need some approximation. Is it more like 1 server with commodity hardware and simple configuration, or more like cloud of 10 servers and extensive tuning? For example, our lookup table solution works on 1 commodity server with 16gb of ram with almost default setup. Thank you for your responses, Dmitry -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/807801d4-2926-4e52-9161-dc82d3f33a75%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/807801d4-2926-4e52-9161-dc82d3f33a75%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZsOOMoJUDmdcH-zsL4EHymR6SUw6T3dKSs_UHfVq0mtCw%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: Supported Operating System
I'll recommend using a Linux based system, and not as a VM, for various reasons relating to resource management -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Lucene.NET committer and PMC member On Sun, Mar 22, 2015 at 1:20 AM, Gil Peleg gilpele...@gmail.com wrote: Hey, I was wondering if there was any guidelines as to preferred OS for ES? Are there any that are not supported? I currently run Windows Server 2008 R2 on a project I am working on and was wondering if there was any issues with using ES on it. Going to go ahead and assume, if Linux is the preferred option, would it be better to run it on a virtual machine on top my current Windows OS and it will reach better performance than being installed simply on the Windows Server? Thanks, Gil -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/abe4a23d-6ad6-4a00-8e19-5817b6b52cba%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/abe4a23d-6ad6-4a00-8e19-5817b6b52cba%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZsaAhZSZ_Tbb7wGK5XKza5fbRqzwUuexpf6U-4vgEMD%3Dw%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: Elasticsearch - Not require an exact match
You should use this then: http://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-word-delimiter-tokenfilter.html -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Lucene.NET committer and PMC member On Thu, Mar 19, 2015 at 2:26 PM, James m...@employ.com wrote: Currently I have an item in my elasticsearch index with the title: *testing123* When I search for it, I can only get it returned if I search *testing123* exactly. However, I want to be able to search *testing* and have it returned too. How can I have it so the search must start with that term but also not be an exact match? Any help would be appreciated. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/de42c52d-1566-461e-b578-594aa963a498%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/de42c52d-1566-461e-b578-594aa963a498%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZvcvCYD0_SRY_QcHNcytaj7KcmCH2itJcUsPCTc2FjsrQ%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: Elasticsearch - Not require an exact match
This boils down to Lucene fundamentals, in particular what search tokens are created and then searched. I've explained this in depth here: https://www.youtube.com/watch?v=QI566fe9Svs -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Lucene.NET committer and PMC member On Thu, Mar 19, 2015 at 2:33 PM, James m...@employ.com wrote: Thank you for the reply. I thought it was much about making my search query not require an exact match, rather than splitting down the words I am searching against? On 19 March 2015 at 12:30, Itamar Syn-Hershko ita...@code972.com wrote: You should use this then: http://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-word-delimiter-tokenfilter.html -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Lucene.NET committer and PMC member On Thu, Mar 19, 2015 at 2:26 PM, James m...@employ.com wrote: Currently I have an item in my elasticsearch index with the title: *testing123* When I search for it, I can only get it returned if I search *testing123* exactly. However, I want to be able to search *testing* and have it returned too. How can I have it so the search must start with that term but also not be an exact match? Any help would be appreciated. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/de42c52d-1566-461e-b578-594aa963a498%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/de42c52d-1566-461e-b578-594aa963a498%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to a topic in the Google Groups elasticsearch group. To unsubscribe from this topic, visit https://groups.google.com/d/topic/elasticsearch/psZ8iAVOziM/unsubscribe. To unsubscribe from this group and all its topics, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZvcvCYD0_SRY_QcHNcytaj7KcmCH2itJcUsPCTc2FjsrQ%40mail.gmail.com https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZvcvCYD0_SRY_QcHNcytaj7KcmCH2itJcUsPCTc2FjsrQ%40mail.gmail.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAPng%3D3f7pQOm4HPkxtec3or%2Ba4ZUVpC1USb7tWMaD9cvicQ8gQ%40mail.gmail.com https://groups.google.com/d/msgid/elasticsearch/CAPng%3D3f7pQOm4HPkxtec3or%2Ba4ZUVpC1USb7tWMaD9cvicQ8gQ%40mail.gmail.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHTr4Zs9A%2B5mVB1txVyFX-KX%3DtXu9m5Rmq0e2n7ET5G3JZnpYQ%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: Courier Fetch error, maybe due to lack of @timestamp?
Like the error suggests, No mapping found for [@timestamp] in order to sort on Kibana expects a @timestamp field - make sure to push that in your source -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Lucene.NET committer and PMC member On Tue, Mar 17, 2015 at 11:19 PM, David Reagan jer...@gmail.com wrote: I keep getting an error like this: Courier Fetch: 5 of 270 shards failed. in Kibana 4.0.1. After some Googling, I think it has something to do with @timestamp not existing for some of my data. But I'm not sure, because https://groups.google.com/d/topic/elasticsearch/L6AG3dZOGJ8/discussion was solved by not searching the kibana indexes. I'm only searching my logstash indexes. And I'm still getting that error. In kibana 4 I went to Settings-Indices and made sure I only have logstash-* listed under Index Patterns. I did recently update the template to what was in the logstash git HEAD. See http://pastebin.com/w7PmHxXS for my /var/log/elasticsearch/index.log output. As well as the template I'm using. It's at the bottom of the paste. I did check with curl -XGET 'http://localhost:9200/_cat/shards?pretty=true' to see if any shards had issues. They all had STARTED as their status. Any suggestions? -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/9d816fa6-62c4-4651-a1e3-30c4f9239f5a%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/9d816fa6-62c4-4651-a1e3-30c4f9239f5a%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHTr4Zsf8HH4WFvF8geoDy4zNhWOX6Y6hEsaLv8E8xhc04F62A%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: Courier Fetch error, maybe due to lack of @timestamp?
@timestamp is generated automatically by logstash, any documents not added by logstash will not have it -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Lucene.NET committer and PMC member On Wed, Mar 18, 2015 at 12:51 AM, David Reagan jer...@gmail.com wrote: @timestamp has always been applied automatically. Only time I've ever touched it is when I've adjusted the date to what the log message holds, rather than when the log message is processed by logstash. So, I have no idea where it comes from, or how I could have turned it off on something. Is that in the template? --David Reagan On Tue, Mar 17, 2015 at 2:24 PM, Itamar Syn-Hershko ita...@code972.com wrote: Like the error suggests, No mapping found for [@timestamp] in order to sort on Kibana expects a @timestamp field - make sure to push that in your source -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Lucene.NET committer and PMC member On Tue, Mar 17, 2015 at 11:19 PM, David Reagan jer...@gmail.com wrote: I keep getting an error like this: Courier Fetch: 5 of 270 shards failed. in Kibana 4.0.1. After some Googling, I think it has something to do with @timestamp not existing for some of my data. But I'm not sure, because https://groups.google.com/d/topic/elasticsearch/L6AG3dZOGJ8/discussion was solved by not searching the kibana indexes. I'm only searching my logstash indexes. And I'm still getting that error. In kibana 4 I went to Settings-Indices and made sure I only have logstash-* listed under Index Patterns. I did recently update the template to what was in the logstash git HEAD. See http://pastebin.com/w7PmHxXS for my /var/log/elasticsearch/index.log output. As well as the template I'm using. It's at the bottom of the paste. I did check with curl -XGET ' http://localhost:9200/_cat/shards?pretty=true' to see if any shards had issues. They all had STARTED as their status. Any suggestions? -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/9d816fa6-62c4-4651-a1e3-30c4f9239f5a%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/9d816fa6-62c4-4651-a1e3-30c4f9239f5a%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to a topic in the Google Groups elasticsearch group. To unsubscribe from this topic, visit https://groups.google.com/d/topic/elasticsearch/dH6zw6swHBg/unsubscribe. To unsubscribe from this group and all its topics, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHTr4Zsf8HH4WFvF8geoDy4zNhWOX6Y6hEsaLv8E8xhc04F62A%40mail.gmail.com https://groups.google.com/d/msgid/elasticsearch/CAHTr4Zsf8HH4WFvF8geoDy4zNhWOX6Y6hEsaLv8E8xhc04F62A%40mail.gmail.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CANo%2B_AdzgruuC8mb5W2fKrxYi58tyBwak%2B-3g8w2hbWJTyRThw%40mail.gmail.com https://groups.google.com/d/msgid/elasticsearch/CANo%2B_AdzgruuC8mb5W2fKrxYi58tyBwak%2B-3g8w2hbWJTyRThw%40mail.gmail.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZsPGBtn9cSLt6Dyz0M%2BEznMCFM0d0Chj1h4%3DwJFX3qTng%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: doc_values in index template for new generated indexes
http://www.elastic.co/guide/en/elasticsearch/guide/current/doc-values.html#_enabling_doc_values -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Lucene.NET committer and PMC member On Tue, Mar 17, 2015 at 5:35 AM, chris85l...@googlemail.com wrote: Hello, We have an elasticsearch setup where we are using the default values, so no doc_values. How can I add doc_values: true to the index template so that the new daily based generated indexes using this feature. Thank you in advanced! Cheers Chris -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/81c8614a-4946-49ac-9e98-9af787445b92%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/81c8614a-4946-49ac-9e98-9af787445b92%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZuKEgR0uhttUBVoA2YQ4Cgu8BbX%3DYqZuJsuFzmUJY8Cfg%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: Running Kibana 4 on index
If there is a @timestamp field, Kibana will use it to work with the documents as timebased. Either way, you need to type the name of the index explicitly and override the logstash-* pattern that is suggested by default. -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Lucene.NET committer and PMC member On Mon, Mar 16, 2015 at 6:44 PM, Moshe Recanati re.mo...@gmail.com wrote: Yes latest ES. How I make sure it's timebased events? On Mar 16, 2015 6:37 PM, Itamar Syn-Hershko ita...@code972.com wrote: Are you using Kibana 4 with the latest Elasticsearch? Basically, in Kibana 4 you need to make sure you uncheck Index contains time-based events, then type the name of the index and click Create -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Lucene.NET committer and PMC member On Mon, Mar 16, 2015 at 6:31 PM, Moshe Recanati re.mo...@gmail.com wrote: Hi, I've Elastic with 2 simple indexes (chatmessages and sessions). I tried to run Kibana but although it can see the indexes I get the following message: Indices and aliases that were found, but did not match the pattern: .kibana chatmessages sessions Let me know what need to be done in order to solve it. Thank you, Moshe -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/43a38a2b-cb98-44ae-a107-873074cbc9e3%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/43a38a2b-cb98-44ae-a107-873074cbc9e3%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to a topic in the Google Groups elasticsearch group. To unsubscribe from this topic, visit https://groups.google.com/d/topic/elasticsearch/jKDqXRQyYDs/unsubscribe. To unsubscribe from this group and all its topics, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZvmfouaDEYHmHrqeZZ3p7Li%3D3yY5WAr7%3D5A9wVagpfhgA%40mail.gmail.com https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZvmfouaDEYHmHrqeZZ3p7Li%3D3yY5WAr7%3D5A9wVagpfhgA%40mail.gmail.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CA%2BhKCHO4G_GY4FXnvB9KFRUm4URxqVHtEgBzYa4Q4C%3D7v6FJBw%40mail.gmail.com https://groups.google.com/d/msgid/elasticsearch/CA%2BhKCHO4G_GY4FXnvB9KFRUm4URxqVHtEgBzYa4Q4C%3D7v6FJBw%40mail.gmail.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHTr4Zus-f7d9OSU%2BKfYvKtQuy5eYRQJr%3DkUokJdFW9MRzD9KA%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: metaphone3
I believe there are licensing issues involved You have metaphone available in core and here https://github.com/elastic/elasticsearch-analysis-phonetic Also see https://github.com/elastic/elasticsearch-analysis-phonetic/issues/16 -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Lucene.NET committer and PMC member On Mon, Mar 16, 2015 at 3:28 PM, kianmob...@gmail.com wrote: Is there any way to add metaphone 3 in elasticsearch as phonetic token filter? there is licence for metaphone3 here. https://code.google.com/p/google-refine/source/browse/trunk/main/src/com/google/refine/clustering/binning/Metaphone3.java -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/53fadd30-5ace-4012-b602-019981208f30%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/53fadd30-5ace-4012-b602-019981208f30%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZsHJ%2BAB-rb4o2gNHsbPd2erM5Seg4aH%3DQnWnzzVm8HHvg%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: Running Kibana 4 on index
Are you using Kibana 4 with the latest Elasticsearch? Basically, in Kibana 4 you need to make sure you uncheck Index contains time-based events, then type the name of the index and click Create -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Lucene.NET committer and PMC member On Mon, Mar 16, 2015 at 6:31 PM, Moshe Recanati re.mo...@gmail.com wrote: Hi, I've Elastic with 2 simple indexes (chatmessages and sessions). I tried to run Kibana but although it can see the indexes I get the following message: Indices and aliases that were found, but did not match the pattern: .kibana chatmessages sessions Let me know what need to be done in order to solve it. Thank you, Moshe -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/43a38a2b-cb98-44ae-a107-873074cbc9e3%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/43a38a2b-cb98-44ae-a107-873074cbc9e3%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZvmfouaDEYHmHrqeZZ3p7Li%3D3yY5WAr7%3D5A9wVagpfhgA%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: Strange exception in Elasticsearch 1.4.3
This looks like a bug in elasticsearch-analysis-combo, I'd post it as an issue there -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Lucene.NET committer and PMC member On Fri, Mar 13, 2015 at 1:35 PM, Angel Cross niegi...@gmail.com wrote: Hello. Recently in our test system we started to notice the following exception. Googling and investigation of setup itself didn't make it more clear. Still have no idea why is this happening. Maybe somebody already faced the issue and knows the reason? Or have any ideas? java.lang.IllegalArgumentException: State contains AttributeImpl of type org.apache.lucene.analysis.tokenattributes.PayloadAttributeImpl that is not in in this AttributeSource at org.apache.lucene.util.AttributeSource.restoreState(AttributeSource.java:313) at org.apache.lucene.analysis.ComboTokenStream.incrementToken(ComboTokenStream.java:106) at org.apache.lucene.index.DefaultIndexingChain$PerField.invert(DefaultIndexingChain.java:618) at org.apache.lucene.index.DefaultIndexingChain.processField(DefaultIndexingChain.java:359) at org.apache.lucene.index.DefaultIndexingChain.processDocument(DefaultIndexingChain.java:318) at org.apache.lucene.index.DocumentsWriterPerThread.updateDocument(DocumentsWriterPerThread.java:239) at org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:457) at org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1511) at org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1246) at org.elasticsearch.index.engine.internal.InternalEngine.innerIndex(InternalEngine.java:594) at org.elasticsearch.index.engine.internal.InternalEngine.index(InternalEngine.java:522) at org.elasticsearch.index.shard.service.InternalIndexShard.index(InternalIndexShard.java:425) at org.elasticsearch.action.bulk.TransportShardBulkAction.shardIndexOperation(TransportShardBulkAction.java:439) at org.elasticsearch.action.bulk.TransportShardBulkAction.shardOperationOnPrimary(TransportShardBulkAction.java:150) at org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction.performOnPrimary(TransportShardReplicationOperationAction.java:512) at org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction$1.run(TransportShardReplicationOperationAction.java:419) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:724) elastic configuration elasticsearch 1.4.3 plugins elasticsearch-analysis-baseform https://github.com/jprante/elasticsearch-analysis-baseform - version 1.4.0 elasticsearch-analysis-kuromoji https://github.com/elasticsearch/elasticsearch-analysis-kuromoji - version 2.4.2 elasticsearch-analysis-combo/ https://github.com/yakaz/elasticsearch-analysis-combo/ -version 1.5.1 elasticsearch-analysis-decompound https://github.com/jprante/elasticsearch-analysis-decompound - version for 1.0.0RC1 elasticsearch-analysis-icu https://github.com/elasticsearch/elasticsearch-analysis-icu - version 2.4.2 elasticsearch-analysis-smartcn https://github.com/elasticsearch/elasticsearch-analysis-smartcn -version 2.4.3 elasticsearch-head/ http://mobz.github.io/elasticsearch-head/ - the last one server(1 machine) works with 2 nodes. One of then is data node, another is tribe node. Nodes are running on different ports and differs in configuration. Server OS ir RedHat 6.5 Exception appears when we try to reindex of document containing nested documents. Indexing is happening via bulks. So this is not update but actually another index request for the existing document with the same id. This exception doesn't appear on another Centos machine and another RedHat machine with similar setup. We reinstalled Elastic on test machine, still no difference. Thanks, Liuba -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/d3f18bae-4df7-4563-9fc4-59d87cd1a50b%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/d3f18bae-4df7-4563-9fc4-59d87cd1a50b%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch
Re: Strange exception in Elasticsearch 1.4.3
Probably a version mismatch as that analyzer seem to only support 1.3.8 -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Lucene.NET committer and PMC member On Fri, Mar 13, 2015 at 3:06 PM, Itamar Syn-Hershko ita...@code972.com wrote: This looks like a bug in elasticsearch-analysis-combo, I'd post it as an issue there -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Lucene.NET committer and PMC member On Fri, Mar 13, 2015 at 1:35 PM, Angel Cross niegi...@gmail.com wrote: Hello. Recently in our test system we started to notice the following exception. Googling and investigation of setup itself didn't make it more clear. Still have no idea why is this happening. Maybe somebody already faced the issue and knows the reason? Or have any ideas? java.lang.IllegalArgumentException: State contains AttributeImpl of type org.apache.lucene.analysis.tokenattributes.PayloadAttributeImpl that is not in in this AttributeSource at org.apache.lucene.util.AttributeSource.restoreState(AttributeSource.java:313) at org.apache.lucene.analysis.ComboTokenStream.incrementToken(ComboTokenStream.java:106) at org.apache.lucene.index.DefaultIndexingChain$PerField.invert(DefaultIndexingChain.java:618) at org.apache.lucene.index.DefaultIndexingChain.processField(DefaultIndexingChain.java:359) at org.apache.lucene.index.DefaultIndexingChain.processDocument(DefaultIndexingChain.java:318) at org.apache.lucene.index.DocumentsWriterPerThread.updateDocument(DocumentsWriterPerThread.java:239) at org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:457) at org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1511) at org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1246) at org.elasticsearch.index.engine.internal.InternalEngine.innerIndex(InternalEngine.java:594) at org.elasticsearch.index.engine.internal.InternalEngine.index(InternalEngine.java:522) at org.elasticsearch.index.shard.service.InternalIndexShard.index(InternalIndexShard.java:425) at org.elasticsearch.action.bulk.TransportShardBulkAction.shardIndexOperation(TransportShardBulkAction.java:439) at org.elasticsearch.action.bulk.TransportShardBulkAction.shardOperationOnPrimary(TransportShardBulkAction.java:150) at org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction.performOnPrimary(TransportShardReplicationOperationAction.java:512) at org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction$1.run(TransportShardReplicationOperationAction.java:419) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:724) elastic configuration elasticsearch 1.4.3 plugins elasticsearch-analysis-baseform https://github.com/jprante/elasticsearch-analysis-baseform - version 1.4.0 elasticsearch-analysis-kuromoji https://github.com/elasticsearch/elasticsearch-analysis-kuromoji - version 2.4.2 elasticsearch-analysis-combo/ https://github.com/yakaz/elasticsearch-analysis-combo/ -version 1.5.1 elasticsearch-analysis-decompound https://github.com/jprante/elasticsearch-analysis-decompound - version for 1.0.0RC1 elasticsearch-analysis-icu https://github.com/elasticsearch/elasticsearch-analysis-icu - version 2.4.2 elasticsearch-analysis-smartcn https://github.com/elasticsearch/elasticsearch-analysis-smartcn -version 2.4.3 elasticsearch-head/ http://mobz.github.io/elasticsearch-head/ - the last one server(1 machine) works with 2 nodes. One of then is data node, another is tribe node. Nodes are running on different ports and differs in configuration. Server OS ir RedHat 6.5 Exception appears when we try to reindex of document containing nested documents. Indexing is happening via bulks. So this is not update but actually another index request for the existing document with the same id. This exception doesn't appear on another Centos machine and another RedHat machine with similar setup. We reinstalled Elastic on test machine, still no difference. Thanks, Liuba -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/d3f18bae-4df7-4563-9fc4-59d87cd1a50b%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/d3f18bae-4df7-4563-9fc4-59d87cd1a50b
Re: Sanitize a text for indexing
See http://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-length-tokenfilter.html -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Lucene.NET committer and PMC member On Thu, Mar 12, 2015 at 10:52 AM, Bernhard Berger bernhardberger3...@gmail.com wrote: Hi, while indexing various comments from Facebook I sometimes get Exceptions: IllegalArgumentException: Document contains at least one immense term... Is it possible to sanitize a text for indexing in Elasticsearch so it doesn't throw these Exceptions? Maybe there is a Filter to remove too-long Unicode terms? For details about the failing documents, see my (unanswered) Stackoverflow question: http://stackoverflow.com/questions/28941570/remove-long-unicode-terms-from-string-in-java (I fear to break another Elasticsearch-based (Maillist) crawler, so I better don't write the failing doc text here ;-) ) -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/93a5ed0d-6486-48b4-a228-1aff47d14ce0%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/93a5ed0d-6486-48b4-a228-1aff47d14ce0%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZtqBSYcM9oFRa%3DGsWeafzHsE%3DSVMSa6H9e1aVfDbS2q%3Dg%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: How to install a plugin from a jar file
Probably a bug in the plugin script which just looks at the folders under /plugins Did you put an es.properties file in your jar as a resource? -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Lucene.NET committer and PMC member On Thu, Mar 5, 2015 at 2:54 PM, Oranit Dror ora...@gmail.com wrote: Hi, I have written a plugin, but I cannot make ElasticSearch 1.4.4 to use it. Specifically, I packed the plugin as a jar file and placed the jar file under my ELASTIC_SEARCH_DIR/plugins/plugin name directory. However, when I am starting ElasticSearch, the list of installed plugins is empty: [INFO ][plugins ] [Ant-Man] loaded [],sites [] I should also note that when I run the 'plugin' command line with the list option (i.e. bin\plugin.bat -l), it does list my plug-in. Any advice? thank you, Oranit. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/7ae1df87-e777-4b42-96b1-050ac0ec92a2%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/7ae1df87-e777-4b42-96b1-050ac0ec92a2%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZvxeOKZXjFrVf3%3Dfne-zW0F7iefRt%3D1SRTb%2BnhiXzsSeQ%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: Elastic search front end
You just need to create a Lucene QueryParser (implementing QueryBuilder) and register it like so: https://github.com/elasticsearch/elasticsearch/issues/3264#issuecomment-20247436 However, Elasticsearch provides a very good and expressive query DSL - so I'd rather look at doing this on your search facade, and generate a verbose query JSON to send to Elasticsearch. Many things that you have to support in Solr via custom query parsers can be done using the provided query DSL with Elasticsearch because JSON is way better than LocalParams etc Alternatively, NLP and POS tagging could be done also on the analysis level. I'd look at doing using TeeSinkTolenFilters. -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Lucene.NET committer and PMC member On Tue, Mar 3, 2015 at 12:12 PM, Oranit Dror ora...@gmail.com wrote: Hi, I will be glad to get some more information on your suggestion to write my own QueryParser as a plugin. To be more specific, I would like that this parser will do some Natural Language processing on the full text query string, supplied by the user, in the front-end search bar. In fact, in SolR I have implemented such a parser (as a QParserPlugin subclass). The output of the parser plugin should be a new string that I would like to give to ElasticSearch. Additionally, before displaying the returned results, I would like to add my own code for selecting the text that I would like to highlight. In SolR, I have implemented a class that extends the DefaultSolrHighlighter class. thank you, Oranit. On Monday, March 2, 2015 at 10:46:32 PM UTC+2, Itamar Syn-Hershko wrote: You can write your own QueryParser as a plugin but that sounds like an overkill. If all you need is display some highlighted results its easy enough to do in any language and I'd say you don't really need Kibana for that -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Lucene.NET committer and PMC member On Mon, Mar 2, 2015 at 10:27 PM, Oranit Dror ora...@gmail.com wrote: Hi, I am new to ElasticSearch and have a newbie question: I want to have a user-friendly front-end to the data with a free text search bar. In this search bar the user inputs a query string, which I would like to parse and transform to a new string (application-dependent) that will be used on ElasticSearch. I then want to highlight the matching search terms in the results. I have implemented a similar application in Solr. I thought of using Kibana's Discover page. Is there a way to hook into Kibana and/or ElasticSearch, so I can transform the user's query string before it is sent to ES and highlight the results? Regards, Oranit -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/ msgid/elasticsearch/648dfc5a-49e4-4a82-8b4a-2497a90dad42% 40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/648dfc5a-49e4-4a82-8b4a-2497a90dad42%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/5d8e4b46-a68f-42d8-b097-0848fde5508c%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/5d8e4b46-a68f-42d8-b097-0848fde5508c%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHTr4Zs%2BbzJzkcPY7FYi0W7FDuYf9v-9%3DngL_PXhtokCgBPDzQ%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: how test plugin in eclipse in elasticsearch
See http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/using-elasticsearch-test-classes.html -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Lucene.NET committer and PMC member On Tue, Feb 17, 2015 at 5:23 PM, Ali Lotfdar ali.lotfda...@gmail.com wrote: Hi All, I found this topic in previous topics but I need some help too. I want to know if it is possible to test my plugin before installing inside ES? and if yes how(test to give some sample data and see the result!)? Could you please let me know how it is possible and how I can debug it using main method? Thanks, Ali -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/37768df0-8877-46db-8fc2-556428c9896a%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/37768df0-8877-46db-8fc2-556428c9896a%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZvA%3DSN-Wmfd8Dcn8uJuzs60d_jJrghdTdpEcaf8J7D5jQ%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: elasticsearch deploy on ec2
Can you describe your deployment process? the cluster can't be _always_ red - it should be green when you first deploy Other than that, check the obvious - that AWS securty groups are properly defined for those machines (all of them under the same named security group) -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Lucene.NET committer and PMC member On Mon, Feb 16, 2015 at 8:11 PM, Eliran Shlomo eli...@whipclip.com wrote: Hi, I'm trying to deploy new elasticsearch environment on aws and the cluster is always at red and i get the following message { error: ClusterBlockException[blocked by: [SERVICE_UNAVAILABLE/1/state not recovered / initialized];], status: 503 } I'm trying to deploy 3 master nodes, 3 data nodes, and 2 client nodes. when i check cluster health: { cluster_name: stress_new, status: red, timed_out: false, number_of_nodes: 8, number_of_data_nodes: 3, active_primary_shards: 0, active_shards: 0, relocating_shards: 0, initializing_shards: 0, unassigned_shards: 0 } please adivce this is the configuration i used with changes by node role ( node.master node.data ): cluster.name: stress_new plugin.mandatory: cloud-aws cloud.aws.access_key: cloud.aws.secret_key: * cloud.aws.region: us-west-2 discovery.type: ec2 discovery.ec2.groups: stress_new_elasticsearch discovery.ec2.host_type: private_ip discovery.ec2.ping_timeout: 30s discovery.ec2.tag.elasticsearch: stress_new node.name: 172.** node.master: false node.data: false index.number_of_shards: 1 index.number_of_replicas: 0 path.data: /mnt/elasticsearch path.logs: /var/log/elasticsearch bootstrap.mlockall: true http.enabled: true gateway.recover_after_nodes: 8 gateway.expected_nodes: 8 discovery.zen.minimum_master_nodes: 3 discovery.zen.ping.timeout: 10s discovery.zen.ping.multicast.enabled: false index.search.slowlog.threshold.query.warn: 500ms index.search.slowlog.threshold.query.info: 200ms index.search.slowlog.threshold.query.debug: 100ms index.search.slowlog.threshold.query.trace: 50ms index.search.slowlog.threshold.fetch.warn: 500ms index.search.slowlog.threshold.fetch.info: 200ms index.search.slowlog.threshold.fetch.debug: 100ms index.search.slowlog.threshold.fetch.trace: 50ms index.indexing.slowlog.threshold.index.warn: 500ms index.indexing.slowlog.threshold.index.info: 200ms index.indexing.slowlog.threshold.index.debug: 1000ms index.indexing.slowlog.threshold.index.trace: 50ms script.disable_dynamic: false script.native.socialScoreCalc.type: *.SocialScriptFactory script.default_lang: native action.disable_delete_all_indices: true action.auto_create_index: .marvel-* indices.fielddata.cache.size: 30% indices.fielddata.cache.expire: 15s marvel.agent.enabled: false allow_leading_wildcard: false script.groovy.sandbox.enabled: false -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/843e39bd-842a-4301-899d-7a1bb1d119a9%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/843e39bd-842a-4301-899d-7a1bb1d119a9%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZstAUwTELHxOx%3DqWBuOjByEn3mhgdd93-kG3M%3DxQMSS7g%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: elasticsearch deploy on ec2
Wait a second, you should use gateway.expected_data_nodes: 3 and gateway.expected_master_nodes: 3 instead of what you have there now. Also min master nodes should be 2 in your case. -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Lucene.NET committer and PMC member On Mon, Feb 16, 2015 at 8:22 PM, Eliran Shlomo eli...@whipclip.com wrote: Hi, Since the first moment the cluster in in red The servers are under the same security group and inside the security group i allow any/any between the servers. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/ed1d0f69-93cc-4467-a0ad-bffe4b5175df%40googlegroups.com . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZtJEVnyXCDkEX7Hj%3DR%3Dvyy%3DUsUiiDSXC0EVhBMU2PaYXQ%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: elasticsearch deploy on ec2
Master eligible nodes and Data nodes need to have this setting -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Lucene.NET committer and PMC member On Mon, Feb 16, 2015 at 8:31 PM, Eliran Shlomo eli...@whipclip.com wrote: Wait a second, you should use gateway.expected_data_nodes: 3 and gateway.expected_master_nodes: 3 instead of what you have there now. Also min master nodes should be 2 in your case. Those settings should be in the configuration of all nodes or only in the external gateway?(client) -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/e6914e16-28ac-4d00-96b7-b3b007538aba%40googlegroups.com . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZtzjXeHrzHiRAitfOHASpXrvzXza8JPVObtjMwCHqMrVA%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: elasticsearch deploy on ec2
Remove the number of nodes setting, if that doesn't help start looking at the logs. I seen clusters on AWS that took some time to discover and stabilize, it may also be that. -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Lucene.NET committer and PMC member On Mon, Feb 16, 2015 at 8:56 PM, Eliran Shlomo eli...@whipclip.com wrote: Hi, made the changes. no change in the cluster status and the same response from the servers { cluster_name: stress_new, status: red, timed_out: false, number_of_nodes: 8, number_of_data_nodes: 3, active_primary_shards: 0, active_shards: 0, relocating_shards: 0, initializing_shards: 0, unassigned_shards: 0 } get _status { error: ClusterBlockException[blocked by: [SERVICE_UNAVAILABLE/1/state not recovered / initialized];], status: 503 } On Monday, February 16, 2015 at 8:53:11 PM UTC+2, Itamar Syn-Hershko wrote: Master eligible nodes and Data nodes need to have this setting -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Lucene.NET committer and PMC member On Mon, Feb 16, 2015 at 8:31 PM, Eliran Shlomo eli...@whipclip.com wrote: Wait a second, you should use gateway.expected_data_nodes: 3 and gateway.expected_master_nodes: 3 instead of what you have there now. Also min master nodes should be 2 in your case. Those settings should be in the configuration of all nodes or only in the external gateway?(client) -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/ msgid/elasticsearch/e6914e16-28ac-4d00-96b7-b3b007538aba% 40googlegroups.com. For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/cd40be93-9b10-4bf1-ac9a-eafe8b98c8fe%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/cd40be93-9b10-4bf1-ac9a-eafe8b98c8fe%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHTr4Zsa0DgkCJuuLrhyrssM%2BEFMQ4jcZwECWmq%3Dq0GFyqHVEg%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: Adding timestamp property
Kibana requires the timestamp filed to be named @timestamp so the internal _timestamp field isn't going to work - I'm pretty sure that's still the case for Kibana 4 as well -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Lucene.NET committer and PMC member On Mon, Feb 16, 2015 at 6:22 PM, Roy Zanbel r...@jfrog.com wrote: Also no.. Will take an hour tomorrow to start fresh, will delete the index and start over again. Will keep you posted. { _index: aql, _type: item, _id: AUuR4GgLMJioTmulRq4u, _version: 1, found: true, _source: { path: Desktop/Desktop/Desktop, depth: 4, size: 477, downloads: 0, created: 2014-11-04T17:26:01.435+02:00, repo: archive-local, name: Desktop-Desktop.pom, type: file, updated: 2014-11-04T17:25:55.822+02:00 } } Thanks for the quick response. BR, Roy. On Saturday, February 14, 2015 at 3:30:28 PM UTC+2, Roy Zanbel wrote: Hi, New to elasticsearch and have a simple question had a hard time finding online. I wish to add a timestamp field.and later use it in kibana This is how my settings/ mappings looks like: { aql: { mappings: { item: { _timestamp: { enabled: true, store: true }, properties: {} } }, settings: { index: { item: { _timestamp: { enabled: true, store: true } }, creation_date: 1423908699031, number_of_shards: 5, number_of_replicas: 1, version: { created: 1040299 }, uuid: JqNaClL1Q5-ucG6NI1bvOA } } } } and after posting new indices would like to see a timetamp option to filter event in kibana. Thanks in advance. BR, Roy. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/340289ee-301f-4ac4-928a-1b547b9c4f74%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/340289ee-301f-4ac4-928a-1b547b9c4f74%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZvuABk%2BWGEvi9rgQgHXPoHnWVdJYbbsj%3DH-GEMvgPSCdA%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: A strange behavior we've encountered on our ELK
Yes - can you try using the bulk API? Also, are you running on a cloud server? -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Lucene.NET committer and PMC member On Thu, Feb 12, 2015 at 11:28 AM, Yuval Khalifa iyuv...@gmail.com wrote: Hi, I wrote that program and ran it and it did managed to keep a steady rate of about 1,000 events per minute even when the Kibana's total events per minute dropped from 60,000 to 6,000. However, when the Kibana's total events per minute dropped to zero, my program got a connection refused exception. I ran netstat -s and found out that every time the Kibana's line hit zero the number of RX-DRP increased. At that point I understood that I forgot to mention that this server has a 10GbE nic. Is it possible that the packets are being dropped because of some bufferis filling up? If so, how can I test it and verify that this is actually the case? If it is, how can I solve it? Thanks, Yuval. On Wednesday, February 11, 2015, Yuval Khalifa iyuv...@gmail.com wrote: Hi. When you say see how the file behaves I'm not quite sure what you mean by that... As I mentioned earlier, it's not that events do not appear at all but instead, the RATE at which they come decreases, so how can I measure the events rate in a file? I thought that there's another way that I can test this: I'll write a quick-and-dirty program that will send an event to the ELK via TCP every 12ms which should result in events rate of about 5,000 events per minute and I'll let you know if the events rate continues to drop or not... Thanks, Yuval. On Tuesday, February 10, 2015, Itamar Syn-Hershko ita...@code972.com wrote: I'd start by using logstash with input tcp and output fs and see how the file behaves. Same for the fs inputs - see how their files behave. And take it from there. -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Lucene.NET committer and PMC member On Tue, Feb 10, 2015 at 7:47 PM, Yuval Khalifa iyuv...@gmail.com wrote: Great! How can I check that? On Tuesday, February 10, 2015, Itamar Syn-Hershko ita...@code972.com wrote: The graphic you sent suggests the issue is with logstash - since the @timestamp field is being populated by logstash and is the one that is used to display the date histogram graphics in Kibana. I would start there. I.e. maybe SecurityOnion buffers writes etc, and then to check the logstash shipper process stats. -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Lucene.NET committer and PMC member On Tue, Feb 10, 2015 at 7:07 PM, Yuval Khalifa iyuv...@gmail.com wrote: Hi. Absolutely (but since that in the past I also worked at the helpdesk dept. I certainly understand why it is important to ask those Are you sure it's plugged in? questions...). One of the logs is comming from SecurityOnion which logs (via bro-conn) all the connections so it must be sending data 24x7x365. Thanks for the quick reply, Yuval. On Tuesday, February 10, 2015, Itamar Syn-Hershko ita...@code972.com wrote: Are you sure your logs are generated linearly without bursts? -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Lucene.NET committer and PMC member On Tue, Feb 10, 2015 at 6:29 PM, Yuval Khalifa iyuv...@gmail.com wrote: Hi, We just installed an ELK server and configured the logstash configuration to match the data that we send to it and until last month it seems to be working fine but since then we see very strange behavior in the Kibana, the event over time histogram shows the event rate at the normal level for about a half an hour, then drops to about 20% of the normal rate and then it continues to drop slowly for about two hours and then stops and after a minute or two it returns to normal for the next half an hour or so and the same behavior repeats. Needless to say that both the /var/log/logstash and /var/log/elasticsearch both show nothing since the service started and by using tcpdump we can verify that events keep coming in at the same rate all time. I attached our logstash configuration, the /var/logstash/logstash.log, the /var/log/elasticsearch/clustername.log and a screenshot of our Kibana with no filter applied so that you can see the weird behavior that we see. Is there someone/somewhere that we can turn to to get some help on the subject? Thanks a lot, Yuval. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch
Re: Master Node vs. Data Node Architecture
Depending why the node goes down - going mid-way with dedicated master nodes is sometimes the solution And if this is due to massive use of aggregations, doc-values may be the answer (or larger heap, but that's costlier) -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Lucene.NET committer and PMC member On Thu, Feb 12, 2015 at 11:40 PM, Mark Walkom markwal...@gmail.com wrote: Except that is overkill when you only have 3 nodes. How much data do you have in the cluster? On 13 February 2015 at 01:15, Itamar Syn-Hershko ita...@code972.com wrote: See this: http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-node.html Basically, the recommended pattern talks about isolating responsibilities. A node should either be a data node, master-eligible node, or an external gateway to the cluster (client node) -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Lucene.NET committer and PMC member On Thu, Feb 12, 2015 at 4:08 PM, Eric eric.luel...@gmail.com wrote: Hello, Currently I have a 3 node ElasticSearch cluster. Each node is a RHEL VM with 16 gig RAM. The basic config is: - All nodes can be master and are data nodes. - 3 shards and 1 replica - 6 different indexes I'm starting to run into issues of ElasticSearch bogging down on searches and is completely freezing sometimes at night. I've dedicated 9 gig to heap size and it says i'm using ~60% of the heap RAM and about 70% of the overall heap. So even though I'm using quite a bit of the heap, I'm not maxed out. I've attached a screenshot of the exact stats from Elastic HQ. I'm averaging around 10,000 events/sec coming into the cluster from 6 different Logstash instances on another server. My question is what can I do to help the stability and speed of my cluster. Currently I'm having issues with 1 node going down and it taking everything else down. The HA portion isn't working very well. I'm debating about either adding 1 more node with the exact same stats or adding 2 more smaller VMs that will act as master nodes only. I didn't know which one was recommended or where I would get the biggest bang for the buck. Any information would be greatly appreciated. Thanks, Eric -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/354a2326-5532-4239-87ea-f02af64fe71f%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/354a2326-5532-4239-87ea-f02af64fe71f%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZureLROJMaO7gVprFjA2OmRZA0ZYyH1v%2Bges06u_V__6w%40mail.gmail.com https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZureLROJMaO7gVprFjA2OmRZA0ZYyH1v%2Bges06u_V__6w%40mail.gmail.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEYi1X93EwqeGf9S4UpMvtJy3%2BmaAjovfVicj7LRHz%2BPyAbSug%40mail.gmail.com https://groups.google.com/d/msgid/elasticsearch/CAEYi1X93EwqeGf9S4UpMvtJy3%2BmaAjovfVicj7LRHz%2BPyAbSug%40mail.gmail.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZvFX-bEqpSEnh3EmdbzAwAhDLE7PYDivd5Q2VnFu_xviA%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: Master Node vs. Data Node Architecture
See this: http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-node.html Basically, the recommended pattern talks about isolating responsibilities. A node should either be a data node, master-eligible node, or an external gateway to the cluster (client node) -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Lucene.NET committer and PMC member On Thu, Feb 12, 2015 at 4:08 PM, Eric eric.luel...@gmail.com wrote: Hello, Currently I have a 3 node ElasticSearch cluster. Each node is a RHEL VM with 16 gig RAM. The basic config is: - All nodes can be master and are data nodes. - 3 shards and 1 replica - 6 different indexes I'm starting to run into issues of ElasticSearch bogging down on searches and is completely freezing sometimes at night. I've dedicated 9 gig to heap size and it says i'm using ~60% of the heap RAM and about 70% of the overall heap. So even though I'm using quite a bit of the heap, I'm not maxed out. I've attached a screenshot of the exact stats from Elastic HQ. I'm averaging around 10,000 events/sec coming into the cluster from 6 different Logstash instances on another server. My question is what can I do to help the stability and speed of my cluster. Currently I'm having issues with 1 node going down and it taking everything else down. The HA portion isn't working very well. I'm debating about either adding 1 more node with the exact same stats or adding 2 more smaller VMs that will act as master nodes only. I didn't know which one was recommended or where I would get the biggest bang for the buck. Any information would be greatly appreciated. Thanks, Eric -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/354a2326-5532-4239-87ea-f02af64fe71f%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/354a2326-5532-4239-87ea-f02af64fe71f%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZureLROJMaO7gVprFjA2OmRZA0ZYyH1v%2Bges06u_V__6w%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: Elasticsearch + attachment plugin + Kibana + couchbase
The XDCR plugin indexes the data using an envelope document. Long story short, make sure you use the latest XDCR plugin as older ones are missing lots of important functions, and use templates and dynamic templates with proper field paths for this to work correctly http://code972.com/blog/2015/02/80-elasticsearch-one-tip-a-day-managing-index-mappings-like-a-pro http://code972.com/blog/2015/02/81-elasticsearch-one-tip-a-day-using-dynamic-templates-to-avoid-rigorous-mappings -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Lucene.NET committer and PMC member On Thu, Feb 12, 2015 at 3:59 PM, Nadav Hashimshony nad...@gmail.com wrote: Hi, I'm new to the group, hope ill find what i need and share my experience as i go along.. im using ES with the attachment-plugin in order to store and search files. when i set the mapping right and insert the file data in a Base64 manner I'm able to query my data via Kibana. my problem is this. if i create the index + mapping in ES, then insert the data to Couchbase and use XDRC to replicate it to ES, i can't query the Data with Kibana. it looks like the mapping of the index created in ES doesn't index well the data it gets from Couchbase. has anyone encounter such an issue? Thanks You Nadav. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/8092eaf5-0ef8-4249-8e5d-acff8281a81a%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/8092eaf5-0ef8-4249-8e5d-acff8281a81a%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHTr4Zs5OwXJe9aT1pPNu9vuooXO10Z3Mx7xc8CJh77EN9s%3DCQ%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: A strange behavior we've encountered on our ELK
Yes, make sure the disk is local and not low latency shared one (e.g. SAN). Also SSD will probably fix all your pains. -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Lucene.NET committer and PMC member On Thu, Feb 12, 2015 at 3:28 PM, Yuval Khalifa iyuv...@gmail.com wrote: Sort of... The ELK is running as a VM on a dedicated ESXi. Are there special configurations I should do in such a case? Thanks, Yuval. On Thursday, February 12, 2015, Itamar Syn-Hershko ita...@code972.com wrote: Yes - can you try using the bulk API? Also, are you running on a cloud server? -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Lucene.NET committer and PMC member On Thu, Feb 12, 2015 at 11:28 AM, Yuval Khalifa iyuv...@gmail.com wrote: Hi, I wrote that program and ran it and it did managed to keep a steady rate of about 1,000 events per minute even when the Kibana's total events per minute dropped from 60,000 to 6,000. However, when the Kibana's total events per minute dropped to zero, my program got a connection refused exception. I ran netstat -s and found out that every time the Kibana's line hit zero the number of RX-DRP increased. At that point I understood that I forgot to mention that this server has a 10GbE nic. Is it possible that the packets are being dropped because of some bufferis filling up? If so, how can I test it and verify that this is actually the case? If it is, how can I solve it? Thanks, Yuval. On Wednesday, February 11, 2015, Yuval Khalifa iyuv...@gmail.com wrote: Hi. When you say see how the file behaves I'm not quite sure what you mean by that... As I mentioned earlier, it's not that events do not appear at all but instead, the RATE at which they come decreases, so how can I measure the events rate in a file? I thought that there's another way that I can test this: I'll write a quick-and-dirty program that will send an event to the ELK via TCP every 12ms which should result in events rate of about 5,000 events per minute and I'll let you know if the events rate continues to drop or not... Thanks, Yuval. On Tuesday, February 10, 2015, Itamar Syn-Hershko ita...@code972.com wrote: I'd start by using logstash with input tcp and output fs and see how the file behaves. Same for the fs inputs - see how their files behave. And take it from there. -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Lucene.NET committer and PMC member On Tue, Feb 10, 2015 at 7:47 PM, Yuval Khalifa iyuv...@gmail.com wrote: Great! How can I check that? On Tuesday, February 10, 2015, Itamar Syn-Hershko ita...@code972.com wrote: The graphic you sent suggests the issue is with logstash - since the @timestamp field is being populated by logstash and is the one that is used to display the date histogram graphics in Kibana. I would start there. I.e. maybe SecurityOnion buffers writes etc, and then to check the logstash shipper process stats. -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Lucene.NET committer and PMC member On Tue, Feb 10, 2015 at 7:07 PM, Yuval Khalifa iyuv...@gmail.com wrote: Hi. Absolutely (but since that in the past I also worked at the helpdesk dept. I certainly understand why it is important to ask those Are you sure it's plugged in? questions...). One of the logs is comming from SecurityOnion which logs (via bro-conn) all the connections so it must be sending data 24x7x365. Thanks for the quick reply, Yuval. On Tuesday, February 10, 2015, Itamar Syn-Hershko ita...@code972.com wrote: Are you sure your logs are generated linearly without bursts? -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Lucene.NET committer and PMC member On Tue, Feb 10, 2015 at 6:29 PM, Yuval Khalifa iyuv...@gmail.com wrote: Hi, We just installed an ELK server and configured the logstash configuration to match the data that we send to it and until last month it seems to be working fine but since then we see very strange behavior in the Kibana, the event over time histogram shows the event rate at the normal level for about a half an hour, then drops to about 20% of the normal rate and then it continues to drop slowly for about two hours and then stops and after a minute or two it returns to normal for the next half an hour or so and the same behavior repeats. Needless to say that both the /var/log/logstash and /var/log/elasticsearch both show nothing since the service started and by using tcpdump we can verify that events keep coming in at the same rate all time. I attached our logstash configuration, the /var/logstash/logstash.log
Re: A strange behavior we've encountered on our ELK
There's a good writeup on the subject by Mike btw, you should read it http://www.elasticsearch.org/blog/performance-considerations-elasticsearch-indexing/ -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Lucene.NET committer and PMC member On Thu, Feb 12, 2015 at 3:30 PM, Itamar Syn-Hershko ita...@code972.com wrote: Yes, make sure the disk is local and not low latency shared one (e.g. SAN). Also SSD will probably fix all your pains. -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Lucene.NET committer and PMC member On Thu, Feb 12, 2015 at 3:28 PM, Yuval Khalifa iyuv...@gmail.com wrote: Sort of... The ELK is running as a VM on a dedicated ESXi. Are there special configurations I should do in such a case? Thanks, Yuval. On Thursday, February 12, 2015, Itamar Syn-Hershko ita...@code972.com wrote: Yes - can you try using the bulk API? Also, are you running on a cloud server? -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Lucene.NET committer and PMC member On Thu, Feb 12, 2015 at 11:28 AM, Yuval Khalifa iyuv...@gmail.com wrote: Hi, I wrote that program and ran it and it did managed to keep a steady rate of about 1,000 events per minute even when the Kibana's total events per minute dropped from 60,000 to 6,000. However, when the Kibana's total events per minute dropped to zero, my program got a connection refused exception. I ran netstat -s and found out that every time the Kibana's line hit zero the number of RX-DRP increased. At that point I understood that I forgot to mention that this server has a 10GbE nic. Is it possible that the packets are being dropped because of some bufferis filling up? If so, how can I test it and verify that this is actually the case? If it is, how can I solve it? Thanks, Yuval. On Wednesday, February 11, 2015, Yuval Khalifa iyuv...@gmail.com wrote: Hi. When you say see how the file behaves I'm not quite sure what you mean by that... As I mentioned earlier, it's not that events do not appear at all but instead, the RATE at which they come decreases, so how can I measure the events rate in a file? I thought that there's another way that I can test this: I'll write a quick-and-dirty program that will send an event to the ELK via TCP every 12ms which should result in events rate of about 5,000 events per minute and I'll let you know if the events rate continues to drop or not... Thanks, Yuval. On Tuesday, February 10, 2015, Itamar Syn-Hershko ita...@code972.com wrote: I'd start by using logstash with input tcp and output fs and see how the file behaves. Same for the fs inputs - see how their files behave. And take it from there. -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Lucene.NET committer and PMC member On Tue, Feb 10, 2015 at 7:47 PM, Yuval Khalifa iyuv...@gmail.com wrote: Great! How can I check that? On Tuesday, February 10, 2015, Itamar Syn-Hershko ita...@code972.com wrote: The graphic you sent suggests the issue is with logstash - since the @timestamp field is being populated by logstash and is the one that is used to display the date histogram graphics in Kibana. I would start there. I.e. maybe SecurityOnion buffers writes etc, and then to check the logstash shipper process stats. -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Lucene.NET committer and PMC member On Tue, Feb 10, 2015 at 7:07 PM, Yuval Khalifa iyuv...@gmail.com wrote: Hi. Absolutely (but since that in the past I also worked at the helpdesk dept. I certainly understand why it is important to ask those Are you sure it's plugged in? questions...). One of the logs is comming from SecurityOnion which logs (via bro-conn) all the connections so it must be sending data 24x7x365. Thanks for the quick reply, Yuval. On Tuesday, February 10, 2015, Itamar Syn-Hershko ita...@code972.com wrote: Are you sure your logs are generated linearly without bursts? -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Lucene.NET committer and PMC member On Tue, Feb 10, 2015 at 6:29 PM, Yuval Khalifa iyuv...@gmail.com wrote: Hi, We just installed an ELK server and configured the logstash configuration to match the data that we send to it and until last month it seems to be working fine but since then we see very strange behavior in the Kibana, the event over time histogram shows the event rate at the normal level for about a half an hour, then drops to about 20% of the normal rate and then it continues to drop slowly for about two hours and then stops
Re: Elasticsearch + attachment plugin + Kibana + couchbase
Like I said, you need the mapping to catch before the XDCR plugin begins the replication - so you need to put a template with this mapping that will override XDCR's -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Lucene.NET committer and PMC member On Thu, Feb 12, 2015 at 4:59 PM, Nadav Hashimshony nad...@gmail.com wrote: Thanks you for the response i am using mapping, i created the following index PUT /storage/files/_mapping { files: { properties: { file: { type: attachment, path: full, fields: { content_type: { type: string, store: true } } } } } } when i insert data via ES and query it, all is fine. the problem is when data is inserted to Couchbase.. Nadav On Thursday, February 12, 2015 at 4:03:01 PM UTC+2, Itamar Syn-Hershko wrote: The XDCR plugin indexes the data using an envelope document. Long story short, make sure you use the latest XDCR plugin as older ones are missing lots of important functions, and use templates and dynamic templates with proper field paths for this to work correctly http://code972.com/blog/2015/02/80-elasticsearch-one-tip-a- day-managing-index-mappings-like-a-pro http://code972.com/blog/2015/02/81-elasticsearch-one-tip-a- day-using-dynamic-templates-to-avoid-rigorous-mappings -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Lucene.NET committer and PMC member On Thu, Feb 12, 2015 at 3:59 PM, Nadav Hashimshony nad...@gmail.com wrote: Hi, I'm new to the group, hope ill find what i need and share my experience as i go along.. im using ES with the attachment-plugin in order to store and search files. when i set the mapping right and insert the file data in a Base64 manner I'm able to query my data via Kibana. my problem is this. if i create the index + mapping in ES, then insert the data to Couchbase and use XDRC to replicate it to ES, i can't query the Data with Kibana. it looks like the mapping of the index created in ES doesn't index well the data it gets from Couchbase. has anyone encounter such an issue? Thanks You Nadav. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/ msgid/elasticsearch/8092eaf5-0ef8-4249-8e5d-acff8281a81a% 40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/8092eaf5-0ef8-4249-8e5d-acff8281a81a%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/1d9c8ce5-116f-40cc-a5e3-6ebe47191850%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/1d9c8ce5-116f-40cc-a5e3-6ebe47191850%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHTr4Zuh7HHK8XmdznuHnw7E01ffXV8BC-49D70ekMc1-YhQCA%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: Elasticsearch + attachment plugin + Kibana + couchbase
Yes. Just make sure the template reflects the actual document structure - as I said XDCR wraps your document in an envelope document -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Lucene.NET committer and PMC member On Thu, Feb 12, 2015 at 5:12 PM, Nadav Hashimshony nad...@gmail.com wrote: ok, just to be clear. the steps i did was as followed: 1. create the index with the mapping. 2. define the XDCR to replicate my bucket with the index in ES. 3. insert data to couchbase. 4. try to query with kibana What you suggest is to Add another BEFORE step 1: 0. create a template to include my mapping. 1. crate the index in ES and so on... did i get it right? Thanks. Nadav. On Thursday, February 12, 2015 at 5:04:24 PM UTC+2, Itamar Syn-Hershko wrote: Like I said, you need the mapping to catch before the XDCR plugin begins the replication - so you need to put a template with this mapping that will override XDCR's -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Lucene.NET committer and PMC member On Thu, Feb 12, 2015 at 4:59 PM, Nadav Hashimshony nad...@gmail.com wrote: Thanks you for the response i am using mapping, i created the following index PUT /storage/files/_mapping { files: { properties: { file: { type: attachment, path: full, fields: { content_type: { type: string, store: true } } } } } } when i insert data via ES and query it, all is fine. the problem is when data is inserted to Couchbase.. Nadav On Thursday, February 12, 2015 at 4:03:01 PM UTC+2, Itamar Syn-Hershko wrote: The XDCR plugin indexes the data using an envelope document. Long story short, make sure you use the latest XDCR plugin as older ones are missing lots of important functions, and use templates and dynamic templates with proper field paths for this to work correctly http://code972.com/blog/2015/02/80-elasticsearch-one-tip-a-d ay-managing-index-mappings-like-a-pro http://code972.com/blog/2015/02/81-elasticsearch-one-tip-a-d ay-using-dynamic-templates-to-avoid-rigorous-mappings -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Lucene.NET committer and PMC member On Thu, Feb 12, 2015 at 3:59 PM, Nadav Hashimshony nad...@gmail.com wrote: Hi, I'm new to the group, hope ill find what i need and share my experience as i go along.. im using ES with the attachment-plugin in order to store and search files. when i set the mapping right and insert the file data in a Base64 manner I'm able to query my data via Kibana. my problem is this. if i create the index + mapping in ES, then insert the data to Couchbase and use XDRC to replicate it to ES, i can't query the Data with Kibana. it looks like the mapping of the index created in ES doesn't index well the data it gets from Couchbase. has anyone encounter such an issue? Thanks You Nadav. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/ msgid/elasticsearch/8092eaf5-0ef8-4249-8e5d-acff8281a81a%40goo glegroups.com https://groups.google.com/d/msgid/elasticsearch/8092eaf5-0ef8-4249-8e5d-acff8281a81a%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/ msgid/elasticsearch/1d9c8ce5-116f-40cc-a5e3-6ebe47191850% 40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/1d9c8ce5-116f-40cc-a5e3-6ebe47191850%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/8628ef07-2e10-423a-9de0-13ebaa37a0e8%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/8628ef07-2e10-423a-9de0-13ebaa37a0e8%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group
Re: Elasticsearch + attachment plugin + Kibana + couchbase
Yes, that too :) Also if its a time based data, you will not be able to use kibana's date filtering etc - because it lacks the @timestamp field. Basically, the XDCR elasticsearch plugin was built around the XDCR / Couchbase realm and not around Elasticsearch's. Unfortunately this means many ES features are unavailable / hard to use, e.g. https://github.com/couchbaselabs/elasticsearch-transport-couchbase/issues/63 https://github.com/couchbaselabs/elasticsearch-transport-couchbase/issues/64 I can help fixing this on the XDCR plugin if you'd like - ping me privately and we can work something out (or I can convince you to avoid using the XDCR replication) -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Lucene.NET committer and PMC member On Thu, Feb 12, 2015 at 5:18 PM, Nadav Hashimshony nad...@gmail.com wrote: ok, ill try. this envelope document, is it something i need to be concerned about when I'm querying via Kibana? On Thursday, February 12, 2015 at 5:14:39 PM UTC+2, Itamar Syn-Hershko wrote: Yes. Just make sure the template reflects the actual document structure - as I said XDCR wraps your document in an envelope document -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Lucene.NET committer and PMC member On Thu, Feb 12, 2015 at 5:12 PM, Nadav Hashimshony nad...@gmail.com wrote: ok, just to be clear. the steps i did was as followed: 1. create the index with the mapping. 2. define the XDCR to replicate my bucket with the index in ES. 3. insert data to couchbase. 4. try to query with kibana What you suggest is to Add another BEFORE step 1: 0. create a template to include my mapping. 1. crate the index in ES and so on... did i get it right? Thanks. Nadav. On Thursday, February 12, 2015 at 5:04:24 PM UTC+2, Itamar Syn-Hershko wrote: Like I said, you need the mapping to catch before the XDCR plugin begins the replication - so you need to put a template with this mapping that will override XDCR's -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Lucene.NET committer and PMC member On Thu, Feb 12, 2015 at 4:59 PM, Nadav Hashimshony nad...@gmail.com wrote: Thanks you for the response i am using mapping, i created the following index PUT /storage/files/_mapping { files: { properties: { file: { type: attachment, path: full, fields: { content_type: { type: string, store: true } } } } } } when i insert data via ES and query it, all is fine. the problem is when data is inserted to Couchbase.. Nadav On Thursday, February 12, 2015 at 4:03:01 PM UTC+2, Itamar Syn-Hershko wrote: The XDCR plugin indexes the data using an envelope document. Long story short, make sure you use the latest XDCR plugin as older ones are missing lots of important functions, and use templates and dynamic templates with proper field paths for this to work correctly http://code972.com/blog/2015/02/80-elasticsearch-one-tip-a-d ay-managing-index-mappings-like-a-pro http://code972.com/blog/2015/02/81-elasticsearch-one-tip-a-d ay-using-dynamic-templates-to-avoid-rigorous-mappings -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Lucene.NET committer and PMC member On Thu, Feb 12, 2015 at 3:59 PM, Nadav Hashimshony nad...@gmail.com wrote: Hi, I'm new to the group, hope ill find what i need and share my experience as i go along.. im using ES with the attachment-plugin in order to store and search files. when i set the mapping right and insert the file data in a Base64 manner I'm able to query my data via Kibana. my problem is this. if i create the index + mapping in ES, then insert the data to Couchbase and use XDRC to replicate it to ES, i can't query the Data with Kibana. it looks like the mapping of the index created in ES doesn't index well the data it gets from Couchbase. has anyone encounter such an issue? Thanks You Nadav. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/8092eaf5-0ef 8-4249-8e5d-acff8281a81a%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/8092eaf5-0ef8-4249-8e5d-acff8281a81a%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving
Re: Can I use Newtonsoft.Json 4.5.0.0 instead of 6.0.0.0 with NEST?
Try using assembly redirects, if that doesn't work it means no... -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Lucene.NET committer and PMC member On Wed, Feb 11, 2015 at 12:38 PM, Martin Widmer swissm...@gmail.com wrote: Can I use Newtonsoft.Json 4.5.0.0 instead of 6.0.0.0 with NEST? How? Thanks for your advice. Martin -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/5a2a4684-d74c-462b-903d-be973dbd327d%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/5a2a4684-d74c-462b-903d-be973dbd327d%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHTr4Zuz-T0hfrACUC27LeOzP9QMf-m3YjOcEWSXhzgAutnZ5w%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: A strange behavior we've encountered on our ELK
Are you sure your logs are generated linearly without bursts? -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Lucene.NET committer and PMC member On Tue, Feb 10, 2015 at 6:29 PM, Yuval Khalifa iyuv...@gmail.com wrote: Hi, We just installed an ELK server and configured the logstash configuration to match the data that we send to it and until last month it seems to be working fine but since then we see very strange behavior in the Kibana, the event over time histogram shows the event rate at the normal level for about a half an hour, then drops to about 20% of the normal rate and then it continues to drop slowly for about two hours and then stops and after a minute or two it returns to normal for the next half an hour or so and the same behavior repeats. Needless to say that both the /var/log/logstash and /var/log/elasticsearch both show nothing since the service started and by using tcpdump we can verify that events keep coming in at the same rate all time. I attached our logstash configuration, the /var/logstash/logstash.log, the /var/log/elasticsearch/clustername.log and a screenshot of our Kibana with no filter applied so that you can see the weird behavior that we see. Is there someone/somewhere that we can turn to to get some help on the subject? Thanks a lot, Yuval. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/c2e5a524-1ba6-4dc9-9fc3-d206d8f82717%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/c2e5a524-1ba6-4dc9-9fc3-d206d8f82717%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZsRoNmJ__QdLnB6NYLhoDVaD9CR1RNkC_9_c%2Boaqccqww%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: Dumping raw data in custom format
Use the scan/scroll API with different queries (filter by document type etc), from a custom tool written in Java. This will be the fastest. -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Lucene.NET committer and PMC member On Tue, Feb 10, 2015 at 7:41 PM, Andrew McFague redmu...@gmail.com wrote: Forgot to mention--the data set size is around 1.6 billion documents. On Tuesday, February 10, 2015 at 9:29:39 AM UTC-8, Andrew McFague wrote: I have a use case where I'd like to be able to dump *all* the documents in ES to a specific output format. However, using scan or any other consistent view is relatively slow. Using the scan query with a match_all, it processes items at a rate of around 80,000 a second--but that means it will still take over 5 hours to dump. It also means it can't be parallelized across machines, which effectively stops scaling. I've also looked at things like Knapsack, Elastidump, etc., but these still don't give me the ability to parallelize the work, and they're not particularly fast. They also don't allow me to manipulate it to the specific format I want (it's not JSON, and requires some organization of the data). So I have a few ideas, which may or may not be possible: 1. Retrieve shard-specific data from ElasticSearch (i.e., Give me all the data for Shard X). This would allow me to divide the task up into /at least/ S tasks, where S is the number of segments, but there doesn't seem to be an API that exposes this. 2. Get snapshots of each shard from disk. This would also allow me to divide up the work, but would also require a framework on top to coordinate which segments have been retrieved, etc.. 3. Hadoop. However, launching an entire MR cluster just to dump data sounds like overkill. The first option gives me the most flexibility and would require the least amount of work on my part, but there doesn't seem to be any way to dump all the data for a specific shard via the API. Is there any sort of API or flag that provides this, or otherwise provides a way to partition the data to different consumers? The second would also (assumingly) give me the ability to subdivide tasks out per worker, and would also allow these to be done offline. I was able to write a sample program that uses Lucene to do this, but this adds the additional complexity of coordinating work across the various hosts in the cluster, as well as requiring an intermediate step where I transfer the common files to another host to combine them. This isn't a terrible problem to have--but does require additional infrastructure to organize. The third is not desirable because it's an incredible amount of operational load without a clear tradeoff, since we don't already have a map reduce cluster on hand. Thanks for any tips or suggestions! Andrew -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/91cebf19-dc58-48bf-80fa-839a7cea4596%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/91cebf19-dc58-48bf-80fa-839a7cea4596%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHTr4Zv9-%3DEsiY1DpzjT8SzQ8jSg7rYrH04UPqYHpwOq2nyMOw%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: A strange behavior we've encountered on our ELK
The graphic you sent suggests the issue is with logstash - since the @timestamp field is being populated by logstash and is the one that is used to display the date histogram graphics in Kibana. I would start there. I.e. maybe SecurityOnion buffers writes etc, and then to check the logstash shipper process stats. -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Lucene.NET committer and PMC member On Tue, Feb 10, 2015 at 7:07 PM, Yuval Khalifa iyuv...@gmail.com wrote: Hi. Absolutely (but since that in the past I also worked at the helpdesk dept. I certainly understand why it is important to ask those Are you sure it's plugged in? questions...). One of the logs is comming from SecurityOnion which logs (via bro-conn) all the connections so it must be sending data 24x7x365. Thanks for the quick reply, Yuval. On Tuesday, February 10, 2015, Itamar Syn-Hershko ita...@code972.com wrote: Are you sure your logs are generated linearly without bursts? -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Lucene.NET committer and PMC member On Tue, Feb 10, 2015 at 6:29 PM, Yuval Khalifa iyuv...@gmail.com wrote: Hi, We just installed an ELK server and configured the logstash configuration to match the data that we send to it and until last month it seems to be working fine but since then we see very strange behavior in the Kibana, the event over time histogram shows the event rate at the normal level for about a half an hour, then drops to about 20% of the normal rate and then it continues to drop slowly for about two hours and then stops and after a minute or two it returns to normal for the next half an hour or so and the same behavior repeats. Needless to say that both the /var/log/logstash and /var/log/elasticsearch both show nothing since the service started and by using tcpdump we can verify that events keep coming in at the same rate all time. I attached our logstash configuration, the /var/logstash/logstash.log, the /var/log/elasticsearch/clustername.log and a screenshot of our Kibana with no filter applied so that you can see the weird behavior that we see. Is there someone/somewhere that we can turn to to get some help on the subject? Thanks a lot, Yuval. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/c2e5a524-1ba6-4dc9-9fc3-d206d8f82717%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/c2e5a524-1ba6-4dc9-9fc3-d206d8f82717%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to a topic in the Google Groups elasticsearch group. To unsubscribe from this topic, visit https://groups.google.com/d/topic/elasticsearch/cw7zEVTy09M/unsubscribe. To unsubscribe from this group and all its topics, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZsRoNmJ__QdLnB6NYLhoDVaD9CR1RNkC_9_c%2Boaqccqww%40mail.gmail.com https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZsRoNmJ__QdLnB6NYLhoDVaD9CR1RNkC_9_c%2Boaqccqww%40mail.gmail.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- בברכה, *יובל כליפא* CTO תחום מערכות מידע | מגדל סוכנויות. נייד:052-3336098 משרד: 03-7966565 פקס:03-7976565 בלוג: http://www.artifex.co.il https://owa.mvs.co.il/OWA/redir.aspx?C=2843559e53a94386b1211d26cb20f8efURL=http%3a%2f%2fwww.artifex.co.il%2f *[image: תיאור: תיאור: cid:image003.png@01CBB583.C49AE5A0]* -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CADtR2A9-UtP5GJLORnVW%2BMowbB%2B0ZV%3DeDFMfN5u3xFPD2Zv5FQ%40mail.gmail.com https://groups.google.com/d/msgid/elasticsearch/CADtR2A9-UtP5GJLORnVW%2BMowbB%2B0ZV%3DeDFMfN5u3xFPD2Zv5FQ%40mail.gmail.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch
Re: A strange behavior we've encountered on our ELK
I'd start by using logstash with input tcp and output fs and see how the file behaves. Same for the fs inputs - see how their files behave. And take it from there. -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Lucene.NET committer and PMC member On Tue, Feb 10, 2015 at 7:47 PM, Yuval Khalifa iyuv...@gmail.com wrote: Great! How can I check that? On Tuesday, February 10, 2015, Itamar Syn-Hershko ita...@code972.com wrote: The graphic you sent suggests the issue is with logstash - since the @timestamp field is being populated by logstash and is the one that is used to display the date histogram graphics in Kibana. I would start there. I.e. maybe SecurityOnion buffers writes etc, and then to check the logstash shipper process stats. -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Lucene.NET committer and PMC member On Tue, Feb 10, 2015 at 7:07 PM, Yuval Khalifa iyuv...@gmail.com wrote: Hi. Absolutely (but since that in the past I also worked at the helpdesk dept. I certainly understand why it is important to ask those Are you sure it's plugged in? questions...). One of the logs is comming from SecurityOnion which logs (via bro-conn) all the connections so it must be sending data 24x7x365. Thanks for the quick reply, Yuval. On Tuesday, February 10, 2015, Itamar Syn-Hershko ita...@code972.com wrote: Are you sure your logs are generated linearly without bursts? -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Lucene.NET committer and PMC member On Tue, Feb 10, 2015 at 6:29 PM, Yuval Khalifa iyuv...@gmail.com wrote: Hi, We just installed an ELK server and configured the logstash configuration to match the data that we send to it and until last month it seems to be working fine but since then we see very strange behavior in the Kibana, the event over time histogram shows the event rate at the normal level for about a half an hour, then drops to about 20% of the normal rate and then it continues to drop slowly for about two hours and then stops and after a minute or two it returns to normal for the next half an hour or so and the same behavior repeats. Needless to say that both the /var/log/logstash and /var/log/elasticsearch both show nothing since the service started and by using tcpdump we can verify that events keep coming in at the same rate all time. I attached our logstash configuration, the /var/logstash/logstash.log, the /var/log/elasticsearch/clustername.log and a screenshot of our Kibana with no filter applied so that you can see the weird behavior that we see. Is there someone/somewhere that we can turn to to get some help on the subject? Thanks a lot, Yuval. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/c2e5a524-1ba6-4dc9-9fc3-d206d8f82717%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/c2e5a524-1ba6-4dc9-9fc3-d206d8f82717%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to a topic in the Google Groups elasticsearch group. To unsubscribe from this topic, visit https://groups.google.com/d/topic/elasticsearch/cw7zEVTy09M/unsubscribe . To unsubscribe from this group and all its topics, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZsRoNmJ__QdLnB6NYLhoDVaD9CR1RNkC_9_c%2Boaqccqww%40mail.gmail.com https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZsRoNmJ__QdLnB6NYLhoDVaD9CR1RNkC_9_c%2Boaqccqww%40mail.gmail.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- בברכה, *יובל כליפא* CTO תחום מערכות מידע | מגדל סוכנויות. נייד:052-3336098 משרד: 03-7966565 פקס:03-7976565 בלוג: http://www.artifex.co.il https://owa.mvs.co.il/OWA/redir.aspx?C=2843559e53a94386b1211d26cb20f8efURL=http%3a%2f%2fwww.artifex.co.il%2f *[image: תיאור: תיאור: cid:image003.png@01CBB583.C49AE5A0]* -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CADtR2A9-UtP5GJLORnVW%2BMowbB%2B0ZV%3DeDFMfN5u3xFPD2Zv5FQ%40mail.gmail.com https://groups.google.com/d/msgid/elasticsearch/CADtR2A9
Re: renaming a nodes in a cluster
No, you will have to restart them tho -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Lucene.NET committer and PMC member On Mon, Feb 9, 2015 at 9:41 PM, Crista Shawler ecs...@gmail.com wrote: I would like to rename a couple of the nodes in my cluster. Are there any issues with doing this? -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/6e4edd7a-8bfb-4633-b7fd-b7d87c382ec7%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/6e4edd7a-8bfb-4633-b7fd-b7d87c382ec7%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZsG7wqkvdjC-Rm08gyyz7D3tMhpsQPzLKGYQu6Ay%3DE7Gw%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: Possible? Wildcard template for a collection of fields to solve some dynamic mapping woes
Please refer to www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-core-types.html -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Lucene.NET committer and PMC member On Mon, Feb 9, 2015 at 12:24 PM, Itamar Syn-Hershko ita...@code972.com wrote: yes, you are using string properties on a date mapping field -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Lucene.NET committer and PMC member On Mon, Feb 9, 2015 at 12:23 PM, Paul Kavanagh pkavan...@shopkeep.com wrote: I think you have something there. I have come up with this: curl -XPUT localhost:9200/_template/template_1 -d ' { template : logstash-*, order : 0, settings : { number_of_shards : 15 }, mappings : { dynamic_templates:[ {apiservice_logstash:{ match:apiservice.logstash.@fields.parameters.*, match_mapping_type:dateOptionalTime, mapping:{ type:string, analyzer:english } } } ] } } ' However... When I try to post it, Elasticsearch throws: {error:ElasticsearchIllegalArgumentException[Malformed mappings section for type [dynamic_templates], should include an inner object describing the mapping],status:400} i've tried a few things, but it doesn't seem to like my mappings block for some reason. Any idea why? On Friday, February 6, 2015 at 11:41:49 AM UTC, Itamar Syn-Hershko wrote: You mean something like dynamic templates? http://code972.com/ blog/2015/02/81-elasticsearch-one-tip-a-day-using-dynamic- templates-to-avoid-rigorous-mappings -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Lucene.NET committer and PMC member On Fri, Feb 6, 2015 at 1:39 PM, Paul Kavanagh pkav...@shopkeep.com wrote: Hi all, We're having a MapperParsingException problem with some field values when we get when we use the JSON Filter for Logstash to explode out a JSON document to Elasticsearch fields. In 99.9% of cases, certain of these fields are either blank, or contain dates in the format of -mm-dd. This allows ES to dynamically map this field to type dateOptionalTime. However, we occasionally see non-standard date formats in these fields, which our main service can handle fine, but which throws a MapperParsingException in Elasticsearch - such are here: [2015-02-06 10:46:50,679][WARN ][cluster.action.shard ] [logging- production-elasticsearch-ip-xxx-xxx-xxx-148] [logstash-2015.02.06][2] received shard failed for [logstash-2015.02.06][2], node[ GZpltBjAQUqGyp2B1SLz_g], [R], s[INITIALIZING], indexUUID [BEdTwj- QRuOZB713YAQwvA], reason [Failed to start shard, message [ RecoveryFailedException[[logstash-2015.02.06][2]: Recovery failed from [logging-production-elasticsearch-ip-xxx-xxx-xxx-82][IALW-92 RReiLffQjSL3I-g][logging-production-elasticsearch-ip-xxx-xxx-xxx-82][ inet[ip-xxx-xxx-xxx-82.ec2.internal/xxx.xxx.xxx.82:9300]]{ max_local_storage_nodes=1, aws_availability_zone=us-east-1e, aws_az=us- east-1e} into [logging-production-elasticsearch-ip-xxx-xxx-xxx-148][ GZpltBjAQUqGyp2B1SLz_g][logging-production-elasticsearch-ip-xxx-xxx-xxx -148][inet[ip-xxx.xxx.xxx.148.ec2.internal/xxx.xxx.xxx.148:9300]]{ max_local_storage_nodes=1, aws_availability_zone=us-east-1c, aws_az=us- east-1c}]; nested: RemoteTransportException[[logging-production- elasticsearch-ip-xxx-xxx-xxx-82][inet[/xxx.xxx.xxx.82:9300]][internal: index/shard/recovery/start_recovery]]; nested: RecoveryEngineException [[logstash-2015.02.06][2] Phase[2] Execution failed]; nested: RemoteTransportException[[logging-production-elasticsearch-ip-xxx-xxx- xxx-148][inet[/xxx.xxx.xxx.148:9300]][internal:index/shard/recovery/ translog_ops]]; nested: MapperParsingException[failed to parse [ apiservice.logstash.@fields.parameters.start_time]]; nested: MapperParsingException[failed to parse date field [Feb 5 2015 12:00 AM ], tried both date format [dateOptionalTime], and timestamp number with locale []]; nested: IllegalArgumentException[Invalid format: Feb 5 2015 12:00 AM]; ]] 2015-02-06 10:46:53,685][WARN ][cluster.action.shard ] [logging- production-elasticsearch-ip-xxx-xxx-xxx-148] [logstash-2015.02.06][2] received shard failed for [logstash-2015.02.06][2], node[ GZpltBjAQUqGyp2B1SLz_g], [R], s[INITIALIZING], indexUUID [BEdTwj- QRuOZB713YAQwvA], reason [master [logging-production-elasticsearch-ip- xxx-xxx-xxx-148][GZpltBjAQUqGyp2B1SLz_g][logging-production- elasticsearch-ip-xxx-xxx-xxx-148][inet[ip-xxx-xxx-xxx-148.ec2.internal/ xxx.xxx.xxx.148:9300]]{max_local_storage_nodes=1, aws_availability_zone =us-east-1c, aws_az=us-east-1c} marked shard as initializing, but shard is marked as failed, resend shard failure] Our planned solution
Re: Possible? Wildcard template for a collection of fields to solve some dynamic mapping woes
yes, you are using string properties on a date mapping field -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Lucene.NET committer and PMC member On Mon, Feb 9, 2015 at 12:23 PM, Paul Kavanagh pkavan...@shopkeep.com wrote: I think you have something there. I have come up with this: curl -XPUT localhost:9200/_template/template_1 -d ' { template : logstash-*, order : 0, settings : { number_of_shards : 15 }, mappings : { dynamic_templates:[ {apiservice_logstash:{ match:apiservice.logstash.@fields.parameters.*, match_mapping_type:dateOptionalTime, mapping:{ type:string, analyzer:english } } } ] } } ' However... When I try to post it, Elasticsearch throws: {error:ElasticsearchIllegalArgumentException[Malformed mappings section for type [dynamic_templates], should include an inner object describing the mapping],status:400} i've tried a few things, but it doesn't seem to like my mappings block for some reason. Any idea why? On Friday, February 6, 2015 at 11:41:49 AM UTC, Itamar Syn-Hershko wrote: You mean something like dynamic templates? http://code972.com/ blog/2015/02/81-elasticsearch-one-tip-a-day-using-dynamic- templates-to-avoid-rigorous-mappings -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Lucene.NET committer and PMC member On Fri, Feb 6, 2015 at 1:39 PM, Paul Kavanagh pkav...@shopkeep.com wrote: Hi all, We're having a MapperParsingException problem with some field values when we get when we use the JSON Filter for Logstash to explode out a JSON document to Elasticsearch fields. In 99.9% of cases, certain of these fields are either blank, or contain dates in the format of -mm-dd. This allows ES to dynamically map this field to type dateOptionalTime. However, we occasionally see non-standard date formats in these fields, which our main service can handle fine, but which throws a MapperParsingException in Elasticsearch - such are here: [2015-02-06 10:46:50,679][WARN ][cluster.action.shard ] [logging- production-elasticsearch-ip-xxx-xxx-xxx-148] [logstash-2015.02.06][2] received shard failed for [logstash-2015.02.06][2], node[ GZpltBjAQUqGyp2B1SLz_g], [R], s[INITIALIZING], indexUUID [BEdTwj- QRuOZB713YAQwvA], reason [Failed to start shard, message [ RecoveryFailedException[[logstash-2015.02.06][2]: Recovery failed from [ logging-production-elasticsearch-ip-xxx-xxx-xxx-82][IALW-92RReiLffQjSL3I -g][logging-production-elasticsearch-ip-xxx-xxx-xxx-82][inet[ip-xxx-xxx- xxx-82.ec2.internal/xxx.xxx.xxx.82:9300]]{max_local_storage_nodes=1, aws_availability_zone=us-east-1e, aws_az=us-east-1e} into [logging- production-elasticsearch-ip-xxx-xxx-xxx-148][GZpltBjAQUqGyp2B1SLz_g][ logging-production-elasticsearch-ip-xxx-xxx-xxx-148][inet[ip-xxx.xxx.xxx .148.ec2.internal/xxx.xxx.xxx.148:9300]]{max_local_storage_nodes=1, aws_availability_zone=us-east-1c, aws_az=us-east-1c}]; nested: RemoteTransportException[[logging-production-elasticsearch-ip-xxx-xxx- xxx-82][inet[/xxx.xxx.xxx.82:9300]][internal:index/shard/recovery/ start_recovery]]; nested: RecoveryEngineException[[logstash-2015.02.06][ 2] Phase[2] Execution failed]; nested: RemoteTransportException[[logging -production-elasticsearch-ip-xxx-xxx-xxx-148][inet[/xxx.xxx.xxx.148:9300 ]][internal:index/shard/recovery/translog_ops]]; nested: MapperParsingException[failed to parse [apiservice.logstash.@fields.p arameters.start_time]]; nested: MapperParsingException[failed to parse date field [Feb 5 2015 12:00 AM], tried both date format [ dateOptionalTime], and timestamp number with locale []]; nested: IllegalArgumentException[Invalid format: Feb 5 2015 12:00 AM]; ]] 2015-02-06 10:46:53,685][WARN ][cluster.action.shard ] [logging- production-elasticsearch-ip-xxx-xxx-xxx-148] [logstash-2015.02.06][2] received shard failed for [logstash-2015.02.06][2], node[ GZpltBjAQUqGyp2B1SLz_g], [R], s[INITIALIZING], indexUUID [BEdTwj- QRuOZB713YAQwvA], reason [master [logging-production-elasticsearch-ip- xxx-xxx-xxx-148][GZpltBjAQUqGyp2B1SLz_g][logging-production- elasticsearch-ip-xxx-xxx-xxx-148][inet[ip-xxx-xxx-xxx-148.ec2.internal/ xxx.xxx.xxx.148:9300]]{max_local_storage_nodes=1, aws_availability_zone= us-east-1c, aws_az=us-east-1c} marked shard as initializing, but shard is marked as failed, resend shard failure] Our planned solution was to create a template for Logstash indices that will set these fields to string. But as the field above isn't the only culprit, and more may be added overtime, it makes more sense to create a template to map all fields under apiservice.logstash.@fields.parameters.* to be string. (We never need to query on user entered data, but it's great to have
Re: Performance Limitation with ELK stack
Logstash is CPU bound, SSD won't help. It's a JRuby implementation. Try to see if you can have multiple logstash shippers on the same logs. Having a redis / kafka server as a middle tier is also a general practice. If that is not feasible then yes - my advise to you would be to roll your own. -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Lucene.NET committer and PMC member On Mon, Feb 9, 2015 at 12:15 PM, Hagai T hagai@gmail.com wrote: Hi, We were able to identify the bottleneck which seems to be the Logstash service. It seems that the Elasticsearch cluster is able to handle 40,000 per second documents with a 3 ES servers cluster using Java client that was written by us using SDK with bulk inserts. The client (written for load testing) is generating JSON format and send it to Elasticsearch for further processing. We run the same test with Logstash which reads the JSON format from Apache access log on a general purpose SSD and managed to achieve maximum of 4,000 requests per second. With 2 Logstash servers we achieved 8,000 req per second. Getting rid of the filtering section in logstash configuration file helped us get to this number. with filtering we achieved only 1,5000-2,000 req per sec. I also tried to move the log file to Ephemeral storage but didn't get any improvement. We don't have any resources problem in the Logstash server (I/O / CPU) so it seems like a limit in the file input module or either logstash itself. I was able to test Logstash performance by creating huge log file 2GB and starting Logstash to send it's content. I also did tried with smaller files (4-5MB each) but performance didn't get any better. Does it sound reasonable for you guys that I got to a limit of 4,000 req per second with one Logstash? If you have any suggestions of how to proceed from here I will be more than happy to hear that. If we can't get more from one Logstash, we'll have to develop our own Java service to do that instead. *Apache Access log file output example - (already in JSON Format):* { timestamp:2015-02-09T10:07:48+, bq_timestamp:2015-02-09T10:07:48, client_ip:52.2.11.111, client_port:80, latency_ms:57, latency_sec:0, elb_status_code:200, request:/il.html?e=fpAdOpportunityw=wfl_dosevid=1vname=compName_PMecpm=8adid=1814157media_file_type=MEDIA_FILE_TYPEmedia_file_url=MEDIA_FILE_URLcurrent_url=%0Ahttp%3A%2F% 2Fu-sd.gga.tv %2Fa%2Fh%2FJvf82UX3%2Beff48Z%2fwU20swbapQoWau_%3Fcb%3D5605126933660359000%26pet%3Dpreroll%26pageUrl%3Dhttp%253A%252F%252F3ffese.com%26eov%3Deov%0A%09current_main_vast_url=MAIN_VAST_URLerror_code=ERROR_CODEerror_message=ERROR_MESSAGEq9= dsdase.comapid=dose.comd=Convertdevice=6719csize=300X250token=14123669cb=260174713417pc=PLAYCOUNT, request_path:/il.html, referer:-, user_agent:Mozilla/5.0 (redhat-x86_64-linux-gnu) Siege/3.0.8 } *Logstash configuration file (for the testing I ran it with root without any limitations):* input { file { path = /var/log/httpd/aaa.d.com.logstash-acc.log.[0-9]* codec = json type = tracking discover_interval = 1 sincedb_path = /opt/logstash/httpd-sincedb sincedb_write_interval = 1 } } output { elasticsearch { workers = 1 host = aaa..com index = %{request_path}-logstash-%{+-MM-dd} flush_size = 1000 cluster = video codec = json } } Your help is appreciated! Thanks! On Thursday, February 5, 2015 at 1:56:34 PM UTC+2, Itamar Syn-Hershko wrote: I'd recommend you use ephemeral SSD - 2+ factor replicas and proper use of the snapshot/restore API will provide you HA and DR guarantees. The rejections you are seeing are due to slow I/O operations, because the disk is not local. There is a way to have a bigger queue but I'd advise against that and instead go with a local fast disk. -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Lucene.NET committer and PMC member On Thu, Feb 5, 2015 at 1:51 PM, Hagai T haga...@gmail.com wrote: Hi Itamar, thank you for the reply. This is 15k inserts totally and not for one host in the cluster. Yes, we have 15 sharding whiting one index. shards are spreaded on the nodes equally (automatically by Elasticsearch cluster). We currently use general purpose SSD and not Ephemeral storage. In addition, I see a lot of thread pool bulk rejections from the Elasticsearch side. On Thursday, February 5, 2015 at 1:33:05 PM UTC+2, Itamar Syn-Hershko wrote: What is the question?... 15k inserts per sec per node is actually quite nice. Are your index sharded? If you write to one index only, you write to maximum of x nodes where x is the number of shards
Re: Doc Values
You can update mappings cluster-wide (just post the mapping definition to server:9200/*), but you will need to specify the field names explicitly -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Lucene.NET committer and PMC member On Sat, Feb 7, 2015 at 9:30 PM, Joel Baranick jbaran...@gmail.com wrote: Is there a way to turn doc_values on cluster wide and override any index specific settings? -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/0f54a59e-7490-4c63-b223-6371fa49719a%40googlegroups.com . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHTr4Zu7aZudsPYCaMLTJGtngn9%2B3h7qny%2B4fYzksf%3DVrUmEEg%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: Force search on a local node?
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-request-preference.html -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Lucene.NET committer and PMC member On Sat, Feb 7, 2015 at 9:15 PM, codemasterg gtotsl...@gmail.com wrote: Hi - I am new to Elasticsearch and have what I hope is a basic question for a simple configuration. Assume I have 3 node cluster with a single index and: - 1 primary shard - 2 replicas of the primary shard The majority of requests will be searches with relatively few index updates. All requests are distributed by a network load balancer across the three nodes. Since each node has a copy of the index and the requests are being spread across the cluster by the network load balancer, my intuition is that a local search (i.e. execute a search on the node that received the request) will perform best. In other words, I do not want Elasticsearch to round-robin each search request from the node received to another node; I want the node that received the request to search its local copy of the index. My question: Is there a way for make Elasticsearch search against only the shard on the node received (and avoid a network hop to another shard)? Thanks very much. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/b4289cc5-7981-443f-a26c-569b271cda3a%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/b4289cc5-7981-443f-a26c-569b271cda3a%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHTr4Zvqw8CnzAYfMg7-zGnv-4tQ2izPAi8Bo1z8xDyHao7jHQ%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: Doc Values
If the indexes have been already created you will have to be creative to find those fields that need updating - not familiar with a plugin that can do that. A simple client side tool that will grab all mappings from the /_mapping endpoint, change it and send it back should do For indexes that weren't created yet you can use index templates -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Lucene.NET committer and PMC member On Sat, Feb 7, 2015 at 10:04 PM, Joel Baranick jbaran...@gmail.com wrote: Got it. What I was hoping for would be a way to force doc_values to be the only way for fielddata to be stored for all mapping a in the entire cluster without having to update each index. Could this be done with a plugin? -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/67ce9351-f37a-40aa-ad04-c5328140d6fd%40googlegroups.com . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHTr4Zv6pffNZ2nZJMm0ew%2BPeaVX2GaNRoQdSHRXYD-x_T%2BARA%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: Doc Values
You don't need a plugin for index when an index is created - use index templates + dynamic templates for this, e.g. http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/custom-dynamic-mapping.html#dynamic-templates -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Lucene.NET committer and PMC member On Sat, Feb 7, 2015 at 11:56 PM, Joel Baranick jbaran...@gmail.com wrote: Thanks. I will look into if I can create a plugin which will automatically enable doc_values whenever an index is created or updated. This seems like it could be very useful for multitenant clusters. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/032429bb-38de-40f1-8290-334a4890851d%40googlegroups.com . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZuKav0hP6G%2BCM55t6r1pYh62%3DOY-eOQarMteEeVyDE7_w%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: Paid help with ES/ELK?
I'm available for Elasticsearch consulting, feel free to ping me privately -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Lucene.NET committer and PMC member On Sat, Feb 7, 2015 at 11:04 PM, Steve Johnson st...@filethis.com wrote: I’ll keep you up to date es5z via this thread. I’ve gotten one response so far with no real info attached, and haven’t followed up yet. I will check with sites like elance at some point. Steve On Feb 7, 2015, at 3:06 AM, es5z wrote: I'm wondering the same thing actually. Have you tried freelancer websites like elance and the others? On Friday, February 6, 2015 at 8:22:27 PM UTC+1, Steve Johnson wrote: I hope a posting like this is not taboo in this forum... We are struggling to understand how to properly configure an ELK stack for our production environment. We think we have things set up pretty much right, and then ES throws us a curve ball. We've had a couple of things happen over the last few days that are simply baffling to us. We've decide we need the help of someone who really knows ES. Support companies all seem to want to sell only long-term contracts. We need short-term help. We are therefore thinking that we need to find an individual ES expert who we can pay on an hourly basis to help us set up our ES cluster and learn how it works and how to maintain it. If anyone reading this fits this description, or knows of some other person or organization that does, please contact me at elastic at filethis dot c0m. If you're offering your services directly, please let me know as much as you can about your experience with ES, including the number of years you've worked with it and the sizes of the clusters you've worked with. TIA for all help! Steve -- You received this message because you are subscribed to a topic in the Google Groups elasticsearch group. To unsubscribe from this topic, visit https://groups.google.com/d/topic/elasticsearch/T4QQ2t23uAw/unsubscribe. To unsubscribe from this group and all its topics, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/9ca23652-8eec-445c-945d-49eb82388499%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/9ca23652-8eec-445c-945d-49eb82388499%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/AD545C15-0FC1-4DBA-B3A5-1C8753353F84%40filethis.com https://groups.google.com/d/msgid/elasticsearch/AD545C15-0FC1-4DBA-B3A5-1C8753353F84%40filethis.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZuJHVOodDVW2iF66Q_mFdRHpa_wLPTFJSOhJ2iTNsB%3D1w%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: Possible? Wildcard template for a collection of fields to solve some dynamic mapping woes
You mean something like dynamic templates? http://code972.com/blog/2015/02/81-elasticsearch-one-tip-a-day-using-dynamic-templates-to-avoid-rigorous-mappings -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Lucene.NET committer and PMC member On Fri, Feb 6, 2015 at 1:39 PM, Paul Kavanagh pkavan...@shopkeep.com wrote: Hi all, We're having a MapperParsingException problem with some field values when we get when we use the JSON Filter for Logstash to explode out a JSON document to Elasticsearch fields. In 99.9% of cases, certain of these fields are either blank, or contain dates in the format of -mm-dd. This allows ES to dynamically map this field to type dateOptionalTime. However, we occasionally see non-standard date formats in these fields, which our main service can handle fine, but which throws a MapperParsingException in Elasticsearch - such are here: [2015-02-06 10:46:50,679][WARN ][cluster.action.shard ] [logging- production-elasticsearch-ip-xxx-xxx-xxx-148] [logstash-2015.02.06][2] received shard failed for [logstash-2015.02.06][2], node[ GZpltBjAQUqGyp2B1SLz_g], [R], s[INITIALIZING], indexUUID [BEdTwj- QRuOZB713YAQwvA], reason [Failed to start shard, message [ RecoveryFailedException[[logstash-2015.02.06][2]: Recovery failed from [ logging-production-elasticsearch-ip-xxx-xxx-xxx-82][IALW-92RReiLffQjSL3I-g ][logging-production-elasticsearch-ip-xxx-xxx-xxx-82][inet[ip-xxx-xxx-xxx- 82.ec2.internal/xxx.xxx.xxx.82:9300]]{max_local_storage_nodes=1, aws_availability_zone=us-east-1e, aws_az=us-east-1e} into [logging- production-elasticsearch-ip-xxx-xxx-xxx-148][GZpltBjAQUqGyp2B1SLz_g][ logging-production-elasticsearch-ip-xxx-xxx-xxx-148][inet[ip-xxx.xxx.xxx. 148.ec2.internal/xxx.xxx.xxx.148:9300]]{max_local_storage_nodes=1, aws_availability_zone=us-east-1c, aws_az=us-east-1c}]; nested: RemoteTransportException[[logging-production-elasticsearch-ip-xxx-xxx-xxx- 82][inet[/xxx.xxx.xxx.82:9300]][internal:index/shard/recovery/ start_recovery]]; nested: RecoveryEngineException[[logstash-2015.02.06][2] Phase[2] Execution failed]; nested: RemoteTransportException[[logging- production-elasticsearch-ip-xxx-xxx-xxx-148][inet[/xxx.xxx.xxx.148:9300]][ internal:index/shard/recovery/translog_ops]]; nested: MapperParsingException[failed to parse [apiservice.logstash.@fields. parameters.start_time]]; nested: MapperParsingException[failed to parse date field [Feb 5 2015 12:00 AM], tried both date format [dateOptionalTime ], and timestamp number with locale []]; nested: IllegalArgumentException[ Invalid format: Feb 5 2015 12:00 AM]; ]] 2015-02-06 10:46:53,685][WARN ][cluster.action.shard ] [logging- production-elasticsearch-ip-xxx-xxx-xxx-148] [logstash-2015.02.06][2] received shard failed for [logstash-2015.02.06][2], node[ GZpltBjAQUqGyp2B1SLz_g], [R], s[INITIALIZING], indexUUID [BEdTwj- QRuOZB713YAQwvA], reason [master [logging-production-elasticsearch-ip-xxx- xxx-xxx-148][GZpltBjAQUqGyp2B1SLz_g][logging-production-elasticsearch-ip- xxx-xxx-xxx-148][inet[ip-xxx-xxx-xxx-148.ec2.internal/xxx.xxx.xxx.148:9300 ]]{max_local_storage_nodes=1, aws_availability_zone=us-east-1c, aws_az=us- east-1c} marked shard as initializing, but shard is marked as failed, resend shard failure] Our planned solution was to create a template for Logstash indices that will set these fields to string. But as the field above isn't the only culprit, and more may be added overtime, it makes more sense to create a template to map all fields under apiservice.logstash.@fields.parameters.* to be string. (We never need to query on user entered data, but it's great to have logged for debugging) Is it possible to do this with a template? I could not find a way to do this via the template documentation on the ES site. Any guidance would be great! Thanks, -Paul -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/6ca4030f-b6bb-4907-b2fc-e3166fa2a6af%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/6ca4030f-b6bb-4907-b2fc-e3166fa2a6af%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZviZWbhJC83fB-3cm5qmcsuH-ScOo4x-ghS9BZ9t28HCA%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: ES crashes when parsing fails due to mapping failure
It's not crashing, it is just a log that says the document insert was rejected -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Lucene.NET committer and PMC member On Thu, Feb 5, 2015 at 2:56 PM, as...@singular.net wrote: We have automatic mapping turned for our logstash indexes. Every now and then our system logs a record that has a wrong (out of the ordinary) field data type. For example, a field that's been automatically mapped to be a number occasionally is logged as a string. This causes ES to crash with the following stack trace: at org.elasticsearch.search.SearchService.parseSource(SearchService.java:681) at org.elasticsearch.search.SearchService.createContext(SearchService.java:537) at org.elasticsearch.search.SearchService.createAndPutContext(SearchService.java:509) at org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:264) at org.elasticsearch.search.action.SearchServiceTransportAction$5.call(SearchServiceTransportAction.java:231) at org.elasticsearch.search.action.SearchServiceTransportAction$5.call(SearchServiceTransportAction.java:228) at org.elasticsearch.search.action.SearchServiceTransportAction$23.run(SearchServiceTransportAction.java:559) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.NumberFormatException: For input string: 1.0 at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) at java.lang.Long.parseLong(Long.java:441) at java.lang.Long.parseLong(Long.java:483) at org.elasticsearch.index.mapper.core.NumberFieldMapper.parseLongValue(NumberFieldMapper.java:357) at org.elasticsearch.index.mapper.core.LongFieldMapper.termQuery(LongFieldMapper.java:185) at org.apache.lucene.queryparser.classic.MapperQueryParser.getFieldQuerySingle(MapperQueryParser.java:257) at org.apache.lucene.queryparser.classic.MapperQueryParser.getFieldQuery(MapperQueryParser.java:168) at org.apache.lucene.queryparser.classic.QueryParserBase.getFieldQuery(QueryParserBase.java:487) at org.apache.lucene.queryparser.classic.MapperQueryParser.getFieldQuery(MapperQueryParser.java:287) at org.apache.lucene.queryparser.classic.QueryParserBase.handleQuotedTerm(QueryParserBase.java:875) at org.apache.lucene.queryparser.classic.QueryParser.Term(QueryParser.java:464) at org.apache.lucene.queryparser.classic.QueryParser.Clause(QueryParser.java:259) at org.apache.lucene.queryparser.classic.QueryParser.Query(QueryParser.java:183) at org.apache.lucene.queryparser.classic.QueryParser.Clause(QueryParser.java:263) at org.apache.lucene.queryparser.classic.QueryParser.Query(QueryParser.java:183) at org.apache.lucene.queryparser.classic.QueryParser.TopLevelQuery(QueryParser.java:172) at org.apache.lucene.queryparser.classic.QueryParserBase.parse(QueryParserBase.java:123) at org.apache.lucene.queryparser.classic.MapperQueryParser.parse(MapperQueryParser.java:882) at org.elasticsearch.index.query.QueryStringQueryParser.parse(QueryStringQueryParser.java:223) at org.elasticsearch.index.query.QueryParseContext.parseInnerQuery(QueryParseContext.java:277) at org.elasticsearch.index.query.FQueryFilterParser.parse(FQueryFilterParser.java:66) at org.elasticsearch.index.query.QueryParseContext.executeFilterParser(QueryParseContext.java:343) at org.elasticsearch.index.query.QueryParseContext.parseInnerFilter(QueryParseContext.java:324) at org.elasticsearch.index.query.BoolFilterParser.parse(BoolFilterParser.java:92) at org.elasticsearch.index.query.QueryParseContext.executeFilterParser(QueryParseContext.java:343) at org.elasticsearch.index.query.QueryParseContext.parseInnerFilter(QueryParseContext.java:324) at org.elasticsearch.index.query.FilteredQueryParser.parse(FilteredQueryParser.java:74) at org.elasticsearch.index.query.QueryParseContext.parseInnerQuery(QueryParseContext.java:277) at org.elasticsearch.index.query.IndexQueryParserService.innerParse(IndexQueryParserService.java:382) at org.elasticsearch.index.query.IndexQueryParserService.parse(IndexQueryParserService.java:281) at org.elasticsearch.index.query.IndexQueryParserService.parse(IndexQueryParserService.java:276) at org.elasticsearch.search.query.QueryParseElement.parse(QueryParseElement.java:33) at org.elasticsearch.search.SearchService.parseSource(SearchService.java:665) Is there a way to tell ES not to crash when failing to parse a field? I realize I can override the mapping, and so on, but regardless I'm also interested in getting ES to run reliably without crashing on rare inputs. Thanks, Assaf -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email
Re: Terms facet changing date to long
terms panel you mean Kibana? take a look at Kibana 4, they are doing this automatically in most places -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Lucene.NET committer and PMC member On Thu, Feb 5, 2015 at 7:29 PM, Chris Neal chris.n...@derbysoft.net wrote: Please excuse the bump of my own question. :) After almost 8 months, I still have this question! Just wanted to get it in front of people's eyes again. Is there a way to have date fields stored in ES displayed in a terms panel as nicely formatted dates instead of epoch time? Very much appreciated! Chris On Mon, Jun 30, 2014 at 3:21 PM, Chris Neal chris.n...@derbysoft.net wrote: Hello all, The issue is I have a terms panel in Kibana that I want to group events by a date field from each record (Not the @timestamp field). The terms panel is taking my nicely formatted dates (2014-07-31) and turning them into longs since UTC (140356800). I did a quick test by creating a new index, giving it a mapping, then running both a search and a facet query, and sure enough, the facet query returns the long format instead of the date format! I tried two types of dates, just to see if that made a difference. It did not. = #Create mapping for index PUT /test_index_jerry/test/_mapping { test: { properties: { date1: { type: date, format: dateOptionalTime }, date2: { type: date, format: date } } } } #Put some data POST /test_index_jerry/test { date1:2014-06-30, date2:2014-06-30 } #Execute a basic query GET /test_index_jerry/test/_search { query: { match_all: {} } } # It returns dates in date format { took: 0, timed_out: false, _shards: { total: 2, successful: 2, failed: 0 }, hits: { total: 1, max_score: 1, hits: [ { _index: test_index_jerry, _type: test, _id: VUOeBuiUTGeqBS2Zl8--lg, _score: 1, _source: { date1: 2014-06-30, date2: 2014-06-30 } } ] } } #Execute a terms facet GET /test_index_jerry/test/_search { facets: { terms: { terms: { field: date1, size: 10, order: count, exclude: [] } } } } #Now we have longs { took: 1, timed_out: false, _shards: { total: 2, successful: 2, failed: 0 }, hits: { total: 1, max_score: 1, hits: [ { _index: test_index_jerry, _type: test, _id: VUOeBuiUTGeqBS2Zl8--lg, _score: 1, _source: { date1: 2014-06-30, date2: 2014-06-30 } } ] }, facets: { terms: { _type: terms, missing: 0, total: 1, other: 0, terms: [ { term: 140408640, count: 1 } ] } } } === Is there some way I can get the term to stay in date formatted buckets? I also tried the date histogram facet, but it returned longs as well. Very much appreciate the help :) Thanks, Chris -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAND3DphaBX2%2BZ_mRuS4vtx39EQKs9k9EnH08nrJBN58hj6yCYA%40mail.gmail.com https://groups.google.com/d/msgid/elasticsearch/CAND3DphaBX2%2BZ_mRuS4vtx39EQKs9k9EnH08nrJBN58hj6yCYA%40mail.gmail.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZsvwQKtabn8-%3D1JE_nDrwEwEy-eFB6KsXnj%3Dg4mzzOCKw%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: Performance Limitation with ELK stack
What is the question?... 15k inserts per sec per node is actually quite nice. Are your index sharded? If you write to one index only, you write to maximum of x nodes where x is the number of shards of that index. Since shards of the same index can co-exist on one node, check if you are spanning writes. Use local disks - never EBS, and if you really care about writing speeds use SSDs. Other than that, Mike did an excellent write up on the subject : http://www.elasticsearch.org/blog/performance-considerations-elasticsearch-indexing/ -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Lucene.NET committer and PMC member On Thu, Feb 5, 2015 at 1:26 PM, Hagai T hagai@gmail.com wrote: Hi Guys, We use ElasticSearch as our tracking system of our products in a dynamic to track performance. Searching of this data is being used by small group of users (10-12) in the company to measure performance. In our current environment, we see limit of 15,000 documents being inserted per second without the ability to scale. Some information on the current setup and the flow: *- Tracking servers* 8 x Apache servers behind Amazon ELB which serve empty html files so it tracks the parameters given and writes it to Apache access log. on each server, we also have Logstash which configured to read this access log file and send data to Elasticsearch cluster. *- Elasticsearch Cluster: * 4 x r3.2xlarge (61.0 GB RAM, 8 cores) - contains one Elasticsearch process - 30GB Heap size 1 x r3.4xlarge (122.0 GB RAM, 16 cores) - contains two Elasticsearch processes each with 30GB Heap size. Additional information on the Cluster: https://gist.github.com/hagait/61e3fac181ff413a8b8c#file-gistfile1-txt Cluster health: https://gist.github.com/hagait/d4f16d8f7b724f85b0ee#file-gistfile1-txt Logstash Configuration: https://gist.github.com/hagait/23f4b2bc614a4c4acbb6 Elasticsearch configuration: https://gist.github.com/hagait/ba3684048abe2f9219b8 Thank you for the support! Regards, Hagai -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/9c5352ed-010d-4080-9794-b8dc0c2a5370%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/9c5352ed-010d-4080-9794-b8dc0c2a5370%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZuuedagMgkbe4JoUGWg2DT1pFMkmdRKS3Rp4f4mwq6ntg%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: Performance Limitation with ELK stack
I'd recommend you use ephemeral SSD - 2+ factor replicas and proper use of the snapshot/restore API will provide you HA and DR guarantees. The rejections you are seeing are due to slow I/O operations, because the disk is not local. There is a way to have a bigger queue but I'd advise against that and instead go with a local fast disk. -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Lucene.NET committer and PMC member On Thu, Feb 5, 2015 at 1:51 PM, Hagai T hagai@gmail.com wrote: Hi Itamar, thank you for the reply. This is 15k inserts totally and not for one host in the cluster. Yes, we have 15 sharding whiting one index. shards are spreaded on the nodes equally (automatically by Elasticsearch cluster). We currently use general purpose SSD and not Ephemeral storage. In addition, I see a lot of thread pool bulk rejections from the Elasticsearch side. On Thursday, February 5, 2015 at 1:33:05 PM UTC+2, Itamar Syn-Hershko wrote: What is the question?... 15k inserts per sec per node is actually quite nice. Are your index sharded? If you write to one index only, you write to maximum of x nodes where x is the number of shards of that index. Since shards of the same index can co-exist on one node, check if you are spanning writes. Use local disks - never EBS, and if you really care about writing speeds use SSDs. Other than that, Mike did an excellent write up on the subject : http://www.elasticsearch.org/blog/performance- considerations-elasticsearch-indexing/ -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Lucene.NET committer and PMC member On Thu, Feb 5, 2015 at 1:26 PM, Hagai T haga...@gmail.com wrote: Hi Guys, We use ElasticSearch as our tracking system of our products in a dynamic to track performance. Searching of this data is being used by small group of users (10-12) in the company to measure performance. In our current environment, we see limit of 15,000 documents being inserted per second without the ability to scale. Some information on the current setup and the flow: *- Tracking servers* 8 x Apache servers behind Amazon ELB which serve empty html files so it tracks the parameters given and writes it to Apache access log. on each server, we also have Logstash which configured to read this access log file and send data to Elasticsearch cluster. *- Elasticsearch Cluster: * 4 x r3.2xlarge (61.0 GB RAM, 8 cores) - contains one Elasticsearch process - 30GB Heap size 1 x r3.4xlarge (122.0 GB RAM, 16 cores) - contains two Elasticsearch processes each with 30GB Heap size. Additional information on the Cluster: https://gist.github.com/hagait/61e3fac181ff413a8b8c#file-gistfile1-txt Cluster health: https://gist.github.com/hagait/d4f16d8f7b724f85b0ee# file-gistfile1-txt Logstash Configuration: https://gist.github.com/ hagait/23f4b2bc614a4c4acbb6 Elasticsearch configuration: https://gist.github.com/hagait/ ba3684048abe2f9219b8 Thank you for the support! Regards, Hagai -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/ msgid/elasticsearch/9c5352ed-010d-4080-9794-b8dc0c2a5370% 40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/9c5352ed-010d-4080-9794-b8dc0c2a5370%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/318b46ed-626f-4975-a417-c99ada9c30fe%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/318b46ed-626f-4975-a417-c99ada9c30fe%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHTr4Zv0toyUEhc%2BynpkH%3DUxf9GVLuvnKdPSGUQcnG1grwpbCw%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: Is there ever a reason to store _id?
Setting fields to stored in Elasticsearch in general is not required and a bad practice, since all fields are extracted from _soruce when they are required and _source benefits from block compression and more. There are only some very few edge cases where you want to not save the _source and enable stored for a few fields (usually several small ones out of many) that this feature becomes helpful. HTH -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Lucene.NET committer and PMC member On Wed, Feb 4, 2015 at 2:57 PM, Andrew White and...@datarank.com wrote: I wanted to give this a friendly bump and follow up with my experience. After doing some light testing I can't see a reason to ever store _id. Doing so inflated the index size and response objects but offered no improvements on scanning. So, at least for my case it doesn't seem to make sense. perhaps there is another use case I am missing. Thanks, Andrew White On Wednesday, January 21, 2015 at 7:11:38 AM UTC-6, Andrew White wrote: According to the documentation on _id http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-id-field.htmlit is possible to store _id but it never gives a reason why that would be useful. I have a use case where I am exporting all ids from ES using scan/scroll with no query. If I set the fields parameter to nothing/blank I get back the _id automatically. I assume this happens by parsing the _uid. If I store the _id I get back the _id in both the metadata section of the document and the fields property which seems redundant. I am a little unsure what ES does when a request for no fields and no query come in. I assume it's scanning something (what?) and then fetching the metadata from somewhere (where?). If what it's scanning and what it's fetching from are the same thing then storing the _id seems moot. So, Is there any performance advantage to storing the _id for scan/scroll requests, or in any specific case? -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/3ea84863-63ac-4719-b54f-6a6bc0bb1cfa%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/3ea84863-63ac-4719-b54f-6a6bc0bb1cfa%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZtD6DB3EOVsUQM3ro0FUfvkG3o%2BBv77CkqfaeUL0qdUnw%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: Kibana4 Beta3: Battling with wildcard search on not_analyzed fields
Here's a working gist: https://gist.github.com/synhershko/3d915a7819145f2d7a1f You need to double escape the slashes - not sure if this is by design or no but that works now -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Lucene.NET committer and PMC member On Tue, Feb 3, 2015 at 7:56 PM, Ali Kheyrollahi alios...@gmail.com wrote: Wildcard does not work either. {wildcard:{CounterName:\\Windows Azure Caching:Client(w3wp_*)\\Failure Exceptions}} And regardless, Regexp does not work so on its own right it is a bug. Can you please help open the issue on GitHub? Already have an issue which was closed: https://github.com/elasticsearch/kibana/issues/2698 On Tuesday, 3 February 2015 13:42:11 UTC, Itamar Syn-Hershko wrote: Thinking of it, I'm not sure why you are using regexp here - can you just use wildcard query instead? http://www.elasticsearch.org/guide/en/ elasticsearch/reference/current/query-dsl-wildcard-query.html -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Lucene.NET committer and PMC member On Tue, Feb 3, 2015 at 12:00 PM, Ali Kheyrollahi alio...@gmail.com wrote: No it doesn't which has been my experience: {regexp:{CounterName:\\Windows Azure Caching:Client\\(w3wp_.*\\)\\Failure Exceptions}} or {regexp:{CounterName:\\Windows Azure Caching\\:Client\\(w3wp_.*\\)\\Failure Exceptions}} None of them work -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/ msgid/elasticsearch/9a4eabaa-1634-46a5-aa8a-f2c47ccd5745% 40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/9a4eabaa-1634-46a5-aa8a-f2c47ccd5745%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/3ed729ef-697b-42e0-975b-3b3c86fd7734%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/3ed729ef-697b-42e0-975b-3b3c86fd7734%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZtYtioqUuAyGWm%3DBf3Jxs8DpUvKUjeTsALO4m38-%3DOr8A%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: Persisting Aggregations
The Aggs Fw doesn't allow for persisting results, mainly because it is targeted at real-time data that can still change, but it does support caching as of 1.4. That is, if you issue the same query aggregations request again and again you will be served directly from cache, given the data hasn't changed. That is to say, if you care about performance, the caching layer should be the answer. If you need other things (point in time view of data, further processing, etc) you will need to store the results back to ES or other storage as a document. HTH -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Lucene.NET committer and PMC member On Tue, Feb 3, 2015 at 11:21 AM, AndrewK kenworth...@gmail.com wrote: I've not yet used the aggregations framework, but one question that has come up recently with contacts and prospective clients is how best to persist aggregations in ElasticSearch for repeated use. If I have understood the documentation correctly, the aggregation framework does a pretty good job of using shard caching to make repeated-or-similar queries as efficient as possible, but it would - presumably - be even better if static results (i.e. which will hardly ever - or never - change) could be persisted in some way (in a dedicated index, for example). Is this possible internally (i.e. to GET an aggregation result and POST it in one call) or would one simply have to extract the desired data and then post it oneself? Regards, Andrew -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/2b492a47-1fa6-40f1-a14e-54ccb7fe2a0e%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/2b492a47-1fa6-40f1-a14e-54ccb7fe2a0e%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHTr4Zs6c4tbG-2vXYowbpcA45MTQty1i6Hquv%3DOYVYOWSp9%3DQ%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: Kibana4 Beta3: Battling with wildcard search on not_analyzed fields
Can you try executing a simple term query in JSON using that query bar? -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Lucene.NET committer and PMC member On Mon, Feb 2, 2015 at 11:57 PM, Ali Kheyrollahi alios...@gmail.com wrote: Thanks for responding. It is *surely* not_analyzed - hence my frustration. Here is the mapping { my_index: { mappings: { my_type: { properties: { @timestamp: { type: date, format: dateOptionalTime }, CounterName: { type: string, index: not_analyzed }, CounterValue: { type: double }, DeploymentId: { type: string, index: not_analyzed }, EventTickCount: { type: long }, PartitionKey: { type: string, index: not_analyzed }, Role: { type: string, index: not_analyzed }, RoleInstance: { type: string, index: not_analyzed }, RowKey: { type: string, index: not_analyzed } } } } } } On Monday, 2 February 2015 13:20:49 UTC, Itamar Syn-Hershko wrote: It looks like your field is analyzed and you are trying to query it assuming its not_analyzed (e.g. one string). Hard to say without seeing your index mapping. -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Lucene.NET committer and PMC member On Mon, Feb 2, 2015 at 3:08 PM, Ali Kheyrollahi alio...@gmail.com wrote: Any help please?? On Saturday, 31 January 2015 09:56:38 UTC, Ali Kheyrollahi wrote: Hi, I really haven't found a consistent way to use query window in Discover or Visualize tabs. My results become hit and miss and inconsistent. So I am searching for types of my_type and I have a field called CounterName and I am looking for \Windows Azure Caching:Client(w3wp_2392)\Total Local Cache Hits Funny thing is searching for verbatim value does not work: CounterName\Windows Azure Caching:Client(w3wp_2392)\Total Local Cache Hits And I have to escape only backslashes (well I am using double quotes so it is literal, no?) and not brackets or colon: CounterName\\Windows Azure Caching:Client(w3wp_2392)\\Total Local Cache Hits Now, the 2392 number here is variable (pid on the box) so I am trying to look for \Windows Azure Caching:Client(w3wp_*)\Total Local Cache Hits and I have tried all these to no avail: CounterName:\\Windows Azure Caching:Client(w3wp_*)\\Total Local Cache Hits CounterName:\\Windows Azure Caching:Client(w3wp_\*)\\Total Local Cache Hits CounterName:\Windows Azure Caching:Client(w3wp_*\Total Local Cache Hits (nothing comes back) And also tried regex: CounterName:/\Windows Azure Caching:Client(w3wp_*)\\Total Local Cache Hits/ CounterName:/\Windows Azure Caching:Client(w3wp_.*)\\Total Local Cache Hits/ ... With many different combinations of replacing reserved chars with ?. What am I doing wrong? Thanks -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/ msgid/elasticsearch/54e8264f-00ee-4327-b4fc-ae074152669e% 40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/54e8264f-00ee-4327-b4fc-ae074152669e%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/a5aa9d83-a0cc-459d-87fe-d5da8142a4fb%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/a5aa9d83-a0cc-459d-87fe-d5da8142a4fb%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group
Re: Kibana4 Beta3: Battling with wildcard search on not_analyzed fields
inline -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Lucene.NET committer and PMC member On Tue, Feb 3, 2015 at 1:32 AM, Ali Kheyrollahi alios...@gmail.com wrote: This *works* (exact value) {term:{CounterName:\\Windows Azure Caching:Client(w3wp_5412)\\Failure Exceptions}} As expected But NOT this: {term:{CounterName:Caching}} Nor {term:{CounterName:\\Windows Azure Caching:Client(w3wp_.*)\\Failure Exceptions}} Or this {term:{CounterName:\\Windows Azure Caching:Client(w3wp_*)\\Failure Exceptions}} As expected too - term query will take the entire string and look for documents matching this exact query. .* has no meaning in this context, its just a different string than the original, hence no hits. And *not even* this {regexp:{CounterName:\\Windows Azure Caching:Client(w3wp_.*)\\Failure Exceptions}} or {regexp:{CounterName:\\Windows Azure Caching:Client(w3wp_.+)\\Failure Exceptions}} or {regexp:{CounterName:\\Windows Azure Caching:Client(w3wp_*)\\Failure Exceptions}} I believe you should escape the parenthesis, this is getting parsed as a regex grouping. See http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-regexp-query.html#regexp-syntax -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/327ba38a-9caf-41c1-8a45-f93be1532bf2%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/327ba38a-9caf-41c1-8a45-f93be1532bf2%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZvNWEKQ6_j3aEBrJX0vfBJX9APVQja1CqzBMjpyynZypA%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: Kibana4 Beta3: Battling with wildcard search on not_analyzed fields
It looks like your field is analyzed and you are trying to query it assuming its not_analyzed (e.g. one string). Hard to say without seeing your index mapping. -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Lucene.NET committer and PMC member On Mon, Feb 2, 2015 at 3:08 PM, Ali Kheyrollahi alios...@gmail.com wrote: Any help please?? On Saturday, 31 January 2015 09:56:38 UTC, Ali Kheyrollahi wrote: Hi, I really haven't found a consistent way to use query window in Discover or Visualize tabs. My results become hit and miss and inconsistent. So I am searching for types of my_type and I have a field called CounterName and I am looking for \Windows Azure Caching:Client(w3wp_2392)\Total Local Cache Hits Funny thing is searching for verbatim value does not work: CounterName\Windows Azure Caching:Client(w3wp_2392)\Total Local Cache Hits And I have to escape only backslashes (well I am using double quotes so it is literal, no?) and not brackets or colon: CounterName\\Windows Azure Caching:Client(w3wp_2392)\\Total Local Cache Hits Now, the 2392 number here is variable (pid on the box) so I am trying to look for \Windows Azure Caching:Client(w3wp_*)\Total Local Cache Hits and I have tried all these to no avail: CounterName:\\Windows Azure Caching:Client(w3wp_*)\\Total Local Cache Hits CounterName:\\Windows Azure Caching:Client(w3wp_\*)\\Total Local Cache Hits CounterName:\Windows Azure Caching:Client(w3wp_*\Total Local Cache Hits (nothing comes back) And also tried regex: CounterName:/\Windows Azure Caching:Client(w3wp_*)\\Total Local Cache Hits/ CounterName:/\Windows Azure Caching:Client(w3wp_.*)\\Total Local Cache Hits/ ... With many different combinations of replacing reserved chars with ?. What am I doing wrong? Thanks -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/54e8264f-00ee-4327-b4fc-ae074152669e%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/54e8264f-00ee-4327-b4fc-ae074152669e%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHTr4Ztkn6wx%2BJB8iJVyLwmZTbX30SKuFkZOvZ38E-96guj7eQ%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: Implementing search as you type example
For implementing good autocomplete I recommend you look at the completion suggester - its much faster and has more capabilities. It was built especially for that. See http://www.elasticsearch.org/blog/you-complete-me/ and http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-suggesters-completion.html You can then complement it with Phrase Suggester to recommend spelling corrections etc edge-grams are less than ideal for this use case given the above tools -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Lucene.NET committer and PMC member On Sun, Feb 1, 2015 at 7:12 PM, Craig Ching craigch...@gmail.com wrote: Hi, I'm trying to implement the search as you type example from http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/_index_time_search_as_you_type.html Can someone see what I'm doing wrong? curl -XDELETE localhost:9200/my_index echo curl -XPUT localhost:9200/my_index -d ' { settings: { number_of_shards: 1, analysis: { filter: { autocomplete_filter: { type: edge_ngram, min_gram: 1, max_gram: 20 } }, analyzer: { autocomplete: { type: custom, tokenizer: standard, filter: [ lowercase, autocomplete_filter ] } } } } }' echo curl -XPUT localhost:9200/my_index/_mapping/my_type -d ' { my_type: { properties: { name: { type: string, analyzer: autocomplete } } } }' echo curl localhost:9200/my_index/my_type/_bulk -d ' { index: { _id: 1}} { name: Brown foxes} { index: { _id: 2}} { name: Yellow furballs } ' echo curl localhost:9200/my_index/my_type/_search -d ' { query: { match: { name: brown fo } } }' echo curl localhost:9200/my_index/my_type/_validate/query?explain -d ' { query: { match: { name: brown fo } } }' echo curl localhost:9200/my_index/my_type/_search -d ' { query: { match: { name: { query:brown fo, analyzer: standard } } } }' echo curl localhost:9200/my_index/my_type/_validate/query?explain -d ' { query: { match: { name: { query:brown fo, analyzer: standard } } } }' echo -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/3265ddb0-eab4-4cc7-9fc0-66ae56c358e5%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/3265ddb0-eab4-4cc7-9fc0-66ae56c358e5%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZvjP1CsF9JSs1H0u6fioT_igm%3DBMxWfYs3iY2A5M6SXJw%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: synonym dictionaries of person names
Was it raw POS tagged data or just raw data? can you share the code / process you used? -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Lucene.NET committer and PMC member On Thu, Jan 29, 2015 at 3:34 PM, Mark Harwood mark.harw...@elasticsearch.com wrote: I've built one before from raw data but you need: 1) a *lot* of data 2) a unique ID per person 3) some noise/variation in the names recorded for each person The input is of this form: personID recorded_name === = 1 Rob 1 Robert 1 Bob 2 Dave 2 David 2 Alice ... The output is a weighted graph of name-variant e.g Robert== Bob with a strong confidence rating. Using this I know not just real names but also typos e.g. that Janes is more likely to be James than Jane (a common typo due to key locations on keyboard). On Thursday, January 29, 2015 at 5:28:33 AM UTC, David Kemp wrote: I am looking for synonym dictionaries of person names that I can use with the Elasticsearch synonym analyser. e.g. dictionaries that map Ted to Edward, and Bill to William. I am curious to know what others are using. So far I have found these two possible sources: https://code.google.com/p/nickname-and-diminutive-names- lookup/downloads/list https://github.com/DallanQ/Names/wiki/Name-variant-files And perhaps http://www.behindthename.com Thanks, David -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/6a473177-7fdd-49d9-95e3-538b51df57f1%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/6a473177-7fdd-49d9-95e3-538b51df57f1%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHTr4Zup6FroPitENCjBohH8Zxjtcs_H4fCvWmL1nQeD8zZL7w%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: Kibana - IIS 7.5
You may want to give this a try: https://github.com/synhershko/KibanaDotNet/tree/owin -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Lucene.NET committer and PMC member On Mon, Jan 26, 2015 at 3:58 PM, GWired garrettcjohn...@gmail.com wrote: I was able to get Kibana setup on my localhost and did a generic entry to allow everything into the elasticsearch.yml http.cors.allow-origin: /.*/ Now I'm trying to getting it to run on my remote server running IIS 7.5 on port 8080. The page loads but only the top bar loads and nothing else any ideas? -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/31402e52-0b96-4f2a-900a-d7f09bf62774%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/31402e52-0b96-4f2a-900a-d7f09bf62774%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZsZSWdjC6RwLFgm2p8Q3Y_kSJT6W1wVw8SQ7U-MeJtjqA%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: When searching for 'Boss' with fuzziness, get higher score for 'Bose' than 'Boss'. ???? How Comes !?!?
Famous last words :) -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Lucene.NET committer and PMC member On Tue, Jan 20, 2015 at 11:11 AM, Mark Harwood mark.harw...@elasticsearch.com wrote: it doesn't seem like this would address the IDF Trust me, I wrote it. On Tuesday, January 20, 2015 at 12:16:44 AM UTC, kasper...@yahoo.com wrote: Thanks Mark. Sounds like this issue affects a lot of people. I looked at your suggestion about FLT, and the ignore_tf parameter should help, however unless I'm missing something, it doesn't seem like this would address the IDF, and results could be biased. But I will experiment. Ultimately I think what my particular use case requires is a scorer that only uses edit distance (when querying with fuzziness) and field boosts, but no TF / IDF. On Monday, January 19, 2015 at 3:15:47 PM UTC-8, Mark Harwood wrote: This issue rounds up a bunch of related issues that have been raised previously: https://github.com/elasticsearch/elasticsearch/issues/9103 For now try FuzzyLikeThis (http://www.elasticsearch.org/ guide/en/elasticsearch/reference/current/query-dsl- flt-query.html#query-dsl-flt-query ) It blends More Like This and fuzzy functionality but includes the adjustments to IDF that I think make more sense than the other implementations with their bias towards rewarding scarcity. On Monday, January 19, 2015 at 6:48:49 PM UTC, kasper...@yahoo.com wrote: I have the same problem, where some results with higher edit distance are ranked higher than other results that are closer in terms of edit distance. I suspect it does have to do with document frequency, as you think Adrien. In my case I want to ignore document frequency completely. Any suggestion to achieve this? I'm a taker of any solution as this looks like a show stopper for us, so even a workaround would help. I can try to create this other rewrite method you mentioned if you could point me in the right direction. Thanks On Thursday, January 15, 2015 at 7:44:57 AM UTC-8, Adrien Grand wrote: This is because the score takes two factors into account: the document frequency and the edit distance. Quite likely in your case, even though Boss is closer than Bose, Bose has a much lower document frequency which helped it eventually get a better score. I guess we should have another rewrite method that would not take freqs into account (or somehow merge them) to avoid that issue. On Thu, Jan 15, 2015 at 4:06 PM, Eylon Steiner eylon@gmail.com wrote: Any ideas? -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/ msgid/elasticsearch/52e09e54-90b6-4014-8454-34e3db5756e5% 40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/52e09e54-90b6-4014-8454-34e3db5756e5%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- Adrien Grand -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/9523b3d5-ffea-4760-9782-69167b9807ed%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/9523b3d5-ffea-4760-9782-69167b9807ed%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHTr4Zt5ycYCcwVkqL%2BMazATz5nS5VXtDq6DHmUv2KS%2BrKE_SQ%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: Elasticsearch at Google Cloud Engine
This requires port 9300 to be open on the cloud for you (UDP), and for your client code to set the cluster name correctly -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Lucene.NET committer and PMC member On Tue, Jan 20, 2015 at 1:52 PM, Klausen Schaefersinho klaus.schaef...@gmail.com wrote: Hi, I have just the click to deploy features to set up a small ElasticSearch cluster. That seems to have worked fine and I can connect to the cluster of rest. For instance curl http://ip:9200 will return { status : 200, name : elasticsearch-8tqw, cluster_name : my_elasticsearch-cluster, version : {...}, tagline : You Know, for Search } So I assume the cluster name is my_elasticsearch-cluster. However if I try to connect to the cluster using the java node client the client takes really long to join the cluster and if I try to perform a healt check, just to check if I am really connected I get the following exception: org.elasticsearch.discovery.MasterNotDiscoveredException: waited for [30s] at org.elasticsearch.action.support.master.TransportMasterNodeOperationAction$4.onTimeout(TransportMasterNodeOperationAction.java:164) at org.elasticsearch.cluster.ClusterStateObserver$ObserverClusterStateListener.onTimeout(ClusterStateObserver.java:239) at org.elasticsearch.cluster.service.InternalClusterService$NotifyTimeout.run(InternalClusterService.java:497) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Why does this happen, and how can I solve it? Is there something not correctly configured in my network or should I use the transport client? Thanks! Klaus -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/bdc6e797-1656-4dfc-adda-794519a82eaf%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/bdc6e797-1656-4dfc-adda-794519a82eaf%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHTr4Zsa5v%3DyW7QF2G03oTuSB0D_mg_CUyUGTdFBRmZSYcdvsA%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: How can I sort results by _id?
No, an ID has to be a string -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Author of RavenDB in Action http://manning.com/synhershko/ On Thu, Jan 15, 2015 at 12:12 PM, Jason Zhang moc...@gmail.com wrote: Can I specify its type as integer in _mapping? Because the _id I use is rewritten. On Thursday, January 15, 2015 at 6:07:22 PM UTC+8, Adrien Grand wrote: This is because the _id is a string field, so comparison is based on the lexicographical order, not numeric. On Thu, Jan 15, 2015 at 11:04 AM, Jason Zhang moc...@gmail.com wrote: What I'm confused is the 'sorted' results are still partly unordered. Also, if I query: { range: { _id: { gt: 1, lt: 1}}} the results contain _id: 199989. On Thursday, January 15, 2015 at 5:48:48 PM UTC+8, Adrien Grand wrote: Making it index:not_analyzed should work, what is the issue with the results? Note that loading the _id in fielddata is typically very costly since the _id field is typically unique per document. On Thu, Jan 15, 2015 at 10:35 AM, Jason Zhang moc...@gmail.com wrote: I use a query dsl like: { filter: { exists: { field: info } }, sort: { _id: desc } } And the _id here is an integer like '123'. But the result is like: { took: 50, ... hits: { ... hits: [ { ... sort: [ null ] }] } } Also, I've tried to add _id: { index: not_analyzerd } in the _mapping. This time the sort section returns values. But I find the results are still partly unordered. Can I sort results by _id? How? Thank you. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/ msgid/elasticsearch/4ea45f18-847a-4b58-b78e-ddcd9ee1e9f9%40goo glegroups.com https://groups.google.com/d/msgid/elasticsearch/4ea45f18-847a-4b58-b78e-ddcd9ee1e9f9%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- Adrien Grand -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/ msgid/elasticsearch/b7f625dd-8afd-4603-afc8-1fd6d5b601d1% 40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/b7f625dd-8afd-4603-afc8-1fd6d5b601d1%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- Adrien Grand -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/2475cb1a-5631-4b06-8507-28c4d81f9d4d%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/2475cb1a-5631-4b06-8507-28c4d81f9d4d%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZvWQtGKE6JDd6%3D%2BXRJENrAyLPkTE3%2BBRpFsEJ%2BS09bTpg%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: tuning elasticsearch node client non-heap memory consumption
Why would you want that? Locking heap memory usage is done by Elasticsearch on data nodes to reduce GC rounds, mainly because it loads a lot of data that is best managed by ES itself. On client nodes you don't need that (and if you did, you wouldn't be using that small heap sizes) -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Author of RavenDB in Action http://manning.com/synhershko/ On Tue, Jan 13, 2015 at 12:37 PM, Itai Frenkel itaifren...@live.com wrote: Hello, We are running a node client on each machine with small JVM heap -Xms384m -Xmx384m -Xss256k which is suitable for our use case. There are however another 285MB non-heap memory (676-389=285). How can this extra non-heap memory usage be configured ? What is it used for ? Below are the relevant node stats. Regards, Itai process: { open_file_descriptors: 340, mem: { resident_in_bytes: 676884480, share_in_bytes: 23248896, total_virtual_in_bytes: 1696899072 } }, jvm: { mem: { heap_used_in_bytes: 44794784, heap_used_percent: 11, heap_committed_in_bytes: 389283840, heap_max_in_bytes: 389283840, non_heap_used_in_bytes: 44208640, non_heap_committed_in_bytes: 44564480, pools: { young: { used_in_bytes: 13765016, max_in_bytes: 107479040, peak_used_in_bytes: 107479040, peak_max_in_bytes: 107479040 }, survivor: { used_in_bytes: 5086896, max_in_bytes: 13369344, peak_used_in_bytes: 13369344, peak_max_in_bytes: 13369344 }, old: { used_in_bytes: 25942872, max_in_bytes: 268435456, peak_used_in_bytes: 25942872, peak_max_in_bytes: 268435456 } } }, -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/178f4f2f-5dfe-418e-82a3-de505a9ebd9a%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/178f4f2f-5dfe-418e-82a3-de505a9ebd9a%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZuWvT7Co6PMJZjYY5tA1XTtiMaBHRZDyynX4zKTmC%3D6Bg%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: Join between two different sources using Kibana 4
You either use parent / child http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/parent-child.html Or index denormalized data in the first place Elasticsearch isn't meant to be used using the same models as relational databases -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Author of RavenDB in Action http://manning.com/synhershko/ On Mon, Jan 12, 2015 at 9:36 PM, Gregory Touretsky gregory.touret...@intel.com wrote: Hi, what would be the right way to join between two data sources using Kibana 4 interface? Assume 2 data sources: 1. source=jobs, fields = {jobid, user, host, exitstatus, starttime,finishtime} Sample record: type = jobs; jobid = 1234; user = john; host = myhost; exitstatus = -3002; starttime = 01/01/2015 01:01; finishtime = 01/01/2015 01:15 2. source=license, fields = {host, user, time, feature, result} Sample records: type = license; user = john; host = myhost; time = 01/01/2015 01:05; feature = AAA; result = DENIED type = license; user = john; host = myhost; time = 01/01/2015 01:07; feature = BBB; result = APPROVED I’d like to create a dashboard in Kibana 4 which would show a joint table combining both sources. Using pseudo-SQL code, it should do something like: select jobs.jobid,jobs.user,jobs.host,license.feature,license.result,count(license.time) from jobs LEFT JOIN license WHERE jobs.exitstatus=-3002 AND license.user=jobs.user AND license.host=jobs.host AND license.time=jobs.starttime AND license.time=jobs.finishtime GROUP BY jobs.jobid,jobs.user,jobs.host Thanks in advance, Gregory -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/daf3dbf4-7b76-477e-8b10-5ca54cb53bf0%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/daf3dbf4-7b76-477e-8b10-5ca54cb53bf0%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZuW4n8JLyAXsnM%3Dppv_Wjg1SSm0OJrmyVYWKkAtrKTzUw%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: Searching with Elasticsearch.Net
If all you need is querying, I will highly recommend looking at https://github.com/CenturyLinkCloud/ElasticLINQ for .NET I also have my own stab at a .NET client library for Elasticsearch here: https://github.com/synhershko/NElasticsearch / https://www.nuget.org/packages/NElasticsearch/1.0.14 (still WIP) -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Author of RavenDB in Action http://manning.com/synhershko/ On Fri, Jan 9, 2015 at 7:54 PM, Garrett Johnson garrettcjohn...@gmail.com wrote: { query = this.textBox1.Text, default_field = _all }, This seems to send it. Still don't know how to get the results. It's like it should be in a foreach(hit in hits) {get fields} but I have nothing and documentation isn't helping. On Friday, January 9, 2015 at 12:43:36 PM UTC-5, Garrett Johnson wrote: Hi All, I would like to use Elasticsearch.Net (NEST requires types and I do not want strong types) to do a simple _all term search. I can do this using the plugin elasticsearch head and I retrieve the appropriate documents. Here is some simple code I wrote just to say hey give me all that match. var node = new Uri(http://myhost:9200); var config = new ConnectionConfiguration(node); var exposed = config.ExposeRawResponse(true); var client = new ElasticsearchClient(config); var search = new { size = 10, from = 1 * 10, query = new { query_string = new { query = this.textBox1.Text } }, }; var searchResponse = client.Search(jdbc,search); This returns these results: {StatusCode: 200, Method: POST, Url: http://u4vmeqlditapp01:9200/jdbc/_search, Request: {size:10,from:10,query:{query_string:{query:Garrett}}}, Response: {took:5,timed_out:false,_shards:{total:5, successful:5,failed:0},hits:{total:10,max_score: 1.1672286,hits:[]}}} But no documents. Here is the JSON I'm trying to replicate: - query: { - bool: { - must: [ - { - query_string: { - default_field: _all, - query: Garrett } } ], - must_not: [ ], - should: [ ] } }, - from: 0, - size: 25000, - sort: [ ], - facets: { } I'm pretty sure it is because the query doesn't have the default_field set to _all... But I don't know how to set that. I've tried several string concatenations to no avail it just searched for them. Any one with any ideas. I want to simply search all types for a single string. Garrett -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/c7be81ac-d3ff-4841-b458-41aae34df921%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/c7be81ac-d3ff-4841-b458-41aae34df921%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZtOWzNT6to0H5ahDooJ-J%3DAgLhRcJxdJwUwCB3i7wjLeg%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: Bucket query results | top hits performance
Can you share the query and example results please? -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Author of RavenDB in Action http://manning.com/synhershko/ On Tue, Jan 6, 2015 at 10:11 PM, Michael Irani irani.mich...@gmail.com wrote: Hello, I'm working on a corpus of size approximately 10 million documents. The issue I'm running into right now is that the top scoring documents that come back from my query are essentially all the same result. I'm trying to find a way to get back unique results. I've looked into modeling the data differently with nested objects or parent-child relationships, but neither layout seems to fit the bill. The nested model won't work because some of the documents have too many closely related objects. On the flip side there are also too many unique documents for the parent-child relationship to fit. I then tried the top hits aggregation and it's exactly what I'm looking for, except the running time of the query is approximately 30x slower than the query without the aggregation. Are there known performance issues with top hits? Any ideas on what I should use to make these queries? Here's the aggregation piece: aggs: { top-fingerprints: { terms: { field: fingerprint, size: 50 }, aggs: { top_tag_hits: { top_hits: { size: 1, _source: { include: [ title ] } } } } } } Thanks, Michael -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/29fce15c-79b7-4756-b033-93e490204095%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/29fce15c-79b7-4756-b033-93e490204095%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHTr4Zv6oymHVY8ambsshh6CMtD%2BMJrf-VSA0hoKAeYwvVQL8w%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: Failed stopping 'elasticsearch-service-x64' service
Does it also happen when you uninstall the JDBC river? Also, I'd highly recommend using Linux servers for Elasticsearch instances and not Windows ones -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Author of RavenDB in Action http://manning.com/synhershko/ On Mon, Jan 5, 2015 at 2:25 AM, Garrett Johnson garrettcjohn...@gmail.com wrote: Log entries: [2015-01-04 18:13:56,185][INFO ][node ] [Bucky III] stopping ... [2015-01-04 18:13:56,202][INFO ][river.jdbc.JDBCRiver ] river closed [jdbc/users] [2015-01-04 18:13:56,203][INFO ][river.jdbc.JDBCRiver ] river closed [jdbc/product2] [2015-01-04 18:13:56,342][INFO ][node ] [Bucky III] stopped [2015-01-04 18:13:56,342][INFO ][node ] [Bucky III] closing ... [2015-01-04 18:13:56,355][INFO ][node ] [Bucky III] closed Windows Server 2008R2 ElasticSearch 1.4.2 Plugins ElasticSearch Head, jdbc river 1.4.0.6 Microsoft jdbc driver. Thanks, Garrett On Saturday, January 3, 2015 10:10:42 AM UTC-6, Costin Leau wrote: Do you see anything in the logs? Can you try removing and reinstalling the service? What's your OS/configuration? On 1/2/15 10:32 PM, Garrett Johnson wrote: By own it's own I mean service stop or using services.msc and clicking restart on the service. Both attempts get the same error. On Friday, January 2, 2015 2:31:28 PM UTC-6, Garrett Johnson wrote: I'm getting this error every time I try to start and stop the elastic search windows service. Takes a couple of minutes then fails. I can kill the task in task manager and then restart but cannot get it to stop on its own. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com mailto:elasticsearch+ unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/f4efa651- 9c60-4abb-b04a-47992f1c3e82%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/f4efa651- 9c60-4abb-b04a-47992f1c3e82%40googlegroups.com?utm_medium= emailutm_source=footer. For more options, visit https://groups.google.com/d/optout. -- Costin -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/fed85958-becc-4269-9300-044e22499624%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/fed85958-becc-4269-9300-044e22499624%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHTr4Zvr0bkpiNCwJpGWhPRV0POVn1eJQxcs%2Bb19MmtaiB%2BX_g%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: Question about highlight query.
A bit off-topic, but I'd really like to see is the ability to perform highlighting asynchronously, that is - first get the search results from Elsaticsearch, process them and get the highlighted snippets on a second wave, asynchronously. The main problem with highlighting currently is that it is slow - because of hackish recursive algorithms and mandatory I/O access. I'd like to avoid doing 2-step searches (one search for the results, the other one is to artificially propagate the highlights to the UI on a second wave - I wonder if we can come up with a way to have ES propagate them asynchronously for us? -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Author of RavenDB in Action http://manning.com/synhershko/ On Wed, Dec 31, 2014 at 5:38 PM, Nikolas Everett nik9...@gmail.com wrote: Highlighting isn't a nice pretty thing - its kind of a hacky. There are three highlighters built in http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-request-highlighting.html to Elasticsearch and they all work differently. You should try all of them and see if they do what you want. They all come at the problem from a different perspective and have their own idiosyncrasies. I maintain a highlighter plugin https://github.com/wikimedia/search-highlighter as well that you can use as a forth option. It merges lots of the implementation strategies that the other ones use together and attempts to give you more options and it might do what you need. Nik On Tue, Dec 23, 2014 at 12:44 PM, Yang Liu yl...@nyu.edu wrote: No one knows anything about this? I really appreciate anything you offered. On Monday, December 22, 2014 5:27:57 PM UTC-5, Yang Liu wrote: Hi, guys, I have a question about highlight query in ES. *Below is my query,* { _source: [ . ], highlight: { fields: { FDS_ATTACHMENTS: { type: plain }, FDS_ATTACHMENTS.no_stem: { type: plain }, FDS_ATTACHMENTS.with_case: { type: plain }, headline: { type: plain }, headline.no_stem: { type: plain }, headline.with_case: { type: plain } }, fragment_size: 500, highlight_query: { bool: { must: [ { bool: { minimum_should_match: 1, should: [ { span_near: { clauses: [ { span_term: { FDS_ATTACHMENTS.no_stem: rights } }, { span_term: { FDS_ATTACHMENTS.no_stem: agreement } } ], in_order: true, slop: 0 } } ] } }, { bool: { minimum_should_match: 1, should: [ { span_near: { clauses: [ { span_term: { FDS_ATTACHMENTS.no_stem: rights } }, { span_term: { FDS_ATTACHMENTS.no_stem: agreement } }, { span_term: { FDS_ATTACHMENTS.no_stem: merger } } ], in_order: false, slop: 5 } } ] } } ] } }, number_of_fragments: 50, post_tags: [ /font ], pre_tags: [ font color=red ], require_field_match: true }, query: { filtered: { filter: { range: { story_datetime: { gte: 20141221t00, lte: 20141222t235959 } } }, query: { bool: { must: [ { bool: { minimum_should_match: 1, should: [ { span_near: { clauses: [ { span_term: { FDS_ATTACHMENTS.no_stem: rights } }, { span_term: { FDS_ATTACHMENTS.no_stem: agreement
Re: Running elasticsearch 1.4.2 and kibana 4 as service
Elasticsearch has packages which will do this for you on every Linux distribution: http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/setup-repositories.html For Kibana 4 you'll need to use init.d and /sbin/service , the specifics are going to depend on the distribution and the tools you have installed -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Author of RavenDB in Action http://manning.com/synhershko/ On Tue, Dec 23, 2014 at 11:02 PM, Ram Maram ram.mara...@gmail.com wrote: Hi, Right now I am running kibana 3 and elasticsearch 1.3.2 for our ELK stack, I would like to use kibana 4 and elasticsearch 1.4.2. Can someone please let me know how to install kibana 4 and elasticsearch 1.4.2 as a service on linux? I was able to run them manually but I couldn't figure how to run them as a service. Thanks, Ram -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/13d0fe92-bb67-4552-b8da-f482a4291dd1%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/13d0fe92-bb67-4552-b8da-f482a4291dd1%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZtVSuCKY8sBoPhw5yzbdMaHiXB2XsHieBvtvRNfJGL5hg%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: Running elasticsearch 1.4.2 and kibana 4 as service
It's basic Linux administration stuff, see http://arstechnica.com/civis/viewtopic.php?p=2147913sid=16c526bdb60201e802cf7f6b8bc598e2#p2147913 for example (and the rest of the instructions on chkconfig). Just update the script to point at your Kibana files. -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Author of RavenDB in Action http://manning.com/synhershko/ On Tue, Dec 23, 2014 at 11:28 PM, Ram Maram ram.mara...@gmail.com wrote: Thank you Itamar for your quick respone, my distribution is redhat linux 6.x and the tools that have installed are logstash, java, elasticsearch. Can you guide me on how to create the init file for kibana 4 or can I host it on apache ? Thanks, Ram On Tuesday, December 23, 2014 4:06:25 PM UTC-5, Itamar Syn-Hershko wrote: Elasticsearch has packages which will do this for you on every Linux distribution: http://www.elasticsearch.org/guide/en/ elasticsearch/reference/current/setup-repositories.html For Kibana 4 you'll need to use init.d and /sbin/service , the specifics are going to depend on the distribution and the tools you have installed -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Author of RavenDB in Action http://manning.com/synhershko/ On Tue, Dec 23, 2014 at 11:02 PM, Ram Maram ram.m...@gmail.com wrote: Hi, Right now I am running kibana 3 and elasticsearch 1.3.2 for our ELK stack, I would like to use kibana 4 and elasticsearch 1.4.2. Can someone please let me know how to install kibana 4 and elasticsearch 1.4.2 as a service on linux? I was able to run them manually but I couldn't figure how to run them as a service. Thanks, Ram -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/ msgid/elasticsearch/13d0fe92-bb67-4552-b8da-f482a4291dd1% 40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/13d0fe92-bb67-4552-b8da-f482a4291dd1%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/a2229186-903a-46c7-b132-b0cae3737236%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/a2229186-903a-46c7-b132-b0cae3737236%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZvJsvbhB9Qy5yesH004YK-3RWVXc1fvjRz3RBuooK94-A%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: Running elasticsearch 1.4.2 and kibana 4 as service
I'd actually prefer to install from repositories as they take care of placing things in the right place and create a user to run ES under -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Author of RavenDB in Action http://manning.com/synhershko/ On Tue, Dec 23, 2014 at 11:45 PM, joergpra...@gmail.com joergpra...@gmail.com wrote: Use https://github.com/elasticsearch/elasticsearch-servicewrapper to run ES as a service under RHEL 6. Jörg On Tue, Dec 23, 2014 at 10:02 PM, Ram Maram ram.mara...@gmail.com wrote: Hi, Right now I am running kibana 3 and elasticsearch 1.3.2 for our ELK stack, I would like to use kibana 4 and elasticsearch 1.4.2. Can someone please let me know how to install kibana 4 and elasticsearch 1.4.2 as a service on linux? I was able to run them manually but I couldn't figure how to run them as a service. Thanks, Ram -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/13d0fe92-bb67-4552-b8da-f482a4291dd1%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/13d0fe92-bb67-4552-b8da-f482a4291dd1%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoERpmJxSASTyghjpVA7omKqx3N1Y7CdMX_GRpfJh5J6Hg%40mail.gmail.com https://groups.google.com/d/msgid/elasticsearch/CAKdsXoERpmJxSASTyghjpVA7omKqx3N1Y7CdMX_GRpfJh5J6Hg%40mail.gmail.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHTr4Zs6xznALSGzjQGz2OGsU%3D3xh88Ab5HOZw9bLVn%3Dcjc3YQ%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: When to use fields and when to use source filtering
Fields are used to pull data from stored fields whereas source filtering is targeting _source. At the moment both fallback on each other, so the differences is in the order of precedence. I believe I've heard there're plans to deprecate fields completely, wonder if someone from ES could confirm? -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Author of RavenDB in Action http://manning.com/synhershko/ On Mon, Dec 22, 2014 at 2:28 PM, Shelef shlaf...@gmail.com wrote: I read about two ways to filter the fields returned by elasticsearch. fields http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-request-fields.html And source filtering http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-request-source-filtering.html. when to use which? -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/0a8593ca-d17f-4f6d-b3b8-b5ee10196892%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/0a8593ca-d17f-4f6d-b3b8-b5ee10196892%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZtVDGbnuBdGgxHavr3O9ToO%2BC_5Tg%3DE8tntw_5yW%3Djm%3Dg%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: Custom _source compression / compaction to reduce disk usage
I'm pretty sure you'll lose cross-document compression that way, which is highly noticable on lots of 3k large documents -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Author of RavenDB in Action http://manning.com/synhershko/ On Mon, Dec 15, 2014 at 10:56 PM, Eran Duchan pav...@gmail.com wrote: Thanks for the pointers. I just realized I can disable _source and store a field with the encoded data (D'oh). If I find anything semi-intelligent during my tests, I'll report back. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/bd777d03-06d9-416b-8366-1b6b1f6e1302%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/bd777d03-06d9-416b-8366-1b6b1f6e1302%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZttApz%3D6kG%2B_Pw19V%3Db-%2Bbvh-deAsZwRZ3TeeqXuzGq5g%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: Performance issues when flagging a document in Elasticsearch
Lucene / Elasticsearch is pretty much insignificant to this as long as you use filters. You should prefer not_analyzed fields with string values to represent those flags vs having dedicated boolean fields if you will have more than a few such flags. -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Author of RavenDB in Action http://manning.com/synhershko/ On Wed, Dec 10, 2014 at 10:22 AM, Dror Atariah dror...@gmail.com wrote: Assume that I want to be able to flag documents in an index according to their attributes: isFoo and isBar [1]. As far as I understand, there are two approaches: 1) Use dedicated fields for the flags: If the document is a Foo then add a field named isFoo. Similarly, for isBar. 2) Use a flags field that will be an array of strings. In this case, if the document is Foo then flags will contain the string isFoo. What are the pros and cons in terms of space and runtime complexities? Bear in mind the following queries examples: Consider the case where one wants to check the attributes of the documents in the index. In particular, if I want to find the documents that are either Foo *or* Bar I can either (a) In case (1): Use a Boolean should filter the surrounds two exists's filters checking whether either isFoo or isBar exist. (b) In case (2): Use a single exists filter that checks the existence of the field flags. A different case, is if I want to find the documents that are both Foo *and* Bar: (a) In case (1): Like before, replace the should with a must. (b) In case (2): Surround two terms filters with a must Boolean one. Lastly, finding the documents that are Foo but *not* Bar. In the bottom line, In case (1) all queries boil down to mixture of Boolean, exists and missing filters. In case (2), one has to process the strings in the array of strings named flags. My intuition is that it is faster to use method (1). In terms of space complexity I believe there is no difference. I'm looking forward to your insights! Dror [1]: Obviously, there could be way more flags... -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/ef637057-4303-4c75-9bbf-ed72e0d4806b%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/ef637057-4303-4c75-9bbf-ed72e0d4806b%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZstGjg-b7tHX8R56sGB9_znBzDwnJO4naC6y_L6FaQ19g%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: Performance issues when flagging a document in Elasticsearch
Basically, you will have to maintain more filters. Also Lucene supports up to certain amount of fields, it wasn't designed to handle unlimited number of them -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Author of RavenDB in Action http://manning.com/synhershko/ On Wed, Dec 10, 2014 at 10:35 AM, Dror Atariah dror...@gmail.com wrote: @Itamar: Can you please elaborate on the matter? Why/how does the number of fields relevant here? On Wednesday, December 10, 2014 4:26:16 PM UTC+1, Itamar Syn-Hershko wrote: Lucene / Elasticsearch is pretty much insignificant to this as long as you use filters. You should prefer not_analyzed fields with string values to represent those flags vs having dedicated boolean fields if you will have more than a few such flags. -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Author of RavenDB in Action http://manning.com/synhershko/ On Wed, Dec 10, 2014 at 10:22 AM, Dror Atariah dro...@gmail.com wrote: Assume that I want to be able to flag documents in an index according to their attributes: isFoo and isBar [1]. As far as I understand, there are two approaches: 1) Use dedicated fields for the flags: If the document is a Foo then add a field named isFoo. Similarly, for isBar. 2) Use a flags field that will be an array of strings. In this case, if the document is Foo then flags will contain the string isFoo. What are the pros and cons in terms of space and runtime complexities? Bear in mind the following queries examples: Consider the case where one wants to check the attributes of the documents in the index. In particular, if I want to find the documents that are either Foo *or* Bar I can either (a) In case (1): Use a Boolean should filter the surrounds two exists's filters checking whether either isFoo or isBar exist. (b) In case (2): Use a single exists filter that checks the existence of the field flags. A different case, is if I want to find the documents that are both Foo *and* Bar: (a) In case (1): Like before, replace the should with a must. (b) In case (2): Surround two terms filters with a must Boolean one. Lastly, finding the documents that are Foo but *not* Bar. In the bottom line, In case (1) all queries boil down to mixture of Boolean, exists and missing filters. In case (2), one has to process the strings in the array of strings named flags. My intuition is that it is faster to use method (1). In terms of space complexity I believe there is no difference. I'm looking forward to your insights! Dror [1]: Obviously, there could be way more flags... -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/ msgid/elasticsearch/ef637057-4303-4c75-9bbf-ed72e0d4806b% 40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/ef637057-4303-4c75-9bbf-ed72e0d4806b%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/c376b40d-1c46-43f5-952f-96ec01338788%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/c376b40d-1c46-43f5-952f-96ec01338788%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHTr4Zv78zrf%3DBkiBr%2BB5k_tM0qOS5QEA83BQ2PD34WtoXt_HA%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: Performance issues when flagging a document in Elasticsearch
I imagine the types of graphs you could come up with will differ significantly, to start with -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Author of RavenDB in Action http://manning.com/synhershko/ On Wed, Dec 10, 2014 at 11:03 AM, Dror Atariah dror...@gmail.com wrote: Is there any difference or any implications if there is also need of aggregations? On Wednesday, December 10, 2014 4:57:10 PM UTC+1, Itamar Syn-Hershko wrote: Basically, you will have to maintain more filters. Also Lucene supports up to certain amount of fields, it wasn't designed to handle unlimited number of them -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Author of RavenDB in Action http://manning.com/synhershko/ On Wed, Dec 10, 2014 at 10:35 AM, Dror Atariah dro...@gmail.com wrote: @Itamar: Can you please elaborate on the matter? Why/how does the number of fields relevant here? On Wednesday, December 10, 2014 4:26:16 PM UTC+1, Itamar Syn-Hershko wrote: Lucene / Elasticsearch is pretty much insignificant to this as long as you use filters. You should prefer not_analyzed fields with string values to represent those flags vs having dedicated boolean fields if you will have more than a few such flags. -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Author of RavenDB in Action http://manning.com/synhershko/ On Wed, Dec 10, 2014 at 10:22 AM, Dror Atariah dro...@gmail.com wrote: Assume that I want to be able to flag documents in an index according to their attributes: isFoo and isBar [1]. As far as I understand, there are two approaches: 1) Use dedicated fields for the flags: If the document is a Foo then add a field named isFoo. Similarly, for isBar. 2) Use a flags field that will be an array of strings. In this case, if the document is Foo then flags will contain the string isFoo. What are the pros and cons in terms of space and runtime complexities? Bear in mind the following queries examples: Consider the case where one wants to check the attributes of the documents in the index. In particular, if I want to find the documents that are either Foo *or* Bar I can either (a) In case (1): Use a Boolean should filter the surrounds two exists's filters checking whether either isFoo or isBar exist. (b) In case (2): Use a single exists filter that checks the existence of the field flags. A different case, is if I want to find the documents that are both Foo *and* Bar: (a) In case (1): Like before, replace the should with a must. (b) In case (2): Surround two terms filters with a must Boolean one. Lastly, finding the documents that are Foo but *not* Bar. In the bottom line, In case (1) all queries boil down to mixture of Boolean, exists and missing filters. In case (2), one has to process the strings in the array of strings named flags. My intuition is that it is faster to use method (1). In terms of space complexity I believe there is no difference. I'm looking forward to your insights! Dror [1]: Obviously, there could be way more flags... -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/ msgid/elasticsearch/ef637057-4303-4c75-9bbf-ed72e0d4806b%40goo glegroups.com https://groups.google.com/d/msgid/elasticsearch/ef637057-4303-4c75-9bbf-ed72e0d4806b%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/ msgid/elasticsearch/c376b40d-1c46-43f5-952f-96ec01338788% 40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/c376b40d-1c46-43f5-952f-96ec01338788%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/a74d02d3-5065-4642-801e-a1823fab37a4%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/a74d02d3-5065-4642-801e-a1823fab37a4%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You
Re: Migrate ES cluster to use doc_values
You will need to reindex, see: http://www.elasticsearch.org/blog/disk-based-field-data-a-k-a-doc-values/ http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/doc-values.html#_enabling_doc_values -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Author of RavenDB in Action http://manning.com/synhershko/ On Sun, Dec 7, 2014 at 3:25 PM, Yoav Melamed yo...@exelate.com wrote: Hello, We want to migrate our ES cluster to use version 1.4.1 with doc_values. We have 20 nodes with 4TB data. What is the best practice? Can we just change the mapping and restart the cluster? How can we make sure the change was done? Thanks, Yoav Melamed -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/f6fb341c-9469-4e8b-be92-0d4f9107463e%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/f6fb341c-9469-4e8b-be92-0d4f9107463e%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZsFE7WnsbFqW7EC3b3HrS4%3DuJx3TtvRffaH-yYFuXD-bw%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: downgrading from 1.4 to1.3
Classic CORS error - maybe * is blocked by ES. Haven't had to deal with this myself (yet) so can't help you here. All in all just a small rough edge to smooth, not a clusterfuck. A quick solution would be to install K3 as a site plugin and use it internally (don't expose it to the web) -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Author of RavenDB in Action http://manning.com/synhershko/ On Thu, Dec 4, 2014 at 3:20 AM, Jack Judge jackjudg...@gmail.com wrote: Well you're right there's JS errors, CORS related; XMLHttpRequest cannot load http://10.5.41.120:9200/logstash-2014.12.04/_search. Request header field Content-Type is not allowed by Access-Control-Allow-Headers. In my elasticsearch.yml I've got this on all nodes, http.cors.allow-origin: /.*/ http.cors.enabled: true Which google leads me to believe should open it up for anything. K3 is fronted by apache and a bit more googling prompted me to add this to the Directory section of httpd.conf Header set Access-Control-Allow-Origin * Still getting the same errors :( I'm at a loss to know what else to do now. On Wednesday, 3 December 2014 15:48:28 UTC-8, Itamar Syn-Hershko wrote: I'm not aware of compat issues with K3 and ES 1.4 other than https://github.com/elasticsearch/kibana/issues/1637 . I'd check for javascript errors, and try to see what's going on under the hood, really. When you have more data about this, you can either quickly resolve, or open a concrete bug :) -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/19d57d39-b764-493c-bd60-c8ae3aff087a%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/19d57d39-b764-493c-bd60-c8ae3aff087a%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZsXoVG%3DqYm8_0RC0H1%3DtWQfBtXhOScv6bjwKN9winF4cQ%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: downgrading from 1.4 to1.3
I'm pretty sure you can't due to different Lucene versions. I wouldn't even try - just export and re-index. I will be more than happy to hear about what went wrong for you with upgrading? -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Author of RavenDB in Action http://manning.com/synhershko/ On Wed, Dec 3, 2014 at 10:53 PM, Jack Judge jackjudg...@gmail.com wrote: Upgrading to 1.4 has been a clusterfuck for us. It's broken pretty much everything we rely on. I need to go back to 1.3, can I use the snapshot feature? Will a snapshot taken on a 1.4 cluster restore to a separate 1.3 cluster ? I'm really just interested in the data, I'd like to reapply my own mappings as the data is ingested into 1.3 Should I be looking at a third party script or will the snapshot/restore features of elastic search be adequate ? -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/cd99a5a2-d9a0-4bdf-ae4f-efe054ba8ebd%40googlegroups.com . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZtKDnvjrv9sU0aAJdab5tTtBP08jB%2BJ9fOiRgtyHbYf9A%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: downgrading from 1.4 to1.3
I'm a bit confused. Are you downgrading just because of Kibana compat issues? seems to me like killing a fly with a bazooka. Enabling CORS and using K3 dashboards seem like the better solution to me, for now. K4 isn't even officially released yet. As for data disappearing, I'm sure it wasn't and a relaxed debugging session can help you find that. As for export-import - yes, knapsack is a great option but it does make sense Joerg hasn't updated it yet as its not officially maintained. Writing your own export-import tool is an easy option; I'd also look at https://github.com/elasticsearch/stream2es , https://github.com/taskrabbit/elasticsearch-dump -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Author of RavenDB in Action http://manning.com/synhershko/ On Thu, Dec 4, 2014 at 12:12 AM, Jack Judge jackjudg...@gmail.com wrote: I will be more than happy to hear about what went wrong for you with upgrading? Well Kibana 4 is unusable for us, the lack of auto refresh killed it for us. Most of the time K4 simply doesn't work even for browsing small sets, I'd say 2 times in 3 we get the 30 ms timeout error, is there a solution to this yet ? And when we finally do get any results it's as slow as a pig on crutches. I need to go back to my old K3 dashboards, but after fighting thru the CORS features, I find they're all blank. There's no errors, just no data and I desperately need them. I also need my old Packetbeat dashboards. From my googling I think the quickest way to get them back is to downgrade ES from 1.4 to 1.3, or am I wrong ? So, I need a way to export the data from the indices to a new cluster. My systems are in a secure environment, I can't use anything that needs to connect out to the interrnet during compile / install time, so the node.js / npm stuff is locked out for me. I tried the knapsack plugin but it doesn't seem to work installed into an ES 1.4 cluster. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/3b3ba0ec-d84e-4a90-920c-997bce89c847%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/3b3ba0ec-d84e-4a90-920c-997bce89c847%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHTr4Zt-r2XQVGhDQ%2Bthqnt6hxavObpTcM%3DUkj-j27JLeRA_3A%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: downgrading from 1.4 to1.3
I'm not aware of compat issues with K3 and ES 1.4 other than https://github.com/elasticsearch/kibana/issues/1637 . I'd check for javascript errors, and try to see what's going on under the hood, really. When you have more data about this, you can either quickly resolve, or open a concrete bug :) -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Author of RavenDB in Action http://manning.com/synhershko/ On Thu, Dec 4, 2014 at 1:44 AM, Jack Judge jackjudg...@gmail.com wrote: Well you say just, but at the moment Kibana is our only view into the ES cluster, so yes it's a dealbreaker for us. After enabling CORS, and what an unexpected knock about of pure fun that was, I still can't use the K3 dashboards, they're blank, no data, no errors just empty dashboards :( Data isn't disappearing, I can still see it via the head plugin So what's the quickest way of being able to see my data via the K3 boards ? Is it exporting out into a new cluster ? Or is there a way to make them work with the ES 1.4 cluster ? JJ .On Wednesday, 3 December 2014 14:17:39 UTC-8, Itamar Syn-Hershko wrote: I'm a bit confused. Are you downgrading just because of Kibana compat issues? seems to me like killing a fly with a bazooka. Enabling CORS and using K3 dashboards seem like the better solution to me, for now. K4 isn't even officially released yet. As for data disappearing, I'm sure it wasn't and a relaxed debugging session can help you find that. As for export-import - yes, knapsack is a great option but it does make sense Joerg hasn't updated it yet as its not officially maintained. Writing your own export-import tool is an easy option; I'd also look at https://github.com/elasticsearch/stream2es , https://github.com/ taskrabbit/elasticsearch-dump -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Author of RavenDB in Action http://manning.com/synhershko/ On Thu, Dec 4, 2014 at 12:12 AM, Jack Judge jackj...@gmail.com wrote: I will be more than happy to hear about what went wrong for you with upgrading? Well Kibana 4 is unusable for us, the lack of auto refresh killed it for us. Most of the time K4 simply doesn't work even for browsing small sets, I'd say 2 times in 3 we get the 30 ms timeout error, is there a solution to this yet ? And when we finally do get any results it's as slow as a pig on crutches. I need to go back to my old K3 dashboards, but after fighting thru the CORS features, I find they're all blank. There's no errors, just no data and I desperately need them. I also need my old Packetbeat dashboards. From my googling I think the quickest way to get them back is to downgrade ES from 1.4 to 1.3, or am I wrong ? So, I need a way to export the data from the indices to a new cluster. My systems are in a secure environment, I can't use anything that needs to connect out to the interrnet during compile / install time, so the node.js / npm stuff is locked out for me. I tried the knapsack plugin but it doesn't seem to work installed into an ES 1.4 cluster. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/ msgid/elasticsearch/3b3ba0ec-d84e-4a90-920c-997bce89c847% 40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/3b3ba0ec-d84e-4a90-920c-997bce89c847%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/803bd821-95e8-4e29-8f0a-b2df813e204f%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/803bd821-95e8-4e29-8f0a-b2df813e204f%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZuOEEn_5_AQt9SKk3JAFmTnjq1afA9PAkP77aDmOLgLCg%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: This version of Kibana requires at least Elasticsearch 1.4.0.Beta1 but using 1.4.1
https://github.com/elasticsearch/kibana/issues/1637 -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Author of RavenDB in Action http://manning.com/synhershko/ On Thu, Dec 4, 2014 at 1:49 AM, David Montgomery davidmontgom...@gmail.com wrote: I am kibana 4.0.0-BETA2 On Wednesday, December 3, 2014 7:30:03 PM UTC+8, Mark Walkom wrote: What version of Kibana? On 3 December 2014 at 21:48, David Montgomery davidmo...@gmail.com wrote: Hi, This version of Kibana requires at least Elasticsearch 1.4.0.Beta1 SetupError@http://monitor-development-east.test.com: 5601/index.js?_b=3998:42905:51 checkEsVersion/@http://monitor-development-east.test. com:5601/index.js?_b=3998:43091:14 qFactory/defer/deferred.promise.then/wrappedCallback@h ttp://monitor-development-east.test.com:5601/index.js?_b=3998:20764:15 qFactory/ref/.then/@http://monitor-development-east.test. com:5601/index.js?_b=3998:20850:11 $RootScopeProvider/this.$get/Scope.prototype.$eval@http:// monitor-development-east.test.com:5601/index.js?_b=3998:21893:9 $RootScopeProvider/this.$get/Scope.prototype.$digest@http:/ /monitor-development-east.test.com:5601/index.js?_b=3998:21705:15 $RootScopeProvider/this.$get/Scope.prototype.$apply@http:// monitor-development-east.test.com:5601/index.js?_b=3998:21997:13 done@http://monitor-development-east.test.com: 5601/index.js?_b=3998:17570:34 completeRequest@http://monitor-development-east.test. com:5601/index.js?_b=3998:17784:7 createHttpBackend//xhr.onreadystatechange@http:// monitor-development-east.test.com:5601/index.js?_b=3998:17727:1 I am using 1.4.1. Clearly kibana is not working. Why? Here is my ES server { status : 200, name : Controller, cluster_name : elasticsearch, version : { number : 1.4.1, build_hash : 89d3241d670db65f994242c8e8383b169779e2d4, build_timestamp : 2014-11-26T15:49:29Z, build_snapshot : false, lucene_version : 4.10.2 }, tagline : You Know, for Search } Thaks -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/ msgid/elasticsearch/CACF%2B8roTzNMtt5FJGQqGQSUQBaMXvVSP s4phpcLDCbM-Baek0g%40mail.gmail.com https://groups.google.com/d/msgid/elasticsearch/CACF%2B8roTzNMtt5FJGQqGQSUQBaMXvVSPs4phpcLDCbM-Baek0g%40mail.gmail.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/35e05f7f-4275-4a15-bfca-518afc4e1cc7%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/35e05f7f-4275-4a15-bfca-518afc4e1cc7%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHTr4Zu7x9Ouaa2YMttO%3D-h1R5LvZAZFFbwGDACB5qgZ6r8ZEQ%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: Java client - setTimeout vs actionGet(timeout)
IIRC the Java API doesn't have any default client-side timeout for search requests, its an opt-in feature -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Author of RavenDB in Action http://manning.com/synhershko/ On Sun, Nov 30, 2014 at 11:37 PM, Nikolas Everett nik9...@gmail.com wrote: Default for server side timeout is none and I don't know client side timeout. I imagine it is a long time. On Nov 30, 2014 1:46 PM, Ron Sher ron.s...@gmail.com wrote: Thanks for the info. Do you know what are the defaults? On Sunday, November 30, 2014 5:53:49 PM UTC+2, Nikolas Everett wrote: Timeouts are server side and best effort. I believe action get(timeout) is client side. I use the http client but use both and set the server side timeout to lower than the client side timeout. The server side timeout should return partial results if possible. On Nov 30, 2014 10:41 AM, Ron Sher ron@gmail.com wrote: Hi all, I want to make sure the search query doesn't exceed some limit. I've seen the option to use a setTimeout vs actionGet(timeout). Can someone please explain the difference? Also, I've read somewhere that there's a default connection timeout. Can that be used instead. If so, how? Thanks for your help, Ron -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/ msgid/elasticsearch/0f39e669-bd98-4fc3-afbc-df9e3d9b1f52% 40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/0f39e669-bd98-4fc3-afbc-df9e3d9b1f52%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/660f86bb-0f96-431c-acb5-aa4f5971578a%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/660f86bb-0f96-431c-acb5-aa4f5971578a%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAPmjWd0uKbdDubBAS-9AhwW4G2gQbJHs-fAg_qkhv0WcRn1K%2BQ%40mail.gmail.com https://groups.google.com/d/msgid/elasticsearch/CAPmjWd0uKbdDubBAS-9AhwW4G2gQbJHs-fAg_qkhv0WcRn1K%2BQ%40mail.gmail.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZtfnhCZJUrKv7fDc-hO%3D3Dpgv8xjUK3up3cCiaY60SmCQ%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: char_filter for German
You may find the approach I give in the end of this talk helpful: https://skillsmatter.com/skillscasts/4968-approaches-to-multi-lingual-text-search-with-elasticsearch-and-lucene -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Author of RavenDB in Action http://manning.com/synhershko/ On Tue, Nov 18, 2014 at 12:30 PM, Krešimir Slugan kresimir.slu...@gmail.com wrote: Hi, To handle German language in search I have to be able to provide same results if user searches for e.g über, uber or ueber I would do this at the index time where I would have über in the data. But if I use just asciifolding filter I lose information that this was work with umlaut and I can't get ueber token. If I use char_fiter, it is applied before analysis and I would not be able to get uber. Is it possible to preserve original with char filter or apply it after the analysis? Cheers, Kresimir -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/f18f94bc-58e0-4bbf-a445-b45ba4db11f3%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/f18f94bc-58e0-4bbf-a445-b45ba4db11f3%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZsUPgHpwYwruOc%3DLhhrb2JnEG5CWS5O4Nuj52vnty9yPA%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: char_filter for German
Why do you need it as ueber? what I'm usually doing is end up with [über, uber] at the same position, possibly marking the first as being the original. Seeing Jurgen's response, I seem to be on the right path... -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Author of RavenDB in Action http://manning.com/synhershko/ On Sat, Nov 29, 2014 at 9:21 PM, Krešimir Slugan kresimir.slu...@gmail.com wrote: Which token filter can I use to replace words like über with ueber? On Saturday, November 29, 2014 8:16:14 PM UTC+1, Itamar Syn-Hershko wrote: What I'm saying is don't use char_filter, and use the token filters chain to achieve that -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Author of RavenDB in Action http://manning.com/synhershko/ On Sat, Nov 29, 2014 at 9:02 PM, Krešimir Slugan kresimi...@gmail.com wrote: Hi Itamar, I don't think this solves my problem. I'm aware that you can preserve original with ASCIIfolding but since char_filter is applied before ASCIIfolding then there would not be any umlauts to fold :) If I could apply char_filter on the end that would be ok, or preserve original with char_filter. Best, Kresimir On Saturday, November 29, 2014 5:41:11 PM UTC+1, Itamar Syn-Hershko wrote: You may find the approach I give in the end of this talk helpful: https://skillsmatter.com/skillscasts/4968-approaches-to-multi-lingual- text-search-with-elasticsearch-and-lucene -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Author of RavenDB in Action http://manning.com/synhershko/ On Tue, Nov 18, 2014 at 12:30 PM, Krešimir Slugan kresimi...@gmail.com wrote: Hi, To handle German language in search I have to be able to provide same results if user searches for e.g über, uber or ueber I would do this at the index time where I would have über in the data. But if I use just asciifolding filter I lose information that this was work with umlaut and I can't get ueber token. If I use char_fiter, it is applied before analysis and I would not be able to get uber. Is it possible to preserve original with char filter or apply it after the analysis? Cheers, Kresimir -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/ msgid/elasticsearch/f18f94bc-58e0-4bbf-a445-b45ba4db11f3%40goo glegroups.com https://groups.google.com/d/msgid/elasticsearch/f18f94bc-58e0-4bbf-a445-b45ba4db11f3%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/ msgid/elasticsearch/4d362cd4-21a4-486c-bf57-f2de5949f072% 40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/4d362cd4-21a4-486c-bf57-f2de5949f072%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/8e3cc964-59fc-4be7-bb13-b1411a312ade%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/8e3cc964-59fc-4be7-bb13-b1411a312ade%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZuvKNq58xryBXJ5FLewOafWd0LvsaTADh%2BeYCtHGaRK2A%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: 3 Node Cluster With Nodes Out of Sync
If this is replicas only, you should be able to set replica count to 0 and then after a while back to 2 again If this is sharded, then no, you'll have to reindex from scratch. -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Author of RavenDB in Action http://manning.com/synhershko/ On Wed, Nov 26, 2014 at 10:26 AM, Yosi Haran y...@my6sense.com wrote: Alright, we'll try upgrading. Thanks :) Meanwhile, any advice on how to fix an inconsistency once it is found? Is there an API to forcefully sync nodes, or at-least reindex from a specific node? On Tuesday, November 25, 2014 8:44:44 PM UTC+2, Itamar Syn-Hershko wrote: I suggest you upgrade to 1.4 and try again - see http://www.elasticsearch.org/guide/en/elasticsearch/ resiliency/current/index.html -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Author of RavenDB in Action http://manning.com/synhershko/ On Tue, Nov 25, 2014 at 7:29 PM, Yosi Haran yo...@my6sense.com wrote: 1.0.0 On Tuesday, November 25, 2014 6:41:36 PM UTC+2, Itamar Syn-Hershko wrote: minimum_master_nodes still doesn't protect you from all possible failure scenarios, see http://aphyr.com/posts/317-call-me-maybe- elasticsearch What version are you running? -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Author of RavenDB in Action http://manning.com/synhershko/ On Tue, Nov 25, 2014 at 6:37 PM, Yosi Haran yo...@my6sense.com wrote: Hi Guys, We are running a 3 node cluster, and each node returns a different number of documents when issued a direct HTTP _count call. The cluster holds about 150K documents and the differences range from 30~50 documents, but are still troubling. This shouldn't be a split brain problem, since we have set: discovery.zen.minimum_master_nodes: 2 We also have a client node, but since client nodes are eligible to be master, I understand that they shouldn't affect the master election process. Any Ideas about why and how this is happening? Thanks! -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/ msgid/elasticsearch/8ed828c8-fb62-413a-9fe0-91806fcf34e6%40goo glegroups.com https://groups.google.com/d/msgid/elasticsearch/8ed828c8-fb62-413a-9fe0-91806fcf34e6%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/ msgid/elasticsearch/fa91da1d-2127-4f70-96e0-15125a5af3bc% 40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/fa91da1d-2127-4f70-96e0-15125a5af3bc%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/b53c3df3-1d7f-4381-884a-713f605a7fba%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/b53c3df3-1d7f-4381-884a-713f605a7fba%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZuJuoVgzzc3bgguk5a9WmFkEaEPEv7DnbET-iQVREfriw%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: 3 Node Cluster With Nodes Out of Sync
minimum_master_nodes still doesn't protect you from all possible failure scenarios, see http://aphyr.com/posts/317-call-me-maybe-elasticsearch What version are you running? -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Author of RavenDB in Action http://manning.com/synhershko/ On Tue, Nov 25, 2014 at 6:37 PM, Yosi Haran y...@my6sense.com wrote: Hi Guys, We are running a 3 node cluster, and each node returns a different number of documents when issued a direct HTTP _count call. The cluster holds about 150K documents and the differences range from 30~50 documents, but are still troubling. This shouldn't be a split brain problem, since we have set: discovery.zen.minimum_master_nodes: 2 We also have a client node, but since client nodes are eligible to be master, I understand that they shouldn't affect the master election process. Any Ideas about why and how this is happening? Thanks! -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/8ed828c8-fb62-413a-9fe0-91806fcf34e6%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/8ed828c8-fb62-413a-9fe0-91806fcf34e6%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHTr4Zu6r2z39iT%2B%3Dy7b6Brez%2BhLz8davSRaq0UmMviCCqV_sQ%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: Performance issue while indexing lot of documents
It may worth looking at 2 things: 1. Using the latest Elasticsearch version (1.4). Many work went on optimizing those type of scenarios on the server side. 2. Disabling refresh / flush - I assume this is an ETL process and as such this could greatly help. -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Author of RavenDB in Action http://manning.com/synhershko/ On Thu, Nov 6, 2014 at 4:01 PM, Moshe Recanati re.mo...@gmail.com wrote: hi Thomas, I fixed the code per your suggestion and initiated prepared bulk each 1000 documents (code below). However add document time is still increasing. Please let me know what's wrong. Thank you in advance. Moshe Output: Going to add 100 processed 1000 records from -1000 until 0 at 704 processed 1000 records from 0 until 1000 at 3068 processed 1000 records from 1000 until 2000 at 1030 processed 1000 records from 2000 until 3000 at 1654 processed 1000 records from 3000 until 4000 at 1798 processed 1000 records from 4000 until 5000 at 1808 processed 1000 records from 5000 until 6000 at 580 processed 1000 records from 6000 until 7000 at 354 processed 1000 records from 7000 until 8000 at 731 processed 1000 records from 8000 until 9000 at 496 processed 1000 records from 9000 until 1 at 822 processed 1000 records from 1 until 11000 at 564 processed 1000 records from 11000 until 12000 at 588 processed 1000 records from 12000 until 13000 at 690 processed 1000 records from 13000 until 14000 at 774 processed 1000 records from 14000 until 15000 at 1528 processed 1000 records from 15000 until 16000 at 1028 processed 1000 records from 16000 until 17000 at 966 processed 1000 records from 17000 until 18000 at 1397 processed 1000 records from 18000 until 19000 at 962 processed 1000 records from 19000 until 2 at 3573 processed 1000 records from 2 until 21000 at 1332 processed 1000 records from 21000 until 22000 at 1282 processed 1000 records from 22000 until 23000 at 1746 processed 1000 records from 23000 until 24000 at 1411 processed 1000 records from 24000 until 25000 at 1742 processed 1000 records from 25000 until 26000 at 2540 processed 1000 records from 26000 until 27000 at 2217 processed 1000 records from 27000 until 28000 at 1203 processed 1000 records from 28000 until 29000 at 1714 processed 1000 records from 29000 until 3 at 1595 processed 1000 records from 3 until 31000 at 1809 processed 1000 records from 31000 until 32000 at 2305 processed 1000 records from 32000 until 33000 at 1604 processed 1000 records from 33000 until 34000 at 2208 processed 1000 records from 34000 until 35000 at 1989 processed 1000 records from 35000 until 36000 at 1939 processed 1000 records from 36000 until 37000 at 1826 processed 1000 records from 37000 until 38000 at 1716 processed 1000 records from 38000 until 39000 at 1957 processed 1000 records from 39000 until 4 at 1665 processed 1000 records from 4 until 41000 at 1743 processed 1000 records from 41000 until 42000 at 2166 processed 1000 records from 42000 until 43000 at 2450 processed 1000 records from 43000 until 44000 at 3342 processed 1000 records from 44000 until 45000 at 2632 processed 1000 records from 45000 until 46000 at 2795 processed 1000 records from 46000 until 47000 at 3129 processed 1000 records from 47000 until 48000 at 3290 processed 1000 records from 48000 until 49000 at 3973 processed 1000 records from 49000 until 5 at 3297 processed 1000 records from 5 until 51000 at 3500 processed 1000 records from 51000 until 52000 at 4328 processed 1000 records from 52000 until 53000 at 3913 processed 1000 records from 53000 until 54000 at 3636 processed 1000 records from 54000 until 55000 at 3971 processed 1000 records from 55000 until 56000 at 5851 processed 1000 records from 56000 until 57000 at 4150 processed 1000 records from 57000 until 58000 at 4557 processed 1000 records from 58000 until 59000 at 4534 processed 1000 records from 59000 until 6 at 4918 processed 1000 records from 6 until 61000 at 3839 processed 1000 records from 61000 until 62000 at 4297 processed 1000 records from 62000 until 63000 at 4516 processed 1000 records from 63000 until 64000 at 4782 processed 1000 records from 64000 until 65000 at 4581 Code: Node node = NodeBuilder.nodeBuilder().node(); Client client = node.client(); try { CreateIndexRequestBuilder createIndexRequestBuilder = client.admin().indices().prepareCreate(twitter2); createIndexRequestBuilder.execute().actionGet(); } catch (Exception e) { e.printStackTrace(); } BulkRequestBuilder bulkRequest = client.prepareBulk(); int numOfDocs = 100; long startTime = System.currentTimeMillis(); System.out.println(Going to add + numOfDocs); // either use client#prepare, or use Requests# to directly build index/delete requests
Re: Enabling doc_values for _timestamp and _parent fields
Yes -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Author of RavenDB in Action http://manning.com/synhershko/ On Tue, Oct 21, 2014 at 10:47 AM, Costya Regev cos...@totango.com wrote: Hi, It's not clear from the documentation. Can doc_values be set for _timestamp and _parent fields? Thanks, Costya, Totango -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/8736df68-26f8-446c-bd1f-bf231fb73849%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/8736df68-26f8-446c-bd1f-bf231fb73849%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZsZgKt4%2BxEUq%2BPV8TEAfTT6czAxQTZFLRL6oO%2BVT7%2Bxxg%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: ES-to-ES river?
I personally recommend https://github.com/elasticsearch/stream2es -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Author of RavenDB in Action http://manning.com/synhershko/ On Tue, Oct 21, 2014 at 3:24 PM, joergpra...@gmail.com joergpra...@gmail.com wrote: You can also try the knapsack plugin, where you can archive index data, but also move index data around, between indices and across clusters. https://github.com/jprante/elasticsearch-knapsack Jörg On Tue, Oct 21, 2014 at 3:57 PM, raidex ralg...@gmail.com wrote: Hi all, Is there a reason why a ES-to-ES river hasn't been implemented? I need to implement a fast copy mechanism to move data between indices (same cluster and across clusters) and seems to me like a river is the right mechanism. I am planning to write my own, but I want to check if it is a reasonable approach. -- Thanks. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/564b7d29-677f-4bb2-9ab0-5ca206894621%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/564b7d29-677f-4bb2-9ab0-5ca206894621%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHLk6A3LUFR6YN%3DDJBdbDOy4QpAXjdzT_%2BNUa7NoqE_iQ%40mail.gmail.com https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHLk6A3LUFR6YN%3DDJBdbDOy4QpAXjdzT_%2BNUa7NoqE_iQ%40mail.gmail.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHTr4Zs-8T67tF%3DdBkEnX44Z3jhpgKj0OJD9mPC4HS8bpW3DYQ%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: Hot backup strategy for Elasticsearch
No - you should definitely use the snapshot and restore as its the most stable and efficient way for backups there is. -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Author of RavenDB in Action http://manning.com/synhershko/ On Wed, Oct 15, 2014 at 1:12 AM, skm replyson...@gmail.com wrote: Hello List, Going through the current documentation I found that snapshot/restore mechanism is one type of backup strategy that we can use for ES clusters. Any other recommendations? Using the following 1.elasticsearch- version : { number : 1.3.4, 2. AWS-cloud-plugin 3. curator curator snapshot --repository mys3_repository --all-indices (weekend) curator snapshot --repository mys3_repository --most-recent 1 (every week day) The above would be run as cron jobs from one of the nodes in the cluster. Let me know of recommendations for hot backup for elastic search cluster. Thanks, skm -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/fdb9ebae-0352-491c-bca6-dc905cd623ae%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/fdb9ebae-0352-491c-bca6-dc905cd623ae%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZtpT5Q2C2sswPDJRN0KK3xNWGbxdUFoctqGd%3D%2B1q7cs1Q%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: NotFilter dude
See http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-not-filter.html You should probably switch to a bool and a should clause before instead of an and filter -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Author of RavenDB in Action http://manning.com/synhershko/ On Tue, Oct 14, 2014 at 9:26 PM, Waldemar Neto waldema...@gmail.com wrote: Hello all! ia have a criteria with *AND* , *OR* and *NOT* operator, but the *NOT* is a single Filter, what is the best way to set multiples *NOT's*? see my query with *AND* my *AND* i need *NOT* :D { highlight: { fields: { *: { fragment_size: 150, number_of_fragments: 1, pre_tags: [ b ], post_tags: [ /b ] } } }, facets: { documents: { terms: { field: primary_field, size: 5 } } }, fields: [ Document.id, Document.name, Document.updated, DocumentTag.name, Document.approval_number, Document.approval_number_us, Document.approval_number_jp, Version.status, Document.rate, Document.last_status ], sort: [ _type ], size: 10, query: { filtered: { query: { match_all: {} }, filter: { and: { filters: [ { terms: { Product.id: [ 6 ] } }, { terms: { Version.jp: [ true ] } }, { terms: { Version.jp: [ true ] } }, { terms: { Document.last_status: [ 4 ] } } ] } } } } } -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/d9d34e14-7d3c-41a5-b3ad-a33ccbd79d45%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/d9d34e14-7d3c-41a5-b3ad-a33ccbd79d45%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZvhhpLcLojKEk0%2Bw_sx%3DFeMkOnvSbiK-aWYeH7ByD1WWg%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: Hot backup strategy for Elasticsearch
Incremental. See http://www.elasticsearch.org/blog/introducing-snapshot-restore/ -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Author of RavenDB in Action http://manning.com/synhershko/ On Wed, Oct 15, 2014 at 5:15 PM, skm replyson...@gmail.com wrote: Thank you for the response! Usually for large amounts of data (TBs) how the snapshot backup strategy work? Full snapshots every week and then most-recent snapshots work well? The most recent would be redundant if there is no new data in the last 24 hrs.? Thanks, skm On Wednesday, October 15, 2014 12:54:13 AM UTC-7, Itamar Syn-Hershko wrote: No - you should definitely use the snapshot and restore as its the most stable and efficient way for backups there is. -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Author of RavenDB in Action http://manning.com/synhershko/ On Wed, Oct 15, 2014 at 1:12 AM, skm reply...@gmail.com wrote: Hello List, Going through the current documentation I found that snapshot/restore mechanism is one type of backup strategy that we can use for ES clusters. Any other recommendations? Using the following 1.elasticsearch- version : { number : 1.3.4, 2. AWS-cloud-plugin 3. curator curator snapshot --repository mys3_repository --all-indices (weekend) curator snapshot --repository mys3_repository --most-recent 1 (every week day) The above would be run as cron jobs from one of the nodes in the cluster. Let me know of recommendations for hot backup for elastic search cluster. Thanks, skm -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/ msgid/elasticsearch/fdb9ebae-0352-491c-bca6-dc905cd623ae% 40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/fdb9ebae-0352-491c-bca6-dc905cd623ae%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/e42872cd-7f44-4ada-b1d5-e988edac60e0%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/e42872cd-7f44-4ada-b1d5-e988edac60e0%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZtbuP1qpXXDpV-zrkdhuuO6DACq32-LF45z%3DQ9s2QA4Ag%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: running on EC2 S3 vs EBS
Yes, you don't want to use anything other than local storage for Elasticsearch. Not EBS and definitely not S3. You can use the snapshot/restore API to continously backup to S3 and get all the data protection you need. -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Author of RavenDB in Action http://manning.com/synhershko/ On Tue, Oct 14, 2014 at 12:17 AM, Matthias Johnson openno...@gmail.com wrote: We've begun deploying to AWS EC2. I've seen refrences in the group about the S3 gateway and it being deprecated. That seems to be confirmed by looking at the docs, which don't seem to list the S3 Gateway specifically after 0.90.x. We are also using the elasticsearch-cloud-aws plugins https://github.com/elasticsearch/elasticsearch-cloud-aws, which does a nice job at helping the auto discovery. It also shows settings for using S3. After some reading my understanding is that the plugin is basically just snapshots that are stored in S3. Is that understanding correct? Is this much different from the original gateway? That suggests that unless we take frequent snapshots we would run a risk of data loss if the entire cluster wen't down (right now we are using instance storage). Is that right? Switching to EBS would give us better protection against data loss, since the data is stored on a more permanent basis as well as improved recovery after an entire cluster going down? Are there any good guides on configuring this sort of setup with cloudformation and templates and/or tying EBS volumes for ES use to machines when a cluster is resurrected? \@matthias -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/415a7669-4c2a-4d3e-a960-67390c1197cf%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/415a7669-4c2a-4d3e-a960-67390c1197cf%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZtPqGN7PDTQkoYcsYwfM_3bmVrEECwZcCiNqDsLsa9gqQ%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.