Re: Documents not deleted when using DeleteRequest within BulkProcessor
No as soon as you have only one index for this alias, indexing and deleting should work. I don’t see anything suspicious here. Any chance you could share on github your full code? When you say that nothing happens, do you mean that you never get the debug LOG « Processing {} … » ? Or do you mean that the document has not been removed ? How do you test all that? -- David Pilato - Developer | Evangelist elastic.co @dadoonet https://twitter.com/dadoonet | @elasticsearchfr https://twitter.com/elasticsearchfr | @scrutmydocs https://twitter.com/scrutmydocs Le 30 avr. 2015 à 19:10, Diana Tuck dtu...@gmail.com a écrit : Thank you for the reply, David. We are using an alias to delete. Is that a problem? Indexing with the alias through the bulk processor works fine. There are no errors reported, it just seems to disappear into the oblivion. Here's our code for the BulkProcessor: public static BulkProcessor getBulkProcessor(Client client, int esConcurrencyLevel, int esBulkSize, int esFlushInterval) { return BulkProcessor.builder(client, new BulkProcessor.Listener() { @Override public void beforeBulk(long executionId, BulkRequest bulkRequest) { LOG.debug(Processing {} requests in bulk process {}, bulkRequest.numberOfActions(), executionId); } @Override public void afterBulk(long executionId, BulkRequest bulkRequest, BulkResponse response) { if (response.hasFailures()) { for (BulkItemResponse item : response.getItems()) { LOG.error(Processing to index \{}\ failed for entity id {} with message {}, item.getIndex(), item.getId(), item.getFailureMessage()); } } } @Override public void afterBulk(long executionId, BulkRequest bulkRequest, Throwable throwable) { LOG.error(Failed to process {} requests in bulk request {}: {}, bulkRequest.numberOfActions(), executionId, throwable.getMessage()); throwable.printStackTrace(); } }) .setBulkActions(esBulkSize) .setFlushInterval(TimeValue.timeValueSeconds(esFlushInterval)) .setConcurrentRequests(esConcurrencyLevel) .build(); } Code for the delete request: bulkProcessor.add(new DeleteRequest(index.getIndexingAlias(), index.getType(), entityId)); where index.getIndexingAlias() is an alias (same alias used for indexing which is working), type is the document type company and entityId is the document ID. What data would be helpful? An example document, the index metadata, something else? On Wednesday, April 29, 2015 at 9:53:41 PM UTC-7, David Pilato wrote: Do you try to delete a doc using an alias? Any failure or error reported by the bulk processor? Hard to tell more without seeing the code / data. David Le 30 avr. 2015 à 02:03, Diana Tuck dtu...@gmail.com javascript: a écrit : Trying to index/delete documents within one BulkProcessor object in the Java API. Indexing documents works great! Deleting, however, does not. bulkProcessor.add(new DeleteRequest(index.getIndexingAlias(), index.getType(), entityId)); Nothing happens. Any ideas? -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com javascript:. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/e2774458-8542-4634-bd8d-1ccfd9837409%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/e2774458-8542-4634-bd8d-1ccfd9837409%40googlegroups.com?utm_medium=emailutm_source=footer. For more options, visit https://groups.google.com/d/optout https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com mailto:elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/01b6ab18-78a8-44d0-b574-c649501ec21a%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/01b6ab18-78a8-44d0-b574-c649501ec21a%40googlegroups.com?utm_medium=emailutm_source=footer. For more options, visit https://groups.google.com/d/optout https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit
Re: Returning partial strings in Kibana visualisation
{script: _value.substring(0,8)} works for you, needing groovy sandbox enabled. On Wednesday, April 29, 2015 at 9:39:33 PM UTC+8, Stuart Kenworthy wrote: I have a number of different load injector boxes and processes that generate load through our system under test. The tool in use produces masses of logs out but none of it is easily accessible or readable. I am therefore using ELK to process the loads with success, however, presenting some of the data is problematic. The process names have a structure of: process_name_Stressnn_Thread_nn but there are around 180 of them. Each process thread generates 1 of 11 different message types. The message types are only distinguishable using a 10 character substring within a field containing strings and semi_colon delimited text and generally in the same location (between character 60 and character 70. In elasticsearch none of these fields are analysed as this makes the queries and results even messier in Kibana and poses the same problem when choosing analysed elements of a field (only picking element 12 or 10-12). When aggregation is done on either of these fields, message type is presented as the long string in the visualisation key with only the first 10-15 characters showing, and process name resutls in all 180 processes rather than the 7 process types. These processes are likely to change over time as we introduce new test scenarios and message types so I do not want to hard code them in just in case we miss something. Is it possible to have elasticsearch return substrings, partials, lefts, rights etc of a field and group them as such rather than the entire field content so all process_namea are grouped together and *msg_typeA* are grouped together? Ideally without code edits to either elastic or kibana? Something in JSON Input such as { field_length: 10 } or { partial_start: 60, partial_for: 15 } would suffice. This is akin to renaming keys, columns and rows. Thanks -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/266c13a5-6a95-4aea-bd3e-1e7edf6eb977%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
GeoNames, Autocomplete and boost
Hi. I'm trying to improve autocomplete search results on a GeoNames Cities index. I have been using django-haystack, but have run into issues there. I may need to replace it, or bypass it. But my question here pertains to indexing and querying with autocomplete using multiple fields. Users expect to be able to use two-letter abbreviations for states to narrow their city choices. For example, San Francisco, CA and New York, NY should have the cities you'd expect at the top of the list. However that is not the case, and I think for different reasons. You can see the results below. It turns out that there are a lot of San Franciscos in the world! Searching for San Francisco CA retrieves San Francisco, Caraga, 13, PH 5.5191193 San Francisco, Caraga, 13, PH 5.5163627 San Francisco, Calabarzon, 40, PH 5.4498897 San Francisco, Calabarzon, 40, PH 5.281434 San Francisco, Caraga, 13, PH 5.281434 San Francisco, California, CA, US 5.2123656 South San Francisco, California, CA, US 4.3138 San Francisco (El Calvito), Chiapas, 05, MX 4.137272 San Francisco, Baja California Sur, 03, MX 4.137272 San Francisco (Baños de Agua Caliente), Guanajuato, 11, MX 3.3008962 I would like to boost the state (region_code) value so that San Francisco and South San Francisco are at the top. For New York NY I get Nyack, New York, US 3.0575132 West Nyack, New York, US 2.670291 South Nyack, New York, US 2.5124028 Upper Nyack, New York, US 2.5124028 Instead of what I want, which is New York City, New York, US. The autocomplete field is EdgeNGram called content_auto. It currently has the following format, which is what I want to return: CityName, RegionName, CountryCode. So I think what I want to do in both cases is boost results if there is a match on the region_code field, but *not* display the region_code field in the results. The type of the search is currently query_string, which is what haystack uses. If there is some way to make that work, then that would be good. However, I'm afraid it is limiting what I'm able to do. I did some experiments -- If I query directly with curl for sf using { query:{ multi_match:{ query: San Francisco CA, type: cross_fields, fields: [content_auto, region_code^3] } } } I get a result I'm satisfied with. However the similar query using New York NY puts the city as the sixth result! I also tried putting the region_code in the content_auto string, and boosting the region_code field. Also, the following works for SF, but I have no way of knowing in advance what the region_code is going to be. It ranks New York City third, and I would have to pick out two-letter combinations. default_field: text, default_operator: OR, query: (content_auto:(san) AND content_auto:(francisco)) CA^1.5 It would really help if someone could help me limit my *own* queries about how ElasticSearch works, so that I can focus on the best approach! Thanks in advance for your help :-) curl 'localhost:9200/cities/_mapping?pretty' { cities : { mappings : { modelresult : { _boost : { name : boost, null_value : 1.0 }, properties : { content_auto : { type : string, analyzer : edgengram_analyzer }, django_ct : { type : string, index : not_analyzed, include_in_all : false }, django_id : { type : string, index : not_analyzed, include_in_all : false }, id : { type : string }, location : { type : geo_point }, region_code : { type : string, analyzer : snowball }, text : { type : string, analyzer : snowball } } } } } } NY example: curl -XGET 'http://localhost:9200/cities/modelresult/_search?pretty' -d '{ from: 0, query: { filtered: { filter: { terms: { django_ct: [ cities.city ] } }, query: { query_string: { analyze_wildcard: true, auto_generate_phrase_queries: true, default_field: text, default_operator: AND, query: (content_auto:(new) AND content_auto:(york,) AND content_auto:(ny)) } } } }, size: 10, sort: [ { _score: { order: desc } } ] }' { took : 2, timed_out : false, _shards : { total : 5, successful : 5, failed : 0 }, hits : { total : 4, max_score : 3.0575132, hits : [ { _index : cities, _type : modelresult, _id : cities.city.5129433, _score : 3.0575132, _source:{django_id: 5129433, region_code: NY,
Re: How to replicate this type of search
Thanks John Ivan for your insight. Very helpful. Just to see if I'm getting this: In order to have a search where my users can type title: {query term} to limit the search only to titles, I need to program my application to parse the query string and then add the additional filters to the ES request. Do I have this right? -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/e8ad15f1-61bf-4f65-a061-ecfe784b62f8%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: too many open files problems and suggestions on cluster configuration
How to calculate the best amount of shards? пятница, 1 мая 2015 г., 18:21:47 UTC+3 пользователь David Pilato написал: Add more nodes or reduce the number of shards per node. -- David ;-) Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs Le 1 mai 2015 à 17:05, Ann Yablunovskaya lad@gmail.com javascript: a écrit : I am looking for suggestions on cluster configuration. I have 2 nodes (master/data and data), 544 indices, about 800 mil documents. If I try to insert more documents and create more indices, I will catch error too many open files. My node's configuration: CentOS 7 Intel(R) Xeon(R) CPU x16 RAM 62 Gb # ulimit -n 10 In future I will have a lot of indices (about 2000) and a lot of documents (~5 bil or maybe more) How can I avoid the error too many open files? -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com javascript:. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/c5d45b95-b3d7-4b6a-80fa-111d66f3f65a%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/c5d45b95-b3d7-4b6a-80fa-111d66f3f65a%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. es_config.pp -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/7c2e1952-e718-4563-ac5c-bb92b45b0aa5%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: How to replicate this type of search
Thanks John! It's all very clear now. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/b78e9ef6-1d15-47d2-ab04-7aceab61aeef%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: Marvel license file/order number baked into a container
For anyone else coming across this, an Elasticsearch engineer confirmed to me in this question https://groups.google.com/d/msg/elasticsearch/CFUZp6j5TOc/_muPZUAMz_kJ that it's not currently possible but a feature request for it has now been opened. On Wednesday, April 29, 2015 at 3:07:21 PM UTC-4, Joel Potischman wrote: Hi Boaz. I know this is an old thread but I can't find anything newer and it seems very related to this issue. Is there documentation anywhere on how to script installation of the license initially? We have not yet run Marvel in production and we do not allow manual steps in our deployment process. We'd like to be able to deploy Marvel to a completely clean box and have the license already present at the end of the deploy. Can you point me to an example curl (dummy values, obviously)? On Monday, October 13, 2014 at 8:48:45 AM UTC-4, Boaz Leskes wrote: .marvel-kibana now has a 'state-2' file inside it. and obv for now since i didnt restart or do anything of that nature i am not asked for the license details. I wonder if what I see is that .marvel-kibana is only stored with one primary and one replica and when i reload the cluster sometimes i happen to load the nodes first which dont have .marvel-kibana so thats why i get that question about the license. The order in which you restart the nodes shouldn't really matter. Try searching `GET .marvel-kibana/_search` and see that you get the license document back? Anyways, where would marvel should me how to add the license from the command line? I'd like to see that cause I haven't run into such a prompt yet. Marvel only shows it if it fails to save it to the cluster. If you want the command, it's not a problem but please reach out off list with your details. On Fri, Oct 10, 2014 at 9:13 PM, Daniel Schonfeld downwi...@gmail.com wrote: Boaz, .marvel-kibana now has a 'state-2' file inside it. and obv for now since i didnt restart or do anything of that nature i am not asked for the license details. I wonder if what I see is that .marvel-kibana is only stored with one primary and one replica and when i reload the cluster sometimes i happen to load the nodes first which dont have .marvel-kibana so thats why i get that question about the license. Anyways, where would marvel should me how to add the license from the command line? I'd like to see that cause I haven't run into such a prompt yet. Thanks! On Thursday, October 9, 2014 2:40:58 PM UTC-4, Boaz Leskes wrote: That's weird. You should for the content of the .marvel-kibana index. That's where it stored when you enter your license info in the UI. Is the Marvel UI allowed to post back to ES? If that's blocked it may explain things. Normally you will get a message from Marvel instructing you how to add the license from the command line. On 9 okt. 2014, at 6:06 p.m., Daniel Schonfeld downwi...@gmail.com wrote: Hi Boaz, No the data folder is persisted. And with it i have all my cluster and indices data... but for some reason marvel asks for the license/order number again. Is there a file I can check for in my data folder? Thanks! Daniel On Thursday, October 9, 2014 5:46:52 AM UTC-4, Boaz Leskes wrote: Hi Daniel, When you restart the cluster, do you also wipe all content? The marvel license should persist once entered but if you clean the data folder, that will go away as well. Cheers, Boaz On Wednesday, October 8, 2014 6:12:19 AM UTC+2, Daniel Schonfeld wrote: Hello, We have recently purchase our Marvel license, but everytime we restart our cluster it asks us for the order number again. We use docker and so our containers are immutable. Is there a file or something we can change in the filesystem that will bake the license key into the container? Thanks! Daniel Schonfeld -- You received this message because you are subscribed to a topic in the Google Groups elasticsearch group. To unsubscribe from this topic, visit https://groups.google.com/d/ topic/elasticsearch/UzvDQObssCM/unsubscribe. To unsubscribe from this group and all its topics, send an email to elasticsearc...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/ msgid/elasticsearch/e4db266f-03d7-40b6-ae77-29262e155492% 40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/e4db266f-03d7-40b6-ae77-29262e155492%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to a topic in the Google Groups elasticsearch group. To unsubscribe from this topic, visit https://groups.google.com/d/topic/elasticsearch/UzvDQObssCM/unsubscribe. To unsubscribe from this group and all its topics, send an email to elasticsearc...@googlegroups.com. To view this discussion on the web visit
too many open files problems and suggestions on cluster configuration
I am looking for suggestions on cluster configuration. I have 2 nodes (master/data and data), 544 indices, about 800 mil documents. If I try to insert more documents and create more indices, I will catch error too many open files. My node's configuration: CentOS 7 Intel(R) Xeon(R) CPU x16 RAM 62 Gb # ulimit -n 10 In future I will have a lot of indices (about 2000) and a lot of documents (~5 bil or maybe more) How can I avoid the error too many open files? -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/c5d45b95-b3d7-4b6a-80fa-111d66f3f65a%40googlegroups.com. For more options, visit https://groups.google.com/d/optout. es_config.pp Description: Binary data
Re: How to replicate this type of search
Yes, that's correct. Depending on your application, it might be easier to have the filters as a dropdown. Similar how sites like Amazon allow you to choose which department to search within. Otherwise you'll have to rely on your users to type in the correct syntax - which might be what you want to allow more flexibility. On Friday, May 1, 2015 at 11:26:38 AM UTC-4, Peter Sorensen wrote: Thanks John Ivan for your insight. Very helpful. Just to see if I'm getting this: In order to have a search where my users can type title: {query term} to limit the search only to titles, I need to program my application to parse the query string and then add the additional filters to the ES request. Do I have this right? -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/dbf03829-7675-41f2-ae5d-274e246a5fc8%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: too many open files problems and suggestions on cluster configuration
The number of open files does not depend on the number of documents. A shard comes not for free. Each shard can take around ~150 open file descriptors (sockets, segment files) and up to 400-500 if actively being indexed. Take care of number of shards, if you have 5 shards per index, and 2000 indices per node, you would hvae to prepare 10k * 150 open file descriptors. That is a challenge on a single RHEL 7 system providing 131072 file descriptors by default so you would have to change system limits (cat /proc/sys/fs/file-max) - the default is already very high. I recommend using fewer shards and redesign the application for fewer indices (or even a single index) if you are limited to 2 nodes only. You can look at shard routing and index aliasing if this helps: http://www.elastic.co/guide/en/elasticsearch/guide/master/kagillion-shards.html http://www.elastic.co/guide/en/elasticsearch/guide/master/faking-it.html Jörg On Fri, May 1, 2015 at 5:05 PM, Ann Yablunovskaya lad.sh...@gmail.com wrote: I am looking for suggestions on cluster configuration. I have 2 nodes (master/data and data), 544 indices, about 800 mil documents. If I try to insert more documents and create more indices, I will catch error too many open files. My node's configuration: CentOS 7 Intel(R) Xeon(R) CPU x16 RAM 62 Gb # ulimit -n 10 In future I will have a lot of indices (about 2000) and a lot of documents (~5 bil or maybe more) How can I avoid the error too many open files? -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/c5d45b95-b3d7-4b6a-80fa-111d66f3f65a%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/c5d45b95-b3d7-4b6a-80fa-111d66f3f65a%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoE_EjkMHgT_M_KPvV%3DDSdf-NyidqOziZvg5HXizx8J8rQ%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: Perma-Unallocated primary shards after a node has left the cluster
Probably super evident but the output above was actually from _cat/allocation?v not /recovery, sorry about that. On Wednesday, April 29, 2015 at 5:19:08 PM UTC-7, Alex Schokking wrote: Hi guys, I would really appreciate some help understanding what's going down with shard allocation in this case: Elasticsearch version: 1.4.4 We had 3 nodes with 1 shard and 1 replica per index (so net 2 copies of everything). 1 node went down and the cluster went red. It started to reallocate shards as expected and there were originally ~50 unallocated shards with 15 primary and the rest replicas. It's been a few hours now and there are still 15 outstanding shards that are all primary that don't seem to be getting re-allocated. I thought this would be a pretty standard scenario so I was really hoping I wouldn't need to manually walk through and re-allocate the primary shards, but I'm not sure what else to try at this point to get back to green. Any pointers would be really appreciated. Here is some of the relevant seeming bits folks asked about on the IRC: In the ES logs for the unallocated index names there are lines along the line of [2015-04-29 22:08:22,803][DEBUG][action.admin.indices.stats] [Agent Axis] [webaccesslogs-2015.04.24][0], node[-r2iQnH4R-mcUy4NicCB5g], [P], s[STARTED]: failed to execute [org.elasticsearch.action.admin.indices.stats.IndicesStatsRequest@6a564a91] org.elasticsearch.transport.SendRequestTransportException: [Jean-Paul Beaubier][inet[/10.155.165.126:9300]][indices:monitor/stats[s]] Jean-Paul Beaubier is the node that went down _cat/recovery shards disk.used disk.avail disk.total disk.percent host ip node 42021.2gb 77gb 98.3gb 21 ip-10-234-164-148 10.234.164.148 Agent Axis 420 41gb 57.2gb 98.3gb 41 ip-10-218-145-237 10.218.145.237 Ebon Seeker 15 UNASSIGNED I'm trying to understand why it's stuck in this state given there is no other info in the logs as far as I can tell about why the shards can't be allocated. Shouldn't the replicas just be promoted in place to new primaries and then new replicas created on the other node? Thanks and regards -- Alex -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/44f2f680-0560-448f-a19f-893fda5aab41%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Max documents 10,500?
Greetings, I have two similar but unrelated machines. I am adding 50,000+ documents to each. Afterwards, one shows the 50,000+ documents and the other only shows 10,500. The second machine seems to be capping out at 10,500. Why, and how can I correct this? The relevant facts are as follows: 1. Both machines are current 64 bit Linux machines with at least 8GB of RAM and more than sufficient disk space. 2. Both have current 64 bit Java 7 and elasticsearch 1.5.1. ES is running local to each machine. 3. Both machines are running the exact same program to load up ES. Each has nearly default ES config files (just different names). 4. The program keeps a counter of the number of times documents are added to ES, and the return codes of each add is checked. Both are 50,000+. 5. When I do a the same query on each machine with curl, the good machine shows a max_score of 8.2, the bad machine shows .499 - remember, same set of documents and same search query. I've spent a day on this, and I am running out of ideas. Any help would sure be appreciated. Blake McBride -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/c44faf88-1b67-4843-a6ed-52d31c055716%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: Evaluating Moving to Discourse - Feedback Wanted
Moving away from mailing lists for anything except announcements would be awesome. Forums are a much better way to have solid discussions with multiple people in. Email is fine, but when you add in more than a couple people, it gets confusing fast. Forums are also far more user friendly for people who haven't learned the various ways developers communicate. That said, if the forum idea is scrapped, please be sure to stick with Google Groups or something similar. Don't switch to something like the Debian user lists. Every time a search result pops up from a list like that when I am looking for help, I can never figure out if I've seen all the emails in the thread or not. The interface is just horrid. Google Groups at least has conversation view. On a similar subject, is there any chance we could get a real time chat app that is more user friendly than IRC? Does something exist that could sit on top of IRC and alleviate the IRC's user unfriendliness? On Thursday, April 2, 2015 at 8:36:33 AM UTC-7, leslie.hawthorn wrote: Hello everyone, As we’ve begun to scale up development on three different open source projects, we’ve found Google Groups to be a difficult solution for dealing with all of our needs for community support. We’ve got multiple mailing lists going, which can be confusing for new folks trying to figure out where to go to ask a question. We’ve also found our lists are becoming noisy in the “good problem to have” kind of way. As we’ve seen more user adoption, and across such a wide variety of use cases, we’re getting widely different types of questions asked. For example, I can imagine that folks not using our Python client would rather not be distracted with emails about it. There’s also a few other strikes against Groups as a tool, such as the fact that it is no longer a supported product by Google, it provides no API hooks and it is not available for users in China. We’ve evaluated several options and we’re currently considering shuttering the elasticsearch-user and logstash-users Google Groups in favor of a Discourse forum. You can read more about Discourse at http://www.discourse.org We feel Discourse will allow us to provide a better experience for all of our users for a few reasons: * More fine grained conversation topics = less noise and better targeted discussions. e.g. we can offer a forum for each language client, individual logstash plugin or for each city to plan user group meetings, etc. * Facilitates discussions that are not generally happening on list now, such as best practices by use case or tips from moving to development to production * Easier for folks who are purely end users - and less used to getting peer support on a mailing list - to get help when they need it Obviously, Discourse does not function the exact same way as a mailing list - however, email interaction with Discourse is supported and will continue to allow you to participate in discussions over email (though there are some small issues related to in-line replies. [0]) We’re working with the Discourse team now as part of evaluating this transition, and we know they’re working to resolve this particular issue. We’re also still determining how Discourse will handle our needs for both user and list archive migration, and we’ll know the precise details of how that would work soon. (We’ll share when we have them.) The final goal would be to move Google Groups to read-only archives, and cut over to Discourse completely for community support discussions. We’re looking at making the cut over in ~30 days from today, but obviously that’s subject to the feedback we receive from all of you. We’re sharing this information to set expectations about time frame for making the switch. It’s not set in stone. Our highest priority is to ensure effective migration of our list archives and subscribers, which may mean a longer time horizon for deploying Discourse, as well. In the meantime, though, we wanted to communicate early and often and get your feedback. Would this change make your life better? Worse? Meh? Please share your thoughts with us so we can evaluate your feedback. We don’t take this switch lightly, and we want to understand how it will impact your overall workflow and experience. We’ll make regular updates to the list responding to incoming feedback and be completely transparent about how our thought processes evolve based on it. Thanks in advance! [0] - https://meta.discourse.org/t/migrating-from-google-groups/24695 Cheers, LH Leslie Hawthorn Director of Developer Relations http://elastic.co Other Places to Find Me: Freenode: lh Twitter: @lhawthorn -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To
Re: Max documents 10,500?
If you have nothing in logs it could mean that you have an issue with your injector. May be you are using bulk but you don't check the bulk response? David Le 1 mai 2015 à 18:36, Blake McBride blake1...@gmail.com a écrit : Greetings, I have two similar but unrelated machines. I am adding 50,000+ documents to each. Afterwards, one shows the 50,000+ documents and the other only shows 10,500. The second machine seems to be capping out at 10,500. Why, and how can I correct this? The relevant facts are as follows: 1. Both machines are current 64 bit Linux machines with at least 8GB of RAM and more than sufficient disk space. 2. Both have current 64 bit Java 7 and elasticsearch 1.5.1. ES is running local to each machine. 3. Both machines are running the exact same program to load up ES. Each has nearly default ES config files (just different names). 4. The program keeps a counter of the number of times documents are added to ES, and the return codes of each add is checked. Both are 50,000+. 5. When I do a the same query on each machine with curl, the good machine shows a max_score of 8.2, the bad machine shows .499 - remember, same set of documents and same search query. I've spent a day on this, and I am running out of ideas. Any help would sure be appreciated. Blake McBride -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/c44faf88-1b67-4843-a6ed-52d31c055716%40googlegroups.com. For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/E2B14086-CFD2-4DF0-AC4A-E00A0204C8A5%40pilato.fr. For more options, visit https://groups.google.com/d/optout.
Re: Max documents 10,500?
Could you compare disk size (/data dir) for your two elasticsearch instances? Also, could you GIST the result of a simple _search?pretty on both nodes? -- David Pilato - Developer | Evangelist elastic.co @dadoonet https://twitter.com/dadoonet | @elasticsearchfr https://twitter.com/elasticsearchfr | @scrutmydocs https://twitter.com/scrutmydocs Le 1 mai 2015 à 21:58, Blake McBride blake1...@gmail.com a écrit : No, for two reasons: 1. I am using the exact same code and data on both machines. 2. I've seen duplicates in the past and I get an error message. Thanks. Blake On Friday, May 1, 2015 at 2:50:57 PM UTC-5, David Pilato wrote: Any chance you are using the same id multiple times? -- David Pilato - Developer | Evangelist elastic.co http://elastic.co/ @dadoonet https://twitter.com/dadoonet | @elasticsearchfr https://twitter.com/elasticsearchfr | @scrutmydocs https://twitter.com/scrutmydocs Le 1 mai 2015 à 21:25, Blake McBride blak...@gmail.com javascript: a écrit : I changed the code to read: var counter = 0; exports.addDoc = function (index, type, id, doc, callback) { if (client !== undefined) { var json = { index: index, type: type, id: id, body: doc }; if (counter++ % 1000 === 0) { console.log('Adding document #' + counter); } client.create(json, callback); } else if (callback !== undefined) { callback('elastic search not connected', undefined); } }; The last printout reads: Adding document #53001 The code that does the error check looks like: esUtils.addDoc(esIndex, 'component', compObj.id, doc, function (err, response, status) { if (err !== undefined || status !== 201 || response.created !== true) { console.log('Unexpected ES response: ' + status + ' ' + err + response.created); } }); I never see that message. Finally, after the above I get: $ curl -s -XPOST 'http://localhost:9200/components/_count' http://localhost:9200/components/_count' {count:10500,_shards:{total:5,successful:5,failed:0}} Thanks for the help! On Friday, May 1, 2015 at 1:57:42 PM UTC-5, David Pilato wrote: Could you add a counter in your JS app to make sure you sent all docs? I suspect something wrong in your index process -- David ;-) Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs Le 1 mai 2015 à 20:40, Blake McBride blak...@gmail.com a écrit : The log only contains: [2015-05-01 18:22:10,398][INFO ][cluster.metadata ] [mmsapp-na-component] [components-1430504530354] creating index, cause [api], templates [], shards [5]/[1], mappings [index_name, component] Each document is being added individually from JavaScript via: exports.addDoc = function (index, type, id, doc, callback) { if (client !== undefined) { var json = { index: index, type: type, id: id, body: doc }; client.create(json, callback); } else if (callback !== undefined) { callback('elastic search not connected', undefined); } }; On Friday, May 1, 2015 at 11:42:12 AM UTC-5, David Pilato wrote: If you have nothing in logs it could mean that you have an issue with your injector. May be you are using bulk but you don't check the bulk response? David Le 1 mai 2015 à 18:36, Blake McBride blak...@gmail.com a écrit : Greetings, I have two similar but unrelated machines. I am adding 50,000+ documents to each. Afterwards, one shows the 50,000+ documents and the other only shows 10,500. The second machine seems to be capping out at 10,500. Why, and how can I correct this? The relevant facts are as follows: 1. Both machines are current 64 bit Linux machines with at least 8GB of RAM and more than sufficient disk space. 2. Both have current 64 bit Java 7 and elasticsearch 1.5.1. ES is running local to each machine. 3. Both machines are running the exact same program to load up ES. Each has nearly default ES config files (just different names). 4. The program keeps a counter of the number of times documents are added to ES, and the return codes of each add is checked. Both are 50,000+. 5. When I do a the same query on each machine with curl, the good machine shows a max_score of 8.2, the bad machine shows .499 - remember, same set of documents and same search query. I've spent a day on this, and I am running out of ideas. Any help would sure be appreciated. Blake McBride -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com . To view this discussion on the web visit
Re: too many open files problems and suggestions on cluster configuration
Add more nodes or reduce the number of shards per node. -- David ;-) Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs Le 1 mai 2015 à 17:05, Ann Yablunovskaya lad.sh...@gmail.com a écrit : I am looking for suggestions on cluster configuration. I have 2 nodes (master/data and data), 544 indices, about 800 mil documents. If I try to insert more documents and create more indices, I will catch error too many open files. My node's configuration: CentOS 7 Intel(R) Xeon(R) CPU x16 RAM 62 Gb # ulimit -n 10 In future I will have a lot of indices (about 2000) and a lot of documents (~5 bil or maybe more) How can I avoid the error too many open files? -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/c5d45b95-b3d7-4b6a-80fa-111d66f3f65a%40googlegroups.com. For more options, visit https://groups.google.com/d/optout. es_config.pp -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/54E58499-F862-4427-A765-E72FCBDC8D92%40pilato.fr. For more options, visit https://groups.google.com/d/optout.
Re: Max documents 10,500?
I changed the code to read: var counter = 0; exports.addDoc = function (index, type, id, doc, callback) { if (client !== undefined) { var json = { index: index, type: type, id: id, body: doc }; if (counter++ % 1000 === 0) { console.log('Adding document #' + counter); } client.create(json, callback); } else if (callback !== undefined) { callback('elastic search not connected', undefined); } }; The last printout reads: Adding document #53001 The code that does the error check looks like: esUtils.addDoc(esIndex, 'component', compObj.id, doc, function (err, response, status) { if (err !== undefined || status !== 201 || response.created !== true) { console.log('Unexpected ES response: ' + status + ' ' + err + response.created); } }); I never see that message. Finally, after the above I get: $ curl -s -XPOST 'http://localhost:9200/components/_count' {count:10500,_shards:{total:5,successful:5,failed:0}} Thanks for the help! On Friday, May 1, 2015 at 1:57:42 PM UTC-5, David Pilato wrote: Could you add a counter in your JS app to make sure you sent all docs? I suspect something wrong in your index process -- David ;-) Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs Le 1 mai 2015 à 20:40, Blake McBride blak...@gmail.com javascript: a écrit : The log only contains: [2015-05-01 18:22:10,398][INFO ][cluster.metadata ] [mmsapp-na-component] [components-1430504530354] creating index, cause [api], templates [], shards [5]/[1], mappings [index_name, component] Each document is being added individually from JavaScript via: exports.addDoc = function (index, type, id, doc, callback) { if (client !== undefined) { var json = { index: index, type: type, id: id, body: doc }; client.create(json, callback); } else if (callback !== undefined) { callback('elastic search not connected', undefined); } }; On Friday, May 1, 2015 at 11:42:12 AM UTC-5, David Pilato wrote: If you have nothing in logs it could mean that you have an issue with your injector. May be you are using bulk but you don't check the bulk response? David Le 1 mai 2015 à 18:36, Blake McBride blak...@gmail.com a écrit : Greetings, I have two similar but unrelated machines. I am adding 50,000+ documents to each. Afterwards, one shows the 50,000+ documents and the other only shows 10,500. The second machine seems to be capping out at 10,500. Why, and how can I correct this? The relevant facts are as follows: 1. Both machines are current 64 bit Linux machines with at least 8GB of RAM and more than sufficient disk space. 2. Both have current 64 bit Java 7 and elasticsearch 1.5.1. ES is running local to each machine. 3. Both machines are running the exact same program to load up ES. Each has nearly default ES config files (just different names). 4. The program keeps a counter of the number of times documents are added to ES, and the return codes of each add is checked. Both are 50,000+. 5. When I do a the same query on each machine with curl, the good machine shows a max_score of 8.2, the bad machine shows .499 - remember, same set of documents and same search query. I've spent a day on this, and I am running out of ideas. Any help would sure be appreciated. Blake McBride -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/c44faf88-1b67-4843-a6ed-52d31c055716%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/c44faf88-1b67-4843-a6ed-52d31c055716%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com javascript:. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/cc41f76d-9716-4f06-9f3a-ec18f0e07bf0%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/cc41f76d-9716-4f06-9f3a-ec18f0e07bf0%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit
Re: Max documents 10,500?
No, for two reasons: 1. I am using the exact same code and data on both machines. 2. I've seen duplicates in the past and I get an error message. Thanks. Blake On Friday, May 1, 2015 at 2:50:57 PM UTC-5, David Pilato wrote: Any chance you are using the same id multiple times? -- *David Pilato* - Developer | Evangelist *elastic.co http://elastic.co* @dadoonet https://twitter.com/dadoonet | @elasticsearchfr https://twitter.com/elasticsearchfr | @scrutmydocs https://twitter.com/scrutmydocs Le 1 mai 2015 à 21:25, Blake McBride blak...@gmail.com javascript: a écrit : I changed the code to read: var counter = 0; exports.addDoc = function (index, type, id, doc, callback) { if (client !== undefined) { var json = { index: index, type: type, id: id, body: doc }; if (counter++ % 1000 === 0) { console.log('Adding document #' + counter); } client.create(json, callback); } else if (callback !== undefined) { callback('elastic search not connected', undefined); } }; The last printout reads: Adding document #53001 The code that does the error check looks like: esUtils.addDoc(esIndex, 'component', compObj.id, doc, function (err, response, status) { if (err !== undefined || status !== 201 || response.created !== true) { console.log('Unexpected ES response: ' + status + ' ' + err + response.created); } }); I never see that message. Finally, after the above I get: $ curl -s -XPOST 'http://localhost:9200/components/_count' {count:10500,_shards:{total:5,successful:5,failed:0}} Thanks for the help! On Friday, May 1, 2015 at 1:57:42 PM UTC-5, David Pilato wrote: Could you add a counter in your JS app to make sure you sent all docs? I suspect something wrong in your index process -- David ;-) Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs Le 1 mai 2015 à 20:40, Blake McBride blak...@gmail.com a écrit : The log only contains: [2015-05-01 18:22:10,398][INFO ][cluster.metadata ] [mmsapp-na-component] [components-1430504530354] creating index, cause [api], templates [], shards [5]/[1], mappings [index_name, component] Each document is being added individually from JavaScript via: exports.addDoc = function (index, type, id, doc, callback) { if (client !== undefined) { var json = { index: index, type: type, id: id, body: doc }; client.create(json, callback); } else if (callback !== undefined) { callback('elastic search not connected', undefined); } }; On Friday, May 1, 2015 at 11:42:12 AM UTC-5, David Pilato wrote: If you have nothing in logs it could mean that you have an issue with your injector. May be you are using bulk but you don't check the bulk response? David Le 1 mai 2015 à 18:36, Blake McBride blak...@gmail.com a écrit : Greetings, I have two similar but unrelated machines. I am adding 50,000+ documents to each. Afterwards, one shows the 50,000+ documents and the other only shows 10,500. The second machine seems to be capping out at 10,500. Why, and how can I correct this? The relevant facts are as follows: 1. Both machines are current 64 bit Linux machines with at least 8GB of RAM and more than sufficient disk space. 2. Both have current 64 bit Java 7 and elasticsearch 1.5.1. ES is running local to each machine. 3. Both machines are running the exact same program to load up ES. Each has nearly default ES config files (just different names). 4. The program keeps a counter of the number of times documents are added to ES, and the return codes of each add is checked. Both are 50,000+. 5. When I do a the same query on each machine with curl, the good machine shows a max_score of 8.2, the bad machine shows .499 - remember, same set of documents and same search query. I've spent a day on this, and I am running out of ideas. Any help would sure be appreciated. Blake McBride -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/c44faf88-1b67-4843-a6ed-52d31c055716%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/c44faf88-1b67-4843-a6ed-52d31c055716%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com. To view this discussion on the web visit
Re: Max documents 10,500?
Any chance you are using the same id multiple times? -- David Pilato - Developer | Evangelist elastic.co @dadoonet https://twitter.com/dadoonet | @elasticsearchfr https://twitter.com/elasticsearchfr | @scrutmydocs https://twitter.com/scrutmydocs Le 1 mai 2015 à 21:25, Blake McBride blake1...@gmail.com a écrit : I changed the code to read: var counter = 0; exports.addDoc = function (index, type, id, doc, callback) { if (client !== undefined) { var json = { index: index, type: type, id: id, body: doc }; if (counter++ % 1000 === 0) { console.log('Adding document #' + counter); } client.create(json, callback); } else if (callback !== undefined) { callback('elastic search not connected', undefined); } }; The last printout reads: Adding document #53001 The code that does the error check looks like: esUtils.addDoc(esIndex, 'component', compObj.id, doc, function (err, response, status) { if (err !== undefined || status !== 201 || response.created !== true) { console.log('Unexpected ES response: ' + status + ' ' + err + response.created); } }); I never see that message. Finally, after the above I get: $ curl -s -XPOST 'http://localhost:9200/components/_count' {count:10500,_shards:{total:5,successful:5,failed:0}} Thanks for the help! On Friday, May 1, 2015 at 1:57:42 PM UTC-5, David Pilato wrote: Could you add a counter in your JS app to make sure you sent all docs? I suspect something wrong in your index process -- David ;-) Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs Le 1 mai 2015 à 20:40, Blake McBride blak...@gmail.com javascript: a écrit : The log only contains: [2015-05-01 18:22:10,398][INFO ][cluster.metadata ] [mmsapp-na-component] [components-1430504530354] creating index, cause [api], templates [], shards [5]/[1], mappings [index_name, component] Each document is being added individually from JavaScript via: exports.addDoc = function (index, type, id, doc, callback) { if (client !== undefined) { var json = { index: index, type: type, id: id, body: doc }; client.create(json, callback); } else if (callback !== undefined) { callback('elastic search not connected', undefined); } }; On Friday, May 1, 2015 at 11:42:12 AM UTC-5, David Pilato wrote: If you have nothing in logs it could mean that you have an issue with your injector. May be you are using bulk but you don't check the bulk response? David Le 1 mai 2015 à 18:36, Blake McBride blak...@gmail.com a écrit : Greetings, I have two similar but unrelated machines. I am adding 50,000+ documents to each. Afterwards, one shows the 50,000+ documents and the other only shows 10,500. The second machine seems to be capping out at 10,500. Why, and how can I correct this? The relevant facts are as follows: 1. Both machines are current 64 bit Linux machines with at least 8GB of RAM and more than sufficient disk space. 2. Both have current 64 bit Java 7 and elasticsearch 1.5.1. ES is running local to each machine. 3. Both machines are running the exact same program to load up ES. Each has nearly default ES config files (just different names). 4. The program keeps a counter of the number of times documents are added to ES, and the return codes of each add is checked. Both are 50,000+. 5. When I do a the same query on each machine with curl, the good machine shows a max_score of 8.2, the bad machine shows .499 - remember, same set of documents and same search query. I've spent a day on this, and I am running out of ideas. Any help would sure be appreciated. Blake McBride -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com . To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/c44faf88-1b67-4843-a6ed-52d31c055716%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/c44faf88-1b67-4843-a6ed-52d31c055716%40googlegroups.com?utm_medium=emailutm_source=footer. For more options, visit https://groups.google.com/d/optout https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com javascript:. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/cc41f76d-9716-4f06-9f3a-ec18f0e07bf0%40googlegroups.com
Re: How to take a snapshots of a specific index with the php library?
Figured it out. The key was to use $params['body'] instead of $params['custom']. Which is odd since that's what I tried in the first place. I guess I must have had some other bug in the way at the time that got fixed as I was testing things... On Thursday, April 30, 2015 at 5:10:58 PM UTC-7, David Reagan wrote: When I try to list a specific index to take a snapshot of, the library seems to ignore it, and instead takes a snapshot of my entire cluster. It's the end of the day, and I'm likely missing something obvious, so any help would be appreciated. Ultimately, I'm trying to do the equivalent of: curl -XPUT localhost:9200/_snapshot/my_backup/snapshot_1 -d '{ indices: logstash-2014.09.25, ignore_unavailable: true, include_global_state: false }' Here is my function: function take( $name, $indices = array(), $settings = array( 'wait_for_completion' = true, 'ignore_unavailable' = true, 'include_global_state' = false ) ) { $params['repository'] = $this-clusterConfig['repository']; $params['snapshot'] = $name; $params['custom'] = array( 'ignore_unavailable' = $settings['ignore_unavailable'], 'include_global_state' = $settings['include_global_state'] ); $indicesString = ; if (!empty($indices)) { foreach ($indices as $value) { $indicesString .= $value . ,; } //remove ending comma $indicesString = rtrim($indicesString, ,); $params['custom']['indices'] = $indicesString; } $result = $this-es-snapshot()-create($params); if (!$result['acknowledged']) { $this-logger-addError(Failed to take snapshot $name in cluster $this-clusterName.); } } Called like: take(testingstuff-logstash-2014.09.25,array(logstash-2014.09.25)); $this-es is an instance of Elasticsearch\Client. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/2470204b-bd53-442b-94c6-b9ab2be92cec%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: SHIELD terms lookup filter : AuthorizationException BUG
I'm having the same problem with Elasticsearch 1.4.5 with shield 1.1 On Thursday, April 23, 2015 at 2:03:23 PM UTC-5, Jay Modi wrote: Hi Bert, I don't know of a workaround to accomplish this in a single query right now. We have been discussing how to fix this issue in depth over the past few days and have ideas on how to move forward but no timeline on it being resolved. Regarding support contracts and fixes, I'm going to defer that question to the person your company is in contact with. They'll be able to answer that much better than I can. On Wednesday, April 22, 2015 at 9:15:21 AM UTC-4, Bert Vermeiren wrote: Hi Jay, Thanks to acknowledge ! Is there any way to work around this issue ? We definitely need a kind of join filter for limiting the returned data based on some permissions/tokens. We are also starting discussions for a support and re-distribution license with both your and our marketing organisation. Is there any way to get a fix within a support contract ? Thanks, Regards, Bert. Op woensdag 22 april 2015 14:34:07 UTC+2 schreef Jay Modi: Hi Bert, Thank you for the detailed report and reproduction of this issue. This is a known limitation with Shield and certain operations in elasticsearch. We're working to resolve this in a future release. We will be documenting this limitation and all of the operations affected shortly; this was something that we had forgotten to document. -Jay On Monday, April 20, 2015 at 10:46:40 AM UTC-4, Bert Vermeiren wrote: Hi, Using: * ElasticSearch 1.5.1 * SHIELD 1.2 Whenever I use a terms lookup filter in a search query, I get an UnAuthorizedException for the [__es_system_user] user although the actual user has even 'admin' role privileges. This seems a bug to me, where the terms filter does not have the correct security context. This is very easy to reproduce, see gist : https://gist.github.com/bertvermeiren/c29e0d9ee54bb5b0b73a Scenario : # Add user 'admin' with default 'admin' role. ./bin/shield/esusers useradd admin -p admin1 -r admin # create index. curl -XPUT 'admin:admin1@localhost:9200/customer' # create a document on the index curl -XPUT 'admin:admin1@localhost:9200/customer/external/1' -d ' { name : John Doe, token : token1 }' # create additional index for the terms lookup filter functionality curl -XPUT 'admin:admin1@localhost:9200/tokens' # create document in 'tokens' index curl -XPUT 'admin:admin1@localhost:9200/tokens/tokens/1' -d ' { group : 1, tokens : [token1, token2 ] }' # search with a terms lookup filter on the customer index, referring to the 'tokens' index. curl -XGET 'admin:admin1@localhost:9200/customer/external/_search' -d ' { query: { filtered: { query: { match_all: {} }, filter: { terms: { token: { index: tokens, type: tokens, id: 1, path: tokens } } } } } }' = org.elasticsearch.shield.authz.AuthorizationException: action [indices:data/read/get] is unauthorized for user [__es_system_user] -- CONFIDENTIAL COMMUNICATION: This email may contain confidential or legally privileged material, and is for the sole use of the intended recipient. Use or distribution by an unintended recipient is prohibited, and may be a violation of law. If you believe that you received this email in error, please do not read, forward, print or copy this email or any attachments. Please delete the email and all attachments, and inform the sender that you have deleted the email and all attachments. Thank you. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/7ca71964-2c03-4a26-8f1f-f63ac40269e3%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: Max documents 10,500?
Could you add a counter in your JS app to make sure you sent all docs? I suspect something wrong in your index process -- David ;-) Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs Le 1 mai 2015 à 20:40, Blake McBride blake1...@gmail.com a écrit : The log only contains: [2015-05-01 18:22:10,398][INFO ][cluster.metadata ] [mmsapp-na-component] [components-1430504530354] creating index, cause [api], templates [], shards [5]/[1], mappings [index_name, component] Each document is being added individually from JavaScript via: exports.addDoc = function (index, type, id, doc, callback) { if (client !== undefined) { var json = { index: index, type: type, id: id, body: doc }; client.create(json, callback); } else if (callback !== undefined) { callback('elastic search not connected', undefined); } }; On Friday, May 1, 2015 at 11:42:12 AM UTC-5, David Pilato wrote: If you have nothing in logs it could mean that you have an issue with your injector. May be you are using bulk but you don't check the bulk response? David Le 1 mai 2015 à 18:36, Blake McBride blak...@gmail.com a écrit : Greetings, I have two similar but unrelated machines. I am adding 50,000+ documents to each. Afterwards, one shows the 50,000+ documents and the other only shows 10,500. The second machine seems to be capping out at 10,500. Why, and how can I correct this? The relevant facts are as follows: 1. Both machines are current 64 bit Linux machines with at least 8GB of RAM and more than sufficient disk space. 2. Both have current 64 bit Java 7 and elasticsearch 1.5.1. ES is running local to each machine. 3. Both machines are running the exact same program to load up ES. Each has nearly default ES config files (just different names). 4. The program keeps a counter of the number of times documents are added to ES, and the return codes of each add is checked. Both are 50,000+. 5. When I do a the same query on each machine with curl, the good machine shows a max_score of 8.2, the bad machine shows .499 - remember, same set of documents and same search query. I've spent a day on this, and I am running out of ideas. Any help would sure be appreciated. Blake McBride -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/c44faf88-1b67-4843-a6ed-52d31c055716%40googlegroups.com. For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/cc41f76d-9716-4f06-9f3a-ec18f0e07bf0%40googlegroups.com. For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/137922C9-B539-452E-9181-B82E16AFC9C5%40pilato.fr. For more options, visit https://groups.google.com/d/optout.
Re: Evaluating Moving to Discourse - Feedback Wanted
On Fri, May 1, 2015 at 7:26 PM, David Reagan jer...@gmail.com wrote: Moving away from mailing lists for anything except announcements would be awesome. Forums are a much better way to have solid discussions with multiple people in. Email is fine, but when you add in more than a couple people, it gets confusing fast. Forums are also far more user friendly for people who haven't learned the various ways developers communicate. Thanks for your feedback! That said, if the forum idea is scrapped, please be sure to stick with Google Groups or something similar. Don't switch to something like the Debian user lists. Every time a search result pops up from a list like that when I am looking for help, I can never figure out if I've seen all the emails in the thread or not. The interface is just horrid. Google Groups at least has conversation view. Mailman is little painful, but with the latest release it's become a bit friendlier. On a similar subject, is there any chance we could get a real time chat app that is more user friendly than IRC? Does something exist that could sit on top of IRC and alleviate the IRC's user unfriendliness? Why do you find IRC unfriendly? Have you tried using a web based client like irccloud.com? Cheers, LH On Thursday, April 2, 2015 at 8:36:33 AM UTC-7, leslie.hawthorn wrote: Hello everyone, As we’ve begun to scale up development on three different open source projects, we’ve found Google Groups to be a difficult solution for dealing with all of our needs for community support. We’ve got multiple mailing lists going, which can be confusing for new folks trying to figure out where to go to ask a question. We’ve also found our lists are becoming noisy in the “good problem to have” kind of way. As we’ve seen more user adoption, and across such a wide variety of use cases, we’re getting widely different types of questions asked. For example, I can imagine that folks not using our Python client would rather not be distracted with emails about it. There’s also a few other strikes against Groups as a tool, such as the fact that it is no longer a supported product by Google, it provides no API hooks and it is not available for users in China. We’ve evaluated several options and we’re currently considering shuttering the elasticsearch-user and logstash-users Google Groups in favor of a Discourse forum. You can read more about Discourse at http://www.discourse.org We feel Discourse will allow us to provide a better experience for all of our users for a few reasons: * More fine grained conversation topics = less noise and better targeted discussions. e.g. we can offer a forum for each language client, individual logstash plugin or for each city to plan user group meetings, etc. * Facilitates discussions that are not generally happening on list now, such as best practices by use case or tips from moving to development to production * Easier for folks who are purely end users - and less used to getting peer support on a mailing list - to get help when they need it Obviously, Discourse does not function the exact same way as a mailing list - however, email interaction with Discourse is supported and will continue to allow you to participate in discussions over email (though there are some small issues related to in-line replies. [0]) We’re working with the Discourse team now as part of evaluating this transition, and we know they’re working to resolve this particular issue. We’re also still determining how Discourse will handle our needs for both user and list archive migration, and we’ll know the precise details of how that would work soon. (We’ll share when we have them.) The final goal would be to move Google Groups to read-only archives, and cut over to Discourse completely for community support discussions. We’re looking at making the cut over in ~30 days from today, but obviously that’s subject to the feedback we receive from all of you. We’re sharing this information to set expectations about time frame for making the switch. It’s not set in stone. Our highest priority is to ensure effective migration of our list archives and subscribers, which may mean a longer time horizon for deploying Discourse, as well. In the meantime, though, we wanted to communicate early and often and get your feedback. Would this change make your life better? Worse? Meh? Please share your thoughts with us so we can evaluate your feedback. We don’t take this switch lightly, and we want to understand how it will impact your overall workflow and experience. We’ll make regular updates to the list responding to incoming feedback and be completely transparent about how our thought processes evolve based on it. Thanks in advance! [0] - https://meta.discourse.org/t/migrating-from-google-groups/24695 Cheers, LH Leslie Hawthorn Director of Developer Relations http://elastic.co Other Places to Find Me: Freenode: lh
Re: Shield and Proxy Users
Thanks Michael. Are you interested in Shield performing the authorization with AD/LDAP for a given proxy user (assumed as being authenticated by your application) or would/can your application also pass the authorization information and then Shield restricts access accordingly? On Wednesday, April 29, 2015 at 10:34:54 PM UTC-4, Michael Young wrote: If you would like to get more specific use case details, I'm more than willing to exchange emails or engage in phone calls. Michael On Wednesday, April 29, 2015 at 10:34:25 PM UTC-4, Michael Young wrote: I thought that might be the case. The problem with Shield for my use case is authentication and authorization are closely tied together. Generally speaking, we want to limit access to indexes via LDAP/AD groups which are assigned to Shield roles. We want to be able to use a system/daemon account to query Elasticserach, but pass in a proxy or impersonation user which can be used to looked up to see what effective groups they have and from which indexes they can get results. Without the proxy user ability, we are forced to login the user via their username and password. The problem is that users will not directly access Easticsearch and we don't have access to their password. Our users will be authenticated via a separate application/user interface which will be using single sign on tokens. The application doesn't have access to the user's password to pass to Elasticsearch. So there isn't an easy way to say I have user1234 running a query and I need you to filter index results appropriately for this authenticated user. We want to manage index permissions using LDAP/AD groups and roles using Shield. We don't want to have to do that in the application. The current work around seems to be some sort of api overlay to elasticsearch which will first check to see if the user exists using an admin account. If the user account doesn't exist (first time logging in), then create the user account using a hash of the users group permissions from LDAP/AD. It's not ideal, but it'll probably get the job done until Shield is extended/enhanced. On Wednesday, April 29, 2015 at 5:03:51 PM UTC-4, Jay Modi wrote: Hi Michael, We don't currently have a way to do this with Shield. Can you tell us a little more about your scenario? Your users are logging into your application and then accessing data in Elasticsearch, which is protected by Shield? This type of information is helpful for us as we plan features for future releases of Shield. -Jay On Wednesday, April 29, 2015 at 3:06:57 PM UTC-4, Michael Young wrote: I have Elasticsearch 1.5.2 and Shield 1.2.0 configured and working against Active Directory. This seems to work pretty well. However, I was wondering if there was a way to pass in a proxy user from an application to get the appropriate index filtering via access controls without having to pass in the username AND password from the application. Is there a way to do this with Shield? -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/c7cb2cd6-3ce0-4bd4-9e21-b67fc05b2b46%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
abnormal file input behavior?
I have a file of logging records I am using to debug some filter parses. I am using file input and have set starting_position to beginning. So I startup logstash see what I get and killed it and make fixes and go again. I have seen if sometimes it reads the file and sometimes not. I ran some experiments and found that if I delete the file and rewrite it and then start up logstash it reads the file. If I have previously read the file, then when logstash starts it doesn't read the file despite being told to start at beginning. Am I missing something here? Is this intended behavior or a possible bug? -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/7ad0633b-0e9d-4f10-8881-5a34c93c9500%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: Evaluating Moving to Discourse - Feedback Wanted
Why do you find IRC unfriendly? Have you tried using a web based client like irccloud.com? I use webchat.freenode.net. There's a big difference between, Here's our live chat app. and Learn how to connect to IRC in order to use live chat. I actively avoided live chat for years simply because I had no interest in learning anything about IRC. For a new user, IRC is not friendly, especially if you are not a developer or are new to it. There's also the fact that you are told off if you post any code when asking for help. I could see it if the code sample was really long, but when people get told off when they post three or four lines of code, that's not friendly at all. This has happened to me, and I've seen it happen to others. (Not sure it if was in an ELK channel or not...) Plus, figuring out how to start a private conversation, or set your status to away, or registering your username so others can't impersonate you, and so on is not obvious. I still haven't taken the time to figure it out. Ideally, we'd have something that functions similar to HipChat. Nice UI, and code snippets are automatically shortened unless you choose to show the full snippet. --David Reagan On Fri, May 1, 2015 at 11:17 AM, Leslie Hawthorn leslie.hawth...@elastic.co wrote: On Fri, May 1, 2015 at 7:26 PM, David Reagan jer...@gmail.com wrote: Moving away from mailing lists for anything except announcements would be awesome. Forums are a much better way to have solid discussions with multiple people in. Email is fine, but when you add in more than a couple people, it gets confusing fast. Forums are also far more user friendly for people who haven't learned the various ways developers communicate. Thanks for your feedback! That said, if the forum idea is scrapped, please be sure to stick with Google Groups or something similar. Don't switch to something like the Debian user lists. Every time a search result pops up from a list like that when I am looking for help, I can never figure out if I've seen all the emails in the thread or not. The interface is just horrid. Google Groups at least has conversation view. Mailman is little painful, but with the latest release it's become a bit friendlier. On a similar subject, is there any chance we could get a real time chat app that is more user friendly than IRC? Does something exist that could sit on top of IRC and alleviate the IRC's user unfriendliness? Why do you find IRC unfriendly? Have you tried using a web based client like irccloud.com? Cheers, LH On Thursday, April 2, 2015 at 8:36:33 AM UTC-7, leslie.hawthorn wrote: Hello everyone, As we’ve begun to scale up development on three different open source projects, we’ve found Google Groups to be a difficult solution for dealing with all of our needs for community support. We’ve got multiple mailing lists going, which can be confusing for new folks trying to figure out where to go to ask a question. We’ve also found our lists are becoming noisy in the “good problem to have” kind of way. As we’ve seen more user adoption, and across such a wide variety of use cases, we’re getting widely different types of questions asked. For example, I can imagine that folks not using our Python client would rather not be distracted with emails about it. There’s also a few other strikes against Groups as a tool, such as the fact that it is no longer a supported product by Google, it provides no API hooks and it is not available for users in China. We’ve evaluated several options and we’re currently considering shuttering the elasticsearch-user and logstash-users Google Groups in favor of a Discourse forum. You can read more about Discourse at http://www.discourse.org We feel Discourse will allow us to provide a better experience for all of our users for a few reasons: * More fine grained conversation topics = less noise and better targeted discussions. e.g. we can offer a forum for each language client, individual logstash plugin or for each city to plan user group meetings, etc. * Facilitates discussions that are not generally happening on list now, such as best practices by use case or tips from moving to development to production * Easier for folks who are purely end users - and less used to getting peer support on a mailing list - to get help when they need it Obviously, Discourse does not function the exact same way as a mailing list - however, email interaction with Discourse is supported and will continue to allow you to participate in discussions over email (though there are some small issues related to in-line replies. [0]) We’re working with the Discourse team now as part of evaluating this transition, and we know they’re working to resolve this particular issue. We’re also still determining how Discourse will handle our needs for both user and list archive migration, and we’ll know the precise details of how that would work soon. (We’ll share when we
Re: Failed to get setting group for [threadpool.] setting prefix and setting [threadpool.bulk] because of a missing '.'
It was a few months ago so I don't exactly recall the command line we used, but it must have been something like: curl -X PUT -d '{ persistent : { threadpool.bulk: { type: fixed, queue_size: 250 } } }' localhost:9200/_cluster/settings m. On Thursday, April 30, 2015 at 11:35:55 PM UTC+2, Mark Walkom wrote: How are you setting this? On 30 April 2015 at 22:02, marc@happn.com javascript: wrote: Hello there I noticed the following warning message in my nodes logs when starting: [2015-04-30 12:40:13,265][INFO ][node ] [***server***] started [2015-04-30 12:40:32,154][WARN ][node.settings] [***server***] failed to refresh settings for [org.elasticsearch.threadpool.ThreadPool$ApplySettings@1ca801ce] org.elasticsearch.common.settings.SettingsException: Failed to get setting group for [threadpool.] setting prefix and setting [threadpool.bulk] because of a missing '.' at org.elasticsearch.common.settings.ImmutableSettings.getGroups(ImmutableSettings.java:527) at org.elasticsearch.common.settings.ImmutableSettings.getGroups(ImmutableSettings.java:505) at org.elasticsearch.threadpool.ThreadPool.updateSettings(ThreadPool.java:396) at org.elasticsearch.threadpool.ThreadPool$ApplySettings.onRefreshSettings(ThreadPool.java:682) at org.elasticsearch.node.settings.NodeSettingsService.clusterChanged(NodeSettingsService.java:84) at org.elasticsearch.cluster.service.InternalClusterService$UpdateTask.run(InternalClusterService.java:428) at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:134) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Here are the current cluster settings: curl http://localhost:9200/_cluster/settings?pretty { persistent : { indices : { store : { throttle : { max_bytes_per_sec : 40mb } } }, threadpool : { bulk : 250 }, cluster : { routing : { allocation : { enable : all, balance : { shard : 0.9f }, disable_allocation : false } } } }, transient : { cluster : { routing : { allocation : { enable : all } } } } } Is there anything wrong with my settings? We tried to change the `threadpool.bulk` setting months ago, but it doesn't seem to have been taken into account (bulk queue still caped at default value). We currently use version 1.2.3, and are in the process of upgrading to 1.5.x performing rolling upgrades. Cheers, m. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com javascript:. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/092de7fc-e290-4897-b7c6-f5985b0e8e7f%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/092de7fc-e290-4897-b7c6-f5985b0e8e7f%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/f8aac10d-51ad-4fbb-8d55-0cc56fd87978%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: ES upgrade
Are you running the same ES and java versions on all your nodes and clients? On 1 May 2015 at 21:21, phani.nadimi...@goktree.com wrote: Hi All i upgraded elastic search from 1.4.2 to 1.5.2 i am getting following warning from console after upgrade please explain me the following error why it is coming. [2015-05-01 06:15:13,361][WARN ][transport.netty ] [ES_Node1] exception caught on transport layer [[id: 0xc483723b, /ip of node:3845 = /ip of node:9300]], closing connection java.io.StreamCorruptedException: invalid internal transport message format, got (47,45,54,20) at org.elasticsearch.transport.netty.SizeHeaderFrameDecoder.decode(SizeHeaderFrameDecoder.java:47) at org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:425) at org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:303) at org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70) at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564) at org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791) at org.elasticsearch.common.netty.OpenChannelsHandler.handleUpstream(OpenChannelsHandler.java:74) at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564) at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559) at org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:268) at org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:255) at org.elasticsearch.common.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88) at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108) at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:337) at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89) at org.elasticsearch.common.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178) at org.elasticsearch.common.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108) at org.elasticsearch.common.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) Thanks phani -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/341f2003-4211-4c70-aebf-a8d1fa649820%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/341f2003-4211-4c70-aebf-a8d1fa649820%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEYi1X92TwbFPSy0q9bP-KkWEwQethunYXBm7nWy0QdSWOtO4g%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: Max documents 10,500?
The question of relative size has, I believe, led me to the problem. I create aliases. After the load, I have the alias point to the new index. One of the indexes had a bad document that made the change alias mechanism fail. This means I kept loading the document into an index but the alias was always pointing to an old index. So, on the working system the database size was relatively small - since I got rid of the old indexes. The bad machine was taking up a lot of space because the old indexes were never deleted because the alias didn't point to them. Thanks a lot for the help!! Blake On Friday, May 1, 2015 at 3:39:05 PM UTC-5, David Pilato wrote: Could you compare disk size (/data dir) for your two elasticsearch instances? Also, could you GIST the result of a simple _search?pretty on both nodes? -- *David Pilato* - Developer | Evangelist *elastic.co http://elastic.co* @dadoonet https://twitter.com/dadoonet | @elasticsearchfr https://twitter.com/elasticsearchfr | @scrutmydocs https://twitter.com/scrutmydocs Le 1 mai 2015 à 21:58, Blake McBride blak...@gmail.com javascript: a écrit : No, for two reasons: 1. I am using the exact same code and data on both machines. 2. I've seen duplicates in the past and I get an error message. Thanks. Blake On Friday, May 1, 2015 at 2:50:57 PM UTC-5, David Pilato wrote: Any chance you are using the same id multiple times? -- *David Pilato* - Developer | Evangelist *elastic.co http://elastic.co/* @dadoonet https://twitter.com/dadoonet | @elasticsearchfr https://twitter.com/elasticsearchfr | @scrutmydocs https://twitter.com/scrutmydocs Le 1 mai 2015 à 21:25, Blake McBride blak...@gmail.com a écrit : I changed the code to read: var counter = 0; exports.addDoc = function (index, type, id, doc, callback) { if (client !== undefined) { var json = { index: index, type: type, id: id, body: doc }; if (counter++ % 1000 === 0) { console.log('Adding document #' + counter); } client.create(json, callback); } else if (callback !== undefined) { callback('elastic search not connected', undefined); } }; The last printout reads: Adding document #53001 The code that does the error check looks like: esUtils.addDoc(esIndex, 'component', compObj.id, doc, function (err, response, status) { if (err !== undefined || status !== 201 || response.created !== true) { console.log('Unexpected ES response: ' + status + ' ' + err + response.created); } }); I never see that message. Finally, after the above I get: $ curl -s -XPOST 'http://localhost:9200/components/_count' {count:10500,_shards:{total:5,successful:5,failed:0}} Thanks for the help! On Friday, May 1, 2015 at 1:57:42 PM UTC-5, David Pilato wrote: Could you add a counter in your JS app to make sure you sent all docs? I suspect something wrong in your index process -- David ;-) Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs Le 1 mai 2015 à 20:40, Blake McBride blak...@gmail.com a écrit : The log only contains: [2015-05-01 18:22:10,398][INFO ][cluster.metadata ] [mmsapp-na-component] [components-1430504530354] creating index, cause [api], templates [], shards [5]/[1], mappings [index_name, component] Each document is being added individually from JavaScript via: exports.addDoc = function (index, type, id, doc, callback) { if (client !== undefined) { var json = { index: index, type: type, id: id, body: doc }; client.create(json, callback); } else if (callback !== undefined) { callback('elastic search not connected', undefined); } }; On Friday, May 1, 2015 at 11:42:12 AM UTC-5, David Pilato wrote: If you have nothing in logs it could mean that you have an issue with your injector. May be you are using bulk but you don't check the bulk response? David Le 1 mai 2015 à 18:36, Blake McBride blak...@gmail.com a écrit : Greetings, I have two similar but unrelated machines. I am adding 50,000+ documents to each. Afterwards, one shows the 50,000+ documents and the other only shows 10,500. The second machine seems to be capping out at 10,500. Why, and how can I correct this? The relevant facts are as follows: 1. Both machines are current 64 bit Linux machines with at least 8GB of RAM and more than sufficient disk space. 2. Both have current 64 bit Java 7 and elasticsearch 1.5.1. ES is running local to each machine. 3. Both machines are running the exact same program to load up ES. Each has nearly default ES config files (just different names). 4. The program keeps a counter of the number of times documents are added to ES, and the return codes of each add is checked. Both are 50,000+. 5. When
ES upgrade
Hi All i upgraded elastic search from 1.4.2 to 1.5.2 i am getting following warning from console after upgrade please explain me the following error why it is coming. [2015-05-01 06:15:13,361][WARN ][transport.netty ] [ES_Node1] exception caught on transport layer [[id: 0xc483723b, /ip of node:3845 = /ip of node:9300]], closing connection java.io.StreamCorruptedException: invalid internal transport message format, got (47,45,54,20) at org.elasticsearch.transport.netty.SizeHeaderFrameDecoder.decode(SizeHeaderFrameDecoder.java:47) at org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:425) at org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:303) at org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70) at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564) at org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791) at org.elasticsearch.common.netty.OpenChannelsHandler.handleUpstream(OpenChannelsHandler.java:74) at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564) at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559) at org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:268) at org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:255) at org.elasticsearch.common.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88) at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108) at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:337) at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89) at org.elasticsearch.common.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178) at org.elasticsearch.common.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108) at org.elasticsearch.common.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) Thanks phani -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/341f2003-4211-4c70-aebf-a8d1fa649820%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: How to Boost
I got it: To add an array of var to an anonymous type add them like this. var field1 = new {field=Name^20}; var field2 = new {field = _all}; var field3 = new {field = Id^20}; var myfields = new [] {field1,field2}; This is slick now it puts the brackets in and there is no escaping :-) On Thursday, April 30, 2015 at 9:01:27 PM UTC-4, GWired wrote: I was able to get this to work in ES using head and got back what i needed. However translating to ElasticSearch.net has been an issue. I have an anonymous type doing a match all query and this works fine and dandy. I am trying to do the multi_match as above but it isn't returning any results. I'm not sure var myfields = [\Name#94;20\,\Id#94;20\,\_all\]; - this gets no results var myfields = [\Name^20\,\Id^20\,\_all\]; - this crashes with all kinds of errors How do you do these bracketed values? var search = new { query = new { multi_match = new { fields = myfields, query = keyword, type = best_fields, use_dis_max= true } }, from = 0, size = limitAllTypes, aggs = new { top_types = new { terms = new { field = _type }, aggs = new { top_type_hits = new { top_hits = new { size = limitPerType } } } } } }; On Wednesday, April 29, 2015 at 12:06:47 PM UTC-4, Joel Potischman wrote: You could use a boosting query http://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-boosting-query.html, or you could use a multi-match query http://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-multi-match-query.html and add a boost directly to the name and id fields. Something like: { multi_match: { fields: [ name^2, id^2, _all ], query: keyword, type: best_fields } } That tells Elasticsearch to search all fields, but to additionally search name and id and double their score if they match (that's the ^2 part). the best_fields type will use the best score from the name, id, and _all searches. You might want most_fields to rank records that contain your query terms in name *and* id *and* other fields even higher. Note that my snippet is using the Query DSL. The syntax for the client you're using will probably be slightly different but that's the general idea. Also note that I've been using Elasticsearch less than a year though so there may be better approaches, but that's where I'd start. -joel On Tuesday, April 28, 2015 at 4:42:56 PM UTC-4, GWired wrote: I am attempting to boost values for queries. I'm searching across all fields and tables and returning 25 results for each type. This is working fine however I need to Boost if the field name Name or the Field Name ID have the value in it. I'm using ElasticSearchClient and sending this search. search = new { query = new { query_string = new { query = keyword, default_field = _all } }, from = 0, size = limitAllTypes, aggs = new { top_types = new { terms = new { field = _type }, aggs = new { top_type_hits = new { top_hits = new { size = limitPerType } } } } } ElasticsearchResponseDynamicDictionary searchResponse = client.Search(jdbc, search, null); How do i tell this to boost the name and id fields over all other fields. If I'm searching for My Searched Company and that is in the Name field I want it at the top of the list. vs in notes, addresses or whatever other columns