Re: Elasticsearch cluster fails to stabilize
On some further debugging I enabled debug logging on one of the nodes. Now when I try to get the indices stats I get the following in the log on the debugging node: [2013-12-18 08:02:01,078][DEBUG][index.shard.service ] [NODE6] [reference][10] Can not build 'completion stats' from engine shard state [RECOVERING] org.elasticsearch.index.shard.IllegalIndexShardStateException: [reference][10] CurrentState[RECOVERING] operations only allowed when started/relocated at org.elasticsearch.index.shard.service.InternalIndexShard.readAllowed(InternalIndexShard.java:765) at org.elasticsearch.index.shard.service.InternalIndexShard.acquireSearcher(InternalIndexShard.java:600) at org.elasticsearch.index.shard.service.InternalIndexShard.acquireSearcher(InternalIndexShard.java:595) at org.elasticsearch.index.shard.service.InternalIndexShard.completionStats(InternalIndexShard.java:536) at org.elasticsearch.action.admin.indices.stats.CommonStats.init(CommonStats.java:151) at org.elasticsearch.indices.InternalIndicesService.stats(InternalIndicesService.java:212) at org.elasticsearch.node.service.NodeService.stats(NodeService.java:165) at org.elasticsearch.action.admin.cluster.node.stats.TransportNodesStatsAction.nodeOperation(TransportNodesStatsAction.java:100) at org.elasticsearch.action.admin.cluster.node.stats.TransportNodesStatsAction.nodeOperation(TransportNodesStatsAction.java:43) at org.elasticsearch.action.support.nodes.TransportNodesOperationAction$NodeTransportHandler.messageReceived(TransportNodesOperationAction.java:273) at org.elasticsearch.action.support.nodes.TransportNodesOperationAction$NodeTransportHandler.messageReceived(TransportNodesOperationAction.java:264) at org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.run(MessageChannelHandler.java:270) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:744) Looking in head I see that this node has a number of green shards, but shard 10 is yellow (recovering). This smells like a bug. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/62fdf587-7779-4814-97a6-f1381993eba5%40googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: replace DailyRollingFileAppender for index slow log
i found some discusion about the same problem but he doesn't say how to resolv it ! 2013/12/18 joergpra...@gmail.com joergpra...@gmail.com Can you give more info about log type indexSlowLog? Did you write an implementation of org.apache.log4j.IndexSlowLogAppender - and why? Look here https://groups.google.com/forum/#!topic/elasticsearch/pPRXkI9P2hA and here http://www.elasticsearch.org/blog/logging-elasticsearch-events-with-logstash-and-elasticsearch/ how to set a standard log4j appender class, or your favorite custom class name in type. Jörg -- You received this message because you are subscribed to a topic in the Google Groups elasticsearch group. To unsubscribe from this topic, visit https://groups.google.com/d/topic/elasticsearch/_iO_l-snRas/unsubscribe. To unsubscribe from this group and all its topics, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHW2b5kUVQr3rST3YGgO7fWcuJYX1qHxoFEeeB_eGuD2Q%40mail.gmail.com . For more options, visit https://groups.google.com/groups/opt_out. -- Cordialement Olivier Morel tel : 06.62.25.03.77 -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAJnWC-8z%3Du4Rr-DP%2B3z4aD%2BOheihWzB%2BsGs5N%3DUU78DA37QT%2BQ%40mail.gmail.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: spring-data elasticsearch support?
Currently this feature is not supported. but looks very handy and useful feature. we will think about it soon. please check my latest reply at https://jira.springsource.org/browse/DATAES-41 HTH Mohsin On Wednesday, 18 December 2013 15:32:02 UTC, Andra Bennett wrote: Hi David, I am able to push a mapping in Java, but I see how I can merge that with an externally defined mapping as you suggest below. But, is the spring-elasticsearch library compatible with the elasticsearch spring-data library? Thanks! Andra On Monday, December 16, 2013 1:56:18 PM UTC-5, David Pilato wrote: I guess that you basically have to send a mapping. I think that once you have a Node you can get a client from it and push a mapping in Java, right? I did not play yet with spring data project for ES but I will soon. As author of https://github.com/dadoonet/spring-elasticsearch, I will probably do it like this with it. Not sure it helps. -- David ;-) Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs Le 16 déc. 2013 à 19:23, Andra Bennett and...@gmail.com a écrit : Hello, I am using spring-data elastic searchhttps://github.com/spring-projects/spring-data-elasticsearchto configure my node and index my data. I'd like to know how to enable the timestamp feature for the indexing, and how to map it to a custom field. Does anyone in the elasticsearch have experience with the spring-data elasticsearch library? Or this question better suited for the Spring forum. Thanks, Andra -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/3141bf8b-bd9e-43c8-afdb-3ffcf7fb53a5%40googlegroups.com . For more options, visit https://groups.google.com/groups/opt_out. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/5b77905a-5c08-4e1b-9653-3aa8b3f43c2c%40googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
1.0.0.Beta2 OOM logs
I ran some heap scaling tests how to find out the required heap for my workload. My config was ES 1.0.0.Beta2 cluster, 3 RHEL 6.3 nodes, Java 1.8.0-ea JVM 25.0-b56, 4GB heap, G1 GC (and some tuning for segment merge and bulk) Workload: mixed, scan/scroll query over 1.6m docs plus term queries over 20m docs (unknown queries per second, but higher than 5000) with bulk indexing (5000 docs per second) The 4GB exercise result was OOM on all nodes after an hour run, with all kinds of error messages. The cluster restarted ok afterwards so it did not matter at all. Increasing heap to 6GB and redoing the exercise succeeded after 52 minutes. I just want to share the OOM logs with anyone who might be interested to have a look, because they are so pretty ;) https://gist.github.com/jprante/8024139 FYI I'm considering a memory watchdog on shard level that might detect low free heap condition in time and can return warnings to the bulk client, so the bulk client might throttle, suspend, or exit the indexing cleanly, before OOMs start to break out in the cluster with all the risk of crashing shards or node dropouts. Surely not an exact science but with some heuristics it should work (e.g. below a threshold of 10mb free heap there should no execution of the indexing engine allowed) Would love to have more time for testing the exciting new 1.0.0.Beta2 features, but right now I'm just happy to run my data reconciliations successfully. Jörg -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoEV9HpzbT5HzdG5s0KspvQC%2Ba153-iTkue4QvRrCiTVxw%40mail.gmail.com. For more options, visit https://groups.google.com/groups/opt_out.
autocomplete mapping error
We are running into an issue where we are getting mapping issues. We are have tried on both Ubuntu and OSX. We are using java 1.7. Attached is our node configuration, and below is the error we are getting. 11:12:47,856 WARN [org.elasticsearch.indices.cluster] (elasticsearch[Nemesis][clusterService#updateTask][T#1]) [Nemesis] [need] failed to add mapping [need], source [{need:{properties:{defaultTextValue:{type:string},description:{type:string},endDate:{type:string},id:{type:string},imageURL:{type:string},location:{type:geo_point},mainCategory:{type:string,store:true,index_analyzer:keyword},postDate:{type:string},subCategory:{type:string,store:true,index_analyzer:keyword},tags:{type:string,store:true,index_analyzer:keyword},title:{type:multi_field,fields:{title:{type:string},title.autocomplete:{type:string,store:true,index_analyzer:autocomplete,search_analyzer:autocomplete_search,include_in_all:false},title.untouched:{type:string,index:not_analyzed,omit_norms:true,index_options:docs,include_in_all:false}}},user:{type:string,store:true,index_analyzer:keyword},version:{type:long]: org.elasticsearch.index.mapper.MapperParsingException: Analyzer [autocomplete] not found for field [title.autocomplete] at org.elasticsearch.index.mapper.core.TypeParsers.parseField(TypeParsers.java:107) [elasticsearch-0.90.7.jar:] at org.elasticsearch.index.mapper.core.StringFieldMapper$TypeParser.parse(StringFieldMapper.java:150) [elasticsearch-0.90.7.jar:] at org.elasticsearch.index.mapper.multifield.MultiFieldMapper$TypeParser.parse(MultiFieldMapper.java:130) [elasticsearch-0.90.7.jar:] at org.elasticsearch.index.mapper.object.ObjectMapper$TypeParser.parseProperties(ObjectMapper.java:263) [elasticsearch-0.90.7.jar:] at org.elasticsearch.index.mapper.object.ObjectMapper$TypeParser.parse(ObjectMapper.java:219) [elasticsearch-0.90.7.jar:] at org.elasticsearch.index.mapper.DocumentMapperParser.parse(DocumentMapperParser.java:176) [elasticsearch-0.90.7.jar:] at org.elasticsearch.index.mapper.MapperService.parse(MapperService.java:314) [elasticsearch-0.90.7.jar:] at org.elasticsearch.index.mapper.MapperService.merge(MapperService.java:193) [elasticsearch-0.90.7.jar:] at org.elasticsearch.indices.cluster.IndicesClusterStateService.processMapping(IndicesClusterStateService.java:417) [elasticsearch-0.90.7.jar:] at org.elasticsearch.indices.cluster.IndicesClusterStateService.applyMappings(IndicesClusterStateService.java:381) [elasticsearch-0.90.7.jar:] at org.elasticsearch.indices.cluster.IndicesClusterStateService.clusterChanged(IndicesClusterStateService.java:179) [elasticsearch-0.90.7.jar:] at org.elasticsearch.cluster.service.InternalClusterService$UpdateTask.run(InternalClusterService.java:414) [elasticsearch-0.90.7.jar:] at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:135) [elasticsearch-0.90.7.jar:] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [rt.jar:1.7.0_45] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [rt.jar:1.7.0_45] at java.lang.Thread.run(Thread.java:744) [rt.jar:1.7.0_45] -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/3c8b1aaa-96e7-43a0-a148-38d92c79bbef%40googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out. elasticsearch.yml Description: Binary data
Re: TransportClient failures with 0.90.3 cluster, but NodeClient works without failures
Jörg, *Beside the cluster node JVMs you also have to take care of the client JVM. Are you accessing the cluster also with Solaris x86 and Java 6u18?* Oooh. Don't know why this didn't occur to me. Short answer: No. Java on the MacBook (where the client / driver runs): $ java -version java version 1.6.0_65 Java(TM) SE Runtime Environment (build 1.6.0_65-b14-462-10M4609) Java HotSpot(TM) 64-Bit Server VM (build 20.65-b04-462, mixed mode) Java on all 3 virtual Solaris hosts of the remote 3-node ES cluster: $ java -version java version 1.6.0_18 Java(TM) SE Runtime Environment (build 1.6.0_18-b07) Java HotSpot(TM) Client VM (build 16.0-b13, mixed mode, sharing) *Can you give more information about noticeably quicker? What do you test and how much load? Searching or indexing?* The ES index and type are mapped to enable TTL with a default 10s TTL value for all documents, and the default 60s TTL interval for the index. I've disabled the indexing for all fields; documents are queried only by their _id. Ad-hoc queries weren't necessary for this particular test case. The remote far-away 3-node ES cluster is running on 3 Solaris x86-64 VMs using Zen unicast discovery. There is also a single-node ES cluster running on the MacBook. The driver contains a writer thread pool, a reader thread pool, and BigQueue in the middle. All of the runs below were configured with 8 writer threads and 8 reader threads. For all tests, the driver was run on the MacBook. The writer threads obtain a unique object, serialize it to JSON, add it to ES, and then add it to the queue. The reader threads read from the queue, deserialize into the object, and query by index+type+id to verify it's there. The timing values are shown by the driver using the super-cool TimeValue object class. Nice touch! *1. Local single-node ES cluster and the driver, all running on the MacBook. The TransportClient is a little bit faster than the NodeClient:* 1a. Using the TransportClient: generated-connections=1629365 elapsed=5m conn/sec=5430 [db-update: total=1629358 time=32.5m time/update=1.1ms] [db-query: total=1629357 time=16.8m time/query=620.1micros] [queue: current=0 max=373] 1b. Using the NodeClient: generated-connections=1551379 elapsed=5m conn/sec=5171 [db-update: total=1551371 time=32.8m time/update=1.2ms] [db-query: total=1551371 time=16.4m time/query=637.7micros] [queue: current=0 max=383] *2. Driver running locally on the MacBook connected to the far-away 3-node ES cluster running on Solaris x86-64 VMs. In this case, the NodeClient was seen to be faster, particularly in the area of updates.* 2a. The driver uses a TransportClient but only 2 of the 3 nodes are added to its list of inet addresses: generated-connections=13427 elapsed=5m conn/sec=44 [db-update: total=13419 time=39.9m time/update=178.4ms] [db-query: total=13419 time=18.7m time/query=83.6ms] [queue: current=0 max=7] 2b. The driver uses a client-only NodeClient with Zen unicast discovery and all 3 nodes configured for it: generated-connections=27592 elapsed=5m conn/sec=91 [db-update: total=27584 time=39.8m time/update=86.7ms] [db-query: total=27584 time=38.7m time/query=84.2ms] [queue: current=0 max=26] Regards, Brian -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/c2e7390d-2ca2-4094-bdf2-c2969daeee5a%40googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
Shard relocation progress
What's the best way to monitor shard relocation that occurrs when one add new nodes? Is there a way to control the relocation and do it manually with few shards at a time? What are the best practices for cluster that is contantly received high volume traffic? -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAOT3TWovrNe4cJvqAuvPV26zErezm0eqo9rF%2B%3DFpFk_Pzh-q0Q%40mail.gmail.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: Very slow ElasticSearch Index
Intra cluster comms are all handled over HTTP. What is the link between your DCs like; 100M, 1G, 10G? You could try using something like logstash to replicate the indexes, that way you can have two clusters and it should reduce any latency. Regards, Mark Walkom Infrastructure Engineer Campaign Monitor email: ma...@campaignmonitor.com web: www.campaignmonitor.com On 19 December 2013 03:10, lekkie omotayo lekkie.ay...@gmail.com wrote: Are there any other protocols other than HTTP I can send request over? Something faster than HTTP? Or do you mean the physical NIC? We run on a LAN On Wednesday, 18 December 2013 11:49:07 UTC+1, Mark Walkom wrote: The lag over that inter-DC link is probably causing your issues. Regards, Mark Walkom Infrastructure Engineer Campaign Monitor email: ma...@campaignmonitor.com web: www.campaignmonitor.com On 18 December 2013 18:12, lekkie omotayo lekkie...@gmail.com wrote: Yes they all have the same capacity. Yes, they are in different data centers (off-site). On Tuesday, 17 December 2013 22:33:07 UTC+1, Mark Walkom wrote: ES will only go as fast as the slowest node. With that in mind, are your DR nodes the same capacity? I also notice they are in different subnets, does that imply they are in different datacenters? Regards, Mark Walkom Infrastructure Engineer Campaign Monitor email: ma...@campaignmonitor.com web: www.campaignmonitor.com On 18 December 2013 00:44, lekkie omotayo lekkie...@gmail.com wrote: Thanks for the insight. First tip would be to drop OpenJDK and move to Oracle, you'll get a lot better performance. So I changed to Oracle JDK and latency dropped from 4000millisecond to around 2500millisecond. . It might also be worth removing indices.ttl.interval and just using a script to delete old indices as TTL searches can use a fair bit of resources. We also dropped indices.ttl.interval and it further dropped to 1500milliseconds. You also mentioned you have 2 nodes, but there are a lot more IPs listed in the discovery hosts, is that intentional? Same for minimum_master_nodes being 3. Yes, the other 2 nodes are DR nodes. So we basically have 4 nodes but 2 are for disaster recovery. And the discovery.zen.minimum_master_nodes was calculated based on the n/2 + 1, where n was 4. One other thing to note, every request is an upsert. What we have now is 1500milliseconds per upsert. This is still very high. We are looking at doing sub-zero millisecond or 10s of millisecond for bulk upload. Can this be achieved or it is a pipe dream? On Tuesday, 17 December 2013 09:12:52 UTC+1, Mark Walkom wrote: First tip would be to drop OpenJDK and move to Oracle, you'll get a lot better performance. Bulk depends a lot on your setup and document size etc, but upwards of 5K is generally towards the upper limit. It might also be worth removing indices.ttl.interval and just using a script to delete old indices as TTL searches can use a fair bit of resources. You also mentioned you have 2 nodes, but there are a lot more IPs listed in the discovery hosts, is that intentional? Same for minimum_master_nodes being 3. Regards, Mark Walkom Infrastructure Engineer Campaign Monitor email: ma...@campaignmonitor.com web: www.campaignmonitor.com On 17 December 2013 18:55, lekkie omotayo lekkie...@gmail.comwrote: Hi guys, We index a document at *** milli on ElasticSearch, which I think is too slow especially for the amount of resources we have setup. We will like to index at the rate of 500tps. Each document weighs between 20K and 30K. How many indexes are advisable to be done at once (assuming we can afford to send multiple http index request to the server at once)? I understand bulk indexing is a preferred approach, for a 30K document how much can be bulked at once? How many http bulk request (supposing I am using a multi-threaded http client ot make requests) is advisable to make? I will appreciate suggestions and how to index this document as fast as possible. We have two nodes set up, the config below is for one out of the two: Shards: 5 Replica: 1 nodes : { T5l5mvIdQsW3je7WmSPOcg : { name : SEARCH-01, version : 0.90.7, attributes : { rack_id : prod, max_local_storage_nodes : 1 }, settings : { node.rack_id : prod, action.disable_delete_all_indices : true, cloud.node.auto_attributes : true, indices.ttl.interval : 90d, node.max_local_storage_nodes : 1, bootstrap.mlockall : true, index.mapper.dynamic : true, cluster.routing.allocation.awareness.attributes : rack_id, discovery.zen.minimum_master_nodes : 3, gateway.expected_nodes : 1, discovery.zen.ping.unicast.hosts : 172.25.15.170,172.25.15.172,172.46.1.170,172.46.1.172, discovery.zen.ping.multicast.enabled : false,
Re: Serialization issues on 0.90.3
Just for double check: are all ES servers and all client JVMs the same version? It is not enough to check all the cluster nodes. The client JVM is the one to check. The exception is quite clear, it stands for the InetAddress encoding/decoding issue because of different JVM versions... Jörg -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoFtzH2Brsm%2BbA0sRwOhhjK6j4MoQ44C_-U7zA5r51iX5w%40mail.gmail.com. For more options, visit https://groups.google.com/groups/opt_out.