Re: optimize elasticsearch / JVM
why not ? Could u tell me how to do such ? and also explain why will it be better ? thanks a lot for your help On Thursday, January 29, 2015 at 10:02:00 AM UTC+1, Arie wrote: Just an idea. You could try running two ES instances as a cluster on one machine if there is no other option. On Wednesday, January 28, 2015 at 2:09:22 PM UTC+1, Oto Iashvili wrote: Hi I have a website for classified. For this I'm using elasticsearch, postgres and rails on a same ubuntu 14.04 dedicated server, with 256go of RAM and 20 cores, 40 threads . I have 10 indexes on elasticsearch, each have default numbers of shardes (5). They have between 1000 and 400 000 classifieds dependings on which index. approximatly 5000 requests per minute, 2/3 making an elasticsearch request. according to htop, jvm is using around 500% of CPU I try different options, I reduce number of shardes per index, I also try to change JAVA_OPTS of followed #JAVA_OPTS=$JAVA_OPTS -XX:+UseParNewGC #JAVA_OPTS=$JAVA_OPTS -XX:+UseConcMarkSweepGC #JAVA_OPTS=$JAVA_OPTS -XX:CMSInitiatingOccupancyFraction=75 #JAVA_OPTS=$JAVA_OPTS -XX:+UseCMSInitiatingOccupancyOnly JAVA_OPTS=$JAVA_OPTS -XX:+UseG1GC but it doesnt seems to change anything. so to questions : - when you change any setting on elasticsearch, and then restart, should the improvement (if any) be visible immediatly or can it arrive a bit later thanks to cache or any thing else ? - can any one help me to find good configuration for JVM / elasticsearch so it will not take that many ressources -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/9db68f64-e79d-4592-9085-0633eec7360f%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
optimize elasticsearch / JVM
Hi I have a website for classified. For this I'm using elasticsearch, postgres and rails on a same ubuntu 14.04 dedicated server, with 256go of RAM and 20 cores, 40 threads . I have 10 indexes on elasticsearch, each have default numbers of shardes (5). They have between 1000 and 400 000 classifieds dependings on which index. approximatly 5000 requests per minute, 2/3 making an elasticsearch request. according to htop, jvm is using around 500% of CPU I try different options, I reduce number of shardes per index, I also try to change JAVA_OPTS of followed #JAVA_OPTS=$JAVA_OPTS -XX:+UseParNewGC #JAVA_OPTS=$JAVA_OPTS -XX:+UseConcMarkSweepGC #JAVA_OPTS=$JAVA_OPTS -XX:CMSInitiatingOccupancyFraction=75 #JAVA_OPTS=$JAVA_OPTS -XX:+UseCMSInitiatingOccupancyOnly JAVA_OPTS=$JAVA_OPTS -XX:+UseG1GC but it doesnt seems to change anything. so to questions : - when you change any setting on elasticsearch, and then restart, should the improvement (if any) be visible immediatly or can it arrive a bit later thanks to cache or any thing else ? - can any one help me to find good configuration for JVM / elasticsearch so it will not take that many ressources -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/369fdad8-cc02-415c-b4e9-e93135e58b59%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: optimize elasticsearch / JVM
Hi, thanks a lot for answer, Ive tried several value for heap, between 26 and 32, but I didnt see any difference. I remove G1 and put back default parameter. But still same pb. I was saying around 500%, but it is just average, i goes sometimes up to 2000% I was also thinking to take several server, but right now it is not possible. I was using a smaller server just before, with 96go of ram and I was working better. I trie to put same parameters as before, but it is not much slower. On Wednesday, January 28, 2015 at 2:36:29 PM UTC+1, Jilles van Gurp wrote: How much heap are you giving to ES? With this many requests, if your setup is not falling over it is probably not garbage collect related because that would result in very noticable delays/unavailability of es. 32GB should be a good value given how much memory you have. Also, you probably want to use doc_values in your mapping so that you can utilize the os file cache and move some of the memory pressure on the heap. You seem to have plenty of ram, so your entire dataset should easily fit in RAM. Also, don't use G1 for elasticsearch. There are known issues with that particular garbage collector in combination with lucene. CMS is the best option for ES. 500% of 20 cores doesn't sound that bad; you'd max them out at 4000%. Still, it would be nice to know what it is doing. In any case, you might want to try out marvel to find out where your setup is bottlenecked. Also, you might want to consider scaling horizontally instead of vertically. Many smaller servers can be nicer than one big one. On Wednesday, January 28, 2015 at 2:09:22 PM UTC+1, Oto Iashvili wrote: Hi I have a website for classified. For this I'm using elasticsearch, postgres and rails on a same ubuntu 14.04 dedicated server, with 256go of RAM and 20 cores, 40 threads . I have 10 indexes on elasticsearch, each have default numbers of shardes (5). They have between 1000 and 400 000 classifieds dependings on which index. approximatly 5000 requests per minute, 2/3 making an elasticsearch request. according to htop, jvm is using around 500% of CPU I try different options, I reduce number of shardes per index, I also try to change JAVA_OPTS of followed #JAVA_OPTS=$JAVA_OPTS -XX:+UseParNewGC #JAVA_OPTS=$JAVA_OPTS -XX:+UseConcMarkSweepGC #JAVA_OPTS=$JAVA_OPTS -XX:CMSInitiatingOccupancyFraction=75 #JAVA_OPTS=$JAVA_OPTS -XX:+UseCMSInitiatingOccupancyOnly JAVA_OPTS=$JAVA_OPTS -XX:+UseG1GC but it doesnt seems to change anything. so to questions : - when you change any setting on elasticsearch, and then restart, should the improvement (if any) be visible immediatly or can it arrive a bit later thanks to cache or any thing else ? - can any one help me to find good configuration for JVM / elasticsearch so it will not take that many ressources -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/da7e20bf-c8b9-43ed-a95e-49ec32b3660c%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: optimize elasticsearch / JVM
, free_in_bytes : 264467443712, available_in_bytes : 249068793856, disk_reads : 351417, disk_writes : 205904, disk_io_op : 557321, disk_read_size_in_bytes : 3433067520, disk_write_size_in_bytes : 3127025664, disk_io_size_in_bytes : 6560093184, disk_queue : 0, disk_service_time : 0.1 }, data : [ { path : /var/lib/elasticsearch/elasticsearch/nodes/0, mount : /, dev : /dev/sda2, total_in_bytes : 302674501632, free_in_bytes : 264467443712, available_in_bytes : 249068793856, disk_reads : 351417, disk_writes : 205904, disk_io_op : 557321, disk_read_size_in_bytes : 3433067520, disk_write_size_in_bytes : 3127025664, disk_io_size_in_bytes : 6560093184, disk_queue : 0, disk_service_time : 0.1 } ] }, transport : { server_open : 13, rx_count : 6, rx_size_in_bytes : 1380, tx_count : 6, tx_size_in_bytes : 1380 }, http : { current_open : 11, total_opened : 2311818 }, breakers : { request : { limit_size_in_bytes : 12357704089, limit_size : 11.5gb, estimated_size_in_bytes : 16440, estimated_size : 16kb, overhead : 1.0, tripped : 0 }, fielddata : { limit_size_in_bytes : 18536556134, limit_size : 17.2gb, estimated_size_in_bytes : 6131132, estimated_size : 5.8mb, overhead : 1.03, tripped : 0 }, parent : { limit_size_in_bytes : 21625982156, limit_size : 20.1gb, estimated_size_in_bytes : 6147572, estimated_size : 5.8mb, overhead : 1.0, tripped : 0 } } } } } On Wednesday, January 28, 2015 at 11:41:16 PM UTC+1, Oto Iashvili wrote: Hi, thanks a lot for answer, Ive tried several value for heap, between 26 and 32, but I didnt see any difference. I remove G1 and put back default parameter. But still same pb. I was saying around 500%, but it is just average, i goes sometimes up to 2000% I was also thinking to take several server, but right now it is not possible. I was using a smaller server just before, with 96go of ram and I was working better. I trie to put same parameters as before, but it is not much slower. On Wednesday, January 28, 2015 at 2:36:29 PM UTC+1, Jilles van Gurp wrote: How much heap are you giving to ES? With this many requests, if your setup is not falling over it is probably not garbage collect related because that would result in very noticable delays/unavailability of es. 32GB should be a good value given how much memory you have. Also, you probably want to use doc_values in your mapping so that you can utilize the os file cache and move some of the memory pressure on the heap. You seem to have plenty of ram, so your entire dataset should easily fit in RAM. Also, don't use G1 for elasticsearch. There are known issues with that particular garbage collector in combination with lucene. CMS is the best option for ES. 500% of 20 cores doesn't sound that bad; you'd max them out at 4000%. Still, it would be nice to know what it is doing. In any case, you might want to try out marvel to find out where your setup is bottlenecked. Also, you might want to consider scaling horizontally instead of vertically. Many smaller servers can be nicer than one big one. On Wednesday, January 28, 2015 at 2:09:22 PM UTC+1, Oto Iashvili wrote: Hi I have a website for classified. For this I'm using elasticsearch, postgres and rails on a same ubuntu 14.04 dedicated server, with 256go of RAM and 20 cores, 40 threads . I have 10 indexes on elasticsearch, each have default numbers of shardes (5). They have between 1000 and 400 000 classifieds dependings on which index. approximatly 5000 requests per minute, 2/3 making an elasticsearch request. according to htop, jvm is using around 500% of CPU I try different options, I reduce number of shardes per index, I also try to change JAVA_OPTS of followed #JAVA_OPTS=$JAVA_OPTS -XX:+UseParNewGC #JAVA_OPTS=$JAVA_OPTS -XX:+UseConcMarkSweepGC #JAVA_OPTS=$JAVA_OPTS -XX:CMSInitiatingOccupancyFraction=75 #JAVA_OPTS=$JAVA_OPTS -XX:+UseCMSInitiatingOccupancyOnly JAVA_OPTS=$JAVA_OPTS -XX:+UseG1GC but it doesnt seems to change anything. so to questions : - when you change any setting on elasticsearch, and then restart, should the improvement (if any) be visible immediatly or can it arrive a bit later thanks to cache or any thing else ? - can any one help me to find good configuration for JVM / elasticsearch so it will not take that many ressources -- You received this message because you
snowball and elusion
Hello, At first, I was using the analyzer language analyzer and everything seemed to work very well. Until I realize that a is not part of the list of stopwords in french So I decided to test with snowball. It also seemed working well, but in this case it does remove short word like l' , d' , ... Hence my question: How to use snowball, keep filters by default, and add a list of stopwords and elusion? Otherwise, how to change the list of stopwords for analyzer language analyzer? And one last question: is there really an interest to use snowball rather than the analyzer language analyzer? is it faster? more relevant? thank you -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/56de95ea-bb68-42a0-889f-5d34bef4dcf2%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
custom stemmer with elasticsearch / tire / rails
Hi, Im' searchinkg to ass new stemmer to elastisearch to use with tire / rails I've found java file (https://github.com/emilis/PolicyFeed/blob/master/src/search/java/org/tartarus/snowball/ext/LithuanianStemmer.java) I've created a jar from this file I've put it in elasticsearch's lib folder here my rails file tire.settings :analysis = { :filter = { lt_stemmer = { type = stemmer, name = lithuanian, rules_path = lt_stemmer.jar } }, :analyzer = { lithuanian = { type = snowball, tokenizer = keyword, filter = [lowercase, lt_stemmer] }, }, } do mapping do indexes :titre_lt, :analyzer = lithuanian end I succeed them to create index and index data, but when I test, it seems it doesn't use the rule in my jar file. curl -XGET 'localhost:9200/lituanieindex/_analyze?analyzer=lithuanian' -d 'smulkių, dalinių, pilnų krovinių pervežimas nuosavais arba partnerių vilkikais su standartinėmis 92 m3 puspriekabėmis ir 120 m3 autotraukiniais;' {tokens:[{token:smulkių,start_offset:0,end_offset:7,type:ALPHANUM,position:1},{token:dalinių,start_offset:9,end_offset:16,type:ALPHANUM,position:2},{token:pilnų,start_offset:18,end_offset:23,type:ALPHANUM,position:3},{token:krovinių,start_offset:24,end_offset:32,type:ALPHANUM,position:4},{token:pervežima,start_offset:33,end_offset:43,type:ALPHANUM,position:5},{token:nuosavai,start_offset:44,end_offset:53,type:ALPHANUM,position:6},{token:arba,start_offset:54,end_offset:58,type:ALPHANUM,position:7},{token:partnerių,start_offset:59,end_offset:68,type:ALPHANUM,position:8},{token:vilkikai,start_offset:69,end_offset:78,type:ALPHANUM,position:9},{token:su,start_offset:79,end_offset:81,type:ALPHANUM,position:10},{token:standartinėmi,start_offset:82,end_offset:96,type:ALPHANUM,position:11},{token:92,start_offset:97,end_offset:99,type:NUM,position:12},{token:m3,start_offset:100,end_offset:102,type:ALPHANUM,position:13},{token:puspriekabėmi,start_offset:103,end_offset:117,type:ALPHANUM,position:14},{token:ir,start_offset:118,end_offset:120,type:ALPHANUM,position:15},{token:120,start_offset:121,end_offset:124,type:NUM,position:16},{token:m3,start_offset:125,end_offset:127,type:ALPHANUM,position:17},{token:autotraukiniai,start_offset:128,end_offset:143,type:ALPHANUM,position:18}]} what do I do wrong ? thanks for help -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/c4bd01c5-832a-42b4-8218-8263ca284f25%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.