Fastes way to import 100m rows
Hi all, i need to import 100m rows. At the moment i use the bulk api with 100 entries at once (Is that a good or a bad value?). But i only get ~500 rows/second imported (Is that a lot or a little more?). Is there a way to import the data faster? I set number_of_replicas = 0 and refresh_interval = -1 without a big difference to the default values. Thank you for your help, Andreas -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/7b1fdb0d-067f-490b-879f-805593a0d4cc%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: Fastes way to import 100m rows
That's a very low rate. Are you importing locally or via remote connection? -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Author of RavenDB in Action http://manning.com/synhershko/ On Mon, May 19, 2014 at 7:16 PM, Andreas Hembach hemba...@gmail.com wrote: Hi all, i need to import 100m rows. At the moment i use the bulk api with 100 entries at once (Is that a good or a bad value?). But i only get ~500 rows/second imported (Is that a lot or a little more?). Is there a way to import the data faster? I set number_of_replicas = 0 and refresh_interval = -1 without a big difference to the default values. Thank you for your help, Andreas -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/7b1fdb0d-067f-490b-879f-805593a0d4cc%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/7b1fdb0d-067f-490b-879f-805593a0d4cc%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHTr4Zt5k26KK-NTk-o-BhBOSxrT6hmr1k9QT6o6Ten5A_AnUA%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: Fastes way to import 100m rows
Hi, i am importing locally. Ok its a Testserver with only 2 CPU's 2.40GHz and 4 GB Ram but for testing i only import 100k rows. Greetings, Andreas Am Montag, 19. Mai 2014 18:17:52 UTC+2 schrieb Itamar Syn-Hershko: That's a very low rate. Are you importing locally or via remote connection? -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Author of RavenDB in Action http://manning.com/synhershko/ On Mon, May 19, 2014 at 7:16 PM, Andreas Hembach hemb...@gmail.comjavascript: wrote: Hi all, i need to import 100m rows. At the moment i use the bulk api with 100 entries at once (Is that a good or a bad value?). But i only get ~500 rows/second imported (Is that a lot or a little more?). Is there a way to import the data faster? I set number_of_replicas = 0 and refresh_interval = -1 without a big difference to the default values. Thank you for your help, Andreas -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com javascript:. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/7b1fdb0d-067f-490b-879f-805593a0d4cc%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/7b1fdb0d-067f-490b-879f-805593a0d4cc%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/c931035b-1925-4345-8029-96b260edca11%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: Fastes way to import 100m rows
That doesn't seem right, try making larger bulk sizes. Also, what size is your docs? -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Author of RavenDB in Action http://manning.com/synhershko/ On Mon, May 19, 2014 at 7:35 PM, Andreas Hembach hemba...@gmail.com wrote: Hi, i am importing locally. Ok its a Testserver with only 2 CPU's 2.40GHz and 4 GB Ram but for testing i only import 100k rows. Greetings, Andreas Am Montag, 19. Mai 2014 18:17:52 UTC+2 schrieb Itamar Syn-Hershko: That's a very low rate. Are you importing locally or via remote connection? -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Author of RavenDB in Action http://manning.com/synhershko/ On Mon, May 19, 2014 at 7:16 PM, Andreas Hembach hemb...@gmail.comwrote: Hi all, i need to import 100m rows. At the moment i use the bulk api with 100 entries at once (Is that a good or a bad value?). But i only get ~500 rows/second imported (Is that a lot or a little more ?). Is there a way to import the data faster? I set number_of_replicas = 0 and refresh_interval = -1 without a big difference to the default values. Thank you for your help, Andreas -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/ msgid/elasticsearch/7b1fdb0d-067f-490b-879f-805593a0d4cc% 40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/7b1fdb0d-067f-490b-879f-805593a0d4cc%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/c931035b-1925-4345-8029-96b260edca11%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/c931035b-1925-4345-8029-96b260edca11%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZuaF53KCw9M9M58h1pMXuK8gFvXekGp1C_3LdTJ8i-Orw%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: Fastes way to import 100m rows
Hi, i have tried some bulk sizes. BulkSize = Rows per Second 5000 = 500 4000 = 625 2000 = 625 1000 = 550 500 = 500 So i think 4000 is the best value. My docs have about 270 columns. Is it to much? It is a denormalized view. While the imports, the CPU load is around 90%. This could be the bottleneck? Thanks and regards, Andreas Am Montag, 19. Mai 2014 18:51:54 UTC+2 schrieb Itamar Syn-Hershko: That doesn't seem right, try making larger bulk sizes. Also, what size is your docs? -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Author of RavenDB in Action http://manning.com/synhershko/ On Mon, May 19, 2014 at 7:35 PM, Andreas Hembach hemb...@gmail.comjavascript: wrote: Hi, i am importing locally. Ok its a Testserver with only 2 CPU's 2.40GHz and 4 GB Ram but for testing i only import 100k rows. Greetings, Andreas Am Montag, 19. Mai 2014 18:17:52 UTC+2 schrieb Itamar Syn-Hershko: That's a very low rate. Are you importing locally or via remote connection? -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Author of RavenDB in Action http://manning.com/synhershko/ On Mon, May 19, 2014 at 7:16 PM, Andreas Hembach hemb...@gmail.comwrote: Hi all, i need to import 100m rows. At the moment i use the bulk api with 100 entries at once (Is that a good or a bad value?). But i only get ~500 rows/second imported (Is that a lot or a little more?). Is there a way to import the data faster? I set number_of_replicas = 0 and refresh_interval = -1 without a big difference to the default values. Thank you for your help, Andreas -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/ msgid/elasticsearch/7b1fdb0d-067f-490b-879f-805593a0d4cc% 40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/7b1fdb0d-067f-490b-879f-805593a0d4cc%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com javascript:. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/c931035b-1925-4345-8029-96b260edca11%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/c931035b-1925-4345-8029-96b260edca11%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/17585b3b-fa20-49b6-8579-699d4097f28a%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.