Fastes way to import 100m rows

2014-05-19 Thread Andreas Hembach
Hi all,

i need to import 100m rows. At the moment i use the bulk api with 100 
entries at once (Is that a good or a bad value?).
But i only get ~500 rows/second imported (Is that a lot or a little more?). Is 
there a way to import the data faster?

I set number_of_replicas = 0 and refresh_interval = -1 without a big 
difference to the default values.

Thank you for your help,
Andreas

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/7b1fdb0d-067f-490b-879f-805593a0d4cc%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Fastes way to import 100m rows

2014-05-19 Thread Itamar Syn-Hershko
That's a very low rate. Are you importing locally or via remote connection?

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer  Consultant
Author of RavenDB in Action http://manning.com/synhershko/


On Mon, May 19, 2014 at 7:16 PM, Andreas Hembach hemba...@gmail.com wrote:

 Hi all,

 i need to import 100m rows. At the moment i use the bulk api with 100
 entries at once (Is that a good or a bad value?).
 But i only get ~500 rows/second imported (Is that a lot or a little more?).
 Is there a way to import the data faster?

 I set number_of_replicas = 0 and refresh_interval = -1 without a big
 difference to the default values.

 Thank you for your help,
 Andreas

  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/7b1fdb0d-067f-490b-879f-805593a0d4cc%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/7b1fdb0d-067f-490b-879f-805593a0d4cc%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHTr4Zt5k26KK-NTk-o-BhBOSxrT6hmr1k9QT6o6Ten5A_AnUA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Fastes way to import 100m rows

2014-05-19 Thread Andreas Hembach
Hi,

i am importing locally. Ok its a Testserver with only 2 CPU's 2.40GHz and 4 
GB Ram but for testing i only import 100k rows.

Greetings,
Andreas

Am Montag, 19. Mai 2014 18:17:52 UTC+2 schrieb Itamar Syn-Hershko:

 That's a very low rate. Are you importing locally or via remote connection?

 --

 Itamar Syn-Hershko
 http://code972.com | @synhershko https://twitter.com/synhershko
 Freelance Developer  Consultant
 Author of RavenDB in Action http://manning.com/synhershko/


 On Mon, May 19, 2014 at 7:16 PM, Andreas Hembach 
 hemb...@gmail.comjavascript:
  wrote:

 Hi all,

 i need to import 100m rows. At the moment i use the bulk api with 100 
 entries at once (Is that a good or a bad value?).
 But i only get ~500 rows/second imported (Is that a lot or a little more?). 
 Is there a way to import the data faster?

 I set number_of_replicas = 0 and refresh_interval = -1 without a big 
 difference to the default values.

  Thank you for your help,
 Andreas

  -- 
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearc...@googlegroups.com javascript:.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/7b1fdb0d-067f-490b-879f-805593a0d4cc%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/7b1fdb0d-067f-490b-879f-805593a0d4cc%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.




-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/c931035b-1925-4345-8029-96b260edca11%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Fastes way to import 100m rows

2014-05-19 Thread Itamar Syn-Hershko
That doesn't seem right, try making larger bulk sizes. Also, what size is
your docs?

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer  Consultant
Author of RavenDB in Action http://manning.com/synhershko/


On Mon, May 19, 2014 at 7:35 PM, Andreas Hembach hemba...@gmail.com wrote:

 Hi,

 i am importing locally. Ok its a Testserver with only 2 CPU's 2.40GHz and
 4 GB Ram but for testing i only import 100k rows.

 Greetings,
 Andreas

 Am Montag, 19. Mai 2014 18:17:52 UTC+2 schrieb Itamar Syn-Hershko:

 That's a very low rate. Are you importing locally or via remote
 connection?

 --

 Itamar Syn-Hershko
 http://code972.com | @synhershko https://twitter.com/synhershko
 Freelance Developer  Consultant
 Author of RavenDB in Action http://manning.com/synhershko/


 On Mon, May 19, 2014 at 7:16 PM, Andreas Hembach hemb...@gmail.comwrote:

 Hi all,

 i need to import 100m rows. At the moment i use the bulk api with 100
 entries at once (Is that a good or a bad value?).
 But i only get ~500 rows/second imported (Is that a lot or a little more
 ?). Is there a way to import the data faster?

 I set number_of_replicas = 0 and refresh_interval = -1 without a big
 difference to the default values.

  Thank you for your help,
 Andreas

  --
 You received this message because you are subscribed to the Google
 Groups elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send
 an email to elasticsearc...@googlegroups.com.

 To view this discussion on the web visit https://groups.google.com/d/
 msgid/elasticsearch/7b1fdb0d-067f-490b-879f-805593a0d4cc%
 40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/7b1fdb0d-067f-490b-879f-805593a0d4cc%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/c931035b-1925-4345-8029-96b260edca11%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/c931035b-1925-4345-8029-96b260edca11%40googlegroups.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZuaF53KCw9M9M58h1pMXuK8gFvXekGp1C_3LdTJ8i-Orw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Fastes way to import 100m rows

2014-05-19 Thread Andreas Hembach
Hi,

i have tried some bulk sizes. 

BulkSize = Rows per Second
5000   = 500
4000   = 625
2000   = 625
1000   = 550
500 = 500

So i think 4000 is the best value. My docs have about 270 columns. Is it to 
much? It is a denormalized view.

While the imports, the CPU load is around 90%. This could be the bottleneck?

Thanks and regards,
Andreas
Am Montag, 19. Mai 2014 18:51:54 UTC+2 schrieb Itamar Syn-Hershko:

 That doesn't seem right, try making larger bulk sizes. Also, what size is 
 your docs?

 --

 Itamar Syn-Hershko
 http://code972.com | @synhershko https://twitter.com/synhershko
 Freelance Developer  Consultant
 Author of RavenDB in Action http://manning.com/synhershko/


 On Mon, May 19, 2014 at 7:35 PM, Andreas Hembach 
 hemb...@gmail.comjavascript:
  wrote:

 Hi,

 i am importing locally. Ok its a Testserver with only 2 CPU's 2.40GHz and 
 4 GB Ram but for testing i only import 100k rows.

 Greetings,
 Andreas

 Am Montag, 19. Mai 2014 18:17:52 UTC+2 schrieb Itamar Syn-Hershko:

 That's a very low rate. Are you importing locally or via remote 
 connection?

 --

 Itamar Syn-Hershko
 http://code972.com | @synhershko https://twitter.com/synhershko
 Freelance Developer  Consultant
 Author of RavenDB in Action http://manning.com/synhershko/


 On Mon, May 19, 2014 at 7:16 PM, Andreas Hembach hemb...@gmail.comwrote:

  Hi all,

 i need to import 100m rows. At the moment i use the bulk api with 100 
 entries at once (Is that a good or a bad value?).
 But i only get ~500 rows/second imported (Is that a lot or a little 
 more?). Is there a way to import the data faster?

 I set number_of_replicas = 0 and refresh_interval = -1 without a big 
 difference to the default values.

  Thank you for your help,
 Andreas

  -- 
 You received this message because you are subscribed to the Google 
 Groups elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send 
 an email to elasticsearc...@googlegroups.com.

 To view this discussion on the web visit https://groups.google.com/d/
 msgid/elasticsearch/7b1fdb0d-067f-490b-879f-805593a0d4cc%
 40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/7b1fdb0d-067f-490b-879f-805593a0d4cc%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


  -- 
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearc...@googlegroups.com javascript:.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/c931035b-1925-4345-8029-96b260edca11%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/c931035b-1925-4345-8029-96b260edca11%40googlegroups.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.




-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/17585b3b-fa20-49b6-8579-699d4097f28a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.