Re: Bulk Processor
It does it automatically. You just have to properly call .close() when you stop you application. It will process the pending requests before actually exiting. -- David Pilato | Technical Advocate | Elasticsearch.com @dadoonet | @elasticsearchfr Le 14 mars 2014 à 15:50:48, ZenMaster80 (sabdall...@gmail.com) a écrit: David, Sorry, I didn't quite follow, does it do the flushing automatically or am I supposed to tell it? On Wednesday, March 12, 2014 4:05:49 PM UTC-4, David Pilato wrote: It also flush docs after a given time, let's say every 5 seconds. BTW there is a small issue which basically flush the Bulk every n-1 docs instead of n. Fix is on the way. -- David ;-) Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs Le 12 mars 2014 à 20:51, ZenMaster80 a écrit : I don't quite undertsand what the bulk processor is doing this, I would like someone to explain how it is upposed to work to make sure I designed this correctly. I specify the number of actions 1000. my feeder keeos pushing documents to it "Its more like a loop iterating documents folders", and I push eash document to the bulk. I expected the bulk to queue things until it reaches 1000 docs, then processes the bulk? Yet, this is how it logs, thie comes from the call back functions of the bulk processor. Bulk Called: ID= 1, Actions=33, MB=5.46250 Bulk Called: ID= 2, Actions=29, MB=5.51660 Bulk Succeeded: ID= 1, took= 921 ms Bulk Called: ID= 3, Actions=12, MB=5.691812 Bulk Succeeded: ID= 2, took= 1526 ms . Bulk Called: ID= 23, Actions=8, MB=5.45294 Bulk Succeeded: ID= 23, took= 751 ms Bulk Called: ID= 24, Actions=19, MB=5.383918 Bulk Succeeded: ID= 24, took= 331 ms Bulk Called: ID= 25, Actions=22, MB=5.347542 Bulk Succeeded: ID= 25, took= 694 ms Bulk Called: ID= 26, Actions=58, MB=5.249195 Bulk Succeeded: ID= 26, took= 583 ms Bulk Called: ID= 27, Actions=89, MB=5.244396 Bulk Succeeded: ID= 27, took= 588 ms. Bulk Called: ID= 47, Actions=17, MB=5.245771 ... Bulk Succeeded: ID= 47, took= 431 ms Finished Processing the whole thing -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/3f06e4bc-eb79-4dd8-b987-1bf86c028062%40googlegroups.com. For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/9cb96ece-d30d-49a2-bcb4-bb09098094fc%40googlegroups.com. For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/etPan.532319bc.1190cde7.1ccf%40MacBook-Air-de-David.local. For more options, visit https://groups.google.com/d/optout.
Re: Bulk Processor
David, Sorry, I didn't quite follow, does it do the flushing automatically or am I supposed to tell it? On Wednesday, March 12, 2014 4:05:49 PM UTC-4, David Pilato wrote: > > It also flush docs after a given time, let's say every 5 seconds. > BTW there is a small issue which basically flush the Bulk every n-1 docs > instead of n. > > Fix is on the way. > > -- > David ;-) > Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs > > > Le 12 mars 2014 à 20:51, ZenMaster80 > a > écrit : > > > I don't quite undertsand what the bulk processor is doing this, I would > like someone to explain how it is upposed to work to make sure I designed > this correctly. > I specify the number of actions 1000. > my feeder keeos pushing documents to it "Its more like a loop iterating > documents folders", and I push eash document to the bulk. I expected the > bulk to queue things until it reaches 1000 docs, then processes the bulk? > > Yet, this is how it logs, thie comes from the call back functions of the > bulk processor. > > > Bulk Called: ID= 1, Actions=33, MB=5.46250 > Bulk Called: ID= 2, Actions=29, MB=5.51660 > Bulk Succeeded: ID= 1, took= 921 ms > Bulk Called: ID= 3, Actions=12, MB=5.691812 > Bulk Succeeded: ID= 2, took= 1526 ms > > . > > > > Bulk Called: ID= 23, Actions=8, MB=5.45294 > Bulk Succeeded: ID= 23, took= 751 ms > Bulk Called: ID= 24, Actions=19, MB=5.383918 > Bulk Succeeded: ID= 24, took= 331 ms > Bulk Called: ID= 25, Actions=22, MB=5.347542 > Bulk Succeeded: ID= 25, took= 694 ms > Bulk Called: ID= 26, Actions=58, MB=5.249195 > Bulk Succeeded: ID= 26, took= 583 ms > Bulk Called: ID= 27, Actions=89, MB=5.244396 > Bulk Succeeded: ID= 27, took= 588 ms. > > > Bulk Called: ID= 47, Actions=17, MB=5.245771 ... > > > Bulk Succeeded: ID= 47, took= 431 ms > > Finished Processing the whole thing > > > > > -- > You received this message because you are subscribed to the Google Groups > "elasticsearch" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to elasticsearc...@googlegroups.com . > To view this discussion on the web visit > https://groups.google.com/d/msgid/elasticsearch/3f06e4bc-eb79-4dd8-b987-1bf86c028062%40googlegroups.com<https://groups.google.com/d/msgid/elasticsearch/3f06e4bc-eb79-4dd8-b987-1bf86c028062%40googlegroups.com?utm_medium=email&utm_source=footer> > . > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/9cb96ece-d30d-49a2-bcb4-bb09098094fc%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: Bulk Processor question
If you want to bulk by action count only, set the bulk size threshold to -1. Here is an example (check bulkIndexByActions method): https://github.com/bly2k/es-java-examples/blob/master/index/BulkProcessorExample.java -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/681ba366-4ebd-41a8-a524-847412ed6bf4%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: Bulk Processor question
My docs vary in size. some a very small, some are pdfs like showing in the log there, how do you suggest I do this since I don't know when the docs will be small or large? On Wednesday, March 12, 2014 4:01:53 PM UTC-4, Jörg Prante wrote: > > BulkProcessor has two thresholds, the number of actions (as you use by > setting it to 1000) or a bulk request byte volume (default 5M). What you > see is the 5M limit kicking in, your docs are quite large. > > Jörg > > > On Wed, Mar 12, 2014 at 8:54 PM, ZenMaster80 > > wrote: > >> >> I don't quite undertsand what the bulk processor is doing, I would like >> someone to explain how it is supposed to work to make sure I designed this >> correctly. >> I specify the number of actions 1000. >> My feeder keeps pushing documents to it "Its more like a loop iterating >> documents folders" where I push each document to the bulk. I expected the >> bulk to queue things until it reaches 1000 docs? Then process the bulk? >> >> Yet, this is how it logs, this comes from the call back functions of the >> bulk processor. >> >> >> Bulk Called: ID= 1, Actions=33, MB=5.46250 >> Bulk Called: ID= 2, Actions=29, MB=5.51660 >> Bulk Succeeded: ID= 1, took= 921 ms >> Bulk Called: ID= 3, Actions=12, MB=5.691812 >> Bulk Succeeded: ID= 2, took= 1526 ms >> >> . >> >> >> >> Bulk Called: ID= 23, Actions=8, MB=5.45294 >> Bulk Succeeded: ID= 23, took= 751 ms >> Bulk Called: ID= 24, Actions=19, MB=5.383918 >> Bulk Succeeded: ID= 24, took= 331 ms >> Bulk Called: ID= 25, Actions=22, MB=5.347542 >> Bulk Succeeded: ID= 25, took= 694 ms >> Bulk Called: ID= 26, Actions=58, MB=5.249195 >> Bulk Succeeded: ID= 26, took= 583 ms >> Bulk Called: ID= 27, Actions=89, MB=5.244396 >> Bulk Succeeded: ID= 27, took= 588 ms. >> >> >> Bulk Called: ID= 47, Actions=17, MB=5.245771 ... >> >> >> Bulk Succeeded: ID= 47, took= 431 ms >> >> Finished Processing the whole thing >> >> >> -- >> You received this message because you are subscribed to the Google Groups >> "elasticsearch" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to elasticsearc...@googlegroups.com . >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/elasticsearch/a3131fe6-5183-495f-8658-21b276a72eb6%40googlegroups.com<https://groups.google.com/d/msgid/elasticsearch/a3131fe6-5183-495f-8658-21b276a72eb6%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> For more options, visit https://groups.google.com/d/optout. >> > > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/23cffa8e-ba3f-49f7-9b23-5b4fdd47b054%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: Bulk Processor
It also flush docs after a given time, let's say every 5 seconds. BTW there is a small issue which basically flush the Bulk every n-1 docs instead of n. Fix is on the way. -- David ;-) Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs Le 12 mars 2014 à 20:51, ZenMaster80 a écrit : I don't quite undertsand what the bulk processor is doing this, I would like someone to explain how it is upposed to work to make sure I designed this correctly. I specify the number of actions 1000. my feeder keeos pushing documents to it "Its more like a loop iterating documents folders", and I push eash document to the bulk. I expected the bulk to queue things until it reaches 1000 docs, then processes the bulk? Yet, this is how it logs, thie comes from the call back functions of the bulk processor. Bulk Called: ID= 1, Actions=33, MB=5.46250 Bulk Called: ID= 2, Actions=29, MB=5.51660 Bulk Succeeded: ID= 1, took= 921 ms Bulk Called: ID= 3, Actions=12, MB=5.691812 Bulk Succeeded: ID= 2, took= 1526 ms . Bulk Called: ID= 23, Actions=8, MB=5.45294 Bulk Succeeded: ID= 23, took= 751 ms Bulk Called: ID= 24, Actions=19, MB=5.383918 Bulk Succeeded: ID= 24, took= 331 ms Bulk Called: ID= 25, Actions=22, MB=5.347542 Bulk Succeeded: ID= 25, took= 694 ms Bulk Called: ID= 26, Actions=58, MB=5.249195 Bulk Succeeded: ID= 26, took= 583 ms Bulk Called: ID= 27, Actions=89, MB=5.244396 Bulk Succeeded: ID= 27, took= 588 ms. Bulk Called: ID= 47, Actions=17, MB=5.245771 ... Bulk Succeeded: ID= 47, took= 431 ms Finished Processing the whole thing -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/3f06e4bc-eb79-4dd8-b987-1bf86c028062%40googlegroups.com. For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/ECA7026F-218A-4AB0-8C50-7AA223FCD1FB%40pilato.fr. For more options, visit https://groups.google.com/d/optout.
Re: Bulk Processor question
BulkProcessor has two thresholds, the number of actions (as you use by setting it to 1000) or a bulk request byte volume (default 5M). What you see is the 5M limit kicking in, your docs are quite large. Jörg On Wed, Mar 12, 2014 at 8:54 PM, ZenMaster80 wrote: > > I don't quite undertsand what the bulk processor is doing, I would like > someone to explain how it is supposed to work to make sure I designed this > correctly. > I specify the number of actions 1000. > My feeder keeps pushing documents to it "Its more like a loop iterating > documents folders" where I push each document to the bulk. I expected the > bulk to queue things until it reaches 1000 docs? Then process the bulk? > > Yet, this is how it logs, this comes from the call back functions of the > bulk processor. > > > Bulk Called: ID= 1, Actions=33, MB=5.46250 > Bulk Called: ID= 2, Actions=29, MB=5.51660 > Bulk Succeeded: ID= 1, took= 921 ms > Bulk Called: ID= 3, Actions=12, MB=5.691812 > Bulk Succeeded: ID= 2, took= 1526 ms > > . > > > > Bulk Called: ID= 23, Actions=8, MB=5.45294 > Bulk Succeeded: ID= 23, took= 751 ms > Bulk Called: ID= 24, Actions=19, MB=5.383918 > Bulk Succeeded: ID= 24, took= 331 ms > Bulk Called: ID= 25, Actions=22, MB=5.347542 > Bulk Succeeded: ID= 25, took= 694 ms > Bulk Called: ID= 26, Actions=58, MB=5.249195 > Bulk Succeeded: ID= 26, took= 583 ms > Bulk Called: ID= 27, Actions=89, MB=5.244396 > Bulk Succeeded: ID= 27, took= 588 ms. > > > Bulk Called: ID= 47, Actions=17, MB=5.245771 ... > > > Bulk Succeeded: ID= 47, took= 431 ms > > Finished Processing the whole thing > > > -- > You received this message because you are subscribed to the Google Groups > "elasticsearch" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to elasticsearch+unsubscr...@googlegroups.com. > To view this discussion on the web visit > https://groups.google.com/d/msgid/elasticsearch/a3131fe6-5183-495f-8658-21b276a72eb6%40googlegroups.com<https://groups.google.com/d/msgid/elasticsearch/a3131fe6-5183-495f-8658-21b276a72eb6%40googlegroups.com?utm_medium=email&utm_source=footer> > . > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoEg11MxwGanPVVJpvjd1py1D2PkwkBGjC05YKzh_A1P%3Dg%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Bulk Processor question
I don't quite undertsand what the bulk processor is doing, I would like someone to explain how it is supposed to work to make sure I designed this correctly. I specify the number of actions 1000. My feeder keeps pushing documents to it "Its more like a loop iterating documents folders" where I push each document to the bulk. I expected the bulk to queue things until it reaches 1000 docs? Then process the bulk? Yet, this is how it logs, this comes from the call back functions of the bulk processor. Bulk Called: ID= 1, Actions=33, MB=5.46250 Bulk Called: ID= 2, Actions=29, MB=5.51660 Bulk Succeeded: ID= 1, took= 921 ms Bulk Called: ID= 3, Actions=12, MB=5.691812 Bulk Succeeded: ID= 2, took= 1526 ms . Bulk Called: ID= 23, Actions=8, MB=5.45294 Bulk Succeeded: ID= 23, took= 751 ms Bulk Called: ID= 24, Actions=19, MB=5.383918 Bulk Succeeded: ID= 24, took= 331 ms Bulk Called: ID= 25, Actions=22, MB=5.347542 Bulk Succeeded: ID= 25, took= 694 ms Bulk Called: ID= 26, Actions=58, MB=5.249195 Bulk Succeeded: ID= 26, took= 583 ms Bulk Called: ID= 27, Actions=89, MB=5.244396 Bulk Succeeded: ID= 27, took= 588 ms. Bulk Called: ID= 47, Actions=17, MB=5.245771 ... Bulk Succeeded: ID= 47, took= 431 ms Finished Processing the whole thing -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/a3131fe6-5183-495f-8658-21b276a72eb6%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Bulk Processor
I don't quite undertsand what the bulk processor is doing this, I would like someone to explain how it is upposed to work to make sure I designed this correctly. I specify the number of actions 1000. my feeder keeos pushing documents to it "Its more like a loop iterating documents folders", and I push eash document to the bulk. I expected the bulk to queue things until it reaches 1000 docs, then processes the bulk? Yet, this is how it logs, thie comes from the call back functions of the bulk processor. Bulk Called: ID= 1, Actions=33, MB=5.46250 Bulk Called: ID= 2, Actions=29, MB=5.51660 Bulk Succeeded: ID= 1, took= 921 ms Bulk Called: ID= 3, Actions=12, MB=5.691812 Bulk Succeeded: ID= 2, took= 1526 ms . Bulk Called: ID= 23, Actions=8, MB=5.45294 Bulk Succeeded: ID= 23, took= 751 ms Bulk Called: ID= 24, Actions=19, MB=5.383918 Bulk Succeeded: ID= 24, took= 331 ms Bulk Called: ID= 25, Actions=22, MB=5.347542 Bulk Succeeded: ID= 25, took= 694 ms Bulk Called: ID= 26, Actions=58, MB=5.249195 Bulk Succeeded: ID= 26, took= 583 ms Bulk Called: ID= 27, Actions=89, MB=5.244396 Bulk Succeeded: ID= 27, took= 588 ms. Bulk Called: ID= 47, Actions=17, MB=5.245771 ... Bulk Succeeded: ID= 47, took= 431 ms Finished Processing the whole thing -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/3f06e4bc-eb79-4dd8-b987-1bf86c028062%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.