Re: Bulk Processor

2014-03-14 Thread David Pilato
It does it automatically.

You just have to properly call .close() when you stop you application.
It will process the pending requests before actually exiting.



-- 
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr


Le 14 mars 2014 à 15:50:48, ZenMaster80 (sabdall...@gmail.com) a écrit:

David,

Sorry, I didn't quite follow, does it do the flushing automatically or am I 
supposed to tell it?

On Wednesday, March 12, 2014 4:05:49 PM UTC-4, David Pilato wrote:
It also flush docs after a given time, let's say every 5 seconds.
BTW there is a small issue which basically flush the Bulk every n-1 docs 
instead of n.

Fix is on the way.

--
David ;-)
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs


Le 12 mars 2014 à 20:51, ZenMaster80  a écrit :


I don't quite undertsand what the bulk processor is doing this, I would like 
someone to explain how it is upposed to work to make sure I designed this 
correctly.
I specify the number of actions 1000.
my feeder keeos pushing documents to it "Its more like a loop iterating 
documents folders", and I push eash document to the bulk. I expected the bulk 
to queue things until it reaches 1000 docs, then processes the bulk?

Yet, this is how it logs, thie comes from the call back functions of the bulk 
processor.


Bulk Called: ID= 1, Actions=33, MB=5.46250
Bulk Called: ID= 2, Actions=29, MB=5.51660
Bulk Succeeded: ID= 1, took= 921 ms
Bulk Called: ID= 3, Actions=12, MB=5.691812
Bulk Succeeded: ID= 2, took= 1526 ms

.



Bulk Called: ID= 23, Actions=8, MB=5.45294
Bulk Succeeded: ID= 23, took= 751 ms
Bulk Called: ID= 24, Actions=19, MB=5.383918
Bulk Succeeded: ID= 24, took= 331 ms
Bulk Called: ID= 25, Actions=22, MB=5.347542
Bulk Succeeded: ID= 25, took= 694 ms
Bulk Called: ID= 26, Actions=58, MB=5.249195
Bulk Succeeded: ID= 26, took= 583 ms
Bulk Called: ID= 27, Actions=89, MB=5.244396
Bulk Succeeded: ID= 27, took= 588 ms.


Bulk Called: ID= 47, Actions=17, MB=5.245771 ...


Bulk Succeeded: ID= 47, took= 431 ms

Finished Processing the whole thing




--
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearc...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/3f06e4bc-eb79-4dd8-b987-1bf86c028062%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/9cb96ece-d30d-49a2-bcb4-bb09098094fc%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/etPan.532319bc.1190cde7.1ccf%40MacBook-Air-de-David.local.
For more options, visit https://groups.google.com/d/optout.


Re: Bulk Processor

2014-03-14 Thread ZenMaster80
David,

Sorry, I didn't quite follow, does it do the flushing automatically or am I 
supposed to tell it?

On Wednesday, March 12, 2014 4:05:49 PM UTC-4, David Pilato wrote:
>
> It also flush docs after a given time, let's say every 5 seconds.
> BTW there is a small issue which basically flush the Bulk every n-1 docs 
> instead of n.
>
> Fix is on the way.
>
> --
> David ;-)
> Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs
>
>
> Le 12 mars 2014 à 20:51, ZenMaster80 > a 
> écrit :
>
>
> I don't quite undertsand what the bulk processor is doing this, I would 
> like someone to explain how it is upposed to work to make sure I designed 
> this correctly.
> I specify the number of actions 1000.
> my feeder keeos pushing documents to it "Its more like a loop iterating 
> documents folders", and I push eash document to the bulk. I expected the 
> bulk to queue things until it reaches 1000 docs, then processes the bulk?
>
> Yet, this is how it logs, thie comes from the call back functions of the 
> bulk processor.
>
>
> Bulk Called: ID= 1, Actions=33, MB=5.46250
> Bulk Called: ID= 2, Actions=29, MB=5.51660
> Bulk Succeeded: ID= 1, took= 921 ms
> Bulk Called: ID= 3, Actions=12, MB=5.691812
> Bulk Succeeded: ID= 2, took= 1526 ms
>
> .
>
>
>
> Bulk Called: ID= 23, Actions=8, MB=5.45294
> Bulk Succeeded: ID= 23, took= 751 ms
> Bulk Called: ID= 24, Actions=19, MB=5.383918
> Bulk Succeeded: ID= 24, took= 331 ms
> Bulk Called: ID= 25, Actions=22, MB=5.347542
> Bulk Succeeded: ID= 25, took= 694 ms
> Bulk Called: ID= 26, Actions=58, MB=5.249195
> Bulk Succeeded: ID= 26, took= 583 ms
> Bulk Called: ID= 27, Actions=89, MB=5.244396
> Bulk Succeeded: ID= 27, took= 588 ms.
>
>
> Bulk Called: ID= 47, Actions=17, MB=5.245771 ...
>
>
> Bulk Succeeded: ID= 47, took= 431 ms
>
> Finished Processing the whole thing
>
>
>
>
>  -- 
> You received this message because you are subscribed to the Google Groups 
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to elasticsearc...@googlegroups.com .
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/elasticsearch/3f06e4bc-eb79-4dd8-b987-1bf86c028062%40googlegroups.com<https://groups.google.com/d/msgid/elasticsearch/3f06e4bc-eb79-4dd8-b987-1bf86c028062%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/9cb96ece-d30d-49a2-bcb4-bb09098094fc%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Bulk Processor question

2014-03-12 Thread Binh Ly
If you want to bulk by action count only, set the bulk size threshold to 
-1. Here is an example (check bulkIndexByActions method):

https://github.com/bly2k/es-java-examples/blob/master/index/BulkProcessorExample.java

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/681ba366-4ebd-41a8-a524-847412ed6bf4%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Bulk Processor question

2014-03-12 Thread ZenMaster80
My docs vary in size. some a very small, some are pdfs like showing in the 
log there, how do you suggest I do this since I don't know when the docs 
will be small or large?

On Wednesday, March 12, 2014 4:01:53 PM UTC-4, Jörg Prante wrote:
>
> BulkProcessor has two thresholds, the number of actions (as you use by 
> setting it to 1000) or a bulk request byte volume (default 5M). What you 
> see is the 5M limit kicking in, your docs are quite large.
>
> Jörg
>
>
> On Wed, Mar 12, 2014 at 8:54 PM, ZenMaster80 
> > wrote:
>
>>
>> I don't quite undertsand what the bulk processor is doing, I would like 
>> someone to explain how it is supposed to work to make sure I designed this 
>> correctly.
>> I specify the number of actions 1000.
>> My feeder keeps pushing documents to it "Its more like a loop iterating 
>> documents folders" where I push each document to the bulk. I expected the 
>> bulk to queue things until it reaches 1000 docs? Then process the bulk?
>>
>> Yet, this is how it logs, this comes from the call back functions of the 
>> bulk processor.
>>
>>
>> Bulk Called: ID= 1, Actions=33, MB=5.46250
>> Bulk Called: ID= 2, Actions=29, MB=5.51660
>> Bulk Succeeded: ID= 1, took= 921 ms
>> Bulk Called: ID= 3, Actions=12, MB=5.691812
>> Bulk Succeeded: ID= 2, took= 1526 ms
>>
>> .
>>
>>
>>
>> Bulk Called: ID= 23, Actions=8, MB=5.45294
>> Bulk Succeeded: ID= 23, took= 751 ms
>> Bulk Called: ID= 24, Actions=19, MB=5.383918
>> Bulk Succeeded: ID= 24, took= 331 ms
>> Bulk Called: ID= 25, Actions=22, MB=5.347542
>> Bulk Succeeded: ID= 25, took= 694 ms
>> Bulk Called: ID= 26, Actions=58, MB=5.249195
>> Bulk Succeeded: ID= 26, took= 583 ms
>> Bulk Called: ID= 27, Actions=89, MB=5.244396
>> Bulk Succeeded: ID= 27, took= 588 ms.
>>
>>
>> Bulk Called: ID= 47, Actions=17, MB=5.245771 ...
>>
>>
>> Bulk Succeeded: ID= 47, took= 431 ms
>>
>> Finished Processing the whole thing
>>
>>
>>  -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to elasticsearc...@googlegroups.com .
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/a3131fe6-5183-495f-8658-21b276a72eb6%40googlegroups.com<https://groups.google.com/d/msgid/elasticsearch/a3131fe6-5183-495f-8658-21b276a72eb6%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/23cffa8e-ba3f-49f7-9b23-5b4fdd47b054%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Bulk Processor

2014-03-12 Thread David Pilato
It also flush docs after a given time, let's say every 5 seconds.
BTW there is a small issue which basically flush the Bulk every n-1 docs 
instead of n.

Fix is on the way.

--
David ;-)
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs


Le 12 mars 2014 à 20:51, ZenMaster80  a écrit :


I don't quite undertsand what the bulk processor is doing this, I would like 
someone to explain how it is upposed to work to make sure I designed this 
correctly.
I specify the number of actions 1000.
my feeder keeos pushing documents to it "Its more like a loop iterating 
documents folders", and I push eash document to the bulk. I expected the bulk 
to queue things until it reaches 1000 docs, then processes the bulk?

Yet, this is how it logs, thie comes from the call back functions of the bulk 
processor.


Bulk Called: ID= 1, Actions=33, MB=5.46250
Bulk Called: ID= 2, Actions=29, MB=5.51660
Bulk Succeeded: ID= 1, took= 921 ms
Bulk Called: ID= 3, Actions=12, MB=5.691812
Bulk Succeeded: ID= 2, took= 1526 ms

.



Bulk Called: ID= 23, Actions=8, MB=5.45294
Bulk Succeeded: ID= 23, took= 751 ms
Bulk Called: ID= 24, Actions=19, MB=5.383918
Bulk Succeeded: ID= 24, took= 331 ms
Bulk Called: ID= 25, Actions=22, MB=5.347542
Bulk Succeeded: ID= 25, took= 694 ms
Bulk Called: ID= 26, Actions=58, MB=5.249195
Bulk Succeeded: ID= 26, took= 583 ms
Bulk Called: ID= 27, Actions=89, MB=5.244396
Bulk Succeeded: ID= 27, took= 588 ms.


Bulk Called: ID= 47, Actions=17, MB=5.245771 ...


Bulk Succeeded: ID= 47, took= 431 ms

Finished Processing the whole thing




-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/3f06e4bc-eb79-4dd8-b987-1bf86c028062%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/ECA7026F-218A-4AB0-8C50-7AA223FCD1FB%40pilato.fr.
For more options, visit https://groups.google.com/d/optout.


Re: Bulk Processor question

2014-03-12 Thread joergpra...@gmail.com
BulkProcessor has two thresholds, the number of actions (as you use by
setting it to 1000) or a bulk request byte volume (default 5M). What you
see is the 5M limit kicking in, your docs are quite large.

Jörg


On Wed, Mar 12, 2014 at 8:54 PM, ZenMaster80  wrote:

>
> I don't quite undertsand what the bulk processor is doing, I would like
> someone to explain how it is supposed to work to make sure I designed this
> correctly.
> I specify the number of actions 1000.
> My feeder keeps pushing documents to it "Its more like a loop iterating
> documents folders" where I push each document to the bulk. I expected the
> bulk to queue things until it reaches 1000 docs? Then process the bulk?
>
> Yet, this is how it logs, this comes from the call back functions of the
> bulk processor.
>
>
> Bulk Called: ID= 1, Actions=33, MB=5.46250
> Bulk Called: ID= 2, Actions=29, MB=5.51660
> Bulk Succeeded: ID= 1, took= 921 ms
> Bulk Called: ID= 3, Actions=12, MB=5.691812
> Bulk Succeeded: ID= 2, took= 1526 ms
>
> .
>
>
>
> Bulk Called: ID= 23, Actions=8, MB=5.45294
> Bulk Succeeded: ID= 23, took= 751 ms
> Bulk Called: ID= 24, Actions=19, MB=5.383918
> Bulk Succeeded: ID= 24, took= 331 ms
> Bulk Called: ID= 25, Actions=22, MB=5.347542
> Bulk Succeeded: ID= 25, took= 694 ms
> Bulk Called: ID= 26, Actions=58, MB=5.249195
> Bulk Succeeded: ID= 26, took= 583 ms
> Bulk Called: ID= 27, Actions=89, MB=5.244396
> Bulk Succeeded: ID= 27, took= 588 ms.
>
>
> Bulk Called: ID= 47, Actions=17, MB=5.245771 ...
>
>
> Bulk Succeeded: ID= 47, took= 431 ms
>
> Finished Processing the whole thing
>
>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/a3131fe6-5183-495f-8658-21b276a72eb6%40googlegroups.com<https://groups.google.com/d/msgid/elasticsearch/a3131fe6-5183-495f-8658-21b276a72eb6%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoEg11MxwGanPVVJpvjd1py1D2PkwkBGjC05YKzh_A1P%3Dg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Bulk Processor question

2014-03-12 Thread ZenMaster80

I don't quite undertsand what the bulk processor is doing, I would like 
someone to explain how it is supposed to work to make sure I designed this 
correctly.
I specify the number of actions 1000.
My feeder keeps pushing documents to it "Its more like a loop iterating 
documents folders" where I push each document to the bulk. I expected the 
bulk to queue things until it reaches 1000 docs? Then process the bulk?

Yet, this is how it logs, this comes from the call back functions of the 
bulk processor.


Bulk Called: ID= 1, Actions=33, MB=5.46250
Bulk Called: ID= 2, Actions=29, MB=5.51660
Bulk Succeeded: ID= 1, took= 921 ms
Bulk Called: ID= 3, Actions=12, MB=5.691812
Bulk Succeeded: ID= 2, took= 1526 ms

.



Bulk Called: ID= 23, Actions=8, MB=5.45294
Bulk Succeeded: ID= 23, took= 751 ms
Bulk Called: ID= 24, Actions=19, MB=5.383918
Bulk Succeeded: ID= 24, took= 331 ms
Bulk Called: ID= 25, Actions=22, MB=5.347542
Bulk Succeeded: ID= 25, took= 694 ms
Bulk Called: ID= 26, Actions=58, MB=5.249195
Bulk Succeeded: ID= 26, took= 583 ms
Bulk Called: ID= 27, Actions=89, MB=5.244396
Bulk Succeeded: ID= 27, took= 588 ms.


Bulk Called: ID= 47, Actions=17, MB=5.245771 ...


Bulk Succeeded: ID= 47, took= 431 ms

Finished Processing the whole thing


-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/a3131fe6-5183-495f-8658-21b276a72eb6%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Bulk Processor

2014-03-12 Thread ZenMaster80

I don't quite undertsand what the bulk processor is doing this, I would 
like someone to explain how it is upposed to work to make sure I designed 
this correctly.
I specify the number of actions 1000.
my feeder keeos pushing documents to it "Its more like a loop iterating 
documents folders", and I push eash document to the bulk. I expected the 
bulk to queue things until it reaches 1000 docs, then processes the bulk?

Yet, this is how it logs, thie comes from the call back functions of the 
bulk processor.


Bulk Called: ID= 1, Actions=33, MB=5.46250
Bulk Called: ID= 2, Actions=29, MB=5.51660
Bulk Succeeded: ID= 1, took= 921 ms
Bulk Called: ID= 3, Actions=12, MB=5.691812
Bulk Succeeded: ID= 2, took= 1526 ms

.



Bulk Called: ID= 23, Actions=8, MB=5.45294
Bulk Succeeded: ID= 23, took= 751 ms
Bulk Called: ID= 24, Actions=19, MB=5.383918
Bulk Succeeded: ID= 24, took= 331 ms
Bulk Called: ID= 25, Actions=22, MB=5.347542
Bulk Succeeded: ID= 25, took= 694 ms
Bulk Called: ID= 26, Actions=58, MB=5.249195
Bulk Succeeded: ID= 26, took= 583 ms
Bulk Called: ID= 27, Actions=89, MB=5.244396
Bulk Succeeded: ID= 27, took= 588 ms.


Bulk Called: ID= 47, Actions=17, MB=5.245771 ...


Bulk Succeeded: ID= 47, took= 431 ms

Finished Processing the whole thing




-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/3f06e4bc-eb79-4dd8-b987-1bf86c028062%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.