Chandni: Good point on ordering.
We need to handle the case. Even the ordering is not guaranteed when
upstream operator has multiple instances.
Regards,
Sandeep
On Tue, Dec 29, 2015 at 8:39 PM, Chandni Singh
wrote:
> I agree with Sandeep that at-least once with databases is not the right
> app
I agree with Sandeep that at-least once with databases is not the right
approach. All the output adaptors we have in the library are written so
that they do not write duplicate entries. For example FileOutputOperator.
This has nothing to do with the processing mode feature offered by the
platform.
One more option:
We can keep track of windowId & batchId i.e. we save batchId with windowId
when we commit a batch within a window. e.g. we are in window 10 and we
have written second batch in window 10, we commit windowId=10 and batchId=2
to DB. While recovery we won't process batches within last
Not sure if "At least once" is right behavior for databases. We may not
always have primary key to update or insert.
Regards,
Sandeep
On Tue, Dec 29, 2015 at 2:23 PM, Priyanka Gugale wrote:
> Hi,
>
> Thanks for your inputs Chandni. I guess what you are suggesting is similar
> to AbstractJdbcNo
Hi,
Thanks for your inputs Chandni. I guess what you are suggesting is similar
to AbstractJdbcNonTransactionableBatchOutputOperator which is batch non
transactional operation. That is one of the good option.
I am also thinking of a possibility of having "At least once" behavior with
Transactional
Yeah I understand there is a problem that app window size is time based
here not number of events based. However I don't think having a max batch
size in this class will help because that causes problems with saving the
tuples exactly once and idempotency.
I was just trying to let you know why the
Hi Chandni,
I totally agree with you that the transactions should be idempotent. And
that needs to be taken care of if the batch size is configurable.
Though, I have a question related to the second part where batch size is
controlled by by controlling app window size.
I agree with you that aggre
Hey Chinmay/Priyanka,
We need to write tuples exactly once in the store. Please address the
failure scenarios on how to achieve exactly once and idempotency. I
mentioned in my previous mail why multiple batches in a window is a problem
with exactly once.
Control via app window would mean, tuning
Hi,
Just a thought on how it can possibly be done.
The pseudo code might look like this:
processTuple()
{
If(batchSize < configuredBatchSize){
//add to the batch
}
Else {
// process the batch as a transaction
// empty the data structure of batch.
}
}
endWindow()
{
// process the batch as
But you will not allow multiple batches in the same window?
Can you please elaborate on failure scenarios and how it affects
idempotency.
Chandni
On Mon, Dec 28, 2015 at 2:32 AM, Priyanka Gugale
wrote:
> Hi,
>
> Sorry if I was not clear, but I am trying to propose the MAX_SIZE per
> window whic
Hi,
Sorry if I was not clear, but I am trying to propose the MAX_SIZE per
window which the operator could process. The size could be less than the
MAX_SIZE, no restriction about that.
-Priyanka
On Mon, Dec 28, 2015 at 3:22 PM, Chandni Singh
wrote:
> How do you propose to to restrict the no. of
How do you propose to to restrict the no. of tuples processed in an
application window < batch size.
I don't see a way to enforce that batch size can never be less tuples
processed in an application window.
On Mon, Dec 28, 2015 at 1:25 AM, Priyanka Gugale wrote:
> Hi Chandni,
>
> How about rest
Hi Chandni,
How about restricting tuples which can be processed per window. If someone
wants to process small and frequent batches, he can set batch size to some
small value and also reduce the window size. This would build some back
pressure of course. But that could be acceptable if one really w
Priyanka,
AbstractBatchTransactionableStore assumes all tuples in one application as
a batch because it needs to store the tuples in the store exactly-once.
If there is more than one batch in an application window, then to store the
tuples exactly once the window Id needs to be written with every
+1 for this.
~ Chinmay.
On Mon, Dec 28, 2015 at 2:27 PM, Priyanka Gugale wrote:
> Hi,
>
> In Malhar we have an
> operator AbstractBatchTransactionableStoreOutputOperator which creates
> batches based on tuples received in a window. At the end of the window
> these batches are sent to database f
Hi,
In Malhar we have an
operator AbstractBatchTransactionableStoreOutputOperator which creates
batches based on tuples received in a window. At the end of the window
these batches are sent to database for processing.
There is no way to configure MAX_SIZE on these batches. Based on input rate
the
16 matches
Mail list logo