Re: Writing batches to database using Transactionable Store Output operator

2015-12-29 Thread Sandeep Deshmukh
Chandni: Good point on ordering. We need to handle the case. Even the ordering is not guaranteed when upstream operator has multiple instances. Regards, Sandeep On Tue, Dec 29, 2015 at 8:39 PM, Chandni Singh wrote: > I agree with Sandeep that at-least once with databases is not the right > app

Re: Writing batches to database using Transactionable Store Output operator

2015-12-29 Thread Chandni Singh
I agree with Sandeep that at-least once with databases is not the right approach. All the output adaptors we have in the library are written so that they do not write duplicate entries. For example FileOutputOperator. This has nothing to do with the processing mode feature offered by the platform.

Re: Writing batches to database using Transactionable Store Output operator

2015-12-29 Thread Priyanka Gugale
One more option: We can keep track of windowId & batchId i.e. we save batchId with windowId when we commit a batch within a window. e.g. we are in window 10 and we have written second batch in window 10, we commit windowId=10 and batchId=2 to DB. While recovery we won't process batches within last

Re: Writing batches to database using Transactionable Store Output operator

2015-12-29 Thread Sandeep Deshmukh
Not sure if "At least once" is right behavior for databases. We may not always have primary key to update or insert. Regards, Sandeep On Tue, Dec 29, 2015 at 2:23 PM, Priyanka Gugale wrote: > Hi, > > Thanks for your inputs Chandni. I guess what you are suggesting is similar > to AbstractJdbcNo

Re: Writing batches to database using Transactionable Store Output operator

2015-12-29 Thread Priyanka Gugale
Hi, Thanks for your inputs Chandni. I guess what you are suggesting is similar to AbstractJdbcNonTransactionableBatchOutputOperator which is batch non transactional operation. That is one of the good option. I am also thinking of a possibility of having "At least once" behavior with Transactional

Re: Writing batches to database using Transactionable Store Output operator

2015-12-28 Thread Chandni Singh
Yeah I understand there is a problem that app window size is time based here not number of events based. However I don't think having a max batch size in this class will help because that causes problems with saving the tuples exactly once and idempotency. I was just trying to let you know why the

Re: Writing batches to database using Transactionable Store Output operator

2015-12-28 Thread Chinmay Kolhatkar
Hi Chandni, I totally agree with you that the transactions should be idempotent. And that needs to be taken care of if the batch size is configurable. Though, I have a question related to the second part where batch size is controlled by by controlling app window size. I agree with you that aggre

Re: Writing batches to database using Transactionable Store Output operator

2015-12-28 Thread Chandni Singh
Hey Chinmay/Priyanka, We need to write tuples exactly once in the store. Please address the failure scenarios on how to achieve exactly once and idempotency. I mentioned in my previous mail why multiple batches in a window is a problem with exactly once. Control via app window would mean, tuning

Re: Writing batches to database using Transactionable Store Output operator

2015-12-28 Thread Chinmay Kolhatkar
Hi, Just a thought on how it can possibly be done. The pseudo code might look like this: processTuple() { If(batchSize < configuredBatchSize){ //add to the batch } Else { // process the batch as a transaction // empty the data structure of batch. } } endWindow() { // process the batch as

Re: Writing batches to database using Transactionable Store Output operator

2015-12-28 Thread Chandni Singh
But you will not allow multiple batches in the same window? Can you please elaborate on failure scenarios and how it affects idempotency. Chandni On Mon, Dec 28, 2015 at 2:32 AM, Priyanka Gugale wrote: > Hi, > > Sorry if I was not clear, but I am trying to propose the MAX_SIZE per > window whic

Re: Writing batches to database using Transactionable Store Output operator

2015-12-28 Thread Priyanka Gugale
Hi, Sorry if I was not clear, but I am trying to propose the MAX_SIZE per window which the operator could process. The size could be less than the MAX_SIZE, no restriction about that. -Priyanka On Mon, Dec 28, 2015 at 3:22 PM, Chandni Singh wrote: > How do you propose to to restrict the no. of

Re: Writing batches to database using Transactionable Store Output operator

2015-12-28 Thread Chandni Singh
How do you propose to to restrict the no. of tuples processed in an application window < batch size. I don't see a way to enforce that batch size can never be less tuples processed in an application window. On Mon, Dec 28, 2015 at 1:25 AM, Priyanka Gugale wrote: > Hi Chandni, > > How about rest

Re: Writing batches to database using Transactionable Store Output operator

2015-12-28 Thread Priyanka Gugale
Hi Chandni, How about restricting tuples which can be processed per window. If someone wants to process small and frequent batches, he can set batch size to some small value and also reduce the window size. This would build some back pressure of course. But that could be acceptable if one really w

Re: Writing batches to database using Transactionable Store Output operator

2015-12-28 Thread Chandni Singh
Priyanka, AbstractBatchTransactionableStore assumes all tuples in one application as a batch because it needs to store the tuples in the store exactly-once. If there is more than one batch in an application window, then to store the tuples exactly once the window Id needs to be written with every

Re: Writing batches to database using Transactionable Store Output operator

2015-12-28 Thread Chinmay Kolhatkar
+1 for this. ~ Chinmay. On Mon, Dec 28, 2015 at 2:27 PM, Priyanka Gugale wrote: > Hi, > > In Malhar we have an > operator AbstractBatchTransactionableStoreOutputOperator which creates > batches based on tuples received in a window. At the end of the window > these batches are sent to database f

Writing batches to database using Transactionable Store Output operator

2015-12-28 Thread Priyanka Gugale
Hi, In Malhar we have an operator AbstractBatchTransactionableStoreOutputOperator which creates batches based on tuples received in a window. At the end of the window these batches are sent to database for processing. There is no way to configure MAX_SIZE on these batches. Based on input rate the