Hi Yi,
I got it now, thanks for your help!
————————
Qi Shu
> 在 2017年5月18日,05:22,Yi Pan <[email protected]> 写道:
>
> Hi, Qi,
>
> This would depend on the following two factors:
> # whether the send() is async or sync
> # how do you handle the send failure
>
> If the send() is sync, you will always receive an exception in your
> process() method when MessageCollector.send() is invoked. Hence, if your
> code does not handle the exception, it would be thrown out to the RunLoop
> and the whole container will fail. If your code captures the exception, it
> is then up to your application logic to deal with the send failure (i.e.
> user will need to choose either ignore the send failure and proceed, or
> fail and stop). If you choose to not ignore the send failures, then in this
> case, the checkpoint will not proceed beyond the input that caused the send
> failures, and the container will restart with the previous checkpoint,
> which does not cause data loss.
>
> If the send() is async, the commit procedure in RunLoop will make sure to
> flush all pending sends before checkpointing. If the flush fails, the
> exception will be thrown out and the container will fail. Hence, when
> restarted, the container will repeat from the previous checkpoint (i.e. at
> least once delivery still holds and no data loss).
>
> Hope the above answers your question.
>
> Thanks!
>
> -Yi
>
> On Thu, May 11, 2017 at 12:43 AM, 舒琦 <[email protected]> wrote:
>
>> Hi Jagadish,
>>
>> I may not express my questions clearly.
>>
>> Here is what I want to know. When MessageCollector.send is called
>> in process method, if sending fail and fail again, under this situation is
>> it possible to cause data loss ( continue to fetch and process messages,
>> but can’t send them out, at the same time offset is still forwarding and
>> checkpointing ).
>>
>> Thanks very much.
>>
>> ————————
>> Qi Shu
>>
>>> 在 2017年5月11日,15:35,Jagadish Venkatraman <[email protected]> 写道:
>>>
>>> Hi Qi,
>>>
>>>>> If one record can’t be sent out all the time, then the consumer
>>> will still fetch messages or not, and what about the offset
>> checkpointing?
>>>
>>> Polling / fetching messages from the consumer (in case of Kafka) happens
>> in
>>> a separate thread.
>>>
>>> Samza offers an at-least once processing guarantee with zero data loss.
>>>
>>> I'm not sure I understand your specific question about checkpointing?
>>>
>>>
>>> On Thu, May 11, 2017 at 12:28 AM, 舒琦 <[email protected]> wrote:
>>>
>>>> Hi,
>>>>
>>>> Below is the description about checkpointing.
>>>>
>>>> 『Checkpointing is guaranteed to only cover events that are fully
>>>> processed. It happens only when there are no pending
>>>> process()/processAsync() or WindowableTask.window() invocations. All
>>>> preceding invocations happen-before checkpointing and checkpointing
>>>> happens-before all subsequent invocations.』
>>>>
>>>> If one record can’t be sent out all the time, then the consumer
>>>> will still fetch messages or not, and what about the offset
>> checkpointing?
>>>>
>>>> Thanks!
>>>>
>>>> ————————
>>>> Qi Shu
>>>
>>>
>>>
>>>
>>> --
>>> Jagadish V,
>>> Graduate Student,
>>> Department of Computer Science,
>>> Stanford University
>>
>>