[ 
https://issues.apache.org/jira/browse/KAFKA-6471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16337075#comment-16337075
 ] 

Coen Damen edited comment on KAFKA-6471 at 1/24/18 7:40 AM:
------------------------------------------------------------

Hi Jason, thanks for your reply.

The Use Case is the following.

We are retrieving log files from a machine. These log records are transformed 
into Kafka messages. Writing a log file into Kafka is atomic. In case of a read 
failure (of a file) or a write failure (to Kafka), the transaction of writing 
messages to Kafka should be aborted an tried again.

When tried again, or when idle for a longer time, during a restart or 
commencing of the "job" we want to read where the processing was halted. e.g. 
the last successfully processed file. For this I expected to use the seekToEnd 
with a Consumer that has the setting read_committed. But, it moved to the end 
of the Topic, even after the Topic contained many aborted messages at the end.

Note: the filename and the index within the file are part of the message, So we 
want to retrieve the last successful message and extract the filename from it.

Thank you,

Coen

 


was (Author: coenos):
Hi Jason, thanks for your reply.

The Use Case is the following.

We are retrieving log files from a machine. These log records are transformed 
into Kafka messages. Writing a log file into Kafka is atomic. In case of a read 
failure (of a file) or a write failure (to Kafka), the transaction of writing 
messages to Kafka should be aborted an tried again.

When tried again, or when idle for a longer time, during a restart or 
commencing of the "job" we want to read where the processing was halted. e.g. 
the last successfully processed file. For this I expected to use the seekToEnd 
with a Consumer that has the setting read_committed. But, it moved to the end 
of the Topic, even after the Topic contained many aborted messages at the end.

Thank you,

Coen

 

> seekToEnd and seek give unclear results for Consumer with read_committed 
> isolation level
> ----------------------------------------------------------------------------------------
>
>                 Key: KAFKA-6471
>                 URL: https://issues.apache.org/jira/browse/KAFKA-6471
>             Project: Kafka
>          Issue Type: Bug
>    Affects Versions: 1.0.0
>            Reporter: Coen Damen
>            Priority: Major
>
> I am using the transactional KafkaProducer to send messages to a topic. This 
> works fine. I use a KafkaConsumer with read_committed isolation level and I 
> have an issue with the seek and seekToEnd methods. According to the 
> documentation, the seek and seekToEnd methods give me the LSO (Last Stable 
> Offset). But this is a bit confusing. As it gives me always the same value, 
> the END of the topic. No matter if the last entry is committed (by the 
> Producer) or part of an aborted transaction. Example, after I abort the last 
> 5 tries to insert 20_000 messages, the last 100_000 records should not be 
> read by the Consumer. But during a seekToEnd it moves to the end of the Topic 
> (including the 100_000 messages). But the poll() does not return them.
> I am looking for a way to retrieve the Last Committed Offset (so the last 
> successful committed message by the Producer). There seems to be no proper 
> API method for this. So do I need to roll my own?
> Option would be to move back and poll until no more records are retrieved, 
> this would result in the last committed message. But I would assume that 
> Kafka provides this method.
> We use Kafka 1.0.0.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to