This used to be working but it appears @FinalizeBundle (which KafkaIO requires) was simply ignored for portable (Python) pipelines. It looks relatively easy to fix.

-Max

On 07.07.20 03:37, Luke Cwik wrote:
The KafkaIO implementation relies on checkpointing to be able to update the last committed offset. This is currently unsupported in the portable Flink runner. BEAM-6868[1] is the associated JIRA. Please vote on it and/or offer to provide an implementation.

1: https://issues.apache.org/jira/browse/BEAM-6868

On Mon, Jul 6, 2020 at 1:42 PM Piotr Filipiuk <piotr.filip...@gmail.com <mailto:piotr.filip...@gmail.com>> wrote:

    Hi,

    I am trying to run a simple example that uses Python API to
    ReadFromKafka, however I am getting the following error when using
    Flink Runner:

    java.lang.UnsupportedOperationException: The ActiveBundle does not
    have a registered bundle checkpoint handler.

    See full log in read_from_kafka_flink.log

    I am using:
    Kafka 2.5.0
    Beam 2.22.0
    Flink 1.10

    When using Direct runner, the pipeline does not fail but does not
    seem to be consuming any data (see read_from_kafka.log) even though
    the updated offsets are being logged:

    [2020-07-06 13:36:01,342] {worker_handlers.py:398} INFO - severity: INFO
    timestamp {
       seconds: 1594067761
       nanos: 340000000
    }
    message: "Reader-0: reading from test-topic-0 starting at offset 165"
    log_location: "org.apache.beam.sdk.io.kafka.KafkaUnboundedSource"
    thread: "23"

    I am running both Kafka and Flink locally. I would appreciate your
    help understanding and fixing the issue.

-- Best regards,
    Piotr

Reply via email to