Re: Specific use-case question - Kafka-to-GCS-avro-Python

2024-03-13 Thread XQ Hu via user
Can you explain more about " that current sinks for Avro and Parquet with the destination of GCS are not supported"? We do have AvroIO and ParquetIO ( https://beam.apache.org/documentation/io/connectors/) in Python. On Wed, Mar 13, 2024 at 5:35 PM Ondřej Pánek wrote: > Hello Beam team! > > > >

Specific use-case question - Kafka-to-GCS-avro-Python

2024-03-13 Thread Ondřej Pánek
Hello Beam team! We’re currently onboarding customer’s infrastructure to the Google Cloud Platform. The decision was made that one of the technologies they will use is Dataflow. Let me briefly the usecase specification: They have kafka cluster where data from CDC data source is stored. The data

Re: Expansion service for SqlTranform fails with a local flink cluster using Python SDK

2024-03-13 Thread Chamikara Jayalath via user
> When I check the expansion service docker container, normally it downloads a JAR file and starts SDK Fn Harness To clarify the terminology here, I think you meant the Java SDK harness container not the expansion service. Expansion service is only needed during job submission and your failure is

Modelling fire-and-forget actions with dependencies

2024-03-13 Thread Florent Biville
Hello everyone, I am working on the Dataflow Template for (imports to) Neo4j and we are currently in the middle of revisiting the whole Beam pipeline logic. The main concepts are: - data sources: typically data from a SQL query or a text file - import targets: a target is linked to a