Re: Dataflow dependencies require non-maven central dependencies (confluent kafka)

2021-07-12 Thread Jan Lukavský

Can you try to exclude only the "kafka-avro-serializer"?

On 7/12/21 6:28 PM, Alex Van Boxel wrote:
That's not the problem, the example Gradle build excludes it, the 
Dataflow runner fails with the error stated in the original post.


 _/
_/ Alex Van Boxel


On Mon, Jul 12, 2021 at 6:27 PM Alexey Romanenko 
mailto:aromanenko@gmail.com>> wrote:


I agree that we should make this optional. What would be the best
way to it with gradle?


On 11 Jul 2021, at 16:40, Jan Lukavský mailto:je...@seznam.cz>> wrote:

I'd be +1 to making it optional as well. Looks really like an
overhead for users not using avro.

On 7/11/21 10:36 AM, Alex Van Boxel wrote:

It worked before 2.30. It's fine for when you're using Confluent
Kafka, but feels like a hard dependency for non-Kafka users.
Certainly the requirement for including an extra repo. Certain
companies have to go through a lengthy process to include extra
repo's. Feels like a strange requirement, for nothing. Isn't
it a bug in the DataflowRunner?

dependencies {
compile("org.apache.beam:beam-sdks-java-core:$beamVersion")
compile("org.apache.beam:beam-runners-direct-java:$beamVersion")

compile("org.apache.beam:beam-runners-google-cloud-dataflow-java:$beamVersion")
// waiting for response on mailinglist (2.30 onwards), dataflow
runner fails
//    {
//        exclude module: 'beam-sdks-java-io-kafka'
//    }

compile("org.apache.beam:beam-sdks-java-io-elasticsearch:$beamVersion")
compile("org.apache.beam:beam-sdks-java-io-jdbc:$beamVersion")

compile("org.apache.beam:beam-sdks-java-extensions-protobuf:$beamVersion")

compile("org.apache.beam:beam-sdks-java-extensions-google-cloud-platform-core:$beamVersion")
compile("org.apache.beam:beam-sdks-java-extensions-sql:$beamVersion")
compile("org.apache.beam:beam-sdks-java-extensions-zetasketch:$beamVersion")

compile("org.apache.beam:beam-sdks-java-extensions-json-jackson:$beamVersion")


compile("org.apache.beam:beam-sdks-java-io-google-cloud-platform:$beamVersion")
compile("org.apache.beam:beam-sdks-java-io-common:$beamVersion")

// force to the JRE, as the android version is auto resolved
compile("com.google.guava:guava:30.1.1-jre")

compile("org.slf4j:slf4j-log4j12:1.7.30")
compile("commons-io:commons-io:2.8.0")

compile("io.opentelemetry:opentelemetry-proto:1.3.0-alpha")

compile("com.microsoft.sqlserver:mssql-jdbc:9.1.0.jre8-preview")

compile("io.swagger.parser.v3:swagger-parser:2.0.24")

testCompile("org.hamcrest:hamcrest-all:1.3")
testCompile("org.assertj:assertj-core:3.4.1")
testCompile("junit:junit:4.12")
}


 _/
_/ Alex Van Boxel


On Fri, Jul 9, 2021 at 1:55 PM Alexey Romanenko
mailto:aromanenko@gmail.com>> wrote:

Hi Alex,

Yes, starting from Beam 2.20.0,  "beam-sdks-java-io-kafka”
requires an additional dependency “kafka-avro-serializer”
from external repository
(https://packages.confluent.io/maven/
).

This is reflected in published POM file:

https://search.maven.org/artifact/org.apache.beam/beam-sdks-java-io-kafka/2.31.0/jar



Did it work for you before version 2.30.0?
Could you share your build.gradle file?

—
Alexey




On 9 Jul 2021, at 11:23, Alex Van Boxel mailto:a...@vanboxel.be>> wrote:

Hi all,

I've been building for years via gradle. The dependency
management is probably a bit different from that of maven,
but it seems that dataflow now requires Confluent Kafka
dependencies. They are not available in Maven Central. This
feels wrong for an Apache project.

 -

file:/Users/alex.vanboxel/.m2/repository/io/confluent/kafka-avro-serializer/5.3.2/kafka-avro-serializer-5.3.2.pom
       -

https://repo.maven.apache.org/maven2/io/confluent/kafka-avro-serializer/5.3.2/kafka-avro-serializer-5.3.2.pom


        -

https://repository.apache.org/content/repositories/releases/io/confluent/kafka-avro-serializer/5.3.2/kafka-avro-serializer-5.3.2.pom



Excluding the dependencies "exclude module:
'*beam-sdks-java-io-kafka*'" doesn't work with:

Exception in thread "main" java.lang.NoClassDefFoundError:
org/apache/beam/sdk/io/kafka/KafkaIO$Read
at

org.apache.beam.runners.dataflow.DataflowRunner.getOverrides(DataflowRunner.java:522)
at


Re: Dataflow dependencies require non-maven central dependencies (confluent kafka)

2021-07-12 Thread Alex Van Boxel
That's not the problem, the example Gradle build excludes it, the Dataflow
runner fails with the error stated in the original post.

 _/
_/ Alex Van Boxel


On Mon, Jul 12, 2021 at 6:27 PM Alexey Romanenko 
wrote:

> I agree that we should make this optional. What would be the best way to
> it with gradle?
>
> On 11 Jul 2021, at 16:40, Jan Lukavský  wrote:
>
> I'd be +1 to making it optional as well. Looks really like an overhead for
> users not using avro.
> On 7/11/21 10:36 AM, Alex Van Boxel wrote:
>
> It worked before 2.30. It's fine for when you're using Confluent Kafka,
> but feels like a hard dependency for non-Kafka users. Certainly the
> requirement for including an extra repo. Certain companies have to go
> through a lengthy process to include extra repo's. Feels like a strange
> requirement, for nothing. Isn't it a bug in the DataflowRunner?
>
> dependencies {
> compile("org.apache.beam:beam-sdks-java-core:$beamVersion")
> compile("org.apache.beam:beam-runners-direct-java:$beamVersion")
>
> compile("org.apache.beam:beam-runners-google-cloud-dataflow-java:$beamVersion")
> // waiting for response on mailinglist (2.30 onwards), dataflow runner
> fails
> //{
> //exclude module: 'beam-sdks-java-io-kafka'
> //}
>
> compile("org.apache.beam:beam-sdks-java-io-elasticsearch:$beamVersion")
> compile("org.apache.beam:beam-sdks-java-io-jdbc:$beamVersion")
>
> compile("org.apache.beam:beam-sdks-java-extensions-protobuf:$beamVersion")
>
> compile("org.apache.beam:beam-sdks-java-extensions-google-cloud-platform-core:$beamVersion")
> compile("org.apache.beam:beam-sdks-java-extensions-sql:$beamVersion")
>
> compile("org.apache.beam:beam-sdks-java-extensions-zetasketch:$beamVersion")
>
> compile("org.apache.beam:beam-sdks-java-extensions-json-jackson:$beamVersion")
>
>
> compile("org.apache.beam:beam-sdks-java-io-google-cloud-platform:$beamVersion")
> compile("org.apache.beam:beam-sdks-java-io-common:$beamVersion")
>
> // force to the JRE, as the android version is auto resolved
> compile("com.google.guava:guava:30.1.1-jre")
>
> compile("org.slf4j:slf4j-log4j12:1.7.30")
> compile("commons-io:commons-io:2.8.0")
>
> compile("io.opentelemetry:opentelemetry-proto:1.3.0-alpha")
>
> compile("com.microsoft.sqlserver:mssql-jdbc:9.1.0.jre8-preview")
>
> compile("io.swagger.parser.v3:swagger-parser:2.0.24")
>
> testCompile("org.hamcrest:hamcrest-all:1.3")
> testCompile("org.assertj:assertj-core:3.4.1")
> testCompile("junit:junit:4.12")
> }
>
>
>  _/
> _/ Alex Van Boxel
>
>
> On Fri, Jul 9, 2021 at 1:55 PM Alexey Romanenko 
> wrote:
>
>> Hi Alex,
>>
>> Yes, starting from Beam 2.20.0,  "beam-sdks-java-io-kafka” requires an
>> additional dependency “kafka-avro-serializer” from external repository (
>> https://packages.confluent.io/maven/).
>>
>> This is reflected in published POM file:
>>
>> https://search.maven.org/artifact/org.apache.beam/beam-sdks-java-io-kafka/2.31.0/jar
>>
>> Did it work for you before version 2.30.0?
>> Could you share your build.gradle file?
>>
>> —
>> Alexey
>>
>>
>>
>> On 9 Jul 2021, at 11:23, Alex Van Boxel  wrote:
>>
>> Hi all,
>>
>> I've been building for years via gradle. The dependency management is
>> probably a bit different from that of maven, but it seems that dataflow now
>> requires Confluent Kafka dependencies. They are not available in Maven
>> Central. This feels wrong for an Apache project.
>>
>>-
>> file:/Users/alex.vanboxel/.m2/repository/io/confluent/kafka-avro-serializer/5.3.2/kafka-avro-serializer-5.3.2.pom
>>-
>> https://repo.maven.apache.org/maven2/io/confluent/kafka-avro-serializer/5.3.2/kafka-avro-serializer-5.3.2.pom
>> -
>> https://repository.apache.org/content/repositories/releases/io/confluent/kafka-avro-serializer/5.3.2/kafka-avro-serializer-5.3.2.pom
>>
>> Excluding the dependencies "exclude module: '*beam-sdks-java-io-kafka*'"
>> doesn't work with:
>>
>> Exception in thread "main" java.lang.NoClassDefFoundError:
>> org/apache/beam/sdk/io/kafka/KafkaIO$Read
>> at
>> org.apache.beam.runners.dataflow.DataflowRunner.getOverrides(DataflowRunner.java:522)
>> at
>> org.apache.beam.runners.dataflow.DataflowRunner.replaceV1Transforms(DataflowRunner.java:1337)
>> at
>> org.apache.beam.runners.dataflow.DataflowRunner.run(DataflowRunner.java:967)
>> at
>> org.apache.beam.runners.dataflow.DataflowRunner.run(DataflowRunner.java:196)
>>
>> This happens from version 2.30 onwards. Is this intended?!
>>
>>  _/
>> _/ Alex Van Boxel
>>
>>
>>
>


Re: Dataflow dependencies require non-maven central dependencies (confluent kafka)

2021-07-12 Thread Alexey Romanenko
I agree that we should make this optional. What would be the best way to it 
with gradle?

> On 11 Jul 2021, at 16:40, Jan Lukavský  wrote:
> 
> I'd be +1 to making it optional as well. Looks really like an overhead for 
> users not using avro.
> 
> On 7/11/21 10:36 AM, Alex Van Boxel wrote:
>> It worked before 2.30. It's fine for when you're using Confluent Kafka, but 
>> feels like a hard dependency for non-Kafka users. Certainly the requirement 
>> for including an extra repo. Certain companies have to go through a lengthy 
>> process to include extra repo's. Feels like a strange requirement, for 
>> nothing. Isn't it a bug in the DataflowRunner?
>> 
>> dependencies {
>> compile("org.apache.beam:beam-sdks-java-core:$beamVersion")
>> compile("org.apache.beam:beam-runners-direct-java:$beamVersion")
>> compile("org.apache.beam:beam-runners-google-cloud-dataflow-java:$beamVersion")
>> // waiting for response on mailinglist (2.30 onwards), dataflow runner fails
>> //{
>> //exclude module: 'beam-sdks-java-io-kafka'
>> //}
>> 
>> compile("org.apache.beam:beam-sdks-java-io-elasticsearch:$beamVersion")
>> compile("org.apache.beam:beam-sdks-java-io-jdbc:$beamVersion")
>> 
>> compile("org.apache.beam:beam-sdks-java-extensions-protobuf:$beamVersion")
>> compile("org.apache.beam:beam-sdks-java-extensions-google-cloud-platform-core:$beamVersion")
>> compile("org.apache.beam:beam-sdks-java-extensions-sql:$beamVersion")
>> compile("org.apache.beam:beam-sdks-java-extensions-zetasketch:$beamVersion")
>> compile("org.apache.beam:beam-sdks-java-extensions-json-jackson:$beamVersion")
>> 
>> compile("org.apache.beam:beam-sdks-java-io-google-cloud-platform:$beamVersion")
>> compile("org.apache.beam:beam-sdks-java-io-common:$beamVersion")
>> 
>> // force to the JRE, as the android version is auto resolved
>> compile("com.google.guava:guava:30.1.1-jre")
>> 
>> compile("org.slf4j:slf4j-log4j12:1.7.30")
>> compile("commons-io:commons-io:2.8.0")
>> 
>> compile("io.opentelemetry:opentelemetry-proto:1.3.0-alpha")
>> 
>> compile("com.microsoft.sqlserver:mssql-jdbc:9.1.0.jre8-preview")
>> 
>> compile("io.swagger.parser.v3:swagger-parser:2.0.24")
>> 
>> testCompile("org.hamcrest:hamcrest-all:1.3")
>> testCompile("org.assertj:assertj-core:3.4.1")
>> testCompile("junit:junit:4.12")
>> }
>> 
>> 
>>  _/
>> _/ Alex Van Boxel
>> 
>> 
>> On Fri, Jul 9, 2021 at 1:55 PM Alexey Romanenko > > wrote:
>> Hi Alex,
>> 
>> Yes, starting from Beam 2.20.0,  "beam-sdks-java-io-kafka” requires an 
>> additional dependency “kafka-avro-serializer” from external repository 
>> (https://packages.confluent.io/maven/ 
>> ). 
>> 
>> This is reflected in published POM file: 
>> https://search.maven.org/artifact/org.apache.beam/beam-sdks-java-io-kafka/2.31.0/jar
>>  
>> 
>> 
>> Did it work for you before version 2.30.0? 
>> Could you share your build.gradle file? 
>> 
>> —
>> Alexey
>> 
>> 
>> 
>>> On 9 Jul 2021, at 11:23, Alex Van Boxel >> > wrote:
>>> 
>>> Hi all,
>>> 
>>> I've been building for years via gradle. The dependency management is 
>>> probably a bit different from that of maven, but it seems that dataflow now 
>>> requires Confluent Kafka dependencies. They are not available in Maven 
>>> Central. This feels wrong for an Apache project.
>>> 
>>>- 
>>> file:/Users/alex.vanboxel/.m2/repository/io/confluent/kafka-avro-serializer/5.3.2/kafka-avro-serializer-5.3.2.pom
>>>  
>>> 
>>>- 
>>> https://repo.maven.apache.org/maven2/io/confluent/kafka-avro-serializer/5.3.2/kafka-avro-serializer-5.3.2.pom
>>>  
>>> 
>>> - 
>>> https://repository.apache.org/content/repositories/releases/io/confluent/kafka-avro-serializer/5.3.2/kafka-avro-serializer-5.3.2.pom
>>>  
>>> 
>>> 
>>> Excluding the dependencies "exclude module: 'beam-sdks-java-io-kafka'" 
>>> doesn't work with:
>>> 
>>> Exception in thread "main" java.lang.NoClassDefFoundError: 
>>> org/apache/beam/sdk/io/kafka/KafkaIO$Read
>>> at 
>>> org.apache.beam.runners.dataflow.DataflowRunner.getOverrides(DataflowRunner.java:522)
>>> at 
>>> org.apache.beam.runners.dataflow.DataflowRunner.replaceV1Transforms(DataflowRunner.java:1337)
>>> at 
>>> org.apache.beam.runners.dataflow.DataflowRunner.run(DataflowRunner.java:967)
>>> at 
>>> org.apache.beam.runners.dataflow.DataflowRunner.run(DataflowRunner.java:196)
>>> 
>>> This happens from version 2.30 onwards. Is this intended?!
>>>  
>>>  _/
>>> _/ Alex Van Boxel
>> 



Re: Dataflow dependencies require non-maven central dependencies (confluent kafka)

2021-07-11 Thread Jan Lukavský
I'd be +1 to making it optional as well. Looks really like an overhead 
for users not using avro.


On 7/11/21 10:36 AM, Alex Van Boxel wrote:
It worked before 2.30. It's fine for when you're using Confluent 
Kafka, but feels like a hard dependency for non-Kafka users. Certainly 
the requirement for including an extra repo. Certain companies have to 
go through a lengthy process to include extra repo's. Feels like a 
strange requirement, for nothing. Isn't it a bug in the 
DataflowRunner?


dependencies {
compile("org.apache.beam:beam-sdks-java-core:$beamVersion")
compile("org.apache.beam:beam-runners-direct-java:$beamVersion")
compile("org.apache.beam:beam-runners-google-cloud-dataflow-java:$beamVersion")
// waiting for response on mailinglist (2.30 onwards), dataflow runner 
fails

//    {
//        exclude module: 'beam-sdks-java-io-kafka'
//    }

compile("org.apache.beam:beam-sdks-java-io-elasticsearch:$beamVersion")
compile("org.apache.beam:beam-sdks-java-io-jdbc:$beamVersion")

compile("org.apache.beam:beam-sdks-java-extensions-protobuf:$beamVersion")
compile("org.apache.beam:beam-sdks-java-extensions-google-cloud-platform-core:$beamVersion")
compile("org.apache.beam:beam-sdks-java-extensions-sql:$beamVersion")
compile("org.apache.beam:beam-sdks-java-extensions-zetasketch:$beamVersion")
compile("org.apache.beam:beam-sdks-java-extensions-json-jackson:$beamVersion")

compile("org.apache.beam:beam-sdks-java-io-google-cloud-platform:$beamVersion")
compile("org.apache.beam:beam-sdks-java-io-common:$beamVersion")

// force to the JRE, as the android version is auto resolved
compile("com.google.guava:guava:30.1.1-jre")

compile("org.slf4j:slf4j-log4j12:1.7.30")
compile("commons-io:commons-io:2.8.0")

compile("io.opentelemetry:opentelemetry-proto:1.3.0-alpha")

compile("com.microsoft.sqlserver:mssql-jdbc:9.1.0.jre8-preview")

compile("io.swagger.parser.v3:swagger-parser:2.0.24")

testCompile("org.hamcrest:hamcrest-all:1.3")
testCompile("org.assertj:assertj-core:3.4.1")
testCompile("junit:junit:4.12")
}


 _/
_/ Alex Van Boxel


On Fri, Jul 9, 2021 at 1:55 PM Alexey Romanenko 
mailto:aromanenko@gmail.com>> wrote:


Hi Alex,

Yes, starting from Beam 2.20.0,  "beam-sdks-java-io-kafka”
requires an additional dependency “kafka-avro-serializer” from
external repository (https://packages.confluent.io/maven/
).

This is reflected in published POM file:

https://search.maven.org/artifact/org.apache.beam/beam-sdks-java-io-kafka/2.31.0/jar



Did it work for you before version 2.30.0?
Could you share your build.gradle file?

—
Alexey




On 9 Jul 2021, at 11:23, Alex Van Boxel mailto:a...@vanboxel.be>> wrote:

Hi all,

I've been building for years via gradle. The dependency
management is probably a bit different from that of maven, but it
seems that dataflow now requires Confluent Kafka dependencies.
They are not available in Maven Central. This feels wrong for an
Apache project.

       -

file:/Users/alex.vanboxel/.m2/repository/io/confluent/kafka-avro-serializer/5.3.2/kafka-avro-serializer-5.3.2.pom
       -

https://repo.maven.apache.org/maven2/io/confluent/kafka-avro-serializer/5.3.2/kafka-avro-serializer-5.3.2.pom


        -

https://repository.apache.org/content/repositories/releases/io/confluent/kafka-avro-serializer/5.3.2/kafka-avro-serializer-5.3.2.pom



Excluding the dependencies "exclude module:
'*beam-sdks-java-io-kafka*'" doesn't work with:

Exception in thread "main" java.lang.NoClassDefFoundError:
org/apache/beam/sdk/io/kafka/KafkaIO$Read
at

org.apache.beam.runners.dataflow.DataflowRunner.getOverrides(DataflowRunner.java:522)
at

org.apache.beam.runners.dataflow.DataflowRunner.replaceV1Transforms(DataflowRunner.java:1337)
at
org.apache.beam.runners.dataflow.DataflowRunner.run(DataflowRunner.java:967)
at
org.apache.beam.runners.dataflow.DataflowRunner.run(DataflowRunner.java:196)

This happens from version 2.30 onwards. Is this intended?!
 _/
_/ Alex Van Boxel




Re: Dataflow dependencies require non-maven central dependencies (confluent kafka)

2021-07-11 Thread Alex Van Boxel
It worked before 2.30. It's fine for when you're using Confluent Kafka, but
feels like a hard dependency for non-Kafka users. Certainly the
requirement for including an extra repo. Certain companies have to go
through a lengthy process to include extra repo's. Feels like a strange
requirement, for nothing. Isn't it a bug in the DataflowRunner?

dependencies {
compile("org.apache.beam:beam-sdks-java-core:$beamVersion")
compile("org.apache.beam:beam-runners-direct-java:$beamVersion")
compile("org.apache.beam:beam-runners-google-cloud-dataflow-java:$beamVersion")
// waiting for response on mailinglist (2.30 onwards), dataflow runner fails
//{
//exclude module: 'beam-sdks-java-io-kafka'
//}

compile("org.apache.beam:beam-sdks-java-io-elasticsearch:$beamVersion")
compile("org.apache.beam:beam-sdks-java-io-jdbc:$beamVersion")

compile("org.apache.beam:beam-sdks-java-extensions-protobuf:$beamVersion")
compile("org.apache.beam:beam-sdks-java-extensions-google-cloud-platform-core:$beamVersion")
compile("org.apache.beam:beam-sdks-java-extensions-sql:$beamVersion")
compile("org.apache.beam:beam-sdks-java-extensions-zetasketch:$beamVersion")
compile("org.apache.beam:beam-sdks-java-extensions-json-jackson:$beamVersion")

compile("org.apache.beam:beam-sdks-java-io-google-cloud-platform:$beamVersion")
compile("org.apache.beam:beam-sdks-java-io-common:$beamVersion")

// force to the JRE, as the android version is auto resolved
compile("com.google.guava:guava:30.1.1-jre")

compile("org.slf4j:slf4j-log4j12:1.7.30")
compile("commons-io:commons-io:2.8.0")

compile("io.opentelemetry:opentelemetry-proto:1.3.0-alpha")

compile("com.microsoft.sqlserver:mssql-jdbc:9.1.0.jre8-preview")

compile("io.swagger.parser.v3:swagger-parser:2.0.24")

testCompile("org.hamcrest:hamcrest-all:1.3")
testCompile("org.assertj:assertj-core:3.4.1")
testCompile("junit:junit:4.12")
}


 _/
_/ Alex Van Boxel


On Fri, Jul 9, 2021 at 1:55 PM Alexey Romanenko 
wrote:

> Hi Alex,
>
> Yes, starting from Beam 2.20.0,  "beam-sdks-java-io-kafka” requires an
> additional dependency “kafka-avro-serializer” from external repository (
> https://packages.confluent.io/maven/).
>
> This is reflected in published POM file:
>
> https://search.maven.org/artifact/org.apache.beam/beam-sdks-java-io-kafka/2.31.0/jar
>
> Did it work for you before version 2.30.0?
> Could you share your build.gradle file?
>
> —
> Alexey
>
>
>
> On 9 Jul 2021, at 11:23, Alex Van Boxel  wrote:
>
> Hi all,
>
> I've been building for years via gradle. The dependency management is
> probably a bit different from that of maven, but it seems that dataflow now
> requires Confluent Kafka dependencies. They are not available in Maven
> Central. This feels wrong for an Apache project.
>
>-
> file:/Users/alex.vanboxel/.m2/repository/io/confluent/kafka-avro-serializer/5.3.2/kafka-avro-serializer-5.3.2.pom
>-
> https://repo.maven.apache.org/maven2/io/confluent/kafka-avro-serializer/5.3.2/kafka-avro-serializer-5.3.2.pom
> -
> https://repository.apache.org/content/repositories/releases/io/confluent/kafka-avro-serializer/5.3.2/kafka-avro-serializer-5.3.2.pom
>
> Excluding the dependencies "exclude module: '*beam-sdks-java-io-kafka*'"
> doesn't work with:
>
> Exception in thread "main" java.lang.NoClassDefFoundError:
> org/apache/beam/sdk/io/kafka/KafkaIO$Read
> at
> org.apache.beam.runners.dataflow.DataflowRunner.getOverrides(DataflowRunner.java:522)
> at
> org.apache.beam.runners.dataflow.DataflowRunner.replaceV1Transforms(DataflowRunner.java:1337)
> at
> org.apache.beam.runners.dataflow.DataflowRunner.run(DataflowRunner.java:967)
> at
> org.apache.beam.runners.dataflow.DataflowRunner.run(DataflowRunner.java:196)
>
> This happens from version 2.30 onwards. Is this intended?!
>
>  _/
> _/ Alex Van Boxel
>
>
>


Re: Dataflow dependencies require non-maven central dependencies (confluent kafka)

2021-07-09 Thread Alexey Romanenko
Hi Alex,

Yes, starting from Beam 2.20.0,  "beam-sdks-java-io-kafka” requires an 
additional dependency “kafka-avro-serializer” from external repository 
(https://packages.confluent.io/maven/). 

This is reflected in published POM file: 
https://search.maven.org/artifact/org.apache.beam/beam-sdks-java-io-kafka/2.31.0/jar

Did it work for you before version 2.30.0? 
Could you share your build.gradle file? 

—
Alexey



> On 9 Jul 2021, at 11:23, Alex Van Boxel  wrote:
> 
> Hi all,
> 
> I've been building for years via gradle. The dependency management is 
> probably a bit different from that of maven, but it seems that dataflow now 
> requires Confluent Kafka dependencies. They are not available in Maven 
> Central. This feels wrong for an Apache project.
> 
>- 
> file:/Users/alex.vanboxel/.m2/repository/io/confluent/kafka-avro-serializer/5.3.2/kafka-avro-serializer-5.3.2.pom
>- 
> https://repo.maven.apache.org/maven2/io/confluent/kafka-avro-serializer/5.3.2/kafka-avro-serializer-5.3.2.pom
>  
> 
> - 
> https://repository.apache.org/content/repositories/releases/io/confluent/kafka-avro-serializer/5.3.2/kafka-avro-serializer-5.3.2.pom
>  
> 
> 
> Excluding the dependencies "exclude module: 'beam-sdks-java-io-kafka'" 
> doesn't work with:
> 
> Exception in thread "main" java.lang.NoClassDefFoundError: 
> org/apache/beam/sdk/io/kafka/KafkaIO$Read
> at 
> org.apache.beam.runners.dataflow.DataflowRunner.getOverrides(DataflowRunner.java:522)
> at 
> org.apache.beam.runners.dataflow.DataflowRunner.replaceV1Transforms(DataflowRunner.java:1337)
> at 
> org.apache.beam.runners.dataflow.DataflowRunner.run(DataflowRunner.java:967)
> at 
> org.apache.beam.runners.dataflow.DataflowRunner.run(DataflowRunner.java:196)
> 
> This happens from version 2.30 onwards. Is this intended?!
>  
>  _/
> _/ Alex Van Boxel



Dataflow dependencies require non-maven central dependencies (confluent kafka)

2021-07-09 Thread Alex Van Boxel
Hi all,

I've been building for years via gradle. The dependency management is
probably a bit different from that of maven, but it seems that dataflow now
requires Confluent Kafka dependencies. They are not available in Maven
Central. This feels wrong for an Apache project.

   -
file:/Users/alex.vanboxel/.m2/repository/io/confluent/kafka-avro-serializer/5.3.2/kafka-avro-serializer-5.3.2.pom
   -
https://repo.maven.apache.org/maven2/io/confluent/kafka-avro-serializer/5.3.2/kafka-avro-serializer-5.3.2.pom
-
https://repository.apache.org/content/repositories/releases/io/confluent/kafka-avro-serializer/5.3.2/kafka-avro-serializer-5.3.2.pom

Excluding the dependencies "exclude module: '*beam-sdks-java-io-kafka*'"
doesn't work with:

Exception in thread "main" java.lang.NoClassDefFoundError:
org/apache/beam/sdk/io/kafka/KafkaIO$Read
at
org.apache.beam.runners.dataflow.DataflowRunner.getOverrides(DataflowRunner.java:522)
at
org.apache.beam.runners.dataflow.DataflowRunner.replaceV1Transforms(DataflowRunner.java:1337)
at
org.apache.beam.runners.dataflow.DataflowRunner.run(DataflowRunner.java:967)
at
org.apache.beam.runners.dataflow.DataflowRunner.run(DataflowRunner.java:196)

This happens from version 2.30 onwards. Is this intended?!

 _/
_/ Alex Van Boxel