Re: [Dataflow][Guidance] Replacing beam-sdks-java-io-google-cloud-platform with local jar
One final note of clarification: the pom file needs to be in the same directory as the jar On Fri, Jul 22, 2022 at 11:01 Evan Galpin wrote: > It's working! Huge thank you to Steve Niemitz who pointed out the need for > "--experiments=enable_custom_pubsub_sink" to prevent dataflow override > for the module that I wanted to use custom source. > > Here is my full process in case it's helpful to anyone in the future (note > one might need to change the version identifiers): > > >1. Modify files in sdks/java/io/google-cloud-platform >2. Add id 'com.github.johnrengelman.shadow' to plugins in >sdks/java/io/google-cloud-platform/build.gradle >3. Build a shadowJar via "./gradlew >:sdk:java:io:google-cloud-platform:shadowJar" >4. Copy the shadowJar from > > my/path/to/beam/sdks/java/io/google-cloud-platform/build/libs/beam-sdks-java-io-google-cloud-platform-2.40.0-SNAPSHOT-all.jar >to > > my/path/to/user/pipeline/top-level/libs/org/apache/beam/beam-sdks-java-io-google-cloud-platform/2.40.0-SNAPSHOT/beam-sdks-java-io-google-cloud-platform-2.40.0-SNAPSHOT.jar >5. Add a pom file for the shadowJar (to emulate local maven repo): > > >http://maven.apache.org/POM/4.0.0 >http://maven.apache.org/xsd/maven-4.0.0.xsd; xmlns=" >http://maven.apache.org/POM/4.0.0; xmlns:xsi=" >http://www.w3.org/2001/XMLSchema-instance;> >4.0.0 >org.apache.beam >beam-sdks-java-io-google-cloud-platform >2.40.0-SNAPSHOT > > >6. In user code pipeline "build.gradle", add a local maven repo (note >"./libs" is from "my/path/to/user/pipeline/top-level/libs") > > repositories { >maven { >url = uri('./libs') >} >... other repos ... > } > >7. In user code pipeline "build.gradle", implement dependency >replacement of the SDK version of beam-sdks-java-io-google-cloud-platform > >configurations { >all { >resolutionStrategy.dependencySubstitution { >substitute >module("org.apache.beam:beam-sdks-java-io-google-cloud-platform:2.40.0") >using > > module("org.apache.beam:beam-sdks-java-io-google-cloud-platform:2.40.0-SNAPSHOT") >} >} >} > >8. Deploy the user code pipeline including the flag: >--experiments=enable_custom_pubsub_sink > > > > On Thu, Jul 21, 2022 at 4:42 PM Evan Galpin wrote: > >> Thanks Tomo, I'll check that out too as a good safeguard! Are you >> familiar with any process to build pre-release artifacts? I suppose that's >> really what I'm after is building a pre-release version of pubsubIO to >> validate in Dataflow. >> >> - Evan >> >> >> On Thu, Jul 21, 2022 at 4:21 PM Tomo Suzuki via dev >> wrote: >> >>> I don't come up with a solution (I'm not familiar with the method >>> you're using). However I often use "getProtectionDomain()" >>> https://stackoverflow.com/a/56000383/975074 to find the JAR file from a >>> class. This ensures the class you modified is actually used. >>> >>> On Thu, Jul 21, 2022 at 3:35 PM Evan Galpin wrote: >>> Spoke too soon... still can't seem to get the new behaviour to appear in dataflow, possibly something is being overridden? On Thu, Jul 21, 2022 at 3:15 PM Evan Galpin wrote: > Making a shadowJar from "beam-sdks-java-io-google-cloud-platform" > looks to be working. Added ` id 'com.github.johnrengelman.shadow'` to > `build.gradle` for "beam-sdks-java-io-google-cloud-platform" in the beam > source and used the resulting jar as a dependency replacement when > deploying the job to dataflow. Looks ok. > > On Thu, Jul 21, 2022 at 3:02 PM Evan Galpin > wrote: > >> I believe I have the dependencySubstitution working, but it seems as >> though the substitution is removing transitive deps of >> "beam-sdks-java-io-google-cloud-platform", hmm... >> >> On Thu, Jul 21, 2022 at 1:15 PM Evan Galpin >> wrote: >> >>> Hi all, >>> >>> I'm trying to test a change I've made locally, but by validating it >>> on Dataflow. It works locally, but I want to validate on Dataflow. >>> I've >>> tried a few different attempts at module substitution in the >>> build.gradle >>> config file for the pipeline I'm trying to deploy, but I haven't had any >>> success yet. >>> >>> How might I be able to replace the >>> beam-sdks-java-io-google-cloud-platform module usually installed from >>> maven >>> with a local jar generated from running: >>> >>> "./gradlew :sdk:java:io:google-cloud-platform:jar" >>> >>> Thanks, >>> Evan >>> >> >>> >>> -- >>> Regards, >>> Tomo >>> >>
Re: [Dataflow][Guidance] Replacing beam-sdks-java-io-google-cloud-platform with local jar
It's working! Huge thank you to Steve Niemitz who pointed out the need for "--experiments=enable_custom_pubsub_sink" to prevent dataflow override for the module that I wanted to use custom source. Here is my full process in case it's helpful to anyone in the future (note one might need to change the version identifiers): 1. Modify files in sdks/java/io/google-cloud-platform 2. Add id 'com.github.johnrengelman.shadow' to plugins in sdks/java/io/google-cloud-platform/build.gradle 3. Build a shadowJar via "./gradlew :sdk:java:io:google-cloud-platform:shadowJar" 4. Copy the shadowJar from my/path/to/beam/sdks/java/io/google-cloud-platform/build/libs/beam-sdks-java-io-google-cloud-platform-2.40.0-SNAPSHOT-all.jar to my/path/to/user/pipeline/top-level/libs/org/apache/beam/beam-sdks-java-io-google-cloud-platform/2.40.0-SNAPSHOT/beam-sdks-java-io-google-cloud-platform-2.40.0-SNAPSHOT.jar 5. Add a pom file for the shadowJar (to emulate local maven repo): http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd; xmlns=" http://maven.apache.org/POM/4.0.0; xmlns:xsi=" http://www.w3.org/2001/XMLSchema-instance;> 4.0.0 org.apache.beam beam-sdks-java-io-google-cloud-platform 2.40.0-SNAPSHOT 6. In user code pipeline "build.gradle", add a local maven repo (note "./libs" is from "my/path/to/user/pipeline/top-level/libs") repositories { maven { url = uri('./libs') } ... other repos ... } 7. In user code pipeline "build.gradle", implement dependency replacement of the SDK version of beam-sdks-java-io-google-cloud-platform configurations { all { resolutionStrategy.dependencySubstitution { substitute module("org.apache.beam:beam-sdks-java-io-google-cloud-platform:2.40.0") using module("org.apache.beam:beam-sdks-java-io-google-cloud-platform:2.40.0-SNAPSHOT") } } } 8. Deploy the user code pipeline including the flag: --experiments=enable_custom_pubsub_sink On Thu, Jul 21, 2022 at 4:42 PM Evan Galpin wrote: > Thanks Tomo, I'll check that out too as a good safeguard! Are you > familiar with any process to build pre-release artifacts? I suppose that's > really what I'm after is building a pre-release version of pubsubIO to > validate in Dataflow. > > - Evan > > > On Thu, Jul 21, 2022 at 4:21 PM Tomo Suzuki via dev > wrote: > >> I don't come up with a solution (I'm not familiar with the method >> you're using). However I often use "getProtectionDomain()" >> https://stackoverflow.com/a/56000383/975074 to find the JAR file from a >> class. This ensures the class you modified is actually used. >> >> On Thu, Jul 21, 2022 at 3:35 PM Evan Galpin wrote: >> >>> Spoke too soon... still can't seem to get the new behaviour to appear in >>> dataflow, possibly something is being overridden? >>> >>> On Thu, Jul 21, 2022 at 3:15 PM Evan Galpin wrote: >>> Making a shadowJar from "beam-sdks-java-io-google-cloud-platform" looks to be working. Added ` id 'com.github.johnrengelman.shadow'` to `build.gradle` for "beam-sdks-java-io-google-cloud-platform" in the beam source and used the resulting jar as a dependency replacement when deploying the job to dataflow. Looks ok. On Thu, Jul 21, 2022 at 3:02 PM Evan Galpin wrote: > I believe I have the dependencySubstitution working, but it seems as > though the substitution is removing transitive deps of > "beam-sdks-java-io-google-cloud-platform", hmm... > > On Thu, Jul 21, 2022 at 1:15 PM Evan Galpin > wrote: > >> Hi all, >> >> I'm trying to test a change I've made locally, but by validating it >> on Dataflow. It works locally, but I want to validate on Dataflow. I've >> tried a few different attempts at module substitution in the build.gradle >> config file for the pipeline I'm trying to deploy, but I haven't had any >> success yet. >> >> How might I be able to replace the >> beam-sdks-java-io-google-cloud-platform module usually installed from >> maven >> with a local jar generated from running: >> >> "./gradlew :sdk:java:io:google-cloud-platform:jar" >> >> Thanks, >> Evan >> > >> >> -- >> Regards, >> Tomo >> >
Re: [Dataflow][Guidance] Replacing beam-sdks-java-io-google-cloud-platform with local jar
Thanks Tomo, I'll check that out too as a good safeguard! Are you familiar with any process to build pre-release artifacts? I suppose that's really what I'm after is building a pre-release version of pubsubIO to validate in Dataflow. - Evan On Thu, Jul 21, 2022 at 4:21 PM Tomo Suzuki via dev wrote: > I don't come up with a solution (I'm not familiar with the method > you're using). However I often use "getProtectionDomain()" > https://stackoverflow.com/a/56000383/975074 to find the JAR file from a > class. This ensures the class you modified is actually used. > > On Thu, Jul 21, 2022 at 3:35 PM Evan Galpin wrote: > >> Spoke too soon... still can't seem to get the new behaviour to appear in >> dataflow, possibly something is being overridden? >> >> On Thu, Jul 21, 2022 at 3:15 PM Evan Galpin wrote: >> >>> Making a shadowJar from "beam-sdks-java-io-google-cloud-platform" looks >>> to be working. Added ` id 'com.github.johnrengelman.shadow'` to >>> `build.gradle` for "beam-sdks-java-io-google-cloud-platform" in the beam >>> source and used the resulting jar as a dependency replacement when >>> deploying the job to dataflow. Looks ok. >>> >>> On Thu, Jul 21, 2022 at 3:02 PM Evan Galpin wrote: >>> I believe I have the dependencySubstitution working, but it seems as though the substitution is removing transitive deps of "beam-sdks-java-io-google-cloud-platform", hmm... On Thu, Jul 21, 2022 at 1:15 PM Evan Galpin wrote: > Hi all, > > I'm trying to test a change I've made locally, but by validating it on > Dataflow. It works locally, but I want to validate on Dataflow. I've > tried a few different attempts at module substitution in the build.gradle > config file for the pipeline I'm trying to deploy, but I haven't had any > success yet. > > How might I be able to replace the > beam-sdks-java-io-google-cloud-platform module usually installed from > maven > with a local jar generated from running: > > "./gradlew :sdk:java:io:google-cloud-platform:jar" > > Thanks, > Evan > > > -- > Regards, > Tomo >
Re: [Dataflow][Guidance] Replacing beam-sdks-java-io-google-cloud-platform with local jar
I don't come up with a solution (I'm not familiar with the method you're using). However I often use "getProtectionDomain()" https://stackoverflow.com/a/56000383/975074 to find the JAR file from a class. This ensures the class you modified is actually used. On Thu, Jul 21, 2022 at 3:35 PM Evan Galpin wrote: > Spoke too soon... still can't seem to get the new behaviour to appear in > dataflow, possibly something is being overridden? > > On Thu, Jul 21, 2022 at 3:15 PM Evan Galpin wrote: > >> Making a shadowJar from "beam-sdks-java-io-google-cloud-platform" looks >> to be working. Added ` id 'com.github.johnrengelman.shadow'` to >> `build.gradle` for "beam-sdks-java-io-google-cloud-platform" in the beam >> source and used the resulting jar as a dependency replacement when >> deploying the job to dataflow. Looks ok. >> >> On Thu, Jul 21, 2022 at 3:02 PM Evan Galpin wrote: >> >>> I believe I have the dependencySubstitution working, but it seems as >>> though the substitution is removing transitive deps of >>> "beam-sdks-java-io-google-cloud-platform", hmm... >>> >>> On Thu, Jul 21, 2022 at 1:15 PM Evan Galpin wrote: >>> Hi all, I'm trying to test a change I've made locally, but by validating it on Dataflow. It works locally, but I want to validate on Dataflow. I've tried a few different attempts at module substitution in the build.gradle config file for the pipeline I'm trying to deploy, but I haven't had any success yet. How might I be able to replace the beam-sdks-java-io-google-cloud-platform module usually installed from maven with a local jar generated from running: "./gradlew :sdk:java:io:google-cloud-platform:jar" Thanks, Evan >>> -- Regards, Tomo
Re: [Dataflow][Guidance] Replacing beam-sdks-java-io-google-cloud-platform with local jar
Spoke too soon... still can't seem to get the new behaviour to appear in dataflow, possibly something is being overridden? On Thu, Jul 21, 2022 at 3:15 PM Evan Galpin wrote: > Making a shadowJar from "beam-sdks-java-io-google-cloud-platform" looks to > be working. Added ` id 'com.github.johnrengelman.shadow'` to > `build.gradle` for "beam-sdks-java-io-google-cloud-platform" in the beam > source and used the resulting jar as a dependency replacement when > deploying the job to dataflow. Looks ok. > > On Thu, Jul 21, 2022 at 3:02 PM Evan Galpin wrote: > >> I believe I have the dependencySubstitution working, but it seems as >> though the substitution is removing transitive deps of >> "beam-sdks-java-io-google-cloud-platform", hmm... >> >> On Thu, Jul 21, 2022 at 1:15 PM Evan Galpin wrote: >> >>> Hi all, >>> >>> I'm trying to test a change I've made locally, but by validating it on >>> Dataflow. It works locally, but I want to validate on Dataflow. I've >>> tried a few different attempts at module substitution in the build.gradle >>> config file for the pipeline I'm trying to deploy, but I haven't had any >>> success yet. >>> >>> How might I be able to replace the >>> beam-sdks-java-io-google-cloud-platform module usually installed from maven >>> with a local jar generated from running: >>> >>> "./gradlew :sdk:java:io:google-cloud-platform:jar" >>> >>> Thanks, >>> Evan >>> >>
Re: [Dataflow][Guidance] Replacing beam-sdks-java-io-google-cloud-platform with local jar
Making a shadowJar from "beam-sdks-java-io-google-cloud-platform" looks to be working. Added ` id 'com.github.johnrengelman.shadow'` to `build.gradle` for "beam-sdks-java-io-google-cloud-platform" in the beam source and used the resulting jar as a dependency replacement when deploying the job to dataflow. Looks ok. On Thu, Jul 21, 2022 at 3:02 PM Evan Galpin wrote: > I believe I have the dependencySubstitution working, but it seems as > though the substitution is removing transitive deps of > "beam-sdks-java-io-google-cloud-platform", hmm... > > On Thu, Jul 21, 2022 at 1:15 PM Evan Galpin wrote: > >> Hi all, >> >> I'm trying to test a change I've made locally, but by validating it on >> Dataflow. It works locally, but I want to validate on Dataflow. I've >> tried a few different attempts at module substitution in the build.gradle >> config file for the pipeline I'm trying to deploy, but I haven't had any >> success yet. >> >> How might I be able to replace the >> beam-sdks-java-io-google-cloud-platform module usually installed from maven >> with a local jar generated from running: >> >> "./gradlew :sdk:java:io:google-cloud-platform:jar" >> >> Thanks, >> Evan >> >
Re: [Dataflow][Guidance] Replacing beam-sdks-java-io-google-cloud-platform with local jar
I believe I have the dependencySubstitution working, but it seems as though the substitution is removing transitive deps of "beam-sdks-java-io-google-cloud-platform", hmm... On Thu, Jul 21, 2022 at 1:15 PM Evan Galpin wrote: > Hi all, > > I'm trying to test a change I've made locally, but by validating it on > Dataflow. It works locally, but I want to validate on Dataflow. I've > tried a few different attempts at module substitution in the build.gradle > config file for the pipeline I'm trying to deploy, but I haven't had any > success yet. > > How might I be able to replace the beam-sdks-java-io-google-cloud-platform > module usually installed from maven with a local jar generated from > running: > > "./gradlew :sdk:java:io:google-cloud-platform:jar" > > Thanks, > Evan >
[Dataflow][Guidance] Replacing beam-sdks-java-io-google-cloud-platform with local jar
Hi all, I'm trying to test a change I've made locally, but by validating it on Dataflow. It works locally, but I want to validate on Dataflow. I've tried a few different attempts at module substitution in the build.gradle config file for the pipeline I'm trying to deploy, but I haven't had any success yet. How might I be able to replace the beam-sdks-java-io-google-cloud-platform module usually installed from maven with a local jar generated from running: "./gradlew :sdk:java:io:google-cloud-platform:jar" Thanks, Evan