It's working! Huge thank you to Steve Niemitz who pointed out the need for "--experiments=enable_custom_pubsub_sink" to prevent dataflow override for the module that I wanted to use custom source.
Here is my full process in case it's helpful to anyone in the future (note one might need to change the version identifiers): 1. Modify files in sdks/java/io/google-cloud-platform 2. Add id 'com.github.johnrengelman.shadow' to plugins in sdks/java/io/google-cloud-platform/build.gradle 3. Build a shadowJar via "./gradlew :sdk:java:io:google-cloud-platform:shadowJar" 4. Copy the shadowJar from my/path/to/beam/sdks/java/io/google-cloud-platform/build/libs/beam-sdks-java-io-google-cloud-platform-2.40.0-SNAPSHOT-all.jar to my/path/to/user/pipeline/top-level/libs/org/apache/beam/beam-sdks-java-io-google-cloud-platform/2.40.0-SNAPSHOT/beam-sdks-java-io-google-cloud-platform-2.40.0-SNAPSHOT.jar 5. Add a pom file for the shadowJar (to emulate local maven repo): <?xml version="1.0" encoding="utf-8"?> <project xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd" xmlns=" http://maven.apache.org/POM/4.0.0" xmlns:xsi=" http://www.w3.org/2001/XMLSchema-instance"> <modelVersion>4.0.0</modelVersion> <groupId>org.apache.beam</groupId> <artifactId>beam-sdks-java-io-google-cloud-platform</artifactId> <version>2.40.0-SNAPSHOT</version> </project> 6. In user code pipeline "build.gradle", add a local maven repo (note "./libs" is from "my/path/to/user/pipeline/top-level/libs") repositories { maven { url = uri('./libs') } ... other repos ... } 7. In user code pipeline "build.gradle", implement dependency replacement of the SDK version of beam-sdks-java-io-google-cloud-platform configurations { all { resolutionStrategy.dependencySubstitution { substitute module("org.apache.beam:beam-sdks-java-io-google-cloud-platform:2.40.0") using module("org.apache.beam:beam-sdks-java-io-google-cloud-platform:2.40.0-SNAPSHOT") } } } 8. Deploy the user code pipeline including the flag: --experiments=enable_custom_pubsub_sink On Thu, Jul 21, 2022 at 4:42 PM Evan Galpin <[email protected]> wrote: > Thanks Tomo, I'll check that out too as a good safeguard! Are you > familiar with any process to build pre-release artifacts? I suppose that's > really what I'm after is building a pre-release version of pubsubIO to > validate in Dataflow. > > - Evan > > > On Thu, Jul 21, 2022 at 4:21 PM Tomo Suzuki via dev <[email protected]> > wrote: > >> I don't come up with a solution (I'm not familiar with the method >> you're using). However I often use "getProtectionDomain()" >> https://stackoverflow.com/a/56000383/975074 to find the JAR file from a >> class. This ensures the class you modified is actually used. >> >> On Thu, Jul 21, 2022 at 3:35 PM Evan Galpin <[email protected]> wrote: >> >>> Spoke too soon... still can't seem to get the new behaviour to appear in >>> dataflow, possibly something is being overridden? >>> >>> On Thu, Jul 21, 2022 at 3:15 PM Evan Galpin <[email protected]> wrote: >>> >>>> Making a shadowJar from "beam-sdks-java-io-google-cloud-platform" looks >>>> to be working. Added ` id 'com.github.johnrengelman.shadow'` to >>>> `build.gradle` for "beam-sdks-java-io-google-cloud-platform" in the beam >>>> source and used the resulting jar as a dependency replacement when >>>> deploying the job to dataflow. Looks ok. >>>> >>>> On Thu, Jul 21, 2022 at 3:02 PM Evan Galpin <[email protected]> wrote: >>>> >>>>> I believe I have the dependencySubstitution working, but it seems as >>>>> though the substitution is removing transitive deps of >>>>> "beam-sdks-java-io-google-cloud-platform", hmm... >>>>> >>>>> On Thu, Jul 21, 2022 at 1:15 PM Evan Galpin <[email protected]> >>>>> wrote: >>>>> >>>>>> Hi all, >>>>>> >>>>>> I'm trying to test a change I've made locally, but by validating it >>>>>> on Dataflow. It works locally, but I want to validate on Dataflow. I've >>>>>> tried a few different attempts at module substitution in the build.gradle >>>>>> config file for the pipeline I'm trying to deploy, but I haven't had any >>>>>> success yet. >>>>>> >>>>>> How might I be able to replace the >>>>>> beam-sdks-java-io-google-cloud-platform module usually installed from >>>>>> maven >>>>>> with a local jar generated from running: >>>>>> >>>>>> "./gradlew :sdk:java:io:google-cloud-platform:jar" >>>>>> >>>>>> Thanks, >>>>>> Evan >>>>>> >>>>> >> >> -- >> Regards, >> Tomo >> >
