Re: [Dataflow][Guidance] Replacing beam-sdks-java-io-google-cloud-platform with local jar

2022-07-23 Thread Evan Galpin
One final note of clarification: the pom file needs to be in the same
directory as the jar

On Fri, Jul 22, 2022 at 11:01 Evan Galpin  wrote:

> It's working! Huge thank you to Steve Niemitz who pointed out the need for
> "--experiments=enable_custom_pubsub_sink" to prevent dataflow override
> for the module that I wanted to use custom source.
>
> Here is my full process in case it's helpful to anyone in the future (note
> one might need to change the version identifiers):
>
>
>1. Modify files in sdks/java/io/google-cloud-platform
>2. Add id 'com.github.johnrengelman.shadow' to plugins in
>sdks/java/io/google-cloud-platform/build.gradle
>3. Build a shadowJar via "./gradlew
>:sdk:java:io:google-cloud-platform:shadowJar"
>4. Copy the shadowJar from
>
> my/path/to/beam/sdks/java/io/google-cloud-platform/build/libs/beam-sdks-java-io-google-cloud-platform-2.40.0-SNAPSHOT-all.jar
>to
>
> my/path/to/user/pipeline/top-level/libs/org/apache/beam/beam-sdks-java-io-google-cloud-platform/2.40.0-SNAPSHOT/beam-sdks-java-io-google-cloud-platform-2.40.0-SNAPSHOT.jar
>5. Add a pom file for the shadowJar (to emulate local maven repo):
>
>
>http://maven.apache.org/POM/4.0.0
>http://maven.apache.org/xsd/maven-4.0.0.xsd; xmlns="
>http://maven.apache.org/POM/4.0.0; xmlns:xsi="
>http://www.w3.org/2001/XMLSchema-instance;>
>4.0.0
>org.apache.beam
>beam-sdks-java-io-google-cloud-platform
>2.40.0-SNAPSHOT
>
>
>6. In user code pipeline "build.gradle", add a local maven repo (note
>"./libs" is from "my/path/to/user/pipeline/top-level/libs")
>
>  repositories {
>maven {
>url = uri('./libs')
>}
>... other repos ...
> }
>
>7. In user code pipeline "build.gradle", implement dependency
>replacement of the SDK version of beam-sdks-java-io-google-cloud-platform
>
>configurations {
>all {
>resolutionStrategy.dependencySubstitution {
>substitute
>module("org.apache.beam:beam-sdks-java-io-google-cloud-platform:2.40.0")
>using
>
> module("org.apache.beam:beam-sdks-java-io-google-cloud-platform:2.40.0-SNAPSHOT")
>}
>}
>}
>
>8. Deploy the user code pipeline including the flag:
>--experiments=enable_custom_pubsub_sink
>
>
>
> On Thu, Jul 21, 2022 at 4:42 PM Evan Galpin  wrote:
>
>> Thanks Tomo, I'll check that out too as a good safeguard!  Are you
>> familiar with any process to build pre-release artifacts?  I suppose that's
>> really what I'm after is building a pre-release version of pubsubIO to
>> validate in Dataflow.
>>
>> - Evan
>>
>>
>> On Thu, Jul 21, 2022 at 4:21 PM Tomo Suzuki via dev 
>> wrote:
>>
>>> I don't come up with a solution (I'm not familiar with the method
>>> you're using). However I often use "getProtectionDomain()"
>>> https://stackoverflow.com/a/56000383/975074 to find the JAR file from a
>>> class. This ensures the class you modified is actually used.
>>>
>>> On Thu, Jul 21, 2022 at 3:35 PM Evan Galpin  wrote:
>>>
 Spoke too soon... still can't seem to get the new behaviour to appear
 in dataflow, possibly something is being overridden?

 On Thu, Jul 21, 2022 at 3:15 PM Evan Galpin  wrote:

> Making a shadowJar from "beam-sdks-java-io-google-cloud-platform"
> looks to be working. Added `  id 'com.github.johnrengelman.shadow'` to
> `build.gradle` for "beam-sdks-java-io-google-cloud-platform" in the beam
> source and used the resulting jar as a dependency replacement when
> deploying the job to dataflow.  Looks ok.
>
> On Thu, Jul 21, 2022 at 3:02 PM Evan Galpin 
> wrote:
>
>> I believe I have the dependencySubstitution working, but it seems as
>> though the substitution is removing transitive deps of
>> "beam-sdks-java-io-google-cloud-platform", hmm...
>>
>> On Thu, Jul 21, 2022 at 1:15 PM Evan Galpin 
>> wrote:
>>
>>> Hi all,
>>>
>>> I'm trying to test a change I've made locally, but by validating it
>>> on Dataflow.  It works locally, but I want to validate on Dataflow.  
>>> I've
>>> tried a few different attempts at module substitution in the 
>>> build.gradle
>>> config file for the pipeline I'm trying to deploy, but I haven't had any
>>> success yet.
>>>
>>> How might I be able to replace the
>>> beam-sdks-java-io-google-cloud-platform module usually installed from 
>>> maven
>>> with a local jar generated from running:
>>>
>>> "./gradlew :sdk:java:io:google-cloud-platform:jar"
>>>
>>> Thanks,
>>> Evan
>>>
>>
>>>
>>> --
>>> Regards,
>>> Tomo
>>>
>>


Re: [Dataflow][Guidance] Replacing beam-sdks-java-io-google-cloud-platform with local jar

2022-07-22 Thread Evan Galpin
It's working! Huge thank you to Steve Niemitz who pointed out the need for
"--experiments=enable_custom_pubsub_sink" to prevent dataflow override for
the module that I wanted to use custom source.

Here is my full process in case it's helpful to anyone in the future (note
one might need to change the version identifiers):


   1. Modify files in sdks/java/io/google-cloud-platform
   2. Add id 'com.github.johnrengelman.shadow' to plugins in
   sdks/java/io/google-cloud-platform/build.gradle
   3. Build a shadowJar via "./gradlew
   :sdk:java:io:google-cloud-platform:shadowJar"
   4. Copy the shadowJar from
   
my/path/to/beam/sdks/java/io/google-cloud-platform/build/libs/beam-sdks-java-io-google-cloud-platform-2.40.0-SNAPSHOT-all.jar
   to
   
my/path/to/user/pipeline/top-level/libs/org/apache/beam/beam-sdks-java-io-google-cloud-platform/2.40.0-SNAPSHOT/beam-sdks-java-io-google-cloud-platform-2.40.0-SNAPSHOT.jar
   5. Add a pom file for the shadowJar (to emulate local maven repo):

   
   http://maven.apache.org/POM/4.0.0
   http://maven.apache.org/xsd/maven-4.0.0.xsd; xmlns="
   http://maven.apache.org/POM/4.0.0; xmlns:xsi="
   http://www.w3.org/2001/XMLSchema-instance;>
   4.0.0
   org.apache.beam
   beam-sdks-java-io-google-cloud-platform
   2.40.0-SNAPSHOT
   

   6. In user code pipeline "build.gradle", add a local maven repo (note
   "./libs" is from "my/path/to/user/pipeline/top-level/libs")

 repositories {
   maven {
   url = uri('./libs')
   }
   ... other repos ...
}

   7. In user code pipeline "build.gradle", implement dependency
   replacement of the SDK version of beam-sdks-java-io-google-cloud-platform

   configurations {
   all {
   resolutionStrategy.dependencySubstitution {
   substitute
   module("org.apache.beam:beam-sdks-java-io-google-cloud-platform:2.40.0")
   using
   
module("org.apache.beam:beam-sdks-java-io-google-cloud-platform:2.40.0-SNAPSHOT")
   }
   }
   }

   8. Deploy the user code pipeline including the flag:
   --experiments=enable_custom_pubsub_sink



On Thu, Jul 21, 2022 at 4:42 PM Evan Galpin  wrote:

> Thanks Tomo, I'll check that out too as a good safeguard!  Are you
> familiar with any process to build pre-release artifacts?  I suppose that's
> really what I'm after is building a pre-release version of pubsubIO to
> validate in Dataflow.
>
> - Evan
>
>
> On Thu, Jul 21, 2022 at 4:21 PM Tomo Suzuki via dev 
> wrote:
>
>> I don't come up with a solution (I'm not familiar with the method
>> you're using). However I often use "getProtectionDomain()"
>> https://stackoverflow.com/a/56000383/975074 to find the JAR file from a
>> class. This ensures the class you modified is actually used.
>>
>> On Thu, Jul 21, 2022 at 3:35 PM Evan Galpin  wrote:
>>
>>> Spoke too soon... still can't seem to get the new behaviour to appear in
>>> dataflow, possibly something is being overridden?
>>>
>>> On Thu, Jul 21, 2022 at 3:15 PM Evan Galpin  wrote:
>>>
 Making a shadowJar from "beam-sdks-java-io-google-cloud-platform" looks
 to be working. Added `  id 'com.github.johnrengelman.shadow'` to
 `build.gradle` for "beam-sdks-java-io-google-cloud-platform" in the beam
 source and used the resulting jar as a dependency replacement when
 deploying the job to dataflow.  Looks ok.

 On Thu, Jul 21, 2022 at 3:02 PM Evan Galpin  wrote:

> I believe I have the dependencySubstitution working, but it seems as
> though the substitution is removing transitive deps of
> "beam-sdks-java-io-google-cloud-platform", hmm...
>
> On Thu, Jul 21, 2022 at 1:15 PM Evan Galpin 
> wrote:
>
>> Hi all,
>>
>> I'm trying to test a change I've made locally, but by validating it
>> on Dataflow.  It works locally, but I want to validate on Dataflow.  I've
>> tried a few different attempts at module substitution in the build.gradle
>> config file for the pipeline I'm trying to deploy, but I haven't had any
>> success yet.
>>
>> How might I be able to replace the
>> beam-sdks-java-io-google-cloud-platform module usually installed from 
>> maven
>> with a local jar generated from running:
>>
>> "./gradlew :sdk:java:io:google-cloud-platform:jar"
>>
>> Thanks,
>> Evan
>>
>
>>
>> --
>> Regards,
>> Tomo
>>
>


Re: [Dataflow][Guidance] Replacing beam-sdks-java-io-google-cloud-platform with local jar

2022-07-21 Thread Evan Galpin
Thanks Tomo, I'll check that out too as a good safeguard!  Are you familiar
with any process to build pre-release artifacts?  I suppose that's really
what I'm after is building a pre-release version of pubsubIO to validate in
Dataflow.

- Evan


On Thu, Jul 21, 2022 at 4:21 PM Tomo Suzuki via dev 
wrote:

> I don't come up with a solution (I'm not familiar with the method
> you're using). However I often use "getProtectionDomain()"
> https://stackoverflow.com/a/56000383/975074 to find the JAR file from a
> class. This ensures the class you modified is actually used.
>
> On Thu, Jul 21, 2022 at 3:35 PM Evan Galpin  wrote:
>
>> Spoke too soon... still can't seem to get the new behaviour to appear in
>> dataflow, possibly something is being overridden?
>>
>> On Thu, Jul 21, 2022 at 3:15 PM Evan Galpin  wrote:
>>
>>> Making a shadowJar from "beam-sdks-java-io-google-cloud-platform" looks
>>> to be working. Added `  id 'com.github.johnrengelman.shadow'` to
>>> `build.gradle` for "beam-sdks-java-io-google-cloud-platform" in the beam
>>> source and used the resulting jar as a dependency replacement when
>>> deploying the job to dataflow.  Looks ok.
>>>
>>> On Thu, Jul 21, 2022 at 3:02 PM Evan Galpin  wrote:
>>>
 I believe I have the dependencySubstitution working, but it seems as
 though the substitution is removing transitive deps of
 "beam-sdks-java-io-google-cloud-platform", hmm...

 On Thu, Jul 21, 2022 at 1:15 PM Evan Galpin  wrote:

> Hi all,
>
> I'm trying to test a change I've made locally, but by validating it on
> Dataflow.  It works locally, but I want to validate on Dataflow.  I've
> tried a few different attempts at module substitution in the build.gradle
> config file for the pipeline I'm trying to deploy, but I haven't had any
> success yet.
>
> How might I be able to replace the
> beam-sdks-java-io-google-cloud-platform module usually installed from 
> maven
> with a local jar generated from running:
>
> "./gradlew :sdk:java:io:google-cloud-platform:jar"
>
> Thanks,
> Evan
>

>
> --
> Regards,
> Tomo
>


Re: [Dataflow][Guidance] Replacing beam-sdks-java-io-google-cloud-platform with local jar

2022-07-21 Thread Tomo Suzuki via dev
I don't come up with a solution (I'm not familiar with the method
you're using). However I often use "getProtectionDomain()"
https://stackoverflow.com/a/56000383/975074 to find the JAR file from a
class. This ensures the class you modified is actually used.

On Thu, Jul 21, 2022 at 3:35 PM Evan Galpin  wrote:

> Spoke too soon... still can't seem to get the new behaviour to appear in
> dataflow, possibly something is being overridden?
>
> On Thu, Jul 21, 2022 at 3:15 PM Evan Galpin  wrote:
>
>> Making a shadowJar from "beam-sdks-java-io-google-cloud-platform" looks
>> to be working. Added `  id 'com.github.johnrengelman.shadow'` to
>> `build.gradle` for "beam-sdks-java-io-google-cloud-platform" in the beam
>> source and used the resulting jar as a dependency replacement when
>> deploying the job to dataflow.  Looks ok.
>>
>> On Thu, Jul 21, 2022 at 3:02 PM Evan Galpin  wrote:
>>
>>> I believe I have the dependencySubstitution working, but it seems as
>>> though the substitution is removing transitive deps of
>>> "beam-sdks-java-io-google-cloud-platform", hmm...
>>>
>>> On Thu, Jul 21, 2022 at 1:15 PM Evan Galpin  wrote:
>>>
 Hi all,

 I'm trying to test a change I've made locally, but by validating it on
 Dataflow.  It works locally, but I want to validate on Dataflow.  I've
 tried a few different attempts at module substitution in the build.gradle
 config file for the pipeline I'm trying to deploy, but I haven't had any
 success yet.

 How might I be able to replace the
 beam-sdks-java-io-google-cloud-platform module usually installed from maven
 with a local jar generated from running:

 "./gradlew :sdk:java:io:google-cloud-platform:jar"

 Thanks,
 Evan

>>>

-- 
Regards,
Tomo


Re: [Dataflow][Guidance] Replacing beam-sdks-java-io-google-cloud-platform with local jar

2022-07-21 Thread Evan Galpin
Spoke too soon... still can't seem to get the new behaviour to appear in
dataflow, possibly something is being overridden?

On Thu, Jul 21, 2022 at 3:15 PM Evan Galpin  wrote:

> Making a shadowJar from "beam-sdks-java-io-google-cloud-platform" looks to
> be working. Added `  id 'com.github.johnrengelman.shadow'` to
> `build.gradle` for "beam-sdks-java-io-google-cloud-platform" in the beam
> source and used the resulting jar as a dependency replacement when
> deploying the job to dataflow.  Looks ok.
>
> On Thu, Jul 21, 2022 at 3:02 PM Evan Galpin  wrote:
>
>> I believe I have the dependencySubstitution working, but it seems as
>> though the substitution is removing transitive deps of
>> "beam-sdks-java-io-google-cloud-platform", hmm...
>>
>> On Thu, Jul 21, 2022 at 1:15 PM Evan Galpin  wrote:
>>
>>> Hi all,
>>>
>>> I'm trying to test a change I've made locally, but by validating it on
>>> Dataflow.  It works locally, but I want to validate on Dataflow.  I've
>>> tried a few different attempts at module substitution in the build.gradle
>>> config file for the pipeline I'm trying to deploy, but I haven't had any
>>> success yet.
>>>
>>> How might I be able to replace the
>>> beam-sdks-java-io-google-cloud-platform module usually installed from maven
>>> with a local jar generated from running:
>>>
>>> "./gradlew :sdk:java:io:google-cloud-platform:jar"
>>>
>>> Thanks,
>>> Evan
>>>
>>


Re: [Dataflow][Guidance] Replacing beam-sdks-java-io-google-cloud-platform with local jar

2022-07-21 Thread Evan Galpin
Making a shadowJar from "beam-sdks-java-io-google-cloud-platform" looks to
be working. Added `  id 'com.github.johnrengelman.shadow'` to
`build.gradle` for "beam-sdks-java-io-google-cloud-platform" in the beam
source and used the resulting jar as a dependency replacement when
deploying the job to dataflow.  Looks ok.

On Thu, Jul 21, 2022 at 3:02 PM Evan Galpin  wrote:

> I believe I have the dependencySubstitution working, but it seems as
> though the substitution is removing transitive deps of
> "beam-sdks-java-io-google-cloud-platform", hmm...
>
> On Thu, Jul 21, 2022 at 1:15 PM Evan Galpin  wrote:
>
>> Hi all,
>>
>> I'm trying to test a change I've made locally, but by validating it on
>> Dataflow.  It works locally, but I want to validate on Dataflow.  I've
>> tried a few different attempts at module substitution in the build.gradle
>> config file for the pipeline I'm trying to deploy, but I haven't had any
>> success yet.
>>
>> How might I be able to replace the
>> beam-sdks-java-io-google-cloud-platform module usually installed from maven
>> with a local jar generated from running:
>>
>> "./gradlew :sdk:java:io:google-cloud-platform:jar"
>>
>> Thanks,
>> Evan
>>
>


Re: [Dataflow][Guidance] Replacing beam-sdks-java-io-google-cloud-platform with local jar

2022-07-21 Thread Evan Galpin
I believe I have the dependencySubstitution working, but it seems as though
the substitution is removing transitive deps of
"beam-sdks-java-io-google-cloud-platform", hmm...

On Thu, Jul 21, 2022 at 1:15 PM Evan Galpin  wrote:

> Hi all,
>
> I'm trying to test a change I've made locally, but by validating it on
> Dataflow.  It works locally, but I want to validate on Dataflow.  I've
> tried a few different attempts at module substitution in the build.gradle
> config file for the pipeline I'm trying to deploy, but I haven't had any
> success yet.
>
> How might I be able to replace the beam-sdks-java-io-google-cloud-platform
> module usually installed from maven with a local jar generated from
> running:
>
> "./gradlew :sdk:java:io:google-cloud-platform:jar"
>
> Thanks,
> Evan
>


[Dataflow][Guidance] Replacing beam-sdks-java-io-google-cloud-platform with local jar

2022-07-21 Thread Evan Galpin
Hi all,

I'm trying to test a change I've made locally, but by validating it on
Dataflow.  It works locally, but I want to validate on Dataflow.  I've
tried a few different attempts at module substitution in the build.gradle
config file for the pipeline I'm trying to deploy, but I haven't had any
success yet.

How might I be able to replace the beam-sdks-java-io-google-cloud-platform
module usually installed from maven with a local jar generated from
running:

"./gradlew :sdk:java:io:google-cloud-platform:jar"

Thanks,
Evan