Re: [I] [Feature Request]: Set quota project in `beam.io.ReadFromBigQuery` [beam]

2024-04-29 Thread via GitHub
shahar1 commented on issue #31126: URL: https://github.com/apache/beam/issues/31126#issuecomment-2084510959 > @shahar1 this level of customization might make sense. > > Let's explore your specific concern for a moment, I might have others, but imagine worth understanding your needs/us

Re: [I] [Feature Request]: Set quota project in `beam.io.ReadFromBigQuery` [beam]

2024-04-29 Thread via GitHub
brucearctor commented on issue #31126: URL: https://github.com/apache/beam/issues/31126#issuecomment-2083695808 Also, I wonder whether implimentation of this issue would help with https://github.com/apache/beam/issues/30747 -- This is an automated message from the Apache Git Service. To

Re: [I] [Feature Request]: Set quota project in `beam.io.ReadFromBigQuery` [beam]

2024-04-29 Thread via GitHub
brucearctor commented on issue #31126: URL: https://github.com/apache/beam/issues/31126#issuecomment-2083672957 @shahar1 this level of customization might make sense. Let's explore your specific concern for a moment, I might have others, but imagine worth understanding your needs/usec

Re: [I] [Feature Request]: Build beam.MLTransform [beam]

2024-04-29 Thread via GitHub
AnandInguva closed issue #26640: [Feature Request]: Build beam.MLTransform URL: https://github.com/apache/beam/issues/26640 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To u

Re: [I] [Bug]: Schema inference is non-deterministic [beam]

2024-04-29 Thread via GitHub
reuvenlax commented on issue #30276: URL: https://github.com/apache/beam/issues/30276#issuecomment-2083435325 Sorting doesn't help here as a common use case is to add a new field on update. Dataflow at least should be able to handle schema update, as long as the schema is on a PCollection.

Re: [I] [Bug]: Embedded Video (and other content) in iframes are not working [beam]

2024-04-29 Thread via GitHub
tvalentyn closed issue #30981: [Bug]: Embedded Video (and other content) in iframes are not working URL: https://github.com/apache/beam/issues/30981 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to t

Re: [PR] Also allow links to Drive materials. [beam]

2024-04-29 Thread via GitHub
tvalentyn merged PR #31131: URL: https://github.com/apache/beam/pull/31131 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@beam.apach

Re: [PR] Also allow links to Drive materials. [beam]

2024-04-29 Thread via GitHub
github-actions[bot] commented on PR #31131: URL: https://github.com/apache/beam/pull/31131#issuecomment-2083398434 Assigning reviewers. If you would like to opt out of this review, comment `assign to next reviewer`: R: @Abacn added as fallback since no labels match configuration

Re: [I] The StressTests Java KafkaIO job is flaky [beam]

2024-04-29 Thread via GitHub
github-actions[bot] commented on issue #31074: URL: https://github.com/apache/beam/issues/31074#issuecomment-2083396619 Reopening since the workflow is still flaky -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the UR

Re: [PR] Allow users to configure wait options for new Neo4j databases [beam]

2024-04-29 Thread via GitHub
Abacn commented on code in PR #31129: URL: https://github.com/apache/beam/pull/31129#discussion_r1583547450 ## it/neo4j/src/main/java/org/apache/beam/it/neo4j/DatabaseWaitOptions.java: ## @@ -0,0 +1,33 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or

Re: [PR] [#30789] Add support for Flink 1.18 [beam]

2024-04-29 Thread via GitHub
thebozzcl commented on code in PR #31062: URL: https://github.com/apache/beam/pull/31062#discussion_r1583521120 ## sdks/typescript/package-lock.json: ## @@ -6,7 +6,7 @@ "packages": { "": { "name": "apache-beam", - "version": "2.54.0-SNAPSHOT", + "version

Re: [PR] [#30789] Add support for Flink 1.18 [beam]

2024-04-29 Thread via GitHub
thebozzcl commented on code in PR #31062: URL: https://github.com/apache/beam/pull/31062#discussion_r1583520189 ## .github/workflows/beam_PostCommit_Java_Tpcds_Flink.yml: ## @@ -101,5 +101,5 @@ jobs: with: gradle-command: :sdks:java:testing:tpcds:run

Re: [PR] Also allow links to Drive materials. [beam]

2024-04-29 Thread via GitHub
tvalentyn commented on PR #31131: URL: https://github.com/apache/beam/pull/31131#issuecomment-2083350425 see: https://github.com/apache/beam/issues/30981 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] Suppress BigQuery read stream splitAtFraction when API busy call or timeout [beam]

2024-04-29 Thread via GitHub
github-actions[bot] commented on PR #31125: URL: https://github.com/apache/beam/pull/31125#issuecomment-2083346916 Assigning reviewers. If you would like to opt out of this review, comment `assign to next reviewer`: R: @damondouglas for label java. R: @chamikaramj for label io.

Re: [PR] add finalizers to kafka loadbalancer services [beam]

2024-04-29 Thread via GitHub
github-actions[bot] commented on PR #31130: URL: https://github.com/apache/beam/pull/31130#issuecomment-2083346760 Assigning reviewers. If you would like to opt out of this review, comment `assign to next reviewer`: R: @chamikaramj added as fallback since no labels match configuration

[PR] Also allow links to Drive materials. [beam]

2024-04-29 Thread via GitHub
tvalentyn opened a new pull request, #31131: URL: https://github.com/apache/beam/pull/31131 fixes: #31058 Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily: - [ ] Mention the appr

[PR] add finalizers to kafka loadbalancer services [beam]

2024-04-29 Thread via GitHub
volatilemolotov opened a new pull request, #31130: URL: https://github.com/apache/beam/pull/31130 Need to add finalizers as strimzi removes it for some reason preventing LB deletion. https://kubernetes.io/docs/tasks/access-application-cluster/create-external-load-balancer/#garbage-c

Re: [PR] Add ReadChangeStream IO param to adjust backlog estimates for replication delay [beam]

2024-04-29 Thread via GitHub
Abacn merged PR #30995: URL: https://github.com/apache/beam/pull/30995 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@beam.apache.or

Re: [PR] Implementing lull reporting at bundle level processing [beam]

2024-04-29 Thread via GitHub
arvindram03 commented on PR #30693: URL: https://github.com/apache/beam/pull/30693#issuecomment-2083243573 Thanks Sam. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To un

Re: [PR] [Prism] Enable Java validatesRunner tests on Prism [beam]

2024-04-29 Thread via GitHub
lostluck commented on code in PR #31075: URL: https://github.com/apache/beam/pull/31075#discussion_r1583393345 ## runners/prism/build.gradle: ## @@ -0,0 +1,74 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See t

Re: [PR] Allow users to configure wait options for new Neo4j databases [beam]

2024-04-29 Thread via GitHub
github-actions[bot] commented on PR #31129: URL: https://github.com/apache/beam/pull/31129#issuecomment-2083183918 Assigning reviewers. If you would like to opt out of this review, comment `assign to next reviewer`: R: @tvalentyn added as fallback since no labels match configuration

Re: [PR] Optimise View.asList() side inputs for iterating rather than for indexing. [beam]

2024-04-29 Thread via GitHub
robertwb commented on code in PR #31087: URL: https://github.com/apache/beam/pull/31087#discussion_r1583387136 ## sdks/java/core/src/main/java/org/apache/beam/sdk/values/PCollectionViews.java: ## @@ -960,6 +995,179 @@ public void verifyDeterministic() throws NonDeterministicExc

Re: [PR] [Draft] Return methods in a lexicographical order from ReflectUtils.getMethods() [beam]

2024-04-29 Thread via GitHub
reuvenlax commented on PR #30279: URL: https://github.com/apache/beam/pull/30279#issuecomment-2083171737 Dataflow keeps track of the old and new schema, and will match up fields to ensure that the encoding position remains the same even if the field orders are different. This is the rea

Re: [PR] Allow users to configure wait options for new Neo4j databases [beam]

2024-04-29 Thread via GitHub
fbiville commented on PR #31129: URL: https://github.com/apache/beam/pull/31129#issuecomment-2083120068 Note: the code snippet in https://cwiki.apache.org/confluence/display/BEAM/Java+Tips#JavaTips-Howtoformatcodeautomaticallyandavoidspotlesserrors? is missing. -- This is an automated me

Re: [I] [Bug]: Reshuffle.viaRandomKey timeout since version 2.54.0 [beam]

2024-04-29 Thread via GitHub
kennknowles commented on issue #31095: URL: https://github.com/apache/beam/issues/31095#issuecomment-2083110425 Through Google's Cloud Support channels you can get someone who can really dig in to the details of the logs and the metrics. But from an outside perspective, doing some trial and

[PR] Allow users to configure wait options for new Neo4j databases [beam]

2024-04-29 Thread via GitHub
fbiville opened a new pull request, #31129: URL: https://github.com/apache/beam/pull/31129 Since v5 of Neo4j, databases are created in an asynchronous manner. Before this commit, `Neo4jResourceManager` did not allow users to specify whether to wait or not for the database creation (via the

Re: [I] [Bug]: Reshuffle.viaRandomKey timeout since version 2.54.0 [beam]

2024-04-29 Thread via GitHub
kennknowles commented on issue #31095: URL: https://github.com/apache/beam/issues/31095#issuecomment-2083102561 Easy to test: you can comment out the BQ write and see if you still reproduce the error. -- This is an automated message from the Apache Git Service. To respond to the message,

[PR] Create option to specify temp query project, and wire into source tab… [beam]

2024-04-29 Thread via GitHub
johnjcasey opened a new pull request, #31128: URL: https://github.com/apache/beam/pull/31128 Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily: - [ ] Mention the appropriate issue

Re: [I] [Bug]: Reshuffle.viaRandomKey timeout since version 2.54.0 [beam]

2024-04-29 Thread via GitHub
djaneluz commented on issue #31095: URL: https://github.com/apache/beam/issues/31095#issuecomment-2083049191 @kennknowles so maybe the problem might be on BigQueryIO write? This step currently is: ``` return response .apply("WriteToBigQuery", BigQueryIO.write()

Re: [I] [Bug]: Reshuffle.viaRandomKey timeout since version 2.54.0 [beam]

2024-04-29 Thread via GitHub
djaneluz commented on issue #31095: URL: https://github.com/apache/beam/issues/31095#issuecomment-2083041082 > Can you share the size of elements and the overall size of data shuffled? (I don't think any other factors could impact this transform) The input of the `Reshuffle.viaRandomK

Re: [PR] Optimise View.asList() side inputs for iterating rather than for indexing. [beam]

2024-04-29 Thread via GitHub
kennknowles commented on code in PR #31087: URL: https://github.com/apache/beam/pull/31087#discussion_r1583238947 ## sdks/java/core/src/main/java/org/apache/beam/sdk/values/PCollectionViews.java: ## @@ -960,6 +995,179 @@ public void verifyDeterministic() throws NonDeterministic

Re: [PR] Optimise View.asList() side inputs for iterating rather than for indexing. [beam]

2024-04-29 Thread via GitHub
kennknowles commented on PR #31087: URL: https://github.com/apache/beam/pull/31087#issuecomment-2082977457 Totally agree. I do know that this was actually an explicit decision. The history as I understand it: - We already had `View.asIterable` that was a simple iterator, but windowed

Re: [I] [Bug]: Schema inference is non-deterministic [beam]

2024-04-29 Thread via GitHub
kennknowles commented on issue #30276: URL: https://github.com/apache/beam/issues/30276#issuecomment-2082959451 I think @reuvenlax did some work to deal with this before. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

Re: [PR] [Draft] Return methods in a lexicographical order from ReflectUtils.getMethods() [beam]

2024-04-29 Thread via GitHub
kennknowles commented on PR #30279: URL: https://github.com/apache/beam/pull/30279#issuecomment-2082958359 How does it work for update compatibility if I add a new field to my POJO? I added Reuven because he knows the most about this. -- This is an automated message from the Apache Git Se

Re: [I] The PostCommit Python job is flaky [beam]

2024-04-29 Thread via GitHub
kennknowles commented on issue #30513: URL: https://github.com/apache/beam/issues/30513#issuecomment-2082920470 Could make sense to find a way to get separate top-level signal for Python versions, assuming we can use software engineering to share everything necessary so they don't get out o

Re: [PR] Add "all runs" link to flake issues [beam]

2024-04-29 Thread via GitHub
kennknowles closed pull request #31127: Add "all runs" link to flake issues URL: https://github.com/apache/beam/pull/31127 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To un

Re: [PR] Add "all runs" link to flake issues [beam]

2024-04-29 Thread via GitHub
kennknowles commented on PR #31127: URL: https://github.com/apache/beam/pull/31127#issuecomment-2082861198 Ah, gotcha. Makes sense. Perhaps I'll revise to that. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL a

Re: [PR] rename cluster used [beam]

2024-04-29 Thread via GitHub
damccorm commented on PR #31113: URL: https://github.com/apache/beam/pull/31113#issuecomment-2082853471 Running - https://github.com/apache/beam/actions/runs/8879975636 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use t

Re: [PR] rename cluster used [beam]

2024-04-29 Thread via GitHub
damccorm merged PR #31113: URL: https://github.com/apache/beam/pull/31113 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@beam.apache

[PR] Add "all runs" link to flake issues [beam]

2024-04-29 Thread via GitHub
kennknowles opened a new pull request, #31127: URL: https://github.com/apache/beam/pull/31127 This will make it quick to pop open the history and eyeball if the flakiness is currently really bad without having to do adjustments in the UI once landing on the history. R: @damccorm

Re: [I] The PostCommit Python Arm job is flaky [beam]

2024-04-29 Thread via GitHub
damccorm commented on issue #30760: URL: https://github.com/apache/beam/issues/30760#issuecomment-2082757026 I think that's wrong - https://github.com/apache/beam/actions/workflows/beam_PostCommit_Python_Arm.yml?query=branch%3Amaster+event%3Aschedule -- This is an automated message from t

Re: [I] The PostCommit Python job is flaky [beam]

2024-04-29 Thread via GitHub
damccorm commented on issue #30513: URL: https://github.com/apache/beam/issues/30513#issuecomment-2082755326 Only sorta - each component job is actually not permared - e.g. there are 2 successes here, https://github.com/apache/beam/actions/runs/8873798546 The whole workflow is permare

Re: [I] The PostCommit Python ValidatesContainer Dataflow With RC job is flaky [beam]

2024-04-29 Thread via GitHub
kennknowles commented on issue #30525: URL: https://github.com/apache/beam/issues/30525#issuecomment-2082731122 Should this be run at all? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spec

Re: [PR] Implementing lull reporting at bundle level processing [beam]

2024-04-29 Thread via GitHub
scwhittle merged PR #30693: URL: https://github.com/apache/beam/pull/30693 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@beam.apach

Re: [PR] Implementing lull reporting at bundle level processing [beam]

2024-04-29 Thread via GitHub
scwhittle commented on PR #30693: URL: https://github.com/apache/beam/pull/30693#issuecomment-2082700812 Failure looks unrelated, merging -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the speci

Re: [I] The PostCommit Java Sickbay job is flaky [beam]

2024-04-29 Thread via GitHub
kennknowles commented on issue #30529: URL: https://github.com/apache/beam/issues/30529#issuecomment-2082671303 Another good candidate for deletion -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [PR] Add Redistribute transform to model, Java SDK, and most active runners [beam]

2024-04-29 Thread via GitHub
kennknowles merged PR #30545: URL: https://github.com/apache/beam/pull/30545 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@beam.apa

Re: [PR] Add Redistribute transform to model, Java SDK, and most active runners [beam]

2024-04-29 Thread via GitHub
kennknowles commented on code in PR #30545: URL: https://github.com/apache/beam/pull/30545#discussion_r1583040940 ## runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkStreamingPortablePipelineTranslator.java: ## @@ -301,6 +305,24 @@ private void translateReshuffle(

Re: [PR] Add Redistribute transform to model, Java SDK, and most active runners [beam]

2024-04-29 Thread via GitHub
kennknowles commented on code in PR #30545: URL: https://github.com/apache/beam/pull/30545#discussion_r1583039482 ## runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowPipelineTranslator.java: ## @@ -917,6 +919,41 @@ private void groupByKe

Re: [PR] Add Redistribute transform to model, Java SDK, and most active runners [beam]

2024-04-29 Thread via GitHub
kennknowles commented on PR #30545: URL: https://github.com/apache/beam/pull/30545#issuecomment-2082643119 Yea, I actually do feel strongly about the one remaining issue. I get what you are saying about directly re-using the reshuffle translator as a way of expressing "the reshuffle transla

Re: [PR] added custom watermark for kinesis reader [beam]

2024-04-29 Thread via GitHub
github-actions[bot] commented on PR #28763: URL: https://github.com/apache/beam/pull/28763#issuecomment-2082570187 Assigning new set of reviewers because Pr has gone too long without review. If you would like to opt out of this review, comment `assign to next reviewer`: R: @Abacn for

Re: [PR] Call `DateTime(long)` instead of `DateTime(Object)`. [beam]

2024-04-29 Thread via GitHub
github-actions[bot] commented on PR #30758: URL: https://github.com/apache/beam/pull/30758#issuecomment-2082570099 Assigning new set of reviewers because Pr has gone too long without review. If you would like to opt out of this review, comment `assign to next reviewer`: R: @shunping a

Re: [PR] Implementing lull reporting at bundle level processing [beam]

2024-04-29 Thread via GitHub
scwhittle commented on PR #30693: URL: https://github.com/apache/beam/pull/30693#issuecomment-2082494509 some tests were cancelled for some reasno so closing/reopening to run them -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

Re: [PR] Implementing lull reporting at bundle level processing [beam]

2024-04-29 Thread via GitHub
scwhittle closed pull request #30693: Implementing lull reporting at bundle level processing URL: https://github.com/apache/beam/pull/30693 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specif

Re: [PR] rename cluster used [beam]

2024-04-29 Thread via GitHub
volatilemolotov commented on PR #31113: URL: https://github.com/apache/beam/pull/31113#issuecomment-2082243386 @damccorm Also when you merge can you trigger the https://github.com/apache/beam/actions/workflows/beam_StressTests_Java_KafkaIO.yml so we can verify -- This is an automated m