Re: [PR] [YAML] Add and cleanup documentation for several builtin transforms. [beam]

2023-12-08 Thread via GitHub
robertwb merged PR #29673: URL: https://github.com/apache/beam/pull/29673 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@beam.apache

Re: [PR] Per DoFn latency instrumentation [beam]

2023-12-08 Thread via GitHub
m-trieu commented on code in PR #29592: URL: https://github.com/apache/beam/pull/29592#discussion_r1421182705 ## runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/streaming/Work.java: ## @@ -111,15 +128,58 @@ public Collection getLa

Re: [PR] Per DoFn latency instrumentation [beam]

2023-12-08 Thread via GitHub
m-trieu commented on code in PR #29592: URL: https://github.com/apache/beam/pull/29592#discussion_r1421180847 ## runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/streaming/ComputationState.java: ## @@ -120,8 +121,9 @@ private void f

Re: [PR] Per DoFn latency instrumentation [beam]

2023-12-08 Thread via GitHub
m-trieu commented on code in PR #29592: URL: https://github.com/apache/beam/pull/29592#discussion_r1421179427 ## runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/streaming/Work.java: ## @@ -101,7 +117,8 @@ private void recordGetWork

Re: [PR] Per DoFn latency instrumentation [beam]

2023-12-08 Thread via GitHub
m-trieu commented on code in PR #29592: URL: https://github.com/apache/beam/pull/29592#discussion_r1421177641 ## runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/DataflowExecutionStateSampler.java: ## @@ -0,0 +1,136 @@ +/* + * Licen

Re: [PR] Per DoFn latency instrumentation [beam]

2023-12-08 Thread via GitHub
m-trieu commented on code in PR #29592: URL: https://github.com/apache/beam/pull/29592#discussion_r1421175977 ## runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/DataflowExecutionStateSampler.java: ## @@ -0,0 +1,136 @@ +/* + * Licen

Re: [PR] Per DoFn latency instrumentation [beam]

2023-12-08 Thread via GitHub
m-trieu commented on code in PR #29592: URL: https://github.com/apache/beam/pull/29592#discussion_r1421172835 ## runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/DataflowExecutionStateSampler.java: ## @@ -0,0 +1,136 @@ +/* + * Licen

Re: [PR] Per DoFn latency instrumentation [beam]

2023-12-08 Thread via GitHub
m-trieu commented on code in PR #29592: URL: https://github.com/apache/beam/pull/29592#discussion_r1421166191 ## runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/ActiveMessageMetadata.java: ## @@ -0,0 +1,32 @@ +/* + * Licensed to th

Re: [I] [Bug]: DaskRunner GBK failures related to partitioning prevent use of string keys and break `assert_that` [beam]

2023-12-08 Thread via GitHub
TheNeuralBit commented on issue #29365: URL: https://github.com/apache/beam/issues/29365#issuecomment-1848006497 It's really surprising that not many people have run into this! I noticed the [dask bag docs](https://docs.dask.org/en/stable/bag.html#shuffle) do nudge users away from groupby:

Re: [I] [Bug]: DaskRunner GBK failures related to partitioning prevent use of string keys and break `assert_that` [beam]

2023-12-08 Thread via GitHub
cisaacstern commented on issue #29365: URL: https://github.com/apache/beam/issues/29365#issuecomment-1847982978 As Brian discovered during our sync on this today, the Dask issues already exist: - https://github.com/dask/distributed/issues/4141 - https://github.com/dask/dask/issues/

Re: [PR] Improve varint encoding throughput with unrolled loop [beam]

2023-12-08 Thread via GitHub
github-actions[bot] commented on PR #29689: URL: https://github.com/apache/beam/pull/29689#issuecomment-1847953996 Checks are failing. Will not request review until checks are succeeding. If you'd like to override that behavior, comment `assign set of reviewers` -- This is an automated me

Re: [PR] fix: optimize segemant reader [beam]

2023-12-08 Thread via GitHub
github-actions[bot] commented on PR #29694: URL: https://github.com/apache/beam/pull/29694#issuecomment-1847953908 Assigning reviewers. If you would like to opt out of this review, comment `assign to next reviewer`: R: @damondouglas for label java. R: @damondouglas for label io.

Re: [PR] Add per test timeout in recently changed dataflow legacy worker tests [beam]

2023-12-08 Thread via GitHub
github-actions[bot] commented on PR #29696: URL: https://github.com/apache/beam/pull/29696#issuecomment-1847953869 Assigning reviewers. If you would like to opt out of this review, comment `assign to next reviewer`: R: @riteshghorse added as fallback since no labels match configuratio

Re: [PR] Enable keys values multimap protocol based on runner capabilities. [beam]

2023-12-08 Thread via GitHub
lostluck commented on code in PR #29695: URL: https://github.com/apache/beam/pull/29695#discussion_r1421083185 ## model/pipeline/src/main/proto/org/apache/beam/model/pipeline/v1/beam_runner_api.proto: ## @@ -1675,6 +1675,11 @@ message StandardRunnerProtocols { // https://s.

[PR] Add per test timeout in recently changed dataflow legacy worker tests [beam]

2023-12-08 Thread via GitHub
Abacn opened a new pull request, #29696: URL: https://github.com/apache/beam/pull/29696 Mitigate #28957 (happened again) We are experiencing stuck Java PreCommit caused by :runner:google-cloud-dataflow-java:worker:test. Some recent streaming worker changes may have caused this (). Al

Re: [PR] Enable keys values multimap protocol based on runner capabilities. [beam]

2023-12-08 Thread via GitHub
lostluck commented on PR #29695: URL: https://github.com/apache/beam/pull/29695#issuecomment-1847929531 cc: @rohdesamuel -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [PR] Enable keys values multimap protocol based on runner capabilities. [beam]

2023-12-08 Thread via GitHub
github-actions[bot] commented on PR #29695: URL: https://github.com/apache/beam/pull/29695#issuecomment-1847928989 Stopping reviewer notifications for this pull request: review requested by someone other than the bot, ceding control -- This is an automated message from the Apache Git Serv

Re: [PR] Enable keys values multimap protocol based on runner capabilities. [beam]

2023-12-08 Thread via GitHub
robertwb commented on PR #29695: URL: https://github.com/apache/beam/pull/29695#issuecomment-1847928085 R: @priyansndesai -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[PR] Enable keys values multimap protocol based on runner capabilities. [beam]

2023-12-08 Thread via GitHub
robertwb opened a new pull request, #29695: URL: https://github.com/apache/beam/pull/29695 This should finish the SDK-side changes needed for https://github.com/apache/beam/issues/29691 Thank you for your contribution! Follow this checklist to help us

Re: [PR] Guard keys-values multi map input with a flag. [beam]

2023-12-08 Thread via GitHub
robertwb merged PR #29690: URL: https://github.com/apache/beam/pull/29690 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@beam.apache

Re: [PR] [WIP]Save Job IDs of dataflow load tests [beam]

2023-12-08 Thread via GitHub
AnandInguva commented on PR #29693: URL: https://github.com/apache/beam/pull/29693#issuecomment-1847911039 ![image](https://github.com/apache/beam/assets/34158215/65812288-8d38-49eb-9289-97b08f629048) The Job IDs are being saved in the BQ table. -- This is an automated message from

[PR] fix: optimize segemant reader [beam]

2023-12-08 Thread via GitHub
mutianf opened a new pull request, #29694: URL: https://github.com/apache/beam/pull/29694 Optimize the segement reader so we're actually fetching more rows when the buffer size is low. `future` in segment reader is only reset to null in `waitReadRowsFuture()` method. This means that

Re: [PR] Bump cloud.google.com/go/bigtable from 1.20.0 to 1.21.0 in /sdks [beam]

2023-12-08 Thread via GitHub
lostluck merged PR #29677: URL: https://github.com/apache/beam/pull/29677 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@beam.apache

Re: [PR] Mock packages instead of installing in tox pydocs task [beam]

2023-12-08 Thread via GitHub
github-actions[bot] commented on PR #29692: URL: https://github.com/apache/beam/pull/29692#issuecomment-1847869795 Assigning reviewers. If you would like to opt out of this review, comment `assign to next reviewer`: R: @damccorm for label python. Available commands: - `stop

[PR] [WIP]Save Job IDs of dataflow load tests [beam]

2023-12-08 Thread via GitHub
AnandInguva opened a new pull request, #29693: URL: https://github.com/apache/beam/pull/29693 **Please** add a meaningful description for your change here Thank you for your contribution! Follow this checklist to help us incorporate your contribution qu

Re: [PR] Publish job id for dataflow load tests [beam]

2023-12-08 Thread via GitHub
AnandInguva closed pull request #29686: Publish job id for dataflow load tests URL: https://github.com/apache/beam/pull/29686 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[PR] Mock packages instead of installing in tox pydocs task [beam]

2023-12-08 Thread via GitHub
AnandInguva opened a new pull request, #29692: URL: https://github.com/apache/beam/pull/29692 **Please** add a meaningful description for your change here Thank you for your contribution! Follow this checklist to help us incorporate your contribution qu

Re: [PR] Support Embeddings in mltransform [beam]

2023-12-08 Thread via GitHub
AnandInguva commented on code in PR #29564: URL: https://github.com/apache/beam/pull/29564#discussion_r1421006250 ## sdks/python/apache_beam/ml/transforms/embeddings/vertex_ai_test.py: ## @@ -44,7 +46,9 @@ VertexAITextEmbeddings is None, 'Vertex AI Python SDK is not install

Re: [PR] Support Embeddings in mltransform [beam]

2023-12-08 Thread via GitHub
AnandInguva commented on code in PR #29564: URL: https://github.com/apache/beam/pull/29564#discussion_r1421006250 ## sdks/python/apache_beam/ml/transforms/embeddings/vertex_ai_test.py: ## @@ -44,7 +46,9 @@ VertexAITextEmbeddings is None, 'Vertex AI Python SDK is not install

Re: [PR] Support Embeddings in mltransform [beam]

2023-12-08 Thread via GitHub
AnandInguva commented on code in PR #29564: URL: https://github.com/apache/beam/pull/29564#discussion_r1420909045 ## sdks/python/apache_beam/ml/transforms/tft.py: ## @@ -95,6 +96,24 @@ def __init__(self, columns: List[str]) -> None: "Columns are not specified. Please

Re: [I] [Failing Test]: Building a wheel for integration tests sometimes times out [beam]

2023-12-08 Thread via GitHub
tvalentyn commented on issue #28703: URL: https://github.com/apache/beam/issues/28703#issuecomment-1847797413 Added a retry logic. Let's reopen if we see it again. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the UR

Re: [I] [Failing Test]: Building a wheel for integration tests sometimes times out [beam]

2023-12-08 Thread via GitHub
tvalentyn closed issue #28703: [Failing Test]: Building a wheel for integration tests sometimes times out URL: https://github.com/apache/beam/issues/28703 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to g

Re: [PR] Retry building a wheel up to 3 times. [beam]

2023-12-08 Thread via GitHub
tvalentyn merged PR #29676: URL: https://github.com/apache/beam/pull/29676 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@beam.apach

Re: [PR] Guard keys-values multi map input with a flag. [beam]

2023-12-08 Thread via GitHub
robertwb commented on PR #29690: URL: https://github.com/apache/beam/pull/29690#issuecomment-1847794418 Followup to https://github.com/apache/beam/pull/29587 for https://github.com/apache/beam/issues/29691 -- This is an automated message from the Apache Git Service. To respond to the mess

Re: [PR] More efficient map side inputs for small maps. [beam]

2023-12-08 Thread via GitHub
robertwb commented on PR #29587: URL: https://github.com/apache/beam/pull/29587#issuecomment-1847794013 This is for https://github.com/apache/beam/issues/29691 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL ab

Re: [PR] Partially revert changes from #29587 [beam]

2023-12-08 Thread via GitHub
damccorm closed pull request #29688: Partially revert changes from #29587 URL: https://github.com/apache/beam/pull/29688 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsu

Re: [PR] Guard keys-values multi map input with a flag. [beam]

2023-12-08 Thread via GitHub
damccorm commented on code in PR #29690: URL: https://github.com/apache/beam/pull/29690#discussion_r1420983387 ## sdks/java/harness/src/main/java/org/apache/beam/fn/harness/state/MultimapSideInput.java: ## @@ -62,6 +63,18 @@ public MultimapSideInput( StateKey stateKey,

Re: [PR] Guard keys-values multi map input with a flag. [beam]

2023-12-08 Thread via GitHub
github-actions[bot] commented on PR #29690: URL: https://github.com/apache/beam/pull/29690#issuecomment-1847780552 Stopping reviewer notifications for this pull request: review requested by someone other than the bot, ceding control -- This is an automated message from the Apache Git Serv

Re: [PR] Guard keys-values multi map input with a flag. [beam]

2023-12-08 Thread via GitHub
robertwb commented on PR #29690: URL: https://github.com/apache/beam/pull/29690#issuecomment-1847778906 R: @damccorm -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To uns

Re: [PR] Support Embeddings in mltransform [beam]

2023-12-08 Thread via GitHub
damccorm commented on code in PR #29564: URL: https://github.com/apache/beam/pull/29564#discussion_r1420981406 ## sdks/python/apache_beam/ml/transforms/embeddings/vertex_ai_test.py: ## @@ -44,7 +46,9 @@ VertexAITextEmbeddings is None, 'Vertex AI Python SDK is not installed.

[PR] Guard keys-values multi map input with a flag. [beam]

2023-12-08 Thread via GitHub
robertwb opened a new pull request, #29690: URL: https://github.com/apache/beam/pull/29690 This is needed as some runners (e.g. Dataflow) do not gracefully return errors on unkown state read types. This flag will be set via runner capabilites in a future PR. ---

[PR] Improve varint encoding throughput with unrolled loop [beam]

2023-12-08 Thread via GitHub
sjvanrossum opened a new pull request, #29689: URL: https://github.com/apache/beam/pull/29689 Second try at this, I've added all the benchmark code and selected the top performing encoding loop on a variety of platforms. The test produces a distribution of integers that favors small and neg

Re: [PR] Partially revert changes from #29587 [beam]

2023-12-08 Thread via GitHub
github-actions[bot] commented on PR #29688: URL: https://github.com/apache/beam/pull/29688#issuecomment-1847693042 Stopping reviewer notifications for this pull request: review requested by someone other than the bot, ceding control -- This is an automated message from the Apache Git Serv

Re: [PR] Partially revert changes from #29587 [beam]

2023-12-08 Thread via GitHub
damccorm commented on PR #29688: URL: https://github.com/apache/beam/pull/29688#issuecomment-1847691739 R: @robertwb -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To uns

Re: [I] [Task]: Support embeddings using ML models in `MLTransform` [beam]

2023-12-08 Thread via GitHub
AnandInguva commented on issue #29356: URL: https://github.com/apache/beam/issues/29356#issuecomment-1847684344 Tasks needs to be done after PR https://github.com/apache/beam/pull/29564 is merged. * Add set_model_handler method * Support Dead letter queue for RunInference and asses its

[PR] Partially revert changes from #29587 [beam]

2023-12-08 Thread via GitHub
damccorm opened a new pull request, #29688: URL: https://github.com/apache/beam/pull/29688 #29587 introduced changes to allow the usage of the multimap_keys_values_side_input state key. Since some runners don't support this key yet, usage was wrapped in a try/except block so that when talki

Re: [I] [Failing Test]: Python PostCommits are failing due to a possible dependency update [beam]

2023-12-08 Thread via GitHub
riteshghorse commented on issue #29684: URL: https://github.com/apache/beam/issues/29684#issuecomment-1847650479 Trying out a local run with 1.4.49 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [I] [Bug]: DaskRunner GBK failures related to partitioning prevent use of string keys and break `assert_that` [beam]

2023-12-08 Thread via GitHub
cisaacstern commented on issue #29365: URL: https://github.com/apache/beam/issues/29365#issuecomment-1847650474 Wow @TheNeuralBit this is right on the money, perfect reproducer! I can translate this into a Dask issue, if you like? -- This is an automated message from the Apache Git

Re: [PR] [#29605] Go SDK: Eagerly create timer writers to ensure is_last sent, minmize lock contention. [beam]

2023-12-08 Thread via GitHub
lostluck merged PR #29607: URL: https://github.com/apache/beam/pull/29607 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@beam.apache

Re: [I] [Bug][Go SDK]: is_last never set for unused Timers [beam]

2023-12-08 Thread via GitHub
lostluck closed issue #29605: [Bug][Go SDK]: is_last never set for unused Timers URL: https://github.com/apache/beam/issues/29605 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific commen

Re: [I] [Failing Test]: Python PostCommits are failing due to a possible dependency update [beam]

2023-12-08 Thread via GitHub
riteshghorse commented on issue #29684: URL: https://github.com/apache/beam/issues/29684#issuecomment-1847650243 SQLAlchemy was upgraded recently to 1.4.50 from 1.4.49. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use t

Re: [I] [Bug]: DaskRunner GBK failures related to partitioning prevent use of string keys and break `assert_that` [beam]

2023-12-08 Thread via GitHub
TheNeuralBit commented on issue #29365: URL: https://github.com/apache/beam/issues/29365#issuecomment-1847646822 I spent a little time looking at this. I found it's possible to minimally reproduce this with dask only, configured to use a `LocalCluster`. With this script: ```py f

Re: [PR] Add the ability to generate markdown documentation from a set of providers. [beam]

2023-12-08 Thread via GitHub
robertwb commented on PR #29639: URL: https://github.com/apache/beam/pull/29639#issuecomment-1847643187 Yes. These were resolved in https://github.com/apache/beam/pull/29675 On Thu, Dec 7, 2023 at 2:56 PM tvalentyn ***@***.***> wrote: > Did this PR add more lint errors? >

Re: [PR] Add Data Sampling support for periodic sampling [beam]

2023-12-08 Thread via GitHub
lostluck commented on PR #29590: URL: https://github.com/apache/beam/pull/29590#issuecomment-1847635253 Merging in. Thanks! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] Add Data Sampling support for periodic sampling [beam]

2023-12-08 Thread via GitHub
lostluck merged PR #29590: URL: https://github.com/apache/beam/pull/29590 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@beam.apache

Re: [PR] Support Embeddings in mltransform [beam]

2023-12-08 Thread via GitHub
AnandInguva commented on code in PR #29564: URL: https://github.com/apache/beam/pull/29564#discussion_r1420845901 ## sdks/python/apache_beam/ml/transforms/utils.py: ## @@ -28,8 +30,13 @@ class ArtifactsFetcher(): to the TFTProcessHandlers in MLTransform. """ def __init_

Re: [PR] Support Embeddings in mltransform [beam]

2023-12-08 Thread via GitHub
AnandInguva commented on code in PR #29564: URL: https://github.com/apache/beam/pull/29564#discussion_r1420844035 ## sdks/python/apache_beam/ml/transforms/utils.py: ## @@ -28,8 +30,13 @@ class ArtifactsFetcher(): to the TFTProcessHandlers in MLTransform. """ def __init_

Re: [PR] Support Embeddings in mltransform [beam]

2023-12-08 Thread via GitHub
AnandInguva commented on code in PR #29564: URL: https://github.com/apache/beam/pull/29564#discussion_r1420845901 ## sdks/python/apache_beam/ml/transforms/utils.py: ## @@ -28,8 +30,13 @@ class ArtifactsFetcher(): to the TFTProcessHandlers in MLTransform. """ def __init_

Re: [PR] Support Embeddings in mltransform [beam]

2023-12-08 Thread via GitHub
AnandInguva commented on code in PR #29564: URL: https://github.com/apache/beam/pull/29564#discussion_r1420844035 ## sdks/python/apache_beam/ml/transforms/utils.py: ## @@ -28,8 +30,13 @@ class ArtifactsFetcher(): to the TFTProcessHandlers in MLTransform. """ def __init_

Re: [PR] Support Embeddings in mltransform [beam]

2023-12-08 Thread via GitHub
AnandInguva commented on code in PR #29564: URL: https://github.com/apache/beam/pull/29564#discussion_r1420844035 ## sdks/python/apache_beam/ml/transforms/utils.py: ## @@ -28,8 +30,13 @@ class ArtifactsFetcher(): to the TFTProcessHandlers in MLTransform. """ def __init_

Re: [PR] Support Embeddings in mltransform [beam]

2023-12-08 Thread via GitHub
AnandInguva commented on code in PR #29564: URL: https://github.com/apache/beam/pull/29564#discussion_r1420836285 ## sdks/python/apache_beam/ml/transforms/utils.py: ## @@ -28,8 +30,13 @@ class ArtifactsFetcher(): to the TFTProcessHandlers in MLTransform. """ def __init_

Re: [PR] [YAML] Better IO documentation. [beam]

2023-12-08 Thread via GitHub
github-actions[bot] commented on PR #29687: URL: https://github.com/apache/beam/pull/29687#issuecomment-1847588808 Stopping reviewer notifications for this pull request: review requested by someone other than the bot, ceding control -- This is an automated message from the Apache Git Serv

Re: [PR] Support Embeddings in mltransform [beam]

2023-12-08 Thread via GitHub
AnandInguva commented on code in PR #29564: URL: https://github.com/apache/beam/pull/29564#discussion_r1420836285 ## sdks/python/apache_beam/ml/transforms/utils.py: ## @@ -28,8 +30,13 @@ class ArtifactsFetcher(): to the TFTProcessHandlers in MLTransform. """ def __init_

Re: [PR] [YAML] Better IO documentation. [beam]

2023-12-08 Thread via GitHub
robertwb commented on PR #29687: URL: https://github.com/apache/beam/pull/29687#issuecomment-1847585927 R: @Polber -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsub

[PR] [YAML] Better IO documentation. [beam]

2023-12-08 Thread via GitHub
robertwb opened a new pull request, #29687: URL: https://github.com/apache/beam/pull/29687 Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily: - [ ] Mention the appropriate issue in yo

Re: [PR] Support Embeddings in mltransform [beam]

2023-12-08 Thread via GitHub
AnandInguva commented on code in PR #29564: URL: https://github.com/apache/beam/pull/29564#discussion_r1420836285 ## sdks/python/apache_beam/ml/transforms/utils.py: ## @@ -28,8 +30,13 @@ class ArtifactsFetcher(): to the TFTProcessHandlers in MLTransform. """ def __init_

Re: [PR] Support Embeddings in mltransform [beam]

2023-12-08 Thread via GitHub
damccorm commented on code in PR #29564: URL: https://github.com/apache/beam/pull/29564#discussion_r1420760671 ## sdks/python/apache_beam/ml/transforms/base.py: ## @@ -42,12 +64,62 @@ OperationOutputT = TypeVar('OperationOutputT') +def _convert_list_of_dicts_to_dict_of_list

Re: [PR] Shrink Java PreCommit timeout [beam]

2023-12-08 Thread via GitHub
Abacn merged PR #29671: URL: https://github.com/apache/beam/pull/29671 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@beam.apache.or

Re: [PR] Publish job id for dataflow load tests [beam]

2023-12-08 Thread via GitHub
AnandInguva commented on PR #29686: URL: https://github.com/apache/beam/pull/29686#issuecomment-1847529499 main PR: https://github.com/apache/beam/pull/29567. I opened this PR to manually run GHA which are not triggered with phrases -- This is an automated message from the Apache Git Serv

[PR] Publish job id for dataflow load tests [beam]

2023-12-08 Thread via GitHub
AnandInguva opened a new pull request, #29686: URL: https://github.com/apache/beam/pull/29686 * Update schema if the default schema is different than the table schema * Add job id label to the BQ publisher for dataflow jobs **Please** add a meaningful description for your change

Re: [PR] Publish job [beam]

2023-12-08 Thread via GitHub
AnandInguva merged PR #29685: URL: https://github.com/apache/beam/pull/29685 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@beam.apa

[PR] Publish job [beam]

2023-12-08 Thread via GitHub
AnandInguva opened a new pull request, #29685: URL: https://github.com/apache/beam/pull/29685 **Please** add a meaningful description for your change here Thank you for your contribution! Follow this checklist to help us incorporate your contribution qu

[I] [Failing Test]: Python PostCommits are failing due to a possible dependency update [beam]

2023-12-08 Thread via GitHub
riteshghorse opened a new issue, #29684: URL: https://github.com/apache/beam/issues/29684 ### What happened? `apache_beam/io/external/xlang_jdbcio_it_test.py::CrossLanguageJdbcIOTest::test_xlang_jdbc_write_read` is faling for last 3 runs for all Python postcommits ([3.9](https://ci-

Re: [PR] Bump github.com/aws/aws-sdk-go-v2/credentials from 1.16.9 to 1.16.11 in /sdks [beam]

2023-12-08 Thread via GitHub
dependabot[bot] closed pull request #29679: Bump github.com/aws/aws-sdk-go-v2/credentials from 1.16.9 to 1.16.11 in /sdks URL: https://github.com/apache/beam/pull/29679 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

Re: [PR] Bump github.com/aws/aws-sdk-go-v2/credentials from 1.16.9 to 1.16.11 in /sdks [beam]

2023-12-08 Thread via GitHub
dependabot[bot] commented on PR #29679: URL: https://github.com/apache/beam/pull/29679#issuecomment-1847459805 Looks like github.com/aws/aws-sdk-go-v2/credentials is up-to-date now, so this is no longer needed. -- This is an automated message from the Apache Git Service. To respond to the

Re: [PR] [Python][RRIO] Call PTransform with setup teardown [beam]

2023-12-08 Thread via GitHub
riteshghorse merged PR #29585: URL: https://github.com/apache/beam/pull/29585 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@beam.ap

Re: [PR] Bump github.com/aws/aws-sdk-go-v2/config from 1.25.8 to 1.26.0 in /sdks [beam]

2023-12-08 Thread via GitHub
riteshghorse merged PR #29678: URL: https://github.com/apache/beam/pull/29678 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@beam.ap

Re: [I] [Bug]: [Python] Respect BigQuery insert byte size limit when writing batched rows [beam]

2023-12-08 Thread via GitHub
johnjcasey commented on issue #27363: URL: https://github.com/apache/beam/issues/27363#issuecomment-1847369449 Upgrading to P2 as we now have a known failing use case -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

Re: [PR] Deal with trailing slash in tempRoot [beam]

2023-12-08 Thread via GitHub
shunping commented on code in PR #29478: URL: https://github.com/apache/beam/pull/29478#discussion_r1420613218 ## sdks/java/core/src/test/java/org/apache/beam/sdk/io/FileSystemsTest.java: ## @@ -316,6 +317,25 @@ public void testInvalidSchemaMatchNewResource() { assertEquals

Re: [PR] Deal with trailing slash in tempRoot [beam]

2023-12-08 Thread via GitHub
shunping commented on code in PR #29478: URL: https://github.com/apache/beam/pull/29478#discussion_r1420606573 ## runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/TestDataflowRunner.java: ## @@ -74,8 +74,13 @@ public class TestDataflowRunner exte

Re: [PR] Add Maven archetype for examples with Dataflow Runner only [beam]

2023-12-08 Thread via GitHub
bvolpato commented on PR #29447: URL: https://github.com/apache/beam/pull/29447#issuecomment-1847136317 Not proceeding for now -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment

Re: [PR] Add Maven archetype for examples with Dataflow Runner only [beam]

2023-12-08 Thread via GitHub
bvolpato closed pull request #29447: Add Maven archetype for examples with Dataflow Runner only URL: https://github.com/apache/beam/pull/29447 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spe

[I] [Bug]: SparkRunner log plumbing for Python SDK not working properly [beam]

2023-12-08 Thread via GitHub
phoerious opened a new issue, #29683: URL: https://github.com/apache/beam/issues/29683 ### What happened? I'm trying to get the SparkRunner working properly with pre-compiled JARs from my Python job (created with `--output_executable=job.jar`) that can be run with `spark-submit` on K

Re: [PR] Add Maven archetype for examples with Dataflow Runner only [beam]

2023-12-08 Thread via GitHub
github-actions[bot] commented on PR #29447: URL: https://github.com/apache/beam/pull/29447#issuecomment-1847072277 Assigning new set of reviewers because Pr has gone too long without review. If you would like to opt out of this review, comment `assign to next reviewer`: R: @Abacn for