[GitHub] [beam] mwalenia commented on pull request #11619: Compile and run tests on java 11 for PreCommit portability api

2020-05-08 Thread GitBox


mwalenia commented on pull request #11619:
URL: https://github.com/apache/beam/pull/11619#issuecomment-625671592







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] mwalenia removed a comment on pull request #11619: Compile and run tests on java 11 for PreCommit portability api

2020-05-08 Thread GitBox


mwalenia removed a comment on pull request #11619:
URL: https://github.com/apache/beam/pull/11619#issuecomment-625671644


   run seed job



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] mwalenia commented on pull request #11619: Compile and run tests on java 11 for PreCommit portability api

2020-05-08 Thread GitBox


mwalenia commented on pull request #11619:
URL: https://github.com/apache/beam/pull/11619#issuecomment-625671876


   run seed job



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] mwalenia commented on pull request #11554: [BEAM-9876] Migrate the Beam website from Jekyll to Hugo to enable localization of the site content

2020-05-08 Thread GitBox


mwalenia commented on pull request #11554:
URL: https://github.com/apache/beam/pull/11554#issuecomment-625673344


   Run Java PreCommit



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] mwalenia commented on pull request #11619: Compile and run tests on java 11 for PreCommit portability api

2020-05-08 Thread GitBox


mwalenia commented on pull request #11619:
URL: https://github.com/apache/beam/pull/11619#issuecomment-625678085


   Run JavaPortabilityApiJava11 PreCommit



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] mwalenia commented on pull request #11619: Compile and run tests on java 11 for PreCommit portability api

2020-05-08 Thread GitBox


mwalenia commented on pull request #11619:
URL: https://github.com/apache/beam/pull/11619#issuecomment-625678328


   Run JavaPortabilityApi PreCommit



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] mwalenia commented on pull request #11619: Compile and run tests on java 11 for PreCommit portability api

2020-05-08 Thread GitBox


mwalenia commented on pull request #11619:
URL: https://github.com/apache/beam/pull/11619#issuecomment-625678500


   Run JavaPortabilityApi PreCommit



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] mwalenia commented on pull request #11619: Compile and run tests on java 11 for PreCommit portability api

2020-05-08 Thread GitBox


mwalenia commented on pull request #11619:
URL: https://github.com/apache/beam/pull/11619#issuecomment-625679351


   retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] mwalenia commented on pull request #11619: Compile and run tests on java 11 for PreCommit portability api

2020-05-08 Thread GitBox


mwalenia commented on pull request #11619:
URL: https://github.com/apache/beam/pull/11619#issuecomment-625679405


   run seed job



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] mwalenia removed a comment on pull request #11619: Compile and run tests on java 11 for PreCommit portability api

2020-05-08 Thread GitBox


mwalenia removed a comment on pull request #11619:
URL: https://github.com/apache/beam/pull/11619#issuecomment-625679611


   run seed job



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] mwalenia commented on pull request #11619: Compile and run tests on java 11 for PreCommit portability api

2020-05-08 Thread GitBox


mwalenia commented on pull request #11619:
URL: https://github.com/apache/beam/pull/11619#issuecomment-625679517


   run seed job



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] mwalenia commented on pull request #11619: Compile and run tests on java 11 for PreCommit portability api

2020-05-08 Thread GitBox


mwalenia commented on pull request #11619:
URL: https://github.com/apache/beam/pull/11619#issuecomment-625679748


   run seed job



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] mwalenia commented on pull request #11619: Compile and run tests on java 11 for PreCommit portability api

2020-05-08 Thread GitBox


mwalenia commented on pull request #11619:
URL: https://github.com/apache/beam/pull/11619#issuecomment-625679611


   run seed job



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] mwalenia commented on pull request #11619: Compile and run tests on java 11 for PreCommit portability api

2020-05-08 Thread GitBox


mwalenia commented on pull request #11619:
URL: https://github.com/apache/beam/pull/11619#issuecomment-625683788


   run seed job



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] mwalenia commented on pull request #11566: [BEAM-9723] Add DLP integration transforms

2020-05-08 Thread GitBox


mwalenia commented on pull request #11566:
URL: https://github.com/apache/beam/pull/11566#issuecomment-625686624


   @santhh Thanks for the feedback! I need to think a little about table 
support, but as for the batch size, it's configurable through the builder. The 
upper bound for the batch is hardcoded and checked at runtime



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] mwalenia commented on pull request #11566: [BEAM-9723] Add DLP integration transforms

2020-05-08 Thread GitBox


mwalenia commented on pull request #11566:
URL: https://github.com/apache/beam/pull/11566#issuecomment-625687808


   Run JavaPortabilityApiJava11 PreCommit



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] mwalenia commented on pull request #11619: Compile and run tests on java 11 for PreCommit portability api

2020-05-08 Thread GitBox


mwalenia commented on pull request #11619:
URL: https://github.com/apache/beam/pull/11619#issuecomment-625687880







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] kamilwu commented on pull request #11554: [BEAM-9876] Migrate the Beam website from Jekyll to Hugo to enable localization of the site content

2020-05-08 Thread GitBox


kamilwu commented on pull request #11554:
URL: https://github.com/apache/beam/pull/11554#issuecomment-625695376


   Run Java PreCommit



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] kamilwu commented on pull request #11629: [BEAM-6710] Add landing page with links to relevant dashboards

2020-05-08 Thread GitBox


kamilwu commented on pull request #11629:
URL: https://github.com/apache/beam/pull/11629#issuecomment-625697491


   cc: @aaltay 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] mxm opened a new pull request #11640: [BEAM-9930] Beam Summit Digital 2020 announcement on blog

2020-05-08 Thread GitBox


mxm opened a new pull request #11640:
URL: https://github.com/apache/beam/pull/11640


   Not the best timing in light of #11608 but let's get this out now and 
migrate it later on. 
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/)
 | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow_V2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_

[GitHub] [beam] mwalenia commented on pull request #11619: Compile and run tests on java 11 for PreCommit portability api

2020-05-08 Thread GitBox


mwalenia commented on pull request #11619:
URL: https://github.com/apache/beam/pull/11619#issuecomment-625714163


   retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] mwalenia commented on pull request #11619: Compile and run tests on java 11 for PreCommit portability api

2020-05-08 Thread GitBox


mwalenia commented on pull request #11619:
URL: https://github.com/apache/beam/pull/11619#issuecomment-625714904


   retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] mwalenia commented on pull request #11619: Compile and run tests on java 11 for PreCommit portability api

2020-05-08 Thread GitBox


mwalenia commented on pull request #11619:
URL: https://github.com/apache/beam/pull/11619#issuecomment-625715055


   run seed job



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] mwalenia commented on pull request #11619: Compile and run tests on java 11 for PreCommit portability api

2020-05-08 Thread GitBox


mwalenia commented on pull request #11619:
URL: https://github.com/apache/beam/pull/11619#issuecomment-625715547


   run seed job



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] mwalenia commented on pull request #11619: Compile and run tests on java 11 for PreCommit portability api

2020-05-08 Thread GitBox


mwalenia commented on pull request #11619:
URL: https://github.com/apache/beam/pull/11619#issuecomment-625715466


   run seed job



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] mwalenia commented on pull request #11619: Compile and run tests on java 11 for PreCommit portability api

2020-05-08 Thread GitBox


mwalenia commented on pull request #11619:
URL: https://github.com/apache/beam/pull/11619#issuecomment-625715386


   run seed job



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] mwalenia commented on pull request #11619: Compile and run tests on java 11 for PreCommit portability api

2020-05-08 Thread GitBox


mwalenia commented on pull request #11619:
URL: https://github.com/apache/beam/pull/11619#issuecomment-625715813


   run seed job



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] mwalenia commented on pull request #11619: Compile and run tests on java 11 for PreCommit portability api

2020-05-08 Thread GitBox


mwalenia commented on pull request #11619:
URL: https://github.com/apache/beam/pull/11619#issuecomment-625715931


   run seed job



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] mwalenia removed a comment on pull request #11619: Compile and run tests on java 11 for PreCommit portability api

2020-05-08 Thread GitBox


mwalenia removed a comment on pull request #11619:
URL: https://github.com/apache/beam/pull/11619#issuecomment-625715466







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] mwalenia commented on pull request #11619: Compile and run tests on java 11 for PreCommit portability api

2020-05-08 Thread GitBox


mwalenia commented on pull request #11619:
URL: https://github.com/apache/beam/pull/11619#issuecomment-625719414


   run seed job



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] mwalenia commented on pull request #11577: [BEAM-8132, BEAM-8133] Apply InfluxDB pipeline options in Load Tests and Performance Tests

2020-05-08 Thread GitBox


mwalenia commented on pull request #11577:
URL: https://github.com/apache/beam/pull/11577#issuecomment-625725372


   LGTM, thanks for the contribution! :) Resolve the conflicts and feel free to 
merge



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] mwalenia commented on pull request #11619: Compile and run tests on java 11 for PreCommit portability api

2020-05-08 Thread GitBox


mwalenia commented on pull request #11619:
URL: https://github.com/apache/beam/pull/11619#issuecomment-625727950


   Run JavaPortabilityApi PreCommit



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] mwalenia commented on pull request #11619: Compile and run tests on java 11 for PreCommit portability api

2020-05-08 Thread GitBox


mwalenia commented on pull request #11619:
URL: https://github.com/apache/beam/pull/11619#issuecomment-625727900


   Run JavaPortabilityApiJava11 PreCommit



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] mwalenia commented on pull request #11619: Compile and run tests on java 11 for PreCommit portability api

2020-05-08 Thread GitBox


mwalenia commented on pull request #11619:
URL: https://github.com/apache/beam/pull/11619#issuecomment-625727805







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] kamilwu commented on pull request #11577: [BEAM-8132, BEAM-8133] Apply InfluxDB pipeline options in Load Tests and Performance Tests

2020-05-08 Thread GitBox


kamilwu commented on pull request #11577:
URL: https://github.com/apache/beam/pull/11577#issuecomment-625747576


   Run Seed Job



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] kamilwu commented on pull request #11577: [BEAM-8132, BEAM-8133] Apply InfluxDB pipeline options in Load Tests and Performance Tests

2020-05-08 Thread GitBox


kamilwu commented on pull request #11577:
URL: https://github.com/apache/beam/pull/11577#issuecomment-625753366


   Run Python Load Tests ParDo Flink Streaming



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] kamilwu commented on pull request #11577: [BEAM-8132, BEAM-8133] Apply InfluxDB pipeline options in Load Tests and Performance Tests

2020-05-08 Thread GitBox


kamilwu commented on pull request #11577:
URL: https://github.com/apache/beam/pull/11577#issuecomment-625753189


   Run Java HadoopFormatIO Performance Test



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] kamilwu commented on pull request #11567: [BEAM-8132] Report Python metrics to InfluxDB

2020-05-08 Thread GitBox


kamilwu commented on pull request #11567:
URL: https://github.com/apache/beam/pull/11567#issuecomment-625760052


   Run PythonDocker PreCommit



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] kamilwu commented on pull request #11577: [BEAM-8132, BEAM-8133] Apply InfluxDB pipeline options in Load Tests and Performance Tests

2020-05-08 Thread GitBox


kamilwu commented on pull request #11577:
URL: https://github.com/apache/beam/pull/11577#issuecomment-625761300


   Run Python Load Tests ParDo Flink Streaming



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] kamilwu commented on pull request #11577: [BEAM-8132, BEAM-8133] Apply InfluxDB pipeline options in Load Tests and Performance Tests

2020-05-08 Thread GitBox


kamilwu commented on pull request #11577:
URL: https://github.com/apache/beam/pull/11577#issuecomment-625770546


   Run Python Load Tests ParDo Flink Streaming



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] kamilwu commented on pull request #11567: [BEAM-8132] Report Python metrics to InfluxDB

2020-05-08 Thread GitBox


kamilwu commented on pull request #11567:
URL: https://github.com/apache/beam/pull/11567#issuecomment-625770721


   Run Python PreCommit



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] mwalenia removed a comment on pull request #11566: [BEAM-9723] Add DLP integration transforms

2020-05-08 Thread GitBox


mwalenia removed a comment on pull request #11566:
URL: https://github.com/apache/beam/pull/11566#issuecomment-625687808


   Run JavaPortabilityApiJava11 PreCommit



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] kamilwu commented on pull request #11577: [BEAM-8132, BEAM-8133] Apply InfluxDB pipeline options in Load Tests and Performance Tests

2020-05-08 Thread GitBox


kamilwu commented on pull request #11577:
URL: https://github.com/apache/beam/pull/11577#issuecomment-625791717


   Run seed job



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] mszb commented on a change in pull request #11210: [BEAM-8949] SpannerIO integration tests

2020-05-08 Thread GitBox


mszb commented on a change in pull request #11210:
URL: https://github.com/apache/beam/pull/11210#discussion_r422104136



##
File path: sdks/python/apache_beam/io/gcp/experimental/spannerio_test.py
##
@@ -499,6 +499,7 @@ def test_batch_byte_size(
   # and each bach should contains 25 mutations.
   res = (
   p | beam.Create(mutation_group)
+  | 'combine to list' >> beam.combiners.ToList()

Review comment:
   Yes, the `_BatchFn` requires a single iterable of collection and loop 
through them to make the batches. Just replicating the same pipeline for the 
batching in the `_WriteGroup` transform.

##
File path: sdks/python/apache_beam/io/gcp/experimental/spannerio.py
##
@@ -1008,31 +1007,30 @@ def _reset_count(self):
 self._cells = 0
 
   def process(self, element):
-mg_info = element.info
+for elem in element:

Review comment:
   There was no issue in processing mutation group, the issue was with the 
batch size. According to the Beam execution model, ‘**The division of the 
collection into bundles is arbitrary and selected by the runner.**’ Which 
causes finish_bundle to be called multiple times rather than on the complete 
collection unit which causes the improper number of batches in the dataflow 
runner. That's the reason I've added the ToList transform to make a single 
collection and generate the batches properly.

##
File path: sdks/python/apache_beam/io/gcp/experimental/spannerio.py
##
@@ -1008,31 +1007,30 @@ def _reset_count(self):
 self._cells = 0
 
   def process(self, element):
-mg_info = element.info
+for elem in element:
+  mg_info = elem.info
+  if mg_info['byte_size'] + self._size_in_bytes > \

Review comment:
   Sure. Should I create a new Jira ticket and (1) add ticket number in 
this PR for reference OR (2) create a new PR for this change, and once it gets 
merge then I rebase this PR and request review? 
   
   I think the first approach required less time to close the tickets! What you 
suggest?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] kamilwu commented on pull request #11577: [BEAM-8132, BEAM-8133] Apply InfluxDB pipeline options in Load Tests and Performance Tests

2020-05-08 Thread GitBox


kamilwu commented on pull request #11577:
URL: https://github.com/apache/beam/pull/11577#issuecomment-625805262


   Run Java HadoopFormatIO Performance Test



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] kkucharc commented on a change in pull request #11567: [BEAM-8132] Report Python metrics to InfluxDB

2020-05-08 Thread GitBox


kkucharc commented on a change in pull request #11567:
URL: https://github.com/apache/beam/pull/11567#discussion_r422124124



##
File path: sdks/python/apache_beam/testing/load_tests/load_test.py
##
@@ -45,6 +47,19 @@ def _add_argparse_args(cls, parser):
 '--metrics_table',
 help='A BigQuery table where metrics should be '
 'written.')
+parser.add_argument(
+'--influx_measurement',
+help='An InfluxDB measurement where metrics should be published to. If 
'

Review comment:
   I am not sure if I correctly understand what measurement means. Is it 
name for metric such as `runtime` or name of place where metric will be stored 
as "table" or "column"?

##
File path: sdks/python/apache_beam/testing/load_tests/load_test.py
##
@@ -67,22 +82,30 @@ def _str_to_boolean(value):
 
 
 class LoadTest(object):
+  """Base class for all integration and performance tests which export
+  metrics to external databases: BigQuery or/and InfluxDB.
+
+  Refer to :class:`~apache_beam.testing.load_tests.LoadTestOptions` for more
+  information on the required pipeline options.
+
+  If using InfluxDB with Basic HTTP authentication enabled, provide the
+  following environment options: `INFLUXDB_USER` and `INFLUXDB_USER_PASSWORD`.

Review comment:
   Is it something we could enable to provide via PipelineOptions as well?

##
File path: sdks/python/apache_beam/testing/load_tests/load_test_metrics_utils.py
##
@@ -167,14 +175,15 @@ class MetricsReader(object):
   A :class:`MetricsReader` retrieves metrics from pipeline result,
   prepares it for publishers and setup publishers.
   """
-  publishers = []  # type: List[ConsoleMetricsPublisher]
+  publishers = []  # type: List[Any]
 
   def __init__(
   self,
   project_name=None,
   bq_table=None,
   bq_dataset=None,
   publish_to_bq=False,

Review comment:
   Do you think it would be good to have consistent parameter naming for 
influx and bq? Or we plan to abandon bq in future?

##
File path: sdks/python/apache_beam/testing/load_tests/load_test.py
##
@@ -67,22 +82,30 @@ def _str_to_boolean(value):
 
 
 class LoadTest(object):
+  """Base class for all integration and performance tests which export
+  metrics to external databases: BigQuery or/and InfluxDB.
+
+  Refer to :class:`~apache_beam.testing.load_tests.LoadTestOptions` for more
+  information on the required pipeline options.
+
+  If using InfluxDB with Basic HTTP authentication enabled, provide the
+  following environment options: `INFLUXDB_USER` and `INFLUXDB_USER_PASSWORD`.
+  """
   def __init__(self):
 # Be sure to set blocking to false for timeout_ms to work properly
 self.pipeline = TestPipeline(is_integration_test=True, blocking=False)
 assert not self.pipeline.blocking
 
-load_test_options = self.pipeline.get_pipeline_options().view_as(
-LoadTestOptions)
-self.timeout_ms = load_test_options.timeout_ms
-self.input_options = load_test_options.input_options
-self.metrics_namespace = load_test_options.metrics_table or 'default'
-publish_to_bq = load_test_options.publish_to_big_query
+options = self.pipeline.get_pipeline_options().view_as(LoadTestOptions)
+self.timeout_ms = options.timeout_ms
+self.input_options = options.input_options
+self.metrics_namespace = options.metrics_table or 'default'
+publish_to_bq = options.publish_to_big_query
 if publish_to_bq is None:

Review comment:
   Maybe we should remove this `if` since we have now two targets where we 
publish metrics?

##
File path: sdks/python/apache_beam/testing/load_tests/load_test_metrics_utils.py
##
@@ -404,6 +419,77 @@ def save(self, results):
 return self._client.insert_rows(self._bq_table, results)
 
 
+class InfluxDBMetricsPublisherOptions(object):
+  def __init__(
+  self,
+  measurement,  # type: str
+  db_name,  # type: str
+  hostname='http://localhost:8086',  # type: str

Review comment:
   Why do we need this default value here? Isn't it provided from pipeline 
options default value?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] mwalenia commented on pull request #11331: [BEAM-9646] Add Google Cloud vision integration transform

2020-05-08 Thread GitBox


mwalenia commented on pull request #11331:
URL: https://github.com/apache/beam/pull/11331#issuecomment-625818489


   @tysonjh Pinging for review



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] kamilwu commented on a change in pull request #11567: [BEAM-8132] Report Python metrics to InfluxDB

2020-05-08 Thread GitBox


kamilwu commented on a change in pull request #11567:
URL: https://github.com/apache/beam/pull/11567#discussion_r422159594



##
File path: sdks/python/apache_beam/testing/load_tests/load_test.py
##
@@ -45,6 +47,19 @@ def _add_argparse_args(cls, parser):
 '--metrics_table',
 help='A BigQuery table where metrics should be '
 'written.')
+parser.add_argument(
+'--influx_measurement',
+help='An InfluxDB measurement where metrics should be published to. If 
'

Review comment:
   It's a name of place where metric will be stored. It's like "table" in 
other databases. 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] kamilwu commented on a change in pull request #11567: [BEAM-8132] Report Python metrics to InfluxDB

2020-05-08 Thread GitBox


kamilwu commented on a change in pull request #11567:
URL: https://github.com/apache/beam/pull/11567#discussion_r422163533



##
File path: sdks/python/apache_beam/testing/load_tests/load_test.py
##
@@ -67,22 +82,30 @@ def _str_to_boolean(value):
 
 
 class LoadTest(object):
+  """Base class for all integration and performance tests which export
+  metrics to external databases: BigQuery or/and InfluxDB.
+
+  Refer to :class:`~apache_beam.testing.load_tests.LoadTestOptions` for more
+  information on the required pipeline options.
+
+  If using InfluxDB with Basic HTTP authentication enabled, provide the
+  following environment options: `INFLUXDB_USER` and `INFLUXDB_USER_PASSWORD`.

Review comment:
   If we did, we could put vulnerable data (like password) at risk by 
exposing them in logs. Apart from that, I think the only way of using Jenkins 
credentials are environment variables only

##
File path: sdks/python/apache_beam/testing/load_tests/load_test.py
##
@@ -67,22 +82,30 @@ def _str_to_boolean(value):
 
 
 class LoadTest(object):
+  """Base class for all integration and performance tests which export
+  metrics to external databases: BigQuery or/and InfluxDB.
+
+  Refer to :class:`~apache_beam.testing.load_tests.LoadTestOptions` for more
+  information on the required pipeline options.
+
+  If using InfluxDB with Basic HTTP authentication enabled, provide the
+  following environment options: `INFLUXDB_USER` and `INFLUXDB_USER_PASSWORD`.

Review comment:
   If we did, we could put vulnerable data (like password) at risk by 
exposing them in logs. Apart from that, I think the only way of using Jenkins 
credentials are environment variables 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] kamilwu commented on a change in pull request #11567: [BEAM-8132] Report Python metrics to InfluxDB

2020-05-08 Thread GitBox


kamilwu commented on a change in pull request #11567:
URL: https://github.com/apache/beam/pull/11567#discussion_r422167855



##
File path: sdks/python/apache_beam/testing/load_tests/load_test.py
##
@@ -67,22 +82,30 @@ def _str_to_boolean(value):
 
 
 class LoadTest(object):
+  """Base class for all integration and performance tests which export
+  metrics to external databases: BigQuery or/and InfluxDB.
+
+  Refer to :class:`~apache_beam.testing.load_tests.LoadTestOptions` for more
+  information on the required pipeline options.
+
+  If using InfluxDB with Basic HTTP authentication enabled, provide the
+  following environment options: `INFLUXDB_USER` and `INFLUXDB_USER_PASSWORD`.
+  """
   def __init__(self):
 # Be sure to set blocking to false for timeout_ms to work properly
 self.pipeline = TestPipeline(is_integration_test=True, blocking=False)
 assert not self.pipeline.blocking
 
-load_test_options = self.pipeline.get_pipeline_options().view_as(
-LoadTestOptions)
-self.timeout_ms = load_test_options.timeout_ms
-self.input_options = load_test_options.input_options
-self.metrics_namespace = load_test_options.metrics_table or 'default'
-publish_to_bq = load_test_options.publish_to_big_query
+options = self.pipeline.get_pipeline_options().view_as(LoadTestOptions)
+self.timeout_ms = options.timeout_ms
+self.input_options = options.input_options
+self.metrics_namespace = options.metrics_table or 'default'
+publish_to_bq = options.publish_to_big_query
 if publish_to_bq is None:

Review comment:
   Our goal is to keep sending metrics to BigQuery for some time. think 
it'd good to keep the interface intact until we eventually abandon bq.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] kamilwu commented on a change in pull request #11567: [BEAM-8132] Report Python metrics to InfluxDB

2020-05-08 Thread GitBox


kamilwu commented on a change in pull request #11567:
URL: https://github.com/apache/beam/pull/11567#discussion_r422174187



##
File path: sdks/python/apache_beam/testing/load_tests/load_test_metrics_utils.py
##
@@ -167,14 +175,15 @@ class MetricsReader(object):
   A :class:`MetricsReader` retrieves metrics from pipeline result,
   prepares it for publishers and setup publishers.
   """
-  publishers = []  # type: List[ConsoleMetricsPublisher]
+  publishers = []  # type: List[Any]
 
   def __init__(
   self,
   project_name=None,
   bq_table=None,
   bq_dataset=None,
   publish_to_bq=False,

Review comment:
   It's likely, but there's no decision yet. What influx parameters do you 
think can be improved? 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] kamilwu commented on a change in pull request #11567: [BEAM-8132] Report Python metrics to InfluxDB

2020-05-08 Thread GitBox


kamilwu commented on a change in pull request #11567:
URL: https://github.com/apache/beam/pull/11567#discussion_r422172322



##
File path: sdks/python/apache_beam/testing/load_tests/load_test_metrics_utils.py
##
@@ -404,6 +419,77 @@ def save(self, results):
 return self._client.insert_rows(self._bq_table, results)
 
 
+class InfluxDBMetricsPublisherOptions(object):
+  def __init__(
+  self,
+  measurement,  # type: str
+  db_name,  # type: str
+  hostname='http://localhost:8086',  # type: str

Review comment:
   There are some minor chances that someone would use InfluxDBPublisher in 
their code without casting pipeline options to LoadTestOptions (view_as(...)). 
But beside that, there's no particular reason





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] chamikaramj commented on a change in pull request #11210: [BEAM-8949] SpannerIO integration tests

2020-05-08 Thread GitBox


chamikaramj commented on a change in pull request #11210:
URL: https://github.com/apache/beam/pull/11210#discussion_r422180693



##
File path: sdks/python/apache_beam/io/gcp/experimental/spannerio.py
##
@@ -1008,31 +1007,30 @@ def _reset_count(self):
 self._cells = 0
 
   def process(self, element):
-mg_info = element.info
+for elem in element:

Review comment:
   "Which causes finish_bundle to be called multiple times" do you mean 
that finish_bundle will be called once per bundle ? 
   This is the expected behavior and users will observe this behavior as well. 
Implementation should work for arbitrary bundle sizes without users having to 
group PCollection elements together.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] chamikaramj commented on a change in pull request #11210: [BEAM-8949] SpannerIO integration tests

2020-05-08 Thread GitBox


chamikaramj commented on a change in pull request #11210:
URL: https://github.com/apache/beam/pull/11210#discussion_r422181467



##
File path: sdks/python/apache_beam/io/gcp/experimental/spannerio_test.py
##
@@ -499,6 +499,7 @@ def test_batch_byte_size(
   # and each bach should contains 25 mutations.
   res = (
   p | beam.Create(mutation_group)
+  | 'combine to list' >> beam.combiners.ToList()

Review comment:
   Do users have to do this as well ? Seems like we are missing something 
in the implementation. How does Java implementation operate ?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] chamikaramj commented on a change in pull request #11210: [BEAM-8949] SpannerIO integration tests

2020-05-08 Thread GitBox


chamikaramj commented on a change in pull request #11210:
URL: https://github.com/apache/beam/pull/11210#discussion_r422182633



##
File path: sdks/python/apache_beam/io/gcp/experimental/spannerio.py
##
@@ -1008,31 +1007,30 @@ def _reset_count(self):
 self._cells = 0
 
   def process(self, element):
-mg_info = element.info
+for elem in element:
+  mg_info = elem.info
+  if mg_info['byte_size'] + self._size_in_bytes > \

Review comment:
   I think (2) is better but we should fix the Spanner connector 
implementation to work for arbitrary bundle sizes than reducing the bundle to a 
single element for the test.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] kamilwu commented on a change in pull request #11274: [BEAM-9633] Add PubsubIO performance test

2020-05-08 Thread GitBox


kamilwu commented on a change in pull request #11274:
URL: https://github.com/apache/beam/pull/11274#discussion_r422183052



##
File path: .test-infra/jenkins/job_PerformanceTests_PubsubIO_Python.groovy
##
@@ -41,7 +42,7 @@ def psio_test = [
 metrics_dataset  : 'beam_performance',
 metrics_table: 'psio_io_2GB_msg_results',
 input_options: '\'{' +
-'"num_records": 2097152,' +
+'"num_records": 2097152' +

Review comment:
   Why did you remove that comma?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] iemejia commented on pull request #11619: [BEAM-2530] Compile and run tests on java 11 for PreCommit portability api

2020-05-08 Thread GitBox


iemejia commented on pull request #11619:
URL: https://github.com/apache/beam/pull/11619#issuecomment-625859443


   Merged manually to add the missing ticket prefix `[BEAM-2530`. Thanks again!



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] steveniemitz opened a new pull request #11641: [BEAM-9931] Allow users of AvroIO to specify a custom DatumReader implementation

2020-05-08 Thread GitBox


steveniemitz opened a new pull request #11641:
URL: https://github.com/apache/beam/pull/11641


   Similar to PR #11479, it would be useful to be able to explicitly pass a 
DatumReader factory to AvroIO, and have it use that instead of 
GenericDatumReader or SpecificDatumReader.
   
   This PR adds `withDatumReaderFactory` to AvroIO and plumbs it through into 
AvroSource.
   
   R: @iemejia 
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [x] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [x] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] Update `CHANGES.md` with noteworthy changes.
- [x] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   See the [Contributor Guide](https://beam.apache.org/contribute) for more 
tips on [how to make review process 
smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkSt

[GitHub] [beam] epicfaace opened a new pull request #11642: Replace call to .checkpoint() in SDF direct runner to .try_claim(0)

2020-05-08 Thread GitBox


epicfaace opened a new pull request #11642:
URL: https://github.com/apache/beam/pull/11642


   **Please** add a meaningful description for your change here
   
   All calls to .checkpoint() need to be replaced by .try_claim(), and the 
.checkpoint() function was removed from OffsetRestrictionTracker in this commit 
(https://github.com/apache/beam/commit/b94dca2b1df4a3aea66d822299a58c97accc0541#diff-e642b3261abdcbd9862e7e3978359e76L170).
   
   Because the DirectRunner was still calling .checkpoint(), it ended up 
crashing at times when trying to run using the OffsetRestrictionTracker. This 
PR fixes it.
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [x] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [x] Update `CHANGES.md` with noteworthy changes.
- [x] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   See the [Contributor Guide](https://beam.apache.org/contribute) for more 
tips on [how to make review process 
smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)

[GitHub] [beam] piotr-szuberski commented on a change in pull request #11274: [BEAM-9633] Add PubsubIO performance test

2020-05-08 Thread GitBox


piotr-szuberski commented on a change in pull request #11274:
URL: https://github.com/apache/beam/pull/11274#discussion_r422202409



##
File path: .test-infra/jenkins/job_PerformanceTests_PubsubIO_Python.groovy
##
@@ -41,7 +42,7 @@ def psio_test = [
 metrics_dataset  : 'beam_performance',
 metrics_table: 'psio_io_2GB_msg_results',
 input_options: '\'{' +
-'"num_records": 2097152,' +
+'"num_records": 2097152' +

Review comment:
   It sounds stupid but my cat went through my keyboard and I deleted too 
many digits and overlooked it





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] lukecwik commented on pull request #11642: Replace call to .checkpoint() in SDF direct runner to .try_claim(0)

2020-05-08 Thread GitBox


lukecwik commented on pull request #11642:
URL: https://github.com/apache/beam/pull/11642#issuecomment-625872963


   R: @boyuanzz When the PR is ready.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] kamilwu commented on pull request #11274: [BEAM-9633] Add PubsubIO performance test

2020-05-08 Thread GitBox


kamilwu commented on pull request #11274:
URL: https://github.com/apache/beam/pull/11274#issuecomment-625874085


   Retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] TheNeuralBit commented on pull request #11638: [BEAM-9449] Pass PipelineOptions through expansion service

2020-05-08 Thread GitBox


TheNeuralBit commented on pull request #11638:
URL: https://github.com/apache/beam/pull/11638#issuecomment-625874811


   Run RAT PreCommit



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] matthiasa4 commented on pull request #11640: [BEAM-9930] Beam Summit Digital 2020 announcement on blog

2020-05-08 Thread GitBox


matthiasa4 commented on pull request #11640:
URL: https://github.com/apache/beam/pull/11640#issuecomment-625875861


   Added a few minor changes - LGTM for the rest! Thanks Max!



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] kamilwu commented on pull request #11274: [BEAM-9633] Add PubsubIO performance test

2020-05-08 Thread GitBox


kamilwu commented on pull request #11274:
URL: https://github.com/apache/beam/pull/11274#issuecomment-625878298


   Run seed job



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] kamilwu commented on pull request #11274: [BEAM-9633] Add PubsubIO performance test

2020-05-08 Thread GitBox


kamilwu commented on pull request #11274:
URL: https://github.com/apache/beam/pull/11274#issuecomment-625878134


   Retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] kamilwu removed a comment on pull request #11274: [BEAM-9633] Add PubsubIO performance test

2020-05-08 Thread GitBox


kamilwu removed a comment on pull request #11274:
URL: https://github.com/apache/beam/pull/11274#issuecomment-625874085


   Retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] chamikaramj commented on pull request #11628: [BEAM-9911]Replace SpannerIO.write latency counter to distribution

2020-05-08 Thread GitBox


chamikaramj commented on pull request #11628:
URL: https://github.com/apache/beam/pull/11628#issuecomment-625883964


   Run Java PreCommit



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] kamilwu commented on pull request #11274: [BEAM-9633] Add PubsubIO performance test

2020-05-08 Thread GitBox


kamilwu commented on pull request #11274:
URL: https://github.com/apache/beam/pull/11274#issuecomment-625886138


   Retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] kamilwu commented on pull request #11274: [BEAM-9633] Add PubsubIO performance test

2020-05-08 Thread GitBox


kamilwu commented on pull request #11274:
URL: https://github.com/apache/beam/pull/11274#issuecomment-625887140


   Retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] kamilwu commented on pull request #11274: [BEAM-9633] Add PubsubIO performance test

2020-05-08 Thread GitBox


kamilwu commented on pull request #11274:
URL: https://github.com/apache/beam/pull/11274#issuecomment-625887287


   Run PubsubIO Performance Test Python



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] TheNeuralBit commented on a change in pull request #11554: [BEAM-9876] Migrate the Beam website from Jekyll to Hugo to enable localization of the site content

2020-05-08 Thread GitBox


TheNeuralBit commented on a change in pull request #11554:
URL: https://github.com/apache/beam/pull/11554#discussion_r48268



##
File path: website/www/site/data/meetings.yml
##
@@ -9,31 +9,30 @@
 # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 # See the License for the specific language governing permissions and
 # limitations under the License.
-# Welcome to Jekyll!
 
 events:
-- date: 2016/04/01
-  time: "9:30 - 16:00 Pacific"
-  location: PayPalSan Jose, CA, USA
-  type: Dev/PPMC Meeting
-  materials:
-- title: Presentation - PPMC Deep Dive
-  link: 
"https://docs.google.com/presentation/d/1uTb7dx4-Y2OM_B0_3XF_whwAL2FlDTTuq2QzP9sJ4Mg/edit?usp=sharing";
+  - date: 2016/04/01
+time: "9:30 - 16:00 Pacific"
+location: PayPalSan Jose, CA, USA
+type: Dev/PPMC Meeting
+materials:
+  - title: Presentation - PPMC Deep Dive
+link: 
"https://docs.google.com/presentation/d/1uTb7dx4-Y2OM_B0_3XF_whwAL2FlDTTuq2QzP9sJ4Mg/edit?usp=sharing";
 
-- title: Notes - PPMC Deep Dive
-  link: 
"https://docs.google.com/document/d/1SXSLj7FMIgKqj43nTcczFpJzqASeUMUCpbyklk2fBkg/edit?usp=sharing";
-  notes:
+  - title: Notes - PPMC Deep Dive
+link: 
"https://docs.google.com/document/d/1SXSLj7FMIgKqj43nTcczFpJzqASeUMUCpbyklk2fBkg/edit?usp=sharing";
+notes:
 
-- date: 2016/05/04
-  time: "8:00 - 11:00 Pacific"
-  location: Virtual
-  type: Technical Deep Dive
-  materials:
-- title: Presentation - Beam Community Meeting
-  link: 
"https://drive.google.com/open?id=17i7SHViboWtLEZw27iabdMisPl987WWxvapJaXg_dEE";
+  - date: 2016/05/04
+time: "8:00 - 11:00 Pacific"
+location: Virtual
+type: Technical Deep Dive
+materials:
+  - title: Presentation - Beam Community Meeting
+link: 
"https://drive.google.com/open?id=17i7SHViboWtLEZw27iabdMisPl987WWxvapJaXg_dEE";
 
-- title: Notes - Beam Community Meeting
-  link: 
"https://drive.google.com/open?id=1szhEE_pfhEtrQye61jXAidUcMW7oebZCRc2InUe3ou0";
-  notes:
+  - title: Notes - Beam Community Meeting
+link: 
"https://drive.google.com/open?id=1szhEE_pfhEtrQye61jXAidUcMW7oebZCRc2InUe3ou0";
+notes:

Review comment:
   Are the whitespace changes in these yaml files necessary?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] robertwb commented on a change in pull request #11632: [BEAM-7746] Fix type errors and enable checks for apache_beam.dataframe.*

2020-05-08 Thread GitBox


robertwb commented on a change in pull request #11632:
URL: https://github.com/apache/beam/pull/11632#discussion_r422233278



##
File path: sdks/python/apache_beam/dataframe/convert.py
##
@@ -16,13 +16,23 @@
 
 from __future__ import absolute_import
 
+import typing
+
 import inspect
 
 from apache_beam import pvalue
 from apache_beam.dataframe import expressions
 from apache_beam.dataframe import frame_base
 from apache_beam.dataframe import transforms
 
+if typing.TYPE_CHECKING:
+  # pylint: disable=ungrouped-imports
+  from typing import Any
+  from typing import Dict
+  from typing import Tuple
+  from typing import Union

Review comment:
   So lint complains about unguarded imports, so I put them back. We'll 
just to a massive sweep to fix these when we change to use type annotations. 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] lukecwik commented on a change in pull request #11596: [BEAM-9856] Optimization/hl7v2 io list messages

2020-05-08 Thread GitBox


lukecwik commented on a change in pull request #11596:
URL: https://github.com/apache/beam/pull/11596#discussion_r422233303



##
File path: 
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/healthcare/HL7v2IO.java
##
@@ -472,24 +548,120 @@ public void initClient() throws IOException {
   this.client = new HttpHealthcareApiClient();
 }
 
+@GetInitialRestriction
+public OrderedTimeRange getEarliestToLatestRestriction(@Element String 
hl7v2Store)
+throws IOException {
+  from = this.client.getEarliestHL7v2SendTime(hl7v2Store, this.filter);
+  // filters are [from, to) to match logic of OffsetRangeTracker but need 
latest element to be
+  // included in results set to add an extra ms to the upper bound.
+  to = this.client.getLatestHL7v2SendTime(hl7v2Store, this.filter).plus(1);
+  return new OrderedTimeRange(from, to);
+}
+
+@NewTracker
+public OrderedTimeRangeTracker newTracker(@Restriction OrderedTimeRange 
timeRange) {
+  return timeRange.newTracker();
+}
+
+@SplitRestriction
+public void split(
+@Restriction OrderedTimeRange timeRange, 
OutputReceiver out) {
+  // TODO(jaketf) How to pick optimal values for desiredNumOffsetsPerSplit 
?

Review comment:
   That seems like a lot.
   
   Dataflow has an API limit of 20mbs for split descriptions when being 
returned which usually tops out around 10k splits for sources but even 10k is 
too much. Typically 20-50 splits is enough since dynamic splitting will ramp 
that up to 1000s if necessary.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] robertwb commented on pull request #11632: [BEAM-7746] Fix type errors and enable checks for apache_beam.dataframe.*

2020-05-08 Thread GitBox


robertwb commented on pull request #11632:
URL: https://github.com/apache/beam/pull/11632#issuecomment-625892367


   PTAL



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] lukecwik commented on a change in pull request #11607: [BEAM-9430] Fixes the bounds of initial watermark set to estimators instead of raising an error

2020-05-08 Thread GitBox


lukecwik commented on a change in pull request #11607:
URL: https://github.com/apache/beam/pull/11607#discussion_r422236167



##
File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/splittabledofn/WatermarkEstimators.java
##
@@ -37,14 +37,16 @@
 private Instant lastReportedWatermark;
 
 public Manual(Instant watermark) {
-  this.watermark = checkNotNull(watermark, "watermark must not be null.");
-  if (watermark.isBefore(GlobalWindow.TIMESTAMP_MIN_VALUE)
-  || watermark.isAfter(GlobalWindow.TIMESTAMP_MAX_VALUE)) {
-throw new IllegalArgumentException(
-String.format(
-"Provided watermark %s must be within bounds [%s, %s].",
-watermark, GlobalWindow.TIMESTAMP_MIN_VALUE, 
GlobalWindow.TIMESTAMP_MAX_VALUE));
+  checkNotNull(watermark, "watermark must not be null.");
+
+  // Making sure that the watermark is within bounds.

Review comment:
   Your right, it would be good to migrate to use BoundedWindow as the 
import for the static though.
   
   I think it makes sense to make the constructor validate the bounds and have 
setWatermark ensure that the value is within the range as expected. We can fix 
the UnboundedSource SDF wrapper to clamp the watermark value that is being 
reported from UnboundedReader instead.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] bntnam commented on a change in pull request #11554: [BEAM-9876] Migrate the Beam website from Jekyll to Hugo to enable localization of the site content

2020-05-08 Thread GitBox


bntnam commented on a change in pull request #11554:
URL: https://github.com/apache/beam/pull/11554#discussion_r422238541



##
File path: website/www/site/data/meetings.yml
##
@@ -9,31 +9,30 @@
 # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 # See the License for the specific language governing permissions and
 # limitations under the License.
-# Welcome to Jekyll!
 
 events:
-- date: 2016/04/01
-  time: "9:30 - 16:00 Pacific"
-  location: PayPalSan Jose, CA, USA
-  type: Dev/PPMC Meeting
-  materials:
-- title: Presentation - PPMC Deep Dive
-  link: 
"https://docs.google.com/presentation/d/1uTb7dx4-Y2OM_B0_3XF_whwAL2FlDTTuq2QzP9sJ4Mg/edit?usp=sharing";
+  - date: 2016/04/01
+time: "9:30 - 16:00 Pacific"
+location: PayPalSan Jose, CA, USA
+type: Dev/PPMC Meeting
+materials:
+  - title: Presentation - PPMC Deep Dive
+link: 
"https://docs.google.com/presentation/d/1uTb7dx4-Y2OM_B0_3XF_whwAL2FlDTTuq2QzP9sJ4Mg/edit?usp=sharing";
 
-- title: Notes - PPMC Deep Dive
-  link: 
"https://docs.google.com/document/d/1SXSLj7FMIgKqj43nTcczFpJzqASeUMUCpbyklk2fBkg/edit?usp=sharing";
-  notes:
+  - title: Notes - PPMC Deep Dive
+link: 
"https://docs.google.com/document/d/1SXSLj7FMIgKqj43nTcczFpJzqASeUMUCpbyklk2fBkg/edit?usp=sharing";
+notes:
 
-- date: 2016/05/04
-  time: "8:00 - 11:00 Pacific"
-  location: Virtual
-  type: Technical Deep Dive
-  materials:
-- title: Presentation - Beam Community Meeting
-  link: 
"https://drive.google.com/open?id=17i7SHViboWtLEZw27iabdMisPl987WWxvapJaXg_dEE";
+  - date: 2016/05/04
+time: "8:00 - 11:00 Pacific"
+location: Virtual
+type: Technical Deep Dive
+materials:
+  - title: Presentation - Beam Community Meeting
+link: 
"https://drive.google.com/open?id=17i7SHViboWtLEZw27iabdMisPl987WWxvapJaXg_dEE";
 
-- title: Notes - Beam Community Meeting
-  link: 
"https://drive.google.com/open?id=1szhEE_pfhEtrQye61jXAidUcMW7oebZCRc2InUe3ou0";
-  notes:
+  - title: Notes - Beam Community Meeting
+link: 
"https://drive.google.com/open?id=1szhEE_pfhEtrQye61jXAidUcMW7oebZCRc2InUe3ou0";
+notes:

Review comment:
   @TheNeuralBit: The file is formatted in the correct form according to 
the Indentation rule [1].
   
   [1] 
https://docs.saltstack.com/en/master/topics/troubleshooting/yaml_idiosyncrasies.html





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] tysonjh commented on a change in pull request #11331: [BEAM-9646] Add Google Cloud vision integration transform

2020-05-08 Thread GitBox


tysonjh commented on a change in pull request #11331:
URL: https://github.com/apache/beam/pull/11331#discussion_r421689993



##
File path: 
sdks/java/extensions/ml/src/main/java/org/apache/beam/sdk/extensions/ml/AnnotateImages.java
##
@@ -0,0 +1,209 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.extensions.ml;
+
+import com.google.cloud.vision.v1.AnnotateImageRequest;
+import com.google.cloud.vision.v1.AnnotateImageResponse;
+import com.google.cloud.vision.v1.BatchAnnotateImagesResponse;
+import com.google.cloud.vision.v1.Feature;
+import com.google.cloud.vision.v1.ImageAnnotatorClient;
+import com.google.cloud.vision.v1.ImageContext;
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Map;
+import java.util.Random;
+import org.apache.beam.sdk.transforms.DoFn;
+import org.apache.beam.sdk.transforms.GroupIntoBatches;
+import org.apache.beam.sdk.transforms.PTransform;
+import org.apache.beam.sdk.transforms.ParDo;
+import org.apache.beam.sdk.values.KV;
+import org.apache.beam.sdk.values.PCollection;
+import org.apache.beam.sdk.values.PCollectionView;
+
+/**
+ * Parent class for transform utilizing Cloud Vision API.
+ *
+ * @param  Type of input PCollection.
+ */
+public abstract class AnnotateImages
+extends PTransform, 
PCollection>> {
+
+  private static final Long MIN_BATCH_SIZE = 1L;
+  private static final Long MAX_BATCH_SIZE = 5L;
+
+  protected final PCollectionView> contextSideInput;
+  protected final List featureList;
+  private long batchSize;
+
+  public AnnotateImages(

Review comment:
   Would you please add comments to this as well? It would be useful for 
those implementing subclasses.

##
File path: 
sdks/java/extensions/ml/src/main/java/org/apache/beam/sdk/extensions/ml/AnnotateImages.java
##
@@ -0,0 +1,209 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.extensions.ml;
+
+import com.google.cloud.vision.v1.AnnotateImageRequest;
+import com.google.cloud.vision.v1.AnnotateImageResponse;
+import com.google.cloud.vision.v1.BatchAnnotateImagesResponse;
+import com.google.cloud.vision.v1.Feature;
+import com.google.cloud.vision.v1.ImageAnnotatorClient;
+import com.google.cloud.vision.v1.ImageContext;
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Map;
+import java.util.Random;
+import org.apache.beam.sdk.transforms.DoFn;
+import org.apache.beam.sdk.transforms.GroupIntoBatches;
+import org.apache.beam.sdk.transforms.PTransform;
+import org.apache.beam.sdk.transforms.ParDo;
+import org.apache.beam.sdk.values.KV;
+import org.apache.beam.sdk.values.PCollection;
+import org.apache.beam.sdk.values.PCollectionView;
+
+/**
+ * Parent class for transform utilizing Cloud Vision API.
+ *
+ * @param  Type of input PCollection.
+ */
+public abstract class AnnotateImages
+extends PTransform, 
PCollection>> {
+
+  private static final Long MIN_BATCH_SIZE = 1L;
+  private static final Long MAX_BATCH_SIZE = 5L;
+
+  protected final PCollectionView> contextSideInput;
+  protected final List featureList;
+  private long batchSize;
+
+  public AnnotateImages(
+  PCollectionView> contextSideInput,
+  List featureList,
+  long batchSize) {
+this.contextSideInput = contextSideInput;
+this.featureList = featureList;
+checkBatchSizeCorrectness(batchSize);
+this.batch

[GitHub] [beam] chadrik commented on a change in pull request #11632: [BEAM-7746] Fix type errors and enable checks for apache_beam.dataframe.*

2020-05-08 Thread GitBox


chadrik commented on a change in pull request #11632:
URL: https://github.com/apache/beam/pull/11632#discussion_r422273845



##
File path: sdks/python/apache_beam/dataframe/convert.py
##
@@ -16,13 +16,23 @@
 
 from __future__ import absolute_import
 
+import typing
+
 import inspect
 
 from apache_beam import pvalue
 from apache_beam.dataframe import expressions
 from apache_beam.dataframe import frame_base
 from apache_beam.dataframe import transforms
 
+if typing.TYPE_CHECKING:
+  # pylint: disable=ungrouped-imports
+  from typing import Any
+  from typing import Dict
+  from typing import Tuple
+  from typing import Union

Review comment:
   What's the lint error?   Is it because of the unused `typing` import?  
   
   I'm confused because unguarded typing imports are used all over the beam 
codebase without any lint errors.  Check `pipeline`, `pipeline_context`, 
`pipeline_options`, for starters. 
   





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] ibzib opened a new pull request #11643: [BEAM-4782] Remove workaround in Python multimap tests.

2020-05-08 Thread GitBox


ibzib opened a new pull request #11643:
URL: https://github.com/apache/beam/pull/11643


   Just some minor code cleanup. R: @udim 
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] Update `CHANGES.md` with noteworthy changes.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   See the [Contributor Guide](https://beam.apache.org/contribute) for more 
tips on [how to make review process 
smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)[![Build
 
Status](https://builds.

[GitHub] [beam] pabloem commented on pull request #11637: Waiting for BQ Query and Export jobs for more than 5 minutes.

2020-05-08 Thread GitBox


pabloem commented on pull request #11637:
URL: https://github.com/apache/beam/pull/11637#issuecomment-625932099


   r: @chamikaramj @kamilwu PTAL
   currently, ReadFromBigQuery is only waiting up to 5 minutes, and failing the 
job if the query/export don't finish in time



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] chamikaramj commented on pull request #11637: Waiting for BQ Query and Export jobs for more than 5 minutes.

2020-05-08 Thread GitBox


chamikaramj commented on pull request #11637:
URL: https://github.com/apache/beam/pull/11637#issuecomment-625936002


   Added a comment above :)
   
   LGTM other than that



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] mszb commented on a change in pull request #11210: [BEAM-8949] SpannerIO integration tests

2020-05-08 Thread GitBox


mszb commented on a change in pull request #11210:
URL: https://github.com/apache/beam/pull/11210#discussion_r422280614



##
File path: sdks/python/apache_beam/io/gcp/experimental/spannerio_test.py
##
@@ -499,6 +499,7 @@ def test_batch_byte_size(
   # and each bach should contains 25 mutations.
   res = (
   p | beam.Create(mutation_group)
+  | 'combine to list' >> beam.combiners.ToList()

Review comment:
   The user does not have to add ToList transform in the production 
pipeline. I only added this to test the batch process.
   The previous implementation of batching (without ToList transform) was as 
per the java implementation but without the sorting of the transactions by 
table and primary key (this is also documented as a feature to be added later). 

##
File path: sdks/python/apache_beam/io/gcp/experimental/spannerio.py
##
@@ -1008,31 +1007,30 @@ def _reset_count(self):
 self._cells = 0
 
   def process(self, element):
-mg_info = element.info
+for elem in element:

Review comment:
   Make sense, in that case, we don't need to alter the connector code 
anymore, it was working as expected. Thanks, @chamikaramj for the feedback as 
it is always helpful.
   I'll remove the changes from the spanner io connector and update the IT test 
code for the assertion.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] chamikaramj commented on a change in pull request #11210: [BEAM-8949] SpannerIO integration tests

2020-05-08 Thread GitBox


chamikaramj commented on a change in pull request #11210:
URL: https://github.com/apache/beam/pull/11210#discussion_r422282706



##
File path: sdks/python/apache_beam/io/gcp/experimental/spannerio.py
##
@@ -1008,31 +1007,30 @@ def _reset_count(self):
 self._cells = 0
 
   def process(self, element):
-mg_info = element.info
+for elem in element:

Review comment:
   Thanks. Lemme know when this is ready for another look. Also lets 
trigger the IT with new changes to make sure it passes.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] chamikaramj commented on pull request #8457: [BEAM-3342] Create a Cloud Bigtable IO connector for Python

2020-05-08 Thread GitBox


chamikaramj commented on pull request #8457:
URL: https://github.com/apache/beam/pull/8457#issuecomment-625940125


   +1 for starting a new PR. It's surprising to hear that Jenkins IT trigger 
does not capture your updates. Hopefully you'll not run into this in the new 
PR. If you do prob. worth an email to the dev list to check if someone else has 
run into that.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] lostluck commented on pull request #11605: [BEAM-9883] Refactor SDF test restrictions.

2020-05-08 Thread GitBox


lostluck commented on pull request #11605:
URL: https://github.com/apache/beam/pull/11605#issuecomment-625946118


   LGTM thanks!



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] TheNeuralBit commented on a change in pull request #11554: [BEAM-9876] Migrate the Beam website from Jekyll to Hugo to enable localization of the site content

2020-05-08 Thread GitBox


TheNeuralBit commented on a change in pull request #11554:
URL: https://github.com/apache/beam/pull/11554#discussion_r422302302



##
File path: website/www/site/content/en/blog/capability-matrix.md
##
@@ -0,0 +1,604 @@
+---
+title:  "Clarifying & Formalizing Runner Capabilities"
+date:   2016-03-17 11:00:00 -0700
+categories:
+  - beam
+  - capability
+aliases:
+  - /beam/capability/2016/03/17/capability-matrix.html
+authors:
+  - fjp
+  - takidau
+
+capability-matrix-snapshot:

Review comment:
   Would it be possible to keep this in it's separate yaml file and just 
reference it here rather than in-lining? That would reduce the diff some.
   
   Not a blocker, just a nice-to-have
   





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] TheNeuralBit commented on a change in pull request #11554: [BEAM-9876] Migrate the Beam website from Jekyll to Hugo to enable localization of the site content

2020-05-08 Thread GitBox


TheNeuralBit commented on a change in pull request #11554:
URL: https://github.com/apache/beam/pull/11554#discussion_r422305429



##
File path: build.gradle
##
@@ -86,14 +86,18 @@ rat {
 // JDBC package config files
 "**/META-INF/services/java.sql.Driver",
 
-// Ruby build files
+// Website build files
 "**/Gemfile.lock",
 "**/Rakefile",
 "**/.htaccess",
-"website/src/_sass/_bootstrap.scss",
-"website/src/_sass/bootstrap/**/*",
-"website/src/js/bootstrap*.js",
-"website/src/js/bootstrap/**/*",
+"website/www/site/assets/scss/_bootstrap.scss",
+"website/www/site/assets/scss/bootstrap/**/*",
+"website/www/site/static/js/bootstrap*.js",
+"website/www/site/static/js/bootstrap/**/*",
+"website/www/site/static/.htaccess",

Review comment:
   nit: I think this is redundant because of the "**/.htaccess" above





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] TheNeuralBit commented on a change in pull request #11554: [BEAM-9876] Migrate the Beam website from Jekyll to Hugo to enable localization of the site content

2020-05-08 Thread GitBox


TheNeuralBit commented on a change in pull request #11554:
URL: https://github.com/apache/beam/pull/11554#discussion_r422323757



##
File path: website/Dockerfile
##
@@ -1,33 +1,65 @@
-###
-#  Licensed to the Apache Software Foundation (ASF) under one
-#  or more contributor license agreements.  See the NOTICE file
-#  distributed with this work for additional information
-#  regarding copyright ownership.  The ASF licenses this file
-#  to you under the Apache License, Version 2.0 (the
-#  "License"); you may not use this file except in compliance
-#  with the License.  You may obtain a copy of the License at
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
 #
-#  http://www.apache.org/licenses/LICENSE-2.0
+#   http://www.apache.org/licenses/LICENSE-2.0
 #
-#  Unless required by applicable law or agreed to in writing, software
-#  distributed under the License is distributed on an "AS IS" BASIS,
-#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-#  See the License for the specific language governing permissions and
-# limitations under the License.
-###
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
 
-# This image contains Ruby and dependencies required to build and test the Beam
-# website. It is used by tasks in build.gradle.

Review comment:
   nit: could you add a comment like this at the start of the new 
Dockerfile?
   
   Also some comments above each of the "RUN" statements below saying what 
they're doing would be nice. They're a little inscrutable by themselves, but 
some comments like "Install misc deps", "Install node", "Install yarn" would 
make it easy to inspect





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] TheNeuralBit commented on a change in pull request #11554: [BEAM-9876] Migrate the Beam website from Jekyll to Hugo to enable localization of the site content

2020-05-08 Thread GitBox


TheNeuralBit commented on a change in pull request #11554:
URL: https://github.com/apache/beam/pull/11554#discussion_r422324477



##
File path: website/Dockerfile
##
@@ -1,33 +1,65 @@
-###
-#  Licensed to the Apache Software Foundation (ASF) under one
-#  or more contributor license agreements.  See the NOTICE file
-#  distributed with this work for additional information
-#  regarding copyright ownership.  The ASF licenses this file
-#  to you under the Apache License, Version 2.0 (the
-#  "License"); you may not use this file except in compliance
-#  with the License.  You may obtain a copy of the License at
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
 #
-#  http://www.apache.org/licenses/LICENSE-2.0
+#   http://www.apache.org/licenses/LICENSE-2.0
 #
-#  Unless required by applicable law or agreed to in writing, software
-#  distributed under the License is distributed on an "AS IS" BASIS,
-#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-#  See the License for the specific language governing permissions and
-# limitations under the License.
-###
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
 
-# This image contains Ruby and dependencies required to build and test the Beam
-# website. It is used by tasks in build.gradle.
+FROM debian:stretch-slim
 
-FROM ruby:2.5
+SHELL ["/bin/bash", "-o", "pipefail", "-e", "-u", "-x", "-c"]
 
-WORKDIR /ruby
-RUN gem install bundler
-# Update buildDockerImage's inputs.files if you change this list.
-ADD Gemfile Gemfile.lock /ruby/
-RUN bundle install --deployment --path $GEM_HOME
+ENV DEBIAN_FRONTEND=noninteractive \
+LANGUAGE=C.UTF-8 \
+LANG=C.UTF-8 \
+LC_ALL=C.UTF-8 \
+LC_CTYPE=C.UTF-8 \
+LC_MESSAGES=C.UTF-8
 
-# Required for website testing using HTMLProofer.
-ENV LC_ALL C.UTF-8
+RUN apt-get update \
+&& apt-get install -y --no-install-recommends \
+ca-certificates \
+curl \
+git \
+gnupg2 \
+gosu \
+lynx \
+&& apt-get autoremove -yqq --purge \
+&& apt-get clean \
+&& rm -rf /var/lib/apt/lists/*
 
-CMD sleep 3600
+RUN curl -sL https://deb.nodesource.com/setup_10.x | bash - \
+&& apt-get update \
+&& apt-get install -y --no-install-recommends \
+nodejs \
+&& apt-get autoremove -yqq --purge \
+&& apt-get clean \
+&& rm -rf /var/lib/apt/lists/*
+
+RUN curl -sS https://dl.yarnpkg.com/debian/pubkey.gpg | apt-key add - \
+&& echo "deb https://dl.yarnpkg.com/debian/ stable main" | tee 
/etc/apt/sources.list.d/yarn.list \
+&& apt-get update \
+&& apt-get install -y --no-install-recommends yarn \
+&& apt-get autoremove -yqq --purge \
+&& apt-get clean \
+&& rm -rf /var/lib/apt/lists/*
+
+RUN HUGOHOME="$(mktemp -d)" \
+&& export HUGOHOME \
+&& curl -sL 
https://github.com/gohugoio/hugo/releases/download/v0.68.3/hugo_extended_0.68.3_Linux-64bit.tar.gz
 > "${HUGOHOME}/hugo.tar.gz" \
+&& tar -xzvf "${HUGOHOME}/hugo.tar.gz" hugo \
+&& mv hugo /usr/local/bin/hugo \
+&& chmod +x /usr/local/bin/hugo \
+&& rm -r "${HUGOHOME}"

Review comment:
   Why not install from the debian repo with apt-get? 
https://gohugo.io/getting-started/installing/#debian-and-ubuntu
   
   If we keep it this way it would be nice to pull the version number out into 
a variable so it's easy to upgrade.
   
   





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] pulasthi commented on pull request #10888: [BEAM-7304] Twister2 Beam runner

2020-05-08 Thread GitBox


pulasthi commented on pull request #10888:
URL: https://github.com/apache/beam/pull/10888#issuecomment-625977538


   @iemejia Hope you are doing well. I just wanted to follow up with you if you 
had time to work on the pull request.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] robertwb commented on pull request #11521: [BEAM-9577] Update Java Runners to handle dependency-based artifact staging.

2020-05-08 Thread GitBox


robertwb commented on pull request #11521:
URL: https://github.com/apache/beam/pull/11521#issuecomment-625979601


   Run XVR_Flink PostCommit



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] tysonjh commented on a change in pull request #11236: [BEAM-7505] Create SideInput Python Jenkins jobs

2020-05-08 Thread GitBox


tysonjh commented on a change in pull request #11236:
URL: https://github.com/apache/beam/pull/11236#discussion_r422315222



##
File path: .test-infra/jenkins/SideInputTestSuite.groovy
##
@@ -0,0 +1,156 @@
+/*
+ * Licensed to the Apache Sextend(template)tware Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy extend(template) the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, 
sextend(template)tware
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+import static LoadTestConfig.extendTemplate
+
+
+class SideInputTestSuite {
+static def configurations = { LoadTestConfig template -> [
+extendTemplate(template) {
+title 'SideInput 2MB 100 byte records: global window'

Review comment:
   Changing the number of records for the main input to be 1 is fine for 
some of the tests but not all of them. For example, for tests that are expected 
to access the entire side input, if the side input is a list or iterable, then 
having 1 input record would be fine. We can assume that that 1 input record 
iterates through the entire side input.
   
   If however the side input is a dict, to simulate accessing all the side 
input records, the main input and side input should have the same number of 
records.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] ibzib opened a new pull request #11644: [BEAM-9835] [Portable Spark] Broadcast a PCollection at most once.

2020-05-08 Thread GitBox


ibzib opened a new pull request #11644:
URL: https://github.com/apache/beam/pull/11644


   **Please** add a meaningful description for your change here
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] Update `CHANGES.md` with noteworthy changes.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   See the [Contributor Guide](https://beam.apache.org/contribute) for more 
tips on [how to make review process 
smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)[![Build
 
St

[GitHub] [beam] ibzib commented on a change in pull request #11270: [BEAM-9639][BEAM-9608] Improvements for FnApiRunner

2020-05-08 Thread GitBox


ibzib commented on a change in pull request #11270:
URL: https://github.com/apache/beam/pull/11270#discussion_r422338193



##
File path: 
sdks/python/apache_beam/runners/portability/fn_api_runner/fn_runner_test.py
##
@@ -240,6 +240,30 @@ def test_multimap_side_input(self):
   lambda k, d: (k, sorted(d[k])), beam.pvalue.AsMultiMap(side)),
   equal_to([('a', [1, 3]), ('b', [2])]))
 
+  def test_multimap_multiside_input(self):

Review comment:
   Thanks for reporting Boyuan, this was a flaw with the Spark runner. Fix: 
#11644





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] ibzib commented on pull request #11644: [BEAM-9835] [Portable Spark] Broadcast a PCollection at most once.

2020-05-08 Thread GitBox


ibzib commented on pull request #11644:
URL: https://github.com/apache/beam/pull/11644#issuecomment-625987422


   Run Python Spark ValidatesRunner



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] ibzib commented on pull request #11644: [BEAM-9835] [Portable Spark] Broadcast a PCollection at most once.

2020-05-08 Thread GitBox


ibzib commented on pull request #11644:
URL: https://github.com/apache/beam/pull/11644#issuecomment-625987535


   Run Java Spark PortableValidatesRunner Batch



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] udim commented on pull request #11396: [BEAM-9742] Add Configurable FluentBackoff to JdbcIO Write

2020-05-08 Thread GitBox


udim commented on pull request #11396:
URL: https://github.com/apache/beam/pull/11396#issuecomment-625987559


   Hi, Cham requested that I take a look but I'm overloaded. Will attempt to 
look next week



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] amaliujia commented on a change in pull request #10946: [BEAM-9363] TUMBLE as TVF

2020-05-08 Thread GitBox


amaliujia commented on a change in pull request #10946:
URL: https://github.com/apache/beam/pull/10946#discussion_r422345657



##
File path: 
sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/rel/BeamTableFunctionScanRel.java
##
@@ -0,0 +1,157 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.extensions.sql.impl.rel;
+
+import static 
org.apache.beam.vendor.calcite.v1_20_0.com.google.common.base.Preconditions.checkArgument;
+
+import java.lang.reflect.Type;
+import java.util.List;
+import java.util.Set;
+import org.apache.beam.sdk.extensions.sql.impl.planner.BeamCostModel;
+import org.apache.beam.sdk.extensions.sql.impl.planner.NodeStats;
+import org.apache.beam.sdk.extensions.sql.impl.utils.CalciteUtils;
+import org.apache.beam.sdk.schemas.Schema;
+import org.apache.beam.sdk.transforms.DoFn;
+import org.apache.beam.sdk.transforms.PTransform;
+import org.apache.beam.sdk.transforms.ParDo;
+import org.apache.beam.sdk.transforms.windowing.FixedWindows;
+import org.apache.beam.sdk.transforms.windowing.IntervalWindow;
+import org.apache.beam.sdk.values.PCollection;
+import org.apache.beam.sdk.values.PCollectionList;
+import org.apache.beam.sdk.values.Row;
+import 
org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.plan.RelOptCluster;
+import 
org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.plan.RelOptPlanner;
+import 
org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.plan.RelTraitSet;
+import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rel.RelNode;
+import 
org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rel.core.TableFunctionScan;
+import 
org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rel.metadata.RelColumnMapping;
+import 
org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rel.metadata.RelMetadataQuery;
+import 
org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rel.type.RelDataType;
+import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rex.RexCall;
+import 
org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rex.RexInputRef;
+import 
org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rex.RexLiteral;
+import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rex.RexNode;
+import org.joda.time.Duration;
+
+public class BeamTableFunctionScanRel extends TableFunctionScan implements 
BeamRelNode {

Review comment:
   Done





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] amaliujia commented on a change in pull request #10946: [BEAM-9363] TUMBLE as TVF

2020-05-08 Thread GitBox


amaliujia commented on a change in pull request #10946:
URL: https://github.com/apache/beam/pull/10946#discussion_r422346049



##
File path: 
sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/rel/BeamTableFunctionScanRel.java
##
@@ -0,0 +1,157 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.extensions.sql.impl.rel;
+
+import static 
org.apache.beam.vendor.calcite.v1_20_0.com.google.common.base.Preconditions.checkArgument;
+
+import java.lang.reflect.Type;
+import java.util.List;
+import java.util.Set;
+import org.apache.beam.sdk.extensions.sql.impl.planner.BeamCostModel;
+import org.apache.beam.sdk.extensions.sql.impl.planner.NodeStats;
+import org.apache.beam.sdk.extensions.sql.impl.utils.CalciteUtils;
+import org.apache.beam.sdk.schemas.Schema;
+import org.apache.beam.sdk.transforms.DoFn;
+import org.apache.beam.sdk.transforms.PTransform;
+import org.apache.beam.sdk.transforms.ParDo;
+import org.apache.beam.sdk.transforms.windowing.FixedWindows;
+import org.apache.beam.sdk.transforms.windowing.IntervalWindow;
+import org.apache.beam.sdk.values.PCollection;
+import org.apache.beam.sdk.values.PCollectionList;
+import org.apache.beam.sdk.values.Row;
+import 
org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.plan.RelOptCluster;
+import 
org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.plan.RelOptPlanner;
+import 
org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.plan.RelTraitSet;
+import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rel.RelNode;
+import 
org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rel.core.TableFunctionScan;
+import 
org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rel.metadata.RelColumnMapping;
+import 
org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rel.metadata.RelMetadataQuery;
+import 
org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rel.type.RelDataType;
+import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rex.RexCall;
+import 
org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rex.RexInputRef;
+import 
org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rex.RexLiteral;
+import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rex.RexNode;
+import org.joda.time.Duration;
+
+public class BeamTableFunctionScanRel extends TableFunctionScan implements 
BeamRelNode {
+  public BeamTableFunctionScanRel(
+  RelOptCluster cluster,
+  RelTraitSet traitSet,
+  List inputs,
+  RexNode rexCall,
+  Type elementType,
+  RelDataType rowType,
+  Set columnMappings) {
+super(cluster, traitSet, inputs, rexCall, elementType, rowType, 
columnMappings);
+  }
+
+  @Override
+  public TableFunctionScan copy(
+  RelTraitSet traitSet,
+  List list,
+  RexNode rexNode,
+  Type type,
+  RelDataType relDataType,
+  Set set) {
+return new BeamTableFunctionScanRel(
+getCluster(), traitSet, list, rexNode, type, relDataType, 
columnMappings);
+  }
+
+  @Override
+  public PTransform, PCollection> buildPTransform() {
+return new Transform();
+  }
+
+  private class Transform extends PTransform, 
PCollection> {
+
+@Override
+public PCollection expand(PCollectionList input) {
+  checkArgument(
+  input.size() == 1,
+  "Wrong number of inputs for %s: %s",

Review comment:
   Nice idea! Done!





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




  1   2   >