[jira] [Updated] (BEAM-7069) Emit a warning when a pipeline option with a hyphen is encountered by BeamArgumentParser

2019-04-12 Thread Valentyn Tymofieiev (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Valentyn Tymofieiev updated BEAM-7069:
--
Description: CC: [~altay] [~udim]]  (was: CC: [~altay] [~ehudm])

> Emit a warning when a pipeline option with a hyphen is encountered by 
> BeamArgumentParser
> 
>
> Key: BEAM-7069
> URL: https://issues.apache.org/jira/browse/BEAM-7069
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: Valentyn Tymofieiev
>Priority: Minor
>
> CC: [~altay] [~udim]]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3713) Consider moving away from nose to nose2 or pytest.

2019-04-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3713?focusedWorklogId=227100=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-227100
 ]

ASF GitHub Bot logged work on BEAM-3713:


Author: ASF GitHub Bot
Created on: 13/Apr/19 02:38
Start Date: 13/Apr/19 02:38
Worklog Time Spent: 10m 
  Work Description: udim commented on issue #7949: [BEAM-3713] Add pytest 
testing infrastructure
URL: https://github.com/apache/beam/pull/7949#issuecomment-482769612
 
 
   run seed job
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 227100)
Time Spent: 2.5h  (was: 2h 20m)

> Consider moving away from nose to nose2 or pytest.
> --
>
> Key: BEAM-3713
> URL: https://issues.apache.org/jira/browse/BEAM-3713
> Project: Beam
>  Issue Type: Test
>  Components: sdk-py-core, testing
>Reporter: Robert Bradshaw
>Assignee: Udi Meiri
>Priority: Minor
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> Per 
> [https://nose.readthedocs.io/en/latest/|https://nose.readthedocs.io/en/latest/,]
>  , nose is in maintenance mode.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7067) Flink job server: can't disable cleanArtifactsPerJob

2019-04-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7067?focusedWorklogId=227096=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-227096
 ]

ASF GitHub Bot logged work on BEAM-7067:


Author: ASF GitHub Bot
Created on: 13/Apr/19 02:01
Start Date: 13/Apr/19 02:01
Worklog Time Spent: 10m 
  Work Description: angoenka commented on issue #8293: [BEAM-7067] make 
cleanArtifactsPerJob configurable for Flink job serv…
URL: https://github.com/apache/beam/pull/8293#issuecomment-482767137
 
 
   Run Python PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 227096)
Time Spent: 0.5h  (was: 20m)

> Flink job server: can't disable cleanArtifactsPerJob
> 
>
> Key: BEAM-7067
> URL: https://issues.apache.org/jira/browse/BEAM-7067
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-flink
>Reporter: Kyle Weaver
>Assignee: Kyle Weaver
>Priority: Minor
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> After [https://github.com/apache/beam/pull/8210], cleanArtifactsPerJob is now 
> an explicit boolean defaulting to true, so 
> [flink_job_server.gradle|https://github.com/apache/beam/blob/531cdf57c4f308482b2137f35ca8c77c96c64ffe/runners/flink/job-server/flink_job_server.gradle#L98-L99]
>  needs to be adjusted so it can be configured.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7052) portableWordCountTask raises IOError

2019-04-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7052?focusedWorklogId=227095=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-227095
 ]

ASF GitHub Bot logged work on BEAM-7052:


Author: ASF GitHub Bot
Created on: 13/Apr/19 02:00
Start Date: 13/Apr/19 02:00
Worklog Time Spent: 10m 
  Work Description: angoenka commented on pull request #8282: [BEAM-7052] 
add optional environmentType property to python wordcount task
URL: https://github.com/apache/beam/pull/8282#discussion_r275100950
 
 

 ##
 File path: sdks/python/build.gradle
 ##
 @@ -182,6 +182,9 @@ def portableWordCountTask(name, streaming) {
 options += ["--environment_cache_millis=1"]
   if (project.hasProperty("jobEndpoint"))
 options += ["--job_endpoint=${project.property('jobEndpoint')}"]
+  if (project.hasProperty("environmentType")) {
 
 Review comment:
   Lets add environmentConfig as well without it, environmentType will not 
always be useful.
   
   ```
 if (project.hasProperty("environmentConfig")) {
   options += 
["--environment_config=${project.property('environmentConfig')}"]
 }
   ```
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 227095)
Time Spent: 1h 10m  (was: 1h)

> portableWordCountTask raises IOError
> 
>
> Key: BEAM-7052
> URL: https://issues.apache.org/jira/browse/BEAM-7052
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Kyle Weaver
>Assignee: Kyle Weaver
>Priority: Major
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Got this error when trying to run (Python) portableWordCountTask on the Spark 
> portable runner:
> {{RuntimeError: IOError: [Errno 2] No such file or directory: 
> '/tmp/beam-temp-py-wordcount-direct-a9e7beba888211e8843d0251/44ba0386-4221-4abc-ae8d-5a2aa4610c4f.py-wordcount-direct'
>  [while running 'write/Write/WriteImpl/WriteBundles']}}
> This does not seem to be an issue with Flink (maybe because of 
> environment_cache_millis), but in any case the problem is fixed when I set 
> --environment_type=LOOPBACK.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7039) Spark portable runner: run validatesRunner tests

2019-04-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7039?focusedWorklogId=227093=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-227093
 ]

ASF GitHub Bot logged work on BEAM-7039:


Author: ASF GitHub Bot
Created on: 13/Apr/19 01:57
Start Date: 13/Apr/19 01:57
Worklog Time Spent: 10m 
  Work Description: angoenka commented on pull request #8285: [BEAM-7039] 
set up validatesPortableRunner tests for Spark
URL: https://github.com/apache/beam/pull/8285#discussion_r275100838
 
 

 ##
 File path: 
buildSrc/src/main/groovy/org/apache/beam/gradle/BeamModulePlugin.groovy
 ##
 @@ -1559,6 +1561,7 @@ class BeamModulePlugin implements Plugin {
   project.tasks.create(name: name, type: Test) {
 group = "Verification"
 description = "Validates the PortableRunner with JobServer 
${config.jobServerDriver}"
+systemProperties config.systemProperties
 systemProperty "beamTestPipelineOptions", 
JsonOutput.toJson(beamTestPipelineOptions)
 
 Review comment:
   Shall we also merge this system property to systemProperties set above,
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 227093)
Time Spent: 1h 50m  (was: 1h 40m)

> Spark portable runner: run validatesRunner tests
> 
>
> Key: BEAM-7039
> URL: https://issues.apache.org/jira/browse/BEAM-7039
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-spark
>Reporter: Kyle Weaver
>Assignee: Kyle Weaver
>Priority: Major
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7067) Flink job server: can't disable cleanArtifactsPerJob

2019-04-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7067?focusedWorklogId=227091=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-227091
 ]

ASF GitHub Bot logged work on BEAM-7067:


Author: ASF GitHub Bot
Created on: 13/Apr/19 01:55
Start Date: 13/Apr/19 01:55
Worklog Time Spent: 10m 
  Work Description: angoenka commented on issue #8293: [BEAM-7067] make 
cleanArtifactsPerJob configurable for Flink job serv…
URL: https://github.com/apache/beam/pull/8293#issuecomment-482766707
 
 
   Run Java PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 227091)
Time Spent: 20m  (was: 10m)

> Flink job server: can't disable cleanArtifactsPerJob
> 
>
> Key: BEAM-7067
> URL: https://issues.apache.org/jira/browse/BEAM-7067
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-flink
>Reporter: Kyle Weaver
>Assignee: Kyle Weaver
>Priority: Minor
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> After [https://github.com/apache/beam/pull/8210], cleanArtifactsPerJob is now 
> an explicit boolean defaulting to true, so 
> [flink_job_server.gradle|https://github.com/apache/beam/blob/531cdf57c4f308482b2137f35ca8c77c96c64ffe/runners/flink/job-server/flink_job_server.gradle#L98-L99]
>  needs to be adjusted so it can be configured.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-6975) Merge portability status into capability matrix

2019-04-12 Thread Melissa Pashniak (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-6975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16816790#comment-16816790
 ] 

Melissa Pashniak commented on BEAM-6975:


Unassigning as I'm not actively working on this

> Merge portability status into capability matrix
> ---
>
> Key: BEAM-6975
> URL: https://issues.apache.org/jira/browse/BEAM-6975
> Project: Beam
>  Issue Type: Sub-task
>  Components: website
>Reporter: Ahmet Altay
>Priority: Major
>
> Should the portability status: 
> https://s.apache.org/apache-beam-portability-support-table
>  be merged into capability matrix 
> https://beam.apache.org/documentation/runners/capability-matrix/ ?
> (That is add portable runners to the list of runners as columns in the 
> capability matrix.)
> cc: [~kenn] [~lcwik]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-6975) Merge portability status into capability matrix

2019-04-12 Thread Melissa Pashniak (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Melissa Pashniak reassigned BEAM-6975:
--

Assignee: (was: Melissa Pashniak)

> Merge portability status into capability matrix
> ---
>
> Key: BEAM-6975
> URL: https://issues.apache.org/jira/browse/BEAM-6975
> Project: Beam
>  Issue Type: Sub-task
>  Components: website
>Reporter: Ahmet Altay
>Priority: Major
>
> Should the portability status: 
> https://s.apache.org/apache-beam-portability-support-table
>  be merged into capability matrix 
> https://beam.apache.org/documentation/runners/capability-matrix/ ?
> (That is add portable runners to the list of runners as columns in the 
> capability matrix.)
> cc: [~kenn] [~lcwik]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6853) Make Java and python portable options same

2019-04-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6853?focusedWorklogId=227080=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-227080
 ]

ASF GitHub Bot logged work on BEAM-6853:


Author: ASF GitHub Bot
Created on: 13/Apr/19 01:30
Start Date: 13/Apr/19 01:30
Worklog Time Spent: 10m 
  Work Description: angoenka commented on issue #8286: [BEAM-6853] Make 
sdkWorkerParallelism option consistent
URL: https://github.com/apache/beam/pull/8286#issuecomment-482764982
 
 
   Run Java PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 227080)
Time Spent: 1h 40m  (was: 1.5h)

> Make Java and python portable options same
> --
>
> Key: BEAM-6853
> URL: https://issues.apache.org/jira/browse/BEAM-6853
> Project: Beam
>  Issue Type: Task
>  Components: sdk-py-core
>Reporter: Ankur Goenka
>Assignee: Ankur Goenka
>Priority: Major
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> Java 
> [PortableRunnerOptions|https://github.com/apache/beam/blob/f21cfaefd54afb798103dc90ab57290739e81e81/sdks/java/core/src/main/java/org/apache/beam/sdk/options/PortablePipelineOptions.java#L80]
>  and [Python Portable 
> options|https://github.com/apache/beam/blob/f21cfaefd54afb798103dc90ab57290739e81e81/sdks/python/apache_beam/options/pipeline_options.py#L719]
>  don't have the same values limiting the use of sdk-worker-parallelism and 
> environment-cache-millis in python sdk.
>  
> Add these options to the python sdk.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6853) Make Java and python portable options same

2019-04-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6853?focusedWorklogId=227077=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-227077
 ]

ASF GitHub Bot logged work on BEAM-6853:


Author: ASF GitHub Bot
Created on: 13/Apr/19 01:20
Start Date: 13/Apr/19 01:20
Worklog Time Spent: 10m 
  Work Description: angoenka commented on issue #8286: [BEAM-6853] Make 
sdkWorkerParallelism option consistent
URL: https://github.com/apache/beam/pull/8286#issuecomment-482764219
 
 
   Run Java PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 227077)
Time Spent: 1.5h  (was: 1h 20m)

> Make Java and python portable options same
> --
>
> Key: BEAM-6853
> URL: https://issues.apache.org/jira/browse/BEAM-6853
> Project: Beam
>  Issue Type: Task
>  Components: sdk-py-core
>Reporter: Ankur Goenka
>Assignee: Ankur Goenka
>Priority: Major
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Java 
> [PortableRunnerOptions|https://github.com/apache/beam/blob/f21cfaefd54afb798103dc90ab57290739e81e81/sdks/java/core/src/main/java/org/apache/beam/sdk/options/PortablePipelineOptions.java#L80]
>  and [Python Portable 
> options|https://github.com/apache/beam/blob/f21cfaefd54afb798103dc90ab57290739e81e81/sdks/python/apache_beam/options/pipeline_options.py#L719]
>  don't have the same values limiting the use of sdk-worker-parallelism and 
> environment-cache-millis in python sdk.
>  
> Add these options to the python sdk.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-6668) use add experiment methods (Java and Python)

2019-04-12 Thread Valentyn Tymofieiev (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-6668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16816779#comment-16816779
 ] 

Valentyn Tymofieiev commented on BEAM-6668:
---

Python portion is completed by https://github.com/apache/beam/pull/8225.

> use add experiment methods (Java and Python)
> 
>
> Key: BEAM-6668
> URL: https://issues.apache.org/jira/browse/BEAM-6668
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-gcp, sdk-py-core
>Reporter: Udi Meiri
>Priority: Minor
>  Labels: beginner, easyfix, newbie
>
> Python:
> Convert instances of experiments.append(...)
> to debug_options.add_experiment(...)
> Java:
> Use ExperimentalOptions.addExperiment(...)
> instead of getExperiments(), modify, setExperiments() pattern.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-7069) Emit a warning when a pipeline option with a hyphen is encountered by BeamArgumentParser

2019-04-12 Thread Valentyn Tymofieiev (JIRA)
Valentyn Tymofieiev created BEAM-7069:
-

 Summary: Emit a warning when a pipeline option with a hyphen is 
encountered by BeamArgumentParser
 Key: BEAM-7069
 URL: https://issues.apache.org/jira/browse/BEAM-7069
 Project: Beam
  Issue Type: Improvement
  Components: sdk-py-core
Reporter: Valentyn Tymofieiev
Assignee: Valentyn Tymofieiev


CC: [~altay] [~ehudm]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6942) Pipeline options to experiment propagation is not working in Dataflow runner.

2019-04-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6942?focusedWorklogId=227072=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-227072
 ]

ASF GitHub Bot logged work on BEAM-6942:


Author: ASF GitHub Bot
Created on: 13/Apr/19 01:10
Start Date: 13/Apr/19 01:10
Worklog Time Spent: 10m 
  Work Description: aaltay commented on pull request #8225: [BEAM-6942]  
Make modifications to pipeline options to be visible to all views.
URL: https://github.com/apache/beam/pull/8225
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 227072)
Time Spent: 11h 40m  (was: 11.5h)

> Pipeline options to experiment propagation is not working in Dataflow runner.
> -
>
> Key: BEAM-6942
> URL: https://issues.apache.org/jira/browse/BEAM-6942
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: Valentyn Tymofieiev
>Priority: Major
>  Time Spent: 11h 40m
>  Remaining Estimate: 0h
>
> Relevant code: 
> [https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/dataflow/dataflow_runner.py#L356-L388]
> 3 experiments/options are affected. We need to fix it in 2.12.0
> cc: [~altay] [~apilloud]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6942) Pipeline options to experiment propagation is not working in Dataflow runner.

2019-04-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6942?focusedWorklogId=227071=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-227071
 ]

ASF GitHub Bot logged work on BEAM-6942:


Author: ASF GitHub Bot
Created on: 13/Apr/19 01:10
Start Date: 13/Apr/19 01:10
Worklog Time Spent: 10m 
  Work Description: aaltay commented on pull request #8225: [BEAM-6942]  
Make modifications to pipeline options to be visible to all views.
URL: https://github.com/apache/beam/pull/8225#discussion_r275098905
 
 

 ##
 File path: sdks/python/apache_beam/options/pipeline_options.py
 ##
 @@ -738,13 +801,13 @@ def _add_argparse_args(cls, parser):
   '""} }. All fields in the json are optional except '
   'command.'))
 parser.add_argument(
-'--sdk-worker-parallelism', default=None,
+'--sdk_worker_parallelism', default=None,
 
 Review comment:
   Awesome, sounds like we are safe.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 227071)
Time Spent: 11.5h  (was: 11h 20m)

> Pipeline options to experiment propagation is not working in Dataflow runner.
> -
>
> Key: BEAM-6942
> URL: https://issues.apache.org/jira/browse/BEAM-6942
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: Valentyn Tymofieiev
>Priority: Major
>  Time Spent: 11.5h
>  Remaining Estimate: 0h
>
> Relevant code: 
> [https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/dataflow/dataflow_runner.py#L356-L388]
> 3 experiments/options are affected. We need to fix it in 2.12.0
> cc: [~altay] [~apilloud]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7035) Clear() method of OutputTimer is inconsistent

2019-04-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7035?focusedWorklogId=227068=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-227068
 ]

ASF GitHub Bot logged work on BEAM-7035:


Author: ASF GitHub Bot
Created on: 13/Apr/19 01:02
Start Date: 13/Apr/19 01:02
Worklog Time Spent: 10m 
  Work Description: tweise commented on pull request #8300: [BEAM-7035] 
Compatible wire representation for timers in Python SDK
URL: https://github.com/apache/beam/pull/8300
 
 
   This change is just to fix the timer encoding. Changes to support 
cancellation of timers on the Flink runner side and test coverage to follow.
   
   Adding @lukecwik who will be familiar with the timestamp encoding 
workarounds :)
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | --- | --- | --- | ---
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python3_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python3_Verify/lastCompletedBuild/)
 | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/)
  [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PreCommit_Python_PVR_Flink_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_Python_PVR_Flink_Cron/lastCompletedBuild/)
 | --- | --- | ---
   
   Pre-Commit Tests Status (on master branch)
   

[jira] [Work logged] (BEAM-6138) Add User Metric Support to Java SDK

2019-04-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6138?focusedWorklogId=227062=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-227062
 ]

ASF GitHub Bot logged work on BEAM-6138:


Author: ASF GitHub Bot
Created on: 13/Apr/19 00:39
Start Date: 13/Apr/19 00:39
Worklog Time Spent: 10m 
  Work Description: ajamato commented on pull request #8299: [DO NOT MERGE] 
[BEAM-6138] Add the Sampled Byte Count counters to the Java SDK
URL: https://github.com/apache/beam/pull/8299
 
 
   **Please** add a meaningful description for your change here
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | --- | --- | --- | ---
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python3_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python3_Verify/lastCompletedBuild/)
 | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/)
  [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PreCommit_Python_PVR_Flink_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_Python_PVR_Flink_Cron/lastCompletedBuild/)
 | --- | --- | ---
   
   Pre-Commit Tests Status (on master branch)
   

   
   --- |Java | Python | Go | Website
   --- | --- | --- | --- | ---
   

[jira] [Work logged] (BEAM-7068) Improve error messages when binding functions

2019-04-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7068?focusedWorklogId=227032=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-227032
 ]

ASF GitHub Bot logged work on BEAM-7068:


Author: ASF GitHub Bot
Created on: 13/Apr/19 00:15
Start Date: 13/Apr/19 00:15
Worklog Time Spent: 10m 
  Work Description: youngoli commented on issue #8298: [BEAM-7068] 
Improving error messages when binding functions in Go SDK
URL: https://github.com/apache/beam/pull/8298#issuecomment-482758365
 
 
   R: @lostluck @ibzib 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 227032)
Time Spent: 20m  (was: 10m)

> Improve error messages when binding functions
> -
>
> Key: BEAM-7068
> URL: https://issues.apache.org/jira/browse/BEAM-7068
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-go
>Reporter: Daniel Oliveira
>Assignee: Daniel Oliveira
>Priority: Minor
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> This bug covers error messages around the file core/graph/bind.go which binds 
> the inputs and outputs of functions to make sure all the types match up 
> properly. When DoFns are created with the wrong number of inputs/outputs or 
> wrong input/output types, this is where errors will usually originate from.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7068) Improve error messages when binding functions

2019-04-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7068?focusedWorklogId=227031=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-227031
 ]

ASF GitHub Bot logged work on BEAM-7068:


Author: ASF GitHub Bot
Created on: 13/Apr/19 00:13
Start Date: 13/Apr/19 00:13
Worklog Time Spent: 10m 
  Work Description: youngoli commented on pull request #8298: [BEAM-7068] 
Improving error messages when binding functions in Go SDK
URL: https://github.com/apache/beam/pull/8298
 
 
   I improve the error messages by adding useful context wherever it comes
   in handy, plus making the error messages more human-readable, for
   example by printing out a function name instead of address.
   
   To show an example, this is the current error message when a DoFn has the 
wrong type as input:
   
   `failed to bind {Fn:0xc000aa4aa0 Param:[{Kind:Value T:int} {Kind:Emit 
T:func(string)}] Ret:[]} to input [string]: string is not assignable to int`
   
   New error message:
   
   `creating new DoFn in scope root/CountWords: binding fn main.extractWordsFn: 
binding params [{Value int}] to input string: string is not assignable to int`
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | --- | --- | --- | ---
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python3_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python3_Verify/lastCompletedBuild/)
 | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/)
  [![Build 

[jira] [Work logged] (BEAM-6747) Adding ExternalTransform in JavaSDK

2019-04-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6747?focusedWorklogId=227029=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-227029
 ]

ASF GitHub Bot logged work on BEAM-6747:


Author: ASF GitHub Bot
Created on: 13/Apr/19 00:08
Start Date: 13/Apr/19 00:08
Worklog Time Spent: 10m 
  Work Description: ihji commented on pull request #7954: [BEAM-6747] 
Adding ExternalTransform in JavaSDK
URL: https://github.com/apache/beam/pull/7954#discussion_r275094777
 
 

 ##
 File path: 
runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/External.java
 ##
 @@ -43,7 +43,12 @@
 import 
org.apache.beam.vendor.guava.v20_0.com.google.common.collect.ImmutableMap;
 import org.apache.beam.vendor.guava.v20_0.com.google.common.collect.Iterables;
 
-/** Cross-language external transform. */
+/**
+ * Cross-language external transform.
+ *
+ * {@link External} provides a cross-language transform via expansion 
services in non-Java SDKs.
+ * This is a low-level API and mainly for internal use.
+ */
 
 Review comment:
   Thanks. Added more comments on requirements.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 227029)
Time Spent: 6h 10m  (was: 6h)

> Adding ExternalTransform in JavaSDK
> ---
>
> Key: BEAM-6747
> URL: https://issues.apache.org/jira/browse/BEAM-6747
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-core
>Reporter: Heejong Lee
>Assignee: Heejong Lee
>Priority: Major
>  Time Spent: 6h 10m
>  Remaining Estimate: 0h
>
> Adding Java counterpart of Python ExternalTransform for testing Python 
> transforms from pipelines in Java SDK.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6747) Adding ExternalTransform in JavaSDK

2019-04-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6747?focusedWorklogId=227028=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-227028
 ]

ASF GitHub Bot logged work on BEAM-6747:


Author: ASF GitHub Bot
Created on: 13/Apr/19 00:04
Start Date: 13/Apr/19 00:04
Worklog Time Spent: 10m 
  Work Description: ihji commented on pull request #7954: [BEAM-6747] 
Adding ExternalTransform in JavaSDK
URL: https://github.com/apache/beam/pull/7954#discussion_r275094474
 
 

 ##
 File path: 
runners/core-construction-java/src/test/java/org/apache/beam/runners/core/construction/ExternalTest.java
 ##
 @@ -0,0 +1,197 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.core.construction;
+
+import com.google.auto.service.AutoService;
+import java.io.IOException;
+import java.io.Serializable;
+import java.nio.charset.StandardCharsets;
+import java.util.Map;
+import org.apache.beam.runners.core.construction.expansion.ExpansionService;
+import org.apache.beam.sdk.testing.PAssert;
+import org.apache.beam.sdk.testing.TestPipeline;
+import org.apache.beam.sdk.testing.UsesCrossLanguageTransforms;
+import org.apache.beam.sdk.testing.ValidatesRunner;
+import org.apache.beam.sdk.transforms.Create;
+import org.apache.beam.sdk.transforms.DoFn;
+import org.apache.beam.sdk.transforms.Filter;
+import org.apache.beam.sdk.transforms.MapElements;
+import org.apache.beam.sdk.transforms.ParDo;
+import org.apache.beam.sdk.values.PCollection;
+import org.apache.beam.sdk.values.PCollectionTuple;
+import org.apache.beam.sdk.values.TupleTag;
+import org.apache.beam.sdk.values.TupleTagList;
+import org.apache.beam.sdk.values.TypeDescriptor;
+import org.apache.beam.sdk.values.TypeDescriptors;
+import org.apache.beam.vendor.grpc.v1p13p1.io.grpc.ConnectivityState;
+import org.apache.beam.vendor.grpc.v1p13p1.io.grpc.ManagedChannel;
+import org.apache.beam.vendor.grpc.v1p13p1.io.grpc.ManagedChannelBuilder;
+import org.apache.beam.vendor.grpc.v1p13p1.io.grpc.Server;
+import org.apache.beam.vendor.grpc.v1p13p1.io.grpc.ServerBuilder;
+import 
org.apache.beam.vendor.guava.v20_0.com.google.common.collect.ImmutableMap;
+import org.junit.AfterClass;
+import org.junit.BeforeClass;
+import org.junit.Rule;
+import org.junit.Test;
+import org.junit.experimental.categories.Category;
+import org.junit.runner.RunWith;
+import org.junit.runners.JUnit4;
+
+/** Test External transforms. */
+@RunWith(JUnit4.class)
+public class ExternalTest implements Serializable {
+  @Rule public transient TestPipeline testPipeline = TestPipeline.create();
+
+  private static final String TEST_URN_SIMPLE = "simple";
+  private static final String TEST_URN_LE = "le";
+  private static final String TEST_URN_MULTI = "multi";
+
+  private static String pythonServerCommand;
+  private static Integer expansionPort;
+  private static String localExpansionAddr;
+  private static Server localExpansionServer;
+
+  @BeforeClass
+  public static void setUp() throws IOException {
+pythonServerCommand = System.getProperty("pythonTestExpansionCommand");
+expansionPort = Integer.valueOf(System.getProperty("expansionPort"));
+int localExpansionPort = expansionPort + 100;
+localExpansionAddr = String.format("localhost:%s", localExpansionPort);
+
+localExpansionServer =
+ServerBuilder.forPort(localExpansionPort).addService(new 
ExpansionService()).build();
+localExpansionServer.start();
+  }
+
+  @AfterClass
+  public static void tearDown() {
+localExpansionServer.shutdownNow();
+  }
+
+  @Test
+  @Category({ValidatesRunner.class, UsesCrossLanguageTransforms.class})
+  public void expandSingleTest() {
+PCollection col =
+testPipeline
+.apply(Create.of(1, 2, 3))
+.apply(External.of(TEST_URN_SIMPLE, new byte[] {}, 
localExpansionAddr));
+PAssert.that(col).containsInAnyOrder(2, 3, 4);
+testPipeline.run();
+  }
+
+  @Test
+  @Category({ValidatesRunner.class, UsesCrossLanguageTransforms.class})
+  public void expandMultipleTest() {
+PCollection pcol =
+testPipeline
+.apply(Create.of(1, 

[jira] [Updated] (BEAM-7068) Improve error messages when binding functions

2019-04-12 Thread Daniel Oliveira (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Oliveira updated BEAM-7068:
--
Summary: Improve error messages when binding functions  (was: Improve error 
messages for errors binding functions)

> Improve error messages when binding functions
> -
>
> Key: BEAM-7068
> URL: https://issues.apache.org/jira/browse/BEAM-7068
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-go
>Reporter: Daniel Oliveira
>Assignee: Daniel Oliveira
>Priority: Minor
>
> This bug covers error messages around the file core/graph/bind.go which binds 
> the inputs and outputs of functions to make sure all the types match up 
> properly. When DoFns are created with the wrong number of inputs/outputs or 
> wrong input/output types, this is where errors will usually originate from.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-7068) Improve error messages for errors binding functions

2019-04-12 Thread Daniel Oliveira (JIRA)
Daniel Oliveira created BEAM-7068:
-

 Summary: Improve error messages for errors binding functions
 Key: BEAM-7068
 URL: https://issues.apache.org/jira/browse/BEAM-7068
 Project: Beam
  Issue Type: Sub-task
  Components: sdk-go
Reporter: Daniel Oliveira
Assignee: Daniel Oliveira


This bug covers error messages around the file core/graph/bind.go which binds 
the inputs and outputs of functions to make sure all the types match up 
properly. When DoFns are created with the wrong number of inputs/outputs or 
wrong input/output types, this is where errors will usually originate from.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7064) Conversion of timestamp from BigQuery row to Beam row loses precision

2019-04-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7064?focusedWorklogId=227020=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-227020
 ]

ASF GitHub Bot logged work on BEAM-7064:


Author: ASF GitHub Bot
Created on: 12/Apr/19 23:36
Start Date: 12/Apr/19 23:36
Worklog Time Spent: 10m 
  Work Description: kennknowles commented on issue #8289: [BEAM-7064] 
Reject BigQuery data with sub-millisecond precision, instead of losing data
URL: https://github.com/apache/beam/pull/8289#issuecomment-482753276
 
 
   Not user friendly at all! An attempt to limit data loss. The real solution 
will be nano precision, a logical type at the Beam schema level. We will have 
to reimplement date time functions.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 227020)
Time Spent: 2h 20m  (was: 2h 10m)

> Conversion of timestamp from BigQuery row to Beam row loses precision
> -
>
> Key: BEAM-7064
> URL: https://issues.apache.org/jira/browse/BEAM-7064
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>Priority: Critical
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> Currently, the utilities to convert from BigQuery row to Beam row simply 
> truncate timestamps at millisecond precision. This is unacceptable. Instead, 
> an error should be raised indicating that it is not supported.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4543) Remove dependency on googledatastore in favor of google-cloud-datastore.

2019-04-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4543?focusedWorklogId=227014=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-227014
 ]

ASF GitHub Bot logged work on BEAM-4543:


Author: ASF GitHub Bot
Created on: 12/Apr/19 23:15
Start Date: 12/Apr/19 23:15
Worklog Time Spent: 10m 
  Work Description: udim commented on issue #8262: [BEAM-4543] Python 
Datastore IO using google-cloud-datastore
URL: https://github.com/apache/beam/pull/8262#issuecomment-482750289
 
 
   run python postcommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 227014)
Time Spent: 20m  (was: 10m)

> Remove dependency on googledatastore in favor of google-cloud-datastore.
> 
>
> Key: BEAM-4543
> URL: https://issues.apache.org/jira/browse/BEAM-4543
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: Udi Meiri
>Priority: Minor
>  Labels: triaged
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> apache-beam[gcp] package depends [1] on googledatastore package [2]. We 
> should replace this dependency with google-cloud-datastore [3] which is 
> officially supported, has better release cadence and also has Python 3 
> support.
> [1] 
> https://github.com/apache/beam/blob/fad655462f8fadfdfaab0b7a09cab538f076f94e/sdks/python/setup.py#L126
> [2] [https://pypi.org/project/googledatastore/]
> [3] [https://pypi.org/project/google-cloud-datastore/]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6669) revert service_default_cmek_config experiment flag

2019-04-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6669?focusedWorklogId=227012=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-227012
 ]

ASF GitHub Bot logged work on BEAM-6669:


Author: ASF GitHub Bot
Created on: 12/Apr/19 23:10
Start Date: 12/Apr/19 23:10
Worklog Time Spent: 10m 
  Work Description: udim commented on issue #8296: [BEAM-6669] Set Dataflow 
KMS key name
URL: https://github.com/apache/beam/pull/8296#issuecomment-482749302
 
 
   R: @lukecwik 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 227012)
Time Spent: 50m  (was: 40m)

> revert service_default_cmek_config experiment flag 
> ---
>
> Key: BEAM-6669
> URL: https://issues.apache.org/jira/browse/BEAM-6669
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-gcp, sdk-py-core
>Reporter: Udi Meiri
>Assignee: Udi Meiri
>Priority: Minor
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Do this when --dataflowKmsKey is supported on Dataflow.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6669) revert service_default_cmek_config experiment flag

2019-04-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6669?focusedWorklogId=227013=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-227013
 ]

ASF GitHub Bot logged work on BEAM-6669:


Author: ASF GitHub Bot
Created on: 12/Apr/19 23:10
Start Date: 12/Apr/19 23:10
Worklog Time Spent: 10m 
  Work Description: udim commented on issue #8297: [BEAM-6669] Set Dataflow 
KMS key name (Python)
URL: https://github.com/apache/beam/pull/8297#issuecomment-482749309
 
 
   R: @lukecwik 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 227013)
Time Spent: 1h  (was: 50m)

> revert service_default_cmek_config experiment flag 
> ---
>
> Key: BEAM-6669
> URL: https://issues.apache.org/jira/browse/BEAM-6669
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-gcp, sdk-py-core
>Reporter: Udi Meiri
>Assignee: Udi Meiri
>Priority: Minor
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Do this when --dataflowKmsKey is supported on Dataflow.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6669) revert service_default_cmek_config experiment flag

2019-04-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6669?focusedWorklogId=227010=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-227010
 ]

ASF GitHub Bot logged work on BEAM-6669:


Author: ASF GitHub Bot
Created on: 12/Apr/19 23:06
Start Date: 12/Apr/19 23:06
Worklog Time Spent: 10m 
  Work Description: udim commented on issue #8297: [BEAM-6669] Set Dataflow 
KMS key name (Python)
URL: https://github.com/apache/beam/pull/8297#issuecomment-482748612
 
 
   run python postcommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 227010)
Time Spent: 40m  (was: 0.5h)

> revert service_default_cmek_config experiment flag 
> ---
>
> Key: BEAM-6669
> URL: https://issues.apache.org/jira/browse/BEAM-6669
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-gcp, sdk-py-core
>Reporter: Udi Meiri
>Assignee: Udi Meiri
>Priority: Minor
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Do this when --dataflowKmsKey is supported on Dataflow.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6669) revert service_default_cmek_config experiment flag

2019-04-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6669?focusedWorklogId=227008=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-227008
 ]

ASF GitHub Bot logged work on BEAM-6669:


Author: ASF GitHub Bot
Created on: 12/Apr/19 23:05
Start Date: 12/Apr/19 23:05
Worklog Time Spent: 10m 
  Work Description: udim commented on pull request #8297: [BEAM-6669] Set 
Dataflow KMS key name (Python)
URL: https://github.com/apache/beam/pull/8297
 
 
   Use the official Environment.service_kms_key_name proto field to pass
   --dataflow_kms_key to the runner.
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | --- | --- | --- | ---
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python3_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python3_Verify/lastCompletedBuild/)
 | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/)
  [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PreCommit_Python_PVR_Flink_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_Python_PVR_Flink_Cron/lastCompletedBuild/)
 | --- | --- | ---
   
   Pre-Commit Tests Status (on master branch)
   

   
   --- |Java | Python | Go | Website
   --- | --- | --- | --- | 

[jira] [Work logged] (BEAM-4461) Create a library of useful transforms that use schemas

2019-04-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4461?focusedWorklogId=226997=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-226997
 ]

ASF GitHub Bot logged work on BEAM-4461:


Author: ASF GitHub Bot
Created on: 12/Apr/19 23:02
Start Date: 12/Apr/19 23:02
Worklog Time Spent: 10m 
  Work Description: robinyqiu commented on pull request #8273: [BEAM-4461] 
A transform to perform binary joins of PCollections with schemas
URL: https://github.com/apache/beam/pull/8273#discussion_r274686707
 
 

 ##
 File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/transforms/CoGroup.java
 ##
 @@ -244,7 +244,7 @@ public static By 
fieldAccessDescriptor(FieldAccessDescriptor fieldAccessDescript
  *
  * This only affects the results of expandCrossProduct.
  */
-public By withOuterJoinParticipation() {
+public By withOptionalParticipation() {
   return toBuilder().setOuterJoinParticipation(true).build();
 
 Review comment:
   I think this should be changed to get/setOptionalParticipation()
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 226997)
Time Spent: 25h 10m  (was: 25h)

> Create a library of useful transforms that use schemas
> --
>
> Key: BEAM-4461
> URL: https://issues.apache.org/jira/browse/BEAM-4461
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-java-core
>Reporter: Reuven Lax
>Assignee: Reuven Lax
>Priority: Major
>  Labels: triaged
>  Time Spent: 25h 10m
>  Remaining Estimate: 0h
>
> e.g. JoinBy(fields). Project, Filter, etc.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4461) Create a library of useful transforms that use schemas

2019-04-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4461?focusedWorklogId=226998=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-226998
 ]

ASF GitHub Bot logged work on BEAM-4461:


Author: ASF GitHub Bot
Created on: 12/Apr/19 23:02
Start Date: 12/Apr/19 23:02
Worklog Time Spent: 10m 
  Work Description: robinyqiu commented on pull request #8273: [BEAM-4461] 
A transform to perform binary joins of PCollections with schemas
URL: https://github.com/apache/beam/pull/8273#discussion_r274585649
 
 

 ##
 File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/transforms/CoGroup.java
 ##
 @@ -159,7 +159,7 @@
  *
  * {@code
  * PCollection joined = PCollectionTuple.of("input1", input1, "input2", 
input2)
- *   .apply(CoGroup.join("input1", 
By.fieldNames("user").withOuterJoinParticipation())
+ *   .apply(CoGroup.join("input1", 
By.fieldNames("user").withOptionalParticipation())
 
 Review comment:
   Nit: update doc at L152 as well?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 226998)
Time Spent: 25h 20m  (was: 25h 10m)

> Create a library of useful transforms that use schemas
> --
>
> Key: BEAM-4461
> URL: https://issues.apache.org/jira/browse/BEAM-4461
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-java-core
>Reporter: Reuven Lax
>Assignee: Reuven Lax
>Priority: Major
>  Labels: triaged
>  Time Spent: 25h 20m
>  Remaining Estimate: 0h
>
> e.g. JoinBy(fields). Project, Filter, etc.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4461) Create a library of useful transforms that use schemas

2019-04-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4461?focusedWorklogId=227000=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-227000
 ]

ASF GitHub Bot logged work on BEAM-4461:


Author: ASF GitHub Bot
Created on: 12/Apr/19 23:02
Start Date: 12/Apr/19 23:02
Worklog Time Spent: 10m 
  Work Description: robinyqiu commented on pull request #8273: [BEAM-4461] 
A transform to perform binary joins of PCollections with schemas
URL: https://github.com/apache/beam/pull/8273#discussion_r275006903
 
 

 ##
 File path: 
sdks/java/core/src/test/java/org/apache/beam/sdk/schemas/transforms/JoinTest.java
 ##
 @@ -0,0 +1,390 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.schemas.transforms;
+
+import static junit.framework.TestCase.assertEquals;
+import static org.apache.beam.sdk.schemas.transforms.JoinTestUtils.innerJoin;
+
+import java.util.List;
+import org.apache.beam.sdk.schemas.Schema;
+import org.apache.beam.sdk.schemas.transforms.Join.FieldsEqual;
+import org.apache.beam.sdk.testing.NeedsRunner;
+import org.apache.beam.sdk.testing.PAssert;
+import org.apache.beam.sdk.testing.TestPipeline;
+import org.apache.beam.sdk.transforms.Create;
+import org.apache.beam.sdk.values.PCollection;
+import org.apache.beam.sdk.values.Row;
+import org.apache.beam.vendor.guava.v20_0.com.google.common.collect.Lists;
+import org.junit.Rule;
+import org.junit.Test;
+import org.junit.experimental.categories.Category;
+
+/** Tests for {@link org.apache.beam.sdk.schemas.transforms.Join}. */
+public class JoinTest {
+  @Rule public final transient TestPipeline pipeline = TestPipeline.create();
+
+  private static final Schema CG_SCHEMA_1 =
+  Schema.builder()
+  .addStringField("user")
+  .addInt32Field("count")
+  .addStringField("country")
+  .build();
+  private static final Schema CG_SCHEMA_2 =
+  Schema.builder()
+  .addStringField("user2")
+  .addInt32Field("count2")
+  .addStringField("country2")
+  .build();
+
+  @Test
+  @Category(NeedsRunner.class)
+  public void testInnerJoinSameKeys() {
+List pc1Rows =
+Lists.newArrayList(
+Row.withSchema(CG_SCHEMA_1).addValues("user1", 1, "us").build(),
+Row.withSchema(CG_SCHEMA_1).addValues("user1", 2, "us").build(),
+Row.withSchema(CG_SCHEMA_1).addValues("user1", 3, "il").build(),
+Row.withSchema(CG_SCHEMA_1).addValues("user1", 4, "il").build(),
+Row.withSchema(CG_SCHEMA_1).addValues("user2", 5, "fr").build(),
+Row.withSchema(CG_SCHEMA_1).addValues("user2", 6, "fr").build(),
+Row.withSchema(CG_SCHEMA_1).addValues("user2", 7, "ar").build(),
+Row.withSchema(CG_SCHEMA_1).addValues("user2", 8, "ar").build(),
+Row.withSchema(CG_SCHEMA_1).addValues("user3", 8, "ar").build());
+List pc2Rows =
+Lists.newArrayList(
+Row.withSchema(CG_SCHEMA_1).addValues("user1", 9, "us").build(),
+Row.withSchema(CG_SCHEMA_1).addValues("user1", 10, "us").build(),
+Row.withSchema(CG_SCHEMA_1).addValues("user1", 11, "il").build(),
+Row.withSchema(CG_SCHEMA_1).addValues("user1", 12, "il").build(),
+Row.withSchema(CG_SCHEMA_1).addValues("user2", 13, "fr").build(),
+Row.withSchema(CG_SCHEMA_1).addValues("user2", 14, "fr").build(),
+Row.withSchema(CG_SCHEMA_1).addValues("user2", 15, "ar").build(),
+Row.withSchema(CG_SCHEMA_1).addValues("user2", 16, "ar").build(),
+Row.withSchema(CG_SCHEMA_1).addValues("user4", 8, "ar").build());
+
+PCollection pc1 = pipeline.apply("Create1", 
Create.of(pc1Rows)).setRowSchema(CG_SCHEMA_1);
+PCollection pc2 = pipeline.apply("Create2", 
Create.of(pc2Rows)).setRowSchema(CG_SCHEMA_1);
+
+Schema expectedSchema =
+Schema.builder()
+.addRowField(Join.LHS_TAG, CG_SCHEMA_1)
+.addRowField(Join.RHS_TAG, CG_SCHEMA_1)
+.build();
+
+PCollection joined = pc1.apply(Join.innerJoin(pc2).using("user", "country"));
+
+

[jira] [Work logged] (BEAM-4461) Create a library of useful transforms that use schemas

2019-04-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4461?focusedWorklogId=227001=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-227001
 ]

ASF GitHub Bot logged work on BEAM-4461:


Author: ASF GitHub Bot
Created on: 12/Apr/19 23:02
Start Date: 12/Apr/19 23:02
Worklog Time Spent: 10m 
  Work Description: robinyqiu commented on pull request #8273: [BEAM-4461] 
A transform to perform binary joins of PCollections with schemas
URL: https://github.com/apache/beam/pull/8273#discussion_r275068137
 
 

 ##
 File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/transforms/Join.java
 ##
 @@ -0,0 +1,237 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.schemas.transforms;
+
+import javax.annotation.Nullable;
+import org.apache.beam.sdk.schemas.FieldAccessDescriptor;
+import org.apache.beam.sdk.schemas.Schema;
+import org.apache.beam.sdk.transforms.PTransform;
+import org.apache.beam.sdk.values.PCollection;
+import org.apache.beam.sdk.values.PCollectionTuple;
+import org.apache.beam.sdk.values.Row;
+
+/**
+ * A transform that performs equijoins across two schema {@link PCollection}s.
+ *
+ * This transform allows joins between two input PCollections simply by 
specifying the fields to
+ * join on. The resulting {@code PCollection} will have two fields named 
"lhs" and "rhs"
+ * respectively, each with the schema of the corresponding input PCollection.
+ *
+ * For example, the following demonstrates joining two PCollections using a 
natural join on the
+ * "user" and "country" fields, where both the left-hand and the right-hand 
PCollections have fields
+ * with these names.
+ *
+ * 
+ * {@code PCollection joined = 
pCollection1.apply(Join.innerJoin(pCollection2).using("user", "country"));
+ * }
+ *
+ * If the right-hand PCollection contains fields with different names to 
join against, you can
+ * specify them as follows:
+ *
+ * {@code PCollection joined = 
pCollection1.apply(Join.innerJoin(pCollection2)
+ *   .on(FieldsEqual.left("user", "country").right("otherUser", 
"otherCountry")));
+ * }
+ *
+ * Full outer joins, left outer joins, and right outer joins are also 
supported.
+ */
+public class Join {
+  public static final String LHS_TAG = "lhs";
+  public static final String RHS_TAG = "rhs";
+
+  /** Predicate object to specify fields to compare when doing an equi-join. */
+  public static class FieldsEqual {
+public static Inner left(String... fieldNames) {
+  return new Inner(
+  FieldAccessDescriptor.withFieldNames(fieldNames), 
FieldAccessDescriptor.create());
+}
+
+public static Inner left(Integer... fieldIds) {
+  return new Inner(
+  FieldAccessDescriptor.withFieldIds(fieldIds), 
FieldAccessDescriptor.create());
+}
+
+public static Inner left(FieldAccessDescriptor fieldAccessDescriptor) {
+  return new Inner(fieldAccessDescriptor, FieldAccessDescriptor.create());
+}
+
+public Inner right(String... fieldNames) {
+  return new Inner(
+  FieldAccessDescriptor.create(), 
FieldAccessDescriptor.withFieldNames(fieldNames));
+}
+
+public Inner right(Integer... fieldIds) {
+  return new Inner(
+  FieldAccessDescriptor.create(), 
FieldAccessDescriptor.withFieldIds(fieldIds));
+}
+
+public Inner right(FieldAccessDescriptor fieldAccessDescriptor) {
+  return new Inner(FieldAccessDescriptor.create(), fieldAccessDescriptor);
+}
+
+/** Implementation class for FieldsEqual. */
+public static class Inner {
+  private FieldAccessDescriptor lhs;
+  private FieldAccessDescriptor rhs;
+
+  private Inner(FieldAccessDescriptor lhs, FieldAccessDescriptor rhs) {
+this.lhs = lhs;
+this.rhs = rhs;
+  }
+
+  public Inner left(String... fieldNames) {
+return new Inner(FieldAccessDescriptor.withFieldNames(fieldNames), 
rhs);
+  }
+
+  public Inner left(Integer... fieldIds) {
+return new Inner(FieldAccessDescriptor.withFieldIds(fieldIds), rhs);
+  }
+
+  public Inner left(FieldAccessDescriptor fieldAccessDescriptor) {
+   

[jira] [Work logged] (BEAM-4461) Create a library of useful transforms that use schemas

2019-04-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4461?focusedWorklogId=226999=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-226999
 ]

ASF GitHub Bot logged work on BEAM-4461:


Author: ASF GitHub Bot
Created on: 12/Apr/19 23:02
Start Date: 12/Apr/19 23:02
Worklog Time Spent: 10m 
  Work Description: robinyqiu commented on pull request #8273: [BEAM-4461] 
A transform to perform binary joins of PCollections with schemas
URL: https://github.com/apache/beam/pull/8273#discussion_r275007051
 
 

 ##
 File path: 
sdks/java/core/src/test/java/org/apache/beam/sdk/schemas/transforms/JoinTest.java
 ##
 @@ -0,0 +1,390 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.schemas.transforms;
+
+import static junit.framework.TestCase.assertEquals;
+import static org.apache.beam.sdk.schemas.transforms.JoinTestUtils.innerJoin;
+
+import java.util.List;
+import org.apache.beam.sdk.schemas.Schema;
+import org.apache.beam.sdk.schemas.transforms.Join.FieldsEqual;
+import org.apache.beam.sdk.testing.NeedsRunner;
+import org.apache.beam.sdk.testing.PAssert;
+import org.apache.beam.sdk.testing.TestPipeline;
+import org.apache.beam.sdk.transforms.Create;
+import org.apache.beam.sdk.values.PCollection;
+import org.apache.beam.sdk.values.Row;
+import org.apache.beam.vendor.guava.v20_0.com.google.common.collect.Lists;
+import org.junit.Rule;
+import org.junit.Test;
+import org.junit.experimental.categories.Category;
+
+/** Tests for {@link org.apache.beam.sdk.schemas.transforms.Join}. */
+public class JoinTest {
+  @Rule public final transient TestPipeline pipeline = TestPipeline.create();
+
+  private static final Schema CG_SCHEMA_1 =
+  Schema.builder()
+  .addStringField("user")
+  .addInt32Field("count")
+  .addStringField("country")
+  .build();
+  private static final Schema CG_SCHEMA_2 =
+  Schema.builder()
+  .addStringField("user2")
+  .addInt32Field("count2")
+  .addStringField("country2")
+  .build();
+
+  @Test
+  @Category(NeedsRunner.class)
+  public void testInnerJoinSameKeys() {
+List pc1Rows =
+Lists.newArrayList(
+Row.withSchema(CG_SCHEMA_1).addValues("user1", 1, "us").build(),
+Row.withSchema(CG_SCHEMA_1).addValues("user1", 2, "us").build(),
+Row.withSchema(CG_SCHEMA_1).addValues("user1", 3, "il").build(),
+Row.withSchema(CG_SCHEMA_1).addValues("user1", 4, "il").build(),
+Row.withSchema(CG_SCHEMA_1).addValues("user2", 5, "fr").build(),
+Row.withSchema(CG_SCHEMA_1).addValues("user2", 6, "fr").build(),
+Row.withSchema(CG_SCHEMA_1).addValues("user2", 7, "ar").build(),
+Row.withSchema(CG_SCHEMA_1).addValues("user2", 8, "ar").build(),
+Row.withSchema(CG_SCHEMA_1).addValues("user3", 8, "ar").build());
+List pc2Rows =
+Lists.newArrayList(
+Row.withSchema(CG_SCHEMA_1).addValues("user1", 9, "us").build(),
+Row.withSchema(CG_SCHEMA_1).addValues("user1", 10, "us").build(),
+Row.withSchema(CG_SCHEMA_1).addValues("user1", 11, "il").build(),
+Row.withSchema(CG_SCHEMA_1).addValues("user1", 12, "il").build(),
+Row.withSchema(CG_SCHEMA_1).addValues("user2", 13, "fr").build(),
+Row.withSchema(CG_SCHEMA_1).addValues("user2", 14, "fr").build(),
+Row.withSchema(CG_SCHEMA_1).addValues("user2", 15, "ar").build(),
+Row.withSchema(CG_SCHEMA_1).addValues("user2", 16, "ar").build(),
+Row.withSchema(CG_SCHEMA_1).addValues("user4", 8, "ar").build());
+
+PCollection pc1 = pipeline.apply("Create1", 
Create.of(pc1Rows)).setRowSchema(CG_SCHEMA_1);
+PCollection pc2 = pipeline.apply("Create2", 
Create.of(pc2Rows)).setRowSchema(CG_SCHEMA_1);
+
+Schema expectedSchema =
+Schema.builder()
+.addRowField(Join.LHS_TAG, CG_SCHEMA_1)
+.addRowField(Join.RHS_TAG, CG_SCHEMA_1)
+.build();
+
+PCollection joined = pc1.apply(Join.innerJoin(pc2).using("user", "country"));
+
+

[jira] [Work logged] (BEAM-4461) Create a library of useful transforms that use schemas

2019-04-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4461?focusedWorklogId=227002=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-227002
 ]

ASF GitHub Bot logged work on BEAM-4461:


Author: ASF GitHub Bot
Created on: 12/Apr/19 23:02
Start Date: 12/Apr/19 23:02
Worklog Time Spent: 10m 
  Work Description: robinyqiu commented on pull request #8273: [BEAM-4461] 
A transform to perform binary joins of PCollections with schemas
URL: https://github.com/apache/beam/pull/8273#discussion_r274714548
 
 

 ##
 File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/transforms/Join.java
 ##
 @@ -0,0 +1,237 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.schemas.transforms;
+
+import javax.annotation.Nullable;
+import org.apache.beam.sdk.schemas.FieldAccessDescriptor;
+import org.apache.beam.sdk.schemas.Schema;
+import org.apache.beam.sdk.transforms.PTransform;
+import org.apache.beam.sdk.values.PCollection;
+import org.apache.beam.sdk.values.PCollectionTuple;
+import org.apache.beam.sdk.values.Row;
+
+/**
+ * A transform that performs equijoins across two schema {@link PCollection}s.
+ *
+ * This transform allows joins between two input PCollections simply by 
specifying the fields to
+ * join on. The resulting {@code PCollection} will have two fields named 
"lhs" and "rhs"
+ * respectively, each with the schema of the corresponding input PCollection.
+ *
+ * For example, the following demonstrates joining two PCollections using a 
natural join on the
+ * "user" and "country" fields, where both the left-hand and the right-hand 
PCollections have fields
+ * with these names.
+ *
+ * 
+ * {@code PCollection joined = 
pCollection1.apply(Join.innerJoin(pCollection2).using("user", "country"));
+ * }
+ *
+ * If the right-hand PCollection contains fields with different names to 
join against, you can
+ * specify them as follows:
+ *
+ * {@code PCollection joined = 
pCollection1.apply(Join.innerJoin(pCollection2)
+ *   .on(FieldsEqual.left("user", "country").right("otherUser", 
"otherCountry")));
+ * }
+ *
+ * Full outer joins, left outer joins, and right outer joins are also 
supported.
+ */
+public class Join {
 
 Review comment:
   Shouldn't we mark this class and CoGroup with `@Experimental(Kind.SCHEMAS)`? 
I think we may be able to make some improvement on the API design later.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 227002)

> Create a library of useful transforms that use schemas
> --
>
> Key: BEAM-4461
> URL: https://issues.apache.org/jira/browse/BEAM-4461
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-java-core
>Reporter: Reuven Lax
>Assignee: Reuven Lax
>Priority: Major
>  Labels: triaged
>  Time Spent: 25.5h
>  Remaining Estimate: 0h
>
> e.g. JoinBy(fields). Project, Filter, etc.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4461) Create a library of useful transforms that use schemas

2019-04-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4461?focusedWorklogId=227003=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-227003
 ]

ASF GitHub Bot logged work on BEAM-4461:


Author: ASF GitHub Bot
Created on: 12/Apr/19 23:02
Start Date: 12/Apr/19 23:02
Worklog Time Spent: 10m 
  Work Description: robinyqiu commented on pull request #8273: [BEAM-4461] 
A transform to perform binary joins of PCollections with schemas
URL: https://github.com/apache/beam/pull/8273#discussion_r275078312
 
 

 ##
 File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/transforms/Join.java
 ##
 @@ -0,0 +1,237 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.schemas.transforms;
+
+import javax.annotation.Nullable;
+import org.apache.beam.sdk.schemas.FieldAccessDescriptor;
+import org.apache.beam.sdk.schemas.Schema;
+import org.apache.beam.sdk.transforms.PTransform;
+import org.apache.beam.sdk.values.PCollection;
+import org.apache.beam.sdk.values.PCollectionTuple;
+import org.apache.beam.sdk.values.Row;
+
+/**
+ * A transform that performs equijoins across two schema {@link PCollection}s.
+ *
+ * This transform allows joins between two input PCollections simply by 
specifying the fields to
+ * join on. The resulting {@code PCollection} will have two fields named 
"lhs" and "rhs"
+ * respectively, each with the schema of the corresponding input PCollection.
+ *
+ * For example, the following demonstrates joining two PCollections using a 
natural join on the
+ * "user" and "country" fields, where both the left-hand and the right-hand 
PCollections have fields
+ * with these names.
+ *
+ * 
+ * {@code PCollection joined = 
pCollection1.apply(Join.innerJoin(pCollection2).using("user", "country"));
+ * }
+ *
+ * If the right-hand PCollection contains fields with different names to 
join against, you can
+ * specify them as follows:
+ *
+ * {@code PCollection joined = 
pCollection1.apply(Join.innerJoin(pCollection2)
+ *   .on(FieldsEqual.left("user", "country").right("otherUser", 
"otherCountry")));
+ * }
+ *
+ * Full outer joins, left outer joins, and right outer joins are also 
supported.
+ */
+public class Join {
+  public static final String LHS_TAG = "lhs";
+  public static final String RHS_TAG = "rhs";
+
+  /** Predicate object to specify fields to compare when doing an equi-join. */
+  public static class FieldsEqual {
+public static Inner left(String... fieldNames) {
+  return new Inner(
+  FieldAccessDescriptor.withFieldNames(fieldNames), 
FieldAccessDescriptor.create());
+}
+
+public static Inner left(Integer... fieldIds) {
+  return new Inner(
+  FieldAccessDescriptor.withFieldIds(fieldIds), 
FieldAccessDescriptor.create());
+}
+
+public static Inner left(FieldAccessDescriptor fieldAccessDescriptor) {
+  return new Inner(fieldAccessDescriptor, FieldAccessDescriptor.create());
+}
+
+public Inner right(String... fieldNames) {
+  return new Inner(
+  FieldAccessDescriptor.create(), 
FieldAccessDescriptor.withFieldNames(fieldNames));
+}
+
+public Inner right(Integer... fieldIds) {
+  return new Inner(
+  FieldAccessDescriptor.create(), 
FieldAccessDescriptor.withFieldIds(fieldIds));
+}
+
+public Inner right(FieldAccessDescriptor fieldAccessDescriptor) {
+  return new Inner(FieldAccessDescriptor.create(), fieldAccessDescriptor);
+}
+
+/** Implementation class for FieldsEqual. */
+public static class Inner {
+  private FieldAccessDescriptor lhs;
+  private FieldAccessDescriptor rhs;
+
+  private Inner(FieldAccessDescriptor lhs, FieldAccessDescriptor rhs) {
+this.lhs = lhs;
+this.rhs = rhs;
+  }
+
+  public Inner left(String... fieldNames) {
+return new Inner(FieldAccessDescriptor.withFieldNames(fieldNames), 
rhs);
+  }
+
+  public Inner left(Integer... fieldIds) {
+return new Inner(FieldAccessDescriptor.withFieldIds(fieldIds), rhs);
+  }
+
+  public Inner left(FieldAccessDescriptor fieldAccessDescriptor) {
+   

[jira] [Work logged] (BEAM-4046) Decouple gradle project names and maven artifact ids

2019-04-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4046?focusedWorklogId=226995=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-226995
 ]

ASF GitHub Bot logged work on BEAM-4046:


Author: ASF GitHub Bot
Created on: 12/Apr/19 23:00
Start Date: 12/Apr/19 23:00
Worklog Time Spent: 10m 
  Work Description: adude3141 commented on issue #8194: [DO NOT MERGE] 
[BEAM-4046] decouple gradle project names and maven artifact ids
URL: https://github.com/apache/beam/pull/8194#issuecomment-482747782
 
 
   Run Flink ValidatesRunner
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 226995)
Time Spent: 2h  (was: 1h 50m)

> Decouple gradle project names and maven artifact ids
> 
>
> Key: BEAM-4046
> URL: https://issues.apache.org/jira/browse/BEAM-4046
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system
>Reporter: Kenneth Knowles
>Assignee: Michael Luckey
>Priority: Major
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> In our first draft, we had gradle projects like {{":beam-sdks-java-core"}}. 
> It is clumsy and requires a hacky settings.gradle that is not idiomatic.
> In our second draft, we changed them to names that work well with Gradle, 
> like {{":sdks:java:core"}}. This caused Maven artifact IDs to be wonky.
> In our third draft, we regressed to the first draft to get the Maven artifact 
> ids right.
> These should be able to be decoupled. It seems there are many StackOverflow 
> questions on the subject.
> Since it is unidiomatic and a poor user experience, if it does turn out to be 
> mandatory then it needs to be documented inline everywhere - the 
> settings.gradle should say why it is so bizarre, and each build.gradle should 
> indicate what its project id is.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4046) Decouple gradle project names and maven artifact ids

2019-04-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4046?focusedWorklogId=226992=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-226992
 ]

ASF GitHub Bot logged work on BEAM-4046:


Author: ASF GitHub Bot
Created on: 12/Apr/19 23:00
Start Date: 12/Apr/19 23:00
Worklog Time Spent: 10m 
  Work Description: adude3141 commented on issue #8194: [DO NOT MERGE] 
[BEAM-4046] decouple gradle project names and maven artifact ids
URL: https://github.com/apache/beam/pull/8194#issuecomment-482747618
 
 
   Run Java Flink PortableValidatesRunner Streaming
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 226992)
Time Spent: 1.5h  (was: 1h 20m)

> Decouple gradle project names and maven artifact ids
> 
>
> Key: BEAM-4046
> URL: https://issues.apache.org/jira/browse/BEAM-4046
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system
>Reporter: Kenneth Knowles
>Assignee: Michael Luckey
>Priority: Major
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> In our first draft, we had gradle projects like {{":beam-sdks-java-core"}}. 
> It is clumsy and requires a hacky settings.gradle that is not idiomatic.
> In our second draft, we changed them to names that work well with Gradle, 
> like {{":sdks:java:core"}}. This caused Maven artifact IDs to be wonky.
> In our third draft, we regressed to the first draft to get the Maven artifact 
> ids right.
> These should be able to be decoupled. It seems there are many StackOverflow 
> questions on the subject.
> Since it is unidiomatic and a poor user experience, if it does turn out to be 
> mandatory then it needs to be documented inline everywhere - the 
> settings.gradle should say why it is so bizarre, and each build.gradle should 
> indicate what its project id is.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4046) Decouple gradle project names and maven artifact ids

2019-04-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4046?focusedWorklogId=226993=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-226993
 ]

ASF GitHub Bot logged work on BEAM-4046:


Author: ASF GitHub Bot
Created on: 12/Apr/19 23:00
Start Date: 12/Apr/19 23:00
Worklog Time Spent: 10m 
  Work Description: adude3141 commented on issue #8194: [DO NOT MERGE] 
[BEAM-4046] decouple gradle project names and maven artifact ids
URL: https://github.com/apache/beam/pull/8194#issuecomment-482747669
 
 
   Run Apex ValidatesRunner
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 226993)
Time Spent: 1h 40m  (was: 1.5h)

> Decouple gradle project names and maven artifact ids
> 
>
> Key: BEAM-4046
> URL: https://issues.apache.org/jira/browse/BEAM-4046
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system
>Reporter: Kenneth Knowles
>Assignee: Michael Luckey
>Priority: Major
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> In our first draft, we had gradle projects like {{":beam-sdks-java-core"}}. 
> It is clumsy and requires a hacky settings.gradle that is not idiomatic.
> In our second draft, we changed them to names that work well with Gradle, 
> like {{":sdks:java:core"}}. This caused Maven artifact IDs to be wonky.
> In our third draft, we regressed to the first draft to get the Maven artifact 
> ids right.
> These should be able to be decoupled. It seems there are many StackOverflow 
> questions on the subject.
> Since it is unidiomatic and a poor user experience, if it does turn out to be 
> mandatory then it needs to be documented inline everywhere - the 
> settings.gradle should say why it is so bizarre, and each build.gradle should 
> indicate what its project id is.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4046) Decouple gradle project names and maven artifact ids

2019-04-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4046?focusedWorklogId=226994=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-226994
 ]

ASF GitHub Bot logged work on BEAM-4046:


Author: ASF GitHub Bot
Created on: 12/Apr/19 23:00
Start Date: 12/Apr/19 23:00
Worklog Time Spent: 10m 
  Work Description: adude3141 commented on issue #8194: [DO NOT MERGE] 
[BEAM-4046] decouple gradle project names and maven artifact ids
URL: https://github.com/apache/beam/pull/8194#issuecomment-482747741
 
 
   Run Dataflow ValidatesRunner
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 226994)
Time Spent: 1h 50m  (was: 1h 40m)

> Decouple gradle project names and maven artifact ids
> 
>
> Key: BEAM-4046
> URL: https://issues.apache.org/jira/browse/BEAM-4046
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system
>Reporter: Kenneth Knowles
>Assignee: Michael Luckey
>Priority: Major
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> In our first draft, we had gradle projects like {{":beam-sdks-java-core"}}. 
> It is clumsy and requires a hacky settings.gradle that is not idiomatic.
> In our second draft, we changed them to names that work well with Gradle, 
> like {{":sdks:java:core"}}. This caused Maven artifact IDs to be wonky.
> In our third draft, we regressed to the first draft to get the Maven artifact 
> ids right.
> These should be able to be decoupled. It seems there are many StackOverflow 
> questions on the subject.
> Since it is unidiomatic and a poor user experience, if it does turn out to be 
> mandatory then it needs to be documented inline everywhere - the 
> settings.gradle should say why it is so bizarre, and each build.gradle should 
> indicate what its project id is.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6669) revert service_default_cmek_config experiment flag

2019-04-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6669?focusedWorklogId=226982=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-226982
 ]

ASF GitHub Bot logged work on BEAM-6669:


Author: ASF GitHub Bot
Created on: 12/Apr/19 22:46
Start Date: 12/Apr/19 22:46
Worklog Time Spent: 10m 
  Work Description: udim commented on issue #8296: [BEAM-6669] Set Dataflow 
KMS key name
URL: https://github.com/apache/beam/pull/8296#issuecomment-482745450
 
 
   run java postcommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 226982)
Time Spent: 20m  (was: 10m)

> revert service_default_cmek_config experiment flag 
> ---
>
> Key: BEAM-6669
> URL: https://issues.apache.org/jira/browse/BEAM-6669
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-gcp, sdk-py-core
>Reporter: Udi Meiri
>Assignee: Udi Meiri
>Priority: Minor
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Do this when --dataflowKmsKey is supported on Dataflow.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6669) revert service_default_cmek_config experiment flag

2019-04-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6669?focusedWorklogId=226981=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-226981
 ]

ASF GitHub Bot logged work on BEAM-6669:


Author: ASF GitHub Bot
Created on: 12/Apr/19 22:45
Start Date: 12/Apr/19 22:45
Worklog Time Spent: 10m 
  Work Description: udim commented on pull request #8296: [BEAM-6669] Set 
Dataflow KMS key name
URL: https://github.com/apache/beam/pull/8296
 
 
   Use the official Environment.service_kms_key_name proto field to pass
   --dataflowKmsKey to the runner.
   
   Stop setting `service_default_cmek_config` experimental flag. It doesn't
   work anyway. The `service_kms_key_name` experimental flag replaced it a
   while back.
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | --- | --- | --- | ---
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python3_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python3_Verify/lastCompletedBuild/)
 | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/)
  [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PreCommit_Python_PVR_Flink_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_Python_PVR_Flink_Cron/lastCompletedBuild/)
 | --- | --- | ---
   
   Pre-Commit Tests Status (on master branch)
   

[jira] [Work logged] (BEAM-6138) Add User Metric Support to Java SDK

2019-04-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6138?focusedWorklogId=226979=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-226979
 ]

ASF GitHub Bot logged work on BEAM-6138:


Author: ASF GitHub Bot
Created on: 12/Apr/19 22:44
Start Date: 12/Apr/19 22:44
Worklog Time Spent: 10m 
  Work Description: Ardagan commented on pull request #8280: [BEAM-6138] 
Update java SDK to report user distribution tuple metrics over the FN API
URL: https://github.com/apache/beam/pull/8280#discussion_r274980022
 
 

 ##
 File path: 
runners/core-java/src/main/java/org/apache/beam/runners/core/metrics/MetricUrns.java
 ##
 @@ -31,6 +31,9 @@ public static MetricName parseUrn(String urn) {
 if (urn.startsWith(MonitoringInfoConstants.Urns.USER_COUNTER_PREFIX)) {
   urn = 
urn.substring(MonitoringInfoConstants.Urns.USER_COUNTER_PREFIX.length());
 }
+if (urn.startsWith(MonitoringInfoConstants.Urns.USER_DISTRIBUTION_PREFIX)) 
{
 
 Review comment:
   you can invert order and use else if
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 226979)
Time Spent: 8h 40m  (was: 8.5h)

> Add User Metric Support to Java SDK
> ---
>
> Key: BEAM-6138
> URL: https://issues.apache.org/jira/browse/BEAM-6138
> Project: Beam
>  Issue Type: New Feature
>  Components: java-fn-execution
>Reporter: Alex Amato
>Assignee: Alex Amato
>Priority: Major
>  Labels: triaged
>  Time Spent: 8h 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6138) Add User Metric Support to Java SDK

2019-04-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6138?focusedWorklogId=226978=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-226978
 ]

ASF GitHub Bot logged work on BEAM-6138:


Author: ASF GitHub Bot
Created on: 12/Apr/19 22:44
Start Date: 12/Apr/19 22:44
Worklog Time Spent: 10m 
  Work Description: Ardagan commented on pull request #8280: [BEAM-6138] 
Update java SDK to report user distribution tuple metrics over the FN API
URL: https://github.com/apache/beam/pull/8280#discussion_r275084090
 
 

 ##
 File path: 
runners/core-java/src/main/java/org/apache/beam/runners/core/metrics/MonitoringInfoConstants.java
 ##
 @@ -41,7 +41,7 @@
 public static final String TOTAL_MSECS = 
extractUrn(MonitoringInfoSpecs.Enum.TOTAL_MSECS);
 public static final String USER_COUNTER_PREFIX =
 extractUrn(MonitoringInfoSpecs.Enum.USER_COUNTER);
-public static final String USER_DISTRIBUTION_COUNTER_PREFIX =
+public static final String USER_DISTRIBUTION_PREFIX =
 
 Review comment:
   Better keep original name. Otherwise it is inconsistent with 
USER_COUNTER_PREFIX.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 226978)
Time Spent: 8.5h  (was: 8h 20m)

> Add User Metric Support to Java SDK
> ---
>
> Key: BEAM-6138
> URL: https://issues.apache.org/jira/browse/BEAM-6138
> Project: Beam
>  Issue Type: New Feature
>  Components: java-fn-execution
>Reporter: Alex Amato
>Assignee: Alex Amato
>Priority: Major
>  Labels: triaged
>  Time Spent: 8.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6138) Add User Metric Support to Java SDK

2019-04-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6138?focusedWorklogId=226980=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-226980
 ]

ASF GitHub Bot logged work on BEAM-6138:


Author: ASF GitHub Bot
Created on: 12/Apr/19 22:44
Start Date: 12/Apr/19 22:44
Worklog Time Spent: 10m 
  Work Description: Ardagan commented on pull request #8280: [BEAM-6138] 
Update java SDK to report user distribution tuple metrics over the FN API
URL: https://github.com/apache/beam/pull/8280#discussion_r275084353
 
 

 ##
 File path: 
runners/core-java/src/test/java/org/apache/beam/runners/core/metrics/MetricsContainerImplTest.java
 ##
 @@ -178,6 +178,34 @@ public void 
testMonitoringInfosArePopulatedForUserCounters() {
 assertThat(actualMonitoringInfos, containsInAnyOrder(builder1.build(), 
builder2.build()));
   }
 
+  @Test
+  public void testMonitoringInfosArePopulatedForUserDistributions() {
+MetricsContainerImpl testObject = new MetricsContainerImpl("step1");
+DistributionCell c1 = testObject.getDistribution(MetricName.named("ns", 
"name1"));
+DistributionCell c2 = testObject.getDistribution(MetricName.named("ns", 
"name2"));
+c1.update(5L);
+c2.update(4L);
+
+SimpleMonitoringInfoBuilder builder1 = new SimpleMonitoringInfoBuilder();
+builder1.setUrnForUserDistribution("ns", "name1");
+builder1.setInt64DistributionValue(DistributionData.create(5, 1, 5, 5));
+builder1.setPTransformLabel("step1");
+builder1.build();
 
 Review comment:
   No need to call build() here and next initialization. You do it at L206.
   Also it is not required since you're ignoring return value.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 226980)
Time Spent: 8h 50m  (was: 8h 40m)

> Add User Metric Support to Java SDK
> ---
>
> Key: BEAM-6138
> URL: https://issues.apache.org/jira/browse/BEAM-6138
> Project: Beam
>  Issue Type: New Feature
>  Components: java-fn-execution
>Reporter: Alex Amato
>Assignee: Alex Amato
>Priority: Major
>  Labels: triaged
>  Time Spent: 8h 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4374) Update existing metrics in the FN API to use new Metric Schema

2019-04-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4374?focusedWorklogId=226974=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-226974
 ]

ASF GitHub Bot logged work on BEAM-4374:


Author: ASF GitHub Bot
Created on: 12/Apr/19 22:38
Start Date: 12/Apr/19 22:38
Worklog Time Spent: 10m 
  Work Description: Ardagan commented on pull request #8062: [BEAM-4374] 
Emit SampledByteCount distribution tuple system metric from Python SDK 
(@Ardagan helped contribute)
URL: https://github.com/apache/beam/pull/8062#discussion_r275083279
 
 

 ##
 File path: sdks/python/apache_beam/runners/portability/fn_api_runner_test.py
 ##
 @@ -611,35 +691,99 @@ def process(self, element):
 
 result_metrics = res.monitoring_metrics()
 
-def assert_contains_metric(src, urn, pcollection, value):
-  for item in src:
-if item.urn == urn:
-  if item.labels['PCOLLECTION'] == pcollection:
-self.assertEqual(item.metric.counter_data.int64_value, value,
- str(("Metric has incorrect value", value, item)))
-return
-  self.fail(str(("Metric not found", urn, pcollection, src)))
-
 counters = result_metrics.monitoring_infos()
+# All element count and byte count metrics must have a PCOLLECTION_LABEL.
 self.assertFalse([x for x in counters if
-  x.urn == monitoring_infos.ELEMENT_COUNT_URN
+  x.urn in [monitoring_infos.ELEMENT_COUNT_URN,
+monitoring_infos.SAMPLED_BYTE_SIZE_URN]
   and
   monitoring_infos.PCOLLECTION_LABEL not in x.labels])
+try:
+  labels = {'PCOLLECTION' : 'Impulse'}
+  self.assert_has_counter(
+  counters, monitoring_infos.ELEMENT_COUNT_URN, labels, 1)
 
-assert_contains_metric(counters, monitoring_infos.ELEMENT_COUNT_URN,
-   'Impulse', 1)
-assert_contains_metric(counters, monitoring_infos.ELEMENT_COUNT_URN,
-   'ref_PCollection_PCollection_1', 2)
-
-# Skipping other pcollections due to non-deterministic naming for multiple
-# outputs.
-
-assert_contains_metric(counters, monitoring_infos.ELEMENT_COUNT_URN,
-   'ref_PCollection_PCollection_5', 8)
-assert_contains_metric(counters, monitoring_infos.ELEMENT_COUNT_URN,
-   'ref_PCollection_PCollection_6', 8)
-assert_contains_metric(counters, monitoring_infos.ELEMENT_COUNT_URN,
-   'ref_PCollection_PCollection_7', 2)
+  # Create/Read, "out" output.
+  labels = {'PCOLLECTION' : 'ref_PCollection_PCollection_1'}
 
 Review comment:
   use constant for monitoring_infos.PCOLLECTION_LABEL
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 226974)
Time Spent: 14h 50m  (was: 14h 40m)

> Update existing metrics in the FN API to use new Metric Schema
> --
>
> Key: BEAM-4374
> URL: https://issues.apache.org/jira/browse/BEAM-4374
> Project: Beam
>  Issue Type: New Feature
>  Components: beam-model
>Reporter: Alex Amato
>Priority: Major
>  Time Spent: 14h 50m
>  Remaining Estimate: 0h
>
> Update existing metrics to use the new proto and cataloging schema defined in:
> [_https://s.apache.org/beam-fn-api-metrics_]
>  * Check in new protos
>  * Define catalog file for metrics
>  * Port existing metrics to use this new format, based on catalog 
> names+metadata



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7064) Conversion of timestamp from BigQuery row to Beam row loses precision

2019-04-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7064?focusedWorklogId=226957=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-226957
 ]

ASF GitHub Bot logged work on BEAM-7064:


Author: ASF GitHub Bot
Created on: 12/Apr/19 21:58
Start Date: 12/Apr/19 21:58
Worklog Time Spent: 10m 
  Work Description: amaliujia commented on issue #8289: [BEAM-7064] Reject 
BigQuery data with sub-millisecond precision, instead of losing data
URL: https://github.com/apache/beam/pull/8289#issuecomment-482736066
 
 
   LGTM
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 226957)
Time Spent: 2h 10m  (was: 2h)

> Conversion of timestamp from BigQuery row to Beam row loses precision
> -
>
> Key: BEAM-7064
> URL: https://issues.apache.org/jira/browse/BEAM-7064
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>Priority: Critical
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> Currently, the utilities to convert from BigQuery row to Beam row simply 
> truncate timestamps at millisecond precision. This is unacceptable. Instead, 
> an error should be raised indicating that it is not supported.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6942) Pipeline options to experiment propagation is not working in Dataflow runner.

2019-04-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6942?focusedWorklogId=226953=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-226953
 ]

ASF GitHub Bot logged work on BEAM-6942:


Author: ASF GitHub Bot
Created on: 12/Apr/19 21:45
Start Date: 12/Apr/19 21:45
Worklog Time Spent: 10m 
  Work Description: tvalentyn commented on issue #8225: [BEAM-6942]  Make 
modifications to pipeline options to be visible to all views.
URL: https://github.com/apache/beam/pull/8225#issuecomment-482732938
 
 
   Thanks for detailed review, PTAL, @aaltay .
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 226953)
Time Spent: 11h 20m  (was: 11h 10m)

> Pipeline options to experiment propagation is not working in Dataflow runner.
> -
>
> Key: BEAM-6942
> URL: https://issues.apache.org/jira/browse/BEAM-6942
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: Valentyn Tymofieiev
>Priority: Major
>  Time Spent: 11h 20m
>  Remaining Estimate: 0h
>
> Relevant code: 
> [https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/dataflow/dataflow_runner.py#L356-L388]
> 3 experiments/options are affected. We need to fix it in 2.12.0
> cc: [~altay] [~apilloud]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6942) Pipeline options to experiment propagation is not working in Dataflow runner.

2019-04-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6942?focusedWorklogId=226952=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-226952
 ]

ASF GitHub Bot logged work on BEAM-6942:


Author: ASF GitHub Bot
Created on: 12/Apr/19 21:44
Start Date: 12/Apr/19 21:44
Worklog Time Spent: 10m 
  Work Description: tvalentyn commented on pull request #8225: [BEAM-6942]  
Make modifications to pipeline options to be visible to all views.
URL: https://github.com/apache/beam/pull/8225#discussion_r275073307
 
 

 ##
 File path: sdks/python/apache_beam/options/pipeline_options.py
 ##
 @@ -738,13 +801,13 @@ def _add_argparse_args(cls, parser):
   '""} }. All fields in the json are optional except '
   'command.'))
 parser.add_argument(
-'--sdk-worker-parallelism', default=None,
+'--sdk_worker_parallelism', default=None,
 
 Review comment:
   This may help clarify argparse behavior:
   ```
   In [1]: import argparse
   
   In [2]: parser=argparse.ArgumentParser()
   
   In [3]: parser.add_argument('--flag-with-dash')
   Out[3]: _StoreAction(option_strings=['--flag-with-dash'], 
dest='flag_with_dash', nargs=None, const=None, default=None, type=None, 
choices=None, help=None, metavar=None)  
   
   In [5]: parser.parse_args(['--flag-with-dash=abc'])
   Out[5]: Namespace(flag_with_dash='abc')
   
   In [6]: parser.parse_args(['--flag_with_dash=I am using underscores instead 
of dashes in flag name!'])
   usage: ipython [-h] [--flag-with-dash FLAG_WITH_DASH]
   ipython: error: unrecognized arguments: --flag_with_dash=I am using 
underscores instead of dashes in flag name!
   An exception has occurred, use %tb to see the full traceback.
   
   In [7]: parser.add_argument('--flag_with_dash')
   Out[7]: _StoreAction(option_strings=['--flag_with_dash'], 
dest='flag_with_dash', nargs=None, const=None, default=None, type=None, 
choices=None, help=None, metavar=None)  
   
   In [8]: parser.parse_args(['--flag_with_dash=Now I can use both dashes and 
underscores'])
   Out[8]: Namespace(flag_with_dash='Now I can use both dashes and underscores')
   
   In [9]: parser.parse_args(['--flag-with-dash=Same dest is used for flags 
with dashes and underscores'])
   Out[9]: Namespace(flag_with_dash='Same dest is used for flags with dashes 
and underscores')
   ```
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 226952)
Time Spent: 11h 10m  (was: 11h)

> Pipeline options to experiment propagation is not working in Dataflow runner.
> -
>
> Key: BEAM-6942
> URL: https://issues.apache.org/jira/browse/BEAM-6942
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: Valentyn Tymofieiev
>Priority: Major
>  Time Spent: 11h 10m
>  Remaining Estimate: 0h
>
> Relevant code: 
> [https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/dataflow/dataflow_runner.py#L356-L388]
> 3 experiments/options are affected. We need to fix it in 2.12.0
> cc: [~altay] [~apilloud]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6942) Pipeline options to experiment propagation is not working in Dataflow runner.

2019-04-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6942?focusedWorklogId=226951=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-226951
 ]

ASF GitHub Bot logged work on BEAM-6942:


Author: ASF GitHub Bot
Created on: 12/Apr/19 21:44
Start Date: 12/Apr/19 21:44
Worklog Time Spent: 10m 
  Work Description: tvalentyn commented on pull request #8225: [BEAM-6942]  
Make modifications to pipeline options to be visible to all views.
URL: https://github.com/apache/beam/pull/8225#discussion_r275073307
 
 

 ##
 File path: sdks/python/apache_beam/options/pipeline_options.py
 ##
 @@ -738,13 +801,13 @@ def _add_argparse_args(cls, parser):
   '""} }. All fields in the json are optional except '
   'command.'))
 parser.add_argument(
-'--sdk-worker-parallelism', default=None,
+'--sdk_worker_parallelism', default=None,
 
 Review comment:
   This may help clarify argparse behavior:
   ```
   In [1]: import argparse
   
   In [2]: parser=argparse.ArgumentParser()
   
   In [3]: parser.add_argument('--flag-with-dash')
   Out[3]: _StoreAction(option_strings=['--flag-with-dash'], 
dest='flag_with_dash', nargs=None, const=None, default=None, type=None, 
choices=None, help=None, metavar=None)  
   
   In [5]: parser.parse_args(['--flag-with-dash=abc'])
   Out[5]: Namespace(flag_with_dash='abc')
   
   In [6]: parser.parse_args(['--flag_with_dash=I am using underscores instead 
of dashes in flag name!'])
   usage: ipython [-h] [--flag-with-dash FLAG_WITH_DASH]
   ipython: error: unrecognized arguments: --flag_with_dash=I am using 
underscores instead of dashes in flag name!
   An exception has occurred, use %tb to see the full traceback.
   
   In [7]: parser.add_argument('--flag_with_dash')
   Out[7]: _StoreAction(option_strings=['--flag_with_dash'], 
dest='flag_with_dash', nargs=None, const=None, default=None, type=None, 
choices=None, help=None, metavar=None)  
   
   In [8]: parser.parse_args(['--flag_with_dash=Now I can use both dashes and 
underscores'])
   Out[8]: Namespace(flag_with_dash='Now I can use both dashes and underscores')
   
   In [9]: parser.parse_args(['--flag-with-dash=Same store is used for flags 
with dashes and underscores'])
   Out[9]: Namespace(flag_with_dash='Same store is used for flags with dashes 
and underscores')
   ```
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 226951)
Time Spent: 11h  (was: 10h 50m)

> Pipeline options to experiment propagation is not working in Dataflow runner.
> -
>
> Key: BEAM-6942
> URL: https://issues.apache.org/jira/browse/BEAM-6942
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: Valentyn Tymofieiev
>Priority: Major
>  Time Spent: 11h
>  Remaining Estimate: 0h
>
> Relevant code: 
> [https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/dataflow/dataflow_runner.py#L356-L388]
> 3 experiments/options are affected. We need to fix it in 2.12.0
> cc: [~altay] [~apilloud]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6942) Pipeline options to experiment propagation is not working in Dataflow runner.

2019-04-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6942?focusedWorklogId=226947=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-226947
 ]

ASF GitHub Bot logged work on BEAM-6942:


Author: ASF GitHub Bot
Created on: 12/Apr/19 21:32
Start Date: 12/Apr/19 21:32
Worklog Time Spent: 10m 
  Work Description: tvalentyn commented on pull request #8225: [BEAM-6942]  
Make modifications to pipeline options to be visible to all views.
URL: https://github.com/apache/beam/pull/8225#discussion_r275070821
 
 

 ##
 File path: sdks/python/apache_beam/options/pipeline_options.py
 ##
 @@ -738,13 +801,13 @@ def _add_argparse_args(cls, parser):
   '""} }. All fields in the json are optional except '
   'command.'))
 parser.add_argument(
-'--sdk-worker-parallelism', default=None,
+'--sdk_worker_parallelism', default=None,
 
 Review comment:
   The two options with dashes were added in 
https://github.com/apache/beam/pull/8082
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 226947)
Time Spent: 10h 50m  (was: 10h 40m)

> Pipeline options to experiment propagation is not working in Dataflow runner.
> -
>
> Key: BEAM-6942
> URL: https://issues.apache.org/jira/browse/BEAM-6942
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: Valentyn Tymofieiev
>Priority: Major
>  Time Spent: 10h 50m
>  Remaining Estimate: 0h
>
> Relevant code: 
> [https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/dataflow/dataflow_runner.py#L356-L388]
> 3 experiments/options are affected. We need to fix it in 2.12.0
> cc: [~altay] [~apilloud]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6942) Pipeline options to experiment propagation is not working in Dataflow runner.

2019-04-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6942?focusedWorklogId=226946=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-226946
 ]

ASF GitHub Bot logged work on BEAM-6942:


Author: ASF GitHub Bot
Created on: 12/Apr/19 21:31
Start Date: 12/Apr/19 21:31
Worklog Time Spent: 10m 
  Work Description: tvalentyn commented on pull request #8225: [BEAM-6942]  
Make modifications to pipeline options to be visible to all views.
URL: https://github.com/apache/beam/pull/8225#discussion_r275070618
 
 

 ##
 File path: sdks/python/apache_beam/options/pipeline_options.py
 ##
 @@ -738,13 +801,13 @@ def _add_argparse_args(cls, parser):
   '""} }. All fields in the json are optional except '
   'command.'))
 parser.add_argument(
-'--sdk-worker-parallelism', default=None,
+'--sdk_worker_parallelism', default=None,
 
 Review comment:
   Yes, this would have been a backwards-incompatible change. Luckily, looks 
like sdk-worker-parallelism didn't make it into 2.12.0: 
https://github.com/apache/beam/blob/release-2.12.0/sdks/python/apache_beam/options/pipeline_options.py#L719.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 226946)
Time Spent: 10h 40m  (was: 10.5h)

> Pipeline options to experiment propagation is not working in Dataflow runner.
> -
>
> Key: BEAM-6942
> URL: https://issues.apache.org/jira/browse/BEAM-6942
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: Valentyn Tymofieiev
>Priority: Major
>  Time Spent: 10h 40m
>  Remaining Estimate: 0h
>
> Relevant code: 
> [https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/dataflow/dataflow_runner.py#L356-L388]
> 3 experiments/options are affected. We need to fix it in 2.12.0
> cc: [~altay] [~apilloud]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7066) Finish reimaging the remaining half of Beam Jenkins workers

2019-04-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7066?focusedWorklogId=226930=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-226930
 ]

ASF GitHub Bot logged work on BEAM-7066:


Author: ASF GitHub Bot
Created on: 12/Apr/19 21:06
Start Date: 12/Apr/19 21:06
Worklog Time Spent: 10m 
  Work Description: markflyhigh commented on pull request #8292: 
[BEAM-7066] Disables Py37 tox suite to reduce test flakiness.
URL: https://github.com/apache/beam/pull/8292
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 226930)
Time Spent: 0.5h  (was: 20m)

> Finish reimaging the remaining half of Beam Jenkins workers
> ---
>
> Key: BEAM-7066
> URL: https://issues.apache.org/jira/browse/BEAM-7066
> Project: Beam
>  Issue Type: Bug
>  Components: testing
>Reporter: Valentyn Tymofieiev
>Assignee: yifan zou
>Priority: Major
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Hi [~yifanzou] filing this to track the remaining work that you identified in 
> https://lists.apache.org/thread.html/4fe2ec304529216d082a860b912b090b958d39474c8040e79ade46d2@%3Cdev.beam.apache.org%3E,
>  since I'll have some changes waiting on the completion of this work.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7051) Support LIMIT OFFSET

2019-04-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7051?focusedWorklogId=226922=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-226922
 ]

ASF GitHub Bot logged work on BEAM-7051:


Author: ASF GitHub Bot
Created on: 12/Apr/19 20:48
Start Date: 12/Apr/19 20:48
Worklog Time Spent: 10m 
  Work Description: amaliujia commented on pull request #8290: [BEAM-7051] 
Support LIMIT OFFSET
URL: https://github.com/apache/beam/pull/8290#discussion_r275059366
 
 

 ##
 File path: 
sdks/java/extensions/sql/src/test/java/org/apache/beam/sdk/extensions/sql/impl/rel/BeamSortRelTest.java
 ##
 @@ -212,6 +217,20 @@ public void testOrderBy_nullsLast() throws Exception {
 pipeline.run().waitUntilFinish();
   }
 
+  @Test
+  public void testOrderBy_with_offset2() throws Exception {
+Schema schema = Schema.builder().addField("count_star", 
Schema.FieldType.INT64).build();
+
+String sql =
+"INSERT INTO COUNT_TABLE(count_star) "
++ "SELECT COUNT(*) FROM (SELECT * FROM "
++ "(SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3) LIMIT 3 OFFSET 
1)";
+
+PCollection rows = compilePipeline(sql, pipeline);
+
PAssert.that(rows).containsInAnyOrder(TestUtils.RowsBuilder.of(schema).addRows(2L).getRows());
 
 Review comment:
   Yes. 2L here is COUNT(*)'s result, which is two rows. 
   
   I should have made the test case more clear by using STRING in SELECT.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 226922)
Time Spent: 2.5h  (was: 2h 20m)

> Support LIMIT OFFSET
> 
>
> Key: BEAM-7051
> URL: https://issues.apache.org/jira/browse/BEAM-7051
> Project: Beam
>  Issue Type: Bug
>  Components: dsl-sql
>Reporter: Rui Wang
>Assignee: Rui Wang
>Priority: Major
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7011) Replace "urn:beam:sideinput:materialization:multimap:0.1" with standard side input type materialization

2019-04-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7011?focusedWorklogId=226923=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-226923
 ]

ASF GitHub Bot logged work on BEAM-7011:


Author: ASF GitHub Bot
Created on: 12/Apr/19 20:48
Start Date: 12/Apr/19 20:48
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on issue #8295: [BEAM-7011] Clean-up 
additional references to removed URN.
URL: https://github.com/apache/beam/pull/8295#issuecomment-482717748
 
 
   CC: @angoenka 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 226923)
Time Spent: 3.5h  (was: 3h 20m)

> Replace "urn:beam:sideinput:materialization:multimap:0.1" with standard side 
> input type materialization
> ---
>
> Key: BEAM-7011
> URL: https://issues.apache.org/jira/browse/BEAM-7011
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model, sdk-go, sdk-java-core, sdk-py-core
>Reporter: Luke Cwik
>Assignee: Luke Cwik
>Priority: Major
> Fix For: 2.13.0
>
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> Use the StandardSideInput defined within beam_runner_api.proto:
> https://github.com/apache/beam/blob/206d98b0765ac662730edd28d669b3db24dd851d/model/pipeline/src/main/proto/beam_runner_api.proto#L311



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7011) Replace "urn:beam:sideinput:materialization:multimap:0.1" with standard side input type materialization

2019-04-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7011?focusedWorklogId=226921=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-226921
 ]

ASF GitHub Bot logged work on BEAM-7011:


Author: ASF GitHub Bot
Created on: 12/Apr/19 20:48
Start Date: 12/Apr/19 20:48
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on issue #8295: [BEAM-7011] Clean-up 
additional references to removed URN.
URL: https://github.com/apache/beam/pull/8295#issuecomment-482717696
 
 
   R: @aljoscha @tweise 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 226921)
Time Spent: 3h 20m  (was: 3h 10m)

> Replace "urn:beam:sideinput:materialization:multimap:0.1" with standard side 
> input type materialization
> ---
>
> Key: BEAM-7011
> URL: https://issues.apache.org/jira/browse/BEAM-7011
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model, sdk-go, sdk-java-core, sdk-py-core
>Reporter: Luke Cwik
>Assignee: Luke Cwik
>Priority: Major
> Fix For: 2.13.0
>
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> Use the StandardSideInput defined within beam_runner_api.proto:
> https://github.com/apache/beam/blob/206d98b0765ac662730edd28d669b3db24dd851d/model/pipeline/src/main/proto/beam_runner_api.proto#L311



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7011) Replace "urn:beam:sideinput:materialization:multimap:0.1" with standard side input type materialization

2019-04-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7011?focusedWorklogId=226920=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-226920
 ]

ASF GitHub Bot logged work on BEAM-7011:


Author: ASF GitHub Bot
Created on: 12/Apr/19 20:48
Start Date: 12/Apr/19 20:48
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on pull request #8295: [BEAM-7011] 
Clean-up additional references to removed URN.
URL: https://github.com/apache/beam/pull/8295
 
 
   This cleans up a few places which were missed in #8232 
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | --- | --- | --- | ---
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python3_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python3_Verify/lastCompletedBuild/)
 | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/)
  [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PreCommit_Python_PVR_Flink_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_Python_PVR_Flink_Cron/lastCompletedBuild/)
 | --- | --- | ---
   
   Pre-Commit Tests Status (on master branch)
   

   
   --- |Java | Python | Go | Website
   --- | --- | --- | --- | ---
   Non-portable | [![Build 

[jira] [Commented] (BEAM-7058) Python SDK metric process_bundle_msecs reported as zero

2019-04-12 Thread Thomas Weise (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-7058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16816628#comment-16816628
 ] 

Thomas Weise commented on BEAM-7058:


The test that I pointed to in the email thread should be sufficient to 
reproduce by adding a sleep time into the user code:

[https://github.com/apache/beam/blob/d38645ae8758d834c3e819b715a66dd82c78f6d4/sdks/python/apache_beam/runners/portability/flink_runner_test.py#L181]

It would be nice to assert that the metrics are indeed reported then.

In the Flink runner bundles in streaming mode are by default 1000 elements or 
1s.

 

> Python SDK metric process_bundle_msecs reported as zero
> ---
>
> Key: BEAM-7058
> URL: https://issues.apache.org/jira/browse/BEAM-7058
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink, sdk-py-harness
>Reporter: Thomas Weise
>Assignee: Alex Amato
>Priority: Major
>  Labels: portability-flink
>
> With the portable Flink runner, the metric is reported as 0, while the count 
> metric works as expected.
> [https://lists.apache.org/thread.html/25eec8104bda6e4c71cc6c5e9864c335833c3ae2afe225d372479f30@%3Cdev.beam.apache.org%3E]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7051) Support LIMIT OFFSET

2019-04-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7051?focusedWorklogId=226912=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-226912
 ]

ASF GitHub Bot logged work on BEAM-7051:


Author: ASF GitHub Bot
Created on: 12/Apr/19 20:34
Start Date: 12/Apr/19 20:34
Worklog Time Spent: 10m 
  Work Description: amaliujia commented on issue #8290: [BEAM-7051] Support 
LIMIT OFFSET
URL: https://github.com/apache/beam/pull/8290#issuecomment-482713554
 
 
   Thanks for valuable suggestions on my mess code. Comments are addressed now. 
PTAL.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 226912)
Time Spent: 2h 20m  (was: 2h 10m)

> Support LIMIT OFFSET
> 
>
> Key: BEAM-7051
> URL: https://issues.apache.org/jira/browse/BEAM-7051
> Project: Beam
>  Issue Type: Bug
>  Components: dsl-sql
>Reporter: Rui Wang
>Assignee: Rui Wang
>Priority: Major
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-7058) Python SDK metric process_bundle_msecs reported as zero

2019-04-12 Thread Alex Amato (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-7058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16816622#comment-16816622
 ] 

Alex Amato commented on BEAM-7058:
--

We have an end to end integration test successfully collecting these metrics in 
Dataflow python and several unit tests, These tests do force some sleep times 
though.

So the the python SDK is emitting the metrics in that case. One theory is that 
small bundles may not trigger the state sampler code properly. Or the 
particular test is too small, and executes too fast to exercise the sampling 
code at all (and it really isn't running much). We should test this on a test 
with a high element count.

If that's the case, the state sampler should be setup to trigger intervals 
periodically, and not reset the interval on new bundles.


It could be related to the behaviour of this specific test. We would need 
someone to debug this test, is there a repro?

 

 

> Python SDK metric process_bundle_msecs reported as zero
> ---
>
> Key: BEAM-7058
> URL: https://issues.apache.org/jira/browse/BEAM-7058
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink, sdk-py-harness
>Reporter: Thomas Weise
>Assignee: Alex Amato
>Priority: Major
>  Labels: portability-flink
>
> With the portable Flink runner, the metric is reported as 0, while the count 
> metric works as expected.
> [https://lists.apache.org/thread.html/25eec8104bda6e4c71cc6c5e9864c335833c3ae2afe225d372479f30@%3Cdev.beam.apache.org%3E]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4046) Decouple gradle project names and maven artifact ids

2019-04-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4046?focusedWorklogId=226911=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-226911
 ]

ASF GitHub Bot logged work on BEAM-4046:


Author: ASF GitHub Bot
Created on: 12/Apr/19 20:29
Start Date: 12/Apr/19 20:29
Worklog Time Spent: 10m 
  Work Description: adude3141 commented on issue #8194: [DO NOT MERGE] 
[BEAM-4046] decouple gradle project names and maven artifact ids
URL: https://github.com/apache/beam/pull/8194#issuecomment-482712066
 
 
   Run Java Flink PortableValidatesRunner Batch
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 226911)
Time Spent: 1h 20m  (was: 1h 10m)

> Decouple gradle project names and maven artifact ids
> 
>
> Key: BEAM-4046
> URL: https://issues.apache.org/jira/browse/BEAM-4046
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system
>Reporter: Kenneth Knowles
>Assignee: Michael Luckey
>Priority: Major
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> In our first draft, we had gradle projects like {{":beam-sdks-java-core"}}. 
> It is clumsy and requires a hacky settings.gradle that is not idiomatic.
> In our second draft, we changed them to names that work well with Gradle, 
> like {{":sdks:java:core"}}. This caused Maven artifact IDs to be wonky.
> In our third draft, we regressed to the first draft to get the Maven artifact 
> ids right.
> These should be able to be decoupled. It seems there are many StackOverflow 
> questions on the subject.
> Since it is unidiomatic and a poor user experience, if it does turn out to be 
> mandatory then it needs to be documented inline everywhere - the 
> settings.gradle should say why it is so bizarre, and each build.gradle should 
> indicate what its project id is.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4046) Decouple gradle project names and maven artifact ids

2019-04-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4046?focusedWorklogId=226910=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-226910
 ]

ASF GitHub Bot logged work on BEAM-4046:


Author: ASF GitHub Bot
Created on: 12/Apr/19 20:28
Start Date: 12/Apr/19 20:28
Worklog Time Spent: 10m 
  Work Description: adude3141 commented on issue #8194: [DO NOT MERGE] 
[BEAM-4046] decouple gradle project names and maven artifact ids
URL: https://github.com/apache/beam/pull/8194#issuecomment-482711924
 
 
   Run Go PostCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 226910)
Time Spent: 1h 10m  (was: 1h)

> Decouple gradle project names and maven artifact ids
> 
>
> Key: BEAM-4046
> URL: https://issues.apache.org/jira/browse/BEAM-4046
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system
>Reporter: Kenneth Knowles
>Assignee: Michael Luckey
>Priority: Major
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> In our first draft, we had gradle projects like {{":beam-sdks-java-core"}}. 
> It is clumsy and requires a hacky settings.gradle that is not idiomatic.
> In our second draft, we changed them to names that work well with Gradle, 
> like {{":sdks:java:core"}}. This caused Maven artifact IDs to be wonky.
> In our third draft, we regressed to the first draft to get the Maven artifact 
> ids right.
> These should be able to be decoupled. It seems there are many StackOverflow 
> questions on the subject.
> Since it is unidiomatic and a poor user experience, if it does turn out to be 
> mandatory then it needs to be documented inline everywhere - the 
> settings.gradle should say why it is so bizarre, and each build.gradle should 
> indicate what its project id is.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4046) Decouple gradle project names and maven artifact ids

2019-04-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4046?focusedWorklogId=226908=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-226908
 ]

ASF GitHub Bot logged work on BEAM-4046:


Author: ASF GitHub Bot
Created on: 12/Apr/19 20:28
Start Date: 12/Apr/19 20:28
Worklog Time Spent: 10m 
  Work Description: adude3141 commented on issue #8194: [DO NOT MERGE] 
[BEAM-4046] decouple gradle project names and maven artifact ids
URL: https://github.com/apache/beam/pull/8194#issuecomment-482711837
 
 
   Run Java PortabilityApi PostCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 226908)
Time Spent: 1h  (was: 50m)

> Decouple gradle project names and maven artifact ids
> 
>
> Key: BEAM-4046
> URL: https://issues.apache.org/jira/browse/BEAM-4046
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system
>Reporter: Kenneth Knowles
>Assignee: Michael Luckey
>Priority: Major
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> In our first draft, we had gradle projects like {{":beam-sdks-java-core"}}. 
> It is clumsy and requires a hacky settings.gradle that is not idiomatic.
> In our second draft, we changed them to names that work well with Gradle, 
> like {{":sdks:java:core"}}. This caused Maven artifact IDs to be wonky.
> In our third draft, we regressed to the first draft to get the Maven artifact 
> ids right.
> These should be able to be decoupled. It seems there are many StackOverflow 
> questions on the subject.
> Since it is unidiomatic and a poor user experience, if it does turn out to be 
> mandatory then it needs to be documented inline everywhere - the 
> settings.gradle should say why it is so bizarre, and each build.gradle should 
> indicate what its project id is.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4046) Decouple gradle project names and maven artifact ids

2019-04-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4046?focusedWorklogId=226907=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-226907
 ]

ASF GitHub Bot logged work on BEAM-4046:


Author: ASF GitHub Bot
Created on: 12/Apr/19 20:27
Start Date: 12/Apr/19 20:27
Worklog Time Spent: 10m 
  Work Description: adude3141 commented on issue #8194: [DO NOT MERGE] 
[BEAM-4046] decouple gradle project names and maven artifact ids
URL: https://github.com/apache/beam/pull/8194#issuecomment-482711609
 
 
   Run Java PostCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 226907)
Time Spent: 50m  (was: 40m)

> Decouple gradle project names and maven artifact ids
> 
>
> Key: BEAM-4046
> URL: https://issues.apache.org/jira/browse/BEAM-4046
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system
>Reporter: Kenneth Knowles
>Assignee: Michael Luckey
>Priority: Major
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> In our first draft, we had gradle projects like {{":beam-sdks-java-core"}}. 
> It is clumsy and requires a hacky settings.gradle that is not idiomatic.
> In our second draft, we changed them to names that work well with Gradle, 
> like {{":sdks:java:core"}}. This caused Maven artifact IDs to be wonky.
> In our third draft, we regressed to the first draft to get the Maven artifact 
> ids right.
> These should be able to be decoupled. It seems there are many StackOverflow 
> questions on the subject.
> Since it is unidiomatic and a poor user experience, if it does turn out to be 
> mandatory then it needs to be documented inline everywhere - the 
> settings.gradle should say why it is so bizarre, and each build.gradle should 
> indicate what its project id is.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4046) Decouple gradle project names and maven artifact ids

2019-04-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4046?focusedWorklogId=226906=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-226906
 ]

ASF GitHub Bot logged work on BEAM-4046:


Author: ASF GitHub Bot
Created on: 12/Apr/19 20:26
Start Date: 12/Apr/19 20:26
Worklog Time Spent: 10m 
  Work Description: adude3141 commented on issue #8194: [DO NOT MERGE] 
[BEAM-4046] decouple gradle project names and maven artifact ids
URL: https://github.com/apache/beam/pull/8194#issuecomment-482711478
 
 
   Run Go PostCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 226906)
Time Spent: 40m  (was: 0.5h)

> Decouple gradle project names and maven artifact ids
> 
>
> Key: BEAM-4046
> URL: https://issues.apache.org/jira/browse/BEAM-4046
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system
>Reporter: Kenneth Knowles
>Assignee: Michael Luckey
>Priority: Major
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> In our first draft, we had gradle projects like {{":beam-sdks-java-core"}}. 
> It is clumsy and requires a hacky settings.gradle that is not idiomatic.
> In our second draft, we changed them to names that work well with Gradle, 
> like {{":sdks:java:core"}}. This caused Maven artifact IDs to be wonky.
> In our third draft, we regressed to the first draft to get the Maven artifact 
> ids right.
> These should be able to be decoupled. It seems there are many StackOverflow 
> questions on the subject.
> Since it is unidiomatic and a poor user experience, if it does turn out to be 
> mandatory then it needs to be documented inline everywhere - the 
> settings.gradle should say why it is so bizarre, and each build.gradle should 
> indicate what its project id is.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6942) Pipeline options to experiment propagation is not working in Dataflow runner.

2019-04-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6942?focusedWorklogId=226903=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-226903
 ]

ASF GitHub Bot logged work on BEAM-6942:


Author: ASF GitHub Bot
Created on: 12/Apr/19 20:23
Start Date: 12/Apr/19 20:23
Worklog Time Spent: 10m 
  Work Description: aaltay commented on issue #8225: [BEAM-6942]  Make 
modifications to pipeline options to be visible to all views.
URL: https://github.com/apache/beam/pull/8225#issuecomment-482710445
 
 
   Left one last question related to backward compatibility of 
'sdk-worker-parallelism' change. Otherwise this change looks good and we do not 
need to block on open questions.
   
   Thank you!
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 226903)
Time Spent: 10.5h  (was: 10h 20m)

> Pipeline options to experiment propagation is not working in Dataflow runner.
> -
>
> Key: BEAM-6942
> URL: https://issues.apache.org/jira/browse/BEAM-6942
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: Valentyn Tymofieiev
>Priority: Major
>  Time Spent: 10.5h
>  Remaining Estimate: 0h
>
> Relevant code: 
> [https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/dataflow/dataflow_runner.py#L356-L388]
> 3 experiments/options are affected. We need to fix it in 2.12.0
> cc: [~altay] [~apilloud]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6942) Pipeline options to experiment propagation is not working in Dataflow runner.

2019-04-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6942?focusedWorklogId=226902=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-226902
 ]

ASF GitHub Bot logged work on BEAM-6942:


Author: ASF GitHub Bot
Created on: 12/Apr/19 20:22
Start Date: 12/Apr/19 20:22
Worklog Time Spent: 10m 
  Work Description: aaltay commented on pull request #8225: [BEAM-6942]  
Make modifications to pipeline options to be visible to all views.
URL: https://github.com/apache/beam/pull/8225#discussion_r275051771
 
 

 ##
 File path: sdks/python/apache_beam/options/pipeline_options.py
 ##
 @@ -738,13 +801,13 @@ def _add_argparse_args(cls, parser):
   '""} }. All fields in the json are optional except '
   'command.'))
 parser.add_argument(
-'--sdk-worker-parallelism', default=None,
+'--sdk_worker_parallelism', default=None,
 
 Review comment:
   This is a bit confusing.
   
   Would a user's pipeline break if they have been using the 
'sdk-worker-parallelism' version in command line or by directly passing to 
pipeline options in kwargs?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 226902)
Time Spent: 10h 20m  (was: 10h 10m)

> Pipeline options to experiment propagation is not working in Dataflow runner.
> -
>
> Key: BEAM-6942
> URL: https://issues.apache.org/jira/browse/BEAM-6942
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: Valentyn Tymofieiev
>Priority: Major
>  Time Spent: 10h 20m
>  Remaining Estimate: 0h
>
> Relevant code: 
> [https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/dataflow/dataflow_runner.py#L356-L388]
> 3 experiments/options are affected. We need to fix it in 2.12.0
> cc: [~altay] [~apilloud]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6942) Pipeline options to experiment propagation is not working in Dataflow runner.

2019-04-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6942?focusedWorklogId=226900=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-226900
 ]

ASF GitHub Bot logged work on BEAM-6942:


Author: ASF GitHub Bot
Created on: 12/Apr/19 20:21
Start Date: 12/Apr/19 20:21
Worklog Time Spent: 10m 
  Work Description: aaltay commented on pull request #8225: [BEAM-6942]  
Make modifications to pipeline options to be visible to all views.
URL: https://github.com/apache/beam/pull/8225#discussion_r275051421
 
 

 ##
 File path: sdks/python/apache_beam/options/pipeline_options.py
 ##
 @@ -246,7 +276,40 @@ def display_data(self):
 return self.get_all_options(True)
 
   def view_as(self, cls):
+"""Returns a view of current object as provided PipelineOption subclass.
+
+Example Usage::
+
+  options = PipelineOptions(['--runner', 'Direct', '--streaming'])
+  standard_options = options.view_as(StandardOptions)
+  if standard_options.streaming:
+# ... start a streaming job ...
+
+Note that options objects may have multiple views, and modifications
+of values in any view-object will apply to current object and other
+view-objects.
+
+Args:
+  cls: PipelineOptions class or any of its subclasses.
+
+Returns:
+  An instance of cls that is intitialized using options contained in 
current
+  object.
+
+"""
 view = cls(self._flags)
+for option_name in view._visible_option_list():
+  # Initialize values of keys defined by a cls.
+  #
+  # Note that we do initialization only once per key to make sure that
+  # values in _all_options dict are not-recreated with each new view.
+  # This is important to make sure that values of multi-options keys are
+  # backed by the same list across multiple views, and that any overrides 
of
+  # pipeline options already stored in _all_options are preserved.
+  if option_name not in self._all_options:
+self._all_options[option_name] = getattr(view._visible_options,
+ option_name)
+# Note that views will still store _all_options of the source object.
 
 Review comment:
   Great, thanks!
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 226900)
Time Spent: 10h 10m  (was: 10h)

> Pipeline options to experiment propagation is not working in Dataflow runner.
> -
>
> Key: BEAM-6942
> URL: https://issues.apache.org/jira/browse/BEAM-6942
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: Valentyn Tymofieiev
>Priority: Major
>  Time Spent: 10h 10m
>  Remaining Estimate: 0h
>
> Relevant code: 
> [https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/dataflow/dataflow_runner.py#L356-L388]
> 3 experiments/options are affected. We need to fix it in 2.12.0
> cc: [~altay] [~apilloud]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6942) Pipeline options to experiment propagation is not working in Dataflow runner.

2019-04-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6942?focusedWorklogId=226899=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-226899
 ]

ASF GitHub Bot logged work on BEAM-6942:


Author: ASF GitHub Bot
Created on: 12/Apr/19 20:21
Start Date: 12/Apr/19 20:21
Worklog Time Spent: 10m 
  Work Description: aaltay commented on pull request #8225: [BEAM-6942]  
Make modifications to pipeline options to be visible to all views.
URL: https://github.com/apache/beam/pull/8225#discussion_r275051339
 
 

 ##
 File path: sdks/python/apache_beam/options/pipeline_options.py
 ##
 @@ -150,28 +160,48 @@ def __init__(self, flags=None, **kwargs):
 arguments and then parse the command line specified by flags or by default
 the one obtained from sys.argv.
 
-The subclasses are not expected to require a redefinition of __init__.
+The subclasses of PipelineOptions should not redefine __init__.
 
 Review comment:
   New versions looks good, thanks.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 226899)
Time Spent: 10h  (was: 9h 50m)

> Pipeline options to experiment propagation is not working in Dataflow runner.
> -
>
> Key: BEAM-6942
> URL: https://issues.apache.org/jira/browse/BEAM-6942
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: Valentyn Tymofieiev
>Priority: Major
>  Time Spent: 10h
>  Remaining Estimate: 0h
>
> Relevant code: 
> [https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/dataflow/dataflow_runner.py#L356-L388]
> 3 experiments/options are affected. We need to fix it in 2.12.0
> cc: [~altay] [~apilloud]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-6919) [beam_Release_NightlySnapshot] Cannot publish artifacts to Apache Maven repo

2019-04-12 Thread yifan zou (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-6919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16816616#comment-16816616
 ] 

yifan zou commented on BEAM-6919:
-

Our beam* nodes got updated Thu Jan 24 13:50:03 2019 +. This change 
specifically removed the nexus credentials from the filesystem because they are 
not supposed to be on external build nodes (nodes Infra does not manage.)  
Working with Infra to have a new role account on the JNLP nodes.

> [beam_Release_NightlySnapshot] Cannot publish artifacts to Apache Maven repo
> 
>
> Key: BEAM-6919
> URL: https://issues.apache.org/jira/browse/BEAM-6919
> Project: Beam
>  Issue Type: Bug
>  Components: test-failures
>Reporter: Daniel Oliveira
>Assignee: yifan zou
>Priority: Blocker
>  Labels: currently-failing, triaged
>
> _Use this form to file an issue for test failure:_
>  * [Jenkins 
> Job|https://builds.apache.org/job/beam_Release_NightlySnapshot/384/]
>  * [Gradle Build Scan|https://scans.gradle.com/s/osuzhgqygga2o]
> Initial investigation:
> Nightly snapshots are failing in beam:publish due to being unable to publish 
> artifacts to Apache's Maven repo:
> {noformat}
> 12:30:57 * What went wrong:
> 12:30:57 Execution failed for task 
> ':beam-examples-java:publishMavenJavaPublicationToMavenRepository'.
> 12:30:57 > Failed to publish publication 'mavenJava' to repository 'maven'
> 12:30:57> Could not write to resource 
> 'https://repository.apache.org/content/repositories/snapshots/org/apache/beam/beam-examples-java/2.12.0-SNAPSHOT/beam-examples-java-2.12.0-20190326.191838-10.jar'.
> 12:30:57   > Could not PUT 
> 'https://repository.apache.org/content/repositories/snapshots/org/apache/beam/beam-examples-java/2.12.0-SNAPSHOT/beam-examples-java-2.12.0-20190326.191838-10.jar'.
>  Received status code 401 from server: Unauthorized
> {noformat}
> This happens for many modules, not just beam-examples-java as in the example 
> above.
> 
> _After you've filled out the above details, please [assign the issue to an 
> individual|https://beam.apache.org/contribute/postcommits-guides/index.html#find_specialist].
>  Assignee should [treat test failures as 
> high-priority|https://beam.apache.org/contribute/postcommits-policies/#assigned-failing-test],
>  helping to fix the issue or find a more appropriate owner. See [Apache Beam 
> Post-Commit 
> Policies|https://beam.apache.org/contribute/postcommits-policies]._



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6942) Pipeline options to experiment propagation is not working in Dataflow runner.

2019-04-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6942?focusedWorklogId=226896=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-226896
 ]

ASF GitHub Bot logged work on BEAM-6942:


Author: ASF GitHub Bot
Created on: 12/Apr/19 20:17
Start Date: 12/Apr/19 20:17
Worklog Time Spent: 10m 
  Work Description: aaltay commented on pull request #8225: [BEAM-6942]  
Make modifications to pipeline options to be visible to all views.
URL: https://github.com/apache/beam/pull/8225#discussion_r275050282
 
 

 ##
 File path: sdks/python/apache_beam/options/pipeline_options.py
 ##
 @@ -70,9 +70,9 @@ class TemplateUserOptions(PipelineOptions):
   @classmethod
 
   def _add_argparse_args(cls, parser):
-parser.add_value_provider_argument('--vp-arg1', default='start')
-parser.add_value_provider_argument('--vp-arg2')
-parser.add_argument('--non-vp-arg')
+parser.add_value_provider_argument('--vp_arg1', default='start')
 
 Review comment:
   A warning would be good in that case. (We can resolve this comment.)
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 226896)
Time Spent: 9h 50m  (was: 9h 40m)

> Pipeline options to experiment propagation is not working in Dataflow runner.
> -
>
> Key: BEAM-6942
> URL: https://issues.apache.org/jira/browse/BEAM-6942
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: Valentyn Tymofieiev
>Priority: Major
>  Time Spent: 9h 50m
>  Remaining Estimate: 0h
>
> Relevant code: 
> [https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/dataflow/dataflow_runner.py#L356-L388]
> 3 experiments/options are affected. We need to fix it in 2.12.0
> cc: [~altay] [~apilloud]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-5120) Support Complex type in BeamSQL

2019-04-12 Thread Rui Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Wang reassigned BEAM-5120:
--

Assignee: (was: Rui Wang)

> Support Complex type in BeamSQL
> ---
>
> Key: BEAM-5120
> URL: https://issues.apache.org/jira/browse/BEAM-5120
> Project: Beam
>  Issue Type: Task
>  Components: dsl-sql
>Reporter: Rui Wang
>Priority: Major
>  Labels: triaged
>
> We want have a smooth experience of complex type in BeamSQL.
>  
> For example, BeamSQL could support nested row in arbitrary levels 
> (Row>>) in both read and write from/to arbitrary sources. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-5505) Disable Row flattening in Apache Calcite

2019-04-12 Thread Rui Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Wang reassigned BEAM-5505:
--

Assignee: (was: Rui Wang)

> Disable Row flattening in Apache Calcite
> 
>
> Key: BEAM-5505
> URL: https://issues.apache.org/jira/browse/BEAM-5505
> Project: Beam
>  Issue Type: Sub-task
>  Components: dsl-sql
>Reporter: Rui Wang
>Priority: Major
>  Labels: triaged
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-7049) BeamUnionRel should work on mutiple input

2019-04-12 Thread Rui Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Wang reassigned BEAM-7049:
--

Assignee: (was: Rui Wang)

> BeamUnionRel should work on mutiple input 
> --
>
> Key: BEAM-7049
> URL: https://issues.apache.org/jira/browse/BEAM-7049
> Project: Beam
>  Issue Type: Improvement
>  Components: dsl-sql
>Reporter: Rui Wang
>Priority: Major
>
> BeamUnionRel assumes inputs are two and rejects more. So `a UNION b UNION c` 
> will have to be created as UNION(a, UNION(b, c)) and have two shuffles. If 
> BeamUnionRel can handle multiple shuffles, we will have only one shuffle



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3061) BigtableIO should support emitting a sentinel "done" value when a bundle completes

2019-04-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3061?focusedWorklogId=226876=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-226876
 ]

ASF GitHub Bot logged work on BEAM-3061:


Author: ASF GitHub Bot
Created on: 12/Apr/19 19:36
Start Date: 12/Apr/19 19:36
Worklog Time Spent: 10m 
  Work Description: kennknowles commented on issue #7805: [BEAM-3061] Done 
notifications for BigtableIO.Write
URL: https://github.com/apache/beam/pull/7805#issuecomment-482697247
 
 
   I think @chamikaramj is probably best for review, not me.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 226876)
Time Spent: 1h 10m  (was: 1h)

> BigtableIO should support emitting a sentinel "done" value when a bundle 
> completes
> --
>
> Key: BEAM-3061
> URL: https://issues.apache.org/jira/browse/BEAM-3061
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-gcp
>Reporter: Steve Niemitz
>Assignee: Steve Niemitz
>Priority: Major
>  Labels: triaged
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> There was some discussion of this on the dev@ mailing list [1].  This 
> approach was taken based on discussion there.
> [1] 
> https://lists.apache.org/thread.html/949b33782f722a9000c9bf9e37042739c6fd0927589b99752b78d7bd@%3Cdev.beam.apache.org%3E



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6957) Spark Portable Runner: Support metrics

2019-04-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6957?focusedWorklogId=226875=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-226875
 ]

ASF GitHub Bot logged work on BEAM-6957:


Author: ASF GitHub Bot
Created on: 12/Apr/19 19:32
Start Date: 12/Apr/19 19:32
Worklog Time Spent: 10m 
  Work Description: ibzib commented on pull request #8294: [BEAM-6957] 
Spark portable runner: support metrics
URL: https://github.com/apache/beam/pull/8294
 
 
   This was quite easy to achieve by plugging existing classes.
   Note that this continues to use Spark's deprecated Accumulator, not 
AccumulatorV2. However, when we do get around to upgrading the [underlying 
classes](https://github.com/apache/beam/blob/247e93ef7fbe87dbb1cd82a0ce6b2e6aa5254f7c/runners/spark/src/main/java/org/apache/beam/runners/spark/metrics/MetricsAccumulator.java#L48),
 it shouldn't affect this code much.
   
   R: @angoenka @iemejia 
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | --- | --- | --- | ---
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python3_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python3_Verify/lastCompletedBuild/)
 | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/)
  [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PreCommit_Python_PVR_Flink_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_Python_PVR_Flink_Cron/lastCompletedBuild/)
 | --- | --- | ---
   
   Pre-Commit Tests Status (on master branch)
   

   
   --- |Java | Python | Go | Website
   --- | --- | --- | --- | ---
   Non-portable | [![Build 
Status](https://builds.apache.org/job/beam_PreCommit_Java_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_Java_Cron/lastCompletedBuild/)
 | [![Build 

[jira] [Work logged] (BEAM-5806) Allow to change the PubsubClientFactory when using PubsubIO

2019-04-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5806?focusedWorklogId=226873=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-226873
 ]

ASF GitHub Bot logged work on BEAM-5806:


Author: ASF GitHub Bot
Created on: 12/Apr/19 19:28
Start Date: 12/Apr/19 19:28
Worklog Time Spent: 10m 
  Work Description: stale[bot] commented on issue #6769: WIP: [BEAM-5806] 
Update PubsubIO to be able to change the PubsubClientFactory
URL: https://github.com/apache/beam/pull/6769#issuecomment-482695037
 
 
   This pull request has been marked as stale due to 60 days of inactivity. It 
will be closed in 1 week if no further activity occurs. If you think that’s 
incorrect or this pull request requires a review, please simply write any 
comment. If closed, you can revive the PR at any time and @mention a reviewer 
or discuss it on the d...@beam.apache.org list. Thank you for your 
contributions.
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 226873)
Time Spent: 2h 10m  (was: 2h)
Remaining Estimate: 21h 50m  (was: 22h)

> Allow to change the PubsubClientFactory when using PubsubIO
> ---
>
> Key: BEAM-5806
> URL: https://issues.apache.org/jira/browse/BEAM-5806
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-gcp
>Reporter: Logan HAUSPIE
>Priority: Minor
>   Original Estimate: 24h
>  Time Spent: 2h 10m
>  Remaining Estimate: 21h 50m
>
> By using the PubsubIO to read from or write to Pub/Sub we are obliged to use 
> the PubsubJsonClient to interact with the Pub/Sub API.
> This PubsubJsonClient encode the message in base 64 and increase the size of 
> this one by 30% and there is no way to change the PubsubClient used by 
> PubsubIO.
>  
> What I suggest is to allow developper to change the PubsubClientFactory by 
> specifying it at the definition-time like the following:
> {{^PubsubIO.Read read = PubsubIO.readStrings()
>       .fromTopic(StaticValueProvider.of(topic))^}}
>   .withTimestampAttribute("myTimestamp")^}}
>   .withIdAttribute("myId")^}}
>   *.withClientFactory(PubsubGrpcClient.FACTORY)*;^}}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-6943) DataInputOperation object has no attribute 'needs_finalization' in Jupyter

2019-04-12 Thread Kenneth Knowles (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles updated BEAM-6943:
--
Affects Version/s: (was: 2.12.0)

> DataInputOperation object has no attribute 'needs_finalization' in Jupyter 
> ---
>
> Key: BEAM-6943
> URL: https://issues.apache.org/jira/browse/BEAM-6943
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Boyuan Zhang
>Assignee: Boyuan Zhang
>Priority: Blocker
> Fix For: 2.12.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7064) Conversion of timestamp from BigQuery row to Beam row loses precision

2019-04-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7064?focusedWorklogId=226868=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-226868
 ]

ASF GitHub Bot logged work on BEAM-7064:


Author: ASF GitHub Bot
Created on: 12/Apr/19 19:22
Start Date: 12/Apr/19 19:22
Worklog Time Spent: 10m 
  Work Description: amaliujia commented on issue #8289: [BEAM-7064] Reject 
BigQuery data with sub-millisecond precision, instead of losing data
URL: https://github.com/apache/beam/pull/8289#issuecomment-482693285
 
 
   Run Java PostCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 226868)
Time Spent: 2h  (was: 1h 50m)

> Conversion of timestamp from BigQuery row to Beam row loses precision
> -
>
> Key: BEAM-7064
> URL: https://issues.apache.org/jira/browse/BEAM-7064
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>Priority: Critical
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> Currently, the utilities to convert from BigQuery row to Beam row simply 
> truncate timestamps at millisecond precision. This is unacceptable. Instead, 
> an error should be raised indicating that it is not supported.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7064) Conversion of timestamp from BigQuery row to Beam row loses precision

2019-04-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7064?focusedWorklogId=226867=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-226867
 ]

ASF GitHub Bot logged work on BEAM-7064:


Author: ASF GitHub Bot
Created on: 12/Apr/19 19:21
Start Date: 12/Apr/19 19:21
Worklog Time Spent: 10m 
  Work Description: amaliujia commented on pull request #8289: [BEAM-7064] 
Reject BigQuery data with sub-millisecond precision, instead of losing data
URL: https://github.com/apache/beam/pull/8289#discussion_r275031566
 
 

 ##
 File path: 
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/AvroUtils.java
 ##
 @@ -73,8 +69,19 @@ public static Object convertAvroFormat(Field beamField, 
Object value) {
   default:
 throw new RuntimeException("Does not support converting unknown type 
value");
 }
+  }
 
-return ret;
+  private static ReadableInstant safeToMillis(Object value) {
+long subMilliPrecision = ((long) value) % 1000;
 
 Review comment:
   I was also wondering where long is generated. If BQ timestamp converts to 
long which contains micros, will that lose precision?
   
   I did some math:
   
   BigQuery doc says TIMESTAMP type range is `0001-01-01 00:00:00 to -12-31 
23:59:59.99 UTC`
   
   so it should ranges from `-6.127488×10¹⁶ to 2.4976512×10¹⁷` in micros, 
calculated by `([0001-01-01 | -12-31] - 1970-01-01) * 12 * 30 * 24 *3600 * 
1000 * 1000`, which falls into INT64's range. 
   
   So no precision loss.
   
   
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 226867)
Time Spent: 1h 50m  (was: 1h 40m)

> Conversion of timestamp from BigQuery row to Beam row loses precision
> -
>
> Key: BEAM-7064
> URL: https://issues.apache.org/jira/browse/BEAM-7064
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>Priority: Critical
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> Currently, the utilities to convert from BigQuery row to Beam row simply 
> truncate timestamps at millisecond precision. This is unacceptable. Instead, 
> an error should be raised indicating that it is not supported.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7064) Conversion of timestamp from BigQuery row to Beam row loses precision

2019-04-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7064?focusedWorklogId=226863=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-226863
 ]

ASF GitHub Bot logged work on BEAM-7064:


Author: ASF GitHub Bot
Created on: 12/Apr/19 19:13
Start Date: 12/Apr/19 19:13
Worklog Time Spent: 10m 
  Work Description: amaliujia commented on pull request #8289: [BEAM-7064] 
Reject BigQuery data with sub-millisecond precision, instead of losing data
URL: https://github.com/apache/beam/pull/8289#discussion_r275031566
 
 

 ##
 File path: 
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/AvroUtils.java
 ##
 @@ -73,8 +69,19 @@ public static Object convertAvroFormat(Field beamField, 
Object value) {
   default:
 throw new RuntimeException("Does not support converting unknown type 
value");
 }
+  }
 
-return ret;
+  private static ReadableInstant safeToMillis(Object value) {
+long subMilliPrecision = ((long) value) % 1000;
 
 Review comment:
   I was also wondering where long is generated. If BQ timestamp converts to 
long which contains micros, will that lose precision?
   
   I did some math:
   
   BigQuery doc says TIMESTAMP type range is `0001-01-01 00:00:00 to -12-31 
23:59:59.99 UTC`
   
   so it should ranges from `-6.127488×10¹⁶ to 2.4976512×10¹⁷` in micros, 
calculated by `([0001-01-01 | -12-31] - 1970) * 12 * 30 * 24 *3600 * 1000 * 
1000`, which falls into INT64's range. 
   
   So no precision loss.
   
   
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 226863)
Time Spent: 1h 40m  (was: 1.5h)

> Conversion of timestamp from BigQuery row to Beam row loses precision
> -
>
> Key: BEAM-7064
> URL: https://issues.apache.org/jira/browse/BEAM-7064
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>Priority: Critical
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> Currently, the utilities to convert from BigQuery row to Beam row simply 
> truncate timestamps at millisecond precision. This is unacceptable. Instead, 
> an error should be raised indicating that it is not supported.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7064) Conversion of timestamp from BigQuery row to Beam row loses precision

2019-04-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7064?focusedWorklogId=226862=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-226862
 ]

ASF GitHub Bot logged work on BEAM-7064:


Author: ASF GitHub Bot
Created on: 12/Apr/19 19:13
Start Date: 12/Apr/19 19:13
Worklog Time Spent: 10m 
  Work Description: amaliujia commented on pull request #8289: [BEAM-7064] 
Reject BigQuery data with sub-millisecond precision, instead of losing data
URL: https://github.com/apache/beam/pull/8289#discussion_r275031566
 
 

 ##
 File path: 
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/AvroUtils.java
 ##
 @@ -73,8 +69,19 @@ public static Object convertAvroFormat(Field beamField, 
Object value) {
   default:
 throw new RuntimeException("Does not support converting unknown type 
value");
 }
+  }
 
-return ret;
+  private static ReadableInstant safeToMillis(Object value) {
+long subMilliPrecision = ((long) value) % 1000;
 
 Review comment:
   I was also wondering where long is generated. If BQ timestamp converts to 
long which contains micros, will that lose precision?
   
   I did some math:
   
   BigQuery doc says TIMESTAMP type range is `0001-01-01 00:00:00 to -12-31 
23:59:59.99 UTC`
   
   so it should ranges from `-6.127488×10¹⁶ to 2.4976512×10¹⁷` in micros, 
calculated by `((0001-01-01 / -12-31) - 1970) * 12 * 30 * 24 *3600 * 1000 * 
1000`, which falls into INT64's range. 
   
   So no precision loss.
   
   
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 226862)
Time Spent: 1.5h  (was: 1h 20m)

> Conversion of timestamp from BigQuery row to Beam row loses precision
> -
>
> Key: BEAM-7064
> URL: https://issues.apache.org/jira/browse/BEAM-7064
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>Priority: Critical
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Currently, the utilities to convert from BigQuery row to Beam row simply 
> truncate timestamps at millisecond precision. This is unacceptable. Instead, 
> an error should be raised indicating that it is not supported.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7064) Conversion of timestamp from BigQuery row to Beam row loses precision

2019-04-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7064?focusedWorklogId=226861=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-226861
 ]

ASF GitHub Bot logged work on BEAM-7064:


Author: ASF GitHub Bot
Created on: 12/Apr/19 19:12
Start Date: 12/Apr/19 19:12
Worklog Time Spent: 10m 
  Work Description: amaliujia commented on pull request #8289: [BEAM-7064] 
Reject BigQuery data with sub-millisecond precision, instead of losing data
URL: https://github.com/apache/beam/pull/8289#discussion_r275031566
 
 

 ##
 File path: 
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/AvroUtils.java
 ##
 @@ -73,8 +69,19 @@ public static Object convertAvroFormat(Field beamField, 
Object value) {
   default:
 throw new RuntimeException("Does not support converting unknown type 
value");
 }
+  }
 
-return ret;
+  private static ReadableInstant safeToMillis(Object value) {
+long subMilliPrecision = ((long) value) % 1000;
 
 Review comment:
   I was also wondering where long is generated. If BQ timestamp converts to 
long which contains micros, will that lose precision?
   
   I did some math:
   
   BigQuery doc says TIMESTAMP type range is `0001-01-01 00:00:00 to -12-31 
23:59:59.99 UTC`
   
   so it should ranges from `-6.127488×10¹⁶ to 2.4976512×10¹⁷` in micros, 
calculated by `((0001-01-01 / -12-31) -/+ 1970) * 12 * 30 * 24 *3600 * 1000 
* 1000`, which falls into INT64's range. 
   
   So no precision loss.
   
   
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 226861)
Time Spent: 1h 20m  (was: 1h 10m)

> Conversion of timestamp from BigQuery row to Beam row loses precision
> -
>
> Key: BEAM-7064
> URL: https://issues.apache.org/jira/browse/BEAM-7064
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>Priority: Critical
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Currently, the utilities to convert from BigQuery row to Beam row simply 
> truncate timestamps at millisecond precision. This is unacceptable. Instead, 
> an error should be raised indicating that it is not supported.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7039) Spark portable runner: run validatesRunner tests

2019-04-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7039?focusedWorklogId=226858=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-226858
 ]

ASF GitHub Bot logged work on BEAM-7039:


Author: ASF GitHub Bot
Created on: 12/Apr/19 19:06
Start Date: 12/Apr/19 19:06
Worklog Time Spent: 10m 
  Work Description: ibzib commented on issue #8285: [BEAM-7039] set up 
validatesPortableRunner tests for Spark
URL: https://github.com/apache/beam/pull/8285#issuecomment-482689060
 
 
   Run Python PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 226858)
Time Spent: 1h 40m  (was: 1.5h)

> Spark portable runner: run validatesRunner tests
> 
>
> Key: BEAM-7039
> URL: https://issues.apache.org/jira/browse/BEAM-7039
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-spark
>Reporter: Kyle Weaver
>Assignee: Kyle Weaver
>Priority: Major
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7064) Conversion of timestamp from BigQuery row to Beam row loses precision

2019-04-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7064?focusedWorklogId=226857=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-226857
 ]

ASF GitHub Bot logged work on BEAM-7064:


Author: ASF GitHub Bot
Created on: 12/Apr/19 19:04
Start Date: 12/Apr/19 19:04
Worklog Time Spent: 10m 
  Work Description: amaliujia commented on pull request #8289: [BEAM-7064] 
Reject BigQuery data with sub-millisecond precision, instead of losing data
URL: https://github.com/apache/beam/pull/8289#discussion_r275029235
 
 

 ##
 File path: 
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/AvroUtils.java
 ##
 @@ -73,8 +69,19 @@ public static Object convertAvroFormat(Field beamField, 
Object value) {
   default:
 throw new RuntimeException("Does not support converting unknown type 
value");
 }
+  }
 
-return ret;
+  private static ReadableInstant safeToMillis(Object value) {
+long subMilliPrecision = ((long) value) % 1000;
 
 Review comment:
   Yes. I was asking because input is an object.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 226857)
Time Spent: 1h 10m  (was: 1h)

> Conversion of timestamp from BigQuery row to Beam row loses precision
> -
>
> Key: BEAM-7064
> URL: https://issues.apache.org/jira/browse/BEAM-7064
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>Priority: Critical
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Currently, the utilities to convert from BigQuery row to Beam row simply 
> truncate timestamps at millisecond precision. This is unacceptable. Instead, 
> an error should be raised indicating that it is not supported.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7064) Conversion of timestamp from BigQuery row to Beam row loses precision

2019-04-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7064?focusedWorklogId=226855=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-226855
 ]

ASF GitHub Bot logged work on BEAM-7064:


Author: ASF GitHub Bot
Created on: 12/Apr/19 19:00
Start Date: 12/Apr/19 19:00
Worklog Time Spent: 10m 
  Work Description: kennknowles commented on pull request #8289: 
[BEAM-7064] Reject BigQuery data with sub-millisecond precision, instead of 
losing data
URL: https://github.com/apache/beam/pull/8289#discussion_r275027986
 
 

 ##
 File path: 
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/AvroUtils.java
 ##
 @@ -73,8 +69,19 @@ public static Object convertAvroFormat(Field beamField, 
Object value) {
   default:
 throw new RuntimeException("Does not support converting unknown type 
value");
 }
+  }
 
-return ret;
+  private static ReadableInstant safeToMillis(Object value) {
+long subMilliPrecision = ((long) value) % 1000;
 
 Review comment:
   It is because this `AvroUtils` is part of the BigQuery connector, not 
general. And we know BigQuery is always micros. Is that your question?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 226855)
Time Spent: 1h  (was: 50m)

> Conversion of timestamp from BigQuery row to Beam row loses precision
> -
>
> Key: BEAM-7064
> URL: https://issues.apache.org/jira/browse/BEAM-7064
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>Priority: Critical
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Currently, the utilities to convert from BigQuery row to Beam row simply 
> truncate timestamps at millisecond precision. This is unacceptable. Instead, 
> an error should be raised indicating that it is not supported.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6493) examples in Kotlin

2019-04-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6493?focusedWorklogId=226851=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-226851
 ]

ASF GitHub Bot logged work on BEAM-6493:


Author: ASF GitHub Bot
Created on: 12/Apr/19 18:50
Start Date: 12/Apr/19 18:50
Worklog Time Spent: 10m 
  Work Description: harshithdwivedi commented on issue #8291: [BEAM-6493] 
Convert the WordCount samples to Kotlin
URL: https://github.com/apache/beam/pull/8291#issuecomment-482684189
 
 
   Certainly, that would be great!
   I'll ping you on slack to discuss more on this, LMK if that sounds good?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 226851)
Time Spent: 10h 10m  (was: 10h)
Remaining Estimate: 494h 50m  (was: 495h)

> examples in Kotlin
> --
>
> Key: BEAM-6493
> URL: https://issues.apache.org/jira/browse/BEAM-6493
> Project: Beam
>  Issue Type: Task
>  Components: examples-java
>Affects Versions: Not applicable
>Reporter: Harshit Dwivedi
>Assignee: Harshit Dwivedi
>Priority: Minor
>  Labels: documentation, triaged
> Fix For: Not applicable
>
>   Original Estimate: 504h
>  Time Spent: 10h 10m
>  Remaining Estimate: 494h 50m
>
> I have been using Apache Beam for few of my projects in production since the 
> past 6 months and apart from Java, [Kotlin|https://kotlinlang.org/] also 
> seems to work as well with no issues whatsoever.
> But currently, the Github Repository of Apache Beam contains examples only in 
> Java which might be an issue for other developers who want to use Apache Beam 
> SDK with kotlin as there are no sample resources available.
> That said, I would love to go ahead and add kotlin examples alongside the 
> current java examples in the [Beam 
> repository|https://github.com/apache/beam/tree/master/examples/java].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6493) examples in Kotlin

2019-04-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6493?focusedWorklogId=226848=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-226848
 ]

ASF GitHub Bot logged work on BEAM-6493:


Author: ASF GitHub Bot
Created on: 12/Apr/19 18:46
Start Date: 12/Apr/19 18:46
Worklog Time Spent: 10m 
  Work Description: pabloem commented on pull request #8291: [BEAM-6493] 
Convert the WordCount samples to Kotlin
URL: https://github.com/apache/beam/pull/8291
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 226848)
Time Spent: 9h 50m  (was: 9h 40m)
Remaining Estimate: 495h 10m  (was: 495h 20m)

> examples in Kotlin
> --
>
> Key: BEAM-6493
> URL: https://issues.apache.org/jira/browse/BEAM-6493
> Project: Beam
>  Issue Type: Task
>  Components: examples-java
>Affects Versions: Not applicable
>Reporter: Harshit Dwivedi
>Assignee: Harshit Dwivedi
>Priority: Minor
>  Labels: documentation, triaged
> Fix For: Not applicable
>
>   Original Estimate: 504h
>  Time Spent: 9h 50m
>  Remaining Estimate: 495h 10m
>
> I have been using Apache Beam for few of my projects in production since the 
> past 6 months and apart from Java, [Kotlin|https://kotlinlang.org/] also 
> seems to work as well with no issues whatsoever.
> But currently, the Github Repository of Apache Beam contains examples only in 
> Java which might be an issue for other developers who want to use Apache Beam 
> SDK with kotlin as there are no sample resources available.
> That said, I would love to go ahead and add kotlin examples alongside the 
> current java examples in the [Beam 
> repository|https://github.com/apache/beam/tree/master/examples/java].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6493) examples in Kotlin

2019-04-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6493?focusedWorklogId=226849=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-226849
 ]

ASF GitHub Bot logged work on BEAM-6493:


Author: ASF GitHub Bot
Created on: 12/Apr/19 18:46
Start Date: 12/Apr/19 18:46
Worklog Time Spent: 10m 
  Work Description: pabloem commented on issue #8291: [BEAM-6493] Convert 
the WordCount samples to Kotlin
URL: https://github.com/apache/beam/pull/8291#issuecomment-482682889
 
 
   Done. Thanks so much for your contribution and your patience. if you feel up 
to it, we can add a little blog post to let users check it out : )
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 226849)
Time Spent: 10h  (was: 9h 50m)
Remaining Estimate: 495h  (was: 495h 10m)

> examples in Kotlin
> --
>
> Key: BEAM-6493
> URL: https://issues.apache.org/jira/browse/BEAM-6493
> Project: Beam
>  Issue Type: Task
>  Components: examples-java
>Affects Versions: Not applicable
>Reporter: Harshit Dwivedi
>Assignee: Harshit Dwivedi
>Priority: Minor
>  Labels: documentation, triaged
> Fix For: Not applicable
>
>   Original Estimate: 504h
>  Time Spent: 10h
>  Remaining Estimate: 495h
>
> I have been using Apache Beam for few of my projects in production since the 
> past 6 months and apart from Java, [Kotlin|https://kotlinlang.org/] also 
> seems to work as well with no issues whatsoever.
> But currently, the Github Repository of Apache Beam contains examples only in 
> Java which might be an issue for other developers who want to use Apache Beam 
> SDK with kotlin as there are no sample resources available.
> That said, I would love to go ahead and add kotlin examples alongside the 
> current java examples in the [Beam 
> repository|https://github.com/apache/beam/tree/master/examples/java].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4046) Decouple gradle project names and maven artifact ids

2019-04-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4046?focusedWorklogId=226842=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-226842
 ]

ASF GitHub Bot logged work on BEAM-4046:


Author: ASF GitHub Bot
Created on: 12/Apr/19 18:38
Start Date: 12/Apr/19 18:38
Worklog Time Spent: 10m 
  Work Description: adude3141 commented on issue #8194: [DO NOT MERGE] 
[BEAM-4046] decouple gradle project names and maven artifact ids
URL: https://github.com/apache/beam/pull/8194#issuecomment-482680290
 
 
   Run JavaPortabilityApi PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 226842)
Time Spent: 0.5h  (was: 20m)

> Decouple gradle project names and maven artifact ids
> 
>
> Key: BEAM-4046
> URL: https://issues.apache.org/jira/browse/BEAM-4046
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system
>Reporter: Kenneth Knowles
>Assignee: Michael Luckey
>Priority: Major
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> In our first draft, we had gradle projects like {{":beam-sdks-java-core"}}. 
> It is clumsy and requires a hacky settings.gradle that is not idiomatic.
> In our second draft, we changed them to names that work well with Gradle, 
> like {{":sdks:java:core"}}. This caused Maven artifact IDs to be wonky.
> In our third draft, we regressed to the first draft to get the Maven artifact 
> ids right.
> These should be able to be decoupled. It seems there are many StackOverflow 
> questions on the subject.
> Since it is unidiomatic and a poor user experience, if it does turn out to be 
> mandatory then it needs to be documented inline everywhere - the 
> settings.gradle should say why it is so bizarre, and each build.gradle should 
> indicate what its project id is.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-7058) Python SDK metric process_bundle_msecs reported as zero

2019-04-12 Thread Pablo Estrada (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pablo Estrada reassigned BEAM-7058:
---

Assignee: Pablo Estrada

> Python SDK metric process_bundle_msecs reported as zero
> ---
>
> Key: BEAM-7058
> URL: https://issues.apache.org/jira/browse/BEAM-7058
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink, sdk-py-harness
>Reporter: Thomas Weise
>Assignee: Pablo Estrada
>Priority: Major
>  Labels: portability-flink
>
> With the portable Flink runner, the metric is reported as 0, while the count 
> metric works as expected.
> [https://lists.apache.org/thread.html/25eec8104bda6e4c71cc6c5e9864c335833c3ae2afe225d372479f30@%3Cdev.beam.apache.org%3E]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-7058) Python SDK metric process_bundle_msecs reported as zero

2019-04-12 Thread Pablo Estrada (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pablo Estrada reassigned BEAM-7058:
---

Assignee: Alex Amato  (was: Pablo Estrada)

> Python SDK metric process_bundle_msecs reported as zero
> ---
>
> Key: BEAM-7058
> URL: https://issues.apache.org/jira/browse/BEAM-7058
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink, sdk-py-harness
>Reporter: Thomas Weise
>Assignee: Alex Amato
>Priority: Major
>  Labels: portability-flink
>
> With the portable Flink runner, the metric is reported as 0, while the count 
> metric works as expected.
> [https://lists.apache.org/thread.html/25eec8104bda6e4c71cc6c5e9864c335833c3ae2afe225d372479f30@%3Cdev.beam.apache.org%3E]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7039) Spark portable runner: run validatesRunner tests

2019-04-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7039?focusedWorklogId=226840=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-226840
 ]

ASF GitHub Bot logged work on BEAM-7039:


Author: ASF GitHub Bot
Created on: 12/Apr/19 18:35
Start Date: 12/Apr/19 18:35
Worklog Time Spent: 10m 
  Work Description: ibzib commented on issue #8285: [BEAM-7039] set up 
validatesPortableRunner tests for Spark
URL: https://github.com/apache/beam/pull/8285#issuecomment-482679460
 
 
   Run Website PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 226840)
Time Spent: 1.5h  (was: 1h 20m)

> Spark portable runner: run validatesRunner tests
> 
>
> Key: BEAM-7039
> URL: https://issues.apache.org/jira/browse/BEAM-7039
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-spark
>Reporter: Kyle Weaver
>Assignee: Kyle Weaver
>Priority: Major
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5519) Spark Streaming Duplicated Encoding/Decoding Effort

2019-04-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5519?focusedWorklogId=226839=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-226839
 ]

ASF GitHub Bot logged work on BEAM-5519:


Author: ASF GitHub Bot
Created on: 12/Apr/19 18:33
Start Date: 12/Apr/19 18:33
Worklog Time Spent: 10m 
  Work Description: kyle-winkelman commented on issue #6511: [BEAM-5519] 
Remove call to groupByKey in Spark Streaming.
URL: https://github.com/apache/beam/pull/6511#issuecomment-482667518
 
 
   I have spent a little bit of time trying to understand the Nexmark 
performance tests. I believe I have narrowed down the issue a little bit. The 
Nexmark BoundedEventSource splits based on the numEventsGenerator (which is 
100). Before this PR, the first time there is a GroupByKey in these queries the 
RDD would be partitioned to the default parallelism (not 100). With this PR, 
the first time there is a GroupByKey in these queries the RDD would stay 
partitioned at 100. I believe this is what is affecting the performance. I 
would say we could accept this degradation in performance because in real use 
when we ask a Source to split itself we expect it will do so in the way we 
asked. Unless there are a large number of sources that tend not to split well.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 226839)
Time Spent: 4h 40m  (was: 4.5h)

> Spark Streaming Duplicated Encoding/Decoding Effort
> ---
>
> Key: BEAM-5519
> URL: https://issues.apache.org/jira/browse/BEAM-5519
> Project: Beam
>  Issue Type: Bug
>  Components: runner-spark
>Reporter: Kyle Winkelman
>Assignee: Kyle Winkelman
>Priority: Major
>  Labels: spark, spark-streaming, triaged
> Fix For: 2.13.0
>
>  Time Spent: 4h 40m
>  Remaining Estimate: 0h
>
> When using the SparkRunner in streaming mode. There is a call to groupByKey 
> followed by a call to updateStateByKey. BEAM-1815 fixed an issue where this 
> used to cause 2 shuffles but it still causes 2 encode/decode cycles.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7064) Conversion of timestamp from BigQuery row to Beam row loses precision

2019-04-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7064?focusedWorklogId=226838=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-226838
 ]

ASF GitHub Bot logged work on BEAM-7064:


Author: ASF GitHub Bot
Created on: 12/Apr/19 18:31
Start Date: 12/Apr/19 18:31
Worklog Time Spent: 10m 
  Work Description: amaliujia commented on issue #8289: [BEAM-7064] Reject 
BigQuery data with sub-millisecond precision, instead of losing data
URL: https://github.com/apache/beam/pull/8289#issuecomment-482677983
 
 
   Run SQL PostCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 226838)
Time Spent: 50m  (was: 40m)

> Conversion of timestamp from BigQuery row to Beam row loses precision
> -
>
> Key: BEAM-7064
> URL: https://issues.apache.org/jira/browse/BEAM-7064
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>Priority: Critical
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Currently, the utilities to convert from BigQuery row to Beam row simply 
> truncate timestamps at millisecond precision. This is unacceptable. Instead, 
> an error should be raised indicating that it is not supported.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7064) Conversion of timestamp from BigQuery row to Beam row loses precision

2019-04-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7064?focusedWorklogId=226837=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-226837
 ]

ASF GitHub Bot logged work on BEAM-7064:


Author: ASF GitHub Bot
Created on: 12/Apr/19 18:29
Start Date: 12/Apr/19 18:29
Worklog Time Spent: 10m 
  Work Description: amaliujia commented on pull request #8289: [BEAM-7064] 
Reject BigQuery data with sub-millisecond precision, instead of losing data
URL: https://github.com/apache/beam/pull/8289#discussion_r275017781
 
 

 ##
 File path: 
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/AvroUtils.java
 ##
 @@ -73,8 +69,19 @@ public static Object convertAvroFormat(Field beamField, 
Object value) {
   default:
 throw new RuntimeException("Does not support converting unknown type 
value");
 }
+  }
 
-return ret;
+  private static ReadableInstant safeToMillis(Object value) {
+long subMilliPrecision = ((long) value) % 1000;
 
 Review comment:
   How do we know that that the value returned is not millis but micros?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 226837)
Time Spent: 40m  (was: 0.5h)

> Conversion of timestamp from BigQuery row to Beam row loses precision
> -
>
> Key: BEAM-7064
> URL: https://issues.apache.org/jira/browse/BEAM-7064
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>Priority: Critical
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Currently, the utilities to convert from BigQuery row to Beam row simply 
> truncate timestamps at millisecond precision. This is unacceptable. Instead, 
> an error should be raised indicating that it is not supported.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7064) Conversion of timestamp from BigQuery row to Beam row loses precision

2019-04-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7064?focusedWorklogId=226836=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-226836
 ]

ASF GitHub Bot logged work on BEAM-7064:


Author: ASF GitHub Bot
Created on: 12/Apr/19 18:28
Start Date: 12/Apr/19 18:28
Worklog Time Spent: 10m 
  Work Description: amaliujia commented on pull request #8289: [BEAM-7064] 
Reject BigQuery data with sub-millisecond precision, instead of losing data
URL: https://github.com/apache/beam/pull/8289#discussion_r275017781
 
 

 ##
 File path: 
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/AvroUtils.java
 ##
 @@ -73,8 +69,19 @@ public static Object convertAvroFormat(Field beamField, 
Object value) {
   default:
 throw new RuntimeException("Does not support converting unknown type 
value");
 }
+  }
 
-return ret;
+  private static ReadableInstant safeToMillis(Object value) {
+long subMilliPrecision = ((long) value) % 1000;
 
 Review comment:
   How do we know that that the value returned is not millis but micro?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 226836)
Time Spent: 0.5h  (was: 20m)

> Conversion of timestamp from BigQuery row to Beam row loses precision
> -
>
> Key: BEAM-7064
> URL: https://issues.apache.org/jira/browse/BEAM-7064
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>Priority: Critical
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Currently, the utilities to convert from BigQuery row to Beam row simply 
> truncate timestamps at millisecond precision. This is unacceptable. Instead, 
> an error should be raised indicating that it is not supported.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6865) Refactor common portable runner infrastructure for better reuse

2019-04-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6865?focusedWorklogId=226834=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-226834
 ]

ASF GitHub Bot logged work on BEAM-6865:


Author: ASF GitHub Bot
Created on: 12/Apr/19 18:23
Start Date: 12/Apr/19 18:23
Worklog Time Spent: 10m 
  Work Description: angoenka commented on pull request #8176: [BEAM-6865] 
Move MetricsApi updates from flink.metrics to core.metrics
URL: https://github.com/apache/beam/pull/8176
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 226834)
Time Spent: 3h 10m  (was: 3h)

> Refactor common portable runner infrastructure for better reuse
> ---
>
> Key: BEAM-6865
> URL: https://issues.apache.org/jira/browse/BEAM-6865
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-flink, runner-spark
>Reporter: Kyle Weaver
>Assignee: Kyle Weaver
>Priority: Major
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> The Flink runner is currently Beam's most mature portable OSS runner. Much of 
> the Flink portable runner's implementation details are not unique to Flink, 
> and yet are confined to the Flink runner code. In order to ease development 
> on other portable runners such as the Spark runner, this reusable code should 
> be moved into some common location.
> I've set this up to track my progress on these ongoing improvements.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7039) Spark portable runner: run validatesRunner tests

2019-04-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7039?focusedWorklogId=226831=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-226831
 ]

ASF GitHub Bot logged work on BEAM-7039:


Author: ASF GitHub Bot
Created on: 12/Apr/19 18:04
Start Date: 12/Apr/19 18:04
Worklog Time Spent: 10m 
  Work Description: ibzib commented on pull request #8285: [BEAM-7039] set 
up validatesPortableRunner tests for Spark
URL: https://github.com/apache/beam/pull/8285#discussion_r275010043
 
 

 ##
 File path: runners/spark/job-server/build.gradle
 ##
 @@ -0,0 +1,122 @@
+import org.apache.beam.gradle.BeamModulePlugin
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * License); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+/**
+ * Spark Runner JobServer build file
+ */
+
+apply plugin: 'org.apache.beam.module'
+apply plugin: 'application'
+// we need to set mainClassName before applying shadow plugin
+mainClassName = "org.apache.beam.runners.spark.SparkJobServerDriver"
+
+applyJavaNature(
+  validateShadowJar: false,
+  exportJavadoc: false,
+  shadowClosure: {
+append "reference.conf"
+  },
+)
+
+def sparkRunnerProject = ":${project.name.replace("-job-server", "")}"
+
+description = project(sparkRunnerProject).description + " :: Job Server"
+
+configurations {
+  validatesPortableRunner
+}
+
+configurations.all {
+  exclude group: "org.slf4j", module: "slf4j-jdk14"
+}
+
+dependencies {
+  compile project(path: sparkRunnerProject, configuration: "shadow")
+  compile project(path: sparkRunnerProject, configuration: "provided")
+  validatesPortableRunner project(path: sparkRunnerProject, configuration: 
"shadowTest")
+  validatesPortableRunner project(path: sparkRunnerProject, configuration: 
"provided")
+  validatesPortableRunner project(path: ":beam-sdks-java-core", configuration: 
"shadowTest")
+  validatesPortableRunner project(path: ":beam-runners-core-java", 
configuration: "shadowTest")
+  validatesPortableRunner project(path: ":beam-runners-reference-java", 
configuration: "shadowTest")
+  compile project(path: 
":beam-sdks-java-extensions-google-cloud-platform-core", configuration: 
"shadow")
+//  TODO: Enable AWS and HDFS file system.
+}
+
+// NOTE: runShadow must be used in order to run the job server. The standard 
run
+// task will not work because the Spark runner classes only exist in the shadow
+// jar.
+runShadow {
+  args = []
+  if (project.hasProperty('jobHost'))
+args += ["--job-host=${project.property('jobHost')}"]
+  if (project.hasProperty('artifactsDir'))
+args += ["--artifacts-dir=${project.property('artifactsDir')}"]
+  if (project.hasProperty('cleanArtifactsPerJob'))
+args += ["--clean-artifacts-per-job"]
 
 Review comment:
   I made a PR to fix this for Flink as well. #8293
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 226831)
Time Spent: 1h 20m  (was: 1h 10m)

> Spark portable runner: run validatesRunner tests
> 
>
> Key: BEAM-7039
> URL: https://issues.apache.org/jira/browse/BEAM-7039
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-spark
>Reporter: Kyle Weaver
>Assignee: Kyle Weaver
>Priority: Major
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7067) Flink job server: can't disable cleanArtifactsPerJob

2019-04-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7067?focusedWorklogId=226830=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-226830
 ]

ASF GitHub Bot logged work on BEAM-7067:


Author: ASF GitHub Bot
Created on: 12/Apr/19 18:03
Start Date: 12/Apr/19 18:03
Worklog Time Spent: 10m 
  Work Description: ibzib commented on pull request #8293: [BEAM-7067] make 
cleanArtifactsPerJob configurable for Flink job serv…
URL: https://github.com/apache/beam/pull/8293
 
 
   …er task
   
   R: @mxm
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | --- | --- | --- | ---
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python3_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python3_Verify/lastCompletedBuild/)
 | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/)
  [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PreCommit_Python_PVR_Flink_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_Python_PVR_Flink_Cron/lastCompletedBuild/)
 | --- | --- | ---
   
   Pre-Commit Tests Status (on master branch)
   

   
   --- |Java | Python | Go | Website
   --- | --- | --- | --- | ---
   Non-portable | [![Build 
Status](https://builds.apache.org/job/beam_PreCommit_Java_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_Java_Cron/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PreCommit_Python_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_Python_Cron/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PreCommit_Go_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_Go_Cron/lastCompletedBuild/)
 | [![Build 

[jira] [Work logged] (BEAM-5519) Spark Streaming Duplicated Encoding/Decoding Effort

2019-04-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5519?focusedWorklogId=226829=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-226829
 ]

ASF GitHub Bot logged work on BEAM-5519:


Author: ASF GitHub Bot
Created on: 12/Apr/19 18:00
Start Date: 12/Apr/19 18:00
Worklog Time Spent: 10m 
  Work Description: kyle-winkelman commented on issue #6511: [BEAM-5519] 
Remove call to groupByKey in Spark Streaming.
URL: https://github.com/apache/beam/pull/6511#issuecomment-482667518
 
 
   I have spent a little bit of time trying to understand the Nexmark 
performance tests. I believe I have narrowed down the issue a little bit. The 
Nexmark BoundedEventSource splits based on the numEventsGenerator (which is 
100). Before this PR, the first time there is a GroupByKey in these queries the 
RDD would be partitioned to the default parallelism (not 100). With this PR, 
the first time there is a GroupByKey in these queries the RDD would stay 
partitioned at 100. I believe this is what is affecting the performance. I 
would say we could accept this degradation in performance because in real use 
when we ask a Source to split itself in the way we asked. Unless there are a 
large number of sources that tend not to split well.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 226829)
Time Spent: 4.5h  (was: 4h 20m)

> Spark Streaming Duplicated Encoding/Decoding Effort
> ---
>
> Key: BEAM-5519
> URL: https://issues.apache.org/jira/browse/BEAM-5519
> Project: Beam
>  Issue Type: Bug
>  Components: runner-spark
>Reporter: Kyle Winkelman
>Assignee: Kyle Winkelman
>Priority: Major
>  Labels: spark, spark-streaming, triaged
> Fix For: 2.13.0
>
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> When using the SparkRunner in streaming mode. There is a call to groupByKey 
> followed by a call to updateStateByKey. BEAM-1815 fixed an issue where this 
> used to cause 2 shuffles but it still causes 2 encode/decode cycles.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


  1   2   >