[jira] [Work logged] (BEAM-6407) regression: FileIO.writeDynamic() with side inputs fails in DirectRunner

2019-01-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6407?focusedWorklogId=186133=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-186133
 ]

ASF GitHub Bot logged work on BEAM-6407:


Author: ASF GitHub Bot
Created on: 17/Jan/19 03:52
Start Date: 17/Jan/19 03:52
Worklog Time Spent: 10m 
  Work Description: kennknowles commented on issue #7537: [BEAM-6407] 
Revert "BEAM-5933: avoid memory allocation in hashCode call"
URL: https://github.com/apache/beam/pull/7537#issuecomment-455033175
 
 
   run java precommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 186133)
Time Spent: 50m  (was: 40m)

> regression: FileIO.writeDynamic() with side inputs fails in DirectRunner
> 
>
> Key: BEAM-6407
> URL: https://issues.apache.org/jira/browse/BEAM-6407
> Project: Beam
>  Issue Type: Bug
>  Components: runner-direct
>Affects Versions: 2.9.0
>Reporter: Niel Markwick
>Assignee: Kenneth Knowles
>Priority: Blocker
>  Labels: regression
> Fix For: 2.10.0
>
> Attachments: beam-filewriter-demo.tgz
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> When FileIO.writeDynamic is used with automatic sharding and  a Contextful.Fn 
> that uses side inputs for the file naming, DirectRunner (and TestPipeline) 
> fail with: 
> {{java.lang.IllegalStateException: All PCollectionViews that are consumed 
> must be written by some WriteView PTransform: Missing [ 
> [RunnerPCollectionView]]}}
>  
> Example code:  
> {code:java}
> PCollectionView outputFileName =
>    pipeline.apply(
>       "outputDir",
>        Create.of("/tmp/testout")).apply(View.asSingleton());
> Contextful.Fn manifestNaming =
>    (element, c) ->
>       (window, pane, numShards, shardIndex, compression) -> 
>          c.sideInput(outputFileName)+shardIndex;
> pipeline.apply(FileIO.writeDynamic()
>    .by(SerializableFunctions.constant(""))
>    .withDestinationCoder(StringUtf8Coder.of())
>    .via(TextIO.sink())
>    .withTempDirectory("/tmp")
>    .withNaming(Contextful.of(
>       manifestNaming,
>       Requirements.requiresSideInputs(outputFileName;
> {code}
>  
> This does not occur in Dataflow-runner
> It does not occur if the ContextFul.Fn is not given side inputs.
> It does not occur if withNumShards(1) is set.
> It did not occur in 2.8.0, and does in 2.9.0 and 2.10.0-SNAPSHOT (as of today)
>  
> The cause appears to be due to the DirectRunner using TransformOverrides 
> re-writing FileIO sinks to use runner-determined-sharding
> ( see [DirectRunner.java line 
> 226|https://github.com/apache/beam/blob/master/runners/direct-java/src/main/java/org/apache/beam/runners/direct/DirectRunner.java#L226]
>  )
>  but I do not know why this started occuring in 2.9.0...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6407) regression: FileIO.writeDynamic() with side inputs fails in DirectRunner

2019-01-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6407?focusedWorklogId=186134=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-186134
 ]

ASF GitHub Bot logged work on BEAM-6407:


Author: ASF GitHub Bot
Created on: 17/Jan/19 03:52
Start Date: 17/Jan/19 03:52
Worklog Time Spent: 10m 
  Work Description: kennknowles commented on issue #7537: [BEAM-6407] 
Revert "BEAM-5933: avoid memory allocation in hashCode call"
URL: https://github.com/apache/beam/pull/7537#issuecomment-455033217
 
 
   Run Java_Examples_Dataflow PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 186134)
Time Spent: 1h  (was: 50m)

> regression: FileIO.writeDynamic() with side inputs fails in DirectRunner
> 
>
> Key: BEAM-6407
> URL: https://issues.apache.org/jira/browse/BEAM-6407
> Project: Beam
>  Issue Type: Bug
>  Components: runner-direct
>Affects Versions: 2.9.0
>Reporter: Niel Markwick
>Assignee: Kenneth Knowles
>Priority: Blocker
>  Labels: regression
> Fix For: 2.10.0
>
> Attachments: beam-filewriter-demo.tgz
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> When FileIO.writeDynamic is used with automatic sharding and  a Contextful.Fn 
> that uses side inputs for the file naming, DirectRunner (and TestPipeline) 
> fail with: 
> {{java.lang.IllegalStateException: All PCollectionViews that are consumed 
> must be written by some WriteView PTransform: Missing [ 
> [RunnerPCollectionView]]}}
>  
> Example code:  
> {code:java}
> PCollectionView outputFileName =
>    pipeline.apply(
>       "outputDir",
>        Create.of("/tmp/testout")).apply(View.asSingleton());
> Contextful.Fn manifestNaming =
>    (element, c) ->
>       (window, pane, numShards, shardIndex, compression) -> 
>          c.sideInput(outputFileName)+shardIndex;
> pipeline.apply(FileIO.writeDynamic()
>    .by(SerializableFunctions.constant(""))
>    .withDestinationCoder(StringUtf8Coder.of())
>    .via(TextIO.sink())
>    .withTempDirectory("/tmp")
>    .withNaming(Contextful.of(
>       manifestNaming,
>       Requirements.requiresSideInputs(outputFileName;
> {code}
>  
> This does not occur in Dataflow-runner
> It does not occur if the ContextFul.Fn is not given side inputs.
> It does not occur if withNumShards(1) is set.
> It did not occur in 2.8.0, and does in 2.9.0 and 2.10.0-SNAPSHOT (as of today)
>  
> The cause appears to be due to the DirectRunner using TransformOverrides 
> re-writing FileIO sinks to use runner-determined-sharding
> ( see [DirectRunner.java line 
> 226|https://github.com/apache/beam/blob/master/runners/direct-java/src/main/java/org/apache/beam/runners/direct/DirectRunner.java#L226]
>  )
>  but I do not know why this started occuring in 2.9.0...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-6451) Portability Pipeline eventually hangs on bundle registration

2019-01-16 Thread Kenneth Knowles (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-6451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16744628#comment-16744628
 ] 

Kenneth Knowles commented on BEAM-6451:
---

Or, conceivably it should be tacked on right when the network request is 
declared, here: 
https://github.com/apache/beam/blob/master/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/fn/control/RegisterAndProcessBundleOperation.java#L269

> Portability Pipeline eventually hangs on bundle registration
> 
>
> Key: BEAM-6451
> URL: https://issues.apache.org/jira/browse/BEAM-6451
> Project: Beam
>  Issue Type: Bug
>  Components: java-fn-execution, runner-dataflow, sdk-py-harness
>Reporter: Scott Wegner
>Priority: Minor
>  Labels: portability
>
> We've seen jobs using portability start off in a healthy state, but then 
> eventually get stuck and hang on bundle registration. We see error logs from 
> the worker harness:
> {code}
> Processing stuck in step s01 for at least 06h30m00s without outputting or 
> completing in state finish at 
> sun.misc.Unsafe.park(Native Method) at 
> java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at 
> java.util.concurrent.CompletableFuture$Signaller.block(CompletableFuture.java:1693)
>  at 
> java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3323) at 
> java.util.concurrent.CompletableFuture.waitingGet(CompletableFuture.java:1729)
>  at 
> java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1895) at 
> org.apache.beam.sdk.util.MoreFutures.get(MoreFutures.java:57) at 
> org.apache.beam.runners.dataflow.worker.fn.control.RegisterAndProcessBundleOperation.finish(RegisterAndProcessBundleOperation.java:277)
>  at 
> org.apache.beam.runners.dataflow.worker.util.common.worker.MapTaskExecutor.execute(MapTaskExecutor.java:85)
>  at 
> org.apache.beam.runners.dataflow.worker.fn.control.BeamFnMapTaskExecutor.execute(BeamFnMapTaskExecutor.java:119)
>  at 
> org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker.process(StreamingDataflowWorker.java:1226)
>  at 
> org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker.access$1000(StreamingDataflowWorker.java:141)
>  at 
> org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker$6.run(StreamingDataflowWorker.java:965)
>  at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  at 
> java.lang.Thread.run(Thread.java:745)
> {code}
> Looking at [the 
> code|https://github.com/apache/beam/blob/release-2.8.0/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/fn/control/RegisterAndProcessBundleOperation.java#L277],
>  it looks like there are no timeouts on the Bundle Registration calls over 
> the FnApi, which contributes to this hanging forever rather than giving a 
> better failure.
> This bug report came from a customer running a python streaming pipeline 
> using the new portability framework on Dataflow. Hopefully we can repro on 
> our own in order to link to the job / logs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-6451) Portability Pipeline eventually hangs on bundle registration

2019-01-16 Thread Kenneth Knowles (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-6451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16744625#comment-16744625
 ] 

Kenneth Knowles commented on BEAM-6451:
---

Dug into the code.

Should add a timeout to throwIfFailure, here: 
https://github.com/apache/beam/blob/master/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/fn/control/RegisterAndProcessBundleOperation.java#L538

> Portability Pipeline eventually hangs on bundle registration
> 
>
> Key: BEAM-6451
> URL: https://issues.apache.org/jira/browse/BEAM-6451
> Project: Beam
>  Issue Type: Bug
>  Components: java-fn-execution, runner-dataflow, sdk-py-harness
>Reporter: Scott Wegner
>Priority: Minor
>  Labels: portability
>
> We've seen jobs using portability start off in a healthy state, but then 
> eventually get stuck and hang on bundle registration. We see error logs from 
> the worker harness:
> {code}
> Processing stuck in step s01 for at least 06h30m00s without outputting or 
> completing in state finish at 
> sun.misc.Unsafe.park(Native Method) at 
> java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at 
> java.util.concurrent.CompletableFuture$Signaller.block(CompletableFuture.java:1693)
>  at 
> java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3323) at 
> java.util.concurrent.CompletableFuture.waitingGet(CompletableFuture.java:1729)
>  at 
> java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1895) at 
> org.apache.beam.sdk.util.MoreFutures.get(MoreFutures.java:57) at 
> org.apache.beam.runners.dataflow.worker.fn.control.RegisterAndProcessBundleOperation.finish(RegisterAndProcessBundleOperation.java:277)
>  at 
> org.apache.beam.runners.dataflow.worker.util.common.worker.MapTaskExecutor.execute(MapTaskExecutor.java:85)
>  at 
> org.apache.beam.runners.dataflow.worker.fn.control.BeamFnMapTaskExecutor.execute(BeamFnMapTaskExecutor.java:119)
>  at 
> org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker.process(StreamingDataflowWorker.java:1226)
>  at 
> org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker.access$1000(StreamingDataflowWorker.java:141)
>  at 
> org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker$6.run(StreamingDataflowWorker.java:965)
>  at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  at 
> java.lang.Thread.run(Thread.java:745)
> {code}
> Looking at [the 
> code|https://github.com/apache/beam/blob/release-2.8.0/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/fn/control/RegisterAndProcessBundleOperation.java#L277],
>  it looks like there are no timeouts on the Bundle Registration calls over 
> the FnApi, which contributes to this hanging forever rather than giving a 
> better failure.
> This bug report came from a customer running a python streaming pipeline 
> using the new portability framework on Dataflow. Hopefully we can repro on 
> our own in order to link to the job / logs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5987) Spark SideInputReader performance

2019-01-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5987?focusedWorklogId=186126=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-186126
 ]

ASF GitHub Bot logged work on BEAM-5987:


Author: ASF GitHub Bot
Created on: 17/Jan/19 02:34
Start Date: 17/Jan/19 02:34
Worklog Time Spent: 10m 
  Work Description: kennknowles commented on pull request #7091: 
[BEAM-5987] Spark: Share cached side inputs between tasks.
URL: https://github.com/apache/beam/pull/7091#discussion_r248521765
 
 

 ##
 File path: 
runners/spark/src/main/java/org/apache/beam/runners/spark/util/CachedSideInputReader.java
 ##
 @@ -86,9 +55,27 @@ private CachedSideInputReader(SideInputReader delegate) {
   @Override
   public  T get(PCollectionView view, BoundedWindow window) {
 @SuppressWarnings("unchecked")
-final Map, T> materializedCasted = (Map) materialized;
-return materializedCasted.computeIfAbsent(
-new Key<>(view, window), key -> delegate.get(view, window));
+final Cache, Optional> materializedCasted =
+(Cache) SideInputStorage.getMaterializedSideInputs();
+
+Key sideInputKey = new Key<>(view, window);
+
+try {
+  Optional optionalResult =
+  materializedCasted.get(
+  sideInputKey,
+  () -> {
+final T result = delegate.get(view, window);
+LOG.info(
+"Caching de-serialized side input for {} of size [{}B] in 
memory.",
+sideInputKey,
+SizeEstimator.estimate(result));
+return Optional.ofNullable(result);
+  });
+  return optionalResult.orElse(null);
 
 Review comment:
   Ah, thank you for clarifying. That is a good attempt. The problem is that 
these will incorrectly be turned into the same thing:
   
   `Optional.ofNullable(null).orElse(null) == null`
   
   `Optional.ofNullable(Optional.absent()).orElse(null) == null`
   
   The fact that `Optional.of(null)` throws NPE is a mistake in the design 
(both Java and Guava). Maybe the point of the design is to convince people to 
not use `null`, which [is a billion dollar good 
idea](https://en.wikipedia.org/wiki/Null_pointer#History). But it makes 
`Optional` not correctly parametric in `T`.
   
   I think that if you actually convert `null` into 
`Optional.of(Optional.absent())` and other values `v` into 
`Optional.of(Optional.of(v))` you can simulate the behavior it should have had 
in the first place. Or you could make your own little replacement of `Optional`.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 186126)
Time Spent: 7h  (was: 6h 50m)

> Spark SideInputReader performance
> -
>
> Key: BEAM-5987
> URL: https://issues.apache.org/jira/browse/BEAM-5987
> Project: Beam
>  Issue Type: Bug
>  Components: runner-spark
>Affects Versions: 2.8.0
>Reporter: David Moravek
>Assignee: David Moravek
>Priority: Major
> Fix For: 2.9.0
>
> Attachments: Screen Shot 2018-11-06 at 13.05.36.png
>
>  Time Spent: 7h
>  Remaining Estimate: 0h
>
> We did some profiling of a spark job and 90% of the application time was 
> spent on side input deserialization.
> For spark, an easy fix is to cache materialized side inputs per bundle. This 
> improved running time of the profiled job from 3 hours to 30 minutes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6443) decrease the number of thread for BigQuery streaming insertAll

2019-01-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6443?focusedWorklogId=186125=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-186125
 ]

ASF GitHub Bot logged work on BEAM-6443:


Author: ASF GitHub Bot
Created on: 17/Jan/19 02:34
Start Date: 17/Jan/19 02:34
Worklog Time Spent: 10m 
  Work Description: ihji commented on pull request #7547: [BEAM-6443] 
decrease the number of thread for BigQuery streaming inse…
URL: https://github.com/apache/beam/pull/7547
 
 
   changes the thread pool used by insertAll from unlimited to single
   
   **Please** add a meaningful description for your change here
   
   
   
   Follow this checklist to help us incorporate your contribution quickly and 
easily:
   
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   It will help us expedite review of your Pull Request if you tag someone 
(e.g. `@username`) to look at it.
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | --- | --- | --- | ---
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/)
 | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/)
  [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_VR_Flink/lastCompletedBuild/)
 | --- | --- | ---
   
   
   
   
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 186125)

[jira] [Created] (BEAM-6458) Beam site links to grafana dashboard by ip

2019-01-16 Thread Kenneth Knowles (JIRA)
Kenneth Knowles created BEAM-6458:
-

 Summary: Beam site links to grafana dashboard by ip
 Key: BEAM-6458
 URL: https://issues.apache.org/jira/browse/BEAM-6458
 Project: Beam
  Issue Type: Bug
  Components: website
Reporter: Kenneth Knowles
Assignee: Mikhail Gryzykhin


I would say this is not a great practice. As a stopgap use a shortlink at 
https://s.apache.org.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5396) Flink portable runner savepoint / upgrade support

2019-01-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5396?focusedWorklogId=186123=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-186123
 ]

ASF GitHub Bot logged work on BEAM-5396:


Author: ASF GitHub Bot
Created on: 17/Jan/19 02:14
Start Date: 17/Jan/19 02:14
Worklog Time Spent: 10m 
  Work Description: mxm commented on issue #7362: [BEAM-5396] Assign 
portable operator uids
URL: https://github.com/apache/beam/pull/7362#issuecomment-455016389
 
 
   Run Portable_Python PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 186123)
Time Spent: 7h  (was: 6h 50m)

> Flink portable runner savepoint / upgrade support
> -
>
> Key: BEAM-5396
> URL: https://issues.apache.org/jira/browse/BEAM-5396
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-flink
>Reporter: Thomas Weise
>Assignee: Maximilian Michels
>Priority: Major
>  Labels: portability, portability-flink
>  Time Spent: 7h
>  Remaining Estimate: 0h
>
> The portable Flink runner needs to support Flink savepoints for production 
> use. It should be possible to upgrade a stateful portable Beam pipeline that 
> runs on Flink, which involves taking a savepoint and then starting the new 
> version of the pipeline from that savepoint. The potential issues with 
> pipeline evolution and migration are similar to those when using the Flink 
> DataStream API (schema / name changes etc.).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5396) Flink portable runner savepoint / upgrade support

2019-01-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5396?focusedWorklogId=186124=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-186124
 ]

ASF GitHub Bot logged work on BEAM-5396:


Author: ASF GitHub Bot
Created on: 17/Jan/19 02:14
Start Date: 17/Jan/19 02:14
Worklog Time Spent: 10m 
  Work Description: mxm commented on issue #7362: [BEAM-5396] Assign 
portable operator uids
URL: https://github.com/apache/beam/pull/7362#issuecomment-455016468
 
 
   Run Python Flink ValidatesRunner
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 186124)
Time Spent: 7h 10m  (was: 7h)

> Flink portable runner savepoint / upgrade support
> -
>
> Key: BEAM-5396
> URL: https://issues.apache.org/jira/browse/BEAM-5396
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-flink
>Reporter: Thomas Weise
>Assignee: Maximilian Michels
>Priority: Major
>  Labels: portability, portability-flink
>  Time Spent: 7h 10m
>  Remaining Estimate: 0h
>
> The portable Flink runner needs to support Flink savepoints for production 
> use. It should be possible to upgrade a stateful portable Beam pipeline that 
> runs on Flink, which involves taking a savepoint and then starting the new 
> version of the pipeline from that savepoint. The potential issues with 
> pipeline evolution and migration are similar to those when using the Flink 
> DataStream API (schema / name changes etc.).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5396) Flink portable runner savepoint / upgrade support

2019-01-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5396?focusedWorklogId=186121=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-186121
 ]

ASF GitHub Bot logged work on BEAM-5396:


Author: ASF GitHub Bot
Created on: 17/Jan/19 02:13
Start Date: 17/Jan/19 02:13
Worklog Time Spent: 10m 
  Work Description: mxm commented on issue #7362: [BEAM-5396] Assign 
portable operator uids
URL: https://github.com/apache/beam/pull/7362#issuecomment-455016372
 
 
   Run Portable_Python PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 186121)
Time Spent: 6h 40m  (was: 6.5h)

> Flink portable runner savepoint / upgrade support
> -
>
> Key: BEAM-5396
> URL: https://issues.apache.org/jira/browse/BEAM-5396
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-flink
>Reporter: Thomas Weise
>Assignee: Maximilian Michels
>Priority: Major
>  Labels: portability, portability-flink
>  Time Spent: 6h 40m
>  Remaining Estimate: 0h
>
> The portable Flink runner needs to support Flink savepoints for production 
> use. It should be possible to upgrade a stateful portable Beam pipeline that 
> runs on Flink, which involves taking a savepoint and then starting the new 
> version of the pipeline from that savepoint. The potential issues with 
> pipeline evolution and migration are similar to those when using the Flink 
> DataStream API (schema / name changes etc.).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5396) Flink portable runner savepoint / upgrade support

2019-01-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5396?focusedWorklogId=186122=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-186122
 ]

ASF GitHub Bot logged work on BEAM-5396:


Author: ASF GitHub Bot
Created on: 17/Jan/19 02:13
Start Date: 17/Jan/19 02:13
Worklog Time Spent: 10m 
  Work Description: mxm commented on issue #7362: [BEAM-5396] Assign 
portable operator uids
URL: https://github.com/apache/beam/pull/7362#issuecomment-455016389
 
 
   Run Portable_Python PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 186122)
Time Spent: 6h 50m  (was: 6h 40m)

> Flink portable runner savepoint / upgrade support
> -
>
> Key: BEAM-5396
> URL: https://issues.apache.org/jira/browse/BEAM-5396
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-flink
>Reporter: Thomas Weise
>Assignee: Maximilian Michels
>Priority: Major
>  Labels: portability, portability-flink
>  Time Spent: 6h 50m
>  Remaining Estimate: 0h
>
> The portable Flink runner needs to support Flink savepoints for production 
> use. It should be possible to upgrade a stateful portable Beam pipeline that 
> runs on Flink, which involves taking a savepoint and then starting the new 
> version of the pipeline from that savepoint. The potential issues with 
> pipeline evolution and migration are similar to those when using the Flink 
> DataStream API (schema / name changes etc.).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-6405) Improve PortableValidatesRunner test reliability on Jenkins

2019-01-16 Thread Maximilian Michels (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-6405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16744608#comment-16744608
 ] 

Maximilian Michels commented on BEAM-6405:
--

Streaming execution looks very good: 
https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/
Batch remains flaky due to memory problems: 
https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/

This PR should get rid of the memory problems: 
https://github.com/apache/beam/pull/7515

> Improve PortableValidatesRunner test reliability on Jenkins
> ---
>
> Key: BEAM-6405
> URL: https://issues.apache.org/jira/browse/BEAM-6405
> Project: Beam
>  Issue Type: Test
>  Components: runner-flink
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Major
>  Time Spent: 5h
>  Remaining Estimate: 0h
>
> The PVR tests seem to be passing fine and then failing consecutively for no 
> reason: https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/ 
> It looks like the outrageous parallelism, i.e. number of available cores, is 
> responsible for the flakiness if there is additional load on the build 
> slaves. We should lower the parallelism.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6405) Improve PortableValidatesRunner test reliability on Jenkins

2019-01-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6405?focusedWorklogId=186120=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-186120
 ]

ASF GitHub Bot logged work on BEAM-6405:


Author: ASF GitHub Bot
Created on: 17/Jan/19 02:08
Start Date: 17/Jan/19 02:08
Worklog Time Spent: 10m 
  Work Description: mxm commented on issue #7515: [BEAM-6405] Disable 
parallel test execution of PortableValidatesRunner
URL: https://github.com/apache/beam/pull/7515#issuecomment-455015448
 
 
   Streaming execution looks very good: 
https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/
   
   Merging this to see if batch mode improves: 
https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 186120)
Time Spent: 5h 20m  (was: 5h 10m)

> Improve PortableValidatesRunner test reliability on Jenkins
> ---
>
> Key: BEAM-6405
> URL: https://issues.apache.org/jira/browse/BEAM-6405
> Project: Beam
>  Issue Type: Test
>  Components: runner-flink
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Major
>  Time Spent: 5h 20m
>  Remaining Estimate: 0h
>
> The PVR tests seem to be passing fine and then failing consecutively for no 
> reason: https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/ 
> It looks like the outrageous parallelism, i.e. number of available cores, is 
> responsible for the flakiness if there is additional load on the build 
> slaves. We should lower the parallelism.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6405) Improve PortableValidatesRunner test reliability on Jenkins

2019-01-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6405?focusedWorklogId=186119=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-186119
 ]

ASF GitHub Bot logged work on BEAM-6405:


Author: ASF GitHub Bot
Created on: 17/Jan/19 02:07
Start Date: 17/Jan/19 02:07
Worklog Time Spent: 10m 
  Work Description: mxm commented on pull request #7515: [BEAM-6405] 
Disable parallel test execution of PortableValidatesRunner
URL: https://github.com/apache/beam/pull/7515
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 186119)
Time Spent: 5h 10m  (was: 5h)

> Improve PortableValidatesRunner test reliability on Jenkins
> ---
>
> Key: BEAM-6405
> URL: https://issues.apache.org/jira/browse/BEAM-6405
> Project: Beam
>  Issue Type: Test
>  Components: runner-flink
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Major
>  Time Spent: 5h 10m
>  Remaining Estimate: 0h
>
> The PVR tests seem to be passing fine and then failing consecutively for no 
> reason: https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/ 
> It looks like the outrageous parallelism, i.e. number of available cores, is 
> responsible for the flakiness if there is additional load on the build 
> slaves. We should lower the parallelism.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6230) Document Flink version support policy

2019-01-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6230?focusedWorklogId=186118=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-186118
 ]

ASF GitHub Bot logged work on BEAM-6230:


Author: ASF GitHub Bot
Created on: 17/Jan/19 01:59
Start Date: 17/Jan/19 01:59
Worklog Time Spent: 10m 
  Work Description: mxm commented on pull request #7546: [BEAM-6230] 
Clarify Flink version support on Flink Runner page
URL: https://github.com/apache/beam/pull/7546
 
 
   This updates the Flink Runner page in two commits:
   
   1. Flink version support
   2. Documentation of all pipeline options
   
   CC @tweise @angoenka 
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | --- | --- | --- | ---
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/)
 | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/)
  [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_VR_Flink/lastCompletedBuild/)
 | --- | --- | ---
   
   
   
   
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 186118)
Time Spent: 10m
Remaining Estimate: 0h

> Document Flink version support policy
> -
>
> Key: BEAM-6230
> URL: https://issues.apache.org/jira/browse/BEAM-6230
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-flink, website
>Reporter: Thomas Weise
>Assignee: Maximilian Michels
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> There have been discussions about which version of Flink should be supported 
> in Beam and release 2.10 

[jira] [Work logged] (BEAM-6428) Allow textual selection syntax for schema fields

2019-01-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6428?focusedWorklogId=186098=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-186098
 ]

ASF GitHub Bot logged work on BEAM-6428:


Author: ASF GitHub Bot
Created on: 17/Jan/19 01:31
Start Date: 17/Jan/19 01:31
Worklog Time Spent: 10m 
  Work Description: reuvenlax commented on pull request #7545: [BEAM-6428] 
Add a textual selection syntax for schema fields.
URL: https://github.com/apache/beam/pull/7545
 
 
   This allows the user to write code like Select.fieldNames("a.b", "c.d.*"))
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 186098)
Time Spent: 10m
Remaining Estimate: 0h

> Allow textual selection syntax for schema fields
> 
>
> Key: BEAM-6428
> URL: https://issues.apache.org/jira/browse/BEAM-6428
> Project: Beam
>  Issue Type: Sub-task
>  Components: beam-model
>Affects Versions: 2.9.0
>Reporter: Reuven Lax
>Assignee: Reuven Lax
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3580) Do not use Go BytesCoder to encode string

2019-01-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3580?focusedWorklogId=186090=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-186090
 ]

ASF GitHub Bot logged work on BEAM-3580:


Author: ASF GitHub Bot
Created on: 17/Jan/19 00:56
Start Date: 17/Jan/19 00:56
Worklog Time Spent: 10m 
  Work Description: lostluck commented on issue #7543: [BEAM-3580] Use a 
custom coder for strings.
URL: https://github.com/apache/beam/pull/7543#issuecomment-455001685
 
 
   R: @htyleo @aaltay Hi, Please review! Thanks
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 186090)
Time Spent: 1h 50m  (was: 1h 40m)

> Do not use Go BytesCoder to encode string
> -
>
> Key: BEAM-3580
> URL: https://issues.apache.org/jira/browse/BEAM-3580
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-go
>Reporter: Henning Rohde
>Assignee: Robert Burke
>Priority: Major
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> We should not use the same built-in coder for two different types. It creates 
> the need for conversions at inopportune times in the runtime.
>  
> One option would be to a custom coder that shares encoding with bytes, given 
> that bytes are length prefixed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3580) Do not use Go BytesCoder to encode string

2019-01-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3580?focusedWorklogId=186087=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-186087
 ]

ASF GitHub Bot logged work on BEAM-3580:


Author: ASF GitHub Bot
Created on: 17/Jan/19 00:39
Start Date: 17/Jan/19 00:39
Worklog Time Spent: 10m 
  Work Description: lostluck commented on issue #7543: [BEAM-3580] Use a 
custom coder for strings.
URL: https://github.com/apache/beam/pull/7543#issuecomment-454998131
 
 
   Run Go Postcommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 186087)
Time Spent: 1h 40m  (was: 1.5h)

> Do not use Go BytesCoder to encode string
> -
>
> Key: BEAM-3580
> URL: https://issues.apache.org/jira/browse/BEAM-3580
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-go
>Reporter: Henning Rohde
>Assignee: Robert Burke
>Priority: Major
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> We should not use the same built-in coder for two different types. It creates 
> the need for conversions at inopportune times in the runtime.
>  
> One option would be to a custom coder that shares encoding with bytes, given 
> that bytes are length prefixed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6431) Add ExecutionTime metrics to the Beam Java SDK

2019-01-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6431?focusedWorklogId=186089=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-186089
 ]

ASF GitHub Bot logged work on BEAM-6431:


Author: ASF GitHub Bot
Created on: 17/Jan/19 00:47
Start Date: 17/Jan/19 00:47
Worklog Time Spent: 10m 
  Work Description: ajamato commented on issue #7507: [BEAM-6431] Refactor, 
Remove references to Dataflow classes in base State Sampling
URL: https://github.com/apache/beam/pull/7507#issuecomment-454999807
 
 
   @Ardagan 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 186089)
Time Spent: 20m  (was: 10m)

> Add ExecutionTime metrics to the Beam Java SDK
> --
>
> Key: BEAM-6431
> URL: https://issues.apache.org/jira/browse/BEAM-6431
> Project: Beam
>  Issue Type: New Feature
>  Components: java-fn-execution
>Reporter: Alex Amato
>Assignee: Alex Amato
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> This will be done by using the Dataflow worker's StateSampler code. I have 
> put together a refactoring plan
> [here|https://docs.google.com/document/d/1OlAJf4T_CTL9WRH8lP8uQOfLjWYfm8IpRXSe38g34k4/edit#]
> This will include estimating the processing time for the start, process and 
> finish bundle. The python SDK already has an implementation of this.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6431) Add ExecutionTime metrics to the Beam Java SDK

2019-01-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6431?focusedWorklogId=186088=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-186088
 ]

ASF GitHub Bot logged work on BEAM-6431:


Author: ASF GitHub Bot
Created on: 17/Jan/19 00:47
Start Date: 17/Jan/19 00:47
Worklog Time Spent: 10m 
  Work Description: ajamato commented on issue #7507: [BEAM-6431] Refactor, 
Remove references to Dataflow classes in base State Sampling
URL: https://github.com/apache/beam/pull/7507#issuecomment-454999774
 
 
   @swegner @pabloem 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 186088)
Time Spent: 10m
Remaining Estimate: 0h

> Add ExecutionTime metrics to the Beam Java SDK
> --
>
> Key: BEAM-6431
> URL: https://issues.apache.org/jira/browse/BEAM-6431
> Project: Beam
>  Issue Type: New Feature
>  Components: java-fn-execution
>Reporter: Alex Amato
>Assignee: Alex Amato
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> This will be done by using the Dataflow worker's StateSampler code. I have 
> put together a refactoring plan
> [here|https://docs.google.com/document/d/1OlAJf4T_CTL9WRH8lP8uQOfLjWYfm8IpRXSe38g34k4/edit#]
> This will include estimating the processing time for the start, process and 
> finish bundle. The python SDK already has an implementation of this.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3580) Do not use Go BytesCoder to encode string

2019-01-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3580?focusedWorklogId=186083=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-186083
 ]

ASF GitHub Bot logged work on BEAM-3580:


Author: ASF GitHub Bot
Created on: 17/Jan/19 00:33
Start Date: 17/Jan/19 00:33
Worklog Time Spent: 10m 
  Work Description: lostluck commented on pull request #7543: [BEAM-3580] 
Use a custom coder for strings.
URL: https://github.com/apache/beam/pull/7543
 
 
   Use a custom coder for strings in the Go SDK. CustomCoders are already 
length prefixed, so this does the simple conversion.
   This PR also removes all places I could recall do a manual []byte to string 
conversion which was a consequence of the previous approach. This should yield 
a mild performance improvement in that the conversions are done earlier rather 
than right before invocation. Similarly, []bytes are no longer getting checked 
that they are strings.
   
   
   
   Follow this checklist to help us incorporate your contribution quickly and 
easily:
   
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   It will help us expedite review of your Pull Request if you tag someone 
(e.g. `@username`) to look at it.
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | --- | --- | --- | ---
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/)
 | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/)
  [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_VR_Flink/lastCompletedBuild/)
 | --- | --- | ---
   
   
   
   
   
 

This is an automated message 

[jira] [Work logged] (BEAM-3580) Do not use Go BytesCoder to encode string

2019-01-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3580?focusedWorklogId=186084=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-186084
 ]

ASF GitHub Bot logged work on BEAM-3580:


Author: ASF GitHub Bot
Created on: 17/Jan/19 00:33
Start Date: 17/Jan/19 00:33
Worklog Time Spent: 10m 
  Work Description: lostluck commented on issue #7543: [BEAM-3580] Use a 
custom coder for strings.
URL: https://github.com/apache/beam/pull/7543#issuecomment-454996994
 
 
   Run Go PostCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 186084)
Time Spent: 1.5h  (was: 1h 20m)

> Do not use Go BytesCoder to encode string
> -
>
> Key: BEAM-3580
> URL: https://issues.apache.org/jira/browse/BEAM-3580
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-go
>Reporter: Henning Rohde
>Assignee: Robert Burke
>Priority: Major
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> We should not use the same built-in coder for two different types. It creates 
> the need for conversions at inopportune times in the runtime.
>  
> One option would be to a custom coder that shares encoding with bytes, given 
> that bytes are length prefixed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3342) Create a Cloud Bigtable Python connector

2019-01-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3342?focusedWorklogId=186081=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-186081
 ]

ASF GitHub Bot logged work on BEAM-3342:


Author: ASF GitHub Bot
Created on: 17/Jan/19 00:26
Start Date: 17/Jan/19 00:26
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on pull request #7367: 
[BEAM-3342] Create a Cloud Bigtable Python connector Write
URL: https://github.com/apache/beam/pull/7367#discussion_r248501754
 
 

 ##
 File path: sdks/python/tox.ini
 ##
 @@ -158,3 +161,4 @@ commands =
   coverage report --skip-covered
   # Generate report in xml format
   coverage xml
+  
 
 Review comment:
   Missing newline.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 186081)
Time Spent: 9h  (was: 8h 50m)

> Create a Cloud Bigtable Python connector
> 
>
> Key: BEAM-3342
> URL: https://issues.apache.org/jira/browse/BEAM-3342
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Solomon Duskis
>Assignee: Solomon Duskis
>Priority: Major
>  Time Spent: 9h
>  Remaining Estimate: 0h
>
> I would like to create a Cloud Bigtable python connector.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3342) Create a Cloud Bigtable Python connector

2019-01-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3342?focusedWorklogId=186078=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-186078
 ]

ASF GitHub Bot logged work on BEAM-3342:


Author: ASF GitHub Bot
Created on: 17/Jan/19 00:26
Start Date: 17/Jan/19 00:26
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on pull request #7367: 
[BEAM-3342] Create a Cloud Bigtable Python connector Write
URL: https://github.com/apache/beam/pull/7367#discussion_r248494914
 
 

 ##
 File path: sdks/python/apache_beam/io/gcp/bigtable_io.py
 ##
 @@ -0,0 +1,120 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+"""BigTable connector
 
 Review comment:
   Probably rename this file to "bigtableio.py" to be consistent with other 
connectors of Python SDK.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 186078)
Time Spent: 8h 40m  (was: 8.5h)

> Create a Cloud Bigtable Python connector
> 
>
> Key: BEAM-3342
> URL: https://issues.apache.org/jira/browse/BEAM-3342
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Solomon Duskis
>Assignee: Solomon Duskis
>Priority: Major
>  Time Spent: 8h 40m
>  Remaining Estimate: 0h
>
> I would like to create a Cloud Bigtable python connector.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3342) Create a Cloud Bigtable Python connector

2019-01-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3342?focusedWorklogId=186079=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-186079
 ]

ASF GitHub Bot logged work on BEAM-3342:


Author: ASF GitHub Bot
Created on: 17/Jan/19 00:26
Start Date: 17/Jan/19 00:26
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on pull request #7367: 
[BEAM-3342] Create a Cloud Bigtable Python connector Write
URL: https://github.com/apache/beam/pull/7367#discussion_r248500874
 
 

 ##
 File path: sdks/python/apache_beam/io/gcp/bigtable_io.py
 ##
 @@ -0,0 +1,120 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+"""BigTable connector
+
+This module implements writing to BigTable tables.
+The default mode is to set row data to write to BigTable tables.
+The syntax supported is described here:
+https://cloud.google.com/bigtable/docs/quickstart-cbt
+
+BigTable connector can be used as main outputs. A main output
+(common case) is expected to be massive and will be split into
+manageable chunks and processed in parallel. In the example below
+we created a list of rows then passed to the GeneratedDirectRows
+DoFn to set the Cells and then we call the WriteToBigtable to insert
+those generated rows in the table.
+
+  main_table = (p
+   | 'Generate Direct Rows' >> GenerateDirectRows(number)
+   | 'Write to BT' >> beam.ParDo(WriteToBigtable(config)))
+"""
+from __future__ import absolute_import
+
+import apache_beam as beam
+from apache_beam.metrics import Metrics
+from apache_beam.options.pipeline_options import PipelineOptions
+from apache_beam.transforms.display import DisplayDataItem
+
+try:
+  from google.cloud.bigtable import Client
+  from google.cloud.bigtable.batcher import MutationsBatcher
+except ImportError:
+  pass
+
+
+class WriteToBigtable(beam.DoFn):
+  """ Creates the connector can call and add_row to the batcher using each
+  row in beam pipe line
+
+  :type beam_options: class:`~bigtable_configuration.BigtableConfiguration`
+  :param beam_options: class `~bigtable_configuration.BigtableConfiguration`
+  """
+
+  def __init__(self, beam_options):
+super(WriteToBigtable, self).__init__(beam_options)
+self.beam_options = beam_options
+self.table = None
+self.batcher = None
+self.written = Metrics.counter(self.__class__, 'Written Row')
+
+  def start_bundle(self):
+if self.table is None:
+  client = Client(project=self.beam_options.project_id)
+  instance = client.instance(self.beam_options.instance_id)
+  self.table = instance.table(self.beam_options.table_id)
+self.batcher = MutationsBatcher(self.table)
+
+  def process(self, row):
+self.written.inc()
+self.batcher.mutate(row)
 
 Review comment:
   Please mention about de-duping somewhere (when it comes to runner retrying 
steps). I think we don't need to explicitly dedup here since, in a retry we 
will be mutating the same object here instead of inserting a new one.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 186079)
Time Spent: 8h 50m  (was: 8h 40m)

> Create a Cloud Bigtable Python connector
> 
>
> Key: BEAM-3342
> URL: https://issues.apache.org/jira/browse/BEAM-3342
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Solomon Duskis
>Assignee: Solomon Duskis
>Priority: Major
>  Time Spent: 8h 50m
>  Remaining Estimate: 0h
>
> I would like to create a Cloud Bigtable python connector.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3342) Create a Cloud Bigtable Python connector

2019-01-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3342?focusedWorklogId=186077=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-186077
 ]

ASF GitHub Bot logged work on BEAM-3342:


Author: ASF GitHub Bot
Created on: 17/Jan/19 00:26
Start Date: 17/Jan/19 00:26
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on pull request #7367: 
[BEAM-3342] Create a Cloud Bigtable Python connector Write
URL: https://github.com/apache/beam/pull/7367#discussion_r248500633
 
 

 ##
 File path: sdks/python/apache_beam/io/gcp/bigtable_io.py
 ##
 @@ -0,0 +1,120 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+"""BigTable connector
+
+This module implements writing to BigTable tables.
+The default mode is to set row data to write to BigTable tables.
+The syntax supported is described here:
+https://cloud.google.com/bigtable/docs/quickstart-cbt
+
+BigTable connector can be used as main outputs. A main output
+(common case) is expected to be massive and will be split into
+manageable chunks and processed in parallel. In the example below
+we created a list of rows then passed to the GeneratedDirectRows
+DoFn to set the Cells and then we call the WriteToBigtable to insert
+those generated rows in the table.
+
+  main_table = (p
+   | 'Generate Direct Rows' >> GenerateDirectRows(number)
+   | 'Write to BT' >> beam.ParDo(WriteToBigtable(config)))
+"""
+from __future__ import absolute_import
+
+import apache_beam as beam
+from apache_beam.metrics import Metrics
+from apache_beam.options.pipeline_options import PipelineOptions
+from apache_beam.transforms.display import DisplayDataItem
+
+try:
+  from google.cloud.bigtable import Client
+  from google.cloud.bigtable.batcher import MutationsBatcher
+except ImportError:
+  pass
+
+
+class WriteToBigtable(beam.DoFn):
+  """ Creates the connector can call and add_row to the batcher using each
+  row in beam pipe line
+
+  :type beam_options: class:`~bigtable_configuration.BigtableConfiguration`
+  :param beam_options: class `~bigtable_configuration.BigtableConfiguration`
+  """
+
+  def __init__(self, beam_options):
 
 Review comment:
   Please extract the properties from options object at pipeline construction 
time and set as instance variables in the DoFn. DoFn object gets pickled so all 
properties should be picklable.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 186077)
Time Spent: 8.5h  (was: 8h 20m)

> Create a Cloud Bigtable Python connector
> 
>
> Key: BEAM-3342
> URL: https://issues.apache.org/jira/browse/BEAM-3342
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Solomon Duskis
>Assignee: Solomon Duskis
>Priority: Major
>  Time Spent: 8.5h
>  Remaining Estimate: 0h
>
> I would like to create a Cloud Bigtable python connector.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3342) Create a Cloud Bigtable Python connector

2019-01-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3342?focusedWorklogId=186082=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-186082
 ]

ASF GitHub Bot logged work on BEAM-3342:


Author: ASF GitHub Bot
Created on: 17/Jan/19 00:26
Start Date: 17/Jan/19 00:26
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on pull request #7367: 
[BEAM-3342] Create a Cloud Bigtable Python connector Write
URL: https://github.com/apache/beam/pull/7367#discussion_r248500211
 
 

 ##
 File path: sdks/python/apache_beam/io/gcp/bigtable_io.py
 ##
 @@ -0,0 +1,120 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+"""BigTable connector
+
+This module implements writing to BigTable tables.
+The default mode is to set row data to write to BigTable tables.
+The syntax supported is described here:
+https://cloud.google.com/bigtable/docs/quickstart-cbt
+
+BigTable connector can be used as main outputs. A main output
+(common case) is expected to be massive and will be split into
+manageable chunks and processed in parallel. In the example below
+we created a list of rows then passed to the GeneratedDirectRows
+DoFn to set the Cells and then we call the WriteToBigtable to insert
+those generated rows in the table.
+
+  main_table = (p
+   | 'Generate Direct Rows' >> GenerateDirectRows(number)
+   | 'Write to BT' >> beam.ParDo(WriteToBigtable(config)))
+"""
+from __future__ import absolute_import
+
+import apache_beam as beam
+from apache_beam.metrics import Metrics
+from apache_beam.options.pipeline_options import PipelineOptions
+from apache_beam.transforms.display import DisplayDataItem
+
+try:
+  from google.cloud.bigtable import Client
+  from google.cloud.bigtable.batcher import MutationsBatcher
+except ImportError:
+  pass
+
+
+class WriteToBigtable(beam.DoFn):
 
 Review comment:
   Please make this private and rename  (for example to BigTableWriteFn). 
WriteToBigTable should be the transform that wraps this.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 186082)

> Create a Cloud Bigtable Python connector
> 
>
> Key: BEAM-3342
> URL: https://issues.apache.org/jira/browse/BEAM-3342
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Solomon Duskis
>Assignee: Solomon Duskis
>Priority: Major
>  Time Spent: 9h
>  Remaining Estimate: 0h
>
> I would like to create a Cloud Bigtable python connector.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3342) Create a Cloud Bigtable Python connector

2019-01-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3342?focusedWorklogId=186080=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-186080
 ]

ASF GitHub Bot logged work on BEAM-3342:


Author: ASF GitHub Bot
Created on: 17/Jan/19 00:26
Start Date: 17/Jan/19 00:26
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on pull request #7367: 
[BEAM-3342] Create a Cloud Bigtable Python connector Write
URL: https://github.com/apache/beam/pull/7367#discussion_r248499875
 
 

 ##
 File path: sdks/python/apache_beam/io/gcp/bigtable_io.py
 ##
 @@ -0,0 +1,120 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+"""BigTable connector
+
+This module implements writing to BigTable tables.
+The default mode is to set row data to write to BigTable tables.
+The syntax supported is described here:
+https://cloud.google.com/bigtable/docs/quickstart-cbt
+
+BigTable connector can be used as main outputs. A main output
+(common case) is expected to be massive and will be split into
+manageable chunks and processed in parallel. In the example below
+we created a list of rows then passed to the GeneratedDirectRows
+DoFn to set the Cells and then we call the WriteToBigtable to insert
+those generated rows in the table.
+
+  main_table = (p
+   | 'Generate Direct Rows' >> GenerateDirectRows(number)
 
 Review comment:
   Please wrap these in a single WriteToBigTable composite as per following gui.
   https://beam.apache.org/contribute/ptransform-style-guide/
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 186080)
Time Spent: 9h  (was: 8h 50m)

> Create a Cloud Bigtable Python connector
> 
>
> Key: BEAM-3342
> URL: https://issues.apache.org/jira/browse/BEAM-3342
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Solomon Duskis
>Assignee: Solomon Duskis
>Priority: Major
>  Time Spent: 9h
>  Remaining Estimate: 0h
>
> I would like to create a Cloud Bigtable python connector.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3342) Create a Cloud Bigtable Python connector

2019-01-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3342?focusedWorklogId=186076=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-186076
 ]

ASF GitHub Bot logged work on BEAM-3342:


Author: ASF GitHub Bot
Created on: 17/Jan/19 00:26
Start Date: 17/Jan/19 00:26
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on pull request #7367: 
[BEAM-3342] Create a Cloud Bigtable Python connector Write
URL: https://github.com/apache/beam/pull/7367#discussion_r248501682
 
 

 ##
 File path: sdks/python/apache_beam/io/gcp/bigtable_io_test.py
 ##
 @@ -0,0 +1,190 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+"""Unittest for GCP Bigtable testing."""
+from __future__ import absolute_import
+
+import datetime
+import logging
+import random
+import string
+import unittest
+import uuid
+
+import pytz
+
+import apache_beam as beam
+from apache_beam.io.gcp.bigtable_io import BigtableConfiguration
+from apache_beam.io.gcp.bigtable_io import WriteToBigtable
+from apache_beam.metrics.metric import MetricsFilter
+from apache_beam.options.pipeline_options import PipelineOptions
+from apache_beam.runners.runner import PipelineState
+from apache_beam.testing.test_pipeline import TestPipeline
+
+# Protect against environments where bigtable library is not available.
+# pylint: disable=wrong-import-order, wrong-import-position
+try:
+  from google.cloud._helpers import _datetime_from_microseconds
+  from google.cloud._helpers import _microseconds_from_datetime
+  from google.cloud._helpers import UTC
+  from google.cloud.bigtable import row, column_family, Client
+except ImportError:
+  Client = None
+  UTC = pytz.utc
+  _microseconds_from_datetime = lambda label_stamp: label_stamp
+  _datetime_from_microseconds = lambda micro: micro
+
+
+EXISTING_INSTANCES = []
+LABEL_KEY = u'python-bigtable-beam'
+label_stamp = datetime.datetime.utcnow().replace(tzinfo=UTC)
+label_stamp_micros = _microseconds_from_datetime(label_stamp)
+LABELS = {LABEL_KEY: str(label_stamp_micros)}
+
+
+def _retry_on_unavailable(exc):
+  """Retry only errors whose status code is 'UNAVAILABLE'."""
+  from grpc import StatusCode
+  return exc.code() == StatusCode.UNAVAILABLE
+
+
+class GenerateDirectRows(beam.PTransform):
+  """ Generates an iterator of DirectRow object to process on beam pipeline.
+
+  """
+  def __init__(self, number, **kwargs):
+super(GenerateDirectRows, self).__init__(**kwargs)
+self.number = number
+self.rand = random.choice(string.ascii_letters + string.digits)
+self.column_family_id = 'cf1'
+
+  def _generate(self):
+value = ''.join(self.rand for i in range(100))
+
+for index in range(self.number):
+  key = "beam_key%s" % ('{0:07}'.format(index))
+  direct_row = row.DirectRow(row_key=key)
+  for column_id in range(10):
+direct_row.set_cell(self.column_family_id,
+('field%s' % column_id).encode('utf-8'),
+value,
+datetime.datetime.now())
+  yield direct_row
+
+  def expand(self, pvalue):
+return (pvalue
+| beam.Create(self._generate()))
+
+
+@unittest.skipIf(Client is None, 'GCP Bigtable dependencies are not installed')
+class BigtableIOWriteIT(unittest.TestCase):
 
 Review comment:
   Probably better to move the integration test to a separate module. For 
example, 
https://github.com/apache/beam/blob/master/sdks/python/apache_beam/examples/cookbook/bigquery_tornadoes_it_test.py
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 186076)
Time Spent: 8h 20m  (was: 8h 10m)

> Create a Cloud Bigtable Python connector
> 
>
> Key: BEAM-3342
> URL: https://issues.apache.org/jira/browse/BEAM-3342
> 

[jira] [Work logged] (BEAM-4184) S3ResourceIdTest has had a masked failure

2019-01-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4184?focusedWorklogId=186072=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-186072
 ]

ASF GitHub Bot logged work on BEAM-4184:


Author: ASF GitHub Bot
Created on: 17/Jan/19 00:14
Start Date: 17/Jan/19 00:14
Worklog Time Spent: 10m 
  Work Description: adude3141 commented on issue #7414: [BEAM-4184] s3 
resource id test has had a masked failure
URL: https://github.com/apache/beam/pull/7414#issuecomment-454993142
 
 
   +1. Thx for taking the time.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 186072)
Time Spent: 2h  (was: 1h 50m)

> S3ResourceIdTest has had a masked failure
> -
>
> Key: BEAM-4184
> URL: https://issues.apache.org/jira/browse/BEAM-4184
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-aws
>Reporter: Kenneth Knowles
>Assignee: Michael Luckey
>Priority: Major
>  Labels: sickbay
> Fix For: 2.11.0
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> Sickbayed in https://github.com/apache/beam/pull/5161, the test should be 
> fixed and no longer ignored.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4184) S3ResourceIdTest has had a masked failure

2019-01-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4184?focusedWorklogId=186071=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-186071
 ]

ASF GitHub Bot logged work on BEAM-4184:


Author: ASF GitHub Bot
Created on: 17/Jan/19 00:12
Start Date: 17/Jan/19 00:12
Worklog Time Spent: 10m 
  Work Description: adude3141 commented on issue #7414: [BEAM-4184] s3 
resource id test has had a masked failure
URL: https://github.com/apache/beam/pull/7414#issuecomment-454992895
 
 
   Run Java_Examples_Dataflow PreCommit
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 186071)
Time Spent: 1h 50m  (was: 1h 40m)

> S3ResourceIdTest has had a masked failure
> -
>
> Key: BEAM-4184
> URL: https://issues.apache.org/jira/browse/BEAM-4184
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-aws
>Reporter: Kenneth Knowles
>Assignee: Michael Luckey
>Priority: Major
>  Labels: sickbay
> Fix For: 2.11.0
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> Sickbayed in https://github.com/apache/beam/pull/5161, the test should be 
> fixed and no longer ignored.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6452) Nested collection types cause NullPointerException when converting to a POJO

2019-01-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6452?focusedWorklogId=186070=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-186070
 ]

ASF GitHub Bot logged work on BEAM-6452:


Author: ASF GitHub Bot
Created on: 17/Jan/19 00:09
Start Date: 17/Jan/19 00:09
Worklog Time Spent: 10m 
  Work Description: reuvenlax commented on pull request #7536: [BEAM-6452] 
Fix NullPointerException caused by nested collections
URL: https://github.com/apache/beam/pull/7536
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 186070)
Time Spent: 1.5h  (was: 1h 20m)

> Nested collection types cause NullPointerException when converting to a POJO
> 
>
> Key: BEAM-6452
> URL: https://issues.apache.org/jira/browse/BEAM-6452
> Project: Beam
>  Issue Type: Sub-task
>  Components: beam-model
>Reporter: Reuven Lax
>Assignee: Reuven Lax
>Priority: Major
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6184) PortableRunner dependency missed in wordcount example maven artifact

2019-01-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6184?focusedWorklogId=186068=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-186068
 ]

ASF GitHub Bot logged work on BEAM-6184:


Author: ASF GitHub Bot
Created on: 17/Jan/19 00:00
Start Date: 17/Jan/19 00:00
Worklog Time Spent: 10m 
  Work Description: HuangLED commented on issue #7532: [BEAM-6184]Turn on 
javadocmethod checkstyle to report error.
URL: https://github.com/apache/beam/pull/7532#issuecomment-454990516
 
 
   Run Python PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 186068)
Time Spent: 7h 10m  (was: 7h)

> PortableRunner dependency missed in wordcount example maven artifact
> 
>
> Key: BEAM-6184
> URL: https://issues.apache.org/jira/browse/BEAM-6184
> Project: Beam
>  Issue Type: Improvement
>  Components: build-system
>Reporter: Ruoyun Huang
>Assignee: Ruoyun Huang
>Priority: Minor
> Fix For: Not applicable
>
>  Time Spent: 7h 10m
>  Remaining Estimate: 0h
>
>  
>  
> more context: 
> https://lists.apache.org/thread.html/8dd60395424425f7502d62888c49014430d1d3b06c026606f3db28ab@%3Cuser.beam.apache.org%3E



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6447) Spotless paddedCell appears to cause unpredictable behavior

2019-01-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6447?focusedWorklogId=186064=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-186064
 ]

ASF GitHub Bot logged work on BEAM-6447:


Author: ASF GitHub Bot
Created on: 16/Jan/19 23:56
Start Date: 16/Jan/19 23:56
Worklog Time Spent: 10m 
  Work Description: reuvenlax commented on pull request #7531: [BEAM-6447] 
Turn off spotless paddedCell
URL: https://github.com/apache/beam/pull/7531
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 186064)
Time Spent: 3h 10m  (was: 3h)

> Spotless paddedCell appears to cause unpredictable behavior
> ---
>
> Key: BEAM-6447
> URL: https://issues.apache.org/jira/browse/BEAM-6447
> Project: Beam
>  Issue Type: Bug
>  Components: build-system
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>Priority: Major
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> In https://github.com/apache/beam/pull/7505 spotlessCheck passed even though 
> spotlessApply was not a no-op.
> In https://github.com/apache/beam/pull/7523 and 
> https://github.com/apache/beam/pull/7527 spotlessApply was run and the result 
> _also_ passed spotlessCheck.
> Confirmed that spotlessCheck fails on 
> https://github.com/apache/beam/pull/7505 if paddedCell is turned off. The use 
> of paddedCell is to workaround bugs in the underlying formatter (in our case 
> google-java-format). Currently, there is no known bug affecting us, so we 
> should not be using paddedCell.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6452) Nested collection types cause NullPointerException when converting to a POJO

2019-01-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6452?focusedWorklogId=186063=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-186063
 ]

ASF GitHub Bot logged work on BEAM-6452:


Author: ASF GitHub Bot
Created on: 16/Jan/19 23:52
Start Date: 16/Jan/19 23:52
Worklog Time Spent: 10m 
  Work Description: reuvenlax commented on pull request #7536: [BEAM-6452] 
Fix NullPointerException caused by nested collections
URL: https://github.com/apache/beam/pull/7536#discussion_r248495238
 
 

 ##
 File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/FieldValueTypeInformation.java
 ##
 @@ -157,68 +157,94 @@ public FieldValueTypeInformation withName(String name) {
 return toBuilder().setName(name).build();
   }
 
-  private static Type getArrayComponentType(Field field) {
+  private static FieldValueTypeInformation getArrayComponentType(Field field) {
 return getArrayComponentType(TypeDescriptor.of(field.getGenericType()));
   }
 
   @Nullable
-  private static Type getArrayComponentType(TypeDescriptor valueType) {
+  private static FieldValueTypeInformation 
getArrayComponentType(TypeDescriptor valueType) {
+// TODO: Figure out nullable elements.
+TypeDescriptor componentType = null;
 if (valueType.isArray()) {
   Type component = valueType.getComponentType().getType();
   if (!component.equals(byte.class)) {
-return component;
+componentType = TypeDescriptor.of(component);
   }
 } else if (valueType.isSubtypeOf(TypeDescriptor.of(Collection.class))) {
   TypeDescriptor> collection = 
valueType.getSupertype(Collection.class);
   if (collection.getType() instanceof ParameterizedType) {
 ParameterizedType ptype = (ParameterizedType) collection.getType();
 java.lang.reflect.Type[] params = ptype.getActualTypeArguments();
 checkArgument(params.length == 1);
-return params[0];
+componentType = TypeDescriptor.of(params[0]);
   } else {
 throw new RuntimeException("Collection parameter is not 
parameterized!");
   }
 }
-return null;
+if (componentType == null) {
+  return null;
+}
+
+return new AutoValue_FieldValueTypeInformation.Builder()
+.setName("")
+.setNullable(false)
+.setType(componentType)
+.setRawType(componentType.getRawType())
+.setElementType(getArrayComponentType(componentType))
+.setMapKeyType(getMapKeyType(componentType))
+.setMapValueType(getMapValueType(componentType))
+.build();
   }
 
   // If the Field is a map type, returns the key type, otherwise returns a 
null reference.
   @Nullable
-  private static Type getMapKeyType(Field field) {
+  private static FieldValueTypeInformation getMapKeyType(Field field) {
 return getMapKeyType(TypeDescriptor.of(field.getGenericType()));
   }
 
   @Nullable
-  private static Type getMapKeyType(TypeDescriptor typeDescriptor) {
+  private static FieldValueTypeInformation getMapKeyType(TypeDescriptor 
typeDescriptor) {
 return getMapType(typeDescriptor, 0);
   }
 
   // If the Field is a map type, returns the value type, otherwise returns a 
null reference.
   @Nullable
-  private static Type getMapValueType(Field field) {
+  private static FieldValueTypeInformation getMapValueType(Field field) {
 return getMapType(TypeDescriptor.of(field.getGenericType()), 1);
   }
 
   @Nullable
-  private static Type getMapValueType(TypeDescriptor typeDescriptor) {
+  private static FieldValueTypeInformation getMapValueType(TypeDescriptor 
typeDescriptor) {
 return getMapType(typeDescriptor, 1);
   }
 
   // If the Field is a map type, returns the key or value type (0 is key type, 
1 is value).
   // Otherwise returns a null reference.
   @SuppressWarnings("unchecked")
   @Nullable
-  private static Type getMapType(TypeDescriptor valueType, int index) {
+  private static FieldValueTypeInformation getMapType(TypeDescriptor 
valueType, int index) {
+TypeDescriptor mapType = null;
 if (valueType.isSubtypeOf(TypeDescriptor.of(Map.class))) {
   TypeDescriptor> map = valueType.getSupertype(Map.class);
   if (map.getType() instanceof ParameterizedType) {
 ParameterizedType ptype = (ParameterizedType) map.getType();
 java.lang.reflect.Type[] params = ptype.getActualTypeArguments();
-return params[index];
+mapType = TypeDescriptor.of(params[index]);
   } else {
 throw new RuntimeException("Map type is not parameterized! " + map);
   }
 }
-return null;
+if (mapType == null) {
+  return null;
+}
+return new AutoValue_FieldValueTypeInformation.Builder()
+.setName("")
+.setNullable(false)
+.setType(mapType)
+.setRawType(mapType.getRawType())
+.setElementType(getArrayComponentType(mapType))
+

[jira] [Work logged] (BEAM-6447) Spotless paddedCell appears to cause unpredictable behavior

2019-01-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6447?focusedWorklogId=186057=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-186057
 ]

ASF GitHub Bot logged work on BEAM-6447:


Author: ASF GitHub Bot
Created on: 16/Jan/19 23:38
Start Date: 16/Jan/19 23:38
Worklog Time Spent: 10m 
  Work Description: reuvenlax commented on issue #7531: [BEAM-6447] Turn 
off spotless paddedCell
URL: https://github.com/apache/beam/pull/7531#issuecomment-454985815
 
 
   Run Java_Examples_Dataflow PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 186057)
Time Spent: 3h  (was: 2h 50m)

> Spotless paddedCell appears to cause unpredictable behavior
> ---
>
> Key: BEAM-6447
> URL: https://issues.apache.org/jira/browse/BEAM-6447
> Project: Beam
>  Issue Type: Bug
>  Components: build-system
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>Priority: Major
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> In https://github.com/apache/beam/pull/7505 spotlessCheck passed even though 
> spotlessApply was not a no-op.
> In https://github.com/apache/beam/pull/7523 and 
> https://github.com/apache/beam/pull/7527 spotlessApply was run and the result 
> _also_ passed spotlessCheck.
> Confirmed that spotlessCheck fails on 
> https://github.com/apache/beam/pull/7505 if paddedCell is turned off. The use 
> of paddedCell is to workaround bugs in the underlying formatter (in our case 
> google-java-format). Currently, there is no known bug affecting us, so we 
> should not be using paddedCell.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6184) PortableRunner dependency missed in wordcount example maven artifact

2019-01-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6184?focusedWorklogId=186060=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-186060
 ]

ASF GitHub Bot logged work on BEAM-6184:


Author: ASF GitHub Bot
Created on: 16/Jan/19 23:46
Start Date: 16/Jan/19 23:46
Worklog Time Spent: 10m 
  Work Description: HuangLED commented on issue #7532: [BEAM-6184]Turn on 
javadocmethod checkstyle to report error.
URL: https://github.com/apache/beam/pull/7532#issuecomment-454987507
 
 
   Run Java_Examples_Dataflow PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 186060)
Time Spent: 7h  (was: 6h 50m)

> PortableRunner dependency missed in wordcount example maven artifact
> 
>
> Key: BEAM-6184
> URL: https://issues.apache.org/jira/browse/BEAM-6184
> Project: Beam
>  Issue Type: Improvement
>  Components: build-system
>Reporter: Ruoyun Huang
>Assignee: Ruoyun Huang
>Priority: Minor
> Fix For: Not applicable
>
>  Time Spent: 7h
>  Remaining Estimate: 0h
>
>  
>  
> more context: 
> https://lists.apache.org/thread.html/8dd60395424425f7502d62888c49014430d1d3b06c026606f3db28ab@%3Cuser.beam.apache.org%3E



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6452) Nested collection types cause NullPointerException when converting to a POJO

2019-01-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6452?focusedWorklogId=186058=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-186058
 ]

ASF GitHub Bot logged work on BEAM-6452:


Author: ASF GitHub Bot
Created on: 16/Jan/19 23:38
Start Date: 16/Jan/19 23:38
Worklog Time Spent: 10m 
  Work Description: reuvenlax commented on issue #7536: [BEAM-6452] Fix 
NullPointerException caused by nested collections
URL: https://github.com/apache/beam/pull/7536#issuecomment-454985891
 
 
   Run Java_Examples_Dataflow PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 186058)
Time Spent: 1h 10m  (was: 1h)

> Nested collection types cause NullPointerException when converting to a POJO
> 
>
> Key: BEAM-6452
> URL: https://issues.apache.org/jira/browse/BEAM-6452
> Project: Beam
>  Issue Type: Sub-task
>  Components: beam-model
>Reporter: Reuven Lax
>Assignee: Reuven Lax
>Priority: Major
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6184) PortableRunner dependency missed in wordcount example maven artifact

2019-01-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6184?focusedWorklogId=186056=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-186056
 ]

ASF GitHub Bot logged work on BEAM-6184:


Author: ASF GitHub Bot
Created on: 16/Jan/19 23:31
Start Date: 16/Jan/19 23:31
Worklog Time Spent: 10m 
  Work Description: HuangLED commented on issue #7532: [BEAM-6184]Turn on 
javadocmethod checkstyle to report error.
URL: https://github.com/apache/beam/pull/7532#issuecomment-454984235
 
 
   Run Java_Examples_Dataflow PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 186056)
Time Spent: 6h 50m  (was: 6h 40m)

> PortableRunner dependency missed in wordcount example maven artifact
> 
>
> Key: BEAM-6184
> URL: https://issues.apache.org/jira/browse/BEAM-6184
> Project: Beam
>  Issue Type: Improvement
>  Components: build-system
>Reporter: Ruoyun Huang
>Assignee: Ruoyun Huang
>Priority: Minor
> Fix For: Not applicable
>
>  Time Spent: 6h 50m
>  Remaining Estimate: 0h
>
>  
>  
> more context: 
> https://lists.apache.org/thread.html/8dd60395424425f7502d62888c49014430d1d3b06c026606f3db28ab@%3Cuser.beam.apache.org%3E



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6457) bigquery.py is too large, and some tools are better moved elsewhere

2019-01-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6457?focusedWorklogId=186054=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-186054
 ]

ASF GitHub Bot logged work on BEAM-6457:


Author: ASF GitHub Bot
Created on: 16/Jan/19 23:27
Start Date: 16/Jan/19 23:27
Worklog Time Spent: 10m 
  Work Description: pabloem commented on issue #7542: [BEAM-6457] 
Refactoring of a few BigQuery classes.
URL: https://github.com/apache/beam/pull/7542#issuecomment-454982670
 
 
   Run Python PreCommit
   
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 186054)
Time Spent: 50m  (was: 40m)

> bigquery.py is too large, and some tools are better moved elsewhere
> ---
>
> Key: BEAM-6457
> URL: https://issues.apache.org/jira/browse/BEAM-6457
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Pablo Estrada
>Assignee: Pablo Estrada
>Priority: Major
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Need to do a bit of refactoring of that file



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6457) bigquery.py is too large, and some tools are better moved elsewhere

2019-01-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6457?focusedWorklogId=186055=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-186055
 ]

ASF GitHub Bot logged work on BEAM-6457:


Author: ASF GitHub Bot
Created on: 16/Jan/19 23:27
Start Date: 16/Jan/19 23:27
Worklog Time Spent: 10m 
  Work Description: pabloem commented on issue #7542: [BEAM-6457] 
Refactoring of a few BigQuery classes.
URL: https://github.com/apache/beam/pull/7542#issuecomment-454983424
 
 
   Run Python PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 186055)
Time Spent: 1h  (was: 50m)

> bigquery.py is too large, and some tools are better moved elsewhere
> ---
>
> Key: BEAM-6457
> URL: https://issues.apache.org/jira/browse/BEAM-6457
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Pablo Estrada
>Assignee: Pablo Estrada
>Priority: Major
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Need to do a bit of refactoring of that file



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6457) bigquery.py is too large, and some tools are better moved elsewhere

2019-01-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6457?focusedWorklogId=186053=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-186053
 ]

ASF GitHub Bot logged work on BEAM-6457:


Author: ASF GitHub Bot
Created on: 16/Jan/19 23:27
Start Date: 16/Jan/19 23:27
Worklog Time Spent: 10m 
  Work Description: pabloem commented on issue #7542: [BEAM-6457] 
Refactoring of a few BigQuery classes.
URL: https://github.com/apache/beam/pull/7542#issuecomment-454981825
 
 
   Run Python PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 186053)
Time Spent: 40m  (was: 0.5h)

> bigquery.py is too large, and some tools are better moved elsewhere
> ---
>
> Key: BEAM-6457
> URL: https://issues.apache.org/jira/browse/BEAM-6457
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Pablo Estrada
>Assignee: Pablo Estrada
>Priority: Major
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Need to do a bit of refactoring of that file



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6457) bigquery.py is too large, and some tools are better moved elsewhere

2019-01-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6457?focusedWorklogId=186051=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-186051
 ]

ASF GitHub Bot logged work on BEAM-6457:


Author: ASF GitHub Bot
Created on: 16/Jan/19 23:24
Start Date: 16/Jan/19 23:24
Worklog Time Spent: 10m 
  Work Description: pabloem commented on issue #7542: [BEAM-6457] 
Refactoring of a few BigQuery classes.
URL: https://github.com/apache/beam/pull/7542#issuecomment-454982670
 
 
   Run Python PreCommit
   
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 186051)
Time Spent: 0.5h  (was: 20m)

> bigquery.py is too large, and some tools are better moved elsewhere
> ---
>
> Key: BEAM-6457
> URL: https://issues.apache.org/jira/browse/BEAM-6457
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Pablo Estrada
>Assignee: Pablo Estrada
>Priority: Major
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Need to do a bit of refactoring of that file



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6457) bigquery.py is too large, and some tools are better moved elsewhere

2019-01-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6457?focusedWorklogId=186050=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-186050
 ]

ASF GitHub Bot logged work on BEAM-6457:


Author: ASF GitHub Bot
Created on: 16/Jan/19 23:20
Start Date: 16/Jan/19 23:20
Worklog Time Spent: 10m 
  Work Description: pabloem commented on issue #7542: [BEAM-6457] 
Refactoring of a few BigQuery classes.
URL: https://github.com/apache/beam/pull/7542#issuecomment-454981825
 
 
   Run Python PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 186050)
Time Spent: 20m  (was: 10m)

> bigquery.py is too large, and some tools are better moved elsewhere
> ---
>
> Key: BEAM-6457
> URL: https://issues.apache.org/jira/browse/BEAM-6457
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Pablo Estrada
>Assignee: Pablo Estrada
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Need to do a bit of refactoring of that file



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6184) PortableRunner dependency missed in wordcount example maven artifact

2019-01-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6184?focusedWorklogId=186049=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-186049
 ]

ASF GitHub Bot logged work on BEAM-6184:


Author: ASF GitHub Bot
Created on: 16/Jan/19 23:11
Start Date: 16/Jan/19 23:11
Worklog Time Spent: 10m 
  Work Description: HuangLED commented on issue #7532: [BEAM-6184]Turn on 
javadocmethod checkstyle to report error.
URL: https://github.com/apache/beam/pull/7532#issuecomment-454979928
 
 
   Run Java_Examples_Dataflow PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 186049)
Time Spent: 6h 40m  (was: 6.5h)

> PortableRunner dependency missed in wordcount example maven artifact
> 
>
> Key: BEAM-6184
> URL: https://issues.apache.org/jira/browse/BEAM-6184
> Project: Beam
>  Issue Type: Improvement
>  Components: build-system
>Reporter: Ruoyun Huang
>Assignee: Ruoyun Huang
>Priority: Minor
> Fix For: Not applicable
>
>  Time Spent: 6h 40m
>  Remaining Estimate: 0h
>
>  
>  
> more context: 
> https://lists.apache.org/thread.html/8dd60395424425f7502d62888c49014430d1d3b06c026606f3db28ab@%3Cuser.beam.apache.org%3E



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6445) Improve Release Process

2019-01-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6445?focusedWorklogId=186048=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-186048
 ]

ASF GitHub Bot logged work on BEAM-6445:


Author: ASF GitHub Bot
Created on: 16/Jan/19 23:07
Start Date: 16/Jan/19 23:07
Worklog Time Spent: 10m 
  Work Description: Ardagan commented on issue #7529: [BEAM-6445]: Release 
Guide changes for release process improvement
URL: https://github.com/apache/beam/pull/7529#issuecomment-454978918
 
 
   Run Website PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 186048)
Time Spent: 2h 20m  (was: 2h 10m)

> Improve Release Process
> ---
>
> Key: BEAM-6445
> URL: https://issues.apache.org/jira/browse/BEAM-6445
> Project: Beam
>  Issue Type: Improvement
>  Components: project-management
>Reporter: Sam Rohde
>Assignee: Sam Rohde
>Priority: Major
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> This JIRA tracks the improvement of the Beam release process as [discussed in 
> the dev 
> list|https://lists.apache.org/thread.html/d52ffbfca21eee953a230100520bd56d947a359c0029d5c291b736a7@%3Cdev.beam.apache.org%3E].
>  In summary, this change will hopefully increase the greenness of the build 
> by: increasing coverage, adding pre and post commits to release validation, 
> and adding a regular cadence to look at flaky and backlogged tests.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3342) Create a Cloud Bigtable Python connector

2019-01-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3342?focusedWorklogId=186047=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-186047
 ]

ASF GitHub Bot logged work on BEAM-3342:


Author: ASF GitHub Bot
Created on: 16/Jan/19 23:06
Start Date: 16/Jan/19 23:06
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on issue #7367: [BEAM-3342] 
Create a Cloud Bigtable Python connector Write
URL: https://github.com/apache/beam/pull/7367#issuecomment-454978700
 
 
   Thanks. Reviewing now.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 186047)
Time Spent: 8h 10m  (was: 8h)

> Create a Cloud Bigtable Python connector
> 
>
> Key: BEAM-3342
> URL: https://issues.apache.org/jira/browse/BEAM-3342
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Solomon Duskis
>Assignee: Solomon Duskis
>Priority: Major
>  Time Spent: 8h 10m
>  Remaining Estimate: 0h
>
> I would like to create a Cloud Bigtable python connector.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6457) bigquery.py is too large, and some tools are better moved elsewhere

2019-01-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6457?focusedWorklogId=186045=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-186045
 ]

ASF GitHub Bot logged work on BEAM-6457:


Author: ASF GitHub Bot
Created on: 16/Jan/19 22:56
Start Date: 16/Jan/19 22:56
Worklog Time Spent: 10m 
  Work Description: pabloem commented on issue #7542: [BEAM-6457] 
Refactoring of a few BigQuery classes.
URL: https://github.com/apache/beam/pull/7542#issuecomment-454976168
 
 
   oops fixing the few issues
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 186045)
Time Spent: 10m
Remaining Estimate: 0h

> bigquery.py is too large, and some tools are better moved elsewhere
> ---
>
> Key: BEAM-6457
> URL: https://issues.apache.org/jira/browse/BEAM-6457
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Pablo Estrada
>Assignee: Pablo Estrada
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Need to do a bit of refactoring of that file



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6452) Nested collection types cause NullPointerException when converting to a POJO

2019-01-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6452?focusedWorklogId=186033=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-186033
 ]

ASF GitHub Bot logged work on BEAM-6452:


Author: ASF GitHub Bot
Created on: 16/Jan/19 22:45
Start Date: 16/Jan/19 22:45
Worklog Time Spent: 10m 
  Work Description: apilloud commented on issue #7536: [BEAM-6452] Fix 
NullPointerException caused by nested collections
URL: https://github.com/apache/beam/pull/7536#issuecomment-454973086
 
 
   Run Java_Examples_Dataflow PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 186033)
Time Spent: 1h  (was: 50m)

> Nested collection types cause NullPointerException when converting to a POJO
> 
>
> Key: BEAM-6452
> URL: https://issues.apache.org/jira/browse/BEAM-6452
> Project: Beam
>  Issue Type: Sub-task
>  Components: beam-model
>Reporter: Reuven Lax
>Assignee: Reuven Lax
>Priority: Major
>  Time Spent: 1h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6452) Nested collection types cause NullPointerException when converting to a POJO

2019-01-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6452?focusedWorklogId=186032=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-186032
 ]

ASF GitHub Bot logged work on BEAM-6452:


Author: ASF GitHub Bot
Created on: 16/Jan/19 22:44
Start Date: 16/Jan/19 22:44
Worklog Time Spent: 10m 
  Work Description: apilloud commented on pull request #7536: [BEAM-6452] 
Fix NullPointerException caused by nested collections
URL: https://github.com/apache/beam/pull/7536#discussion_r248479025
 
 

 ##
 File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/FieldValueTypeInformation.java
 ##
 @@ -157,68 +157,94 @@ public FieldValueTypeInformation withName(String name) {
 return toBuilder().setName(name).build();
   }
 
-  private static Type getArrayComponentType(Field field) {
+  private static FieldValueTypeInformation getArrayComponentType(Field field) {
 return getArrayComponentType(TypeDescriptor.of(field.getGenericType()));
   }
 
   @Nullable
-  private static Type getArrayComponentType(TypeDescriptor valueType) {
+  private static FieldValueTypeInformation 
getArrayComponentType(TypeDescriptor valueType) {
+// TODO: Figure out nullable elements.
+TypeDescriptor componentType = null;
 if (valueType.isArray()) {
   Type component = valueType.getComponentType().getType();
   if (!component.equals(byte.class)) {
-return component;
+componentType = TypeDescriptor.of(component);
   }
 } else if (valueType.isSubtypeOf(TypeDescriptor.of(Collection.class))) {
   TypeDescriptor> collection = 
valueType.getSupertype(Collection.class);
   if (collection.getType() instanceof ParameterizedType) {
 ParameterizedType ptype = (ParameterizedType) collection.getType();
 java.lang.reflect.Type[] params = ptype.getActualTypeArguments();
 checkArgument(params.length == 1);
-return params[0];
+componentType = TypeDescriptor.of(params[0]);
   } else {
 throw new RuntimeException("Collection parameter is not 
parameterized!");
   }
 }
-return null;
+if (componentType == null) {
+  return null;
+}
+
+return new AutoValue_FieldValueTypeInformation.Builder()
+.setName("")
+.setNullable(false)
+.setType(componentType)
+.setRawType(componentType.getRawType())
+.setElementType(getArrayComponentType(componentType))
+.setMapKeyType(getMapKeyType(componentType))
+.setMapValueType(getMapValueType(componentType))
+.build();
   }
 
   // If the Field is a map type, returns the key type, otherwise returns a 
null reference.
   @Nullable
-  private static Type getMapKeyType(Field field) {
+  private static FieldValueTypeInformation getMapKeyType(Field field) {
 return getMapKeyType(TypeDescriptor.of(field.getGenericType()));
   }
 
   @Nullable
-  private static Type getMapKeyType(TypeDescriptor typeDescriptor) {
+  private static FieldValueTypeInformation getMapKeyType(TypeDescriptor 
typeDescriptor) {
 return getMapType(typeDescriptor, 0);
   }
 
   // If the Field is a map type, returns the value type, otherwise returns a 
null reference.
   @Nullable
-  private static Type getMapValueType(Field field) {
+  private static FieldValueTypeInformation getMapValueType(Field field) {
 return getMapType(TypeDescriptor.of(field.getGenericType()), 1);
   }
 
   @Nullable
-  private static Type getMapValueType(TypeDescriptor typeDescriptor) {
+  private static FieldValueTypeInformation getMapValueType(TypeDescriptor 
typeDescriptor) {
 return getMapType(typeDescriptor, 1);
   }
 
   // If the Field is a map type, returns the key or value type (0 is key type, 
1 is value).
   // Otherwise returns a null reference.
   @SuppressWarnings("unchecked")
   @Nullable
-  private static Type getMapType(TypeDescriptor valueType, int index) {
+  private static FieldValueTypeInformation getMapType(TypeDescriptor 
valueType, int index) {
+TypeDescriptor mapType = null;
 if (valueType.isSubtypeOf(TypeDescriptor.of(Map.class))) {
   TypeDescriptor> map = valueType.getSupertype(Map.class);
   if (map.getType() instanceof ParameterizedType) {
 ParameterizedType ptype = (ParameterizedType) map.getType();
 java.lang.reflect.Type[] params = ptype.getActualTypeArguments();
-return params[index];
+mapType = TypeDescriptor.of(params[index]);
   } else {
 throw new RuntimeException("Map type is not parameterized! " + map);
   }
 }
-return null;
+if (mapType == null) {
+  return null;
+}
+return new AutoValue_FieldValueTypeInformation.Builder()
+.setName("")
+.setNullable(false)
+.setType(mapType)
+.setRawType(mapType.getRawType())
+.setElementType(getArrayComponentType(mapType))
+

[jira] [Created] (BEAM-6457) bigquery.py is too large, and some tools are better moved elsewhere

2019-01-16 Thread Pablo Estrada (JIRA)
Pablo Estrada created BEAM-6457:
---

 Summary: bigquery.py is too large, and some tools are better moved 
elsewhere
 Key: BEAM-6457
 URL: https://issues.apache.org/jira/browse/BEAM-6457
 Project: Beam
  Issue Type: Bug
  Components: sdk-py-core
Reporter: Pablo Estrada
Assignee: Pablo Estrada


Need to do a bit of refactoring of that file



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6452) Nested collection types cause NullPointerException when converting to a POJO

2019-01-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6452?focusedWorklogId=186029=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-186029
 ]

ASF GitHub Bot logged work on BEAM-6452:


Author: ASF GitHub Bot
Created on: 16/Jan/19 22:34
Start Date: 16/Jan/19 22:34
Worklog Time Spent: 10m 
  Work Description: reuvenlax commented on issue #7536: [BEAM-6452] Fix 
NullPointerException caused by nested collections
URL: https://github.com/apache/beam/pull/7536#issuecomment-454970240
 
 
   Run Java_Examples_Dataflow PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 186029)
Time Spent: 40m  (was: 0.5h)

> Nested collection types cause NullPointerException when converting to a POJO
> 
>
> Key: BEAM-6452
> URL: https://issues.apache.org/jira/browse/BEAM-6452
> Project: Beam
>  Issue Type: Sub-task
>  Components: beam-model
>Reporter: Reuven Lax
>Assignee: Reuven Lax
>Priority: Major
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5953) Support DataflowRunner on Python 3

2019-01-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5953?focusedWorklogId=186028=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-186028
 ]

ASF GitHub Bot logged work on BEAM-5953:


Author: ASF GitHub Bot
Created on: 16/Jan/19 22:30
Start Date: 16/Jan/19 22:30
Worklog Time Spent: 10m 
  Work Description: tvalentyn commented on pull request #7521: [BEAM-5953] 
Fix py3 type error in bundle_processor
URL: https://github.com/apache/beam/pull/7521#discussion_r248475838
 
 

 ##
 File path: sdks/python/apache_beam/runners/worker/bundle_processor.py
 ##
 @@ -591,8 +592,10 @@ def get_coder(self, coder_id):
   return self.context.coders.get_by_id(coder_id)
 else:
   # No URN, assume cloud object encoding json bytes.
-  return operation_specs.get_coder_from_spec(
-  json.loads(coder_proto.spec.spec.payload))
+  payload = coder_proto.spec.spec.payload
+  if isinstance(payload, bytes) and sys.version_info[0] == 3:
 
 Review comment:
   I think we should not be checking whether input is `bytes` on Python 3, and 
should consistently expect the same datatype as input. Can we change this to:
   ```
  if sys.version_info[0] > 2:
  # json.loads() does not accept `bytes` on some versions of Python 3.
  payload = payload.decode('utf-8')
   ```
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 186028)
Time Spent: 3.5h  (was: 3h 20m)

> Support DataflowRunner on Python 3
> --
>
> Key: BEAM-5953
> URL: https://issues.apache.org/jira/browse/BEAM-5953
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Mark Liu
>Assignee: Mark Liu
>Priority: Major
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-5612) Add tox suites to exercise unit tests using Python3 interpreter with cython, and with gcp dependencies.

2019-01-16 Thread Mark Liu (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Liu reassigned BEAM-5612:
--

Assignee: Mark Liu

> Add tox suites to exercise unit tests using Python3 interpreter with cython, 
> and with gcp dependencies.
> ---
>
> Key: BEAM-5612
> URL: https://issues.apache.org/jira/browse/BEAM-5612
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: Mark Liu
>Priority: Minor
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6454) TypeError in DataflowRunner: dict_values does not support indexing

2019-01-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6454?focusedWorklogId=186019=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-186019
 ]

ASF GitHub Bot logged work on BEAM-6454:


Author: ASF GitHub Bot
Created on: 16/Jan/19 21:54
Start Date: 16/Jan/19 21:54
Worklog Time Spent: 10m 
  Work Description: aaltay commented on pull request #7538: [BEAM-6454] Fix 
dict_values error in DataflowRunner
URL: https://github.com/apache/beam/pull/7538
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 186019)
Time Spent: 40m  (was: 0.5h)

> TypeError in DataflowRunner: dict_values does not support indexing
> --
>
> Key: BEAM-6454
> URL: https://issues.apache.org/jira/browse/BEAM-6454
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-harness
>Reporter: Mark Liu
>Assignee: Mark Liu
>Priority: Major
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> In python 3, dict.values() returns a view, rather than a list. So need to 
> wrap it to a list.
> Error in console output:
> {code:java}
> ERROR:root:Error while visiting read/Read/Impulse
> Traceback (most recent call last):
>   File "/usr/lib/python3.5/runpy.py", line 193, in _run_module_as_main
> "__main__", mod_spec)
>   File "/usr/lib/python3.5/runpy.py", line 85, in _run_code
> exec(code, run_globals)
>   File 
> "/usr/local/google/home/markliu/tmp/beam4/sdks/python/apache_beam/examples/wordcount.py",
>  line 136, in 
> run()
>   File 
> "/usr/local/google/home/markliu/tmp/beam4/sdks/python/apache_beam/examples/wordcount.py",
>  line 115, in run
> result = p.run()
>   File 
> "/usr/local/google/home/markliu/tmp/beam4/sdks/python/apache_beam/pipeline.py",
>  line 405, in run
> self._options).run(False)
>   File 
> "/usr/local/google/home/markliu/tmp/beam4/sdks/python/apache_beam/pipeline.py",
>  line 418, in run
> return self.runner.run_pipeline(self, self._options)
>   File 
> "/usr/local/google/home/markliu/tmp/beam4/sdks/python/apache_beam/runners/dataflow/dataflow_runner.py",
>  line 367, in run_pipeline
> super(DataflowRunner, self).run_pipeline(pipeline, options)
>   File 
> "/usr/local/google/home/markliu/tmp/beam4/sdks/python/apache_beam/runners/runner.py",
>  line 176, in run_pipeline
> pipeline.visit(RunVisitor(self))
>   File 
> "/usr/local/google/home/markliu/tmp/beam4/sdks/python/apache_beam/pipeline.py",
>  line 446, in visit
> self._root_transform().visit(visitor, self, visited)
>   File 
> "/usr/local/google/home/markliu/tmp/beam4/sdks/python/apache_beam/pipeline.py",
>  line 815, in visit
> part.visit(visitor, pipeline, visited)
>   File 
> "/usr/local/google/home/markliu/tmp/beam4/sdks/python/apache_beam/pipeline.py",
>  line 815, in visit
> part.visit(visitor, pipeline, visited)
>   File 
> "/usr/local/google/home/markliu/tmp/beam4/sdks/python/apache_beam/pipeline.py",
>  line 815, in visit
> part.visit(visitor, pipeline, visited)
>   File 
> "/usr/local/google/home/markliu/tmp/beam4/sdks/python/apache_beam/pipeline.py",
>  line 818, in visit
> visitor.visit_transform(self)
>   File 
> "/usr/local/google/home/markliu/tmp/beam4/sdks/python/apache_beam/runners/runner.py",
>  line 171, in visit_transform
> self.runner.run_transform(transform_node, options)
>   File 
> "/usr/local/google/home/markliu/tmp/beam4/sdks/python/apache_beam/runners/runner.py",
>  line 214, in run_transform
> return m(transform_node, options)
>   File 
> "/usr/local/google/home/markliu/tmp/beam4/sdks/python/apache_beam/runners/dataflow/dataflow_runner.py",
>  line 532, in run_Impulse
> step.encoding = self._get_encoded_output_coder(transform_node)
>   File 
> "/usr/local/google/home/markliu/tmp/beam4/sdks/python/apache_beam/runners/dataflow/dataflow_runner.py",
>  line 461, in _get_encoded_output_coder
> transform_node.outputs.values()[0].pipeline._options)
> TypeError: 'dict_values' object does not support indexing
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (BEAM-6456) [community metrics] Update community metrics DB to utilize BigInt for job duration

2019-01-16 Thread Mikhail Gryzykhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Gryzykhin resolved BEAM-6456.
-
   Resolution: Fixed
Fix Version/s: Not applicable

> [community metrics] Update community metrics DB to utilize BigInt for job 
> duration
> --
>
> Key: BEAM-6456
> URL: https://issues.apache.org/jira/browse/BEAM-6456
> Project: Beam
>  Issue Type: New Feature
>  Components: build-system
>Reporter: Mikhail Gryzykhin
>Assignee: Mikhail Gryzykhin
>Priority: Minor
> Fix For: Not applicable
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Community metrics located at [http://104.154.241.245|http://104.154.241.245/] 
> have DB utilizing integers for job duration. This causes int overflow on long 
> jobs.
> Current mitigation trims duration to int max. We want to update DB schema to 
> utilize bigint for duration. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6456) [community metrics] Update community metrics DB to utilize BigInt for job duration

2019-01-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6456?focusedWorklogId=186012=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-186012
 ]

ASF GitHub Bot logged work on BEAM-6456:


Author: ASF GitHub Bot
Created on: 16/Jan/19 21:42
Start Date: 16/Jan/19 21:42
Worklog Time Spent: 10m 
  Work Description: swegner commented on pull request #7541: [BEAM-6456] 
Update jenkins_builds schema to BIGINT for durations
URL: https://github.com/apache/beam/pull/7541
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 186012)
Time Spent: 40m  (was: 0.5h)

> [community metrics] Update community metrics DB to utilize BigInt for job 
> duration
> --
>
> Key: BEAM-6456
> URL: https://issues.apache.org/jira/browse/BEAM-6456
> Project: Beam
>  Issue Type: New Feature
>  Components: build-system
>Reporter: Mikhail Gryzykhin
>Assignee: Mikhail Gryzykhin
>Priority: Minor
> Fix For: Not applicable
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Community metrics located at [http://104.154.241.245|http://104.154.241.245/] 
> have DB utilizing integers for job duration. This causes int overflow on long 
> jobs.
> Current mitigation trims duration to int max. We want to update DB schema to 
> utilize bigint for duration. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6439) Move Python Flink VR tests to PreCommit

2019-01-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6439?focusedWorklogId=186011=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-186011
 ]

ASF GitHub Bot logged work on BEAM-6439:


Author: ASF GitHub Bot
Created on: 16/Jan/19 21:40
Start Date: 16/Jan/19 21:40
Worklog Time Spent: 10m 
  Work Description: angoenka commented on issue #7539: [BEAM-6439] Move 
Python Validates Runner Flink test to PreCommit
URL: https://github.com/apache/beam/pull/7539#issuecomment-454953263
 
 
   Need to name job to Python_**PVR**_Flink as 
beam_PreCommit_Python_VR_Flink_Cron already existed. We can try deleting 
beam_PreCommit_Python_VR_Flink_Cron and then try to use Python_VR_Flink.
   Successful run: 
https://builds.apache.org/job/beam_PreCommit_Python_PVR_Flink_Phrase/1/
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 186011)
Time Spent: 3.5h  (was: 3h 20m)

> Move Python Flink VR tests to PreCommit
> ---
>
> Key: BEAM-6439
> URL: https://issues.apache.org/jira/browse/BEAM-6439
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-flink, testing
>Reporter: Robert Bradshaw
>Assignee: Robert Bradshaw
>Priority: Major
> Fix For: Not applicable
>
> Attachments: png.png
>
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> Now that they're fast and stable, it would be good to catch changes that 
> break this earlier.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6456) [community metrics] Update community metrics DB to utilize BigInt for job duration

2019-01-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6456?focusedWorklogId=186010=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-186010
 ]

ASF GitHub Bot logged work on BEAM-6456:


Author: ASF GitHub Bot
Created on: 16/Jan/19 21:39
Start Date: 16/Jan/19 21:39
Worklog Time Spent: 10m 
  Work Description: Ardagan commented on issue #7541: [BEAM-6456] Update 
jenkins_builds schema to BIGINT for durations
URL: https://github.com/apache/beam/pull/7541#issuecomment-454952973
 
 
   Already migrated.
   Was easier than I anticipated.
   Metrics service is up: 
http://104.154.241.245/d/D81lW0pmk/post-commit-test-reliability?orgId=1
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 186010)
Time Spent: 0.5h  (was: 20m)

> [community metrics] Update community metrics DB to utilize BigInt for job 
> duration
> --
>
> Key: BEAM-6456
> URL: https://issues.apache.org/jira/browse/BEAM-6456
> Project: Beam
>  Issue Type: New Feature
>  Components: build-system
>Reporter: Mikhail Gryzykhin
>Assignee: Mikhail Gryzykhin
>Priority: Minor
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Community metrics located at [http://104.154.241.245|http://104.154.241.245/] 
> have DB utilizing integers for job duration. This causes int overflow on long 
> jobs.
> Current mitigation trims duration to int max. We want to update DB schema to 
> utilize bigint for duration. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6456) [community metrics] Update community metrics DB to utilize BigInt for job duration

2019-01-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6456?focusedWorklogId=186009=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-186009
 ]

ASF GitHub Bot logged work on BEAM-6456:


Author: ASF GitHub Bot
Created on: 16/Jan/19 21:39
Start Date: 16/Jan/19 21:39
Worklog Time Spent: 10m 
  Work Description: Ardagan commented on issue #7541: [BEAM-6456] Update 
jenkins_builds schema to BIGINT for durations
URL: https://github.com/apache/beam/pull/7541#issuecomment-454952973
 
 
   Already migrated.
   Was easier than I anticipated.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 186009)
Time Spent: 20m  (was: 10m)

> [community metrics] Update community metrics DB to utilize BigInt for job 
> duration
> --
>
> Key: BEAM-6456
> URL: https://issues.apache.org/jira/browse/BEAM-6456
> Project: Beam
>  Issue Type: New Feature
>  Components: build-system
>Reporter: Mikhail Gryzykhin
>Assignee: Mikhail Gryzykhin
>Priority: Minor
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Community metrics located at [http://104.154.241.245|http://104.154.241.245/] 
> have DB utilizing integers for job duration. This causes int overflow on long 
> jobs.
> Current mitigation trims duration to int max. We want to update DB schema to 
> utilize bigint for duration. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6184) PortableRunner dependency missed in wordcount example maven artifact

2019-01-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6184?focusedWorklogId=186007=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-186007
 ]

ASF GitHub Bot logged work on BEAM-6184:


Author: ASF GitHub Bot
Created on: 16/Jan/19 21:37
Start Date: 16/Jan/19 21:37
Worklog Time Spent: 10m 
  Work Description: HuangLED commented on issue #7532: [BEAM-6184]Turn on 
javadocmethod checkstyle to report error.
URL: https://github.com/apache/beam/pull/7532#issuecomment-454949944
 
 
   Run Java_Examples_Dataflow PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 186007)
Time Spent: 6h 20m  (was: 6h 10m)

> PortableRunner dependency missed in wordcount example maven artifact
> 
>
> Key: BEAM-6184
> URL: https://issues.apache.org/jira/browse/BEAM-6184
> Project: Beam
>  Issue Type: Improvement
>  Components: build-system
>Reporter: Ruoyun Huang
>Assignee: Ruoyun Huang
>Priority: Minor
> Fix For: Not applicable
>
>  Time Spent: 6h 20m
>  Remaining Estimate: 0h
>
>  
>  
> more context: 
> https://lists.apache.org/thread.html/8dd60395424425f7502d62888c49014430d1d3b06c026606f3db28ab@%3Cuser.beam.apache.org%3E



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6456) [community metrics] Update community metrics DB to utilize BigInt for job duration

2019-01-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6456?focusedWorklogId=186000=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-186000
 ]

ASF GitHub Bot logged work on BEAM-6456:


Author: ASF GitHub Bot
Created on: 16/Jan/19 21:25
Start Date: 16/Jan/19 21:25
Worklog Time Spent: 10m 
  Work Description: swegner commented on issue #7541: [BEAM-6456] Update 
jenkins_builds schema to BIGINT for durations
URL: https://github.com/apache/beam/pull/7541#issuecomment-454948035
 
 
   LGTM. Have you looked into how we will manually migrate the live database? 
Hopefully it will be easy.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 186000)
Time Spent: 10m
Remaining Estimate: 0h

> [community metrics] Update community metrics DB to utilize BigInt for job 
> duration
> --
>
> Key: BEAM-6456
> URL: https://issues.apache.org/jira/browse/BEAM-6456
> Project: Beam
>  Issue Type: New Feature
>  Components: build-system
>Reporter: Mikhail Gryzykhin
>Assignee: Mikhail Gryzykhin
>Priority: Minor
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Community metrics located at [http://104.154.241.245|http://104.154.241.245/] 
> have DB utilizing integers for job duration. This causes int overflow on long 
> jobs.
> Current mitigation trims duration to int max. We want to update DB schema to 
> utilize bigint for duration. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-6352) Watch PTransform is broken

2019-01-16 Thread Scott Wegner (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-6352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16744473#comment-16744473
 ] 

Scott Wegner commented on BEAM-6352:


Status update:

I made an attempt to extract the extract APIs off of {{GrowthTracker}} into a 
separate class local to the ProcessElement invocation. But, it was not as easy 
to isolate as I thought; there's some shared state that's needed for 
checkpointing. Initial attempt here: https://github.com/apache/beam/pull/7520

In parallel, I tried rolling back PR#7520 so see how easy that would be, and it 
seems that everything is passing: https://github.com/apache/beam/pull/7540. So 
we have rollback as an option.

I'm going to spend a little more time grokking the Watch.Growth implementation 
and see if I can come up with a better way of untangling it. If I don't come to 
a solution today, I propose we rollback to unblock the release.

Note: we could also choose to rollback only on the release branch. But, that 
keeps master in a broken state and if we don't find a solution today, it's not 
clear to me how long it will take. My preference is not to keep things broken 
for too long.

> Watch PTransform is broken
> --
>
> Key: BEAM-6352
> URL: https://issues.apache.org/jira/browse/BEAM-6352
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Affects Versions: 2.9.0
>Reporter: Gleb Kanterov
>Assignee: Scott Wegner
>Priority: Blocker
> Fix For: 2.10.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> List of affected tests:
> org.apache.beam.sdk.transforms.WatchTest > 
> testSinglePollMultipleInputsWithSideInput FAILED
> org.apache.beam.sdk.transforms.WatchTest > testMultiplePollsWithKeyExtractor 
> FAILED
> org.apache.beam.sdk.transforms.WatchTest > testSinglePollMultipleInputs FAILED
> org.apache.beam.sdk.transforms.WatchTest > 
> testMultiplePollsWithTerminationDueToTerminationCondition FAILED
> org.apache.beam.sdk.transforms.WatchTest > testMultiplePollsWithManyResults 
> FAILED
> org.apache.beam.sdk.transforms.WatchTest > testSinglePollWithManyResults 
> FAILED
> org.apache.beam.sdk.transforms.WatchTest > 
> testMultiplePollsStopAfterTimeSinceNewOutput 
> org.apache.beam.sdk.transforms.WatchTest > 
> testMultiplePollsWithTerminationBecauseOutputIsFinal FAILED
> org.apache.beam.sdk.io.AvroIOTest$NeedsRunnerTests > 
> testContinuouslyWriteAndReadMultipleFilepatterns[0: true] FAILED
> org.apache.beam.sdk.io.AvroIOTest$NeedsRunnerTests > 
> testContinuouslyWriteAndReadMultipleFilepatterns[1: false] FAILED
> org.apache.beam.sdk.io.FileIOTest > testMatchWatchForNewFiles FAILED
> org.apache.beam.sdk.io.TextIOReadTest$BasicIOTest > testReadWatchForNewFiles 
> FAILED
> {code}
> java.lang.IllegalArgumentException: 
> org.apache.beam.sdk.transforms.Watch$WatchGrowthFn, @ProcessElement 
> process(ProcessContext, GrowthTracker): Has tracker type 
> Watch.GrowthTracker, but the DoFn's tracker 
> type must be of type RestrictionTracker.
> {code}
> Relevant pull requests:
> - https://github.com/apache/beam/pull/6467
> - https://github.com/apache/beam/pull/7374
> Now tests are marked with @Ignore referencing this JIRA issue



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6184) PortableRunner dependency missed in wordcount example maven artifact

2019-01-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6184?focusedWorklogId=186004=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-186004
 ]

ASF GitHub Bot logged work on BEAM-6184:


Author: ASF GitHub Bot
Created on: 16/Jan/19 21:30
Start Date: 16/Jan/19 21:30
Worklog Time Spent: 10m 
  Work Description: HuangLED commented on issue #7532: [BEAM-6184]Turn on 
javadocmethod checkstyle to report error.
URL: https://github.com/apache/beam/pull/7532#issuecomment-454949944
 
 
   Run Java_Examples_Dataflow PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 186004)
Time Spent: 6h 10m  (was: 6h)

> PortableRunner dependency missed in wordcount example maven artifact
> 
>
> Key: BEAM-6184
> URL: https://issues.apache.org/jira/browse/BEAM-6184
> Project: Beam
>  Issue Type: Improvement
>  Components: build-system
>Reporter: Ruoyun Huang
>Assignee: Ruoyun Huang
>Priority: Minor
> Fix For: Not applicable
>
>  Time Spent: 6h 10m
>  Remaining Estimate: 0h
>
>  
>  
> more context: 
> https://lists.apache.org/thread.html/8dd60395424425f7502d62888c49014430d1d3b06c026606f3db28ab@%3Cuser.beam.apache.org%3E



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6439) Move Python Flink VR tests to PreCommit

2019-01-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6439?focusedWorklogId=186001=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-186001
 ]

ASF GitHub Bot logged work on BEAM-6439:


Author: ASF GitHub Bot
Created on: 16/Jan/19 21:26
Start Date: 16/Jan/19 21:26
Worklog Time Spent: 10m 
  Work Description: angoenka commented on issue #7539: [BEAM-6439] Move 
Python Validates Runner Flink test to PreCommit
URL: https://github.com/apache/beam/pull/7539#issuecomment-454948234
 
 
   Run Python_PVR PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 186001)
Time Spent: 3h 10m  (was: 3h)

> Move Python Flink VR tests to PreCommit
> ---
>
> Key: BEAM-6439
> URL: https://issues.apache.org/jira/browse/BEAM-6439
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-flink, testing
>Reporter: Robert Bradshaw
>Assignee: Robert Bradshaw
>Priority: Major
> Fix For: Not applicable
>
> Attachments: png.png
>
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> Now that they're fast and stable, it would be good to catch changes that 
> break this earlier.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6439) Move Python Flink VR tests to PreCommit

2019-01-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6439?focusedWorklogId=186002=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-186002
 ]

ASF GitHub Bot logged work on BEAM-6439:


Author: ASF GitHub Bot
Created on: 16/Jan/19 21:26
Start Date: 16/Jan/19 21:26
Worklog Time Spent: 10m 
  Work Description: angoenka commented on issue #7539: [BEAM-6439] Move 
Python Validates Runner Flink test to PreCommit
URL: https://github.com/apache/beam/pull/7539#issuecomment-454948384
 
 
   Run Python_PVR_Flink PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 186002)
Time Spent: 3h 20m  (was: 3h 10m)

> Move Python Flink VR tests to PreCommit
> ---
>
> Key: BEAM-6439
> URL: https://issues.apache.org/jira/browse/BEAM-6439
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-flink, testing
>Reporter: Robert Bradshaw
>Assignee: Robert Bradshaw
>Priority: Major
> Fix For: Not applicable
>
> Attachments: png.png
>
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> Now that they're fast and stable, it would be good to catch changes that 
> break this earlier.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6184) PortableRunner dependency missed in wordcount example maven artifact

2019-01-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6184?focusedWorklogId=185998=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-185998
 ]

ASF GitHub Bot logged work on BEAM-6184:


Author: ASF GitHub Bot
Created on: 16/Jan/19 21:23
Start Date: 16/Jan/19 21:23
Worklog Time Spent: 10m 
  Work Description: HuangLED commented on issue #7532: [BEAM-6184]Turn on 
javadocmethod checkstyle to report error.
URL: https://github.com/apache/beam/pull/7532#issuecomment-454947371
 
 
   Run Java_Examples_Dataflow PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 185998)
Time Spent: 6h  (was: 5h 50m)

> PortableRunner dependency missed in wordcount example maven artifact
> 
>
> Key: BEAM-6184
> URL: https://issues.apache.org/jira/browse/BEAM-6184
> Project: Beam
>  Issue Type: Improvement
>  Components: build-system
>Reporter: Ruoyun Huang
>Assignee: Ruoyun Huang
>Priority: Minor
> Fix For: Not applicable
>
>  Time Spent: 6h
>  Remaining Estimate: 0h
>
>  
>  
> more context: 
> https://lists.apache.org/thread.html/8dd60395424425f7502d62888c49014430d1d3b06c026606f3db28ab@%3Cuser.beam.apache.org%3E



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5953) Support DataflowRunner on Python 3

2019-01-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5953?focusedWorklogId=185997=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-185997
 ]

ASF GitHub Bot logged work on BEAM-5953:


Author: ASF GitHub Bot
Created on: 16/Jan/19 21:22
Start Date: 16/Jan/19 21:22
Worklog Time Spent: 10m 
  Work Description: tvalentyn commented on pull request #7521: [BEAM-5953] 
Fix py3 type error in bundle_processor
URL: https://github.com/apache/beam/pull/7521#discussion_r248455123
 
 

 ##
 File path: sdks/python/apache_beam/runners/worker/operation_specs.py
 ##
 @@ -354,7 +354,10 @@ def get_coder_from_spec(coder_spec):
 
   # We pass coders in the form "$" to make the job
   # description JSON more readable.
-  return coders.coders.deserialize_coder(coder_spec['@type'])
+  coder = coder_spec['@type']
+  if not isinstance(coder, bytes):
 
 Review comment:
   Do we understand in which codepath  we happen to populate coders_spec 
without encoding it to bytes? If so, can we encode at the creation time? I 
think it would be easier to reason about SDK internals if we can state that 
this method always expects the same datatype (a bytestring) as input.  
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 185997)
Time Spent: 3h 10m  (was: 3h)

> Support DataflowRunner on Python 3
> --
>
> Key: BEAM-5953
> URL: https://issues.apache.org/jira/browse/BEAM-5953
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Mark Liu
>Assignee: Mark Liu
>Priority: Major
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3342) Create a Cloud Bigtable Python connector

2019-01-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3342?focusedWorklogId=185993=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-185993
 ]

ASF GitHub Bot logged work on BEAM-3342:


Author: ASF GitHub Bot
Created on: 16/Jan/19 21:13
Start Date: 16/Jan/19 21:13
Worklog Time Spent: 10m 
  Work Description: juan-rael commented on issue #7367: [BEAM-3342] Create 
a Cloud Bigtable Python connector Write
URL: https://github.com/apache/beam/pull/7367#issuecomment-454943910
 
 
   @sduskis @chamikaram do the changes and pass the test.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 185993)
Time Spent: 7h 50m  (was: 7h 40m)

> Create a Cloud Bigtable Python connector
> 
>
> Key: BEAM-3342
> URL: https://issues.apache.org/jira/browse/BEAM-3342
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Solomon Duskis
>Assignee: Solomon Duskis
>Priority: Major
>  Time Spent: 7h 50m
>  Remaining Estimate: 0h
>
> I would like to create a Cloud Bigtable python connector.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6439) Move Python Flink VR tests to PreCommit

2019-01-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6439?focusedWorklogId=185995=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-185995
 ]

ASF GitHub Bot logged work on BEAM-6439:


Author: ASF GitHub Bot
Created on: 16/Jan/19 21:15
Start Date: 16/Jan/19 21:15
Worklog Time Spent: 10m 
  Work Description: angoenka commented on issue #7539: [BEAM-6439] Move 
Python Validates Runner Flink test to PreCommit
URL: https://github.com/apache/beam/pull/7539#issuecomment-454944712
 
 
   Run Seed Job
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 185995)
Time Spent: 3h  (was: 2h 50m)

> Move Python Flink VR tests to PreCommit
> ---
>
> Key: BEAM-6439
> URL: https://issues.apache.org/jira/browse/BEAM-6439
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-flink, testing
>Reporter: Robert Bradshaw
>Assignee: Robert Bradshaw
>Priority: Major
> Fix For: Not applicable
>
> Attachments: png.png
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> Now that they're fast and stable, it would be good to catch changes that 
> break this earlier.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3342) Create a Cloud Bigtable Python connector

2019-01-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3342?focusedWorklogId=185994=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-185994
 ]

ASF GitHub Bot logged work on BEAM-3342:


Author: ASF GitHub Bot
Created on: 16/Jan/19 21:15
Start Date: 16/Jan/19 21:15
Worklog Time Spent: 10m 
  Work Description: sduskis commented on issue #7367: [BEAM-3342] Create a 
Cloud Bigtable Python connector Write
URL: https://github.com/apache/beam/pull/7367#issuecomment-454944610
 
 
   @chamikaramj and @aaltay, what are the next steps?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 185994)
Time Spent: 8h  (was: 7h 50m)

> Create a Cloud Bigtable Python connector
> 
>
> Key: BEAM-3342
> URL: https://issues.apache.org/jira/browse/BEAM-3342
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Solomon Duskis
>Assignee: Solomon Duskis
>Priority: Major
>  Time Spent: 8h
>  Remaining Estimate: 0h
>
> I would like to create a Cloud Bigtable python connector.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6439) Move Python Flink VR tests to PreCommit

2019-01-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6439?focusedWorklogId=185992=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-185992
 ]

ASF GitHub Bot logged work on BEAM-6439:


Author: ASF GitHub Bot
Created on: 16/Jan/19 21:10
Start Date: 16/Jan/19 21:10
Worklog Time Spent: 10m 
  Work Description: angoenka commented on issue #7539: [BEAM-6439] Move 
Python Validates Runner Flink test to PreCommit
URL: https://github.com/apache/beam/pull/7539#issuecomment-454942661
 
 
   Run Seed Job
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 185992)
Time Spent: 2h 50m  (was: 2h 40m)

> Move Python Flink VR tests to PreCommit
> ---
>
> Key: BEAM-6439
> URL: https://issues.apache.org/jira/browse/BEAM-6439
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-flink, testing
>Reporter: Robert Bradshaw
>Assignee: Robert Bradshaw
>Priority: Major
> Fix For: Not applicable
>
> Attachments: png.png
>
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> Now that they're fast and stable, it would be good to catch changes that 
> break this earlier.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6439) Move Python Flink VR tests to PreCommit

2019-01-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6439?focusedWorklogId=185989=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-185989
 ]

ASF GitHub Bot logged work on BEAM-6439:


Author: ASF GitHub Bot
Created on: 16/Jan/19 21:04
Start Date: 16/Jan/19 21:04
Worklog Time Spent: 10m 
  Work Description: angoenka commented on issue #7539: [BEAM-6439] Move 
Python Validates Runner Flink test to PreCommit
URL: https://github.com/apache/beam/pull/7539#issuecomment-454940678
 
 
   Run Seed Job
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 185989)
Time Spent: 2h 40m  (was: 2.5h)

> Move Python Flink VR tests to PreCommit
> ---
>
> Key: BEAM-6439
> URL: https://issues.apache.org/jira/browse/BEAM-6439
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-flink, testing
>Reporter: Robert Bradshaw
>Assignee: Robert Bradshaw
>Priority: Major
> Fix For: Not applicable
>
> Attachments: png.png
>
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> Now that they're fast and stable, it would be good to catch changes that 
> break this earlier.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4184) S3ResourceIdTest has had a masked failure

2019-01-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4184?focusedWorklogId=185983=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-185983
 ]

ASF GitHub Bot logged work on BEAM-4184:


Author: ASF GitHub Bot
Created on: 16/Jan/19 20:44
Start Date: 16/Jan/19 20:44
Worklog Time Spent: 10m 
  Work Description: kennknowles commented on issue #7414: [BEAM-4184] s3 
resource id test has had a masked failure
URL: https://github.com/apache/beam/pull/7414#issuecomment-454933689
 
 
   beam1 is not healthy but should be good soon. Since this is not blocking 
release or others' development I am inclined to wait a little bit and get 
everything green, for the record.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 185983)
Time Spent: 1h 40m  (was: 1.5h)

> S3ResourceIdTest has had a masked failure
> -
>
> Key: BEAM-4184
> URL: https://issues.apache.org/jira/browse/BEAM-4184
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-aws
>Reporter: Kenneth Knowles
>Assignee: Michael Luckey
>Priority: Major
>  Labels: sickbay
> Fix For: 2.11.0
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> Sickbayed in https://github.com/apache/beam/pull/5161, the test should be 
> fixed and no longer ignored.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4184) S3ResourceIdTest has had a masked failure

2019-01-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4184?focusedWorklogId=185982=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-185982
 ]

ASF GitHub Bot logged work on BEAM-4184:


Author: ASF GitHub Bot
Created on: 16/Jan/19 20:42
Start Date: 16/Jan/19 20:42
Worklog Time Spent: 10m 
  Work Description: kennknowles commented on issue #7414: [BEAM-4184] s3 
resource id test has had a masked failure
URL: https://github.com/apache/beam/pull/7414#issuecomment-454932991
 
 
   LGTM. Thanks for the discussion!
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 185982)
Time Spent: 1.5h  (was: 1h 20m)

> S3ResourceIdTest has had a masked failure
> -
>
> Key: BEAM-4184
> URL: https://issues.apache.org/jira/browse/BEAM-4184
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-aws
>Reporter: Kenneth Knowles
>Assignee: Michael Luckey
>Priority: Major
>  Labels: sickbay
> Fix For: 2.11.0
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Sickbayed in https://github.com/apache/beam/pull/5161, the test should be 
> fixed and no longer ignored.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-6407) regression: FileIO.writeDynamic() with side inputs fails in DirectRunner

2019-01-16 Thread Kenneth Knowles (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-6407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16744436#comment-16744436
 ] 

Kenneth Knowles commented on BEAM-6407:
---

We have a "rollback first" policy but we do need evidence that the rollback 
solves it. Your comment here that you did a git bisect is pretty good, but it 
would be even better to put your repro into a test suite. Commented on the PR 
to that effect.

> regression: FileIO.writeDynamic() with side inputs fails in DirectRunner
> 
>
> Key: BEAM-6407
> URL: https://issues.apache.org/jira/browse/BEAM-6407
> Project: Beam
>  Issue Type: Bug
>  Components: runner-direct
>Affects Versions: 2.9.0
>Reporter: Niel Markwick
>Assignee: Kenneth Knowles
>Priority: Blocker
>  Labels: regression
> Fix For: 2.10.0
>
> Attachments: beam-filewriter-demo.tgz
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> When FileIO.writeDynamic is used with automatic sharding and  a Contextful.Fn 
> that uses side inputs for the file naming, DirectRunner (and TestPipeline) 
> fail with: 
> {{java.lang.IllegalStateException: All PCollectionViews that are consumed 
> must be written by some WriteView PTransform: Missing [ 
> [RunnerPCollectionView]]}}
>  
> Example code:  
> {code:java}
> PCollectionView outputFileName =
>    pipeline.apply(
>       "outputDir",
>        Create.of("/tmp/testout")).apply(View.asSingleton());
> Contextful.Fn manifestNaming =
>    (element, c) ->
>       (window, pane, numShards, shardIndex, compression) -> 
>          c.sideInput(outputFileName)+shardIndex;
> pipeline.apply(FileIO.writeDynamic()
>    .by(SerializableFunctions.constant(""))
>    .withDestinationCoder(StringUtf8Coder.of())
>    .via(TextIO.sink())
>    .withTempDirectory("/tmp")
>    .withNaming(Contextful.of(
>       manifestNaming,
>       Requirements.requiresSideInputs(outputFileName;
> {code}
>  
> This does not occur in Dataflow-runner
> It does not occur if the ContextFul.Fn is not given side inputs.
> It does not occur if withNumShards(1) is set.
> It did not occur in 2.8.0, and does in 2.9.0 and 2.10.0-SNAPSHOT (as of today)
>  
> The cause appears to be due to the DirectRunner using TransformOverrides 
> re-writing FileIO sinks to use runner-determined-sharding
> ( see [DirectRunner.java line 
> 226|https://github.com/apache/beam/blob/master/runners/direct-java/src/main/java/org/apache/beam/runners/direct/DirectRunner.java#L226]
>  )
>  but I do not know why this started occuring in 2.9.0...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-6407) regression: FileIO.writeDynamic() with side inputs fails in DirectRunner

2019-01-16 Thread Kenneth Knowles (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles updated BEAM-6407:
--
Priority: Blocker  (was: Major)

> regression: FileIO.writeDynamic() with side inputs fails in DirectRunner
> 
>
> Key: BEAM-6407
> URL: https://issues.apache.org/jira/browse/BEAM-6407
> Project: Beam
>  Issue Type: Bug
>  Components: runner-direct
>Affects Versions: 2.9.0
>Reporter: Niel Markwick
>Assignee: Kenneth Knowles
>Priority: Blocker
>  Labels: regression
> Fix For: 2.10.0
>
> Attachments: beam-filewriter-demo.tgz
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> When FileIO.writeDynamic is used with automatic sharding and  a Contextful.Fn 
> that uses side inputs for the file naming, DirectRunner (and TestPipeline) 
> fail with: 
> {{java.lang.IllegalStateException: All PCollectionViews that are consumed 
> must be written by some WriteView PTransform: Missing [ 
> [RunnerPCollectionView]]}}
>  
> Example code:  
> {code:java}
> PCollectionView outputFileName =
>    pipeline.apply(
>       "outputDir",
>        Create.of("/tmp/testout")).apply(View.asSingleton());
> Contextful.Fn manifestNaming =
>    (element, c) ->
>       (window, pane, numShards, shardIndex, compression) -> 
>          c.sideInput(outputFileName)+shardIndex;
> pipeline.apply(FileIO.writeDynamic()
>    .by(SerializableFunctions.constant(""))
>    .withDestinationCoder(StringUtf8Coder.of())
>    .via(TextIO.sink())
>    .withTempDirectory("/tmp")
>    .withNaming(Contextful.of(
>       manifestNaming,
>       Requirements.requiresSideInputs(outputFileName;
> {code}
>  
> This does not occur in Dataflow-runner
> It does not occur if the ContextFul.Fn is not given side inputs.
> It does not occur if withNumShards(1) is set.
> It did not occur in 2.8.0, and does in 2.9.0 and 2.10.0-SNAPSHOT (as of today)
>  
> The cause appears to be due to the DirectRunner using TransformOverrides 
> re-writing FileIO sinks to use runner-determined-sharding
> ( see [DirectRunner.java line 
> 226|https://github.com/apache/beam/blob/master/runners/direct-java/src/main/java/org/apache/beam/runners/direct/DirectRunner.java#L226]
>  )
>  but I do not know why this started occuring in 2.9.0...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6407) regression: FileIO.writeDynamic() with side inputs fails in DirectRunner

2019-01-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6407?focusedWorklogId=185977=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-185977
 ]

ASF GitHub Bot logged work on BEAM-6407:


Author: ASF GitHub Bot
Created on: 16/Jan/19 20:26
Start Date: 16/Jan/19 20:26
Worklog Time Spent: 10m 
  Work Description: kennknowles commented on issue #7537: [BEAM-6407] 
Revert "BEAM-5933: avoid memory allocation in hashCode call"
URL: https://github.com/apache/beam/pull/7537#issuecomment-454927557
 
 
   Note that these are certainly distinct hashes: 
https://docs.oracle.com/javase/8/docs/api/java/util/Objects.html#hash-java.lang.Object...-
   
   I'm guessing that the check whether every view is set up right also uses 
Objects.hash. So there might be an easy roll-forward.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 185977)
Time Spent: 40m  (was: 0.5h)

> regression: FileIO.writeDynamic() with side inputs fails in DirectRunner
> 
>
> Key: BEAM-6407
> URL: https://issues.apache.org/jira/browse/BEAM-6407
> Project: Beam
>  Issue Type: Bug
>  Components: runner-direct
>Affects Versions: 2.9.0
>Reporter: Niel Markwick
>Assignee: Kenneth Knowles
>Priority: Major
>  Labels: regression
> Fix For: 2.10.0
>
> Attachments: beam-filewriter-demo.tgz
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> When FileIO.writeDynamic is used with automatic sharding and  a Contextful.Fn 
> that uses side inputs for the file naming, DirectRunner (and TestPipeline) 
> fail with: 
> {{java.lang.IllegalStateException: All PCollectionViews that are consumed 
> must be written by some WriteView PTransform: Missing [ 
> [RunnerPCollectionView]]}}
>  
> Example code:  
> {code:java}
> PCollectionView outputFileName =
>    pipeline.apply(
>       "outputDir",
>        Create.of("/tmp/testout")).apply(View.asSingleton());
> Contextful.Fn manifestNaming =
>    (element, c) ->
>       (window, pane, numShards, shardIndex, compression) -> 
>          c.sideInput(outputFileName)+shardIndex;
> pipeline.apply(FileIO.writeDynamic()
>    .by(SerializableFunctions.constant(""))
>    .withDestinationCoder(StringUtf8Coder.of())
>    .via(TextIO.sink())
>    .withTempDirectory("/tmp")
>    .withNaming(Contextful.of(
>       manifestNaming,
>       Requirements.requiresSideInputs(outputFileName;
> {code}
>  
> This does not occur in Dataflow-runner
> It does not occur if the ContextFul.Fn is not given side inputs.
> It does not occur if withNumShards(1) is set.
> It did not occur in 2.8.0, and does in 2.9.0 and 2.10.0-SNAPSHOT (as of today)
>  
> The cause appears to be due to the DirectRunner using TransformOverrides 
> re-writing FileIO sinks to use runner-determined-sharding
> ( see [DirectRunner.java line 
> 226|https://github.com/apache/beam/blob/master/runners/direct-java/src/main/java/org/apache/beam/runners/direct/DirectRunner.java#L226]
>  )
>  but I do not know why this started occuring in 2.9.0...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5933) PCollectionViews$SimplePCollectionView.hashCode allocates memory

2019-01-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5933?focusedWorklogId=185974=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-185974
 ]

ASF GitHub Bot logged work on BEAM-5933:


Author: ASF GitHub Bot
Created on: 16/Jan/19 20:23
Start Date: 16/Jan/19 20:23
Worklog Time Spent: 10m 
  Work Description: kennknowles commented on issue #6909: BEAM-5933: avoid 
memory allocation in hashCode call
URL: https://github.com/apache/beam/pull/6909#issuecomment-454926181
 
 
   Out of curiosity, did you measure the allocation and its cost? Asking 
because we use this sort of practice a lot, since building your own hashCode is 
a mess and we don't use AutoValue enough. If it is expensive, we should inline 
a few versions of the call for low numbers of varargs.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 185974)
Time Spent: 50m  (was: 40m)

> PCollectionViews$SimplePCollectionView.hashCode allocates memory
> 
>
> Key: BEAM-5933
> URL: https://issues.apache.org/jira/browse/BEAM-5933
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Affects Versions: 2.8.0
>Reporter: Vojtech Janota
>Assignee: Vojtech Janota
>Priority: Trivial
> Fix For: 2.9.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> I'm currently profiling memory consumption of our Beam pipeline and have 
> noticed that
>     
> org.apache.beam.sdk.values.PCollectionViews$SimplePCollectionView.hashCode()
> makes noticeable heap allocations. The implementation is:
>     return Objects.hash(tag);
> That itself translates to:
>     return Arrays.hashCode(values);
> Which performs implicit array creation in order to call:
>     public static int Arrays.hashCode(Object a[]);
> Instead of the helper call, doing simple:
>     tag.hashCode();
> Seems more appropriate.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6407) regression: FileIO.writeDynamic() with side inputs fails in DirectRunner

2019-01-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6407?focusedWorklogId=185975=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-185975
 ]

ASF GitHub Bot logged work on BEAM-6407:


Author: ASF GitHub Bot
Created on: 16/Jan/19 20:23
Start Date: 16/Jan/19 20:23
Worklog Time Spent: 10m 
  Work Description: kennknowles commented on issue #7537: [BEAM-6407] 
Revert "BEAM-5933: avoid memory allocation in hashCode call"
URL: https://github.com/apache/beam/pull/7537#issuecomment-454926641
 
 
   Can you add a regression test?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 185975)
Time Spent: 0.5h  (was: 20m)

> regression: FileIO.writeDynamic() with side inputs fails in DirectRunner
> 
>
> Key: BEAM-6407
> URL: https://issues.apache.org/jira/browse/BEAM-6407
> Project: Beam
>  Issue Type: Bug
>  Components: runner-direct
>Affects Versions: 2.9.0
>Reporter: Niel Markwick
>Assignee: Kenneth Knowles
>Priority: Major
>  Labels: regression
> Fix For: 2.10.0
>
> Attachments: beam-filewriter-demo.tgz
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> When FileIO.writeDynamic is used with automatic sharding and  a Contextful.Fn 
> that uses side inputs for the file naming, DirectRunner (and TestPipeline) 
> fail with: 
> {{java.lang.IllegalStateException: All PCollectionViews that are consumed 
> must be written by some WriteView PTransform: Missing [ 
> [RunnerPCollectionView]]}}
>  
> Example code:  
> {code:java}
> PCollectionView outputFileName =
>    pipeline.apply(
>       "outputDir",
>        Create.of("/tmp/testout")).apply(View.asSingleton());
> Contextful.Fn manifestNaming =
>    (element, c) ->
>       (window, pane, numShards, shardIndex, compression) -> 
>          c.sideInput(outputFileName)+shardIndex;
> pipeline.apply(FileIO.writeDynamic()
>    .by(SerializableFunctions.constant(""))
>    .withDestinationCoder(StringUtf8Coder.of())
>    .via(TextIO.sink())
>    .withTempDirectory("/tmp")
>    .withNaming(Contextful.of(
>       manifestNaming,
>       Requirements.requiresSideInputs(outputFileName;
> {code}
>  
> This does not occur in Dataflow-runner
> It does not occur if the ContextFul.Fn is not given side inputs.
> It does not occur if withNumShards(1) is set.
> It did not occur in 2.8.0, and does in 2.9.0 and 2.10.0-SNAPSHOT (as of today)
>  
> The cause appears to be due to the DirectRunner using TransformOverrides 
> re-writing FileIO sinks to use runner-determined-sharding
> ( see [DirectRunner.java line 
> 226|https://github.com/apache/beam/blob/master/runners/direct-java/src/main/java/org/apache/beam/runners/direct/DirectRunner.java#L226]
>  )
>  but I do not know why this started occuring in 2.9.0...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5933) PCollectionViews$SimplePCollectionView.hashCode allocates memory

2019-01-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5933?focusedWorklogId=185973=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-185973
 ]

ASF GitHub Bot logged work on BEAM-5933:


Author: ASF GitHub Bot
Created on: 16/Jan/19 20:22
Start Date: 16/Jan/19 20:22
Worklog Time Spent: 10m 
  Work Description: kennknowles commented on issue #6909: BEAM-5933: avoid 
memory allocation in hashCode call
URL: https://github.com/apache/beam/pull/6909#issuecomment-454926181
 
 
   Out of curiosity, did you measure the allocation and its cost?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 185973)
Time Spent: 40m  (was: 0.5h)

> PCollectionViews$SimplePCollectionView.hashCode allocates memory
> 
>
> Key: BEAM-5933
> URL: https://issues.apache.org/jira/browse/BEAM-5933
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Affects Versions: 2.8.0
>Reporter: Vojtech Janota
>Assignee: Vojtech Janota
>Priority: Trivial
> Fix For: 2.9.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> I'm currently profiling memory consumption of our Beam pipeline and have 
> noticed that
>     
> org.apache.beam.sdk.values.PCollectionViews$SimplePCollectionView.hashCode()
> makes noticeable heap allocations. The implementation is:
>     return Objects.hash(tag);
> That itself translates to:
>     return Arrays.hashCode(values);
> Which performs implicit array creation in order to call:
>     public static int Arrays.hashCode(Object a[]);
> Instead of the helper call, doing simple:
>     tag.hashCode();
> Seems more appropriate.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-6456) [community metrics] Update community metrics DB to utilize BigInt for job duration

2019-01-16 Thread Mikhail Gryzykhin (JIRA)
Mikhail Gryzykhin created BEAM-6456:
---

 Summary: [community metrics] Update community metrics DB to 
utilize BigInt for job duration
 Key: BEAM-6456
 URL: https://issues.apache.org/jira/browse/BEAM-6456
 Project: Beam
  Issue Type: New Feature
  Components: build-system
Reporter: Mikhail Gryzykhin
Assignee: Mikhail Gryzykhin


Community metrics located at [http://104.154.241.245|http://104.154.241.245/] 
have DB utilizing integers for job duration. This causes int overflow on long 
jobs.

Current mitigation trims duration to int max. We want to update DB schema to 
utilize bigint for duration. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6407) regression: FileIO.writeDynamic() with side inputs fails in DirectRunner

2019-01-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6407?focusedWorklogId=185971=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-185971
 ]

ASF GitHub Bot logged work on BEAM-6407:


Author: ASF GitHub Bot
Created on: 16/Jan/19 20:17
Start Date: 16/Jan/19 20:17
Worklog Time Spent: 10m 
  Work Description: kennknowles commented on issue #7537: [BEAM-6407] 
Revert "BEAM-5933: avoid memory allocation in hashCode call"
URL: https://github.com/apache/beam/pull/7537#issuecomment-454924351
 
 
   CC @janotav @iemejia 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 185971)
Time Spent: 20m  (was: 10m)

> regression: FileIO.writeDynamic() with side inputs fails in DirectRunner
> 
>
> Key: BEAM-6407
> URL: https://issues.apache.org/jira/browse/BEAM-6407
> Project: Beam
>  Issue Type: Bug
>  Components: runner-direct
>Affects Versions: 2.9.0
>Reporter: Niel Markwick
>Assignee: Kenneth Knowles
>Priority: Major
>  Labels: regression
> Fix For: 2.10.0
>
> Attachments: beam-filewriter-demo.tgz
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> When FileIO.writeDynamic is used with automatic sharding and  a Contextful.Fn 
> that uses side inputs for the file naming, DirectRunner (and TestPipeline) 
> fail with: 
> {{java.lang.IllegalStateException: All PCollectionViews that are consumed 
> must be written by some WriteView PTransform: Missing [ 
> [RunnerPCollectionView]]}}
>  
> Example code:  
> {code:java}
> PCollectionView outputFileName =
>    pipeline.apply(
>       "outputDir",
>        Create.of("/tmp/testout")).apply(View.asSingleton());
> Contextful.Fn manifestNaming =
>    (element, c) ->
>       (window, pane, numShards, shardIndex, compression) -> 
>          c.sideInput(outputFileName)+shardIndex;
> pipeline.apply(FileIO.writeDynamic()
>    .by(SerializableFunctions.constant(""))
>    .withDestinationCoder(StringUtf8Coder.of())
>    .via(TextIO.sink())
>    .withTempDirectory("/tmp")
>    .withNaming(Contextful.of(
>       manifestNaming,
>       Requirements.requiresSideInputs(outputFileName;
> {code}
>  
> This does not occur in Dataflow-runner
> It does not occur if the ContextFul.Fn is not given side inputs.
> It does not occur if withNumShards(1) is set.
> It did not occur in 2.8.0, and does in 2.9.0 and 2.10.0-SNAPSHOT (as of today)
>  
> The cause appears to be due to the DirectRunner using TransformOverrides 
> re-writing FileIO sinks to use runner-determined-sharding
> ( see [DirectRunner.java line 
> 226|https://github.com/apache/beam/blob/master/runners/direct-java/src/main/java/org/apache/beam/runners/direct/DirectRunner.java#L226]
>  )
>  but I do not know why this started occuring in 2.9.0...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6184) PortableRunner dependency missed in wordcount example maven artifact

2019-01-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6184?focusedWorklogId=185965=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-185965
 ]

ASF GitHub Bot logged work on BEAM-6184:


Author: ASF GitHub Bot
Created on: 16/Jan/19 20:04
Start Date: 16/Jan/19 20:04
Worklog Time Spent: 10m 
  Work Description: HuangLED commented on issue #7532: [BEAM-6184]Turn on 
javadocmethod checkstyle to report error.
URL: https://github.com/apache/beam/pull/7532#issuecomment-454920457
 
 
   Run Seed Job
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 185965)
Time Spent: 5h 40m  (was: 5.5h)

> PortableRunner dependency missed in wordcount example maven artifact
> 
>
> Key: BEAM-6184
> URL: https://issues.apache.org/jira/browse/BEAM-6184
> Project: Beam
>  Issue Type: Improvement
>  Components: build-system
>Reporter: Ruoyun Huang
>Assignee: Ruoyun Huang
>Priority: Minor
> Fix For: Not applicable
>
>  Time Spent: 5h 40m
>  Remaining Estimate: 0h
>
>  
>  
> more context: 
> https://lists.apache.org/thread.html/8dd60395424425f7502d62888c49014430d1d3b06c026606f3db28ab@%3Cuser.beam.apache.org%3E



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6184) PortableRunner dependency missed in wordcount example maven artifact

2019-01-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6184?focusedWorklogId=185970=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-185970
 ]

ASF GitHub Bot logged work on BEAM-6184:


Author: ASF GitHub Bot
Created on: 16/Jan/19 20:12
Start Date: 16/Jan/19 20:12
Worklog Time Spent: 10m 
  Work Description: asfgit commented on issue #7532: [BEAM-6184]Turn on 
javadocmethod checkstyle to report error.
URL: https://github.com/apache/beam/pull/7532#issuecomment-454922790
 
 
   SUCCESS 

   --none--
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 185970)
Time Spent: 5h 50m  (was: 5h 40m)

> PortableRunner dependency missed in wordcount example maven artifact
> 
>
> Key: BEAM-6184
> URL: https://issues.apache.org/jira/browse/BEAM-6184
> Project: Beam
>  Issue Type: Improvement
>  Components: build-system
>Reporter: Ruoyun Huang
>Assignee: Ruoyun Huang
>Priority: Minor
> Fix For: Not applicable
>
>  Time Spent: 5h 50m
>  Remaining Estimate: 0h
>
>  
>  
> more context: 
> https://lists.apache.org/thread.html/8dd60395424425f7502d62888c49014430d1d3b06c026606f3db28ab@%3Cuser.beam.apache.org%3E



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5953) Support DataflowRunner on Python 3

2019-01-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5953?focusedWorklogId=185969=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-185969
 ]

ASF GitHub Bot logged work on BEAM-5953:


Author: ASF GitHub Bot
Created on: 16/Jan/19 20:08
Start Date: 16/Jan/19 20:08
Worklog Time Spent: 10m 
  Work Description: markflyhigh commented on pull request #7521: 
[BEAM-5953] Fix py3 type error in bundle_processor
URL: https://github.com/apache/beam/pull/7521#discussion_r248433001
 
 

 ##
 File path: sdks/python/apache_beam/runners/worker/bundle_processor.py
 ##
 @@ -590,9 +590,11 @@ def get_coder(self, coder_id):
 if coder_proto.spec.spec.urn:
   return self.context.coders.get_by_id(coder_id)
 else:
+  payload = coder_proto.spec.spec.payload
+  if isinstance(payload, bytes):
+payload = payload.decode('utf-8')
   # No URN, assume cloud object encoding json bytes.
 
 Review comment:
   ms
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 185969)
Time Spent: 3h  (was: 2h 50m)

> Support DataflowRunner on Python 3
> --
>
> Key: BEAM-5953
> URL: https://issues.apache.org/jira/browse/BEAM-5953
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Mark Liu
>Assignee: Mark Liu
>Priority: Major
>  Time Spent: 3h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5953) Support DataflowRunner on Python 3

2019-01-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5953?focusedWorklogId=185967=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-185967
 ]

ASF GitHub Bot logged work on BEAM-5953:


Author: ASF GitHub Bot
Created on: 16/Jan/19 20:05
Start Date: 16/Jan/19 20:05
Worklog Time Spent: 10m 
  Work Description: markflyhigh commented on pull request #7521: 
[BEAM-5953] Fix py3 type error in bundle_processor
URL: https://github.com/apache/beam/pull/7521#discussion_r248431715
 
 

 ##
 File path: sdks/python/apache_beam/runners/worker/bundle_processor.py
 ##
 @@ -590,9 +590,11 @@ def get_coder(self, coder_id):
 if coder_proto.spec.spec.urn:
   return self.context.coders.get_by_id(coder_id)
 else:
+  payload = coder_proto.spec.spec.payload
+  if isinstance(payload, bytes):
 
 Review comment:
   Since this decoding is only needed in python 3, I prefer to add version 
check here. Same thing for chagnes to operation_specs
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 185967)
Time Spent: 2h 50m  (was: 2h 40m)

> Support DataflowRunner on Python 3
> --
>
> Key: BEAM-5953
> URL: https://issues.apache.org/jira/browse/BEAM-5953
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Mark Liu
>Assignee: Mark Liu
>Priority: Major
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5953) Support DataflowRunner on Python 3

2019-01-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5953?focusedWorklogId=185964=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-185964
 ]

ASF GitHub Bot logged work on BEAM-5953:


Author: ASF GitHub Bot
Created on: 16/Jan/19 20:04
Start Date: 16/Jan/19 20:04
Worklog Time Spent: 10m 
  Work Description: markflyhigh commented on pull request #7521: 
[BEAM-5953] Fix py3 type error in bundle_processor
URL: https://github.com/apache/beam/pull/7521#discussion_r248431715
 
 

 ##
 File path: sdks/python/apache_beam/runners/worker/bundle_processor.py
 ##
 @@ -590,9 +590,11 @@ def get_coder(self, coder_id):
 if coder_proto.spec.spec.urn:
   return self.context.coders.get_by_id(coder_id)
 else:
+  payload = coder_proto.spec.spec.payload
+  if isinstance(payload, bytes):
 
 Review comment:
   Since this decoding should only happens in python 3, I prefer to add version 
check here.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 185964)
Time Spent: 2.5h  (was: 2h 20m)

> Support DataflowRunner on Python 3
> --
>
> Key: BEAM-5953
> URL: https://issues.apache.org/jira/browse/BEAM-5953
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Mark Liu
>Assignee: Mark Liu
>Priority: Major
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5953) Support DataflowRunner on Python 3

2019-01-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5953?focusedWorklogId=185966=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-185966
 ]

ASF GitHub Bot logged work on BEAM-5953:


Author: ASF GitHub Bot
Created on: 16/Jan/19 20:04
Start Date: 16/Jan/19 20:04
Worklog Time Spent: 10m 
  Work Description: markflyhigh commented on pull request #7521: 
[BEAM-5953] Fix py3 type error in bundle_processor
URL: https://github.com/apache/beam/pull/7521#discussion_r248431715
 
 

 ##
 File path: sdks/python/apache_beam/runners/worker/bundle_processor.py
 ##
 @@ -590,9 +590,11 @@ def get_coder(self, coder_id):
 if coder_proto.spec.spec.urn:
   return self.context.coders.get_by_id(coder_id)
 else:
+  payload = coder_proto.spec.spec.payload
+  if isinstance(payload, bytes):
 
 Review comment:
   Since this decoding is only needed in python 3, I prefer to add version 
check here.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 185966)
Time Spent: 2h 40m  (was: 2.5h)

> Support DataflowRunner on Python 3
> --
>
> Key: BEAM-5953
> URL: https://issues.apache.org/jira/browse/BEAM-5953
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Mark Liu
>Assignee: Mark Liu
>Priority: Major
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6439) Move Python Flink VR tests to PreCommit

2019-01-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6439?focusedWorklogId=185963=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-185963
 ]

ASF GitHub Bot logged work on BEAM-6439:


Author: ASF GitHub Bot
Created on: 16/Jan/19 20:03
Start Date: 16/Jan/19 20:03
Worklog Time Spent: 10m 
  Work Description: angoenka commented on issue #7539: [BEAM-6439] Move 
Python Validates Runner Flink test to PreCommit
URL: https://github.com/apache/beam/pull/7539#issuecomment-454920089
 
 
   Run Seed Job
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 185963)
Time Spent: 2.5h  (was: 2h 20m)

> Move Python Flink VR tests to PreCommit
> ---
>
> Key: BEAM-6439
> URL: https://issues.apache.org/jira/browse/BEAM-6439
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-flink, testing
>Reporter: Robert Bradshaw
>Assignee: Robert Bradshaw
>Priority: Major
> Fix For: Not applicable
>
> Attachments: png.png
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> Now that they're fast and stable, it would be good to catch changes that 
> break this earlier.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6439) Move Python Flink VR tests to PreCommit

2019-01-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6439?focusedWorklogId=185961=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-185961
 ]

ASF GitHub Bot logged work on BEAM-6439:


Author: ASF GitHub Bot
Created on: 16/Jan/19 19:56
Start Date: 16/Jan/19 19:56
Worklog Time Spent: 10m 
  Work Description: angoenka commented on issue #7539: [BEAM-6439] Move 
Python Validates Runner Flink test to PreCommit
URL: https://github.com/apache/beam/pull/7539#issuecomment-454917912
 
 
   Run Python_VR_Flink PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 185961)
Time Spent: 2h 10m  (was: 2h)

> Move Python Flink VR tests to PreCommit
> ---
>
> Key: BEAM-6439
> URL: https://issues.apache.org/jira/browse/BEAM-6439
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-flink, testing
>Reporter: Robert Bradshaw
>Assignee: Robert Bradshaw
>Priority: Major
> Fix For: Not applicable
>
> Attachments: png.png
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> Now that they're fast and stable, it would be good to catch changes that 
> break this earlier.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6439) Move Python Flink VR tests to PreCommit

2019-01-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6439?focusedWorklogId=185962=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-185962
 ]

ASF GitHub Bot logged work on BEAM-6439:


Author: ASF GitHub Bot
Created on: 16/Jan/19 19:57
Start Date: 16/Jan/19 19:57
Worklog Time Spent: 10m 
  Work Description: angoenka commented on issue #7539: [BEAM-6439] Move 
Python Validates Runner Flink test to PreCommit
URL: https://github.com/apache/beam/pull/7539#issuecomment-454918169
 
 
   Can't test the job as beam1 is down and all runs are scheduled on beam1.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 185962)
Time Spent: 2h 20m  (was: 2h 10m)

> Move Python Flink VR tests to PreCommit
> ---
>
> Key: BEAM-6439
> URL: https://issues.apache.org/jira/browse/BEAM-6439
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-flink, testing
>Reporter: Robert Bradshaw
>Assignee: Robert Bradshaw
>Priority: Major
> Fix For: Not applicable
>
> Attachments: png.png
>
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> Now that they're fast and stable, it would be good to catch changes that 
> break this earlier.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6439) Move Python Flink VR tests to PreCommit

2019-01-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6439?focusedWorklogId=185960=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-185960
 ]

ASF GitHub Bot logged work on BEAM-6439:


Author: ASF GitHub Bot
Created on: 16/Jan/19 19:56
Start Date: 16/Jan/19 19:56
Worklog Time Spent: 10m 
  Work Description: angoenka commented on issue #7539: [BEAM-6439] Move 
Python Validates Runner Flink test to PreCommit
URL: https://github.com/apache/beam/pull/7539#issuecomment-454917710
 
 
   Run Python_VR_Flink PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 185960)
Time Spent: 2h  (was: 1h 50m)

> Move Python Flink VR tests to PreCommit
> ---
>
> Key: BEAM-6439
> URL: https://issues.apache.org/jira/browse/BEAM-6439
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-flink, testing
>Reporter: Robert Bradshaw
>Assignee: Robert Bradshaw
>Priority: Major
> Fix For: Not applicable
>
> Attachments: png.png
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> Now that they're fast and stable, it would be good to catch changes that 
> break this earlier.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6439) Move Python Flink VR tests to PreCommit

2019-01-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6439?focusedWorklogId=185959=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-185959
 ]

ASF GitHub Bot logged work on BEAM-6439:


Author: ASF GitHub Bot
Created on: 16/Jan/19 19:55
Start Date: 16/Jan/19 19:55
Worklog Time Spent: 10m 
  Work Description: angoenka commented on issue #7539: [BEAM-6439] Move 
Python Validates Runner Flink test to PreCommit
URL: https://github.com/apache/beam/pull/7539#issuecomment-454917592
 
 
   Run Python_VR PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 185959)
Time Spent: 1h 50m  (was: 1h 40m)

> Move Python Flink VR tests to PreCommit
> ---
>
> Key: BEAM-6439
> URL: https://issues.apache.org/jira/browse/BEAM-6439
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-flink, testing
>Reporter: Robert Bradshaw
>Assignee: Robert Bradshaw
>Priority: Major
> Fix For: Not applicable
>
> Attachments: png.png
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> Now that they're fast and stable, it would be good to catch changes that 
> break this earlier.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6352) Watch PTransform is broken

2019-01-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6352?focusedWorklogId=185958=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-185958
 ]

ASF GitHub Bot logged work on BEAM-6352:


Author: ASF GitHub Bot
Created on: 16/Jan/19 19:50
Start Date: 16/Jan/19 19:50
Worklog Time Spent: 10m 
  Work Description: swegner commented on pull request #7540: [BEAM-6352] 
Revert PR#6467 to fix Watch transform
URL: https://github.com/apache/beam/pull/7540
 
 
   **Please** add a meaningful description for your change here
   
   
   
   Follow this checklist to help us incorporate your contribution quickly and 
easily:
   
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   It will help us expedite review of your Pull Request if you tag someone 
(e.g. `@username`) to look at it.
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | --- | --- | --- | ---
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/)
 | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/)
  [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_VR_Flink/lastCompletedBuild/)
 | --- | --- | ---
   
   
   
   
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 185958)
Time Spent: 20m  (was: 10m)

> Watch PTransform is broken
> --
>
>  

[jira] [Work logged] (BEAM-6184) PortableRunner dependency missed in wordcount example maven artifact

2019-01-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6184?focusedWorklogId=185955=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-185955
 ]

ASF GitHub Bot logged work on BEAM-6184:


Author: ASF GitHub Bot
Created on: 16/Jan/19 19:41
Start Date: 16/Jan/19 19:41
Worklog Time Spent: 10m 
  Work Description: HuangLED commented on issue #7532: [BEAM-6184]Turn on 
javadocmethod checkstyle to report error.
URL: https://github.com/apache/beam/pull/7532#issuecomment-454912555
 
 
   Run Java_Examples_Dataflow PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 185955)
Time Spent: 5.5h  (was: 5h 20m)

> PortableRunner dependency missed in wordcount example maven artifact
> 
>
> Key: BEAM-6184
> URL: https://issues.apache.org/jira/browse/BEAM-6184
> Project: Beam
>  Issue Type: Improvement
>  Components: build-system
>Reporter: Ruoyun Huang
>Assignee: Ruoyun Huang
>Priority: Minor
> Fix For: Not applicable
>
>  Time Spent: 5.5h
>  Remaining Estimate: 0h
>
>  
>  
> more context: 
> https://lists.apache.org/thread.html/8dd60395424425f7502d62888c49014430d1d3b06c026606f3db28ab@%3Cuser.beam.apache.org%3E



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6444) Equality tests in coders.py are broken

2019-01-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6444?focusedWorklogId=185954=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-185954
 ]

ASF GitHub Bot logged work on BEAM-6444:


Author: ASF GitHub Bot
Created on: 16/Jan/19 19:38
Start Date: 16/Jan/19 19:38
Worklog Time Spent: 10m 
  Work Description: aaltay commented on pull request #7522: [BEAM-6444] Fix 
equality comparison to be against method invocation, not method object
URL: https://github.com/apache/beam/pull/7522
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 185954)
Time Spent: 40m  (was: 0.5h)

> Equality tests in coders.py are broken
> --
>
> Key: BEAM-6444
> URL: https://issues.apache.org/jira/browse/BEAM-6444
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Tyler Akidau
>Assignee: Tyler Akidau
>Priority: Major
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Equality checks for TupleSequenceCoder and IterableCoder were broken recently 
> with 
> [https://github.com/apache/beam/commit/ffec4716b6db6802a5ff54ee6d6d7fd4e764e8b6.]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6444) Equality tests in coders.py are broken

2019-01-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6444?focusedWorklogId=185953=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-185953
 ]

ASF GitHub Bot logged work on BEAM-6444:


Author: ASF GitHub Bot
Created on: 16/Jan/19 19:38
Start Date: 16/Jan/19 19:38
Worklog Time Spent: 10m 
  Work Description: aaltay commented on issue #7522: [BEAM-6444] Fix 
equality comparison to be against method invocation, not method object
URL: https://github.com/apache/beam/pull/7522#issuecomment-454911583
 
 
   Flink test failed with a known issue (now fixed: 
https://github.com/apache/beam/pull/7533) Rest of the tests are passing, 
merging.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 185953)
Time Spent: 0.5h  (was: 20m)

> Equality tests in coders.py are broken
> --
>
> Key: BEAM-6444
> URL: https://issues.apache.org/jira/browse/BEAM-6444
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Tyler Akidau
>Assignee: Tyler Akidau
>Priority: Major
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Equality checks for TupleSequenceCoder and IterableCoder were broken recently 
> with 
> [https://github.com/apache/beam/commit/ffec4716b6db6802a5ff54ee6d6d7fd4e764e8b6.]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6454) TypeError in DataflowRunner: dict_values does not support indexing

2019-01-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6454?focusedWorklogId=185951=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-185951
 ]

ASF GitHub Bot logged work on BEAM-6454:


Author: ASF GitHub Bot
Created on: 16/Jan/19 19:30
Start Date: 16/Jan/19 19:30
Worklog Time Spent: 10m 
  Work Description: markflyhigh commented on issue #7538: [BEAM-6454] Fix 
dict_values error in DataflowRunner
URL: https://github.com/apache/beam/pull/7538#issuecomment-454908965
 
 
   Run Portable_Python PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 185951)
Time Spent: 0.5h  (was: 20m)

> TypeError in DataflowRunner: dict_values does not support indexing
> --
>
> Key: BEAM-6454
> URL: https://issues.apache.org/jira/browse/BEAM-6454
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-harness
>Reporter: Mark Liu
>Assignee: Mark Liu
>Priority: Major
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> In python 3, dict.values() returns a view, rather than a list. So need to 
> wrap it to a list.
> Error in console output:
> {code:java}
> ERROR:root:Error while visiting read/Read/Impulse
> Traceback (most recent call last):
>   File "/usr/lib/python3.5/runpy.py", line 193, in _run_module_as_main
> "__main__", mod_spec)
>   File "/usr/lib/python3.5/runpy.py", line 85, in _run_code
> exec(code, run_globals)
>   File 
> "/usr/local/google/home/markliu/tmp/beam4/sdks/python/apache_beam/examples/wordcount.py",
>  line 136, in 
> run()
>   File 
> "/usr/local/google/home/markliu/tmp/beam4/sdks/python/apache_beam/examples/wordcount.py",
>  line 115, in run
> result = p.run()
>   File 
> "/usr/local/google/home/markliu/tmp/beam4/sdks/python/apache_beam/pipeline.py",
>  line 405, in run
> self._options).run(False)
>   File 
> "/usr/local/google/home/markliu/tmp/beam4/sdks/python/apache_beam/pipeline.py",
>  line 418, in run
> return self.runner.run_pipeline(self, self._options)
>   File 
> "/usr/local/google/home/markliu/tmp/beam4/sdks/python/apache_beam/runners/dataflow/dataflow_runner.py",
>  line 367, in run_pipeline
> super(DataflowRunner, self).run_pipeline(pipeline, options)
>   File 
> "/usr/local/google/home/markliu/tmp/beam4/sdks/python/apache_beam/runners/runner.py",
>  line 176, in run_pipeline
> pipeline.visit(RunVisitor(self))
>   File 
> "/usr/local/google/home/markliu/tmp/beam4/sdks/python/apache_beam/pipeline.py",
>  line 446, in visit
> self._root_transform().visit(visitor, self, visited)
>   File 
> "/usr/local/google/home/markliu/tmp/beam4/sdks/python/apache_beam/pipeline.py",
>  line 815, in visit
> part.visit(visitor, pipeline, visited)
>   File 
> "/usr/local/google/home/markliu/tmp/beam4/sdks/python/apache_beam/pipeline.py",
>  line 815, in visit
> part.visit(visitor, pipeline, visited)
>   File 
> "/usr/local/google/home/markliu/tmp/beam4/sdks/python/apache_beam/pipeline.py",
>  line 815, in visit
> part.visit(visitor, pipeline, visited)
>   File 
> "/usr/local/google/home/markliu/tmp/beam4/sdks/python/apache_beam/pipeline.py",
>  line 818, in visit
> visitor.visit_transform(self)
>   File 
> "/usr/local/google/home/markliu/tmp/beam4/sdks/python/apache_beam/runners/runner.py",
>  line 171, in visit_transform
> self.runner.run_transform(transform_node, options)
>   File 
> "/usr/local/google/home/markliu/tmp/beam4/sdks/python/apache_beam/runners/runner.py",
>  line 214, in run_transform
> return m(transform_node, options)
>   File 
> "/usr/local/google/home/markliu/tmp/beam4/sdks/python/apache_beam/runners/dataflow/dataflow_runner.py",
>  line 532, in run_Impulse
> step.encoding = self._get_encoded_output_coder(transform_node)
>   File 
> "/usr/local/google/home/markliu/tmp/beam4/sdks/python/apache_beam/runners/dataflow/dataflow_runner.py",
>  line 461, in _get_encoded_output_coder
> transform_node.outputs.values()[0].pipeline._options)
> TypeError: 'dict_values' object does not support indexing
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6184) PortableRunner dependency missed in wordcount example maven artifact

2019-01-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6184?focusedWorklogId=185947=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-185947
 ]

ASF GitHub Bot logged work on BEAM-6184:


Author: ASF GitHub Bot
Created on: 16/Jan/19 19:29
Start Date: 16/Jan/19 19:29
Worklog Time Spent: 10m 
  Work Description: HuangLED commented on issue #7532: [BEAM-6184]Turn on 
javadocmethod checkstyle to report error.
URL: https://github.com/apache/beam/pull/7532#issuecomment-454908415
 
 
   Run Java_Examples_Dataflow PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 185947)
Time Spent: 5h 20m  (was: 5h 10m)

> PortableRunner dependency missed in wordcount example maven artifact
> 
>
> Key: BEAM-6184
> URL: https://issues.apache.org/jira/browse/BEAM-6184
> Project: Beam
>  Issue Type: Improvement
>  Components: build-system
>Reporter: Ruoyun Huang
>Assignee: Ruoyun Huang
>Priority: Minor
> Fix For: Not applicable
>
>  Time Spent: 5h 20m
>  Remaining Estimate: 0h
>
>  
>  
> more context: 
> https://lists.apache.org/thread.html/8dd60395424425f7502d62888c49014430d1d3b06c026606f3db28ab@%3Cuser.beam.apache.org%3E



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


  1   2   3   >