[jira] [Work logged] (BEAM-6165) Send metrics to Flink in portable Flink runner

2019-03-20 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6165?focusedWorklogId=216583&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216583
 ]

ASF GitHub Bot logged work on BEAM-6165:


Author: ASF GitHub Bot
Created on: 21/Mar/19 04:06
Start Date: 21/Mar/19 04:06
Worklog Time Spent: 10m 
  Work Description: tweise commented on pull request #7971: [BEAM-6165] 
Flink portable metrics: get ptransform from MonitoringInfo, not stage name
URL: https://github.com/apache/beam/pull/7971#discussion_r267620804
 
 

 ##
 File path: 
runners/flink/src/main/java/org/apache/beam/runners/flink/metrics/FlinkMetricContainer.java
 ##
 @@ -99,117 +103,145 @@ public MetricsContainer getMetricsContainer(String 
stepName) {
* Update this container with metrics from the passed {@link 
MonitoringInfo}s, and send updates
* along to Flink's internal metrics framework.
*/
-  public void updateMetrics(String stepName, List 
monitoringInfos) {
-MetricsContainer metricsContainer = getMetricsContainer(stepName);
+  public void updateMetrics(List monitoringInfos) {
 monitoringInfos.forEach(
 monitoringInfo -> {
-  if (monitoringInfo.hasMetric()) {
-String urn = monitoringInfo.getUrn();
-MetricName metricName = parseUrn(urn);
-Metric metric = monitoringInfo.getMetric();
-if (metric.hasCounterData()) {
-  CounterData counterData = metric.getCounterData();
-  if (counterData.getValueCase() == 
CounterData.ValueCase.INT64_VALUE) {
-org.apache.beam.sdk.metrics.Counter counter =
-metricsContainer.getCounter(metricName);
-counter.inc(counterData.getInt64Value());
-  } else {
-LOG.warn("Unsupported CounterData type: {}", counterData);
-  }
-} else if (metric.hasDistributionData()) {
-  DistributionData distributionData = metric.getDistributionData();
-  if (distributionData.hasIntDistributionData()) {
-Distribution distribution = 
metricsContainer.getDistribution(metricName);
-IntDistributionData intDistributionData = 
distributionData.getIntDistributionData();
-distribution.update(
-intDistributionData.getSum(),
-intDistributionData.getCount(),
-intDistributionData.getMin(),
-intDistributionData.getMax());
-  } else {
-LOG.warn("Unsupported DistributionData type: {}", 
distributionData);
-  }
-} else if (metric.hasExtremaData()) {
-  ExtremaData extremaData = metric.getExtremaData();
-  LOG.warn("Extrema metric unsupported: {}", extremaData);
+  if (!monitoringInfo.hasMetric()) {
+LOG.info("Skipping metric-less MonitoringInfo: {}", 
monitoringInfo);
+return;
+  }
+  Metric metric = monitoringInfo.getMetric();
+
+  String urn = monitoringInfo.getUrn();
+  MetricName metricName = parseUserUrn(urn);
+  if (metricName == null) {
+LOG.info("Dropping system metric: {}", monitoringInfo);
+return;
+  }
+
+  Map labels = monitoringInfo.getLabelsMap();
+  String ptransform = labels.get(PTRANSFORM_LABEL);
+
+  MetricsContainer metricsContainer = getMetricsContainer(ptransform);
+
+  MetricKey key = MetricKey.create(ptransform, metricName);
+
+  if (metric.hasCounterData()) {
+CounterData counterData = metric.getCounterData();
+if (counterData.getValueCase() == 
CounterData.ValueCase.INT64_VALUE) {
+  long value = counterData.getInt64Value();
+  org.apache.beam.sdk.metrics.Counter counter = 
metricsContainer.getCounter(metricName);
+  counter.inc(value);
+
+  // Update flink
+  updateCounter(key, value);
+} else {
+  LOG.warn("Unsupported CounterData type: {}", counterData);
+}
+  } else if (metric.hasDistributionData()) {
+DistributionData distributionData = metric.getDistributionData();
+if (distributionData.hasIntDistributionData()) {
+  Distribution distribution = 
metricsContainer.getDistribution(metricName);
+  IntDistributionData intDistributionData = 
distributionData.getIntDistributionData();
+  distribution.update(
+  intDistributionData.getSum(),
+  intDistributionData.getCount(),
+  intDistributionData.getMin(),
+  intDistributionData.getMax());
+
+  // Update flink
+  updateDistribution(
+  key,
+  DistributionResult.create(
+ 

[jira] [Work logged] (BEAM-6527) Parallel tox (unit) tests run on Jenkins

2019-03-20 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6527?focusedWorklogId=216560&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216560
 ]

ASF GitHub Bot logged work on BEAM-6527:


Author: ASF GitHub Bot
Created on: 21/Mar/19 00:36
Start Date: 21/Mar/19 00:36
Worklog Time Spent: 10m 
  Work Description: markflyhigh commented on issue #8067: [BEAM-6527] Use 
Gradle to parallel Python tox tests
URL: https://github.com/apache/beam/pull/8067#issuecomment-475079256
 
 
   "./gradlew publish" failed since python36 and 37 are missing on Beam12. This 
is an infra issue and I started [a 
discussion](https://lists.apache.org/thread.html/529a3626eb9e8f5795b271b9d06ba34242fcaf771fdc8f596eaec229@%3Cdev.beam.apache.org%3E)
 on dev@. Gradle scan shows that python load test was built successfully.
   
   PTAL @aaltay @tvalentyn @adude3141
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 216560)
Time Spent: 9h  (was: 8h 50m)

> Parallel tox (unit) tests run on Jenkins
> 
>
> Key: BEAM-6527
> URL: https://issues.apache.org/jira/browse/BEAM-6527
> Project: Beam
>  Issue Type: Sub-task
>  Components: testing
>Reporter: Mark Liu
>Assignee: Mark Liu
>Priority: Major
>  Labels: triaged
>  Time Spent: 9h
>  Remaining Estimate: 0h
>
> Existing tox unit test suite (basic, gcp and cython) will be enabled in 
> multiple version of Python 3, which will significantly increase runtime of 
> Pre/PostCommit build. A parallel is wanted in tox or Gradle invocation to 
> control the time in a reasonable range (<30mins for PreCommit is desired).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6527) Parallel tox (unit) tests run on Jenkins

2019-03-20 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6527?focusedWorklogId=216561&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216561
 ]

ASF GitHub Bot logged work on BEAM-6527:


Author: ASF GitHub Bot
Created on: 21/Mar/19 00:37
Start Date: 21/Mar/19 00:37
Worklog Time Spent: 10m 
  Work Description: markflyhigh commented on issue #8067: [BEAM-6527] Use 
Gradle to parallel Python tox tests
URL: https://github.com/apache/beam/pull/8067#issuecomment-475079256
 
 
   "./gradlew publish" failed since python36 and 37 are missing on Beam12. This 
is an infra issue and I started [a 
discussion](https://lists.apache.org/thread.html/529a3626eb9e8f5795b271b9d06ba34242fcaf771fdc8f596eaec229@%3Cdev.beam.apache.org%3E)
 on dev@. Gradle scan shows that python load test (failed previously and led to 
rollback) was built successfully.
   
   PTAL @aaltay @tvalentyn @adude3141
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 216561)
Time Spent: 9h 10m  (was: 9h)

> Parallel tox (unit) tests run on Jenkins
> 
>
> Key: BEAM-6527
> URL: https://issues.apache.org/jira/browse/BEAM-6527
> Project: Beam
>  Issue Type: Sub-task
>  Components: testing
>Reporter: Mark Liu
>Assignee: Mark Liu
>Priority: Major
>  Labels: triaged
>  Time Spent: 9h 10m
>  Remaining Estimate: 0h
>
> Existing tox unit test suite (basic, gcp and cython) will be enabled in 
> multiple version of Python 3, which will significantly increase runtime of 
> Pre/PostCommit build. A parallel is wanted in tox or Gradle invocation to 
> control the time in a reasonable range (<30mins for PreCommit is desired).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6527) Parallel tox (unit) tests run on Jenkins

2019-03-20 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6527?focusedWorklogId=216562&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216562
 ]

ASF GitHub Bot logged work on BEAM-6527:


Author: ASF GitHub Bot
Created on: 21/Mar/19 00:37
Start Date: 21/Mar/19 00:37
Worklog Time Spent: 10m 
  Work Description: adude3141 commented on issue #8067: [BEAM-6527] Use 
Gradle to parallel Python tox tests
URL: https://github.com/apache/beam/pull/8067#issuecomment-475079580
 
 
   Run Gradle Publish
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 216562)
Time Spent: 9h 20m  (was: 9h 10m)

> Parallel tox (unit) tests run on Jenkins
> 
>
> Key: BEAM-6527
> URL: https://issues.apache.org/jira/browse/BEAM-6527
> Project: Beam
>  Issue Type: Sub-task
>  Components: testing
>Reporter: Mark Liu
>Assignee: Mark Liu
>Priority: Major
>  Labels: triaged
>  Time Spent: 9h 20m
>  Remaining Estimate: 0h
>
> Existing tox unit test suite (basic, gcp and cython) will be enabled in 
> multiple version of Python 3, which will significantly increase runtime of 
> Pre/PostCommit build. A parallel is wanted in tox or Gradle invocation to 
> control the time in a reasonable range (<30mins for PreCommit is desired).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6527) Parallel tox (unit) tests run on Jenkins

2019-03-20 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6527?focusedWorklogId=216558&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216558
 ]

ASF GitHub Bot logged work on BEAM-6527:


Author: ASF GitHub Bot
Created on: 21/Mar/19 00:35
Start Date: 21/Mar/19 00:35
Worklog Time Spent: 10m 
  Work Description: markflyhigh commented on issue #8067: [BEAM-6527] Use 
Gradle to parallel Python tox tests
URL: https://github.com/apache/beam/pull/8067#issuecomment-475079256
 
 
   "./gradlew publish" failed since python36 and 37 are missing on Beam12. This 
is a infra issue and I start [a discussion] on 
dev@(https://lists.apache.org/thread.html/529a3626eb9e8f5795b271b9d06ba34242fcaf771fdc8f596eaec229@%3Cdev.beam.apache.org%3E).
 Gradle scan shows that python load test was built successfully.
   
   PTAL @aaltay @tvalentyn @adude3141
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 216558)
Time Spent: 8h 40m  (was: 8.5h)

> Parallel tox (unit) tests run on Jenkins
> 
>
> Key: BEAM-6527
> URL: https://issues.apache.org/jira/browse/BEAM-6527
> Project: Beam
>  Issue Type: Sub-task
>  Components: testing
>Reporter: Mark Liu
>Assignee: Mark Liu
>Priority: Major
>  Labels: triaged
>  Time Spent: 8h 40m
>  Remaining Estimate: 0h
>
> Existing tox unit test suite (basic, gcp and cython) will be enabled in 
> multiple version of Python 3, which will significantly increase runtime of 
> Pre/PostCommit build. A parallel is wanted in tox or Gradle invocation to 
> control the time in a reasonable range (<30mins for PreCommit is desired).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6527) Parallel tox (unit) tests run on Jenkins

2019-03-20 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6527?focusedWorklogId=216559&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216559
 ]

ASF GitHub Bot logged work on BEAM-6527:


Author: ASF GitHub Bot
Created on: 21/Mar/19 00:35
Start Date: 21/Mar/19 00:35
Worklog Time Spent: 10m 
  Work Description: markflyhigh commented on issue #8067: [BEAM-6527] Use 
Gradle to parallel Python tox tests
URL: https://github.com/apache/beam/pull/8067#issuecomment-475079256
 
 
   "./gradlew publish" failed since python36 and 37 are missing on Beam12. This 
is an infra issue and I start [a discussion] on 
dev@(https://lists.apache.org/thread.html/529a3626eb9e8f5795b271b9d06ba34242fcaf771fdc8f596eaec229@%3Cdev.beam.apache.org%3E).
 Gradle scan shows that python load test was built successfully.
   
   PTAL @aaltay @tvalentyn @adude3141
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 216559)
Time Spent: 8h 50m  (was: 8h 40m)

> Parallel tox (unit) tests run on Jenkins
> 
>
> Key: BEAM-6527
> URL: https://issues.apache.org/jira/browse/BEAM-6527
> Project: Beam
>  Issue Type: Sub-task
>  Components: testing
>Reporter: Mark Liu
>Assignee: Mark Liu
>Priority: Major
>  Labels: triaged
>  Time Spent: 8h 50m
>  Remaining Estimate: 0h
>
> Existing tox unit test suite (basic, gcp and cython) will be enabled in 
> multiple version of Python 3, which will significantly increase runtime of 
> Pre/PostCommit build. A parallel is wanted in tox or Gradle invocation to 
> control the time in a reasonable range (<30mins for PreCommit is desired).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6838) Improve error messages for unregistered protos

2019-03-20 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6838?focusedWorklogId=216555&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216555
 ]

ASF GitHub Bot logged work on BEAM-6838:


Author: ASF GitHub Bot
Created on: 21/Mar/19 00:16
Start Date: 21/Mar/19 00:16
Worklog Time Spent: 10m 
  Work Description: youngoli commented on issue #8064: [BEAM-6838] Go SDK: 
Improving error msgs in pipeline serialization.
URL: https://github.com/apache/beam/pull/8064#issuecomment-475076149
 
 
   There definitely is a lot of repetition with the recursion, even before my 
changes. Though I'm not sure how to avoid that really. While it's a lot of 
information for users, it's important for debugging. Only alternative I can 
think of is adding some debug logs instead of some errors, so users can see the 
path being taken by examining logs.
   
   Here's a real-world example I've been using to test (I replaced the proto 
name though)
   
   > failed to encode custom coder: bad underlying type: bad element: bad field 
type: bad element: bad element: bad field type: bad element: bad field type: 
bad element: bad element: bad field type: bad field type: bad element: bad 
field type: unencodable type: protoapi.extensionMap
   
   And the new error:
   
   > Failed to encode custom coder for type protoapi.Message. Make sure the 
type was registered before calling beam.Init. For example: 
beam.RegisterType(reflect.TypeOf((*TypeName)(nil)).Elem())
   > 
   > Full error: failed to encode custom coder 
*my_package.MyProto[protoapi.Message], bad underlying type: failed to encode 
pointer *my_package.MyProto, bad base type: failed to encode struct 
my_package.MyProto, bad field type: failed to encode pointer 
*my_package.MyProto_MyField, bad base type: failed to encode struct 
my_package.MyProto_MyField, bad field type: unencodable type 
map[int32]protoapi.Message
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 216555)
Time Spent: 7h 20m  (was: 7h 10m)

> Improve error messages for unregistered protos
> --
>
> Key: BEAM-6838
> URL: https://issues.apache.org/jira/browse/BEAM-6838
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-go
>Reporter: Daniel Oliveira
>Assignee: Daniel Oliveira
>Priority: Minor
>  Time Spent: 7h 20m
>  Remaining Estimate: 0h
>
> When users have a type that can't be serialized (it has a function Field or 
> similar) or more complex protocol buffers, the error should be improved to 
> display the type being registered, and better yet, the beam.RegisterType(...) 
> code they can use to enable use of their type.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6446) Clean up suppression rules in checkstyle suppressions.xml

2019-03-20 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6446?focusedWorklogId=216554&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216554
 ]

ASF GitHub Bot logged work on BEAM-6446:


Author: ASF GitHub Bot
Created on: 21/Mar/19 00:11
Start Date: 21/Mar/19 00:11
Worklog Time Spent: 10m 
  Work Description: HuangLED commented on issue #8091: [BEAM-6446] Clean up 
checkstyle suppressions.
URL: https://github.com/apache/beam/pull/8091#issuecomment-475075320
 
 
   R: @kennknowles 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 216554)
Time Spent: 1h  (was: 50m)

> Clean up suppression rules in checkstyle suppressions.xml
> -
>
> Key: BEAM-6446
> URL: https://issues.apache.org/jira/browse/BEAM-6446
> Project: Beam
>  Issue Type: Improvement
>  Components: build-system
>Reporter: Ruoyun Huang
>Assignee: Ruoyun Huang
>Priority: Minor
>  Labels: triaged
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> When violations are addressed, clean up suppression rules in checkstyle 
> suppressions.xml



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work started] (BEAM-6446) Clean up suppression rules in checkstyle suppressions.xml

2019-03-20 Thread Ruoyun Huang (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on BEAM-6446 started by Ruoyun Huang.
--
> Clean up suppression rules in checkstyle suppressions.xml
> -
>
> Key: BEAM-6446
> URL: https://issues.apache.org/jira/browse/BEAM-6446
> Project: Beam
>  Issue Type: Improvement
>  Components: build-system
>Reporter: Ruoyun Huang
>Assignee: Ruoyun Huang
>Priority: Minor
>  Labels: triaged
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> When violations are addressed, clean up suppression rules in checkstyle 
> suppressions.xml



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6838) Improve error messages for unregistered protos

2019-03-20 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6838?focusedWorklogId=216542&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216542
 ]

ASF GitHub Bot logged work on BEAM-6838:


Author: ASF GitHub Bot
Created on: 20/Mar/19 23:53
Start Date: 20/Mar/19 23:53
Worklog Time Spent: 10m 
  Work Description: youngoli commented on pull request #8064: [BEAM-6838] 
Go SDK: Improving error msgs in pipeline serialization.
URL: https://github.com/apache/beam/pull/8064#discussion_r267590495
 
 

 ##
 File path: sdks/go/pkg/beam/core/runtime/graphx/serialize.go
 ##
 @@ -257,22 +257,22 @@ func decodeFn(u *v1.Fn) (*graph.Fn, error) {
if u.Fn != nil {
fn, err := decodeUserFn(u.Fn)
if err != nil {
-   return nil, fmt.Errorf("bad userfn: %v", err)
+   return nil, fmt.Errorf("failed to decode function proto 
%v, bad userfn: %v", u, err)
 
 Review comment:
   Done with a little added detail:
   
   ("failed to decode DoFn %v, failed to decode userfn: %v", u, err)
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 216542)
Time Spent: 5h 40m  (was: 5.5h)

> Improve error messages for unregistered protos
> --
>
> Key: BEAM-6838
> URL: https://issues.apache.org/jira/browse/BEAM-6838
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-go
>Reporter: Daniel Oliveira
>Assignee: Daniel Oliveira
>Priority: Minor
>  Time Spent: 5h 40m
>  Remaining Estimate: 0h
>
> When users have a type that can't be serialized (it has a function Field or 
> similar) or more complex protocol buffers, the error should be improved to 
> display the type being registered, and better yet, the beam.RegisterType(...) 
> code they can use to enable use of their type.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6838) Improve error messages for unregistered protos

2019-03-20 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6838?focusedWorklogId=216550&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216550
 ]

ASF GitHub Bot logged work on BEAM-6838:


Author: ASF GitHub Bot
Created on: 20/Mar/19 23:54
Start Date: 20/Mar/19 23:54
Worklog Time Spent: 10m 
  Work Description: youngoli commented on pull request #8064: [BEAM-6838] 
Go SDK: Improving error msgs in pipeline serialization.
URL: https://github.com/apache/beam/pull/8064#discussion_r267590596
 
 

 ##
 File path: sdks/go/pkg/beam/core/runtime/graphx/serialize.go
 ##
 @@ -195,7 +195,7 @@ func encodeFn(u *graph.Fn) (*v1.Fn, error) {
gen := reflectx.FunctionName(u.DynFn.Gen)
t, err := encodeType(u.DynFn.T)
if err != nil {
-   return nil, fmt.Errorf("bad function type: %v", err)
+   return nil, fmt.Errorf("failed to encode function %v, 
bad function type: %v", u, err)
 
 Review comment:
   Done.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 216550)
Time Spent: 7h  (was: 6h 50m)

> Improve error messages for unregistered protos
> --
>
> Key: BEAM-6838
> URL: https://issues.apache.org/jira/browse/BEAM-6838
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-go
>Reporter: Daniel Oliveira
>Assignee: Daniel Oliveira
>Priority: Minor
>  Time Spent: 7h
>  Remaining Estimate: 0h
>
> When users have a type that can't be serialized (it has a function Field or 
> similar) or more complex protocol buffers, the error should be improved to 
> display the type being registered, and better yet, the beam.RegisterType(...) 
> code they can use to enable use of their type.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6838) Improve error messages for unregistered protos

2019-03-20 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6838?focusedWorklogId=216545&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216545
 ]

ASF GitHub Bot logged work on BEAM-6838:


Author: ASF GitHub Bot
Created on: 20/Mar/19 23:54
Start Date: 20/Mar/19 23:54
Worklog Time Spent: 10m 
  Work Description: youngoli commented on pull request #8064: [BEAM-6838] 
Go SDK: Improving error msgs in pipeline serialization.
URL: https://github.com/apache/beam/pull/8064#discussion_r267590522
 
 

 ##
 File path: sdks/go/pkg/beam/core/runtime/graphx/serialize.go
 ##
 @@ -207,45 +207,45 @@ func encodeFn(u *graph.Fn) (*v1.Fn, error) {
case u.Fn != nil:
fn, err := encodeUserFn(u.Fn)
if err != nil {
-   return nil, fmt.Errorf("bad userfn: %v", err)
+   return nil, fmt.Errorf("failed to encode function %v, 
bad userfn: %v", u, err)
}
return &v1.Fn{Fn: fn}, nil
 
case u.Recv != nil:
t := reflect.TypeOf(u.Recv)
k, ok := runtime.TypeKey(reflectx.SkipPtr(t))
if !ok {
-   return nil, fmt.Errorf("bad recv: %v", u.Recv)
+   return nil, fmt.Errorf("failed to encode function %v, 
bad receiver %v", u, u.Recv)
}
if _, ok := runtime.LookupType(k); !ok {
-   return nil, fmt.Errorf("recv type must be registered: 
%v", t)
+   return nil, fmt.Errorf("failed to encode function %v, 
receiver type %v must be registered", u, t)
}
typ, err := encodeType(t)
if err != nil {
-   panic(fmt.Sprintf("bad recv type: %v", u.Recv))
+   panic(fmt.Sprintf("Failed to encode function %v, bad 
receiver type: %v", u, err))
}
 
data, err := json.Marshal(u.Recv)
if err != nil {
-   return nil, fmt.Errorf("bad userfn: %v", err)
+   return nil, fmt.Errorf("failed to encode function %v, 
bad userfn: %v", u, err)
 
 Review comment:
   Done, with some changes:
   
   ("failed to encode structural DoFn %v, failed to marshal receiver %v: %v", 
u, u.Recv, err)
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 216545)
Time Spent: 6h 10m  (was: 6h)

> Improve error messages for unregistered protos
> --
>
> Key: BEAM-6838
> URL: https://issues.apache.org/jira/browse/BEAM-6838
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-go
>Reporter: Daniel Oliveira
>Assignee: Daniel Oliveira
>Priority: Minor
>  Time Spent: 6h 10m
>  Remaining Estimate: 0h
>
> When users have a type that can't be serialized (it has a function Field or 
> similar) or more complex protocol buffers, the error should be improved to 
> display the type being registered, and better yet, the beam.RegisterType(...) 
> code they can use to enable use of their type.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6838) Improve error messages for unregistered protos

2019-03-20 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6838?focusedWorklogId=216539&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216539
 ]

ASF GitHub Bot logged work on BEAM-6838:


Author: ASF GitHub Bot
Created on: 20/Mar/19 23:53
Start Date: 20/Mar/19 23:53
Worklog Time Spent: 10m 
  Work Description: youngoli commented on pull request #8064: [BEAM-6838] 
Go SDK: Improving error msgs in pipeline serialization.
URL: https://github.com/apache/beam/pull/8064#discussion_r267590458
 
 

 ##
 File path: sdks/go/pkg/beam/core/runtime/graphx/serialize.go
 ##
 @@ -257,22 +257,22 @@ func decodeFn(u *v1.Fn) (*graph.Fn, error) {
if u.Fn != nil {
fn, err := decodeUserFn(u.Fn)
if err != nil {
-   return nil, fmt.Errorf("bad userfn: %v", err)
+   return nil, fmt.Errorf("failed to decode function proto 
%v, bad userfn: %v", u, err)
}
fx, err := funcx.New(reflectx.MakeFunc(fn))
if err != nil {
-   return nil, fmt.Errorf("bad userfn: %v", err)
+   return nil, fmt.Errorf("failed to decode function proto 
%v, bad userfn: %v", u, err)
}
return &graph.Fn{Fn: fx}, nil
}
 
t, err := decodeType(u.Type)
if err != nil {
-   return nil, fmt.Errorf("bad type: %v", err)
+   return nil, fmt.Errorf("failed to decode function proto %v, bad 
type: %v", u, err)
}
fn, err := reflectx.UnmarshalJSON(t, u.Opt)
if err != nil {
-   return nil, fmt.Errorf("bad struct encoding: %v", err)
+   return nil, fmt.Errorf("failed to decode function proto %v, bad 
struct encoding: %v", u, err)
 
 Review comment:
   Is this a copy/paste error, or did you want "bad struct encoding" changed to 
"bad userfn"?
   
   Done for the change from "function proto" -> "structural DoFn".
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 216539)
Time Spent: 5h 10m  (was: 5h)

> Improve error messages for unregistered protos
> --
>
> Key: BEAM-6838
> URL: https://issues.apache.org/jira/browse/BEAM-6838
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-go
>Reporter: Daniel Oliveira
>Assignee: Daniel Oliveira
>Priority: Minor
>  Time Spent: 5h 10m
>  Remaining Estimate: 0h
>
> When users have a type that can't be serialized (it has a function Field or 
> similar) or more complex protocol buffers, the error should be improved to 
> display the type being registered, and better yet, the beam.RegisterType(...) 
> code they can use to enable use of their type.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6838) Improve error messages for unregistered protos

2019-03-20 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6838?focusedWorklogId=216551&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216551
 ]

ASF GitHub Bot logged work on BEAM-6838:


Author: ASF GitHub Bot
Created on: 20/Mar/19 23:54
Start Date: 20/Mar/19 23:54
Worklog Time Spent: 10m 
  Work Description: youngoli commented on pull request #8064: [BEAM-6838] 
Go SDK: Improving error msgs in pipeline serialization.
URL: https://github.com/apache/beam/pull/8064#discussion_r267590604
 
 

 ##
 File path: sdks/go/pkg/beam/core/runtime/graphx/serialize.go
 ##
 @@ -145,20 +145,20 @@ func encodeCustomCoder(c *coder.CustomCoder) 
(*v1.CustomCoder, error) {
 func decodeCustomCoder(c *v1.CustomCoder) (*coder.CustomCoder, error) {
t, err := decodeType(c.Type)
if err != nil {
-   return nil, fmt.Errorf("decodeCustomCoder bad type: %v", err)
+   return nil, fmt.Errorf("failed to decode custom coder proto %v 
for type %v: %v", c, c.Type, err)
 
 Review comment:
   Done, removed mentions of "proto" in error messages.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 216551)
Time Spent: 7h 10m  (was: 7h)

> Improve error messages for unregistered protos
> --
>
> Key: BEAM-6838
> URL: https://issues.apache.org/jira/browse/BEAM-6838
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-go
>Reporter: Daniel Oliveira
>Assignee: Daniel Oliveira
>Priority: Minor
>  Time Spent: 7h 10m
>  Remaining Estimate: 0h
>
> When users have a type that can't be serialized (it has a function Field or 
> similar) or more complex protocol buffers, the error should be improved to 
> display the type being registered, and better yet, the beam.RegisterType(...) 
> code they can use to enable use of their type.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6838) Improve error messages for unregistered protos

2019-03-20 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6838?focusedWorklogId=216549&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216549
 ]

ASF GitHub Bot logged work on BEAM-6838:


Author: ASF GitHub Bot
Created on: 20/Mar/19 23:54
Start Date: 20/Mar/19 23:54
Worklog Time Spent: 10m 
  Work Description: youngoli commented on pull request #8064: [BEAM-6838] 
Go SDK: Improving error msgs in pipeline serialization.
URL: https://github.com/apache/beam/pull/8064#discussion_r267590587
 
 

 ##
 File path: sdks/go/pkg/beam/core/runtime/graphx/serialize.go
 ##
 @@ -207,45 +207,45 @@ func encodeFn(u *graph.Fn) (*v1.Fn, error) {
case u.Fn != nil:
fn, err := encodeUserFn(u.Fn)
if err != nil {
-   return nil, fmt.Errorf("bad userfn: %v", err)
+   return nil, fmt.Errorf("failed to encode function %v, 
bad userfn: %v", u, err)
 
 Review comment:
   Done
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 216549)
Time Spent: 6h 50m  (was: 6h 40m)

> Improve error messages for unregistered protos
> --
>
> Key: BEAM-6838
> URL: https://issues.apache.org/jira/browse/BEAM-6838
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-go
>Reporter: Daniel Oliveira
>Assignee: Daniel Oliveira
>Priority: Minor
>  Time Spent: 6h 50m
>  Remaining Estimate: 0h
>
> When users have a type that can't be serialized (it has a function Field or 
> similar) or more complex protocol buffers, the error should be improved to 
> display the type being registered, and better yet, the beam.RegisterType(...) 
> code they can use to enable use of their type.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6838) Improve error messages for unregistered protos

2019-03-20 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6838?focusedWorklogId=216548&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216548
 ]

ASF GitHub Bot logged work on BEAM-6838:


Author: ASF GitHub Bot
Created on: 20/Mar/19 23:54
Start Date: 20/Mar/19 23:54
Worklog Time Spent: 10m 
  Work Description: youngoli commented on pull request #8064: [BEAM-6838] 
Go SDK: Improving error msgs in pipeline serialization.
URL: https://github.com/apache/beam/pull/8064#discussion_r267590579
 
 

 ##
 File path: sdks/go/pkg/beam/core/runtime/graphx/serialize.go
 ##
 @@ -207,45 +207,45 @@ func encodeFn(u *graph.Fn) (*v1.Fn, error) {
case u.Fn != nil:
fn, err := encodeUserFn(u.Fn)
if err != nil {
-   return nil, fmt.Errorf("bad userfn: %v", err)
+   return nil, fmt.Errorf("failed to encode function %v, 
bad userfn: %v", u, err)
}
return &v1.Fn{Fn: fn}, nil
 
case u.Recv != nil:
t := reflect.TypeOf(u.Recv)
k, ok := runtime.TypeKey(reflectx.SkipPtr(t))
if !ok {
-   return nil, fmt.Errorf("bad recv: %v", u.Recv)
+   return nil, fmt.Errorf("failed to encode function %v, 
bad receiver %v", u, u.Recv)
 
 Review comment:
   Done, with a little more detail:
   
   ("failed to encode structural DoFn %v, failed to create TypeKey for receiver 
type %T", u, u.Recv)
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 216548)
Time Spent: 6h 40m  (was: 6.5h)

> Improve error messages for unregistered protos
> --
>
> Key: BEAM-6838
> URL: https://issues.apache.org/jira/browse/BEAM-6838
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-go
>Reporter: Daniel Oliveira
>Assignee: Daniel Oliveira
>Priority: Minor
>  Time Spent: 6h 40m
>  Remaining Estimate: 0h
>
> When users have a type that can't be serialized (it has a function Field or 
> similar) or more complex protocol buffers, the error should be improved to 
> display the type being registered, and better yet, the beam.RegisterType(...) 
> code they can use to enable use of their type.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6838) Improve error messages for unregistered protos

2019-03-20 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6838?focusedWorklogId=216547&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216547
 ]

ASF GitHub Bot logged work on BEAM-6838:


Author: ASF GitHub Bot
Created on: 20/Mar/19 23:54
Start Date: 20/Mar/19 23:54
Worklog Time Spent: 10m 
  Work Description: youngoli commented on pull request #8064: [BEAM-6838] 
Go SDK: Improving error msgs in pipeline serialization.
URL: https://github.com/apache/beam/pull/8064#discussion_r267590539
 
 

 ##
 File path: sdks/go/pkg/beam/core/runtime/graphx/serialize.go
 ##
 @@ -207,45 +207,45 @@ func encodeFn(u *graph.Fn) (*v1.Fn, error) {
case u.Fn != nil:
fn, err := encodeUserFn(u.Fn)
if err != nil {
-   return nil, fmt.Errorf("bad userfn: %v", err)
+   return nil, fmt.Errorf("failed to encode function %v, 
bad userfn: %v", u, err)
}
return &v1.Fn{Fn: fn}, nil
 
case u.Recv != nil:
t := reflect.TypeOf(u.Recv)
k, ok := runtime.TypeKey(reflectx.SkipPtr(t))
if !ok {
-   return nil, fmt.Errorf("bad recv: %v", u.Recv)
+   return nil, fmt.Errorf("failed to encode function %v, 
bad receiver %v", u, u.Recv)
}
if _, ok := runtime.LookupType(k); !ok {
-   return nil, fmt.Errorf("recv type must be registered: 
%v", t)
+   return nil, fmt.Errorf("failed to encode function %v, 
receiver type %v must be registered", u, t)
 
 Review comment:
   Done
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 216547)
Time Spent: 6.5h  (was: 6h 20m)

> Improve error messages for unregistered protos
> --
>
> Key: BEAM-6838
> URL: https://issues.apache.org/jira/browse/BEAM-6838
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-go
>Reporter: Daniel Oliveira
>Assignee: Daniel Oliveira
>Priority: Minor
>  Time Spent: 6.5h
>  Remaining Estimate: 0h
>
> When users have a type that can't be serialized (it has a function Field or 
> similar) or more complex protocol buffers, the error should be improved to 
> display the type being registered, and better yet, the beam.RegisterType(...) 
> code they can use to enable use of their type.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6838) Improve error messages for unregistered protos

2019-03-20 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6838?focusedWorklogId=216546&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216546
 ]

ASF GitHub Bot logged work on BEAM-6838:


Author: ASF GitHub Bot
Created on: 20/Mar/19 23:54
Start Date: 20/Mar/19 23:54
Worklog Time Spent: 10m 
  Work Description: youngoli commented on pull request #8064: [BEAM-6838] 
Go SDK: Improving error msgs in pipeline serialization.
URL: https://github.com/apache/beam/pull/8064#discussion_r267590532
 
 

 ##
 File path: sdks/go/pkg/beam/core/runtime/graphx/serialize.go
 ##
 @@ -207,45 +207,45 @@ func encodeFn(u *graph.Fn) (*v1.Fn, error) {
case u.Fn != nil:
fn, err := encodeUserFn(u.Fn)
if err != nil {
-   return nil, fmt.Errorf("bad userfn: %v", err)
+   return nil, fmt.Errorf("failed to encode function %v, 
bad userfn: %v", u, err)
}
return &v1.Fn{Fn: fn}, nil
 
case u.Recv != nil:
t := reflect.TypeOf(u.Recv)
k, ok := runtime.TypeKey(reflectx.SkipPtr(t))
if !ok {
-   return nil, fmt.Errorf("bad recv: %v", u.Recv)
+   return nil, fmt.Errorf("failed to encode function %v, 
bad receiver %v", u, u.Recv)
}
if _, ok := runtime.LookupType(k); !ok {
-   return nil, fmt.Errorf("recv type must be registered: 
%v", t)
+   return nil, fmt.Errorf("failed to encode function %v, 
receiver type %v must be registered", u, t)
}
typ, err := encodeType(t)
if err != nil {
-   panic(fmt.Sprintf("bad recv type: %v", u.Recv))
+   panic(fmt.Sprintf("Failed to encode function %v, bad 
receiver type: %v", u, err))
 
 Review comment:
   Done, with a little more detail:
   
   ("Failed to encode structural DoFn %v, failed to encode receiver type %T: 
%v", u, u.Recv, err)
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 216546)
Time Spent: 6h 20m  (was: 6h 10m)

> Improve error messages for unregistered protos
> --
>
> Key: BEAM-6838
> URL: https://issues.apache.org/jira/browse/BEAM-6838
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-go
>Reporter: Daniel Oliveira
>Assignee: Daniel Oliveira
>Priority: Minor
>  Time Spent: 6h 20m
>  Remaining Estimate: 0h
>
> When users have a type that can't be serialized (it has a function Field or 
> similar) or more complex protocol buffers, the error should be improved to 
> display the type being registered, and better yet, the beam.RegisterType(...) 
> code they can use to enable use of their type.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6838) Improve error messages for unregistered protos

2019-03-20 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6838?focusedWorklogId=216543&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216543
 ]

ASF GitHub Bot logged work on BEAM-6838:


Author: ASF GitHub Bot
Created on: 20/Mar/19 23:53
Start Date: 20/Mar/19 23:53
Worklog Time Spent: 10m 
  Work Description: youngoli commented on pull request #8064: [BEAM-6838] 
Go SDK: Improving error msgs in pipeline serialization.
URL: https://github.com/apache/beam/pull/8064#discussion_r267590502
 
 

 ##
 File path: sdks/go/pkg/beam/core/runtime/graphx/serialize.go
 ##
 @@ -207,45 +207,45 @@ func encodeFn(u *graph.Fn) (*v1.Fn, error) {
case u.Fn != nil:
fn, err := encodeUserFn(u.Fn)
if err != nil {
-   return nil, fmt.Errorf("bad userfn: %v", err)
+   return nil, fmt.Errorf("failed to encode function %v, 
bad userfn: %v", u, err)
}
return &v1.Fn{Fn: fn}, nil
 
case u.Recv != nil:
t := reflect.TypeOf(u.Recv)
k, ok := runtime.TypeKey(reflectx.SkipPtr(t))
if !ok {
-   return nil, fmt.Errorf("bad recv: %v", u.Recv)
+   return nil, fmt.Errorf("failed to encode function %v, 
bad receiver %v", u, u.Recv)
}
if _, ok := runtime.LookupType(k); !ok {
-   return nil, fmt.Errorf("recv type must be registered: 
%v", t)
+   return nil, fmt.Errorf("failed to encode function %v, 
receiver type %v must be registered", u, t)
}
typ, err := encodeType(t)
if err != nil {
-   panic(fmt.Sprintf("bad recv type: %v", u.Recv))
+   panic(fmt.Sprintf("Failed to encode function %v, bad 
receiver type: %v", u, err))
}
 
data, err := json.Marshal(u.Recv)
if err != nil {
-   return nil, fmt.Errorf("bad userfn: %v", err)
+   return nil, fmt.Errorf("failed to encode function %v, 
bad userfn: %v", u, err)
}
return &v1.Fn{Type: typ, Opt: string(data)}, nil
 
default:
-   panic("empty Fn")
+   panic(fmt.Sprintf("Failed to encode function %v, empty fn", u))
}
 }
 
 func decodeFn(u *v1.Fn) (*graph.Fn, error) {
if u.Dynfn != nil {
gen, err := runtime.ResolveFunction(u.Dynfn.Gen, genFnType)
if err != nil {
-   return nil, fmt.Errorf("bad symbol %v: %v", 
u.Dynfn.Gen, err)
+   return nil, fmt.Errorf("failed to decode function proto 
%v, bad symbol %v: %v", u, u.Dynfn.Gen, err)
}
 
t, err := decodeType(u.Dynfn.Type)
if err != nil {
-   return nil, fmt.Errorf("bad type: %v", err)
+   return nil, fmt.Errorf("failed to decode function proto 
%v, bad type: %v", u, err)
 
 Review comment:
   Done.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 216543)
Time Spent: 5h 50m  (was: 5h 40m)

> Improve error messages for unregistered protos
> --
>
> Key: BEAM-6838
> URL: https://issues.apache.org/jira/browse/BEAM-6838
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-go
>Reporter: Daniel Oliveira
>Assignee: Daniel Oliveira
>Priority: Minor
>  Time Spent: 5h 50m
>  Remaining Estimate: 0h
>
> When users have a type that can't be serialized (it has a function Field or 
> similar) or more complex protocol buffers, the error should be improved to 
> display the type being registered, and better yet, the beam.RegisterType(...) 
> code they can use to enable use of their type.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6838) Improve error messages for unregistered protos

2019-03-20 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6838?focusedWorklogId=216544&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216544
 ]

ASF GitHub Bot logged work on BEAM-6838:


Author: ASF GitHub Bot
Created on: 20/Mar/19 23:53
Start Date: 20/Mar/19 23:53
Worklog Time Spent: 10m 
  Work Description: youngoli commented on pull request #8064: [BEAM-6838] 
Go SDK: Improving error msgs in pipeline serialization.
URL: https://github.com/apache/beam/pull/8064#discussion_r267590515
 
 

 ##
 File path: sdks/go/pkg/beam/core/runtime/graphx/serialize.go
 ##
 @@ -207,45 +207,45 @@ func encodeFn(u *graph.Fn) (*v1.Fn, error) {
case u.Fn != nil:
fn, err := encodeUserFn(u.Fn)
if err != nil {
-   return nil, fmt.Errorf("bad userfn: %v", err)
+   return nil, fmt.Errorf("failed to encode function %v, 
bad userfn: %v", u, err)
}
return &v1.Fn{Fn: fn}, nil
 
case u.Recv != nil:
t := reflect.TypeOf(u.Recv)
k, ok := runtime.TypeKey(reflectx.SkipPtr(t))
if !ok {
-   return nil, fmt.Errorf("bad recv: %v", u.Recv)
+   return nil, fmt.Errorf("failed to encode function %v, 
bad receiver %v", u, u.Recv)
}
if _, ok := runtime.LookupType(k); !ok {
-   return nil, fmt.Errorf("recv type must be registered: 
%v", t)
+   return nil, fmt.Errorf("failed to encode function %v, 
receiver type %v must be registered", u, t)
}
typ, err := encodeType(t)
if err != nil {
-   panic(fmt.Sprintf("bad recv type: %v", u.Recv))
+   panic(fmt.Sprintf("Failed to encode function %v, bad 
receiver type: %v", u, err))
}
 
data, err := json.Marshal(u.Recv)
if err != nil {
-   return nil, fmt.Errorf("bad userfn: %v", err)
+   return nil, fmt.Errorf("failed to encode function %v, 
bad userfn: %v", u, err)
}
return &v1.Fn{Type: typ, Opt: string(data)}, nil
 
default:
-   panic("empty Fn")
+   panic(fmt.Sprintf("Failed to encode function %v, empty fn", u))
}
 }
 
 func decodeFn(u *v1.Fn) (*graph.Fn, error) {
if u.Dynfn != nil {
gen, err := runtime.ResolveFunction(u.Dynfn.Gen, genFnType)
if err != nil {
-   return nil, fmt.Errorf("bad symbol %v: %v", 
u.Dynfn.Gen, err)
+   return nil, fmt.Errorf("failed to decode function proto 
%v, bad symbol %v: %v", u, u.Dynfn.Gen, err)
 
 Review comment:
   Done.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 216544)
Time Spent: 6h  (was: 5h 50m)

> Improve error messages for unregistered protos
> --
>
> Key: BEAM-6838
> URL: https://issues.apache.org/jira/browse/BEAM-6838
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-go
>Reporter: Daniel Oliveira
>Assignee: Daniel Oliveira
>Priority: Minor
>  Time Spent: 6h
>  Remaining Estimate: 0h
>
> When users have a type that can't be serialized (it has a function Field or 
> similar) or more complex protocol buffers, the error should be improved to 
> display the type being registered, and better yet, the beam.RegisterType(...) 
> code they can use to enable use of their type.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6838) Improve error messages for unregistered protos

2019-03-20 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6838?focusedWorklogId=216540&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216540
 ]

ASF GitHub Bot logged work on BEAM-6838:


Author: ASF GitHub Bot
Created on: 20/Mar/19 23:53
Start Date: 20/Mar/19 23:53
Worklog Time Spent: 10m 
  Work Description: youngoli commented on pull request #8064: [BEAM-6838] 
Go SDK: Improving error msgs in pipeline serialization.
URL: https://github.com/apache/beam/pull/8064#discussion_r267590472
 
 

 ##
 File path: sdks/go/pkg/beam/core/runtime/graphx/serialize.go
 ##
 @@ -257,22 +257,22 @@ func decodeFn(u *v1.Fn) (*graph.Fn, error) {
if u.Fn != nil {
fn, err := decodeUserFn(u.Fn)
if err != nil {
-   return nil, fmt.Errorf("bad userfn: %v", err)
+   return nil, fmt.Errorf("failed to decode function proto 
%v, bad userfn: %v", u, err)
}
fx, err := funcx.New(reflectx.MakeFunc(fn))
if err != nil {
-   return nil, fmt.Errorf("bad userfn: %v", err)
+   return nil, fmt.Errorf("failed to decode function proto 
%v, bad userfn: %v", u, err)
}
return &graph.Fn{Fn: fx}, nil
}
 
t, err := decodeType(u.Type)
if err != nil {
-   return nil, fmt.Errorf("bad type: %v", err)
+   return nil, fmt.Errorf("failed to decode function proto %v, bad 
type: %v", u, err)
 
 Review comment:
   Done.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 216540)
Time Spent: 5h 20m  (was: 5h 10m)

> Improve error messages for unregistered protos
> --
>
> Key: BEAM-6838
> URL: https://issues.apache.org/jira/browse/BEAM-6838
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-go
>Reporter: Daniel Oliveira
>Assignee: Daniel Oliveira
>Priority: Minor
>  Time Spent: 5h 20m
>  Remaining Estimate: 0h
>
> When users have a type that can't be serialized (it has a function Field or 
> similar) or more complex protocol buffers, the error should be improved to 
> display the type being registered, and better yet, the beam.RegisterType(...) 
> code they can use to enable use of their type.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6838) Improve error messages for unregistered protos

2019-03-20 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6838?focusedWorklogId=216541&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216541
 ]

ASF GitHub Bot logged work on BEAM-6838:


Author: ASF GitHub Bot
Created on: 20/Mar/19 23:53
Start Date: 20/Mar/19 23:53
Worklog Time Spent: 10m 
  Work Description: youngoli commented on pull request #8064: [BEAM-6838] 
Go SDK: Improving error msgs in pipeline serialization.
URL: https://github.com/apache/beam/pull/8064#discussion_r267590477
 
 

 ##
 File path: sdks/go/pkg/beam/core/runtime/graphx/serialize.go
 ##
 @@ -257,22 +257,22 @@ func decodeFn(u *v1.Fn) (*graph.Fn, error) {
if u.Fn != nil {
fn, err := decodeUserFn(u.Fn)
if err != nil {
-   return nil, fmt.Errorf("bad userfn: %v", err)
+   return nil, fmt.Errorf("failed to decode function proto 
%v, bad userfn: %v", u, err)
}
fx, err := funcx.New(reflectx.MakeFunc(fn))
if err != nil {
-   return nil, fmt.Errorf("bad userfn: %v", err)
+   return nil, fmt.Errorf("failed to decode function proto 
%v, bad userfn: %v", u, err)
 
 Review comment:
   Done, with a little added detail:
   
   ("failed to decode DoFn %v, failed to construct userfn: %v", u, err)
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 216541)
Time Spent: 5.5h  (was: 5h 20m)

> Improve error messages for unregistered protos
> --
>
> Key: BEAM-6838
> URL: https://issues.apache.org/jira/browse/BEAM-6838
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-go
>Reporter: Daniel Oliveira
>Assignee: Daniel Oliveira
>Priority: Minor
>  Time Spent: 5.5h
>  Remaining Estimate: 0h
>
> When users have a type that can't be serialized (it has a function Field or 
> similar) or more complex protocol buffers, the error should be improved to 
> display the type being registered, and better yet, the beam.RegisterType(...) 
> code they can use to enable use of their type.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6872) Add hook for user-defined JVM initialization in workers

2019-03-20 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6872?focusedWorklogId=216538&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216538
 ]

ASF GitHub Bot logged work on BEAM-6872:


Author: ASF GitHub Bot
Created on: 20/Mar/19 23:52
Start Date: 20/Mar/19 23:52
Worklog Time Spent: 10m 
  Work Description: TheNeuralBit commented on issue #8104: [BEAM-6872] Add 
hook for user-defined JVM initialization in workers
URL: https://github.com/apache/beam/pull/8104#issuecomment-475071702
 
 
   @lukecwik are you a good person to review this PR?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 216538)
Time Spent: 20m  (was: 10m)

> Add hook for user-defined JVM initialization in workers
> ---
>
> Key: BEAM-6872
> URL: https://issues.apache.org/jira/browse/BEAM-6872
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-dataflow
>Reporter: Brian Hulette
>Assignee: Brian Hulette
>Priority: Minor
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Expose an interface for users to run some one-time initialization code when a 
> worker starts up.
> This can be useful for things like overriding the Default ZoneRulesProvider, 
> or setting up custom SSL providers.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6872) Add hook for user-defined JVM initialization in workers

2019-03-20 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6872?focusedWorklogId=216521&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216521
 ]

ASF GitHub Bot logged work on BEAM-6872:


Author: ASF GitHub Bot
Created on: 20/Mar/19 23:10
Start Date: 20/Mar/19 23:10
Worklog Time Spent: 10m 
  Work Description: TheNeuralBit commented on pull request #8104: 
[BEAM-6872] Add hook for user-defined JVM initialization in workers
URL: https://github.com/apache/beam/pull/8104
 
 
   Adds DataflowWorkerInitializer interface. Workers execute implementations of 
the interface in user code when they start up via ServiceLoader. Currently 
implementations are run immediately after logging is configured.
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | --- | --- | --- | ---
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python3_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python3_Verify/lastCompletedBuild/)
 | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/)
  [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PreCommit_Python_PVR_Flink_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_Python_PVR_Flink_Cron/lastCompletedBuild/)
 | --- | --- | ---
   
   See 
[.test-infra/jenkins/README](https://github.com/apache/beam/blob/master/.test-infra/jenkins/README.md)
 for trigger phrase, status and link of all Jenkins jobs.
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 216521)
Time Spent: 10m
Remaining Estimate: 0h

> Add hook for user-defined JVM initialization in workers
> ---
>
>

[jira] [Created] (BEAM-6872) Add hook for user-defined JVM initialization

2019-03-20 Thread Brian Hulette (JIRA)
Brian Hulette created BEAM-6872:
---

 Summary: Add hook for user-defined JVM initialization
 Key: BEAM-6872
 URL: https://issues.apache.org/jira/browse/BEAM-6872
 Project: Beam
  Issue Type: New Feature
  Components: runner-dataflow
Reporter: Brian Hulette
Assignee: Brian Hulette


Expose an interface for users to run some one-time initialization code when a 
worker starts up.
This can be useful for things like overriding the Default ZoneRulesProvider, or 
setting up custom SSL providers.




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-6872) Add hook for user-defined JVM initialization in workers

2019-03-20 Thread Brian Hulette (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brian Hulette updated BEAM-6872:

Summary: Add hook for user-defined JVM initialization in workers  (was: Add 
hook for user-defined JVM initialization)

> Add hook for user-defined JVM initialization in workers
> ---
>
> Key: BEAM-6872
> URL: https://issues.apache.org/jira/browse/BEAM-6872
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-dataflow
>Reporter: Brian Hulette
>Assignee: Brian Hulette
>Priority: Minor
>
> Expose an interface for users to run some one-time initialization code when a 
> worker starts up.
> This can be useful for things like overriding the Default ZoneRulesProvider, 
> or setting up custom SSL providers.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-6711) Bigquery Tornadoes IT is broken in Python3 PostCommit test suite.

2019-03-20 Thread Pablo Estrada (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-6711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16797599#comment-16797599
 ] 

Pablo Estrada commented on BEAM-6711:
-

The issue occurs when proto-typed elements go into a combine, because they are 
strictly not-hashable in Python 3. There was a combine in the transform: 
https://github.com/apache/beam/pull/8093/files#diff-c397a2a1f47619267d7c57494de7d835R593

This pr should fix that: https://github.com/apache/beam/pull/8093 - I'll also 
apply changes discussed with [~tvalentyn] regarding default behavior with 
DirectRunner.

> Bigquery Tornadoes IT is broken in Python3 PostCommit test suite. 
> --
>
> Key: BEAM-6711
> URL: https://issues.apache.org/jira/browse/BEAM-6711
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: Pablo Estrada
>Priority: Major
> Fix For: 2.12.0
>
>  Time Spent: 6h 50m
>  Remaining Estimate: 0h
>
> First failure was observed in 
> https://builds.apache.org/job/beam_PostCommit_Python3_Verify/54 , after 
> https://github.com/apache/beam/commit/cdea885872b3be7de9ba22f22700be89f7d53766
>  was merged. 
> [~pabloem], could you please take a look? I suggest we do a rollback + 
> rollforward with a fix.
> {noformat}
> root: ERROR: Exception at bundle 
> , 
> due to an exception.
>  Traceback (most recent call last):
>   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py",
>  line 727, in process
> return self.do_fn_invoker.invoke_process(windowed_value)
>   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py",
>  line 556, in invoke_process
> windowed_value, additional_args, additional_kwargs, output_processor)
>   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py",
>  line 622, in _invoke_per_window
> self.process_method(*args_for_process, **kwargs_for_process))
>   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py",
>  line 823, in process_outputs
> for result in results:
>   File "/home/jenkins/jenkins-slave/works
> pace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/io/gcp/bigquery_file_loads.py",
>  line 191, in process
> if destination in self._destination_to_file_writer:
> TypeError: unhashable type: 'TableReference'
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6812) Convert keys to ByteArray in Combine.perKey for Spark

2019-03-20 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6812?focusedWorklogId=216486&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216486
 ]

ASF GitHub Bot logged work on BEAM-6812:


Author: ASF GitHub Bot
Created on: 20/Mar/19 21:26
Start Date: 20/Mar/19 21:26
Worklog Time Spent: 10m 
  Work Description: dmvk commented on pull request #8042: [BEAM-6812]: 
Convert keys to ByteArray in Combine.perKey to make sure hashCode is consistent
URL: https://github.com/apache/beam/pull/8042
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 216486)
Time Spent: 2h 50m  (was: 2h 40m)

> Convert keys to ByteArray in Combine.perKey for Spark
> -
>
> Key: BEAM-6812
> URL: https://issues.apache.org/jira/browse/BEAM-6812
> Project: Beam
>  Issue Type: Bug
>  Components: runner-spark
>Reporter: Ankit Jhalaria
>Assignee: Ankit Jhalaria
>Priority: Critical
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> * During calls to Combine.perKey, we want they keys used to have consistent 
> hashCode when invoked from different JVM's.
>  * However, while testing this in our company we found out that when using 
> protobuf as keys during combine, the hashCodes can be different for the same 
> key when invoked from different JVMs. This results in duplicates. 
>  * `ByteArray` class in Spark has a stable has code when dealing with arrays 
> as well. 
>  * GroupByKey correctly converts keys to `ByteArray` and uses coders for 
> serialization.
>  * The fix does something similar when dealing with combines.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (BEAM-6812) Convert keys to ByteArray in Combine.perKey for Spark

2019-03-20 Thread David Moravek (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Moravek resolved BEAM-6812.
-
   Resolution: Fixed
Fix Version/s: 2.12.0

> Convert keys to ByteArray in Combine.perKey for Spark
> -
>
> Key: BEAM-6812
> URL: https://issues.apache.org/jira/browse/BEAM-6812
> Project: Beam
>  Issue Type: Bug
>  Components: runner-spark
>Reporter: Ankit Jhalaria
>Assignee: Ankit Jhalaria
>Priority: Critical
> Fix For: 2.12.0
>
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> * During calls to Combine.perKey, we want they keys used to have consistent 
> hashCode when invoked from different JVM's.
>  * However, while testing this in our company we found out that when using 
> protobuf as keys during combine, the hashCodes can be different for the same 
> key when invoked from different JVMs. This results in duplicates. 
>  * `ByteArray` class in Spark has a stable has code when dealing with arrays 
> as well. 
>  * GroupByKey correctly converts keys to `ByteArray` and uses coders for 
> serialization.
>  * The fix does something similar when dealing with combines.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6527) Parallel tox (unit) tests run on Jenkins

2019-03-20 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6527?focusedWorklogId=216476&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216476
 ]

ASF GitHub Bot logged work on BEAM-6527:


Author: ASF GitHub Bot
Created on: 20/Mar/19 21:02
Start Date: 20/Mar/19 21:02
Worklog Time Spent: 10m 
  Work Description: markflyhigh commented on issue #8067: [BEAM-6527] Use 
Gradle to parallel Python tox tests
URL: https://github.com/apache/beam/pull/8067#issuecomment-475026993
 
 
   Run Java PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 216476)
Time Spent: 8.5h  (was: 8h 20m)

> Parallel tox (unit) tests run on Jenkins
> 
>
> Key: BEAM-6527
> URL: https://issues.apache.org/jira/browse/BEAM-6527
> Project: Beam
>  Issue Type: Sub-task
>  Components: testing
>Reporter: Mark Liu
>Assignee: Mark Liu
>Priority: Major
>  Labels: triaged
>  Time Spent: 8.5h
>  Remaining Estimate: 0h
>
> Existing tox unit test suite (basic, gcp and cython) will be enabled in 
> multiple version of Python 3, which will significantly increase runtime of 
> Pre/PostCommit build. A parallel is wanted in tox or Gradle invocation to 
> control the time in a reasonable range (<30mins for PreCommit is desired).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6844) Metrics tests for Portable metrics

2019-03-20 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6844?focusedWorklogId=216475&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216475
 ]

ASF GitHub Bot logged work on BEAM-6844:


Author: ASF GitHub Bot
Created on: 20/Mar/19 21:01
Start Date: 20/Mar/19 21:01
Worklog Time Spent: 10m 
  Work Description: Ardagan commented on issue #8038: [BEAM-6844] Add 
MetricResultMatcher verification for user metrics, GBK element count, and 
process time to an integration test.
URL: https://github.com/apache/beam/pull/8038#issuecomment-475026699
 
 
   run java postcommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 216475)
Time Spent: 0.5h  (was: 20m)

> Metrics tests for Portable metrics
> --
>
> Key: BEAM-6844
> URL: https://issues.apache.org/jira/browse/BEAM-6844
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-harness, sdk-py-harness
>Reporter: Pablo Estrada
>Assignee: Alex Amato
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6866) SamzaRunner: support timers in ParDo

2019-03-20 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6866?focusedWorklogId=216473&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216473
 ]

ASF GitHub Bot logged work on BEAM-6866:


Author: ASF GitHub Bot
Created on: 20/Mar/19 20:58
Start Date: 20/Mar/19 20:58
Worklog Time Spent: 10m 
  Work Description: xinyuiscool commented on pull request #8092: 
[BEAM-6866]: SamzaRunner: support timers in ParDo
URL: https://github.com/apache/beam/pull/8092
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 216473)
Time Spent: 20m  (was: 10m)

> SamzaRunner: support timers in ParDo
> 
>
> Key: BEAM-6866
> URL: https://issues.apache.org/jira/browse/BEAM-6866
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-samza
>Reporter: Xinyu Liu
>Assignee: Xinyu Liu
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> This patch adds support of timers in ParDo.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6838) Improve error messages for unregistered protos

2019-03-20 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6838?focusedWorklogId=216461&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216461
 ]

ASF GitHub Bot logged work on BEAM-6838:


Author: ASF GitHub Bot
Created on: 20/Mar/19 20:47
Start Date: 20/Mar/19 20:47
Worklog Time Spent: 10m 
  Work Description: lostluck commented on pull request #8064: [BEAM-6838] 
Go SDK: Improving error msgs in pipeline serialization.
URL: https://github.com/apache/beam/pull/8064#discussion_r267530489
 
 

 ##
 File path: sdks/go/pkg/beam/core/runtime/graphx/serialize.go
 ##
 @@ -207,45 +207,45 @@ func encodeFn(u *graph.Fn) (*v1.Fn, error) {
case u.Fn != nil:
fn, err := encodeUserFn(u.Fn)
if err != nil {
-   return nil, fmt.Errorf("bad userfn: %v", err)
+   return nil, fmt.Errorf("failed to encode function %v, 
bad userfn: %v", u, err)
}
return &v1.Fn{Fn: fn}, nil
 
case u.Recv != nil:
t := reflect.TypeOf(u.Recv)
k, ok := runtime.TypeKey(reflectx.SkipPtr(t))
if !ok {
-   return nil, fmt.Errorf("bad recv: %v", u.Recv)
+   return nil, fmt.Errorf("failed to encode function %v, 
bad receiver %v", u, u.Recv)
 
 Review comment:
   This one is trickier:
   ("failed to encode structural DoFn %v, bad receiver type: %T", u, u.Recv)
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 216461)
Time Spent: 3h 50m  (was: 3h 40m)

> Improve error messages for unregistered protos
> --
>
> Key: BEAM-6838
> URL: https://issues.apache.org/jira/browse/BEAM-6838
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-go
>Reporter: Daniel Oliveira
>Assignee: Daniel Oliveira
>Priority: Minor
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>
> When users have a type that can't be serialized (it has a function Field or 
> similar) or more complex protocol buffers, the error should be improved to 
> display the type being registered, and better yet, the beam.RegisterType(...) 
> code they can use to enable use of their type.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-6862) [beam_PostCommit_Py_ValCont] Test fail with import error

2019-03-20 Thread Mikhail Gryzykhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Gryzykhin reassigned BEAM-6862:
---

Assignee: Mikhail Gryzykhin  (was: Alex Amato)

> [beam_PostCommit_Py_ValCont] Test fail with import error
> 
>
> Key: BEAM-6862
> URL: https://issues.apache.org/jira/browse/BEAM-6862
> Project: Beam
>  Issue Type: Bug
>  Components: test-failures
>Reporter: Ahmet Altay
>Assignee: Mikhail Gryzykhin
>Priority: Major
>  Labels: currently-failing
>
> PR likely root cause: https://github.com/apache/beam/pull/8038
> Started at: https://builds.apache.org/job/beam_PostCommit_Py_ValCont/2672/
> 14:30:12 ImportError: No module named 
> hamcrest.library.number.ordering_comparison
> 14:30:12 
> 14:30:12  at 
> java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)
> 14:30:12  at 
> java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1895)
> 14:30:12  at org.apache.beam.sdk.util.MoreFutures.get(MoreFutures.java:57)
> 14:30:12  at 
> org.apache.beam.runners.dataflow.worker.fn.control.RegisterAndProcessBundleOperation.finish(RegisterAndProcessBundleOperation.java:281)
> 14:30:12  at 
> org.apache.beam.runners.dataflow.worker.util.common.worker.MapTaskExecutor.execute(MapTaskExecutor.java:85)
> 14:30:12  at 
> org.apache.beam.runners.dataflow.worker.fn.control.BeamFnMapTaskExecutor.execute(BeamFnMapTaskExecutor.java:125)
> 14:30:12  at 
> org.apache.beam.runners.dataflow.worker.BatchDataflowWorker.executeWork(BatchDataflowWorker.java:411)
> 14:30:12  at 
> org.apache.beam.runners.dataflow.worker.BatchDataflowWorker.doWork(BatchDataflowWorker.java:380)
> 14:30:12  at 
> org.apache.beam.runners.dataflow.worker.BatchDataflowWorker.getAndPerformWork(BatchDataflowWorker.java:306)
> 14:30:12  at 
> org.apache.beam.runners.dataflow.worker.DataflowRunnerHarness.start(DataflowRunnerHarness.java:195)
> 14:30:12  at 
> org.apache.beam.runners.dataflow.worker.DataflowRunnerHarness.main(DataflowRunnerHarness.java:123)
> 14:30:12  Suppressed: java.lang.IllegalStateException: Already closed.
> 14:30:12  at 
> org.apache.beam.sdk.fn.data.BeamFnDataBufferingOutboundObserver.close(BeamFnDataBufferingOutboundObserver.java:95)
> 14:30:12  at 
> org.apache.beam.runners.dataflow.worker.fn.data.RemoteGrpcPortWriteOperation.abort(RemoteGrpcPortWriteOperation.java:215)
> 14:30:12  at 
> org.apache.beam.runners.dataflow.worker.util.common.worker.MapTaskExecutor.execute(MapTaskExecutor.java:91)
> 14:30:12  ... 6 more



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6838) Improve error messages for unregistered protos

2019-03-20 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6838?focusedWorklogId=216458&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216458
 ]

ASF GitHub Bot logged work on BEAM-6838:


Author: ASF GitHub Bot
Created on: 20/Mar/19 20:47
Start Date: 20/Mar/19 20:47
Worklog Time Spent: 10m 
  Work Description: lostluck commented on pull request #8064: [BEAM-6838] 
Go SDK: Improving error msgs in pipeline serialization.
URL: https://github.com/apache/beam/pull/8064#discussion_r267529785
 
 

 ##
 File path: sdks/go/pkg/beam/core/runtime/graphx/serialize.go
 ##
 @@ -195,7 +195,7 @@ func encodeFn(u *graph.Fn) (*v1.Fn, error) {
gen := reflectx.FunctionName(u.DynFn.Gen)
t, err := encodeType(u.DynFn.T)
if err != nil {
-   return nil, fmt.Errorf("bad function type: %v", err)
+   return nil, fmt.Errorf("failed to encode function %v, 
bad function type: %v", u, err)
 
 Review comment:
   ("failed to encode dynamic DoFn %v, bad function type: %v", u, err)
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 216458)
Time Spent: 3.5h  (was: 3h 20m)

> Improve error messages for unregistered protos
> --
>
> Key: BEAM-6838
> URL: https://issues.apache.org/jira/browse/BEAM-6838
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-go
>Reporter: Daniel Oliveira
>Assignee: Daniel Oliveira
>Priority: Minor
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> When users have a type that can't be serialized (it has a function Field or 
> similar) or more complex protocol buffers, the error should be improved to 
> display the type being registered, and better yet, the beam.RegisterType(...) 
> code they can use to enable use of their type.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6838) Improve error messages for unregistered protos

2019-03-20 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6838?focusedWorklogId=216466&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216466
 ]

ASF GitHub Bot logged work on BEAM-6838:


Author: ASF GitHub Bot
Created on: 20/Mar/19 20:47
Start Date: 20/Mar/19 20:47
Worklog Time Spent: 10m 
  Work Description: lostluck commented on pull request #8064: [BEAM-6838] 
Go SDK: Improving error msgs in pipeline serialization.
URL: https://github.com/apache/beam/pull/8064#discussion_r267535027
 
 

 ##
 File path: sdks/go/pkg/beam/core/runtime/graphx/serialize.go
 ##
 @@ -257,22 +257,22 @@ func decodeFn(u *v1.Fn) (*graph.Fn, error) {
if u.Fn != nil {
fn, err := decodeUserFn(u.Fn)
if err != nil {
-   return nil, fmt.Errorf("bad userfn: %v", err)
+   return nil, fmt.Errorf("failed to decode function proto 
%v, bad userfn: %v", u, err)
}
fx, err := funcx.New(reflectx.MakeFunc(fn))
if err != nil {
-   return nil, fmt.Errorf("bad userfn: %v", err)
+   return nil, fmt.Errorf("failed to decode function proto 
%v, bad userfn: %v", u, err)
}
return &graph.Fn{Fn: fx}, nil
}
 
t, err := decodeType(u.Type)
if err != nil {
-   return nil, fmt.Errorf("bad type: %v", err)
+   return nil, fmt.Errorf("failed to decode function proto %v, bad 
type: %v", u, err)
 
 Review comment:
   ("failed to decode structural DoFn %v, bad type: %v", u, err)
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 216466)
Time Spent: 4h 40m  (was: 4.5h)

> Improve error messages for unregistered protos
> --
>
> Key: BEAM-6838
> URL: https://issues.apache.org/jira/browse/BEAM-6838
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-go
>Reporter: Daniel Oliveira
>Assignee: Daniel Oliveira
>Priority: Minor
>  Time Spent: 4h 40m
>  Remaining Estimate: 0h
>
> When users have a type that can't be serialized (it has a function Field or 
> similar) or more complex protocol buffers, the error should be improved to 
> display the type being registered, and better yet, the beam.RegisterType(...) 
> code they can use to enable use of their type.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6838) Improve error messages for unregistered protos

2019-03-20 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6838?focusedWorklogId=216468&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216468
 ]

ASF GitHub Bot logged work on BEAM-6838:


Author: ASF GitHub Bot
Created on: 20/Mar/19 20:47
Start Date: 20/Mar/19 20:47
Worklog Time Spent: 10m 
  Work Description: lostluck commented on pull request #8064: [BEAM-6838] 
Go SDK: Improving error msgs in pipeline serialization.
URL: https://github.com/apache/beam/pull/8064#discussion_r267534573
 
 

 ##
 File path: sdks/go/pkg/beam/core/runtime/graphx/serialize.go
 ##
 @@ -257,22 +257,22 @@ func decodeFn(u *v1.Fn) (*graph.Fn, error) {
if u.Fn != nil {
fn, err := decodeUserFn(u.Fn)
if err != nil {
-   return nil, fmt.Errorf("bad userfn: %v", err)
+   return nil, fmt.Errorf("failed to decode function proto 
%v, bad userfn: %v", u, err)
 
 Review comment:
   ("failed to decode DoFn %v, bad userfn: %v", u, err)
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 216468)
Time Spent: 5h  (was: 4h 50m)

> Improve error messages for unregistered protos
> --
>
> Key: BEAM-6838
> URL: https://issues.apache.org/jira/browse/BEAM-6838
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-go
>Reporter: Daniel Oliveira
>Assignee: Daniel Oliveira
>Priority: Minor
>  Time Spent: 5h
>  Remaining Estimate: 0h
>
> When users have a type that can't be serialized (it has a function Field or 
> similar) or more complex protocol buffers, the error should be improved to 
> display the type being registered, and better yet, the beam.RegisterType(...) 
> code they can use to enable use of their type.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6838) Improve error messages for unregistered protos

2019-03-20 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6838?focusedWorklogId=216464&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216464
 ]

ASF GitHub Bot logged work on BEAM-6838:


Author: ASF GitHub Bot
Created on: 20/Mar/19 20:47
Start Date: 20/Mar/19 20:47
Worklog Time Spent: 10m 
  Work Description: lostluck commented on pull request #8064: [BEAM-6838] 
Go SDK: Improving error msgs in pipeline serialization.
URL: https://github.com/apache/beam/pull/8064#discussion_r267534727
 
 

 ##
 File path: sdks/go/pkg/beam/core/runtime/graphx/serialize.go
 ##
 @@ -257,22 +257,22 @@ func decodeFn(u *v1.Fn) (*graph.Fn, error) {
if u.Fn != nil {
fn, err := decodeUserFn(u.Fn)
if err != nil {
-   return nil, fmt.Errorf("bad userfn: %v", err)
+   return nil, fmt.Errorf("failed to decode function proto 
%v, bad userfn: %v", u, err)
}
fx, err := funcx.New(reflectx.MakeFunc(fn))
if err != nil {
-   return nil, fmt.Errorf("bad userfn: %v", err)
+   return nil, fmt.Errorf("failed to decode function proto 
%v, bad userfn: %v", u, err)
 
 Review comment:
   ("failed to decode DoFn %v, bad userfn: %v", u, err)
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 216464)
Time Spent: 4h 20m  (was: 4h 10m)

> Improve error messages for unregistered protos
> --
>
> Key: BEAM-6838
> URL: https://issues.apache.org/jira/browse/BEAM-6838
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-go
>Reporter: Daniel Oliveira
>Assignee: Daniel Oliveira
>Priority: Minor
>  Time Spent: 4h 20m
>  Remaining Estimate: 0h
>
> When users have a type that can't be serialized (it has a function Field or 
> similar) or more complex protocol buffers, the error should be improved to 
> display the type being registered, and better yet, the beam.RegisterType(...) 
> code they can use to enable use of their type.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6838) Improve error messages for unregistered protos

2019-03-20 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6838?focusedWorklogId=216467&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216467
 ]

ASF GitHub Bot logged work on BEAM-6838:


Author: ASF GitHub Bot
Created on: 20/Mar/19 20:47
Start Date: 20/Mar/19 20:47
Worklog Time Spent: 10m 
  Work Description: lostluck commented on pull request #8064: [BEAM-6838] 
Go SDK: Improving error msgs in pipeline serialization.
URL: https://github.com/apache/beam/pull/8064#discussion_r267535219
 
 

 ##
 File path: sdks/go/pkg/beam/core/runtime/graphx/serialize.go
 ##
 @@ -257,22 +257,22 @@ func decodeFn(u *v1.Fn) (*graph.Fn, error) {
if u.Fn != nil {
fn, err := decodeUserFn(u.Fn)
if err != nil {
-   return nil, fmt.Errorf("bad userfn: %v", err)
+   return nil, fmt.Errorf("failed to decode function proto 
%v, bad userfn: %v", u, err)
}
fx, err := funcx.New(reflectx.MakeFunc(fn))
if err != nil {
-   return nil, fmt.Errorf("bad userfn: %v", err)
+   return nil, fmt.Errorf("failed to decode function proto 
%v, bad userfn: %v", u, err)
}
return &graph.Fn{Fn: fx}, nil
}
 
t, err := decodeType(u.Type)
if err != nil {
-   return nil, fmt.Errorf("bad type: %v", err)
+   return nil, fmt.Errorf("failed to decode function proto %v, bad 
type: %v", u, err)
}
fn, err := reflectx.UnmarshalJSON(t, u.Opt)
if err != nil {
-   return nil, fmt.Errorf("bad struct encoding: %v", err)
+   return nil, fmt.Errorf("failed to decode function proto %v, bad 
struct encoding: %v", u, err)
 
 Review comment:
   ("failed to decode structural DoFn %v, bad userfn: %v", u, err)
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 216467)
Time Spent: 4h 50m  (was: 4h 40m)

> Improve error messages for unregistered protos
> --
>
> Key: BEAM-6838
> URL: https://issues.apache.org/jira/browse/BEAM-6838
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-go
>Reporter: Daniel Oliveira
>Assignee: Daniel Oliveira
>Priority: Minor
>  Time Spent: 4h 50m
>  Remaining Estimate: 0h
>
> When users have a type that can't be serialized (it has a function Field or 
> similar) or more complex protocol buffers, the error should be improved to 
> display the type being registered, and better yet, the beam.RegisterType(...) 
> code they can use to enable use of their type.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6838) Improve error messages for unregistered protos

2019-03-20 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6838?focusedWorklogId=216456&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216456
 ]

ASF GitHub Bot logged work on BEAM-6838:


Author: ASF GitHub Bot
Created on: 20/Mar/19 20:47
Start Date: 20/Mar/19 20:47
Worklog Time Spent: 10m 
  Work Description: lostluck commented on pull request #8064: [BEAM-6838] 
Go SDK: Improving error msgs in pipeline serialization.
URL: https://github.com/apache/beam/pull/8064#discussion_r267530172
 
 

 ##
 File path: sdks/go/pkg/beam/core/runtime/graphx/serialize.go
 ##
 @@ -207,45 +207,45 @@ func encodeFn(u *graph.Fn) (*v1.Fn, error) {
case u.Fn != nil:
fn, err := encodeUserFn(u.Fn)
if err != nil {
-   return nil, fmt.Errorf("bad userfn: %v", err)
+   return nil, fmt.Errorf("failed to encode function %v, 
bad userfn: %v", u, err)
 
 Review comment:
   ("failed to encode DoFn %v, bad userfn: %v", u, err)
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 216456)
Time Spent: 3h 20m  (was: 3h 10m)

> Improve error messages for unregistered protos
> --
>
> Key: BEAM-6838
> URL: https://issues.apache.org/jira/browse/BEAM-6838
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-go
>Reporter: Daniel Oliveira
>Assignee: Daniel Oliveira
>Priority: Minor
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> When users have a type that can't be serialized (it has a function Field or 
> similar) or more complex protocol buffers, the error should be improved to 
> display the type being registered, and better yet, the beam.RegisterType(...) 
> code they can use to enable use of their type.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6838) Improve error messages for unregistered protos

2019-03-20 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6838?focusedWorklogId=216462&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216462
 ]

ASF GitHub Bot logged work on BEAM-6838:


Author: ASF GitHub Bot
Created on: 20/Mar/19 20:47
Start Date: 20/Mar/19 20:47
Worklog Time Spent: 10m 
  Work Description: lostluck commented on pull request #8064: [BEAM-6838] 
Go SDK: Improving error msgs in pipeline serialization.
URL: https://github.com/apache/beam/pull/8064#discussion_r267532889
 
 

 ##
 File path: sdks/go/pkg/beam/core/runtime/graphx/serialize.go
 ##
 @@ -207,45 +207,45 @@ func encodeFn(u *graph.Fn) (*v1.Fn, error) {
case u.Fn != nil:
fn, err := encodeUserFn(u.Fn)
if err != nil {
-   return nil, fmt.Errorf("bad userfn: %v", err)
+   return nil, fmt.Errorf("failed to encode function %v, 
bad userfn: %v", u, err)
}
return &v1.Fn{Fn: fn}, nil
 
case u.Recv != nil:
t := reflect.TypeOf(u.Recv)
k, ok := runtime.TypeKey(reflectx.SkipPtr(t))
if !ok {
-   return nil, fmt.Errorf("bad recv: %v", u.Recv)
+   return nil, fmt.Errorf("failed to encode function %v, 
bad receiver %v", u, u.Recv)
}
if _, ok := runtime.LookupType(k); !ok {
-   return nil, fmt.Errorf("recv type must be registered: 
%v", t)
+   return nil, fmt.Errorf("failed to encode function %v, 
receiver type %v must be registered", u, t)
}
typ, err := encodeType(t)
if err != nil {
-   panic(fmt.Sprintf("bad recv type: %v", u.Recv))
+   panic(fmt.Sprintf("Failed to encode function %v, bad 
receiver type: %v", u, err))
 
 Review comment:
   ("Failed to encode structural DoFn %v, bad receiver type %T: %v", u, u.Recv, 
err)
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 216462)
Time Spent: 4h  (was: 3h 50m)

> Improve error messages for unregistered protos
> --
>
> Key: BEAM-6838
> URL: https://issues.apache.org/jira/browse/BEAM-6838
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-go
>Reporter: Daniel Oliveira
>Assignee: Daniel Oliveira
>Priority: Minor
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> When users have a type that can't be serialized (it has a function Field or 
> similar) or more complex protocol buffers, the error should be improved to 
> display the type being registered, and better yet, the beam.RegisterType(...) 
> code they can use to enable use of their type.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6838) Improve error messages for unregistered protos

2019-03-20 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6838?focusedWorklogId=216459&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216459
 ]

ASF GitHub Bot logged work on BEAM-6838:


Author: ASF GitHub Bot
Created on: 20/Mar/19 20:47
Start Date: 20/Mar/19 20:47
Worklog Time Spent: 10m 
  Work Description: lostluck commented on pull request #8064: [BEAM-6838] 
Go SDK: Improving error msgs in pipeline serialization.
URL: https://github.com/apache/beam/pull/8064#discussion_r267531124
 
 

 ##
 File path: sdks/go/pkg/beam/core/runtime/graphx/serialize.go
 ##
 @@ -207,45 +207,45 @@ func encodeFn(u *graph.Fn) (*v1.Fn, error) {
case u.Fn != nil:
fn, err := encodeUserFn(u.Fn)
if err != nil {
-   return nil, fmt.Errorf("bad userfn: %v", err)
+   return nil, fmt.Errorf("failed to encode function %v, 
bad userfn: %v", u, err)
}
return &v1.Fn{Fn: fn}, nil
 
case u.Recv != nil:
t := reflect.TypeOf(u.Recv)
k, ok := runtime.TypeKey(reflectx.SkipPtr(t))
if !ok {
-   return nil, fmt.Errorf("bad recv: %v", u.Recv)
+   return nil, fmt.Errorf("failed to encode function %v, 
bad receiver %v", u, u.Recv)
}
if _, ok := runtime.LookupType(k); !ok {
-   return nil, fmt.Errorf("recv type must be registered: 
%v", t)
+   return nil, fmt.Errorf("failed to encode function %v, 
receiver type %v must be registered", u, t)
 
 Review comment:
   ("failed to encode structural DoFn %v, receiver type %v must be registered", 
u, t)
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 216459)
Time Spent: 3h 40m  (was: 3.5h)

> Improve error messages for unregistered protos
> --
>
> Key: BEAM-6838
> URL: https://issues.apache.org/jira/browse/BEAM-6838
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-go
>Reporter: Daniel Oliveira
>Assignee: Daniel Oliveira
>Priority: Minor
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> When users have a type that can't be serialized (it has a function Field or 
> similar) or more complex protocol buffers, the error should be improved to 
> display the type being registered, and better yet, the beam.RegisterType(...) 
> code they can use to enable use of their type.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6838) Improve error messages for unregistered protos

2019-03-20 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6838?focusedWorklogId=216465&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216465
 ]

ASF GitHub Bot logged work on BEAM-6838:


Author: ASF GitHub Bot
Created on: 20/Mar/19 20:47
Start Date: 20/Mar/19 20:47
Worklog Time Spent: 10m 
  Work Description: lostluck commented on pull request #8064: [BEAM-6838] 
Go SDK: Improving error msgs in pipeline serialization.
URL: https://github.com/apache/beam/pull/8064#discussion_r267534401
 
 

 ##
 File path: sdks/go/pkg/beam/core/runtime/graphx/serialize.go
 ##
 @@ -207,45 +207,45 @@ func encodeFn(u *graph.Fn) (*v1.Fn, error) {
case u.Fn != nil:
fn, err := encodeUserFn(u.Fn)
if err != nil {
-   return nil, fmt.Errorf("bad userfn: %v", err)
+   return nil, fmt.Errorf("failed to encode function %v, 
bad userfn: %v", u, err)
}
return &v1.Fn{Fn: fn}, nil
 
case u.Recv != nil:
t := reflect.TypeOf(u.Recv)
k, ok := runtime.TypeKey(reflectx.SkipPtr(t))
if !ok {
-   return nil, fmt.Errorf("bad recv: %v", u.Recv)
+   return nil, fmt.Errorf("failed to encode function %v, 
bad receiver %v", u, u.Recv)
}
if _, ok := runtime.LookupType(k); !ok {
-   return nil, fmt.Errorf("recv type must be registered: 
%v", t)
+   return nil, fmt.Errorf("failed to encode function %v, 
receiver type %v must be registered", u, t)
}
typ, err := encodeType(t)
if err != nil {
-   panic(fmt.Sprintf("bad recv type: %v", u.Recv))
+   panic(fmt.Sprintf("Failed to encode function %v, bad 
receiver type: %v", u, err))
}
 
data, err := json.Marshal(u.Recv)
if err != nil {
-   return nil, fmt.Errorf("bad userfn: %v", err)
+   return nil, fmt.Errorf("failed to encode function %v, 
bad userfn: %v", u, err)
}
return &v1.Fn{Type: typ, Opt: string(data)}, nil
 
default:
-   panic("empty Fn")
+   panic(fmt.Sprintf("Failed to encode function %v, empty fn", u))
}
 }
 
 func decodeFn(u *v1.Fn) (*graph.Fn, error) {
if u.Dynfn != nil {
gen, err := runtime.ResolveFunction(u.Dynfn.Gen, genFnType)
if err != nil {
-   return nil, fmt.Errorf("bad symbol %v: %v", 
u.Dynfn.Gen, err)
+   return nil, fmt.Errorf("failed to decode function proto 
%v, bad symbol %v: %v", u, u.Dynfn.Gen, err)
}
 
t, err := decodeType(u.Dynfn.Type)
if err != nil {
-   return nil, fmt.Errorf("bad type: %v", err)
+   return nil, fmt.Errorf("failed to decode function proto 
%v, bad type: %v", u, err)
 
 Review comment:
   ("failed to decode dynamic DoFn %v, bad type: %v", u, err)
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 216465)
Time Spent: 4.5h  (was: 4h 20m)

> Improve error messages for unregistered protos
> --
>
> Key: BEAM-6838
> URL: https://issues.apache.org/jira/browse/BEAM-6838
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-go
>Reporter: Daniel Oliveira
>Assignee: Daniel Oliveira
>Priority: Minor
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> When users have a type that can't be serialized (it has a function Field or 
> similar) or more complex protocol buffers, the error should be improved to 
> display the type being registered, and better yet, the beam.RegisterType(...) 
> code they can use to enable use of their type.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6838) Improve error messages for unregistered protos

2019-03-20 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6838?focusedWorklogId=216463&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216463
 ]

ASF GitHub Bot logged work on BEAM-6838:


Author: ASF GitHub Bot
Created on: 20/Mar/19 20:47
Start Date: 20/Mar/19 20:47
Worklog Time Spent: 10m 
  Work Description: lostluck commented on pull request #8064: [BEAM-6838] 
Go SDK: Improving error msgs in pipeline serialization.
URL: https://github.com/apache/beam/pull/8064#discussion_r267534230
 
 

 ##
 File path: sdks/go/pkg/beam/core/runtime/graphx/serialize.go
 ##
 @@ -207,45 +207,45 @@ func encodeFn(u *graph.Fn) (*v1.Fn, error) {
case u.Fn != nil:
fn, err := encodeUserFn(u.Fn)
if err != nil {
-   return nil, fmt.Errorf("bad userfn: %v", err)
+   return nil, fmt.Errorf("failed to encode function %v, 
bad userfn: %v", u, err)
}
return &v1.Fn{Fn: fn}, nil
 
case u.Recv != nil:
t := reflect.TypeOf(u.Recv)
k, ok := runtime.TypeKey(reflectx.SkipPtr(t))
if !ok {
-   return nil, fmt.Errorf("bad recv: %v", u.Recv)
+   return nil, fmt.Errorf("failed to encode function %v, 
bad receiver %v", u, u.Recv)
}
if _, ok := runtime.LookupType(k); !ok {
-   return nil, fmt.Errorf("recv type must be registered: 
%v", t)
+   return nil, fmt.Errorf("failed to encode function %v, 
receiver type %v must be registered", u, t)
}
typ, err := encodeType(t)
if err != nil {
-   panic(fmt.Sprintf("bad recv type: %v", u.Recv))
+   panic(fmt.Sprintf("Failed to encode function %v, bad 
receiver type: %v", u, err))
}
 
data, err := json.Marshal(u.Recv)
if err != nil {
-   return nil, fmt.Errorf("bad userfn: %v", err)
+   return nil, fmt.Errorf("failed to encode function %v, 
bad userfn: %v", u, err)
}
return &v1.Fn{Type: typ, Opt: string(data)}, nil
 
default:
-   panic("empty Fn")
+   panic(fmt.Sprintf("Failed to encode function %v, empty fn", u))
}
 }
 
 func decodeFn(u *v1.Fn) (*graph.Fn, error) {
if u.Dynfn != nil {
gen, err := runtime.ResolveFunction(u.Dynfn.Gen, genFnType)
if err != nil {
-   return nil, fmt.Errorf("bad symbol %v: %v", 
u.Dynfn.Gen, err)
+   return nil, fmt.Errorf("failed to decode function proto 
%v, bad symbol %v: %v", u, u.Dynfn.Gen, err)
 
 Review comment:
   ("failed to decode dynamic DoFn  %v, bad symbol %v: %v", u, u.Dynfn.Gen, err)
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 216463)
Time Spent: 4h 10m  (was: 4h)

> Improve error messages for unregistered protos
> --
>
> Key: BEAM-6838
> URL: https://issues.apache.org/jira/browse/BEAM-6838
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-go
>Reporter: Daniel Oliveira
>Assignee: Daniel Oliveira
>Priority: Minor
>  Time Spent: 4h 10m
>  Remaining Estimate: 0h
>
> When users have a type that can't be serialized (it has a function Field or 
> similar) or more complex protocol buffers, the error should be improved to 
> display the type being registered, and better yet, the beam.RegisterType(...) 
> code they can use to enable use of their type.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6838) Improve error messages for unregistered protos

2019-03-20 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6838?focusedWorklogId=216460&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216460
 ]

ASF GitHub Bot logged work on BEAM-6838:


Author: ASF GitHub Bot
Created on: 20/Mar/19 20:47
Start Date: 20/Mar/19 20:47
Worklog Time Spent: 10m 
  Work Description: lostluck commented on pull request #8064: [BEAM-6838] 
Go SDK: Improving error msgs in pipeline serialization.
URL: https://github.com/apache/beam/pull/8064#discussion_r267533474
 
 

 ##
 File path: sdks/go/pkg/beam/core/runtime/graphx/serialize.go
 ##
 @@ -207,45 +207,45 @@ func encodeFn(u *graph.Fn) (*v1.Fn, error) {
case u.Fn != nil:
fn, err := encodeUserFn(u.Fn)
if err != nil {
-   return nil, fmt.Errorf("bad userfn: %v", err)
+   return nil, fmt.Errorf("failed to encode function %v, 
bad userfn: %v", u, err)
}
return &v1.Fn{Fn: fn}, nil
 
case u.Recv != nil:
t := reflect.TypeOf(u.Recv)
k, ok := runtime.TypeKey(reflectx.SkipPtr(t))
if !ok {
-   return nil, fmt.Errorf("bad recv: %v", u.Recv)
+   return nil, fmt.Errorf("failed to encode function %v, 
bad receiver %v", u, u.Recv)
}
if _, ok := runtime.LookupType(k); !ok {
-   return nil, fmt.Errorf("recv type must be registered: 
%v", t)
+   return nil, fmt.Errorf("failed to encode function %v, 
receiver type %v must be registered", u, t)
}
typ, err := encodeType(t)
if err != nil {
-   panic(fmt.Sprintf("bad recv type: %v", u.Recv))
+   panic(fmt.Sprintf("Failed to encode function %v, bad 
receiver type: %v", u, err))
}
 
data, err := json.Marshal(u.Recv)
if err != nil {
-   return nil, fmt.Errorf("bad userfn: %v", err)
+   return nil, fmt.Errorf("failed to encode function %v, 
bad userfn: %v", u, err)
 
 Review comment:
   ("Failed to encode structural DoFn %v, %T: %v", u, u.Recv, err)
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 216460)

> Improve error messages for unregistered protos
> --
>
> Key: BEAM-6838
> URL: https://issues.apache.org/jira/browse/BEAM-6838
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-go
>Reporter: Daniel Oliveira
>Assignee: Daniel Oliveira
>Priority: Minor
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> When users have a type that can't be serialized (it has a function Field or 
> similar) or more complex protocol buffers, the error should be improved to 
> display the type being registered, and better yet, the beam.RegisterType(...) 
> code they can use to enable use of their type.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6838) Improve error messages for unregistered protos

2019-03-20 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6838?focusedWorklogId=216457&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216457
 ]

ASF GitHub Bot logged work on BEAM-6838:


Author: ASF GitHub Bot
Created on: 20/Mar/19 20:47
Start Date: 20/Mar/19 20:47
Worklog Time Spent: 10m 
  Work Description: lostluck commented on pull request #8064: [BEAM-6838] 
Go SDK: Improving error msgs in pipeline serialization.
URL: https://github.com/apache/beam/pull/8064#discussion_r267529336
 
 

 ##
 File path: sdks/go/pkg/beam/core/runtime/graphx/serialize.go
 ##
 @@ -145,20 +145,20 @@ func encodeCustomCoder(c *coder.CustomCoder) 
(*v1.CustomCoder, error) {
 func decodeCustomCoder(c *v1.CustomCoder) (*coder.CustomCoder, error) {
t, err := decodeType(c.Type)
if err != nil {
-   return nil, fmt.Errorf("decodeCustomCoder bad type: %v", err)
+   return nil, fmt.Errorf("failed to decode custom coder proto %v 
for type %v: %v", c, c.Type, err)
 
 Review comment:
   Missed removing the "proto " bits throughout this file. I think it's because 
it's in the second file, understandable oversight. That's why we review.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 216457)

> Improve error messages for unregistered protos
> --
>
> Key: BEAM-6838
> URL: https://issues.apache.org/jira/browse/BEAM-6838
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-go
>Reporter: Daniel Oliveira
>Assignee: Daniel Oliveira
>Priority: Minor
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> When users have a type that can't be serialized (it has a function Field or 
> similar) or more complex protocol buffers, the error should be improved to 
> display the type being registered, and better yet, the beam.RegisterType(...) 
> code they can use to enable use of their type.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6527) Parallel tox (unit) tests run on Jenkins

2019-03-20 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6527?focusedWorklogId=216446&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216446
 ]

ASF GitHub Bot logged work on BEAM-6527:


Author: ASF GitHub Bot
Created on: 20/Mar/19 20:20
Start Date: 20/Mar/19 20:20
Worklog Time Spent: 10m 
  Work Description: markflyhigh commented on issue #8067: [BEAM-6527] Use 
Gradle to parallel Python tox tests
URL: https://github.com/apache/beam/pull/8067#issuecomment-475011873
 
 
   Run Gradle Publish
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 216446)
Time Spent: 8h 20m  (was: 8h 10m)

> Parallel tox (unit) tests run on Jenkins
> 
>
> Key: BEAM-6527
> URL: https://issues.apache.org/jira/browse/BEAM-6527
> Project: Beam
>  Issue Type: Sub-task
>  Components: testing
>Reporter: Mark Liu
>Assignee: Mark Liu
>Priority: Major
>  Labels: triaged
>  Time Spent: 8h 20m
>  Remaining Estimate: 0h
>
> Existing tox unit test suite (basic, gcp and cython) will be enabled in 
> multiple version of Python 3, which will significantly increase runtime of 
> Pre/PostCommit build. A parallel is wanted in tox or Gradle invocation to 
> control the time in a reasonable range (<30mins for PreCommit is desired).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6812) Convert keys to ByteArray in Combine.perKey for Spark

2019-03-20 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6812?focusedWorklogId=216443&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216443
 ]

ASF GitHub Bot logged work on BEAM-6812:


Author: ASF GitHub Bot
Created on: 20/Mar/19 20:11
Start Date: 20/Mar/19 20:11
Worklog Time Spent: 10m 
  Work Description: kyle-winkelman commented on pull request #8042: 
[BEAM-6812]: Convert keys to ByteArray in Combine.perKey to make sure hashCode 
is consistent
URL: https://github.com/apache/beam/pull/8042#discussion_r267525204
 
 

 ##
 File path: 
runners/spark/src/main/java/org/apache/beam/runners/spark/translation/TransformTranslator.java
 ##
 @@ -569,8 +569,8 @@ private static Partitioner 
getPartitioner(EvaluationContext context) {
 Long bundleSize =
 
context.getSerializableOptions().get().as(SparkPipelineOptions.class).getBundleSize();
 return (bundleSize > 0)
-? null
-: new HashPartitioner(context.getSparkContext().defaultParallelism());
+? new HashPartitioner(context.getSparkContext().defaultParallelism())
+: null;
 
 Review comment:
   I agree that the old functionality seems strange, but I remember (when I had 
the logic backwards) that the performance tests for the spark runner were 
impacted. I think the impact was in streaming mode because if you don't use 
this HashPartitioner then it actually does a double shuffle of the data. I 
tried to clean this up but my I never finished getting PR #6511 merged.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 216443)
Time Spent: 2h 40m  (was: 2.5h)

> Convert keys to ByteArray in Combine.perKey for Spark
> -
>
> Key: BEAM-6812
> URL: https://issues.apache.org/jira/browse/BEAM-6812
> Project: Beam
>  Issue Type: Bug
>  Components: runner-spark
>Reporter: Ankit Jhalaria
>Assignee: Ankit Jhalaria
>Priority: Critical
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> * During calls to Combine.perKey, we want they keys used to have consistent 
> hashCode when invoked from different JVM's.
>  * However, while testing this in our company we found out that when using 
> protobuf as keys during combine, the hashCodes can be different for the same 
> key when invoked from different JVMs. This results in duplicates. 
>  * `ByteArray` class in Spark has a stable has code when dealing with arrays 
> as well. 
>  * GroupByKey correctly converts keys to `ByteArray` and uses coders for 
> serialization.
>  * The fix does something similar when dealing with combines.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6527) Parallel tox (unit) tests run on Jenkins

2019-03-20 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6527?focusedWorklogId=216442&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216442
 ]

ASF GitHub Bot logged work on BEAM-6527:


Author: ASF GitHub Bot
Created on: 20/Mar/19 20:09
Start Date: 20/Mar/19 20:09
Worklog Time Spent: 10m 
  Work Description: markflyhigh commented on issue #8067: [BEAM-6527] Use 
Gradle to parallel Python tox tests
URL: https://github.com/apache/beam/pull/8067#issuecomment-475007822
 
 
   Run Seed Job
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 216442)
Time Spent: 8h 10m  (was: 8h)

> Parallel tox (unit) tests run on Jenkins
> 
>
> Key: BEAM-6527
> URL: https://issues.apache.org/jira/browse/BEAM-6527
> Project: Beam
>  Issue Type: Sub-task
>  Components: testing
>Reporter: Mark Liu
>Assignee: Mark Liu
>Priority: Major
>  Labels: triaged
>  Time Spent: 8h 10m
>  Remaining Estimate: 0h
>
> Existing tox unit test suite (basic, gcp and cython) will be enabled in 
> multiple version of Python 3, which will significantly increase runtime of 
> Pre/PostCommit build. A parallel is wanted in tox or Gradle invocation to 
> control the time in a reasonable range (<30mins for PreCommit is desired).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6446) Clean up suppression rules in checkstyle suppressions.xml

2019-03-20 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6446?focusedWorklogId=216441&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216441
 ]

ASF GitHub Bot logged work on BEAM-6446:


Author: ASF GitHub Bot
Created on: 20/Mar/19 19:58
Start Date: 20/Mar/19 19:58
Worklog Time Spent: 10m 
  Work Description: HuangLED commented on issue #8091: [BEAM-6446] Clean up 
checkstyle suppressions.
URL: https://github.com/apache/beam/pull/8091#issuecomment-474613132
 
 
   Run Java PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 216441)
Time Spent: 50m  (was: 40m)

> Clean up suppression rules in checkstyle suppressions.xml
> -
>
> Key: BEAM-6446
> URL: https://issues.apache.org/jira/browse/BEAM-6446
> Project: Beam
>  Issue Type: Improvement
>  Components: build-system
>Reporter: Ruoyun Huang
>Assignee: Ruoyun Huang
>Priority: Minor
>  Labels: triaged
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> When violations are addressed, clean up suppression rules in checkstyle 
> suppressions.xml



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6446) Clean up suppression rules in checkstyle suppressions.xml

2019-03-20 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6446?focusedWorklogId=216440&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216440
 ]

ASF GitHub Bot logged work on BEAM-6446:


Author: ASF GitHub Bot
Created on: 20/Mar/19 19:58
Start Date: 20/Mar/19 19:58
Worklog Time Spent: 10m 
  Work Description: HuangLED commented on issue #8091: [BEAM-6446] Clean up 
checkstyle suppressions.
URL: https://github.com/apache/beam/pull/8091#issuecomment-474932260
 
 
   Run Java PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 216440)
Time Spent: 40m  (was: 0.5h)

> Clean up suppression rules in checkstyle suppressions.xml
> -
>
> Key: BEAM-6446
> URL: https://issues.apache.org/jira/browse/BEAM-6446
> Project: Beam
>  Issue Type: Improvement
>  Components: build-system
>Reporter: Ruoyun Huang
>Assignee: Ruoyun Huang
>Priority: Minor
>  Labels: triaged
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> When violations are addressed, clean up suppression rules in checkstyle 
> suppressions.xml



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6812) Convert keys to ByteArray in Combine.perKey for Spark

2019-03-20 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6812?focusedWorklogId=216436&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216436
 ]

ASF GitHub Bot logged work on BEAM-6812:


Author: ASF GitHub Bot
Created on: 20/Mar/19 19:53
Start Date: 20/Mar/19 19:53
Worklog Time Spent: 10m 
  Work Description: dmvk commented on issue #8042: [BEAM-6812]: Convert 
keys to ByteArray in Combine.perKey to make sure hashCode is consistent
URL: https://github.com/apache/beam/pull/8042#issuecomment-475002477
 
 
   Run Spark ValidatesRunner
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 216436)
Time Spent: 2.5h  (was: 2h 20m)

> Convert keys to ByteArray in Combine.perKey for Spark
> -
>
> Key: BEAM-6812
> URL: https://issues.apache.org/jira/browse/BEAM-6812
> Project: Beam
>  Issue Type: Bug
>  Components: runner-spark
>Reporter: Ankit Jhalaria
>Assignee: Ankit Jhalaria
>Priority: Critical
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> * During calls to Combine.perKey, we want they keys used to have consistent 
> hashCode when invoked from different JVM's.
>  * However, while testing this in our company we found out that when using 
> protobuf as keys during combine, the hashCodes can be different for the same 
> key when invoked from different JVMs. This results in duplicates. 
>  * `ByteArray` class in Spark has a stable has code when dealing with arrays 
> as well. 
>  * GroupByKey correctly converts keys to `ByteArray` and uses coders for 
> serialization.
>  * The fix does something similar when dealing with combines.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6812) Convert keys to ByteArray in Combine.perKey for Spark

2019-03-20 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6812?focusedWorklogId=216433&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216433
 ]

ASF GitHub Bot logged work on BEAM-6812:


Author: ASF GitHub Bot
Created on: 20/Mar/19 19:49
Start Date: 20/Mar/19 19:49
Worklog Time Spent: 10m 
  Work Description: jhalaria commented on pull request #8042: [BEAM-6812]: 
Convert keys to ByteArray in Combine.perKey to make sure hashCode is consistent
URL: https://github.com/apache/beam/pull/8042#discussion_r267517341
 
 

 ##
 File path: 
runners/spark/src/main/java/org/apache/beam/runners/spark/translation/TransformTranslator.java
 ##
 @@ -569,8 +569,8 @@ private static Partitioner 
getPartitioner(EvaluationContext context) {
 Long bundleSize =
 
context.getSerializableOptions().get().as(SparkPipelineOptions.class).getBundleSize();
 return (bundleSize > 0)
-? null
-: new HashPartitioner(context.getSparkContext().defaultParallelism());
+? new HashPartitioner(context.getSparkContext().defaultParallelism())
+: null;
 
 Review comment:
   I see. I reverted back the changes made to `getPartition`
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 216433)
Time Spent: 2h 20m  (was: 2h 10m)

> Convert keys to ByteArray in Combine.perKey for Spark
> -
>
> Key: BEAM-6812
> URL: https://issues.apache.org/jira/browse/BEAM-6812
> Project: Beam
>  Issue Type: Bug
>  Components: runner-spark
>Reporter: Ankit Jhalaria
>Assignee: Ankit Jhalaria
>Priority: Critical
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> * During calls to Combine.perKey, we want they keys used to have consistent 
> hashCode when invoked from different JVM's.
>  * However, while testing this in our company we found out that when using 
> protobuf as keys during combine, the hashCodes can be different for the same 
> key when invoked from different JVMs. This results in duplicates. 
>  * `ByteArray` class in Spark has a stable has code when dealing with arrays 
> as well. 
>  * GroupByKey correctly converts keys to `ByteArray` and uses coders for 
> serialization.
>  * The fix does something similar when dealing with combines.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6812) Convert keys to ByteArray in Combine.perKey for Spark

2019-03-20 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6812?focusedWorklogId=216429&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216429
 ]

ASF GitHub Bot logged work on BEAM-6812:


Author: ASF GitHub Bot
Created on: 20/Mar/19 19:45
Start Date: 20/Mar/19 19:45
Worklog Time Spent: 10m 
  Work Description: dmvk commented on pull request #8042: [BEAM-6812]: 
Convert keys to ByteArray in Combine.perKey to make sure hashCode is consistent
URL: https://github.com/apache/beam/pull/8042#discussion_r267514847
 
 

 ##
 File path: 
runners/spark/src/main/java/org/apache/beam/runners/spark/translation/TransformTranslator.java
 ##
 @@ -569,8 +569,8 @@ private static Partitioner 
getPartitioner(EvaluationContext context) {
 Long bundleSize =
 
context.getSerializableOptions().get().as(SparkPipelineOptions.class).getBundleSize();
 return (bundleSize > 0)
-? null
-: new HashPartitioner(context.getSparkContext().defaultParallelism());
+? new HashPartitioner(context.getSparkContext().defaultParallelism())
+: null;
 
 Review comment:
   Second thoughts, it was correct.
   
   https://github.com/apache/beam/pull/6884/files#r246077919
   
   This was in order to maintain old functionality (bundleSize == 0, which 
basically means to use predefined parallelism).
   
   I think the old functionality doesn't make much sense as it doesn't scale 
with input data. I guess someone may use this in order to re-scale "downstream" 
stage, but there should be a better mechanism to do this.
   
   Any thoughts? @timrobertson100 @kyle-winkelman 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 216429)
Time Spent: 2h 10m  (was: 2h)

> Convert keys to ByteArray in Combine.perKey for Spark
> -
>
> Key: BEAM-6812
> URL: https://issues.apache.org/jira/browse/BEAM-6812
> Project: Beam
>  Issue Type: Bug
>  Components: runner-spark
>Reporter: Ankit Jhalaria
>Assignee: Ankit Jhalaria
>Priority: Critical
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> * During calls to Combine.perKey, we want they keys used to have consistent 
> hashCode when invoked from different JVM's.
>  * However, while testing this in our company we found out that when using 
> protobuf as keys during combine, the hashCodes can be different for the same 
> key when invoked from different JVMs. This results in duplicates. 
>  * `ByteArray` class in Spark has a stable has code when dealing with arrays 
> as well. 
>  * GroupByKey correctly converts keys to `ByteArray` and uses coders for 
> serialization.
>  * The fix does something similar when dealing with combines.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-6862) [beam_PostCommit_Py_ValCont] Test fail with import error

2019-03-20 Thread Boyuan Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-6862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16797461#comment-16797461
 ] 

Boyuan Zhang commented on BEAM-6862:


Error: File 
"/usr/local/lib/python2.7/site-packages/apache_beam/runners/dataflow/dataflow_exercise_metrics_pipeline.py",
 line 24, in  from hamcrest.library.number.ordering_comparison import 
greater_than ImportError: No module named 
hamcrest.library.number.ordering_comparison

Seems like import hamcrest.library.number.ordering_comparison failed. Maybe 
it's because circular dependency.

> [beam_PostCommit_Py_ValCont] Test fail with import error
> 
>
> Key: BEAM-6862
> URL: https://issues.apache.org/jira/browse/BEAM-6862
> Project: Beam
>  Issue Type: Bug
>  Components: test-failures
>Reporter: Ahmet Altay
>Assignee: Alex Amato
>Priority: Major
>  Labels: currently-failing
>
> PR likely root cause: https://github.com/apache/beam/pull/8038
> Started at: https://builds.apache.org/job/beam_PostCommit_Py_ValCont/2672/
> 14:30:12 ImportError: No module named 
> hamcrest.library.number.ordering_comparison
> 14:30:12 
> 14:30:12  at 
> java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)
> 14:30:12  at 
> java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1895)
> 14:30:12  at org.apache.beam.sdk.util.MoreFutures.get(MoreFutures.java:57)
> 14:30:12  at 
> org.apache.beam.runners.dataflow.worker.fn.control.RegisterAndProcessBundleOperation.finish(RegisterAndProcessBundleOperation.java:281)
> 14:30:12  at 
> org.apache.beam.runners.dataflow.worker.util.common.worker.MapTaskExecutor.execute(MapTaskExecutor.java:85)
> 14:30:12  at 
> org.apache.beam.runners.dataflow.worker.fn.control.BeamFnMapTaskExecutor.execute(BeamFnMapTaskExecutor.java:125)
> 14:30:12  at 
> org.apache.beam.runners.dataflow.worker.BatchDataflowWorker.executeWork(BatchDataflowWorker.java:411)
> 14:30:12  at 
> org.apache.beam.runners.dataflow.worker.BatchDataflowWorker.doWork(BatchDataflowWorker.java:380)
> 14:30:12  at 
> org.apache.beam.runners.dataflow.worker.BatchDataflowWorker.getAndPerformWork(BatchDataflowWorker.java:306)
> 14:30:12  at 
> org.apache.beam.runners.dataflow.worker.DataflowRunnerHarness.start(DataflowRunnerHarness.java:195)
> 14:30:12  at 
> org.apache.beam.runners.dataflow.worker.DataflowRunnerHarness.main(DataflowRunnerHarness.java:123)
> 14:30:12  Suppressed: java.lang.IllegalStateException: Already closed.
> 14:30:12  at 
> org.apache.beam.sdk.fn.data.BeamFnDataBufferingOutboundObserver.close(BeamFnDataBufferingOutboundObserver.java:95)
> 14:30:12  at 
> org.apache.beam.runners.dataflow.worker.fn.data.RemoteGrpcPortWriteOperation.abort(RemoteGrpcPortWriteOperation.java:215)
> 14:30:12  at 
> org.apache.beam.runners.dataflow.worker.util.common.worker.MapTaskExecutor.execute(MapTaskExecutor.java:91)
> 14:30:12  ... 6 more



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-6812) Convert keys to ByteArray in Combine.perKey for Spark

2019-03-20 Thread Ankit Jhalaria (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ankit Jhalaria updated BEAM-6812:
-
Description: 
* During calls to Combine.perKey, we want they keys used to have consistent 
hashCode when invoked from different JVM's.
 * However, while testing this in our company we found out that when using 
protobuf as keys during combine, the hashCodes can be different for the same 
key when invoked from different JVMs. This results in duplicates. 
 * `ByteArray` class in Spark has a stable has code when dealing with arrays as 
well. 
 * GroupByKey correctly converts keys to `ByteArray` and uses coders for 
serialization.
 * The fix does something similar when dealing with combines.

> Convert keys to ByteArray in Combine.perKey for Spark
> -
>
> Key: BEAM-6812
> URL: https://issues.apache.org/jira/browse/BEAM-6812
> Project: Beam
>  Issue Type: Bug
>  Components: runner-spark
>Reporter: Ankit Jhalaria
>Assignee: Ankit Jhalaria
>Priority: Critical
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> * During calls to Combine.perKey, we want they keys used to have consistent 
> hashCode when invoked from different JVM's.
>  * However, while testing this in our company we found out that when using 
> protobuf as keys during combine, the hashCodes can be different for the same 
> key when invoked from different JVMs. This results in duplicates. 
>  * `ByteArray` class in Spark has a stable has code when dealing with arrays 
> as well. 
>  * GroupByKey correctly converts keys to `ByteArray` and uses coders for 
> serialization.
>  * The fix does something similar when dealing with combines.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Issue Comment Deleted] (BEAM-6812) Convert keys to ByteArray in Combine.perKey for Spark

2019-03-20 Thread Ankit Jhalaria (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ankit Jhalaria updated BEAM-6812:
-
Comment: was deleted

(was: * When running a Combine.perKey transform on Spark, we noticed duplicate 
results in the output for the same key.
 * On further investigation, it turns out that `StreamingTransformTranslator` 
uses `Spark's` HashPartitioner which does not work correctly when attempting to 
partition on a key which is an array.
 * Note from the code: 
 * /**
 * A [[org.apache.spark.Partitioner]] that implements hash-based partitioning 
using
 * Java's `Object.hashCode`.
 *
 * Java arrays have hashCodes that are based on the arrays' identities rather 
than their contents,
 * so attempting to partition an RDD[Array[_]] or RDD[(Array[_], _)] using a 
HashPartitioner will
 * produce an unexpected or incorrect result.
 */)

> Convert keys to ByteArray in Combine.perKey for Spark
> -
>
> Key: BEAM-6812
> URL: https://issues.apache.org/jira/browse/BEAM-6812
> Project: Beam
>  Issue Type: Bug
>  Components: runner-spark
>Reporter: Ankit Jhalaria
>Assignee: Ankit Jhalaria
>Priority: Critical
>  Time Spent: 2h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-6812) Convert keys to ByteArray in Combine.perKey for Spark

2019-03-20 Thread Ankit Jhalaria (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ankit Jhalaria updated BEAM-6812:
-
Summary: Convert keys to ByteArray in Combine.perKey for Spark  (was: 
HashPartitioning on Spark with arrays as keys produces unpredictable results)

> Convert keys to ByteArray in Combine.perKey for Spark
> -
>
> Key: BEAM-6812
> URL: https://issues.apache.org/jira/browse/BEAM-6812
> Project: Beam
>  Issue Type: Bug
>  Components: runner-spark
>Reporter: Ankit Jhalaria
>Assignee: Ankit Jhalaria
>Priority: Critical
>  Time Spent: 2h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6812) HashPartitioning on Spark with arrays as keys produces unpredictable results

2019-03-20 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6812?focusedWorklogId=216409&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216409
 ]

ASF GitHub Bot logged work on BEAM-6812:


Author: ASF GitHub Bot
Created on: 20/Mar/19 19:02
Start Date: 20/Mar/19 19:02
Worklog Time Spent: 10m 
  Work Description: jhalaria commented on issue #8042: [BEAM-6812]: Create 
a custom hash partitioner that deals with arrays during spark combines
URL: https://github.com/apache/beam/pull/8042#issuecomment-474985096
 
 
   @dmvk - Thank you for looking at the PR. I removed the custom partitioner 
logic for now. I made the CPK similar to GBK where we convert the key to a 
`ByteArray`. Will create another PR that provides the ability to add a custom 
partitioner.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 216409)
Time Spent: 2h  (was: 1h 50m)

> HashPartitioning on Spark with arrays as keys produces unpredictable results
> 
>
> Key: BEAM-6812
> URL: https://issues.apache.org/jira/browse/BEAM-6812
> Project: Beam
>  Issue Type: Bug
>  Components: runner-spark
>Reporter: Ankit Jhalaria
>Assignee: Ankit Jhalaria
>Priority: Critical
>  Time Spent: 2h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6838) Improve error messages for unregistered protos

2019-03-20 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6838?focusedWorklogId=216393&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216393
 ]

ASF GitHub Bot logged work on BEAM-6838:


Author: ASF GitHub Bot
Created on: 20/Mar/19 18:42
Start Date: 20/Mar/19 18:42
Worklog Time Spent: 10m 
  Work Description: youngoli commented on pull request #8064: [BEAM-6838] 
Go SDK: Improving error msgs in pipeline serialization.
URL: https://github.com/apache/beam/pull/8064#discussion_r267492152
 
 

 ##
 File path: sdks/go/pkg/beam/core/runtime/graphx/coder.go
 ##
 @@ -217,7 +217,7 @@ func (b *CoderUnmarshaller) makeCoder(c *pb.Coder) 
(*coder.Coder, error) {
// the portable pipeline model directly (BEAM-2885)
if elm.GetSpec().GetSpec().GetUrn() != "" && 
elm.GetSpec().GetSpec().GetUrn() != urnCustomCoder {
// TODO(herohde) 11/17/2017: revisit this restriction
-   return nil, fmt.Errorf("expected length prefix of 
custom coder only: %v", elm)
+   return nil, fmt.Errorf("could not make coder from proto 
%v, expected length prefix of custom coder only: %v", *c, elm)
 
 Review comment:
   Done.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 216393)
Time Spent: 2h 10m  (was: 2h)

> Improve error messages for unregistered protos
> --
>
> Key: BEAM-6838
> URL: https://issues.apache.org/jira/browse/BEAM-6838
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-go
>Reporter: Daniel Oliveira
>Assignee: Daniel Oliveira
>Priority: Minor
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> When users have a type that can't be serialized (it has a function Field or 
> similar) or more complex protocol buffers, the error should be improved to 
> display the type being registered, and better yet, the beam.RegisterType(...) 
> code they can use to enable use of their type.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6838) Improve error messages for unregistered protos

2019-03-20 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6838?focusedWorklogId=216406&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216406
 ]

ASF GitHub Bot logged work on BEAM-6838:


Author: ASF GitHub Bot
Created on: 20/Mar/19 18:48
Start Date: 20/Mar/19 18:48
Worklog Time Spent: 10m 
  Work Description: youngoli commented on pull request #8064: [BEAM-6838] 
Go SDK: Improving error msgs in pipeline serialization.
URL: https://github.com/apache/beam/pull/8064#discussion_r267494692
 
 

 ##
 File path: sdks/go/pkg/beam/core/runtime/graphx/coder.go
 ##
 @@ -157,7 +157,7 @@ func (b *CoderUnmarshaller) makeCoder(c *pb.Coder) 
(*coder.Coder, error) {
 
case urnKVCoder:
if len(components) != 2 {
-   return nil, fmt.Errorf("bad pair: %v", c)
+   return nil, fmt.Errorf("could not make coder from proto 
%v, incorrect number of components for pair", *c)
 
 Review comment:
   That's really useful to know actually. I checked the existing String 
functions and removed all the dereferences I could (which was all of them).
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 216406)
Time Spent: 3h 10m  (was: 3h)

> Improve error messages for unregistered protos
> --
>
> Key: BEAM-6838
> URL: https://issues.apache.org/jira/browse/BEAM-6838
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-go
>Reporter: Daniel Oliveira
>Assignee: Daniel Oliveira
>Priority: Minor
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> When users have a type that can't be serialized (it has a function Field or 
> similar) or more complex protocol buffers, the error should be improved to 
> display the type being registered, and better yet, the beam.RegisterType(...) 
> code they can use to enable use of their type.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6838) Improve error messages for unregistered protos

2019-03-20 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6838?focusedWorklogId=216405&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216405
 ]

ASF GitHub Bot logged work on BEAM-6838:


Author: ASF GitHub Bot
Created on: 20/Mar/19 18:47
Start Date: 20/Mar/19 18:47
Worklog Time Spent: 10m 
  Work Description: youngoli commented on pull request #8064: [BEAM-6838] 
Go SDK: Improving error msgs in pipeline serialization.
URL: https://github.com/apache/beam/pull/8064#discussion_r267494071
 
 

 ##
 File path: sdks/go/pkg/beam/core/runtime/graphx/serialize.go
 ##
 @@ -122,15 +122,15 @@ func DecodeMultiEdge(edge *v1.MultiEdge) (graph.Opcode, 
*graph.Fn, *window.Fn, [
 func encodeCustomCoder(c *coder.CustomCoder) (*v1.CustomCoder, error) {
t, err := encodeType(c.Type)
if err != nil {
-   return nil, fmt.Errorf("bad underlying type: %v", err)
+   return nil, fmt.Errorf("failed to encode custom coder %v, bad 
underlying type: %v", *c, err)
 
 Review comment:
   Yeah, it's a bit tricky. Your suggestion is better than anything I thought 
of, so I went with it.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 216405)
Time Spent: 3h  (was: 2h 50m)

> Improve error messages for unregistered protos
> --
>
> Key: BEAM-6838
> URL: https://issues.apache.org/jira/browse/BEAM-6838
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-go
>Reporter: Daniel Oliveira
>Assignee: Daniel Oliveira
>Priority: Minor
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> When users have a type that can't be serialized (it has a function Field or 
> similar) or more complex protocol buffers, the error should be improved to 
> display the type being registered, and better yet, the beam.RegisterType(...) 
> code they can use to enable use of their type.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6838) Improve error messages for unregistered protos

2019-03-20 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6838?focusedWorklogId=216402&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216402
 ]

ASF GitHub Bot logged work on BEAM-6838:


Author: ASF GitHub Bot
Created on: 20/Mar/19 18:46
Start Date: 20/Mar/19 18:46
Worklog Time Spent: 10m 
  Work Description: youngoli commented on pull request #8064: [BEAM-6838] 
Go SDK: Improving error msgs in pipeline serialization.
URL: https://github.com/apache/beam/pull/8064#discussion_r267493684
 
 

 ##
 File path: sdks/go/pkg/beam/core/runtime/graphx/serialize.go
 ##
 @@ -43,14 +43,14 @@ func EncodeMultiEdge(edge *graph.MultiEdge) 
(*v1.MultiEdge, error) {
if edge.DoFn != nil {
ref, err := encodeFn((*graph.Fn)(edge.DoFn))
if err != nil {
-   return nil, fmt.Errorf("encode: bad userfn: %v", err)
+   return nil, fmt.Errorf("failed to encode multi edge %v, 
bad userfn: %v", *edge, err)
 
 Review comment:
   Done.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 216402)
Time Spent: 2h 50m  (was: 2h 40m)

> Improve error messages for unregistered protos
> --
>
> Key: BEAM-6838
> URL: https://issues.apache.org/jira/browse/BEAM-6838
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-go
>Reporter: Daniel Oliveira
>Assignee: Daniel Oliveira
>Priority: Minor
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> When users have a type that can't be serialized (it has a function Field or 
> similar) or more complex protocol buffers, the error should be improved to 
> display the type being registered, and better yet, the beam.RegisterType(...) 
> code they can use to enable use of their type.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6838) Improve error messages for unregistered protos

2019-03-20 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6838?focusedWorklogId=216401&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216401
 ]

ASF GitHub Bot logged work on BEAM-6838:


Author: ASF GitHub Bot
Created on: 20/Mar/19 18:44
Start Date: 20/Mar/19 18:44
Worklog Time Spent: 10m 
  Work Description: youngoli commented on pull request #8064: [BEAM-6838] 
Go SDK: Improving error msgs in pipeline serialization.
URL: https://github.com/apache/beam/pull/8064#discussion_r267492965
 
 

 ##
 File path: sdks/go/pkg/beam/core/runtime/graphx/coder.go
 ##
 @@ -366,7 +370,7 @@ func (b *CoderMarshaller) Add(c *coder.Coder) string {
return b.internBuiltInCoder(urnVarIntCoder)
 
default:
-   panic(fmt.Sprintf("Unexpected coder kind: %v", c.Kind))
+   panic(fmt.Sprintf("failed to marshal custom coder %v, 
unexpected coder kind: %v", *c, c.Kind))
 
 Review comment:
   Capitalized them all for consistency. I was aware of errors being 
lowercased, but wasn't sure about panic messages.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 216401)
Time Spent: 2h 40m  (was: 2.5h)

> Improve error messages for unregistered protos
> --
>
> Key: BEAM-6838
> URL: https://issues.apache.org/jira/browse/BEAM-6838
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-go
>Reporter: Daniel Oliveira
>Assignee: Daniel Oliveira
>Priority: Minor
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> When users have a type that can't be serialized (it has a function Field or 
> similar) or more complex protocol buffers, the error should be improved to 
> display the type being registered, and better yet, the beam.RegisterType(...) 
> code they can use to enable use of their type.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6838) Improve error messages for unregistered protos

2019-03-20 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6838?focusedWorklogId=216396&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216396
 ]

ASF GitHub Bot logged work on BEAM-6838:


Author: ASF GitHub Bot
Created on: 20/Mar/19 18:43
Start Date: 20/Mar/19 18:43
Worklog Time Spent: 10m 
  Work Description: youngoli commented on pull request #8064: [BEAM-6838] 
Go SDK: Improving error msgs in pipeline serialization.
URL: https://github.com/apache/beam/pull/8064#discussion_r267492258
 
 

 ##
 File path: sdks/go/pkg/beam/core/runtime/graphx/coder.go
 ##
 @@ -248,7 +248,7 @@ func (b *CoderUnmarshaller) makeCoder(c *pb.Coder) 
(*coder.Coder, error) {
return &coder.Coder{Kind: coder.WindowedValue, T: t, 
Components: []*coder.Coder{elm}, Window: w}, nil
 
case streamType:
-   return nil, fmt.Errorf("stream must be pair value: %v", c)
+   return nil, fmt.Errorf("could not make coder from proto %v, 
stream must be pair value", *c)
 
 Review comment:
   Done.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 216396)
Time Spent: 2.5h  (was: 2h 20m)

> Improve error messages for unregistered protos
> --
>
> Key: BEAM-6838
> URL: https://issues.apache.org/jira/browse/BEAM-6838
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-go
>Reporter: Daniel Oliveira
>Assignee: Daniel Oliveira
>Priority: Minor
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> When users have a type that can't be serialized (it has a function Field or 
> similar) or more complex protocol buffers, the error should be improved to 
> display the type being registered, and better yet, the beam.RegisterType(...) 
> code they can use to enable use of their type.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6838) Improve error messages for unregistered protos

2019-03-20 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6838?focusedWorklogId=216391&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216391
 ]

ASF GitHub Bot logged work on BEAM-6838:


Author: ASF GitHub Bot
Created on: 20/Mar/19 18:42
Start Date: 20/Mar/19 18:42
Worklog Time Spent: 10m 
  Work Description: youngoli commented on pull request #8064: [BEAM-6838] 
Go SDK: Improving error msgs in pipeline serialization.
URL: https://github.com/apache/beam/pull/8064#discussion_r267492046
 
 

 ##
 File path: sdks/go/pkg/beam/core/runtime/graphx/coder.go
 ##
 @@ -157,7 +157,7 @@ func (b *CoderUnmarshaller) makeCoder(c *pb.Coder) 
(*coder.Coder, error) {
 
case urnKVCoder:
if len(components) != 2 {
-   return nil, fmt.Errorf("bad pair: %v", c)
+   return nil, fmt.Errorf("could not make coder from proto 
%v, incorrect number of components for pair", *c)
 
 Review comment:
   Useful advice, modified all the error msgs in this function to match.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 216391)
Time Spent: 1h 50m  (was: 1h 40m)

> Improve error messages for unregistered protos
> --
>
> Key: BEAM-6838
> URL: https://issues.apache.org/jira/browse/BEAM-6838
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-go
>Reporter: Daniel Oliveira
>Assignee: Daniel Oliveira
>Priority: Minor
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> When users have a type that can't be serialized (it has a function Field or 
> similar) or more complex protocol buffers, the error should be improved to 
> display the type being registered, and better yet, the beam.RegisterType(...) 
> code they can use to enable use of their type.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6838) Improve error messages for unregistered protos

2019-03-20 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6838?focusedWorklogId=216392&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216392
 ]

ASF GitHub Bot logged work on BEAM-6838:


Author: ASF GitHub Bot
Created on: 20/Mar/19 18:42
Start Date: 20/Mar/19 18:42
Worklog Time Spent: 10m 
  Work Description: youngoli commented on pull request #8064: [BEAM-6838] 
Go SDK: Improving error msgs in pipeline serialization.
URL: https://github.com/apache/beam/pull/8064#discussion_r267492086
 
 

 ##
 File path: sdks/go/pkg/beam/core/runtime/graphx/coder.go
 ##
 @@ -206,7 +206,7 @@ func (b *CoderUnmarshaller) makeCoder(c *pb.Coder) 
(*coder.Coder, error) {
 
case urnLengthPrefixCoder:
if len(components) != 1 {
-   return nil, fmt.Errorf("bad length prefix: %v", c)
+   return nil, fmt.Errorf("could not make coder from proto 
%v, bad length prefix", *c)
 
 Review comment:
   Done
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 216392)
Time Spent: 2h  (was: 1h 50m)

> Improve error messages for unregistered protos
> --
>
> Key: BEAM-6838
> URL: https://issues.apache.org/jira/browse/BEAM-6838
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-go
>Reporter: Daniel Oliveira
>Assignee: Daniel Oliveira
>Priority: Minor
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> When users have a type that can't be serialized (it has a function Field or 
> similar) or more complex protocol buffers, the error should be improved to 
> display the type being registered, and better yet, the beam.RegisterType(...) 
> code they can use to enable use of their type.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6838) Improve error messages for unregistered protos

2019-03-20 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6838?focusedWorklogId=216394&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216394
 ]

ASF GitHub Bot logged work on BEAM-6838:


Author: ASF GitHub Bot
Created on: 20/Mar/19 18:42
Start Date: 20/Mar/19 18:42
Worklog Time Spent: 10m 
  Work Description: youngoli commented on pull request #8064: [BEAM-6838] 
Go SDK: Improving error msgs in pipeline serialization.
URL: https://github.com/apache/beam/pull/8064#discussion_r267492213
 
 

 ##
 File path: sdks/go/pkg/beam/core/runtime/graphx/coder.go
 ##
 @@ -233,7 +233,7 @@ func (b *CoderUnmarshaller) makeCoder(c *pb.Coder) 
(*coder.Coder, error) {
 
case urnWindowedValueCoder:
if len(components) != 2 {
-   return nil, fmt.Errorf("bad windowed value: %v", c)
+   return nil, fmt.Errorf("could not make coder from proto 
%v, bad windowed value", *c)
 
 Review comment:
   Done.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 216394)
Time Spent: 2h 20m  (was: 2h 10m)

> Improve error messages for unregistered protos
> --
>
> Key: BEAM-6838
> URL: https://issues.apache.org/jira/browse/BEAM-6838
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-go
>Reporter: Daniel Oliveira
>Assignee: Daniel Oliveira
>Priority: Minor
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> When users have a type that can't be serialized (it has a function Field or 
> similar) or more complex protocol buffers, the error should be improved to 
> display the type being registered, and better yet, the beam.RegisterType(...) 
> code they can use to enable use of their type.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6778) Enable Bundle Finalization in Python SDK

2019-03-20 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6778?focusedWorklogId=216379&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216379
 ]

ASF GitHub Bot logged work on BEAM-6778:


Author: ASF GitHub Bot
Created on: 20/Mar/19 18:15
Start Date: 20/Mar/19 18:15
Worklog Time Spent: 10m 
  Work Description: boyuanzz commented on issue #7937: [BEAM-6778] Enable 
Bundle Finalization in Python SDK harness over FnApi
URL: https://github.com/apache/beam/pull/7937#issuecomment-474967111
 
 
   All tests have passed. Thanks for your time and help! @robertwb 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 216379)
Time Spent: 9h 20m  (was: 9h 10m)

> Enable Bundle Finalization in Python SDK
> 
>
> Key: BEAM-6778
> URL: https://issues.apache.org/jira/browse/BEAM-6778
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py-harness
>Reporter: Boyuan Zhang
>Assignee: Boyuan Zhang
>Priority: Major
>  Time Spent: 9h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6619) Add PostCommit suite for integration tests on DataflowRunner

2019-03-20 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6619?focusedWorklogId=216378&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216378
 ]

ASF GitHub Bot logged work on BEAM-6619:


Author: ASF GitHub Bot
Created on: 20/Mar/19 18:14
Start Date: 20/Mar/19 18:14
Worklog Time Spent: 10m 
  Work Description: tvalentyn commented on pull request #8076: [BEAM-6619] 
[BEAM-6593] Add bigquery integration tests to postcommit
URL: https://github.com/apache/beam/pull/8076#discussion_r267479836
 
 

 ##
 File path: 
sdks/python/apache_beam/examples/complete/game/leader_board_it_test.py
 ##
 @@ -53,7 +53,7 @@
 class LeaderBoardIT(unittest.TestCase):
 
   # Input event containing user, team, score, processing time, window start.
-  INPUT_EVENT = 'user1,teamA,10,%d,2015-11-02 09:09:28.224'
+  INPUT_EVENT = b'user1,teamA,10,%d,2015-11-02 09:09:28.224'
 
 Review comment:
   So the question is then, to the humans who read and edit this code, should 
INPUT_EVENT be a textual date or encoded data? I think it's easier to consider 
it text, up until it's time to feed it to pubsub, then when we can encode it to 
bytes.
   
   I'd keep INPUT_EVENT as is change publishing to: 
   
   ```
   event = self.INPUT_EVENT % self._test_timestamp
   self.pub_client.publish(event.encode('utf-8'))
   ```
   I would expect users to follow a similar pattern in their pipelines, and 
they might refer to beam examples for guidance, so I suggest to change this.
   
   
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 216378)
Time Spent: 20h 50m  (was: 20h 40m)

> Add PostCommit suite for integration tests on DataflowRunner
> 
>
> Key: BEAM-6619
> URL: https://issues.apache.org/jira/browse/BEAM-6619
> Project: Beam
>  Issue Type: Sub-task
>  Components: testing
>Reporter: Mark Liu
>Assignee: Mark Liu
>Priority: Major
>  Labels: triaged
> Fix For: Not applicable
>
>  Time Spent: 20h 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6772) Select transform has non-intuitive semantics

2019-03-20 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6772?focusedWorklogId=216376&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216376
 ]

ASF GitHub Bot logged work on BEAM-6772:


Author: ASF GitHub Bot
Created on: 20/Mar/19 18:13
Start Date: 20/Mar/19 18:13
Worklog Time Spent: 10m 
  Work Description: kanterov commented on issue #8006: [BEAM-6772] Change 
Select semantics to match what a user expects
URL: https://github.com/apache/beam/pull/8006#issuecomment-474966459
 
 
   Sounds good. I will take a look tomorrow
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 216376)
Time Spent: 5.5h  (was: 5h 20m)

> Select transform has non-intuitive semantics
> 
>
> Key: BEAM-6772
> URL: https://issues.apache.org/jira/browse/BEAM-6772
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-java-core
>Reporter: Reuven Lax
>Assignee: Reuven Lax
>Priority: Major
>  Time Spent: 5.5h
>  Remaining Estimate: 0h
>
> Consider the following schema:
> User:
>     name: STRING
>     location: Location
>  
> Location:
>     latitude: DOUBLE
>     longitude: DOUBLE
>  
> If you apply Select.fieldNames("location"), most users expect to get back a 
> row matching the Location schema. Instead you get back an outer schema with a 
> single location field in it. Select should instead unnest the output up to 
> the point where multiple fields are selected.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6861) Select fields and computed fields in CassandraIO.Read

2019-03-20 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6861?focusedWorklogId=216369&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216369
 ]

ASF GitHub Bot logged work on BEAM-6861:


Author: ASF GitHub Bot
Created on: 20/Mar/19 18:04
Start Date: 20/Mar/19 18:04
Worklog Time Spent: 10m 
  Work Description: stankiewicz commented on issue #8090: [BEAM-6861] 
Select fields and computed fields in CassandraIO.Read
URL: https://github.com/apache/beam/pull/8090#issuecomment-474962082
 
 
   Hi @srfrnk! 
   I don't like the idea of forking driver as we will end up maintaining it. 
   I haven't thought about it but I like the idea of programmatic generation of 
query instead of doing String joins and formats. Challenge here is what we want 
to expose instead of withWhere() or withQuery() - withQueryBuilder()? If yes it 
would be nice alternative because we can then just add key ranges dynamically. 
But then the challenge is how to support value provider for it. 
   
   If we figure out serialisation then we can clean this up a little bit. In 
the meantime I would try to merge this. 
   
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 216369)
Time Spent: 4h  (was: 3h 50m)

> Select fields and computed fields in CassandraIO.Read 
> --
>
> Key: BEAM-6861
> URL: https://issues.apache.org/jira/browse/BEAM-6861
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-cassandra
>Reporter: Radosław Stankiewicz
>Assignee: Radosław Stankiewicz
>Priority: Minor
>  Labels: features
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> CassandraIO.Read currently selects all fields and maps them to POJO.
> To make this component more flexible, it should be possible to select only 
> subset of fields or computed fields to allow reading things like write 
> Timestamp or using other functions.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6619) Add PostCommit suite for integration tests on DataflowRunner

2019-03-20 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6619?focusedWorklogId=216367&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216367
 ]

ASF GitHub Bot logged work on BEAM-6619:


Author: ASF GitHub Bot
Created on: 20/Mar/19 17:59
Start Date: 20/Mar/19 17:59
Worklog Time Spent: 10m 
  Work Description: Juta commented on pull request #8076: [BEAM-6619] 
[BEAM-6593] Add bigquery integration tests to postcommit
URL: https://github.com/apache/beam/pull/8076#discussion_r267473739
 
 

 ##
 File path: 
sdks/python/apache_beam/examples/complete/game/leader_board_it_test.py
 ##
 @@ -53,7 +53,7 @@
 class LeaderBoardIT(unittest.TestCase):
 
   # Input event containing user, team, score, processing time, window start.
-  INPUT_EVENT = 'user1,teamA,10,%d,2015-11-02 09:09:28.224'
+  INPUT_EVENT = b'user1,teamA,10,%d,2015-11-02 09:09:28.224'
 
 Review comment:
   yes `self.pub_client.publish` requires bytes. In this case I think 
specifying the input event as bytes is what is expected because it is directly 
passed to the client. What do you think?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 216367)
Time Spent: 20h 40m  (was: 20.5h)

> Add PostCommit suite for integration tests on DataflowRunner
> 
>
> Key: BEAM-6619
> URL: https://issues.apache.org/jira/browse/BEAM-6619
> Project: Beam
>  Issue Type: Sub-task
>  Components: testing
>Reporter: Mark Liu
>Assignee: Mark Liu
>Priority: Major
>  Labels: triaged
> Fix For: Not applicable
>
>  Time Spent: 20h 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6619) Add PostCommit suite for integration tests on DataflowRunner

2019-03-20 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6619?focusedWorklogId=216366&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216366
 ]

ASF GitHub Bot logged work on BEAM-6619:


Author: ASF GitHub Bot
Created on: 20/Mar/19 17:59
Start Date: 20/Mar/19 17:59
Worklog Time Spent: 10m 
  Work Description: tvalentyn commented on pull request #8076: [BEAM-6619] 
[BEAM-6593] Add bigquery integration tests to postcommit
URL: https://github.com/apache/beam/pull/8076#discussion_r267473468
 
 

 ##
 File path: sdks/python/apache_beam/examples/complete/game/leader_board.py
 ##
 @@ -120,6 +120,8 @@ def __init__(self):
 
   def process(self, elem):
 try:
+  if isinstance(elem, bytes):
 
 Review comment:
   I am not sure I follow - could you please elaborate - what exactly in either 
pubsub or unit test makes it so that the input type is not consistent here?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 216366)
Time Spent: 20.5h  (was: 20h 20m)

> Add PostCommit suite for integration tests on DataflowRunner
> 
>
> Key: BEAM-6619
> URL: https://issues.apache.org/jira/browse/BEAM-6619
> Project: Beam
>  Issue Type: Sub-task
>  Components: testing
>Reporter: Mark Liu
>Assignee: Mark Liu
>Priority: Major
>  Labels: triaged
> Fix For: Not applicable
>
>  Time Spent: 20.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6619) Add PostCommit suite for integration tests on DataflowRunner

2019-03-20 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6619?focusedWorklogId=216363&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216363
 ]

ASF GitHub Bot logged work on BEAM-6619:


Author: ASF GitHub Bot
Created on: 20/Mar/19 17:54
Start Date: 20/Mar/19 17:54
Worklog Time Spent: 10m 
  Work Description: Juta commented on pull request #8076: [BEAM-6619] 
[BEAM-6593] Add bigquery integration tests to postcommit
URL: https://github.com/apache/beam/pull/8076#discussion_r267471449
 
 

 ##
 File path: sdks/python/apache_beam/examples/complete/game/leader_board.py
 ##
 @@ -120,6 +120,8 @@ def __init__(self):
 
   def process(self, elem):
 try:
+  if isinstance(elem, bytes):
 
 Review comment:
   The IT test uses pubsub as input source while the unit test currently passes 
strings. That's why we cannot always decode
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 216363)
Time Spent: 20h 20m  (was: 20h 10m)

> Add PostCommit suite for integration tests on DataflowRunner
> 
>
> Key: BEAM-6619
> URL: https://issues.apache.org/jira/browse/BEAM-6619
> Project: Beam
>  Issue Type: Sub-task
>  Components: testing
>Reporter: Mark Liu
>Assignee: Mark Liu
>Priority: Major
>  Labels: triaged
> Fix For: Not applicable
>
>  Time Spent: 20h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-6868) Flink runner supports Bundle Finalization

2019-03-20 Thread Maximilian Michels (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-6868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16797386#comment-16797386
 ] 

Maximilian Michels commented on BEAM-6868:
--

Thanks for clarifying. Indeed, the Flink Runner does not send 
{{FinalizeBundleRequest}}. I thought you were referring to the bundle 
processing of the legacy Runner.

> Flink runner supports Bundle Finalization
> -
>
> Key: BEAM-6868
> URL: https://issues.apache.org/jira/browse/BEAM-6868
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-flink
>Reporter: Boyuan Zhang
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-6870) python 3 test_hourly_team_score_it fails with bigquery job id already exists

2019-03-20 Thread Juta Staes (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-6870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16797387#comment-16797387
 ] 

Juta Staes commented on BEAM-6870:
--

The test also fails in isolation because the retry is triggered somehow. I did 
not yet look into why this happens.

> python 3 test_hourly_team_score_it fails with bigquery job id already exists
> 
>
> Key: BEAM-6870
> URL: https://issues.apache.org/jira/browse/BEAM-6870
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Juta Staes
>Priority: Major
>
> This IT test on python 3 fails with "Already exists job ..."
> {code:java}
> 14:42:03 
> ==
> 14:42:03 ERROR: test_hourly_team_score_it 
> (apache_beam.examples.complete.game.hourly_team_score_it_test.HourlyTeamScoreIT)
> 14:42:03 
> --
> 14:42:03 Traceback (most recent call last):
> 14:42:03   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify_PR/src/sdks/python/apache_beam/examples/complete/game/hourly_team_score_it_test.py",
>  line 89, in test_hourly_team_score_it
> 14:42:03 self.test_pipeline.get_full_options_as_args(**extra_opts))
> 14:42:03   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify_PR/src/sdks/python/apache_beam/examples/complete/game/hourly_team_score.py",
>  line 295, in run
> 14:42:03 }, options.view_as(GoogleCloudOptions).project))
> 14:42:03   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify_PR/src/sdks/python/apache_beam/pipeline.py",
>  line 426, in __exit__
> 14:42:03 self.run().wait_until_finish()
> 14:42:03   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify_PR/src/sdks/python/apache_beam/pipeline.py",
>  line 406, in run
> 14:42:03 self._options).run(False)
> 14:42:03   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify_PR/src/sdks/python/apache_beam/pipeline.py",
>  line 419, in run
> 14:42:03 return self.runner.run_pipeline(self, self._options)
> 14:42:03   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify_PR/src/sdks/python/apache_beam/runners/dataflow/test_dataflow_runner.py",
>  line 64, in run_pipeline
> 14:42:03 self.result.wait_until_finish(duration=wait_duration)
> 14:42:03   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify_PR/src/sdks/python/apache_beam/runners/dataflow/dataflow_runner.py",
>  line 1217, in wait_until_finish
> 14:42:03 (self.state, getattr(self._runner, 'last_error_msg', None)), 
> self)
> 14:42:03 
> apache_beam.runners.dataflow.dataflow_runner.DataflowRuntimeException: 
> Dataflow pipeline failed. State: FAILED, Error:
> 14:42:03 Traceback (most recent call last):
> 14:42:03   File "apache_beam/runners/common.py", line 732, in 
> apache_beam.runners.common.DoFnRunner.process
> 14:42:03   File "apache_beam/runners/common.py", line 555, in 
> apache_beam.runners.common.PerWindowInvoker.invoke_process
> 14:42:03   File "apache_beam/runners/common.py", line 624, in 
> apache_beam.runners.common.PerWindowInvoker._invoke_per_window
> 14:42:03   File "apache_beam/runners/common.py", line 816, in 
> apache_beam.runners.common._OutputProcessor.process_outputs
> 14:42:03   File "apache_beam/runners/common.py", line 831, in 
> apache_beam.runners.common._OutputProcessor.process_outputs
> 14:42:03   File 
> "/usr/local/lib/python3.5/site-packages/apache_beam/io/gcp/bigquery_file_loads.py",
>  line 385, in process
> 14:42:03 create_disposition=self.create_disposition)
> 14:42:03   File 
> "/usr/local/lib/python3.5/site-packages/apache_beam/io/gcp/bigquery_tools.py",
>  line 560, in perform_load_job
> 14:42:03 write_disposition=write_disposition)
> 14:42:03   File 
> "/usr/local/lib/python3.5/site-packages/apache_beam/utils/retry.py", line 
> 195, in wrapper
> 14:42:03 return fun(*args, **kwargs)
> 14:42:03   File 
> "/usr/local/lib/python3.5/site-packages/apache_beam/io/gcp/bigquery_tools.py",
>  line 327, in _insert_load_job
> 14:42:03 response = self.client.jobs.Insert(request)
> 14:42:03   File 
> "/usr/local/lib/python3.5/site-packages/apache_beam/io/gcp/internal/clients/bigquery/bigquery_v2_client.py",
>  line 342, in Insert
> 14:42:03 upload=upload, upload_config=upload_config)
> 14:42:03   File 
> "/usr/local/lib/python3.5/site-packages/apitools/base/py/base_api.py", line 
> 731, in _RunMethod
> 14:42:03 return self.ProcessHttpResponse(method_config, http_response, 
> request)
> 14:42:03   File 
> "/usr/local/lib/python3.5/site-packages/apitools/base/py/base_api.py", line 
> 737, in ProcessHttpResponse
> 1

[jira] [Commented] (BEAM-6711) Bigquery Tornadoes IT is broken in Python3 PostCommit test suite.

2019-03-20 Thread Pablo Estrada (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-6711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16797385#comment-16797385
 ] 

Pablo Estrada commented on BEAM-6711:
-

Yes. I'll work on this.

> Bigquery Tornadoes IT is broken in Python3 PostCommit test suite. 
> --
>
> Key: BEAM-6711
> URL: https://issues.apache.org/jira/browse/BEAM-6711
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: Pablo Estrada
>Priority: Major
> Fix For: 2.12.0
>
>  Time Spent: 6h 50m
>  Remaining Estimate: 0h
>
> First failure was observed in 
> https://builds.apache.org/job/beam_PostCommit_Python3_Verify/54 , after 
> https://github.com/apache/beam/commit/cdea885872b3be7de9ba22f22700be89f7d53766
>  was merged. 
> [~pabloem], could you please take a look? I suggest we do a rollback + 
> rollforward with a fix.
> {noformat}
> root: ERROR: Exception at bundle 
> , 
> due to an exception.
>  Traceback (most recent call last):
>   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py",
>  line 727, in process
> return self.do_fn_invoker.invoke_process(windowed_value)
>   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py",
>  line 556, in invoke_process
> windowed_value, additional_args, additional_kwargs, output_processor)
>   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py",
>  line 622, in _invoke_per_window
> self.process_method(*args_for_process, **kwargs_for_process))
>   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py",
>  line 823, in process_outputs
> for result in results:
>   File "/home/jenkins/jenkins-slave/works
> pace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/io/gcp/bigquery_file_loads.py",
>  line 191, in process
> if destination in self._destination_to_file_writer:
> TypeError: unhashable type: 'TableReference'
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6865) Refactor common portable runner infrastructure for better reuse

2019-03-20 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6865?focusedWorklogId=216341&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216341
 ]

ASF GitHub Bot logged work on BEAM-6865:


Author: ASF GitHub Bot
Created on: 20/Mar/19 17:22
Start Date: 20/Mar/19 17:22
Worklog Time Spent: 10m 
  Work Description: ibzib commented on issue #8085: [BEAM-6865] share 
non-Flink-specific pipeline helper utils
URL: https://github.com/apache/beam/pull/8085#issuecomment-474172661
 
 
   Run Java PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 216341)
Time Spent: 1h  (was: 50m)

> Refactor common portable runner infrastructure for better reuse
> ---
>
> Key: BEAM-6865
> URL: https://issues.apache.org/jira/browse/BEAM-6865
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-flink, runner-spark
>Reporter: Kyle Weaver
>Assignee: Kyle Weaver
>Priority: Major
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> The Flink runner is currently Beam's most mature portable OSS runner. Much of 
> the Flink portable runner's implementation details are not unique to Flink, 
> and yet are confined to the Flink runner code. In order to ease development 
> on other portable runners such as the Spark runner, this reusable code should 
> be moved into some common location.
> I've set this up to track my progress on these ongoing improvements.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-6711) Bigquery Tornadoes IT is broken in Python3 PostCommit test suite.

2019-03-20 Thread Valentyn Tymofieiev (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-6711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16797376#comment-16797376
 ] 

Valentyn Tymofieiev commented on BEAM-6711:
---

Thanks Juta. Pablo, will you have cycles to look into this, since this is 
likely related to your recent changes? Also, I think these changes happened 
after Beam 2.11, and I'd like us to fix them before Beam 2.12, which is to be 
cut a week from now. 

> Bigquery Tornadoes IT is broken in Python3 PostCommit test suite. 
> --
>
> Key: BEAM-6711
> URL: https://issues.apache.org/jira/browse/BEAM-6711
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: Pablo Estrada
>Priority: Major
> Fix For: 2.12.0
>
>  Time Spent: 6h 50m
>  Remaining Estimate: 0h
>
> First failure was observed in 
> https://builds.apache.org/job/beam_PostCommit_Python3_Verify/54 , after 
> https://github.com/apache/beam/commit/cdea885872b3be7de9ba22f22700be89f7d53766
>  was merged. 
> [~pabloem], could you please take a look? I suggest we do a rollback + 
> rollforward with a fix.
> {noformat}
> root: ERROR: Exception at bundle 
> , 
> due to an exception.
>  Traceback (most recent call last):
>   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py",
>  line 727, in process
> return self.do_fn_invoker.invoke_process(windowed_value)
>   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py",
>  line 556, in invoke_process
> windowed_value, additional_args, additional_kwargs, output_processor)
>   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py",
>  line 622, in _invoke_per_window
> self.process_method(*args_for_process, **kwargs_for_process))
>   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py",
>  line 823, in process_outputs
> for result in results:
>   File "/home/jenkins/jenkins-slave/works
> pace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/io/gcp/bigquery_file_loads.py",
>  line 191, in process
> if destination in self._destination_to_file_writer:
> TypeError: unhashable type: 'TableReference'
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-6870) python 3 test_hourly_team_score_it fails with bigquery job id already exists

2019-03-20 Thread Valentyn Tymofieiev (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Valentyn Tymofieiev updated BEAM-6870:
--
Summary: python 3 test_hourly_team_score_it fails with bigquery job id 
already exists  (was: python 3 test_hourly_team_score_it fails with dataflow 
job id already exists)

> python 3 test_hourly_team_score_it fails with bigquery job id already exists
> 
>
> Key: BEAM-6870
> URL: https://issues.apache.org/jira/browse/BEAM-6870
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Juta Staes
>Priority: Major
>
> This IT test on python 3 fails with "Already exists job ..."
> {code:java}
> 14:42:03 
> ==
> 14:42:03 ERROR: test_hourly_team_score_it 
> (apache_beam.examples.complete.game.hourly_team_score_it_test.HourlyTeamScoreIT)
> 14:42:03 
> --
> 14:42:03 Traceback (most recent call last):
> 14:42:03   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify_PR/src/sdks/python/apache_beam/examples/complete/game/hourly_team_score_it_test.py",
>  line 89, in test_hourly_team_score_it
> 14:42:03 self.test_pipeline.get_full_options_as_args(**extra_opts))
> 14:42:03   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify_PR/src/sdks/python/apache_beam/examples/complete/game/hourly_team_score.py",
>  line 295, in run
> 14:42:03 }, options.view_as(GoogleCloudOptions).project))
> 14:42:03   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify_PR/src/sdks/python/apache_beam/pipeline.py",
>  line 426, in __exit__
> 14:42:03 self.run().wait_until_finish()
> 14:42:03   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify_PR/src/sdks/python/apache_beam/pipeline.py",
>  line 406, in run
> 14:42:03 self._options).run(False)
> 14:42:03   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify_PR/src/sdks/python/apache_beam/pipeline.py",
>  line 419, in run
> 14:42:03 return self.runner.run_pipeline(self, self._options)
> 14:42:03   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify_PR/src/sdks/python/apache_beam/runners/dataflow/test_dataflow_runner.py",
>  line 64, in run_pipeline
> 14:42:03 self.result.wait_until_finish(duration=wait_duration)
> 14:42:03   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify_PR/src/sdks/python/apache_beam/runners/dataflow/dataflow_runner.py",
>  line 1217, in wait_until_finish
> 14:42:03 (self.state, getattr(self._runner, 'last_error_msg', None)), 
> self)
> 14:42:03 
> apache_beam.runners.dataflow.dataflow_runner.DataflowRuntimeException: 
> Dataflow pipeline failed. State: FAILED, Error:
> 14:42:03 Traceback (most recent call last):
> 14:42:03   File "apache_beam/runners/common.py", line 732, in 
> apache_beam.runners.common.DoFnRunner.process
> 14:42:03   File "apache_beam/runners/common.py", line 555, in 
> apache_beam.runners.common.PerWindowInvoker.invoke_process
> 14:42:03   File "apache_beam/runners/common.py", line 624, in 
> apache_beam.runners.common.PerWindowInvoker._invoke_per_window
> 14:42:03   File "apache_beam/runners/common.py", line 816, in 
> apache_beam.runners.common._OutputProcessor.process_outputs
> 14:42:03   File "apache_beam/runners/common.py", line 831, in 
> apache_beam.runners.common._OutputProcessor.process_outputs
> 14:42:03   File 
> "/usr/local/lib/python3.5/site-packages/apache_beam/io/gcp/bigquery_file_loads.py",
>  line 385, in process
> 14:42:03 create_disposition=self.create_disposition)
> 14:42:03   File 
> "/usr/local/lib/python3.5/site-packages/apache_beam/io/gcp/bigquery_tools.py",
>  line 560, in perform_load_job
> 14:42:03 write_disposition=write_disposition)
> 14:42:03   File 
> "/usr/local/lib/python3.5/site-packages/apache_beam/utils/retry.py", line 
> 195, in wrapper
> 14:42:03 return fun(*args, **kwargs)
> 14:42:03   File 
> "/usr/local/lib/python3.5/site-packages/apache_beam/io/gcp/bigquery_tools.py",
>  line 327, in _insert_load_job
> 14:42:03 response = self.client.jobs.Insert(request)
> 14:42:03   File 
> "/usr/local/lib/python3.5/site-packages/apache_beam/io/gcp/internal/clients/bigquery/bigquery_v2_client.py",
>  line 342, in Insert
> 14:42:03 upload=upload, upload_config=upload_config)
> 14:42:03   File 
> "/usr/local/lib/python3.5/site-packages/apitools/base/py/base_api.py", line 
> 731, in _RunMethod
> 14:42:03 return self.ProcessHttpResponse(method_config, http_response, 
> request)
> 14:42:03   File 
> "/usr/local/lib/python3.5/site-packages/apitools/base/py/base_api.py", line 
> 737, in Pr

[jira] [Commented] (BEAM-6870) python 3 test_hourly_team_score_it fails with dataflow job id already exists

2019-03-20 Thread Valentyn Tymofieiev (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-6870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16797372#comment-16797372
 ] 

Valentyn Tymofieiev commented on BEAM-6870:
---

Interesting, looks like the error is coming from BQ, and is referring to some 
BQ export job.

We need to understand whether the test fails because the same test running in 
parallel in a different suite, and there is some naming conflict in BQ.  Does 
this test pass when we run it in isolation? If the test passes in isolation, we 
need to understand how BQ job name is generated.

> python 3 test_hourly_team_score_it fails with dataflow job id already exists
> 
>
> Key: BEAM-6870
> URL: https://issues.apache.org/jira/browse/BEAM-6870
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Juta Staes
>Priority: Major
>
> This IT test on python 3 fails with "Already exists job ..."
> {code:java}
> 14:42:03 
> ==
> 14:42:03 ERROR: test_hourly_team_score_it 
> (apache_beam.examples.complete.game.hourly_team_score_it_test.HourlyTeamScoreIT)
> 14:42:03 
> --
> 14:42:03 Traceback (most recent call last):
> 14:42:03   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify_PR/src/sdks/python/apache_beam/examples/complete/game/hourly_team_score_it_test.py",
>  line 89, in test_hourly_team_score_it
> 14:42:03 self.test_pipeline.get_full_options_as_args(**extra_opts))
> 14:42:03   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify_PR/src/sdks/python/apache_beam/examples/complete/game/hourly_team_score.py",
>  line 295, in run
> 14:42:03 }, options.view_as(GoogleCloudOptions).project))
> 14:42:03   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify_PR/src/sdks/python/apache_beam/pipeline.py",
>  line 426, in __exit__
> 14:42:03 self.run().wait_until_finish()
> 14:42:03   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify_PR/src/sdks/python/apache_beam/pipeline.py",
>  line 406, in run
> 14:42:03 self._options).run(False)
> 14:42:03   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify_PR/src/sdks/python/apache_beam/pipeline.py",
>  line 419, in run
> 14:42:03 return self.runner.run_pipeline(self, self._options)
> 14:42:03   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify_PR/src/sdks/python/apache_beam/runners/dataflow/test_dataflow_runner.py",
>  line 64, in run_pipeline
> 14:42:03 self.result.wait_until_finish(duration=wait_duration)
> 14:42:03   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify_PR/src/sdks/python/apache_beam/runners/dataflow/dataflow_runner.py",
>  line 1217, in wait_until_finish
> 14:42:03 (self.state, getattr(self._runner, 'last_error_msg', None)), 
> self)
> 14:42:03 
> apache_beam.runners.dataflow.dataflow_runner.DataflowRuntimeException: 
> Dataflow pipeline failed. State: FAILED, Error:
> 14:42:03 Traceback (most recent call last):
> 14:42:03   File "apache_beam/runners/common.py", line 732, in 
> apache_beam.runners.common.DoFnRunner.process
> 14:42:03   File "apache_beam/runners/common.py", line 555, in 
> apache_beam.runners.common.PerWindowInvoker.invoke_process
> 14:42:03   File "apache_beam/runners/common.py", line 624, in 
> apache_beam.runners.common.PerWindowInvoker._invoke_per_window
> 14:42:03   File "apache_beam/runners/common.py", line 816, in 
> apache_beam.runners.common._OutputProcessor.process_outputs
> 14:42:03   File "apache_beam/runners/common.py", line 831, in 
> apache_beam.runners.common._OutputProcessor.process_outputs
> 14:42:03   File 
> "/usr/local/lib/python3.5/site-packages/apache_beam/io/gcp/bigquery_file_loads.py",
>  line 385, in process
> 14:42:03 create_disposition=self.create_disposition)
> 14:42:03   File 
> "/usr/local/lib/python3.5/site-packages/apache_beam/io/gcp/bigquery_tools.py",
>  line 560, in perform_load_job
> 14:42:03 write_disposition=write_disposition)
> 14:42:03   File 
> "/usr/local/lib/python3.5/site-packages/apache_beam/utils/retry.py", line 
> 195, in wrapper
> 14:42:03 return fun(*args, **kwargs)
> 14:42:03   File 
> "/usr/local/lib/python3.5/site-packages/apache_beam/io/gcp/bigquery_tools.py",
>  line 327, in _insert_load_job
> 14:42:03 response = self.client.jobs.Insert(request)
> 14:42:03   File 
> "/usr/local/lib/python3.5/site-packages/apache_beam/io/gcp/internal/clients/bigquery/bigquery_v2_client.py",
>  line 342, in Insert
> 14:42:03 upload=upload, upload_config=upload_config)
> 14:42:03   File 
> "/usr/local/lib/python3.5/site-pa

[jira] [Work logged] (BEAM-6861) Select fields and computed fields in CassandraIO.Read

2019-03-20 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6861?focusedWorklogId=216342&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216342
 ]

ASF GitHub Bot logged work on BEAM-6861:


Author: ASF GitHub Bot
Created on: 20/Mar/19 17:23
Start Date: 20/Mar/19 17:23
Worklog Time Spent: 10m 
  Work Description: srfrnk commented on issue #8090: [BEAM-6861] Select 
fields and computed fields in CassandraIO.Read
URL: https://github.com/apache/beam/pull/8090#issuecomment-474942509
 
 
   Sorry for barging in on this.
   If there is a need for a more robust query building mechanism we might 
consider using [Datastux 
QueryBuilder](https://docs.datastax.com/en/drivers/java/2.0/com/datastax/driver/core/querybuilder/QueryBuilder.html).
   I tryed to go for that approach when I added `withWhere` but since their 
classes are not `Serializable` I could not.
   I attempted to fix this problem 
[here](https://github.com/datastax/java-driver/pull/1163) but it was rejected.
   Maybe with enough influence we can manage to make the change?
   If not maybe this repo can be forked into Beam and so a working QueryBuilder 
can be used without having to rebuild it?
   
   WDYT?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 216342)
Time Spent: 3h 50m  (was: 3h 40m)

> Select fields and computed fields in CassandraIO.Read 
> --
>
> Key: BEAM-6861
> URL: https://issues.apache.org/jira/browse/BEAM-6861
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-cassandra
>Reporter: Radosław Stankiewicz
>Assignee: Radosław Stankiewicz
>Priority: Minor
>  Labels: features
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>
> CassandraIO.Read currently selects all fields and maps them to POJO.
> To make this component more flexible, it should be possible to select only 
> subset of fields or computed fields to allow reading things like write 
> Timestamp or using other functions.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6865) Refactor common portable runner infrastructure for better reuse

2019-03-20 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6865?focusedWorklogId=216340&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216340
 ]

ASF GitHub Bot logged work on BEAM-6865:


Author: ASF GitHub Bot
Created on: 20/Mar/19 17:22
Start Date: 20/Mar/19 17:22
Worklog Time Spent: 10m 
  Work Description: ibzib commented on issue #8085: [BEAM-6865] share 
non-Flink-specific pipeline helper utils
URL: https://github.com/apache/beam/pull/8085#issuecomment-474496559
 
 
   Run Python PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 216340)
Time Spent: 50m  (was: 40m)

> Refactor common portable runner infrastructure for better reuse
> ---
>
> Key: BEAM-6865
> URL: https://issues.apache.org/jira/browse/BEAM-6865
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-flink, runner-spark
>Reporter: Kyle Weaver
>Assignee: Kyle Weaver
>Priority: Major
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> The Flink runner is currently Beam's most mature portable OSS runner. Much of 
> the Flink portable runner's implementation details are not unique to Flink, 
> and yet are confined to the Flink runner code. In order to ease development 
> on other portable runners such as the Spark runner, this reusable code should 
> be moved into some common location.
> I've set this up to track my progress on these ongoing improvements.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-6695) Latest transform for Python SDK

2019-03-20 Thread Tanay Tummalapalli (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-6695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16797368#comment-16797368
 ] 

Tanay Tummalapalli commented on BEAM-6695:
--

Thank you [~altay]!
I've been sick for the past week. So, progress on this has been slow. Will get 
back to you with any more questions I have.

> Latest transform for Python SDK
> ---
>
> Key: BEAM-6695
> URL: https://issues.apache.org/jira/browse/BEAM-6695
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py-core
>Reporter: Ahmet Altay
>Assignee: Tanay Tummalapalli
>Priority: Minor
>
> Add a PTransform} and Combine.CombineFn for computing the latest element in a 
> PCollection.
> It should offer the same API as its Java counterpart: 
> https://github.com/apache/beam/blob/11a977b8b26eff2274d706541127c19dc93131a2/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/Latest.java



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-6871) apache_beam.io.parquetio_test.TestParquet is very flaky

2019-03-20 Thread Valentyn Tymofieiev (JIRA)
Valentyn Tymofieiev created BEAM-6871:
-

 Summary: apache_beam.io.parquetio_test.TestParquet is very flaky
 Key: BEAM-6871
 URL: https://issues.apache.org/jira/browse/BEAM-6871
 Project: Beam
  Issue Type: Bug
  Components: test-failures
Reporter: Valentyn Tymofieiev
Assignee: Heejong Lee


Hi [~heejong], I think you added these tests - could you please take a look?

Symptoms:
{noformat}
07:47:19 ==
07:47:19 ERROR: test_sink_transform_compressed_0 
(apache_beam.io.parquetio_test.TestParquet)
07:47:19 --
07:47:19 Traceback (most recent call last):
07:47:19   File 
"/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/apache_beam/io/parquetio_test.py",
 line 105, in tearDown
07:47:19 shutil.rmtree(self.temp_dir)
07:47:19   File "/usr/lib/python2.7/shutil.py", line 239, in rmtree
07:47:19 onerror(os.listdir, path, sys.exc_info())
07:47:19   File "/usr/lib/python2.7/shutil.py", line 237, in rmtree
07:47:19 names = os.listdir(path)
07:47:19 OSError: [Errno 2] No such file or directory: '/tmp/tmpEXNCgA'
{noformat}
Examples:
 [https://builds.apache.org/job/beam_PreCommit_Python_Commit/5186/consoleFull]
 [https://builds.apache.org/job/beam_PreCommit_Python_Commit/5185/consoleFull]

 

CC: [~Juta], [~boyuanz], [~udim]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6446) Clean up suppression rules in checkstyle suppressions.xml

2019-03-20 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6446?focusedWorklogId=216326&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216326
 ]

ASF GitHub Bot logged work on BEAM-6446:


Author: ASF GitHub Bot
Created on: 20/Mar/19 17:01
Start Date: 20/Mar/19 17:01
Worklog Time Spent: 10m 
  Work Description: HuangLED commented on issue #8091: [BEAM-6446] Clean up 
checkstyle suppressions.
URL: https://github.com/apache/beam/pull/8091#issuecomment-474932260
 
 
   Run Java PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 216326)
Time Spent: 0.5h  (was: 20m)

> Clean up suppression rules in checkstyle suppressions.xml
> -
>
> Key: BEAM-6446
> URL: https://issues.apache.org/jira/browse/BEAM-6446
> Project: Beam
>  Issue Type: Improvement
>  Components: build-system
>Reporter: Ruoyun Huang
>Assignee: Ruoyun Huang
>Priority: Minor
>  Labels: triaged
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> When violations are addressed, clean up suppression rules in checkstyle 
> suppressions.xml



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6824) Plumb FnApi ElementCount metrics in Dataflow Runner.

2019-03-20 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6824?focusedWorklogId=216329&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216329
 ]

ASF GitHub Bot logged work on BEAM-6824:


Author: ASF GitHub Bot
Created on: 20/Mar/19 17:07
Start Date: 20/Mar/19 17:07
Worklog Time Spent: 10m 
  Work Description: Ardagan commented on issue #8095: [BEAM-6824] Add 
Element count transformer for Java worker FnApi
URL: https://github.com/apache/beam/pull/8095#issuecomment-474934867
 
 
   @pabloem 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 216329)
Time Spent: 1.5h  (was: 1h 20m)

> Plumb FnApi ElementCount metrics in Dataflow Runner.
> 
>
> Key: BEAM-6824
> URL: https://issues.apache.org/jira/browse/BEAM-6824
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-dataflow
>Reporter: Mikhail Gryzykhin
>Assignee: Mikhail Gryzykhin
>Priority: Major
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> We want to provide FnApi metrics to Dataflow Users.
> This bug covers ElementCount metric.
> Current approach utilizes mapping of PCollectionID based on WorkItem and 
> ProcessBundle graphs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6824) Plumb FnApi ElementCount metrics in Dataflow Runner.

2019-03-20 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6824?focusedWorklogId=216331&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216331
 ]

ASF GitHub Bot logged work on BEAM-6824:


Author: ASF GitHub Bot
Created on: 20/Mar/19 17:07
Start Date: 20/Mar/19 17:07
Worklog Time Spent: 10m 
  Work Description: Ardagan commented on issue #8095: [BEAM-6824] Add 
Element count transformer for Java worker FnApi
URL: https://github.com/apache/beam/pull/8095#issuecomment-474934929
 
 
   @lukecwik 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 216331)
Time Spent: 1h 40m  (was: 1.5h)

> Plumb FnApi ElementCount metrics in Dataflow Runner.
> 
>
> Key: BEAM-6824
> URL: https://issues.apache.org/jira/browse/BEAM-6824
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-dataflow
>Reporter: Mikhail Gryzykhin
>Assignee: Mikhail Gryzykhin
>Priority: Major
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> We want to provide FnApi metrics to Dataflow Users.
> This bug covers ElementCount metric.
> Current approach utilizes mapping of PCollectionID based on WorkItem and 
> ProcessBundle graphs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-6868) Flink runner supports Bundle Finalization

2019-03-20 Thread Boyuan Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-6868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16797356#comment-16797356
 ] 

Boyuan Zhang commented on BEAM-6868:


Here is the design doc: https://s.apache.org/beam-finalizing-bundles
and Python SDK implementation: https://issues.apache.org/jira/browse/BEAM-6778. 
I checked the code base, flink runner doesn't send the 
[FinalizeBundleRequest](https://github.com/apache/beam/search?q=FinalizeBundleRequest&unscoped_q=FinalizeBundleRequest)
 to SDK. But it's not a required feature.

> Flink runner supports Bundle Finalization
> -
>
> Key: BEAM-6868
> URL: https://issues.apache.org/jira/browse/BEAM-6868
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-flink
>Reporter: Boyuan Zhang
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Reopened] (BEAM-6324) CassandraIO.Read - Add the ability to provide a filter to the query

2019-03-20 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/BEAM-6324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismaël Mejía reopened BEAM-6324:


I am reopening this one following the API discussions on BEAM-6861

> CassandraIO.Read - Add the ability to provide a filter to the query
> ---
>
> Key: BEAM-6324
> URL: https://issues.apache.org/jira/browse/BEAM-6324
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-cassandra
>Affects Versions: 2.9.0
>Reporter: Shahar Frank
>Assignee: Shahar Frank
>Priority: Major
>  Labels: performance, pull-request-available, triaged
> Fix For: 2.12.0
>
>  Time Spent: 11.5h
>  Remaining Estimate: 0h
>
> CassandraIO.Read doesn't support using WHERE to filter the input at the 
> source (In Cassandra) which might provide great performance boost.
> Already implemented by:
> https://github.com/apache/beam/pull/7340



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6619) Add PostCommit suite for integration tests on DataflowRunner

2019-03-20 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6619?focusedWorklogId=216310&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216310
 ]

ASF GitHub Bot logged work on BEAM-6619:


Author: ASF GitHub Bot
Created on: 20/Mar/19 16:34
Start Date: 20/Mar/19 16:34
Worklog Time Spent: 10m 
  Work Description: tvalentyn commented on pull request #8076: [BEAM-6619] 
[BEAM-6593] Add bigquery integration tests to postcommit
URL: https://github.com/apache/beam/pull/8076#discussion_r267431019
 
 

 ##
 File path: 
sdks/python/apache_beam/examples/complete/game/leader_board_it_test.py
 ##
 @@ -53,7 +53,7 @@
 class LeaderBoardIT(unittest.TestCase):
 
   # Input event containing user, team, score, processing time, window start.
-  INPUT_EVENT = 'user1,teamA,10,%d,2015-11-02 09:09:28.224'
+  INPUT_EVENT = b'user1,teamA,10,%d,2015-11-02 09:09:28.224'
 
 Review comment:
   I am not sure if this should be bytes. Does `self.pub_client.publish` 
require bytes? If so, we should encode before passing data to that method.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 216310)
Time Spent: 20h  (was: 19h 50m)

> Add PostCommit suite for integration tests on DataflowRunner
> 
>
> Key: BEAM-6619
> URL: https://issues.apache.org/jira/browse/BEAM-6619
> Project: Beam
>  Issue Type: Sub-task
>  Components: testing
>Reporter: Mark Liu
>Assignee: Mark Liu
>Priority: Major
>  Labels: triaged
> Fix For: Not applicable
>
>  Time Spent: 20h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-6869) Beam dependency check report failed to generate

2019-03-20 Thread yifan zou (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-6869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16797338#comment-16797338
 ] 

yifan zou commented on BEAM-6869:
-

An unit test failed and interrupted the report. 

> Beam dependency check report failed to generate
> ---
>
> Key: BEAM-6869
> URL: https://issues.apache.org/jira/browse/BEAM-6869
> Project: Beam
>  Issue Type: Bug
>  Components: dependencies, test-failures
>Reporter: Lukasz Gajowy
>Assignee: yifan zou
>Priority: Major
>
> I think the tests in dependency_check_report_generator_test.py have failed 
> preventing the script to generate the report.
> Link for the logs: 
> https://builds.apache.org/view/A-D/view/Beam/view/All/job/beam_Dependency_Check/182/console



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6854) Update amazon-web-services aws-sdk dependency to version 1.11.519

2019-03-20 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6854?focusedWorklogId=216297&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216297
 ]

ASF GitHub Bot logged work on BEAM-6854:


Author: ASF GitHub Bot
Created on: 20/Mar/19 16:13
Start Date: 20/Mar/19 16:13
Worklog Time Spent: 10m 
  Work Description: iemejia commented on pull request #8083: [BEAM-6854] 
Update amazon-web-services aws-sdk dependency to version 1.11.519
URL: https://github.com/apache/beam/pull/8083#discussion_r267423370
 
 

 ##
 File path: 
sdks/java/io/amazon-web-services/src/test/java/org/apache/beam/sdk/io/aws/sqs/SqsIOTest.java
 ##
 @@ -117,7 +117,6 @@ protected void before() {
   new AWSStaticCredentialsProvider(new 
BasicAWSCredentials(accessKey, secretKey)))
   .withEndpointConfiguration(
   new AwsClientBuilder.EndpointConfiguration(endpoint, region))
-  .withRegion(region)
 
 Review comment:
   Yes and now the amazon sdk validates this so the test failed.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 216297)
Time Spent: 40m  (was: 0.5h)

> Update amazon-web-services aws-sdk dependency to version 1.11.519
> -
>
> Key: BEAM-6854
> URL: https://issues.apache.org/jira/browse/BEAM-6854
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-aws
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Minor
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> An update to the AWS dependency to include new regions.
> Additionally this unifies the AWS SDK use among modules as a follow up of 
> BEAM-6330



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6619) Add PostCommit suite for integration tests on DataflowRunner

2019-03-20 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6619?focusedWorklogId=216311&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216311
 ]

ASF GitHub Bot logged work on BEAM-6619:


Author: ASF GitHub Bot
Created on: 20/Mar/19 16:34
Start Date: 20/Mar/19 16:34
Worklog Time Spent: 10m 
  Work Description: tvalentyn commented on pull request #8076: [BEAM-6619] 
[BEAM-6593] Add bigquery integration tests to postcommit
URL: https://github.com/apache/beam/pull/8076#discussion_r267431454
 
 

 ##
 File path: sdks/python/apache_beam/examples/complete/game/game_stats_it_test.py
 ##
 @@ -52,7 +52,7 @@
 class GameStatsIT(unittest.TestCase):
 
   # Input events containing user, team, score, processing time, window start.
-  INPUT_EVENT = 'user1,teamA,10,%d,2015-11-02 09:09:28.224'
+  INPUT_EVENT = b'user1,teamA,10,%d,2015-11-02 09:09:28.224'
 
 Review comment:
   I am not sure if this should be bytes. Does self.pub_client.publish require 
bytes? If so, we should encode before passing data to that method, 
   
   To quote https://docs.python.org/3/howto/pyporting.html ,  `Decode binary 
data to text as soon as possible, encode text as binary data as late as 
possible`
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 216311)
Time Spent: 20h 10m  (was: 20h)

> Add PostCommit suite for integration tests on DataflowRunner
> 
>
> Key: BEAM-6619
> URL: https://issues.apache.org/jira/browse/BEAM-6619
> Project: Beam
>  Issue Type: Sub-task
>  Components: testing
>Reporter: Mark Liu
>Assignee: Mark Liu
>Priority: Major
>  Labels: triaged
> Fix For: Not applicable
>
>  Time Spent: 20h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6619) Add PostCommit suite for integration tests on DataflowRunner

2019-03-20 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6619?focusedWorklogId=216309&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216309
 ]

ASF GitHub Bot logged work on BEAM-6619:


Author: ASF GitHub Bot
Created on: 20/Mar/19 16:34
Start Date: 20/Mar/19 16:34
Worklog Time Spent: 10m 
  Work Description: tvalentyn commented on pull request #8076: [BEAM-6619] 
[BEAM-6593] Add bigquery integration tests to postcommit
URL: https://github.com/apache/beam/pull/8076#discussion_r267428713
 
 

 ##
 File path: sdks/python/apache_beam/examples/complete/game/leader_board.py
 ##
 @@ -120,6 +120,8 @@ def __init__(self):
 
   def process(self, elem):
 try:
+  if isinstance(elem, bytes):
 
 Review comment:
   Can we always decode here? It is better to have a clear expectation of input 
arguments as much as possible on Python 3: either always encoded bytes, or 
always strings, but not mixing the two.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 216309)

> Add PostCommit suite for integration tests on DataflowRunner
> 
>
> Key: BEAM-6619
> URL: https://issues.apache.org/jira/browse/BEAM-6619
> Project: Beam
>  Issue Type: Sub-task
>  Components: testing
>Reporter: Mark Liu
>Assignee: Mark Liu
>Priority: Major
>  Labels: triaged
> Fix For: Not applicable
>
>  Time Spent: 19h 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6619) Add PostCommit suite for integration tests on DataflowRunner

2019-03-20 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6619?focusedWorklogId=216308&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216308
 ]

ASF GitHub Bot logged work on BEAM-6619:


Author: ASF GitHub Bot
Created on: 20/Mar/19 16:34
Start Date: 20/Mar/19 16:34
Worklog Time Spent: 10m 
  Work Description: tvalentyn commented on pull request #8076: [BEAM-6619] 
[BEAM-6593] Add bigquery integration tests to postcommit
URL: https://github.com/apache/beam/pull/8076#discussion_r267428124
 
 

 ##
 File path: sdks/python/apache_beam/examples/complete/game/game_stats.py
 ##
 @@ -111,6 +111,8 @@ def __init__(self):
 
   def process(self, elem):
 try:
+  if isinstance(elem, bytes):
 
 Review comment:
   Can we always decode here? It is better to have a clear expectation of input 
arguments as much as possible on Python 3: either always encoded bytes, or 
always strings, but not mixing the two.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 216308)
Time Spent: 19h 50m  (was: 19h 40m)

> Add PostCommit suite for integration tests on DataflowRunner
> 
>
> Key: BEAM-6619
> URL: https://issues.apache.org/jira/browse/BEAM-6619
> Project: Beam
>  Issue Type: Sub-task
>  Components: testing
>Reporter: Mark Liu
>Assignee: Mark Liu
>Priority: Major
>  Labels: triaged
> Fix For: Not applicable
>
>  Time Spent: 19h 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6854) Update amazon-web-services aws-sdk dependency to version 1.11.519

2019-03-20 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6854?focusedWorklogId=216304&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216304
 ]

ASF GitHub Bot logged work on BEAM-6854:


Author: ASF GitHub Bot
Created on: 20/Mar/19 16:27
Start Date: 20/Mar/19 16:27
Worklog Time Spent: 10m 
  Work Description: aromanenko-dev commented on pull request #8083: 
[BEAM-6854] Update amazon-web-services aws-sdk dependency to version 1.11.519
URL: https://github.com/apache/beam/pull/8083
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 216304)
Time Spent: 1h  (was: 50m)

> Update amazon-web-services aws-sdk dependency to version 1.11.519
> -
>
> Key: BEAM-6854
> URL: https://issues.apache.org/jira/browse/BEAM-6854
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-aws
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Minor
> Fix For: 2.12.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> An update to the AWS dependency to include new regions.
> Additionally this unifies the AWS SDK use among modules as a follow up of 
> BEAM-6330



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (BEAM-6854) Update amazon-web-services aws-sdk dependency to version 1.11.519

2019-03-20 Thread Alexey Romanenko (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexey Romanenko resolved BEAM-6854.

   Resolution: Fixed
Fix Version/s: 2.12.0

> Update amazon-web-services aws-sdk dependency to version 1.11.519
> -
>
> Key: BEAM-6854
> URL: https://issues.apache.org/jira/browse/BEAM-6854
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-aws
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Minor
> Fix For: 2.12.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> An update to the AWS dependency to include new regions.
> Additionally this unifies the AWS SDK use among modules as a follow up of 
> BEAM-6330



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6854) Update amazon-web-services aws-sdk dependency to version 1.11.519

2019-03-20 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6854?focusedWorklogId=216303&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216303
 ]

ASF GitHub Bot logged work on BEAM-6854:


Author: ASF GitHub Bot
Created on: 20/Mar/19 16:26
Start Date: 20/Mar/19 16:26
Worklog Time Spent: 10m 
  Work Description: aromanenko-dev commented on issue #8083: [BEAM-6854] 
Update amazon-web-services aws-sdk dependency to version 1.11.519
URL: https://github.com/apache/beam/pull/8083#issuecomment-474913945
 
 
   LGTM
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 216303)
Time Spent: 50m  (was: 40m)

> Update amazon-web-services aws-sdk dependency to version 1.11.519
> -
>
> Key: BEAM-6854
> URL: https://issues.apache.org/jira/browse/BEAM-6854
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-aws
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Minor
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> An update to the AWS dependency to include new regions.
> Additionally this unifies the AWS SDK use among modules as a follow up of 
> BEAM-6330



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


  1   2   >