[jira] [Work logged] (BEAM-4775) JobService should support returning metrics

2019-02-19 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4775?focusedWorklogId=201132&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201132
 ]

ASF GitHub Bot logged work on BEAM-4775:


Author: ASF GitHub Bot
Created on: 20/Feb/19 07:48
Start Date: 20/Feb/19 07:48
Worklog Time Spent: 10m 
  Work Description: ryan-williams commented on issue #7868: [BEAM-4775] 
MonitoringInfo URN tweaks
URL: https://github.com/apache/beam/pull/7868#issuecomment-465462811
 
 
   Run Java PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 201132)
Time Spent: 15h  (was: 14h 50m)

> JobService should support returning metrics
> ---
>
> Key: BEAM-4775
> URL: https://issues.apache.org/jira/browse/BEAM-4775
> Project: Beam
>  Issue Type: Bug
>  Components: beam-model
>Reporter: Eugene Kirpichov
>Assignee: Ryan Williams
>Priority: Major
>  Labels: triaged
>  Time Spent: 15h
>  Remaining Estimate: 0h
>
> [https://github.com/apache/beam/blob/master/model/job-management/src/main/proto/beam_job_api.proto]
>  currently doesn't appear to have a way for JobService to return metrics to a 
> user, even though 
> [https://github.com/apache/beam/blob/master/model/fn-execution/src/main/proto/beam_fn_api.proto]
>  includes support for reporting SDK metrics to the runner harness.
>  
> Metrics are apparently necessary to run any ValidatesRunner tests because 
> PAssert needs to validate that the assertions succeeded. However, this 
> statement should be double-checked: perhaps it's possible to somehow work 
> with PAssert without metrics support.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4775) JobService should support returning metrics

2019-02-19 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4775?focusedWorklogId=201128&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201128
 ]

ASF GitHub Bot logged work on BEAM-4775:


Author: ASF GitHub Bot
Created on: 20/Feb/19 07:40
Start Date: 20/Feb/19 07:40
Worklog Time Spent: 10m 
  Work Description: ryan-williams commented on pull request #7876: 
[BEAM-4775] Clean up metric protos; support integer distributions, gauges
URL: https://github.com/apache/beam/pull/7876#discussion_r258339552
 
 

 ##
 File path: model/fn-execution/src/main/proto/beam_fn_api.proto
 ##
 @@ -553,11 +514,9 @@ message IntDistributionData {
   int64 max = 4;
 }
 
-message DoubleDistributionData {
-  int64 count = 1;
-  double sum = 2;
-  double min = 3;
-  double max = 4;
+message IntGaugeData {
 
 Review comment:
   Yea, in #7823 I support all the types here (counters, distributions, gauges) 
everywhere in Java; I *think* Python already supports them as well, but need to 
dig in to that further.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 201128)
Time Spent: 14.5h  (was: 14h 20m)

> JobService should support returning metrics
> ---
>
> Key: BEAM-4775
> URL: https://issues.apache.org/jira/browse/BEAM-4775
> Project: Beam
>  Issue Type: Bug
>  Components: beam-model
>Reporter: Eugene Kirpichov
>Assignee: Ryan Williams
>Priority: Major
>  Labels: triaged
>  Time Spent: 14.5h
>  Remaining Estimate: 0h
>
> [https://github.com/apache/beam/blob/master/model/job-management/src/main/proto/beam_job_api.proto]
>  currently doesn't appear to have a way for JobService to return metrics to a 
> user, even though 
> [https://github.com/apache/beam/blob/master/model/fn-execution/src/main/proto/beam_fn_api.proto]
>  includes support for reporting SDK metrics to the runner harness.
>  
> Metrics are apparently necessary to run any ValidatesRunner tests because 
> PAssert needs to validate that the assertions succeeded. However, this 
> statement should be double-checked: perhaps it's possible to somehow work 
> with PAssert without metrics support.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4775) JobService should support returning metrics

2019-02-19 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4775?focusedWorklogId=201125&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201125
 ]

ASF GitHub Bot logged work on BEAM-4775:


Author: ASF GitHub Bot
Created on: 20/Feb/19 07:40
Start Date: 20/Feb/19 07:40
Worklog Time Spent: 10m 
  Work Description: ryan-williams commented on pull request #7876: 
[BEAM-4775] Clean up metric protos; support integer distributions, gauges
URL: https://github.com/apache/beam/pull/7876#discussion_r258357699
 
 

 ##
 File path: 
runners/core-java/src/main/java/org/apache/beam/runners/core/metrics/MonitoringInfoMetricName.java
 ##
 @@ -113,11 +136,15 @@ public boolean equals(Object o) {
   }
 
   @Override
-  public String toString() {
+  public String toString(String delimiter) {
+if (getNamespace() != null && getName() != null) {
+  return super.toString(delimiter);
+}
 StringBuilder builder = new StringBuilder();
-builder.append(this.urn.toString());
-builder.append(" ");
-builder.append(this.labels.toString());
+if (labels.containsKey(PCOLLECTION_LABEL)) {
 
 Review comment:
   I wrote things that way in several places initially, but ran into 
interesting problems that we'll need to discuss eventually, so I'll describe 
them below.
   
   Here's what happens in this PR if I naively append the label kv-pairs: 
https://github.com/ryan-williams/beam/commit/557628627ae7438a6b688514e81205594238ed0f.
 
   
   Note that Flink receives a metric with the unseemly name 
[`step.PTRANSFORM.step.PCOLLECTION.pcoll.beam.metric.element_count.v1`](https://github.com/ryan-williams/beam/commit/557628627ae7438a6b688514e81205594238ed0f#diff-682be67e606ae40ded3e0087df5f9f8bR136)
 (which goes e.g. to the Flink web UI, which we've taken pains to make more 
human-readable).
   
   The first `step` there is added by `MetricKey.toString`, which calls this 
`MetricName.toString` routine, expecting info about the metric's "name" (URN) 
only.
   
   ## MetricKey vs MetricName
   
   The issue, In my view, is that `MonitoringInfoMetricName` is actually more 
of a `MetricKey` implementation:
   - `MetricKey` ≈ `MonitoringInfoMetricName` ({URN, labels})
 - `stepName` ≈ [labels map with just a `PTRANSFORM` entry]
 - `MetricName` ≈ URN
   - `MetricResult` ≈ `MonitoringInfo` ({key, metric value})
   
   I ultimately lay things out in this manner in #7823, folding MIMN into 
MetricKey.
   
   ## Special-casing single-label ptransform/pcollection-scoped metrics
   
   A bigger issue is that I think runners and endpoints are going to want to 
hard-code [the only 2 cases that we will have for the foreseeable future]: 
`..` and `.` 
(`.` folds into this nicely as well).
   
   They'll do this for backwards-compat reasons, and/or to make things look 
nice in e.g. web UIs. I've touched code like this in at least the Samza, Flink, 
and Spark runners (and that `MetricsHttpSink` we were looking at).
   
   I think it's great that the wire format is extensible to many+new labels, 
but ended up feeling that the SDK should provide the APIs that deal 
specifically with the shapes all calling code will care about atm.
   
   Sorry that's a lot! I don't care much what we do in this case, but I think 
we'll have some harder similar choices shortly 😲 lmk what you think.
   
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 201125)
Time Spent: 14h  (was: 13h 50m)

> JobService should support returning metrics
> ---
>
> Key: BEAM-4775
> URL: https://issues.apache.org/jira/browse/BEAM-4775
> Project: Beam
>  Issue Type: Bug
>  Components: beam-model
>Reporter: Eugene Kirpichov
>Assignee: Ryan Williams
>Priority: Major
>  Labels: triaged
>  Time Spent: 14h
>  Remaining Estimate: 0h
>
> [https://github.com/apache/beam/blob/master/model/job-management/src/main/proto/beam_job_api.proto]
>  currently doesn't appear to have a way for JobService to return metrics to a 
> user, even though 
> [https://github.com/apache/beam/blob/master/model/fn-execution/src/main/proto/beam_fn_api.proto]
>  includes support for reporting SDK metrics to the runner harness.
>  
> Metrics are apparently necessary to run any ValidatesRunner tests because 
> PAssert needs to validate that the assertions succeeded. However, this 
> statement should be double-checked: perhaps it's possible to somehow work 
> with PAssert without metrics support

[jira] [Work logged] (BEAM-4775) JobService should support returning metrics

2019-02-19 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4775?focusedWorklogId=201126&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201126
 ]

ASF GitHub Bot logged work on BEAM-4775:


Author: ASF GitHub Bot
Created on: 20/Feb/19 07:40
Start Date: 20/Feb/19 07:40
Worklog Time Spent: 10m 
  Work Description: ryan-williams commented on pull request #7876: 
[BEAM-4775] Clean up metric protos; support integer distributions, gauges
URL: https://github.com/apache/beam/pull/7876#discussion_r258344137
 
 

 ##
 File path: 
runners/core-java/src/main/java/org/apache/beam/runners/core/metrics/LabeledMetrics.java
 ##
 @@ -29,7 +30,7 @@
   /**
* Create a metric that can be incremented and decremented, and is 
aggregated by taking the sum.
*/
-  public static Counter counter(MonitoringInfoMetricName metricName) {
+  public static Counter counter(MetricName metricName) {
 
 Review comment:
   Quick explanation for why this changed:
   
   Before this PR, `MetricName.named` allowed creating (user-)metrics from 
{namespace,name}, while 
[`MonitoringInfoMetricName.named`](https://github.com/apache/beam/pull/7876/files#diff-ba1b936a7d4c8d43789e81a816b9bb90L92)
 created metrics from {URN, labels}.
   
   The latter can create user- or system-metrics, but naively creating 
`MonitoringInfoMetricName`s for user-metrics is dangerous: they don't 
hash/equal the "same" metric created via `MetricName.named`, which I ran into 
in tests.
   
   Meanwhile, user code wants to make metrics from arbitrary MonitoringInfos, 
so a single API for that makes sense, but it should return a `MetricName` 
(AutoValue) for user-URNs, and a `MonitoringInfoMetricName` otherwise; [I've 
changed the `MonitoringInfoMetricName` constructor to do that in this 
PR](https://github.com/apache/beam/pull/7876/files#diff-ba1b936a7d4c8d43789e81a816b9bb90R106).
 That means that it returns a `MetricName`, not a `MonitoringInfoMetricName`.
   
   Ultimately (#7823), I merge these two into something that looks like 
`MonitoringInfoMetricName` but is called `MetricName`. In the meantime, there's 
little cost to using `MetricName` as the super-type of both.
   
   So, re: your comment here, I could create an additional 
`MonitoringInfoMetricName` constructor that's only to be used for system metric 
URNs, and have `ElementCountFnDataReceiver` call that 
[here](https://github.com/apache/beam/pull/7876/files#diff-85f1cc117121c923d328b4d98374c18aR53)
 instead, and then pass the `MonitoringInfoMetricName` it got to 
`LabeledMetrics.Counter` here, but I don't think that is worth it.
   
   That new constructor would feel unsafe if it didn't validate that the passed 
URN was in fact a system-metric, and that would start to be a lot of overhead, 
when `LabeledMetrics.Counter` (and other callers) don't actually care whether 
they are dealing with a system- or user-metric.
   
   So that's the whole story 😄 ! lmk what you think.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 201126)
Time Spent: 14h 10m  (was: 14h)

> JobService should support returning metrics
> ---
>
> Key: BEAM-4775
> URL: https://issues.apache.org/jira/browse/BEAM-4775
> Project: Beam
>  Issue Type: Bug
>  Components: beam-model
>Reporter: Eugene Kirpichov
>Assignee: Ryan Williams
>Priority: Major
>  Labels: triaged
>  Time Spent: 14h 10m
>  Remaining Estimate: 0h
>
> [https://github.com/apache/beam/blob/master/model/job-management/src/main/proto/beam_job_api.proto]
>  currently doesn't appear to have a way for JobService to return metrics to a 
> user, even though 
> [https://github.com/apache/beam/blob/master/model/fn-execution/src/main/proto/beam_fn_api.proto]
>  includes support for reporting SDK metrics to the runner harness.
>  
> Metrics are apparently necessary to run any ValidatesRunner tests because 
> PAssert needs to validate that the assertions succeeded. However, this 
> statement should be double-checked: perhaps it's possible to somehow work 
> with PAssert without metrics support.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4775) JobService should support returning metrics

2019-02-19 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4775?focusedWorklogId=201130&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201130
 ]

ASF GitHub Bot logged work on BEAM-4775:


Author: ASF GitHub Bot
Created on: 20/Feb/19 07:40
Start Date: 20/Feb/19 07:40
Worklog Time Spent: 10m 
  Work Description: ryan-williams commented on pull request #7876: 
[BEAM-4775] Clean up metric protos; support integer distributions, gauges
URL: https://github.com/apache/beam/pull/7876#discussion_r258362473
 
 

 ##
 File path: 
runners/flink/src/main/java/org/apache/beam/runners/flink/metrics/FlinkMetricContainer.java
 ##
 @@ -94,121 +93,128 @@ public MetricsContainer getMetricsContainer(String 
stepName) {
 : null;
   }
 
+  public MetricsContainer getUnboundMetricsContainer() {
+return metricsAccumulator != null
+? metricsAccumulator.getLocalValue().getUnboundContainer()
+: null;
+  }
+
   /**
* Update this container with metrics from the passed {@link 
MonitoringInfo}s, and send updates
* along to Flink's internal metrics framework.
*/
-  public void updateMetrics(String stepName, List 
monitoringInfos) {
-MetricsContainer metricsContainer = getMetricsContainer(stepName);
+  public void updateMetrics(List monitoringInfos) {
 
 Review comment:
   Interesting point. To paraphrase: ingesting MonitoringInfos into a 
MetricContainerStepMap (which other runners also use) will be a common path, 
and code for that should get factored out.
   
   I experimented with keeping fn-api metrics natively in MonitoringInfos in 
MCSM, alongside traditional Java-SDK-formatted ones, but ended up needing to do 
a lot of other cleanup in there (#7890) before I could see straight 😄.
   
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 201130)
Time Spent: 14h 50m  (was: 14h 40m)

> JobService should support returning metrics
> ---
>
> Key: BEAM-4775
> URL: https://issues.apache.org/jira/browse/BEAM-4775
> Project: Beam
>  Issue Type: Bug
>  Components: beam-model
>Reporter: Eugene Kirpichov
>Assignee: Ryan Williams
>Priority: Major
>  Labels: triaged
>  Time Spent: 14h 50m
>  Remaining Estimate: 0h
>
> [https://github.com/apache/beam/blob/master/model/job-management/src/main/proto/beam_job_api.proto]
>  currently doesn't appear to have a way for JobService to return metrics to a 
> user, even though 
> [https://github.com/apache/beam/blob/master/model/fn-execution/src/main/proto/beam_fn_api.proto]
>  includes support for reporting SDK metrics to the runner harness.
>  
> Metrics are apparently necessary to run any ValidatesRunner tests because 
> PAssert needs to validate that the assertions succeeded. However, this 
> statement should be double-checked: perhaps it's possible to somehow work 
> with PAssert without metrics support.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4775) JobService should support returning metrics

2019-02-19 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4775?focusedWorklogId=201129&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201129
 ]

ASF GitHub Bot logged work on BEAM-4775:


Author: ASF GitHub Bot
Created on: 20/Feb/19 07:40
Start Date: 20/Feb/19 07:40
Worklog Time Spent: 10m 
  Work Description: ryan-williams commented on pull request #7876: 
[BEAM-4775] Clean up metric protos; support integer distributions, gauges
URL: https://github.com/apache/beam/pull/7876#discussion_r258362774
 
 

 ##
 File path: 
runners/flink/src/main/java/org/apache/beam/runners/flink/metrics/FlinkMetricContainer.java
 ##
 @@ -94,121 +93,128 @@ public MetricsContainer getMetricsContainer(String 
stepName) {
 : null;
   }
 
+  public MetricsContainer getUnboundMetricsContainer() {
+return metricsAccumulator != null
+? metricsAccumulator.getLocalValue().getUnboundContainer()
+: null;
+  }
+
   /**
* Update this container with metrics from the passed {@link 
MonitoringInfo}s, and send updates
* along to Flink's internal metrics framework.
*/
-  public void updateMetrics(String stepName, List 
monitoringInfos) {
-MetricsContainer metricsContainer = getMetricsContainer(stepName);
+  public void updateMetrics(List monitoringInfos) {
+LOG.info("Flink updating metrics with {} monitoring infos", 
monitoringInfos.size());
 monitoringInfos.forEach(
 monitoringInfo -> {
-  if (monitoringInfo.hasMetric()) {
-String urn = monitoringInfo.getUrn();
-MetricName metricName = parseUrn(urn);
-Metric metric = monitoringInfo.getMetric();
-if (metric.hasCounterData()) {
-  CounterData counterData = metric.getCounterData();
-  if (counterData.getValueCase() == 
CounterData.ValueCase.INT64_VALUE) {
-org.apache.beam.sdk.metrics.Counter counter =
-metricsContainer.getCounter(metricName);
-counter.inc(counterData.getInt64Value());
-  } else {
-LOG.warn("Unsupported CounterData type: {}", counterData);
-  }
-} else if (metric.hasDistributionData()) {
-  DistributionData distributionData = metric.getDistributionData();
-  if (distributionData.hasIntDistributionData()) {
-Distribution distribution = 
metricsContainer.getDistribution(metricName);
-IntDistributionData intDistributionData = 
distributionData.getIntDistributionData();
-distribution.update(
-intDistributionData.getSum(),
-intDistributionData.getCount(),
-intDistributionData.getMin(),
-intDistributionData.getMax());
-  } else {
-LOG.warn("Unsupported DistributionData type: {}", 
distributionData);
-  }
-} else if (metric.hasExtremaData()) {
-  ExtremaData extremaData = metric.getExtremaData();
-  LOG.warn("Extrema metric unsupported: {}", extremaData);
-}
+  if (!monitoringInfo.hasMetric()) {
 
 Review comment:
   Interesting. It seems in this case like it's easier and better to just 
handle all the metrics that come through conforming to the standard format? Or 
maybe I'm not understanding what you mean.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 201129)
Time Spent: 14h 40m  (was: 14.5h)

> JobService should support returning metrics
> ---
>
> Key: BEAM-4775
> URL: https://issues.apache.org/jira/browse/BEAM-4775
> Project: Beam
>  Issue Type: Bug
>  Components: beam-model
>Reporter: Eugene Kirpichov
>Assignee: Ryan Williams
>Priority: Major
>  Labels: triaged
>  Time Spent: 14h 40m
>  Remaining Estimate: 0h
>
> [https://github.com/apache/beam/blob/master/model/job-management/src/main/proto/beam_job_api.proto]
>  currently doesn't appear to have a way for JobService to return metrics to a 
> user, even though 
> [https://github.com/apache/beam/blob/master/model/fn-execution/src/main/proto/beam_fn_api.proto]
>  includes support for reporting SDK metrics to the runner harness.
>  
> Metrics are apparently necessary to run any ValidatesRunner tests because 
> PAssert needs to validate that the assertions succeeded. However, this 
> statement should be double-checked: perhaps it's possi

[jira] [Work logged] (BEAM-4775) JobService should support returning metrics

2019-02-19 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4775?focusedWorklogId=201127&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201127
 ]

ASF GitHub Bot logged work on BEAM-4775:


Author: ASF GitHub Bot
Created on: 20/Feb/19 07:40
Start Date: 20/Feb/19 07:40
Worklog Time Spent: 10m 
  Work Description: ryan-williams commented on pull request #7876: 
[BEAM-4775] Clean up metric protos; support integer distributions, gauges
URL: https://github.com/apache/beam/pull/7876#discussion_r258344412
 
 

 ##
 File path: model/fn-execution/src/main/proto/beam_fn_api.proto
 ##
 @@ -501,48 +501,9 @@ message MonitoringInfoTypeUrns {
 message Metric {
   // (Required) The data for this metric.
   oneof data {
-CounterData counter_data = 1;
-DistributionData distribution_data = 2;
-ExtremaData extrema_data = 3;
-  }
-}
-
-// Data associated with a Counter or Gauge metric.
-// This is designed to be compatible with metric collection
-// systems such as DropWizard.
-message CounterData {
-  oneof value {
-int64 int64_value = 1;
-double double_value = 2;
-string string_value = 3;
-  }
-}
-
-// Extrema messages are used for calculating
-// Top-N/Bottom-N metrics.
-message ExtremaData {
-  oneof extrema {
-IntExtremaData int_extrema_data = 1;
-DoubleExtremaData double_extrema_data = 2;
-  }
-}
-
-message IntExtremaData {
-  repeated int64 int_values = 1;
-}
-
-message DoubleExtremaData {
-  repeated double double_values = 2;
-}
-
-// Data associated with a distribution metric.
-// This is based off of the current DistributionData metric.
-// This is not a stackdriver or dropwizard compatible
-// style of distribution metric.
-message DistributionData {
-  oneof distribution {
-IntDistributionData int_distribution_data = 1;
-DoubleDistributionData double_distribution_data = 2;
+int64 counter = 1;
 
 Review comment:
   Thanks for the info and pointers!
   
   I was just going off of what seemed to be already supported in the Java and 
Python SDKs.
   
   In particular, existing Java metrics are all in terms of `long`, and I 
thought I discussed keeping that and dropping these other types with someone 
(@robertwb?) in the course of planning this work.
   
   I'll read those links and we can discuss further.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 201127)
Time Spent: 14h 20m  (was: 14h 10m)

> JobService should support returning metrics
> ---
>
> Key: BEAM-4775
> URL: https://issues.apache.org/jira/browse/BEAM-4775
> Project: Beam
>  Issue Type: Bug
>  Components: beam-model
>Reporter: Eugene Kirpichov
>Assignee: Ryan Williams
>Priority: Major
>  Labels: triaged
>  Time Spent: 14h 20m
>  Remaining Estimate: 0h
>
> [https://github.com/apache/beam/blob/master/model/job-management/src/main/proto/beam_job_api.proto]
>  currently doesn't appear to have a way for JobService to return metrics to a 
> user, even though 
> [https://github.com/apache/beam/blob/master/model/fn-execution/src/main/proto/beam_fn_api.proto]
>  includes support for reporting SDK metrics to the runner harness.
>  
> Metrics are apparently necessary to run any ValidatesRunner tests because 
> PAssert needs to validate that the assertions succeeded. However, this 
> statement should be double-checked: perhaps it's possible to somehow work 
> with PAssert without metrics support.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-6715) ./gradlew :beam-runners-google-cloud-dataflow-java:validatesRunner fails due to invalid GCS path

2019-02-19 Thread Kenneth Knowles (JIRA)
Kenneth Knowles created BEAM-6715:
-

 Summary: ./gradlew 
:beam-runners-google-cloud-dataflow-java:validatesRunner fails due to invalid 
GCS path
 Key: BEAM-6715
 URL: https://issues.apache.org/jira/browse/BEAM-6715
 Project: Beam
  Issue Type: Bug
  Components: build-system
Reporter: Kenneth Knowles
Assignee: Kenneth Knowles


The default root is gs://temp-storage-for-end-to-end-tests/ and the code adds a 
slash and then the test case name, so it results in two slashes in a row, 
rejected by the filesystem implementation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4775) JobService should support returning metrics

2019-02-19 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4775?focusedWorklogId=201103&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201103
 ]

ASF GitHub Bot logged work on BEAM-4775:


Author: ASF GitHub Bot
Created on: 20/Feb/19 05:26
Start Date: 20/Feb/19 05:26
Worklog Time Spent: 10m 
  Work Description: ryan-williams commented on issue #7868: [BEAM-4775] 
MonitoringInfo URN tweaks
URL: https://github.com/apache/beam/pull/7868#issuecomment-465343573
 
 
   > Just curious, what were you hoping to get out of this change in 
particular? This is largely a bunch of renamings, was it too confusing before?
   
   There are 2 renamings:
   - 
https://github.com/apache/beam/pull/7868/commits/0022bf9a6e12ae35b33d4bf06e0b53e9e0e7f1ad:
 `MonitoringInfoLabels.{TRANSFORM → PTRANSFORM}`, per [this 
`TODO(ajamato)`](https://github.com/apache/beam/pull/7868/commits/737c9e25ebbbf2654d2294705d5d3f14e17a841d#diff-4d226ebf3f70cf18449c29e82511a67eL439)
 😄  
   - 
https://github.com/apache/beam/pull/7868/commits/0715c199031548b218c425adbd7870d7460cbab2:
  `USER_COUNTER_URN_PREFIX ` → `USER_METRIC_URN_PREFIX `
 - (in `SimpleMonitoringInfoBuilder` and `MonitoringInfoUrns.Enum`)
 - we need either this change, or to add new user-URN-prefixes for 
distributions/gauges, afaict.
   
   There are also 6 other commits here that each constitute minor forward 
progress, I think:
   - 
https://github.com/apache/beam/pull/7868/commits/9d1e237ee5ae0182f43cfc9280a84021c1230d31:
 `UserMonitoringInfoToCounterUpdateTransformer` had [a redundant definition of  
`USER_COUNTER_URN_PREFIX`](https://github.com/apache/beam/pull/7868/commits/08108ccdfad500bb68241e588bb525e36e0efa3d#diff-a2189a7c97dace5f44f04e16a71b6468L56)
   - 
https://github.com/apache/beam/pull/7868/commits/84abbfafd9d75c7169af2625b5e72f539da6ede0:
 consolidate two (different!) implementations of [parsing a MetricName from a 
URN]
   - etc.
   
   Sorry if it just feels like dust. I thought these were all thematically 
similar and a reasonable chunk to pull out of #7823 and reckon with 
independently.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 201103)
Time Spent: 13.5h  (was: 13h 20m)

> JobService should support returning metrics
> ---
>
> Key: BEAM-4775
> URL: https://issues.apache.org/jira/browse/BEAM-4775
> Project: Beam
>  Issue Type: Bug
>  Components: beam-model
>Reporter: Eugene Kirpichov
>Assignee: Ryan Williams
>Priority: Major
>  Labels: triaged
>  Time Spent: 13.5h
>  Remaining Estimate: 0h
>
> [https://github.com/apache/beam/blob/master/model/job-management/src/main/proto/beam_job_api.proto]
>  currently doesn't appear to have a way for JobService to return metrics to a 
> user, even though 
> [https://github.com/apache/beam/blob/master/model/fn-execution/src/main/proto/beam_fn_api.proto]
>  includes support for reporting SDK metrics to the runner harness.
>  
> Metrics are apparently necessary to run any ValidatesRunner tests because 
> PAssert needs to validate that the assertions succeeded. However, this 
> statement should be double-checked: perhaps it's possible to somehow work 
> with PAssert without metrics support.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4775) JobService should support returning metrics

2019-02-19 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4775?focusedWorklogId=201108&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201108
 ]

ASF GitHub Bot logged work on BEAM-4775:


Author: ASF GitHub Bot
Created on: 20/Feb/19 05:31
Start Date: 20/Feb/19 05:31
Worklog Time Spent: 10m 
  Work Description: ryan-williams commented on issue #7868: [BEAM-4775] 
MonitoringInfo URN tweaks
URL: https://github.com/apache/beam/pull/7868#issuecomment-465343573
 
 
   > Just curious, what were you hoping to get out of this change in 
particular? This is largely a bunch of renamings, was it too confusing before?
   
   There are 2 renamings:
   - 
https://github.com/apache/beam/pull/7868/commits/0022bf9a6e12ae35b33d4bf06e0b53e9e0e7f1ad:
 `MonitoringInfoLabels.{TRANSFORM → PTRANSFORM}`, per [this 
`TODO(ajamato)`](https://github.com/apache/beam/commit/0022bf9a6e12ae35b33d4bf06e0b53e9e0e7f1ad#diff-4d226ebf3f70cf18449c29e82511a67eL439)
 😄  
   - 
https://github.com/apache/beam/pull/7868/commits/0715c199031548b218c425adbd7870d7460cbab2:
  `USER_COUNTER_URN_PREFIX ` → `USER_METRIC_URN_PREFIX `
 - (in `SimpleMonitoringInfoBuilder` and `MonitoringInfoUrns.Enum`)
 - we need either this change, or to add new user-URN-prefixes for 
distributions/gauges, afaict.
   
   There are also 6 other commits here that each constitute minor forward 
progress, I think:
   - 
https://github.com/apache/beam/pull/7868/commits/9d1e237ee5ae0182f43cfc9280a84021c1230d31:
 `UserMonitoringInfoToCounterUpdateTransformer` had [a redundant definition of  
`USER_COUNTER_URN_PREFIX`](https://github.com/apache/beam/commit/9d1e237ee5ae0182f43cfc9280a84021c1230d31#diff-a2189a7c97dace5f44f04e16a71b6468L56)
   - 
https://github.com/apache/beam/pull/7868/commits/84abbfafd9d75c7169af2625b5e72f539da6ede0:
 consolidate two (different!) implementations of [parsing a MetricName from a 
URN]
   - etc.
   
   Sorry if it just feels like dust. I thought these were all thematically 
similar and a reasonable chunk to pull out of #7823 and reckon with 
independently.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 201108)
Time Spent: 13h 50m  (was: 13h 40m)

> JobService should support returning metrics
> ---
>
> Key: BEAM-4775
> URL: https://issues.apache.org/jira/browse/BEAM-4775
> Project: Beam
>  Issue Type: Bug
>  Components: beam-model
>Reporter: Eugene Kirpichov
>Assignee: Ryan Williams
>Priority: Major
>  Labels: triaged
>  Time Spent: 13h 50m
>  Remaining Estimate: 0h
>
> [https://github.com/apache/beam/blob/master/model/job-management/src/main/proto/beam_job_api.proto]
>  currently doesn't appear to have a way for JobService to return metrics to a 
> user, even though 
> [https://github.com/apache/beam/blob/master/model/fn-execution/src/main/proto/beam_fn_api.proto]
>  includes support for reporting SDK metrics to the runner harness.
>  
> Metrics are apparently necessary to run any ValidatesRunner tests because 
> PAssert needs to validate that the assertions succeeded. However, this 
> statement should be double-checked: perhaps it's possible to somehow work 
> with PAssert without metrics support.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4775) JobService should support returning metrics

2019-02-19 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4775?focusedWorklogId=201107&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201107
 ]

ASF GitHub Bot logged work on BEAM-4775:


Author: ASF GitHub Bot
Created on: 20/Feb/19 05:30
Start Date: 20/Feb/19 05:30
Worklog Time Spent: 10m 
  Work Description: ryan-williams commented on issue #7868: [BEAM-4775] 
MonitoringInfo URN tweaks
URL: https://github.com/apache/beam/pull/7868#issuecomment-465343573
 
 
   > Just curious, what were you hoping to get out of this change in 
particular? This is largely a bunch of renamings, was it too confusing before?
   
   There are 2 renamings:
   - 
https://github.com/apache/beam/pull/7868/commits/0022bf9a6e12ae35b33d4bf06e0b53e9e0e7f1ad:
 `MonitoringInfoLabels.{TRANSFORM → PTRANSFORM}`, per [this 
`TODO(ajamato)`](https://github.com/apache/beam/commit/0022bf9a6e12ae35b33d4bf06e0b53e9e0e7f1ad#diff-4d226ebf3f70cf18449c29e82511a67eL439)
 😄  
   - 
https://github.com/apache/beam/pull/7868/commits/0715c199031548b218c425adbd7870d7460cbab2:
  `USER_COUNTER_URN_PREFIX ` → `USER_METRIC_URN_PREFIX `
 - (in `SimpleMonitoringInfoBuilder` and `MonitoringInfoUrns.Enum`)
 - we need either this change, or to add new user-URN-prefixes for 
distributions/gauges, afaict.
   
   There are also 6 other commits here that each constitute minor forward 
progress, I think:
   - 
https://github.com/apache/beam/pull/7868/commits/9d1e237ee5ae0182f43cfc9280a84021c1230d31:
 `UserMonitoringInfoToCounterUpdateTransformer` had [a redundant definition of  
`USER_COUNTER_URN_PREFIX`](https://github.com/apache/beam/pull/7868/commits/08108ccdfad500bb68241e588bb525e36e0efa3d#diff-a2189a7c97dace5f44f04e16a71b6468L56)
   - 
https://github.com/apache/beam/pull/7868/commits/84abbfafd9d75c7169af2625b5e72f539da6ede0:
 consolidate two (different!) implementations of [parsing a MetricName from a 
URN]
   - etc.
   
   Sorry if it just feels like dust. I thought these were all thematically 
similar and a reasonable chunk to pull out of #7823 and reckon with 
independently.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 201107)
Time Spent: 13h 40m  (was: 13.5h)

> JobService should support returning metrics
> ---
>
> Key: BEAM-4775
> URL: https://issues.apache.org/jira/browse/BEAM-4775
> Project: Beam
>  Issue Type: Bug
>  Components: beam-model
>Reporter: Eugene Kirpichov
>Assignee: Ryan Williams
>Priority: Major
>  Labels: triaged
>  Time Spent: 13h 40m
>  Remaining Estimate: 0h
>
> [https://github.com/apache/beam/blob/master/model/job-management/src/main/proto/beam_job_api.proto]
>  currently doesn't appear to have a way for JobService to return metrics to a 
> user, even though 
> [https://github.com/apache/beam/blob/master/model/fn-execution/src/main/proto/beam_fn_api.proto]
>  includes support for reporting SDK metrics to the runner harness.
>  
> Metrics are apparently necessary to run any ValidatesRunner tests because 
> PAssert needs to validate that the assertions succeeded. However, this 
> statement should be double-checked: perhaps it's possible to somehow work 
> with PAssert without metrics support.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4076) Schema followups

2019-02-19 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4076?focusedWorklogId=201104&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201104
 ]

ASF GitHub Bot logged work on BEAM-4076:


Author: ASF GitHub Bot
Created on: 20/Feb/19 05:28
Start Date: 20/Feb/19 05:28
Worklog Time Spent: 10m 
  Work Description: reuvenlax commented on issue #7635: [BEAM-4076] 
Generalize schema inputs to ParDo
URL: https://github.com/apache/beam/pull/7635#issuecomment-465430283
 
 
   Thanks! I'll also point out that the code in ParDo.java was refactored and
   simplified in the PR that follows this one. Let me know if you'd prefer me
   to backport that refactoring to this PR.
   
   On Tue, Feb 19, 2019 at 9:04 PM Kenn Knowles 
   wrote:
   
   > FYi I'm using reviewable.io to check through the files and I'm halfway
   > there. Most are pretty trivial plumbing so I want to be able to focus on
   > the real changes.
   >
   > —
   > You are receiving this because you authored the thread.
   > Reply to this email directly, view it on GitHub
   > , or mute
   > the thread
   > 

   > .
   >
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 201104)
Time Spent: 19h 10m  (was: 19h)

> Schema followups
> 
>
> Key: BEAM-4076
> URL: https://issues.apache.org/jira/browse/BEAM-4076
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model, dsl-sql, sdk-java-core
>Reporter: Kenneth Knowles
>Priority: Major
>  Time Spent: 19h 10m
>  Remaining Estimate: 0h
>
> This umbrella bug contains subtasks with followups for Beam schemas, which 
> were moved from SQL to the core Java SDK and made to be type-name-based 
> rather than coder based.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4775) JobService should support returning metrics

2019-02-19 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4775?focusedWorklogId=201097&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201097
 ]

ASF GitHub Bot logged work on BEAM-4775:


Author: ASF GitHub Bot
Created on: 20/Feb/19 05:18
Start Date: 20/Feb/19 05:18
Worklog Time Spent: 10m 
  Work Description: ryan-williams commented on issue #7868: [BEAM-4775] 
MonitoringInfo URN tweaks
URL: https://github.com/apache/beam/pull/7868#issuecomment-465428337
 
 
   Run Java PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 201097)
Time Spent: 12h 50m  (was: 12h 40m)

> JobService should support returning metrics
> ---
>
> Key: BEAM-4775
> URL: https://issues.apache.org/jira/browse/BEAM-4775
> Project: Beam
>  Issue Type: Bug
>  Components: beam-model
>Reporter: Eugene Kirpichov
>Assignee: Ryan Williams
>Priority: Major
>  Labels: triaged
>  Time Spent: 12h 50m
>  Remaining Estimate: 0h
>
> [https://github.com/apache/beam/blob/master/model/job-management/src/main/proto/beam_job_api.proto]
>  currently doesn't appear to have a way for JobService to return metrics to a 
> user, even though 
> [https://github.com/apache/beam/blob/master/model/fn-execution/src/main/proto/beam_fn_api.proto]
>  includes support for reporting SDK metrics to the runner harness.
>  
> Metrics are apparently necessary to run any ValidatesRunner tests because 
> PAssert needs to validate that the assertions succeeded. However, this 
> statement should be double-checked: perhaps it's possible to somehow work 
> with PAssert without metrics support.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4775) JobService should support returning metrics

2019-02-19 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4775?focusedWorklogId=201101&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201101
 ]

ASF GitHub Bot logged work on BEAM-4775:


Author: ASF GitHub Bot
Created on: 20/Feb/19 05:22
Start Date: 20/Feb/19 05:22
Worklog Time Spent: 10m 
  Work Description: ryan-williams commented on pull request #7868: 
[BEAM-4775] MonitoringInfo URN tweaks
URL: https://github.com/apache/beam/pull/7868#discussion_r258337315
 
 

 ##
 File path: model/fn-execution/src/main/proto/beam_fn_api.proto
 ##
 @@ -337,24 +337,19 @@ message Annotation {
 // MonitoringInfo protos.
 message MonitoringInfoSpecs {
   enum Enum {
-// TODO(ajamato): Add the PTRANSFORM name as a required label after
-// upgrading the python SDK.
-USER_COUNTER = 0 [(monitoring_info_spec) = {
-  urn: "beam:metric:user:",
-  type_urn: "beam:metrics:sum_int_64",
-}];
-
-ELEMENT_COUNT = 1 [(monitoring_info_spec) = {
+ELEMENT_COUNT = 0 [(monitoring_info_spec) = {
   urn: "beam:metric:element_count:v1",
   type_urn: "beam:metrics:sum_int_64",
+  // TODO(ryan): we currently also generate this metric with 
["PTRANSFORM"] labels, but it fails validation in
 
 Review comment:
   OK. I'll leave this for now, perhaps I can remove it before this gets 
merged, if @Ardagan fixes it first, or lmk what you suggest!
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 201101)
Time Spent: 13h 20m  (was: 13h 10m)

> JobService should support returning metrics
> ---
>
> Key: BEAM-4775
> URL: https://issues.apache.org/jira/browse/BEAM-4775
> Project: Beam
>  Issue Type: Bug
>  Components: beam-model
>Reporter: Eugene Kirpichov
>Assignee: Ryan Williams
>Priority: Major
>  Labels: triaged
>  Time Spent: 13h 20m
>  Remaining Estimate: 0h
>
> [https://github.com/apache/beam/blob/master/model/job-management/src/main/proto/beam_job_api.proto]
>  currently doesn't appear to have a way for JobService to return metrics to a 
> user, even though 
> [https://github.com/apache/beam/blob/master/model/fn-execution/src/main/proto/beam_fn_api.proto]
>  includes support for reporting SDK metrics to the runner harness.
>  
> Metrics are apparently necessary to run any ValidatesRunner tests because 
> PAssert needs to validate that the assertions succeeded. However, this 
> statement should be double-checked: perhaps it's possible to somehow work 
> with PAssert without metrics support.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4775) JobService should support returning metrics

2019-02-19 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4775?focusedWorklogId=201100&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201100
 ]

ASF GitHub Bot logged work on BEAM-4775:


Author: ASF GitHub Bot
Created on: 20/Feb/19 05:20
Start Date: 20/Feb/19 05:20
Worklog Time Spent: 10m 
  Work Description: ryan-williams commented on pull request #7868: 
[BEAM-4775] MonitoringInfo URN tweaks
URL: https://github.com/apache/beam/pull/7868#discussion_r258255289
 
 

 ##
 File path: model/fn-execution/src/main/proto/beam_fn_api.proto
 ##
 @@ -311,13 +311,13 @@ message ProcessBundleProgressRequest {
   string instruction_reference = 1;
 }
 
-// A specification containing required set of fields and labels required
-// to be set on a MonitoringInfo for the specific URN for SDK->RunnerHarness
+// A specification containing required set of fields and labels required to be
+// set on a MonitoringInfo for a given URN-type, for SDK->RunnerHarness
 
 Review comment:
   Yea, "type" was not what I meant here; context was [discussion w/ 
Robert](https://github.com/apache/beam/pull/7868#discussion_r257795396) about 
whether these specs describe "specific URNs" or "URN categories" (e.g. "user 
metrics").
   
   (reworded at `HEAD`)
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 201100)
Time Spent: 13h 10m  (was: 13h)

> JobService should support returning metrics
> ---
>
> Key: BEAM-4775
> URL: https://issues.apache.org/jira/browse/BEAM-4775
> Project: Beam
>  Issue Type: Bug
>  Components: beam-model
>Reporter: Eugene Kirpichov
>Assignee: Ryan Williams
>Priority: Major
>  Labels: triaged
>  Time Spent: 13h 10m
>  Remaining Estimate: 0h
>
> [https://github.com/apache/beam/blob/master/model/job-management/src/main/proto/beam_job_api.proto]
>  currently doesn't appear to have a way for JobService to return metrics to a 
> user, even though 
> [https://github.com/apache/beam/blob/master/model/fn-execution/src/main/proto/beam_fn_api.proto]
>  includes support for reporting SDK metrics to the runner harness.
>  
> Metrics are apparently necessary to run any ValidatesRunner tests because 
> PAssert needs to validate that the assertions succeeded. However, this 
> statement should be double-checked: perhaps it's possible to somehow work 
> with PAssert without metrics support.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4775) JobService should support returning metrics

2019-02-19 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4775?focusedWorklogId=201099&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201099
 ]

ASF GitHub Bot logged work on BEAM-4775:


Author: ASF GitHub Bot
Created on: 20/Feb/19 05:19
Start Date: 20/Feb/19 05:19
Worklog Time Spent: 10m 
  Work Description: ryan-williams commented on issue #7868: [BEAM-4775] 
MonitoringInfo URN tweaks
URL: https://github.com/apache/beam/pull/7868#issuecomment-465428537
 
 
   ```
   :beam-sdks-java-io-cassandra:test FAILED
   --
    
   org.apache.beam.sdk.io.cassandra.CassandraIOTest > classMethod FAILED
   java.lang.AssertionError at CassandraIOTest.java:100
    
   org.apache.beam.sdk.io.cassandra.CassandraIOTest > classMethod FAILED
   java.lang.NullPointerException at CassandraIOTest.java:125
    
   6 tests completed, 2 failed
   ```
   
   
[scan](https://scans.gradle.com/s/btppkeky63a5g/console-log?task=:beam-sdks-java-io-cassandra:test)
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 201099)
Time Spent: 13h  (was: 12h 50m)

> JobService should support returning metrics
> ---
>
> Key: BEAM-4775
> URL: https://issues.apache.org/jira/browse/BEAM-4775
> Project: Beam
>  Issue Type: Bug
>  Components: beam-model
>Reporter: Eugene Kirpichov
>Assignee: Ryan Williams
>Priority: Major
>  Labels: triaged
>  Time Spent: 13h
>  Remaining Estimate: 0h
>
> [https://github.com/apache/beam/blob/master/model/job-management/src/main/proto/beam_job_api.proto]
>  currently doesn't appear to have a way for JobService to return metrics to a 
> user, even though 
> [https://github.com/apache/beam/blob/master/model/fn-execution/src/main/proto/beam_fn_api.proto]
>  includes support for reporting SDK metrics to the runner harness.
>  
> Metrics are apparently necessary to run any ValidatesRunner tests because 
> PAssert needs to validate that the assertions succeeded. However, this 
> statement should be double-checked: perhaps it's possible to somehow work 
> with PAssert without metrics support.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4775) JobService should support returning metrics

2019-02-19 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4775?focusedWorklogId=201093&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201093
 ]

ASF GitHub Bot logged work on BEAM-4775:


Author: ASF GitHub Bot
Created on: 20/Feb/19 05:09
Start Date: 20/Feb/19 05:09
Worklog Time Spent: 10m 
  Work Description: ryan-williams commented on issue #7867: [BEAM-4775] key 
MetricResult by a MetricKey
URL: https://github.com/apache/beam/pull/7867#issuecomment-465426463
 
 
   Run Portable_Python PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 201093)
Time Spent: 12h 40m  (was: 12.5h)

> JobService should support returning metrics
> ---
>
> Key: BEAM-4775
> URL: https://issues.apache.org/jira/browse/BEAM-4775
> Project: Beam
>  Issue Type: Bug
>  Components: beam-model
>Reporter: Eugene Kirpichov
>Assignee: Ryan Williams
>Priority: Major
>  Labels: triaged
>  Time Spent: 12h 40m
>  Remaining Estimate: 0h
>
> [https://github.com/apache/beam/blob/master/model/job-management/src/main/proto/beam_job_api.proto]
>  currently doesn't appear to have a way for JobService to return metrics to a 
> user, even though 
> [https://github.com/apache/beam/blob/master/model/fn-execution/src/main/proto/beam_fn_api.proto]
>  includes support for reporting SDK metrics to the runner harness.
>  
> Metrics are apparently necessary to run any ValidatesRunner tests because 
> PAssert needs to validate that the assertions succeeded. However, this 
> statement should be double-checked: perhaps it's possible to somehow work 
> with PAssert without metrics support.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4775) JobService should support returning metrics

2019-02-19 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4775?focusedWorklogId=201092&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201092
 ]

ASF GitHub Bot logged work on BEAM-4775:


Author: ASF GitHub Bot
Created on: 20/Feb/19 05:09
Start Date: 20/Feb/19 05:09
Worklog Time Spent: 10m 
  Work Description: ryan-williams commented on issue #7867: [BEAM-4775] key 
MetricResult by a MetricKey
URL: https://github.com/apache/beam/pull/7867#issuecomment-465426438
 
 
   hm, I see a bunch of 
   ```
   BeamIOError: src and dst files do not exist. src: 
/tmp/beam-temp-py-wordcount-direct-42fabb3a34cb11e984ac42010a8d/1c504dc3-20b6-4f8c-88ef-15912e953404.py-wordcount-direct,
 dst: /tmp/py-wordcount-direct-0-of-2 with exceptions None [while 
running 'write/Write/WriteImpl/FinalizeWrite
   --
   '] with exceptions None
   ```
   [in the 
scan](https://scans.gradle.com/s/5qa6sgdele2rs/console-log?task=:beam-sdks-python:portableWordCountBatch#L54)
 for `Portable_Python` (batch; streaming seems to have passed).
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 201092)
Time Spent: 12.5h  (was: 12h 20m)

> JobService should support returning metrics
> ---
>
> Key: BEAM-4775
> URL: https://issues.apache.org/jira/browse/BEAM-4775
> Project: Beam
>  Issue Type: Bug
>  Components: beam-model
>Reporter: Eugene Kirpichov
>Assignee: Ryan Williams
>Priority: Major
>  Labels: triaged
>  Time Spent: 12.5h
>  Remaining Estimate: 0h
>
> [https://github.com/apache/beam/blob/master/model/job-management/src/main/proto/beam_job_api.proto]
>  currently doesn't appear to have a way for JobService to return metrics to a 
> user, even though 
> [https://github.com/apache/beam/blob/master/model/fn-execution/src/main/proto/beam_fn_api.proto]
>  includes support for reporting SDK metrics to the runner harness.
>  
> Metrics are apparently necessary to run any ValidatesRunner tests because 
> PAssert needs to validate that the assertions succeeded. However, this 
> statement should be double-checked: perhaps it's possible to somehow work 
> with PAssert without metrics support.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4076) Schema followups

2019-02-19 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4076?focusedWorklogId=201089&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201089
 ]

ASF GitHub Bot logged work on BEAM-4076:


Author: ASF GitHub Bot
Created on: 20/Feb/19 05:04
Start Date: 20/Feb/19 05:04
Worklog Time Spent: 10m 
  Work Description: kennknowles commented on issue #7635: [BEAM-4076] 
Generalize schema inputs to ParDo
URL: https://github.com/apache/beam/pull/7635#issuecomment-465425645
 
 
   FYi I'm using reviewable.io to check through the files and I'm halfway 
there. Most are pretty trivial plumbing so I want to be able to focus on the 
real changes.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 201089)
Time Spent: 19h  (was: 18h 50m)

> Schema followups
> 
>
> Key: BEAM-4076
> URL: https://issues.apache.org/jira/browse/BEAM-4076
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model, dsl-sql, sdk-java-core
>Reporter: Kenneth Knowles
>Priority: Major
>  Time Spent: 19h
>  Remaining Estimate: 0h
>
> This umbrella bug contains subtasks with followups for Beam schemas, which 
> were moved from SQL to the core Java SDK and made to be type-name-based 
> rather than coder based.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6639) ClickHouseIOTest flakey failure failing in precomiits

2019-02-19 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6639?focusedWorklogId=201078&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201078
 ]

ASF GitHub Bot logged work on BEAM-6639:


Author: ASF GitHub Bot
Created on: 20/Feb/19 04:47
Start Date: 20/Feb/19 04:47
Worklog Time Spent: 10m 
  Work Description: kennknowles commented on pull request #7797: 
[BEAM-6639] Retry pulling clickhouse-server docker image
URL: https://github.com/apache/beam/pull/7797
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 201078)
Time Spent: 4h 40m  (was: 4.5h)

> ClickHouseIOTest flakey failure failing in precomiits
> -
>
> Key: BEAM-6639
> URL: https://issues.apache.org/jira/browse/BEAM-6639
> Project: Beam
>  Issue Type: Test
>  Components: java-fn-execution
>Reporter: Alex Amato
>Priority: Major
>  Time Spent: 4h 40m
>  Remaining Estimate: 0h
>
> [https://builds.apache.org/job/beam_PreCommit_Java_Commit/4166/testReport/junit/org.apache.beam.sdk.io.clickhouse/ClickHouseIOTest/classMethod/]
>  
> h3. Error Message
> org.testcontainers.containers.ContainerLaunchException: Container startup 
> failed
> h3. Stacktrace
> org.testcontainers.containers.ContainerLaunchException: Container startup 
> failed at 
> org.testcontainers.containers.GenericContainer.doStart(GenericContainer.java:221)
>  at 
> org.testcontainers.containers.GenericContainer.start(GenericContainer.java:203)
>  at 
> org.apache.beam.sdk.io.clickhouse.BaseClickHouseTest.setup(BaseClickHouseTest.java:68)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498) at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>  at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>  at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>  at 
> org.junit.internal.runners.statements.RunBefores.invokeMethod(RunBefores.java:33)
>  at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:24) 
> at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) 
> at org.junit.runners.ParentRunner.run(ParentRunner.java:396) at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.runTestClass(JUnitTestClassExecutor.java:106)
>  at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:58)
>  at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:38)
>  at 
> org.gradle.api.internal.tasks.testing.junit.AbstractJUnitTestClassProcessor.processTestClass(AbstractJUnitTestClassProcessor.java:66)
>  at 
> org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.processTestClass(SuiteTestClassProcessor.java:51)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498) at 
> org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35)
>  at 
> org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
>  at 
> org.gradle.internal.dispatch.ContextClassLoaderDispatch.dispatch(ContextClassLoaderDispatch.java:32)
>  at 
> org.gradle.internal.dispatch.ProxyDispatchAdapter$DispatchingInvocationHandler.invoke(ProxyDispatchAdapter.java:93)
>  at com.sun.proxy.$Proxy2.processTestClass(Unknown Source) at 
> org.gradle.api.internal.tasks.testing.worker.TestWorker.processTestClass(TestWorker.java:117)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498) at 
> org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35)
>  at 
> org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
>  at 
> org.gradle.internal.rem

[jira] [Work logged] (BEAM-5795) Can SQL Query 5 be simplified?

2019-02-19 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5795?focusedWorklogId=201086&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201086
 ]

ASF GitHub Bot logged work on BEAM-5795:


Author: ASF GitHub Bot
Created on: 20/Feb/19 05:03
Start Date: 20/Feb/19 05:03
Worklog Time Spent: 10m 
  Work Description: stale[bot] commented on issue #6757: [BEAM-5795] 
Simplify SQL Query 5
URL: https://github.com/apache/beam/pull/6757#issuecomment-465425496
 
 
   This pull request is no longer marked as stale.
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 201086)
Time Spent: 50m  (was: 40m)

> Can SQL Query 5 be simplified?
> --
>
> Key: BEAM-5795
> URL: https://issues.apache.org/jira/browse/BEAM-5795
> Project: Beam
>  Issue Type: Improvement
>  Components: dsl-sql
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>Priority: Major
>  Labels: triaged
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> The original CQL query uses the ALL operator over the set of rows that are 
> within a certain period from the watermark. We instead have a fancy join, due 
> to windowing. Nonetheless, can this be simplified?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-6706) User reports trouble downloading 2.10.0 Dataflow worker image

2019-02-19 Thread Kenneth Knowles (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-6706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16772602#comment-16772602
 ] 

Kenneth Knowles commented on BEAM-6706:
---

[~tvalentyn] any ideas?

> User reports trouble downloading 2.10.0 Dataflow worker image
> -
>
> Key: BEAM-6706
> URL: https://issues.apache.org/jira/browse/BEAM-6706
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>Priority: Blocker
>
> DataFlow however is throwing all sorts of errors.  For example:
> * Handler for GET 
> /v1.27/images/gcr.io/cloud-dataflow/v1beta3/beam-java-batch:beam-2.10.0/json 
> returned error: No such image: 
> gcr.io/cloud-dataflow/v1beta3/beam-java-batch:beam-2.10.0"
> * while reading 'google-dockercfg' metadata: http status code: 404 while 
> fetching url 
> http://metadata.google.internal./computeMetadata/v1/instance/attributes/google-dockercfg";
> * Error syncing pod..."
> The job gets stuck after starting a worker and after an hour or so it gives 
> up with a failure.  2.9.0 runs fine.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6459) gradle clean depends on setupVirtualEnv, which installs lots of stuff

2019-02-19 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6459?focusedWorklogId=201063&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201063
 ]

ASF GitHub Bot logged work on BEAM-6459:


Author: ASF GitHub Bot
Created on: 20/Feb/19 03:34
Start Date: 20/Feb/19 03:34
Worklog Time Spent: 10m 
  Work Description: kennknowles commented on pull request #7897: 
[BEAM-6459] Don't run setupVirtualenv during clean
URL: https://github.com/apache/beam/pull/7897
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 201063)
Time Spent: 40m  (was: 0.5h)

> gradle clean depends on setupVirtualEnv, which installs lots of stuff
> -
>
> Key: BEAM-6459
> URL: https://issues.apache.org/jira/browse/BEAM-6459
> Project: Beam
>  Issue Type: Bug
>  Components: build-system, sdk-py-core
>Reporter: Kenneth Knowles
>Assignee: Udi Meiri
>Priority: Major
>  Labels: starter
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> "This seems to be on purpose [1]
> AFAIU setup is done to be able to call into setup.py clean. We probably 
> should work around that."
> Thread: 
> https://lists.apache.org/thread.html/c87640aa1ffb654fd5b477f61cee1d4d53aa24f545e50013543427b1@%3Cdev.beam.apache.org%3E



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6459) gradle clean depends on setupVirtualEnv, which installs lots of stuff

2019-02-19 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6459?focusedWorklogId=201064&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201064
 ]

ASF GitHub Bot logged work on BEAM-6459:


Author: ASF GitHub Bot
Created on: 20/Feb/19 03:34
Start Date: 20/Feb/19 03:34
Worklog Time Spent: 10m 
  Work Description: kennknowles commented on issue #7897: [BEAM-6459] Don't 
run setupVirtualenv during clean
URL: https://github.com/apache/beam/pull/7897#issuecomment-465407948
 
 
   Thanks!
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 201064)
Time Spent: 50m  (was: 40m)

> gradle clean depends on setupVirtualEnv, which installs lots of stuff
> -
>
> Key: BEAM-6459
> URL: https://issues.apache.org/jira/browse/BEAM-6459
> Project: Beam
>  Issue Type: Bug
>  Components: build-system, sdk-py-core
>Reporter: Kenneth Knowles
>Assignee: Udi Meiri
>Priority: Major
>  Labels: starter
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> "This seems to be on purpose [1]
> AFAIU setup is done to be able to call into setup.py clean. We probably 
> should work around that."
> Thread: 
> https://lists.apache.org/thread.html/c87640aa1ffb654fd5b477f61cee1d4d53aa24f545e50013543427b1@%3Cdev.beam.apache.org%3E



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4776) Java PortableRunner should support metrics

2019-02-19 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4776?focusedWorklogId=201059&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201059
 ]

ASF GitHub Bot logged work on BEAM-4776:


Author: ASF GitHub Bot
Created on: 20/Feb/19 03:14
Start Date: 20/Feb/19 03:14
Worklog Time Spent: 10m 
  Work Description: ryan-williams commented on issue #7890: [BEAM-4776] 
consolidate MetricResult implementations
URL: https://github.com/apache/beam/pull/7890#issuecomment-465404199
 
 
   [Looks like flaky 
tests](https://scans.gradle.com/s/njubqln3d6mxc/failure#top=0):
   
   ```
   :beam-runners-google-cloud-dataflow-java-examples-streaming:preCommit FAILED
   --
    
   org.apache.beam.examples.WordCountIT > testE2EWordCount FAILED
   java.lang.RuntimeException at WordCountIT.java:69
   Caused by: java.io.IOException at WordCountIT.java:69
    
   1 test completed, 1 failed
   ```
   
   ```
   
   :beam-runners-google-cloud-dataflow-java-examples:preCommitLegacyWorker 
FAILED
   -- 
   org.apache.beam.examples.WordCountIT > testE2EWordCount FAILED
   java.lang.RuntimeException at WordCountIT.java:69
   Caused by: java.io.IOException at WordCountIT.java:69
    
   4 tests completed, 1 failed
   ```
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 201059)
Time Spent: 1.5h  (was: 1h 20m)

> Java PortableRunner should support metrics
> --
>
> Key: BEAM-4776
> URL: https://issues.apache.org/jira/browse/BEAM-4776
> Project: Beam
>  Issue Type: Bug
>  Components: runner-core
>Reporter: Eugene Kirpichov
>Assignee: Ryan Williams
>Priority: Major
>  Labels: triaged
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> BEAM-4775 concerns adding metrics to the JobService API; the current issue is 
> about making PortableRunner understand them.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4776) Java PortableRunner should support metrics

2019-02-19 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4776?focusedWorklogId=201058&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201058
 ]

ASF GitHub Bot logged work on BEAM-4776:


Author: ASF GitHub Bot
Created on: 20/Feb/19 03:12
Start Date: 20/Feb/19 03:12
Worklog Time Spent: 10m 
  Work Description: ryan-williams commented on issue #7890: [BEAM-4776] 
consolidate MetricResult implementations
URL: https://github.com/apache/beam/pull/7890#issuecomment-465403763
 
 
   Run Java_Examples_Dataflow PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 201058)
Time Spent: 1h 20m  (was: 1h 10m)

> Java PortableRunner should support metrics
> --
>
> Key: BEAM-4776
> URL: https://issues.apache.org/jira/browse/BEAM-4776
> Project: Beam
>  Issue Type: Bug
>  Components: runner-core
>Reporter: Eugene Kirpichov
>Assignee: Ryan Williams
>Priority: Major
>  Labels: triaged
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> BEAM-4775 concerns adding metrics to the JobService API; the current issue is 
> about making PortableRunner understand them.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5638) Add exception handling to single message transforms in Java SDK

2019-02-19 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5638?focusedWorklogId=201033&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201033
 ]

ASF GitHub Bot logged work on BEAM-5638:


Author: ASF GitHub Bot
Created on: 20/Feb/19 02:03
Start Date: 20/Feb/19 02:03
Worklog Time Spent: 10m 
  Work Description: reuvenlax commented on pull request #7736: [BEAM-5638] 
Exception handling for Java MapElements and FlatMapElements
URL: https://github.com/apache/beam/pull/7736#discussion_r258307310
 
 

 ##
 File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/FlatMapElements.java
 ##
 @@ -170,4 +174,193 @@ public void populateDisplayData(DisplayData.Builder 
builder) {
   builder.include("fn", (HasDisplayData) originalFnForDisplayData);
 }
   }
+
+  /**
+   * Return a modified {@code PTransform} that catches exceptions raised while 
mapping elements.
+   *
+   * The user must call {@code via} on the returned {@link 
FlatMapWithExceptions} instance to
+   * define an exception handler. If the handler does not provide sufficient 
type information, the
+   * user must also call {@code into} to define a type descriptor for the 
error collection.
+   *
+   * See {@link WithExceptions} documentation for usage patterns of the 
returned {@link
+   * WithExceptions.Result}.
+   *
+   * @return a {@link WithExceptions.Result} wrapping the output and error 
collections
+   */
+  @Experimental(Experimental.Kind.WITH_EXCEPTIONS)
+  public FlatMapWithExceptions withExceptions() {
+return new FlatMapWithExceptions<>(
+fn, originalFnForDisplayData, inputType, outputType, null, null);
+  }
+
+  /** Implementation of {@link FlatMapElements#withExceptions()}. */
+  @Experimental(Experimental.Kind.WITH_EXCEPTIONS)
+  public static class FlatMapWithExceptions
+  extends PTransform<
+  PCollection, WithExceptions.Result, 
FailureT>> {
+
+private final transient TypeDescriptor inputType;
+private final transient TypeDescriptor outputType;
+@Nullable private final transient TypeDescriptor failureType;
+private final transient Object originalFnForDisplayData;
+@Nullable private final Contextful>> fn;
+@Nullable private final ProcessFunction, 
FailureT> exceptionHandler;
+
+FlatMapWithExceptions(
+@Nullable Contextful>> fn,
+Object originalFnForDisplayData,
+TypeDescriptor inputType,
+TypeDescriptor outputType,
+@Nullable ProcessFunction, FailureT> 
exceptionHandler,
+@Nullable TypeDescriptor failureType) {
+  this.fn = fn;
+  this.originalFnForDisplayData = originalFnForDisplayData;
+  this.inputType = inputType;
+  this.outputType = outputType;
+  this.exceptionHandler = exceptionHandler;
+  this.failureType = failureType;
+}
+
+/**
+ * Returns a new {@link FlatMapWithExceptions} transform with the given 
type descriptor for the
+ * error collection, but the exception handler yet to be specified using 
{@link
+ * #via(ProcessFunction)}.
+ */
+public  FlatMapWithExceptions 
into(
+TypeDescriptor failureTypeDescriptor) {
+  return new FlatMapWithExceptions<>(
+  fn, originalFnForDisplayData, inputType, outputType, null, 
failureTypeDescriptor);
+}
+
+/**
+ * Returns a {@code PTransform} that catches exceptions raised while 
mapping elements, passing
+ * the raised exception instance and the input element being processed 
through the given {@code
+ * exceptionHandler} and emitting the result to an error collection.
+ *
+ * Example usage:
+ *
+ * {@code
+ * Result, String>> result = words.apply(
+ * FlatMapElements.into(TypeDescriptors.strings())
+ * .via((String line) -> 
Arrays.asList(Arrays.copyOfRange(line.split(" "), 1, 5)))
+ * .withExceptions()
+ * .into(TypeDescriptors.strings())
+ * .via(ee -> e.exception().getMessage()));
+ * PCollection errors = result.errors();
+ * }
+ */
+public FlatMapWithExceptions via(
+ProcessFunction, FailureT> exceptionHandler) {
+  return new FlatMapWithExceptions<>(
+  fn, originalFnForDisplayData, inputType, outputType, 
exceptionHandler, failureType);
+}
+
+/**
+ * Like {@link #via(ProcessFunction)}, but takes advantage of the type 
information provided by
+ * {@link InferableFunction}, meaning that a call to {@link 
#into(TypeDescriptor)} may not be
+ * necessary.
+ *
+ * Example usage:
+ *
+ * {@code
+ * Result, KV>> result = 
words.apply(
+ * FlatMapElements.into(TypeDescriptors.strings())
+ * .via((String line) -> 
Arrays.asList(Arrays.copyOfRange(line.split(" "), 1, 5)))
+ * .withExceptions()
+ * .via(new WithExceptions.ExceptionAsMapHandler() {}));
+ * PCol

[jira] [Work logged] (BEAM-6619) Add PostCommit suite for integration tests on DataflowRunner

2019-02-19 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6619?focusedWorklogId=201028&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201028
 ]

ASF GitHub Bot logged work on BEAM-6619:


Author: ASF GitHub Bot
Created on: 20/Feb/19 01:58
Start Date: 20/Feb/19 01:58
Worklog Time Spent: 10m 
  Work Description: tvalentyn commented on issue #7872: [BEAM-6619] 
[BEAM-6593] Add more integration tests to postcommit
URL: https://github.com/apache/beam/pull/7872#issuecomment-465388155
 
 
   Run Python PostCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 201028)
Time Spent: 3h 10m  (was: 3h)

> Add PostCommit suite for integration tests on DataflowRunner
> 
>
> Key: BEAM-6619
> URL: https://issues.apache.org/jira/browse/BEAM-6619
> Project: Beam
>  Issue Type: Sub-task
>  Components: testing
>Reporter: Mark Liu
>Assignee: Mark Liu
>Priority: Major
>  Labels: triaged
> Fix For: Not applicable
>
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6619) Add PostCommit suite for integration tests on DataflowRunner

2019-02-19 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6619?focusedWorklogId=201027&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201027
 ]

ASF GitHub Bot logged work on BEAM-6619:


Author: ASF GitHub Bot
Created on: 20/Feb/19 01:57
Start Date: 20/Feb/19 01:57
Worklog Time Spent: 10m 
  Work Description: tvalentyn commented on issue #7872: [BEAM-6619] 
[BEAM-6593] Add more integration tests to postcommit
URL: https://github.com/apache/beam/pull/7872#issuecomment-465388087
 
 
   Run Python3 PostCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 201027)
Time Spent: 3h  (was: 2h 50m)

> Add PostCommit suite for integration tests on DataflowRunner
> 
>
> Key: BEAM-6619
> URL: https://issues.apache.org/jira/browse/BEAM-6619
> Project: Beam
>  Issue Type: Sub-task
>  Components: testing
>Reporter: Mark Liu
>Assignee: Mark Liu
>Priority: Major
>  Labels: triaged
> Fix For: Not applicable
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-6545) NPE when decoding null base 64 strings

2019-02-19 Thread Kenneth Knowles (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-6545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16772515#comment-16772515
 ] 

Kenneth Knowles commented on BEAM-6545:
---

I'm not sure about compatibility, but pinning google-http-library to 1.27.0 or 
another version that is null-accepting might work.

> NPE when decoding null base 64 strings
> --
>
> Key: BEAM-6545
> URL: https://issues.apache.org/jira/browse/BEAM-6545
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Affects Versions: 2.9.0
>Reporter: Ahmet Altay
>Assignee: Kenneth Knowles
>Priority: Major
> Fix For: 2.10.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> **ByteArrayShufflePosition.fromBase64 is marked with a @Nullable argument, 
> however it does not properly handle null inputs resulting in NPE.
> This seems like an unintended change we picked up from the dependency: 
> google-http-java-client/ switched from apache commons to guava 
> ([https://github.com/googleapis/google-http-java-client/commit/990c534f0e5103a142b0639c12c90cb990a00cfd#diff-97264fba16d690a26d63fbbc992af937)]
>  
>  
> and decodeBase64 behaves differently in both cases. Former can handle null by 
> returning null, latter will throw NPE.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5638) Add exception handling to single message transforms in Java SDK

2019-02-19 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5638?focusedWorklogId=201026&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201026
 ]

ASF GitHub Bot logged work on BEAM-5638:


Author: ASF GitHub Bot
Created on: 20/Feb/19 01:47
Start Date: 20/Feb/19 01:47
Worklog Time Spent: 10m 
  Work Description: jklukas commented on pull request #7736: [BEAM-5638] 
Exception handling for Java MapElements and FlatMapElements
URL: https://github.com/apache/beam/pull/7736#discussion_r258304373
 
 

 ##
 File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/FlatMapElements.java
 ##
 @@ -170,4 +174,193 @@ public void populateDisplayData(DisplayData.Builder 
builder) {
   builder.include("fn", (HasDisplayData) originalFnForDisplayData);
 }
   }
+
+  /**
+   * Return a modified {@code PTransform} that catches exceptions raised while 
mapping elements.
+   *
+   * The user must call {@code via} on the returned {@link 
FlatMapWithExceptions} instance to
+   * define an exception handler. If the handler does not provide sufficient 
type information, the
+   * user must also call {@code into} to define a type descriptor for the 
error collection.
+   *
+   * See {@link WithExceptions} documentation for usage patterns of the 
returned {@link
+   * WithExceptions.Result}.
+   *
+   * @return a {@link WithExceptions.Result} wrapping the output and error 
collections
+   */
+  @Experimental(Experimental.Kind.WITH_EXCEPTIONS)
+  public FlatMapWithExceptions withExceptions() {
+return new FlatMapWithExceptions<>(
+fn, originalFnForDisplayData, inputType, outputType, null, null);
+  }
+
+  /** Implementation of {@link FlatMapElements#withExceptions()}. */
+  @Experimental(Experimental.Kind.WITH_EXCEPTIONS)
+  public static class FlatMapWithExceptions
+  extends PTransform<
+  PCollection, WithExceptions.Result, 
FailureT>> {
+
+private final transient TypeDescriptor inputType;
+private final transient TypeDescriptor outputType;
+@Nullable private final transient TypeDescriptor failureType;
+private final transient Object originalFnForDisplayData;
+@Nullable private final Contextful>> fn;
+@Nullable private final ProcessFunction, 
FailureT> exceptionHandler;
+
+FlatMapWithExceptions(
+@Nullable Contextful>> fn,
+Object originalFnForDisplayData,
+TypeDescriptor inputType,
+TypeDescriptor outputType,
+@Nullable ProcessFunction, FailureT> 
exceptionHandler,
+@Nullable TypeDescriptor failureType) {
+  this.fn = fn;
+  this.originalFnForDisplayData = originalFnForDisplayData;
+  this.inputType = inputType;
+  this.outputType = outputType;
+  this.exceptionHandler = exceptionHandler;
+  this.failureType = failureType;
+}
+
+/**
+ * Returns a new {@link FlatMapWithExceptions} transform with the given 
type descriptor for the
+ * error collection, but the exception handler yet to be specified using 
{@link
+ * #via(ProcessFunction)}.
+ */
+public  FlatMapWithExceptions 
into(
+TypeDescriptor failureTypeDescriptor) {
+  return new FlatMapWithExceptions<>(
+  fn, originalFnForDisplayData, inputType, outputType, null, 
failureTypeDescriptor);
+}
+
+/**
+ * Returns a {@code PTransform} that catches exceptions raised while 
mapping elements, passing
+ * the raised exception instance and the input element being processed 
through the given {@code
+ * exceptionHandler} and emitting the result to an error collection.
+ *
+ * Example usage:
+ *
+ * {@code
+ * Result, String>> result = words.apply(
+ * FlatMapElements.into(TypeDescriptors.strings())
+ * .via((String line) -> 
Arrays.asList(Arrays.copyOfRange(line.split(" "), 1, 5)))
+ * .withExceptions()
+ * .into(TypeDescriptors.strings())
+ * .via(ee -> e.exception().getMessage()));
+ * PCollection errors = result.errors();
+ * }
+ */
+public FlatMapWithExceptions via(
+ProcessFunction, FailureT> exceptionHandler) {
+  return new FlatMapWithExceptions<>(
+  fn, originalFnForDisplayData, inputType, outputType, 
exceptionHandler, failureType);
+}
+
+/**
+ * Like {@link #via(ProcessFunction)}, but takes advantage of the type 
information provided by
+ * {@link InferableFunction}, meaning that a call to {@link 
#into(TypeDescriptor)} may not be
+ * necessary.
+ *
+ * Example usage:
+ *
+ * {@code
+ * Result, KV>> result = 
words.apply(
+ * FlatMapElements.into(TypeDescriptors.strings())
+ * .via((String line) -> 
Arrays.asList(Arrays.copyOfRange(line.split(" "), 1, 5)))
+ * .withExceptions()
+ * .via(new WithExceptions.ExceptionAsMapHandler() {}));
+ * PColle

[jira] [Work logged] (BEAM-6650) FlinkRunner fails to checkpoint elements emitted during finishBundle

2019-02-19 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6650?focusedWorklogId=201025&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201025
 ]

ASF GitHub Bot logged work on BEAM-6650:


Author: ASF GitHub Bot
Created on: 20/Feb/19 01:46
Start Date: 20/Feb/19 01:46
Worklog Time Spent: 10m 
  Work Description: aaltay commented on issue #7874: [release-2.11.0] 
Backport for BEAM-6650 and BEAM-6678
URL: https://github.com/apache/beam/pull/7874#issuecomment-465385561
 
 
   There is still some missing containers (dataflow runner harness). Python 
containers are published.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 201025)
Time Spent: 5h 50m  (was: 5h 40m)

> FlinkRunner fails to checkpoint elements emitted during finishBundle
> 
>
> Key: BEAM-6650
> URL: https://issues.apache.org/jira/browse/BEAM-6650
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Major
> Fix For: 2.11.0
>
>  Time Spent: 5h 50m
>  Remaining Estimate: 0h
>
> Elements emitted during the finalizeBundle call in snapshopState are lost 
> after the pipeline is restored. This only happens when the operator is keyed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5627) Investigate why test_split_at_fraction_exhaustive consistently fails to split after 101 attempts on Python 3

2019-02-19 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5627?focusedWorklogId=201024&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201024
 ]

ASF GitHub Bot logged work on BEAM-5627:


Author: ASF GitHub Bot
Created on: 20/Feb/19 01:45
Start Date: 20/Feb/19 01:45
Worklog Time Spent: 10m 
  Work Description: tvalentyn commented on pull request #7878: 
[BEAM-5627][BEAM-6569] run gcsio tests
URL: https://github.com/apache/beam/pull/7878#discussion_r258304010
 
 

 ##
 File path: sdks/python/tox.ini
 ##
 @@ -109,7 +109,7 @@ setenv =
   RUN_SKIPPED_PY3_TESTS=0
 extras = test,gcp
 modules =
-  
apache_beam.typehints,apache_beam.coders,apache_beam.options,apache_beam.tools,apache_beam.utils,apache_beam.internal,apache_beam.metrics,apache_beam.portability,apache_beam.pipeline_test,apache_beam.pvalue_test,apache_beam.runners,apache_beam.io.hadoopfilesystem_test,apache_beam.io.gcp.tests.utils_test,apache_beam.io.gcp.big_query_query_to_table_it_test,apache_beam.io.gcp.bigquery_io_read_it_test,apache_beam.io.gcp.bigquery_test,apache_beam.io.gcp.pubsub_integration_test,apache_beam.io.hdfs_integration_test,apache_beam.io.gcp.internal,apache_beam.io.filesystem_test,apache_beam.io.filesystems_test,apache_beam.io.sources_test,apache_beam.transforms,apache_beam.testing,apache_beam.io.filesystemio_test,apache_beam.io.localfilesystem_test,apache_beam.io.range_trackers_test,apache_beam.io.restriction_trackers_test,apache_beam.io.source_test_utils_test,apache_beam.io.concat_source_test,apache_beam.io.filebasedsink_test,apache_beam.io.filebasedsource_test,apache_beam.io.textio_test,apache_beam.io.tfrecordio_test,apache_beam.examples.wordcount_debugging_test,apache_beam.examples.wordcount_minimal_test,apache_beam.examples.wordcount_test,apache_beam.io.parquetio_test,apache_beam.io.gcp.gcsfilesystem_test,apache_beam.io.gcp.gcsio_integration_test,apache_beam.io.gcp.bigquery_test,apache_beam.io.gcp.big_query_query_to_table_it_test,apache_beam.io.gcp.bigquery_io_read_it_test,apache_beam.io.gcp.bigquery_tools_test,apache_beam.io.gcp.pubsub_integration_test,apache_beam.io.gcp.pubsub_test,apache_beam.io.gcp.datastore,apache_beam.io.gcp.datastore_write_it_test
+  
apache_beam.typehints,apache_beam.coders,apache_beam.options,apache_beam.tools,apache_beam.utils,apache_beam.internal,apache_beam.metrics,apache_beam.portability,apache_beam.pipeline_test,apache_beam.pvalue_test,apache_beam.runners,apache_beam.io.hadoopfilesystem_test,apache_beam.io.gcp.tests.utils_test,apache_beam.io.gcp.big_query_query_to_table_it_test,apache_beam.io.gcp.bigquery_io_read_it_test,apache_beam.io.gcp.bigquery_test,apache_beam.io.gcp.pubsub_integration_test,apache_beam.io.hdfs_integration_test,apache_beam.io.gcp.internal,apache_beam.io.filesystem_test,apache_beam.io.filesystems_test,apache_beam.io.sources_test,apache_beam.transforms,apache_beam.testing,apache_beam.io.filesystemio_test,apache_beam.io.localfilesystem_test,apache_beam.io.range_trackers_test,apache_beam.io.restriction_trackers_test,apache_beam.io.source_test_utils_test,apache_beam.io.concat_source_test,apache_beam.io.filebasedsink_test,apache_beam.io.filebasedsource_test,apache_beam.io.textio_test,apache_beam.io.tfrecordio_test,apache_beam.examples.wordcount_debugging_test,apache_beam.examples.wordcount_minimal_test,apache_beam.examples.wordcount_test,apache_beam.io.parquetio_test,apache_beam.io.gcp.gcsfilesystem_test,apache_beam.io.gcp.gcsio_integration_test,apache_beam.io.gcp.bigquery_test,apache_beam.io.gcp.big_query_query_to_table_it_test,apache_beam.io.gcp.bigquery_io_read_it_test,apache_beam.io.gcp.bigquery_tools_test,apache_beam.io.gcp.pubsub_integration_test,apache_beam.io.gcp.pubsub_test,apache_beam.io.gcp.datastore,apache_beam.io.gcp.datastore_write_it_test,apache_beam.io.gcp.gcsio_test
 
 Review comment:
   Can this be simplified to run all the tests?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 201024)
Time Spent: 4h  (was: 3h 50m)

> Investigate why test_split_at_fraction_exhaustive consistently fails to split 
> after 101 attempts on Python 3
> 
>
> Key: BEAM-5627
> URL: https://issues.apache.org/jira/browse/BEAM-5627
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: Robbe
>Priority: Minor
>  Labels: triaged
> Fix For: Not applicable
>

[jira] [Commented] (BEAM-6545) NPE when decoding null base 64 strings

2019-02-19 Thread Chris Chow (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-6545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16772510#comment-16772510
 ] 

Chris Chow commented on BEAM-6545:
--

how would you recommend avoiding this bug if we can't upgrade to 2.10.0?

> NPE when decoding null base 64 strings
> --
>
> Key: BEAM-6545
> URL: https://issues.apache.org/jira/browse/BEAM-6545
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Affects Versions: 2.9.0
>Reporter: Ahmet Altay
>Assignee: Kenneth Knowles
>Priority: Major
> Fix For: 2.10.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> **ByteArrayShufflePosition.fromBase64 is marked with a @Nullable argument, 
> however it does not properly handle null inputs resulting in NPE.
> This seems like an unintended change we picked up from the dependency: 
> google-http-java-client/ switched from apache commons to guava 
> ([https://github.com/googleapis/google-http-java-client/commit/990c534f0e5103a142b0639c12c90cb990a00cfd#diff-97264fba16d690a26d63fbbc992af937)]
>  
>  
> and decodeBase64 behaves differently in both cases. Former can handle null by 
> returning null, latter will throw NPE.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6459) gradle clean depends on setupVirtualEnv, which installs lots of stuff

2019-02-19 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6459?focusedWorklogId=201020&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201020
 ]

ASF GitHub Bot logged work on BEAM-6459:


Author: ASF GitHub Bot
Created on: 20/Feb/19 01:29
Start Date: 20/Feb/19 01:29
Worklog Time Spent: 10m 
  Work Description: udim commented on issue #7897: [BEAM-6459] Don't run 
setupVirtualenv during clean
URL: https://github.com/apache/beam/pull/7897#issuecomment-465381813
 
 
   CC: @tvalentyn @charlesccychen 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 201020)
Time Spent: 0.5h  (was: 20m)

> gradle clean depends on setupVirtualEnv, which installs lots of stuff
> -
>
> Key: BEAM-6459
> URL: https://issues.apache.org/jira/browse/BEAM-6459
> Project: Beam
>  Issue Type: Bug
>  Components: build-system, sdk-py-core
>Reporter: Kenneth Knowles
>Assignee: Udi Meiri
>Priority: Major
>  Labels: starter
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> "This seems to be on purpose [1]
> AFAIU setup is done to be able to call into setup.py clean. We probably 
> should work around that."
> Thread: 
> https://lists.apache.org/thread.html/c87640aa1ffb654fd5b477f61cee1d4d53aa24f545e50013543427b1@%3Cdev.beam.apache.org%3E



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-6714) Move runner-agnostic code out of FlinkJobServerDriver

2019-02-19 Thread Kyle Weaver (JIRA)
Kyle Weaver created BEAM-6714:
-

 Summary: Move runner-agnostic code out of FlinkJobServerDriver
 Key: BEAM-6714
 URL: https://issues.apache.org/jira/browse/BEAM-6714
 Project: Beam
  Issue Type: Task
  Components: runner-flink, runner-spark
Reporter: Kyle Weaver
Assignee: Kyle Weaver


[FlinkJobServerDriver|https://github.com/apache/beam/blob/master/runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkJobServerDriver.java]
 contains quite a bit of code that is not actually specific to the Flink 
runner. This runner-agnostic code should be shared so that other runners (ie 
Spark) developing portability can leverage it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6709) Typehinting depends on typing changes in Python 3.5.3

2019-02-19 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6709?focusedWorklogId=201014&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201014
 ]

ASF GitHub Bot logged work on BEAM-6709:


Author: ASF GitHub Bot
Created on: 20/Feb/19 01:09
Start Date: 20/Feb/19 01:09
Worklog Time Spent: 10m 
  Work Description: charlesccychen commented on issue #7873: [BEAM-6709] 
Check tuple typing failure.
URL: https://github.com/apache/beam/pull/7873#issuecomment-465377205
 
 
   Thanks, this LGTM.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 201014)
Time Spent: 1h  (was: 50m)

> Typehinting depends on typing changes in Python 3.5.3
> -
>
> Key: BEAM-6709
> URL: https://issues.apache.org/jira/browse/BEAM-6709
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Robbe
>Assignee: Robbe
>Priority: Major
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> On Python versions < 3.5.3, the Tuple and Union type from typing do not have 
> an `__args__` attribute, but a `__tuple_params__`, and a `__union_params__` 
> and `__union_set_params__` argument respectively.
> The current implementation fails <3.5.3 since it depends on the `__args__` 
> attribute



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6709) Typehinting depends on typing changes in Python 3.5.3

2019-02-19 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6709?focusedWorklogId=201015&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201015
 ]

ASF GitHub Bot logged work on BEAM-6709:


Author: ASF GitHub Bot
Created on: 20/Feb/19 01:09
Start Date: 20/Feb/19 01:09
Worklog Time Spent: 10m 
  Work Description: charlesccychen commented on pull request #7873: 
[BEAM-6709] Check tuple typing failure.
URL: https://github.com/apache/beam/pull/7873
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 201015)
Time Spent: 1h 10m  (was: 1h)

> Typehinting depends on typing changes in Python 3.5.3
> -
>
> Key: BEAM-6709
> URL: https://issues.apache.org/jira/browse/BEAM-6709
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Robbe
>Assignee: Robbe
>Priority: Major
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> On Python versions < 3.5.3, the Tuple and Union type from typing do not have 
> an `__args__` attribute, but a `__tuple_params__`, and a `__union_params__` 
> and `__union_set_params__` argument respectively.
> The current implementation fails <3.5.3 since it depends on the `__args__` 
> attribute



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6709) Typehinting depends on typing changes in Python 3.5.3

2019-02-19 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6709?focusedWorklogId=201000&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201000
 ]

ASF GitHub Bot logged work on BEAM-6709:


Author: ASF GitHub Bot
Created on: 20/Feb/19 00:47
Start Date: 20/Feb/19 00:47
Worklog Time Spent: 10m 
  Work Description: tvalentyn commented on pull request #7873: [BEAM-6709] 
Check tuple typing failure.
URL: https://github.com/apache/beam/pull/7873#discussion_r258291272
 
 

 ##
 File path: sdks/python/apache_beam/typehints/native_type_compatibility.py
 ##
 @@ -37,14 +38,33 @@
 
 def _get_arg(typ, index):
   """Returns the index-th argument to the given type."""
-  return typ.__args__[index]
+  try:
+return typ.__args__[index]
+  except AttributeError:
+if (3, 0, 0) <= sys.version_info[0:3] < (3, 5, 3):
+  # On Python versions < 3.5.3, the Tuple and Union type from typing do
+  # not have an __args__ attribute, but a __tuple_params__, and a
+  # __union_params__ and __union_set_params__ argument respectively.
+  args = next(value for key, value in typ.__dict__.items()
 
 Review comment:
   This looks correct, but is hard to read. I suggest, we rewrite this to be 
more explicit here:
   
   ```
   if has_attr(typ, '__tuple_params__') :
 args = typ.__tuple_params__
   elif has_attr(typ, '__union_params__'):
 args = typ.__union_params__
   else:
 assert False, "Only Union or Tuple metatypes are expected here."
   return args[index]
   ```
   We can do a similar change below.
   
   Also R: @charlesccychen to take another look on this change, thank you.
   
   FYI, 3.5.2 seems to use this revision of typing.py 
https://github.com/python/cpython/blob/3b557991d4a7626cf12baf2277bc87acbc439744/Lib/typing.py,
 changed in 
https://github.com/python/cpython/commit/5fc25a873cfdec27e46f71e62c9b65df5667c1b4#diff-015978a768b517a38abbc0ecdea87f5a.
  
   
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 201000)
Time Spent: 50m  (was: 40m)

> Typehinting depends on typing changes in Python 3.5.3
> -
>
> Key: BEAM-6709
> URL: https://issues.apache.org/jira/browse/BEAM-6709
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Robbe
>Assignee: Robbe
>Priority: Major
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> On Python versions < 3.5.3, the Tuple and Union type from typing do not have 
> an `__args__` attribute, but a `__tuple_params__`, and a `__union_params__` 
> and `__union_set_params__` argument respectively.
> The current implementation fails <3.5.3 since it depends on the `__args__` 
> attribute



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6709) Typehinting depends on typing changes in Python 3.5.3

2019-02-19 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6709?focusedWorklogId=200999&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-200999
 ]

ASF GitHub Bot logged work on BEAM-6709:


Author: ASF GitHub Bot
Created on: 20/Feb/19 00:41
Start Date: 20/Feb/19 00:41
Worklog Time Spent: 10m 
  Work Description: tvalentyn commented on pull request #7873: [BEAM-6709] 
Check tuple typing failure.
URL: https://github.com/apache/beam/pull/7873#discussion_r258291272
 
 

 ##
 File path: sdks/python/apache_beam/typehints/native_type_compatibility.py
 ##
 @@ -37,14 +38,33 @@
 
 def _get_arg(typ, index):
   """Returns the index-th argument to the given type."""
-  return typ.__args__[index]
+  try:
+return typ.__args__[index]
+  except AttributeError:
+if (3, 0, 0) <= sys.version_info[0:3] < (3, 5, 3):
+  # On Python versions < 3.5.3, the Tuple and Union type from typing do
+  # not have an __args__ attribute, but a __tuple_params__, and a
+  # __union_params__ and __union_set_params__ argument respectively.
+  args = next(value for key, value in typ.__dict__.items()
 
 Review comment:
   This looks correct, but is hard to read. I suggest, we rewrite this to be 
more explicit here:
   
   ```
   if has_attr(typ, '__tuple_params__') :
 args = typ.__tuple_params__
   elif has_attr(typ, '__union_params__'):
 args = typ.__tuple_params__
   else:
 assert False, "Only Union or Tuple metatypes are expected here."
   return args[index]
   ```
   We can do a similar change below.
   
   Also R: @charlesccychen to take another look on this change, thank you.
   
   FYI, 3.5.2 seems to use this revision of typing.py 
https://github.com/python/cpython/blob/3b557991d4a7626cf12baf2277bc87acbc439744/Lib/typing.py,
 changed in 
https://github.com/python/cpython/commit/5fc25a873cfdec27e46f71e62c9b65df5667c1b4#diff-015978a768b517a38abbc0ecdea87f5a.
  
   
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 200999)
Time Spent: 40m  (was: 0.5h)

> Typehinting depends on typing changes in Python 3.5.3
> -
>
> Key: BEAM-6709
> URL: https://issues.apache.org/jira/browse/BEAM-6709
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Robbe
>Assignee: Robbe
>Priority: Major
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> On Python versions < 3.5.3, the Tuple and Union type from typing do not have 
> an `__args__` attribute, but a `__tuple_params__`, and a `__union_params__` 
> and `__union_set_params__` argument respectively.
> The current implementation fails <3.5.3 since it depends on the `__args__` 
> attribute



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6459) gradle clean depends on setupVirtualEnv, which installs lots of stuff

2019-02-19 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6459?focusedWorklogId=200987&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-200987
 ]

ASF GitHub Bot logged work on BEAM-6459:


Author: ASF GitHub Bot
Created on: 20/Feb/19 00:23
Start Date: 20/Feb/19 00:23
Worklog Time Spent: 10m 
  Work Description: udim commented on issue #7897: [BEAM-6459] Don't run 
setupVirtualenv during clean
URL: https://github.com/apache/beam/pull/7897#issuecomment-465367157
 
 
   R: @kennknowles 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 200987)
Time Spent: 20m  (was: 10m)

> gradle clean depends on setupVirtualEnv, which installs lots of stuff
> -
>
> Key: BEAM-6459
> URL: https://issues.apache.org/jira/browse/BEAM-6459
> Project: Beam
>  Issue Type: Bug
>  Components: build-system, sdk-py-core
>Reporter: Kenneth Knowles
>Assignee: Udi Meiri
>Priority: Major
>  Labels: starter
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> "This seems to be on purpose [1]
> AFAIU setup is done to be able to call into setup.py clean. We probably 
> should work around that."
> Thread: 
> https://lists.apache.org/thread.html/c87640aa1ffb654fd5b477f61cee1d4d53aa24f545e50013543427b1@%3Cdev.beam.apache.org%3E



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6459) gradle clean depends on setupVirtualEnv, which installs lots of stuff

2019-02-19 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6459?focusedWorklogId=200986&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-200986
 ]

ASF GitHub Bot logged work on BEAM-6459:


Author: ASF GitHub Bot
Created on: 20/Feb/19 00:23
Start Date: 20/Feb/19 00:23
Worklog Time Spent: 10m 
  Work Description: udim commented on pull request #7897: [BEAM-6459] Don't 
run setupVirtualenv during clean
URL: https://github.com/apache/beam/pull/7897
 
 
   `cleanPython` no longer depends on `setupVirtualenv`.
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | --- | --- | --- | ---
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/)
 | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/)
  [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PreCommit_Python_PVR_Flink_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_Python_PVR_Flink_Cron/lastCompletedBuild/)
 | --- | --- | ---
   
   See [.test-infra/jenkins/README](../.test-infra/jenkins/README.md) for 
trigger phrase, status and link of all Jenkins jobs.
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Iss

[jira] [Work logged] (BEAM-6158) Enable support for save_main_session in Python 3

2019-02-19 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6158?focusedWorklogId=200981&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-200981
 ]

ASF GitHub Bot logged work on BEAM-6158:


Author: ASF GitHub Bot
Created on: 20/Feb/19 00:08
Start Date: 20/Feb/19 00:08
Worklog Time Spent: 10m 
  Work Description: markflyhigh commented on issue #7888: [BEAM-6158] 
Remove test_wordcount_without_save_main_session
URL: https://github.com/apache/beam/pull/7888#issuecomment-465363983
 
 
   +R: @charlesccychen @tvalentyn 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 200981)
Time Spent: 0.5h  (was: 20m)

> Enable support for save_main_session in Python 3
> 
>
> Key: BEAM-6158
> URL: https://issues.apache.org/jira/browse/BEAM-6158
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-harness
>Reporter: Mark Liu
>Assignee: Valentyn Tymofieiev
>Priority: Major
>  Labels: triaged
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> This happened when I run wordcount example with portable Dataflow runner in 
> Python 3.5. The failure shows in worker log (unfortunately unformatted) of 
> [this 
> job|https://pantheon.corp.google.com/dataflow/jobsDetail/locations/us-central1/jobs/2018-11-29_11_47_38-6731484595556255542?project=google.com:clouddfe]:
> {code:java}
> Could not load main session: Traceback (most recent call last): File 
> "/usr/local/lib/python3.5/site-packages/apache_beam/runners/worker/sdk_worker_main.py",
>  line 125, in main _load_main_session(semi_persistent_directory) File 
> "/usr/local/lib/python3.5/site-packages/apache_beam/runners/worker/sdk_worker_main.py",
>  line 201, in _load_main_session pickler.load_session(session_file) File 
> "/usr/local/lib/python3.5/site-packages/apache_beam/internal/pickler.py", 
> line 269, in load_session return dill.load_session(file_path) File 
> "/usr/local/lib/python3.5/site-packages/dill/_dill.py", line 402, in 
> load_session module = unpickler.load() File 
> "/usr/local/lib/python3.5/site-packages/dill/_dill.py", line 465, in 
> find_class return StockUnpickler.find_class(self, module, name) 
> AttributeError: Can't get attribute 'WordExtractingDoFn' on  'apache_beam.runners.worker.sdk_worker_main' from 
> '/usr/local/lib/python3.5/site-packages/apache_beam/runners/worker/sdk_worker_main.py'>
>  Traceback (most recent call last): File 
> "/usr/local/lib/python3.5/site-packages/apache_beam/runners/worker/sdk_worker_main.py",
>  line 125, in main _load_main_session(semi_persistent_directory) File 
> "/usr/local/lib/python3.5/site-packages/apache_beam/runners/worker/sdk_worker_main.py",
>  line 201, in _load_main_session pickler.load_session(session_file) File 
> "/usr/local/lib/python3.5/site-packages/apache_beam/internal/pickler.py", 
> line 269, in load_session return dill.load_session(file_path) File 
> "/usr/local/lib/python3.5/site-packages/dill/_dill.py", line 402, in 
> load_session module = unpickler.load() File 
> "/usr/local/lib/python3.5/site-packages/dill/_dill.py", line 465, in 
> find_class return StockUnpickler.find_class(self, module, name) 
> AttributeError: Can't get attribute 'WordExtractingDoFn' on  'apache_beam.runners.worker.sdk_worker_main' from 
> '/usr/local/lib/python3.5/site-packages/apache_beam/runners/worker/sdk_worker_main.py'>
> {code}
> Looks like saved main session didn't work properly in Python 3.
> +cc: [~tvalentyn] [~robertwb] [~altay]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-6459) gradle clean depends on setupVirtualEnv, which installs lots of stuff

2019-02-19 Thread Udi Meiri (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Udi Meiri reassigned BEAM-6459:
---

Assignee: Udi Meiri

> gradle clean depends on setupVirtualEnv, which installs lots of stuff
> -
>
> Key: BEAM-6459
> URL: https://issues.apache.org/jira/browse/BEAM-6459
> Project: Beam
>  Issue Type: Bug
>  Components: build-system, sdk-py-core
>Reporter: Kenneth Knowles
>Assignee: Udi Meiri
>Priority: Major
>  Labels: starter
>
> "This seems to be on purpose [1]
> AFAIU setup is done to be able to call into setup.py clean. We probably 
> should work around that."
> Thread: 
> https://lists.apache.org/thread.html/c87640aa1ffb654fd5b477f61cee1d4d53aa24f545e50013543427b1@%3Cdev.beam.apache.org%3E



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6711) Bigquery Tornadoes IT is broken in Python3 PostCommit test suite.

2019-02-19 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6711?focusedWorklogId=200978&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-200978
 ]

ASF GitHub Bot logged work on BEAM-6711:


Author: ASF GitHub Bot
Created on: 19/Feb/19 23:58
Start Date: 19/Feb/19 23:58
Worklog Time Spent: 10m 
  Work Description: pabloem commented on issue #7892: [BEAM-6711] 
[BEAM-6553] A Python SDK sink that supports File Loads into BQ (#7655)
URL: https://github.com/apache/beam/pull/7892#issuecomment-465361612
 
 
   Run Python PostCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 200978)
Time Spent: 1.5h  (was: 1h 20m)

> Bigquery Tornadoes IT is broken in Python3 PostCommit test suite. 
> --
>
> Key: BEAM-6711
> URL: https://issues.apache.org/jira/browse/BEAM-6711
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: Pablo Estrada
>Priority: Major
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> First failure was observed in 
> https://builds.apache.org/job/beam_PostCommit_Python3_Verify/54 , after 
> https://github.com/apache/beam/commit/cdea885872b3be7de9ba22f22700be89f7d53766
>  was merged. 
> [~pabloem], could you please take a look? I suggest we do a rollback + 
> rollforward with a fix.
> {noformat}
> root: ERROR: Exception at bundle 
> , 
> due to an exception.
>  Traceback (most recent call last):
>   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py",
>  line 727, in process
> return self.do_fn_invoker.invoke_process(windowed_value)
>   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py",
>  line 556, in invoke_process
> windowed_value, additional_args, additional_kwargs, output_processor)
>   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py",
>  line 622, in _invoke_per_window
> self.process_method(*args_for_process, **kwargs_for_process))
>   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py",
>  line 823, in process_outputs
> for result in results:
>   File "/home/jenkins/jenkins-slave/works
> pace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/io/gcp/bigquery_file_loads.py",
>  line 191, in process
> if destination in self._destination_to_file_writer:
> TypeError: unhashable type: 'TableReference'
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4775) JobService should support returning metrics

2019-02-19 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4775?focusedWorklogId=200977&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-200977
 ]

ASF GitHub Bot logged work on BEAM-4775:


Author: ASF GitHub Bot
Created on: 19/Feb/19 23:49
Start Date: 19/Feb/19 23:49
Worklog Time Spent: 10m 
  Work Description: Ardagan commented on pull request #7876: [BEAM-4775] 
Clean up metric protos; support integer distributions, gauges
URL: https://github.com/apache/beam/pull/7876#discussion_r258280077
 
 

 ##
 File path: model/fn-execution/src/main/proto/beam_fn_api.proto
 ##
 @@ -501,48 +501,9 @@ message MonitoringInfoTypeUrns {
 message Metric {
   // (Required) The data for this metric.
   oneof data {
-CounterData counter_data = 1;
-DistributionData distribution_data = 2;
-ExtremaData extrema_data = 3;
-  }
-}
-
-// Data associated with a Counter or Gauge metric.
-// This is designed to be compatible with metric collection
-// systems such as DropWizard.
-message CounterData {
-  oneof value {
-int64 int64_value = 1;
-double double_value = 2;
-string string_value = 3;
-  }
-}
-
-// Extrema messages are used for calculating
-// Top-N/Bottom-N metrics.
-message ExtremaData {
-  oneof extrema {
-IntExtremaData int_extrema_data = 1;
-DoubleExtremaData double_extrema_data = 2;
-  }
-}
-
-message IntExtremaData {
-  repeated int64 int_values = 1;
-}
-
-message DoubleExtremaData {
-  repeated double double_values = 2;
-}
-
-// Data associated with a distribution metric.
-// This is based off of the current DistributionData metric.
-// This is not a stackdriver or dropwizard compatible
-// style of distribution metric.
-message DistributionData {
-  oneof distribution {
-IntDistributionData int_distribution_data = 1;
-DoubleDistributionData double_distribution_data = 2;
+int64 counter = 1;
 
 Review comment:
   I don't have strong opinion about removing CounterData, but I'm concerned 
that we make counter int64. In my opinion, double will be much more flexible.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 200977)
Time Spent: 12h 20m  (was: 12h 10m)

> JobService should support returning metrics
> ---
>
> Key: BEAM-4775
> URL: https://issues.apache.org/jira/browse/BEAM-4775
> Project: Beam
>  Issue Type: Bug
>  Components: beam-model
>Reporter: Eugene Kirpichov
>Assignee: Ryan Williams
>Priority: Major
>  Labels: triaged
>  Time Spent: 12h 20m
>  Remaining Estimate: 0h
>
> [https://github.com/apache/beam/blob/master/model/job-management/src/main/proto/beam_job_api.proto]
>  currently doesn't appear to have a way for JobService to return metrics to a 
> user, even though 
> [https://github.com/apache/beam/blob/master/model/fn-execution/src/main/proto/beam_fn_api.proto]
>  includes support for reporting SDK metrics to the runner harness.
>  
> Metrics are apparently necessary to run any ValidatesRunner tests because 
> PAssert needs to validate that the assertions succeeded. However, this 
> statement should be double-checked: perhaps it's possible to somehow work 
> with PAssert without metrics support.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6698) Portable Validates Runner Tests on Flink flaky after update to gradle5

2019-02-19 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6698?focusedWorklogId=200976&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-200976
 ]

ASF GitHub Bot logged work on BEAM-6698:


Author: ASF GitHub Bot
Created on: 19/Feb/19 23:37
Start Date: 19/Feb/19 23:37
Worklog Time Spent: 10m 
  Work Description: adude3141 commented on issue #7877: [DO NOT MERGE] - 
[BEAM-6698] increase maxHeapSize to prevent OutOfMemoryError (direct …
URL: https://github.com/apache/beam/pull/7877#issuecomment-465356872
 
 
   Run Java Flink PortableValidatesRunner Streaming
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 200976)
Time Spent: 2h 50m  (was: 2h 40m)

> Portable Validates Runner Tests on Flink flaky after update to gradle5 
> ---
>
> Key: BEAM-6698
> URL: https://issues.apache.org/jira/browse/BEAM-6698
> Project: Beam
>  Issue Type: Bug
>  Components: test-failures
>Reporter: Michael Luckey
>Assignee: Michael Luckey
>Priority: Major
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> After upgrade to gradle 5 [1], the two portable runner test projects on 
> jenkins
>  - beam_PostCommit_Java_PVR_Flink_Streaming [2]
>  - beam_PostCommit_Java_PVR_Flink_Batch [3]
> became flaky.
> First investigation seems to point the tests to be failing on direct buffer 
> memory (e.g. [4]) while staging files.
> Although I am unsure, whether this is really the root cause, or something 
> that shows up after some other failure.
> {noformat}
> INFO: Transport failed
> org.apache.beam.vendor.grpc.v1p13p1.io.netty.util.internal.OutOfDirectMemoryError:
>  failed to allocate 16777216 byte(s) of direct memory (used: 1895825695, max: 
> 1908932608)
> {noformat}
> As far as I know, we do not set `-XX:MaxDirectMemorySize` anywhere in our 
> setup, neither does gradle itself. At least on my machine both gradle 4 and 
> gradle 5 stick to the same jvm default
> {noformat}
> ###
> sun.misc.VM.maxDirectMemory(): 1908932608 Bytes
> sun.misc.VM.maxDirectMemory(): 1820 MB
> ###
> {noformat}
>  
> Unfortunately this does not reproduce on (my) local machine. We might try to 
> workaround here by increasing ` -XX:MaxDirectMemorySize==3G` but this would 
> probably only hide the problem? But might still be helpful to increase 
> temporarily on branch just to be sure, that this is indeed the root cause?
> [1] https://issues.apache.org/jira/browse/BEAM-6630
>  [2] [https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/]
>  [3] [https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/]
>  [4] 
> [https://scans.gradle.com/s/tpo3yffjznfxa/tests/yobvrae4rwsg4-go44ti5iq45vq]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6711) Bigquery Tornadoes IT is broken in Python3 PostCommit test suite.

2019-02-19 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6711?focusedWorklogId=200974&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-200974
 ]

ASF GitHub Bot logged work on BEAM-6711:


Author: ASF GitHub Bot
Created on: 19/Feb/19 23:36
Start Date: 19/Feb/19 23:36
Worklog Time Spent: 10m 
  Work Description: pabloem commented on issue #7892: [BEAM-6711] 
[BEAM-6553] A Python SDK sink that supports File Loads into BQ (#7655)
URL: https://github.com/apache/beam/pull/7892#issuecomment-465356737
 
 
   Run Python PostCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 200974)
Time Spent: 1h 20m  (was: 1h 10m)

> Bigquery Tornadoes IT is broken in Python3 PostCommit test suite. 
> --
>
> Key: BEAM-6711
> URL: https://issues.apache.org/jira/browse/BEAM-6711
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: Pablo Estrada
>Priority: Major
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> First failure was observed in 
> https://builds.apache.org/job/beam_PostCommit_Python3_Verify/54 , after 
> https://github.com/apache/beam/commit/cdea885872b3be7de9ba22f22700be89f7d53766
>  was merged. 
> [~pabloem], could you please take a look? I suggest we do a rollback + 
> rollforward with a fix.
> {noformat}
> root: ERROR: Exception at bundle 
> , 
> due to an exception.
>  Traceback (most recent call last):
>   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py",
>  line 727, in process
> return self.do_fn_invoker.invoke_process(windowed_value)
>   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py",
>  line 556, in invoke_process
> windowed_value, additional_args, additional_kwargs, output_processor)
>   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py",
>  line 622, in _invoke_per_window
> self.process_method(*args_for_process, **kwargs_for_process))
>   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py",
>  line 823, in process_outputs
> for result in results:
>   File "/home/jenkins/jenkins-slave/works
> pace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/io/gcp/bigquery_file_loads.py",
>  line 191, in process
> if destination in self._destination_to_file_writer:
> TypeError: unhashable type: 'TableReference'
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6698) Portable Validates Runner Tests on Flink flaky after update to gradle5

2019-02-19 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6698?focusedWorklogId=200975&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-200975
 ]

ASF GitHub Bot logged work on BEAM-6698:


Author: ASF GitHub Bot
Created on: 19/Feb/19 23:37
Start Date: 19/Feb/19 23:37
Worklog Time Spent: 10m 
  Work Description: adude3141 commented on issue #7877: [DO NOT MERGE] - 
[BEAM-6698] increase maxHeapSize to prevent OutOfMemoryError (direct …
URL: https://github.com/apache/beam/pull/7877#issuecomment-465356814
 
 
   Run Java Flink PortableValidatesRunner Batch
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 200975)
Time Spent: 2h 40m  (was: 2.5h)

> Portable Validates Runner Tests on Flink flaky after update to gradle5 
> ---
>
> Key: BEAM-6698
> URL: https://issues.apache.org/jira/browse/BEAM-6698
> Project: Beam
>  Issue Type: Bug
>  Components: test-failures
>Reporter: Michael Luckey
>Assignee: Michael Luckey
>Priority: Major
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> After upgrade to gradle 5 [1], the two portable runner test projects on 
> jenkins
>  - beam_PostCommit_Java_PVR_Flink_Streaming [2]
>  - beam_PostCommit_Java_PVR_Flink_Batch [3]
> became flaky.
> First investigation seems to point the tests to be failing on direct buffer 
> memory (e.g. [4]) while staging files.
> Although I am unsure, whether this is really the root cause, or something 
> that shows up after some other failure.
> {noformat}
> INFO: Transport failed
> org.apache.beam.vendor.grpc.v1p13p1.io.netty.util.internal.OutOfDirectMemoryError:
>  failed to allocate 16777216 byte(s) of direct memory (used: 1895825695, max: 
> 1908932608)
> {noformat}
> As far as I know, we do not set `-XX:MaxDirectMemorySize` anywhere in our 
> setup, neither does gradle itself. At least on my machine both gradle 4 and 
> gradle 5 stick to the same jvm default
> {noformat}
> ###
> sun.misc.VM.maxDirectMemory(): 1908932608 Bytes
> sun.misc.VM.maxDirectMemory(): 1820 MB
> ###
> {noformat}
>  
> Unfortunately this does not reproduce on (my) local machine. We might try to 
> workaround here by increasing ` -XX:MaxDirectMemorySize==3G` but this would 
> probably only hide the problem? But might still be helpful to increase 
> temporarily on branch just to be sure, that this is indeed the root cause?
> [1] https://issues.apache.org/jira/browse/BEAM-6630
>  [2] [https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/]
>  [3] [https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/]
>  [4] 
> [https://scans.gradle.com/s/tpo3yffjznfxa/tests/yobvrae4rwsg4-go44ti5iq45vq]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6711) Bigquery Tornadoes IT is broken in Python3 PostCommit test suite.

2019-02-19 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6711?focusedWorklogId=200968&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-200968
 ]

ASF GitHub Bot logged work on BEAM-6711:


Author: ASF GitHub Bot
Created on: 19/Feb/19 23:20
Start Date: 19/Feb/19 23:20
Worklog Time Spent: 10m 
  Work Description: pabloem commented on issue #7892: [BEAM-6711] 
[BEAM-6553] A Python SDK sink that supports File Loads into BQ (#7655)
URL: https://github.com/apache/beam/pull/7892#issuecomment-465352712
 
 
   Run Python PostCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 200968)
Time Spent: 1h 10m  (was: 1h)

> Bigquery Tornadoes IT is broken in Python3 PostCommit test suite. 
> --
>
> Key: BEAM-6711
> URL: https://issues.apache.org/jira/browse/BEAM-6711
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: Pablo Estrada
>Priority: Major
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> First failure was observed in 
> https://builds.apache.org/job/beam_PostCommit_Python3_Verify/54 , after 
> https://github.com/apache/beam/commit/cdea885872b3be7de9ba22f22700be89f7d53766
>  was merged. 
> [~pabloem], could you please take a look? I suggest we do a rollback + 
> rollforward with a fix.
> {noformat}
> root: ERROR: Exception at bundle 
> , 
> due to an exception.
>  Traceback (most recent call last):
>   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py",
>  line 727, in process
> return self.do_fn_invoker.invoke_process(windowed_value)
>   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py",
>  line 556, in invoke_process
> windowed_value, additional_args, additional_kwargs, output_processor)
>   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py",
>  line 622, in _invoke_per_window
> self.process_method(*args_for_process, **kwargs_for_process))
>   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py",
>  line 823, in process_outputs
> for result in results:
>   File "/home/jenkins/jenkins-slave/works
> pace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/io/gcp/bigquery_file_loads.py",
>  line 191, in process
> if destination in self._destination_to_file_writer:
> TypeError: unhashable type: 'TableReference'
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5638) Add exception handling to single message transforms in Java SDK

2019-02-19 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5638?focusedWorklogId=200965&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-200965
 ]

ASF GitHub Bot logged work on BEAM-5638:


Author: ASF GitHub Bot
Created on: 19/Feb/19 23:10
Start Date: 19/Feb/19 23:10
Worklog Time Spent: 10m 
  Work Description: reuvenlax commented on pull request #7736: [BEAM-5638] 
Exception handling for Java MapElements and FlatMapElements
URL: https://github.com/apache/beam/pull/7736#discussion_r258252131
 
 

 ##
 File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/FlatMapElements.java
 ##
 @@ -170,4 +174,193 @@ public void populateDisplayData(DisplayData.Builder 
builder) {
   builder.include("fn", (HasDisplayData) originalFnForDisplayData);
 }
   }
+
+  /**
+   * Return a modified {@code PTransform} that catches exceptions raised while 
mapping elements.
+   *
+   * The user must call {@code via} on the returned {@link 
FlatMapWithExceptions} instance to
+   * define an exception handler. If the handler does not provide sufficient 
type information, the
+   * user must also call {@code into} to define a type descriptor for the 
error collection.
+   *
+   * See {@link WithExceptions} documentation for usage patterns of the 
returned {@link
+   * WithExceptions.Result}.
+   *
+   * @return a {@link WithExceptions.Result} wrapping the output and error 
collections
+   */
+  @Experimental(Experimental.Kind.WITH_EXCEPTIONS)
+  public FlatMapWithExceptions withExceptions() {
+return new FlatMapWithExceptions<>(
+fn, originalFnForDisplayData, inputType, outputType, null, null);
+  }
+
+  /** Implementation of {@link FlatMapElements#withExceptions()}. */
+  @Experimental(Experimental.Kind.WITH_EXCEPTIONS)
+  public static class FlatMapWithExceptions
+  extends PTransform<
+  PCollection, WithExceptions.Result, 
FailureT>> {
+
+private final transient TypeDescriptor inputType;
+private final transient TypeDescriptor outputType;
+@Nullable private final transient TypeDescriptor failureType;
+private final transient Object originalFnForDisplayData;
+@Nullable private final Contextful>> fn;
+@Nullable private final ProcessFunction, 
FailureT> exceptionHandler;
+
+FlatMapWithExceptions(
+@Nullable Contextful>> fn,
+Object originalFnForDisplayData,
+TypeDescriptor inputType,
+TypeDescriptor outputType,
+@Nullable ProcessFunction, FailureT> 
exceptionHandler,
+@Nullable TypeDescriptor failureType) {
+  this.fn = fn;
+  this.originalFnForDisplayData = originalFnForDisplayData;
+  this.inputType = inputType;
+  this.outputType = outputType;
+  this.exceptionHandler = exceptionHandler;
+  this.failureType = failureType;
+}
+
+/**
+ * Returns a new {@link FlatMapWithExceptions} transform with the given 
type descriptor for the
+ * error collection, but the exception handler yet to be specified using 
{@link
+ * #via(ProcessFunction)}.
+ */
+public  FlatMapWithExceptions 
into(
+TypeDescriptor failureTypeDescriptor) {
+  return new FlatMapWithExceptions<>(
+  fn, originalFnForDisplayData, inputType, outputType, null, 
failureTypeDescriptor);
+}
+
+/**
+ * Returns a {@code PTransform} that catches exceptions raised while 
mapping elements, passing
+ * the raised exception instance and the input element being processed 
through the given {@code
+ * exceptionHandler} and emitting the result to an error collection.
+ *
+ * Example usage:
+ *
+ * {@code
+ * Result, String>> result = words.apply(
+ * FlatMapElements.into(TypeDescriptors.strings())
+ * .via((String line) -> 
Arrays.asList(Arrays.copyOfRange(line.split(" "), 1, 5)))
+ * .withExceptions()
+ * .into(TypeDescriptors.strings())
+ * .via(ee -> e.exception().getMessage()));
+ * PCollection errors = result.errors();
+ * }
+ */
+public FlatMapWithExceptions via(
+ProcessFunction, FailureT> exceptionHandler) {
+  return new FlatMapWithExceptions<>(
+  fn, originalFnForDisplayData, inputType, outputType, 
exceptionHandler, failureType);
+}
+
+/**
+ * Like {@link #via(ProcessFunction)}, but takes advantage of the type 
information provided by
+ * {@link InferableFunction}, meaning that a call to {@link 
#into(TypeDescriptor)} may not be
+ * necessary.
+ *
+ * Example usage:
+ *
+ * {@code
+ * Result, KV>> result = 
words.apply(
+ * FlatMapElements.into(TypeDescriptors.strings())
+ * .via((String line) -> 
Arrays.asList(Arrays.copyOfRange(line.split(" "), 1, 5)))
+ * .withExceptions()
+ * .via(new WithExceptions.ExceptionAsMapHandler() {}));
+ * PCol

[jira] [Work logged] (BEAM-4775) JobService should support returning metrics

2019-02-19 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4775?focusedWorklogId=200963&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-200963
 ]

ASF GitHub Bot logged work on BEAM-4775:


Author: ASF GitHub Bot
Created on: 19/Feb/19 23:06
Start Date: 19/Feb/19 23:06
Worklog Time Spent: 10m 
  Work Description: ryan-williams commented on issue #7823: [DO NOT MERGE] 
[BEAM-4775] Second take on portable metrics over the job-server API
URL: https://github.com/apache/beam/pull/7823#issuecomment-465348737
 
 
   > I'm not sure if you want me to review this one or not. This PR is a bit 
massive.
   
   Yea, there are probably 4 chunks I will still pull out of this into separate 
PRs, that are hopefully more manageable:
   - consolidating MetricResult implementations
 - already out at #7890, though I need to fix a couple tests
 - I haven't had a chance to factor those changes out of this PR yet
   - fully supporting MonitoringInfo data types ({URN, labels}) inside 
`MetricKey` (comprised of `MetricName`,`MetricLabels`)
   - adding the job-API metrics RPC
   - python support
   
   In the meantime, [this link will show all the changes that still live only 
in this 
PR](https://github.com/ryan-williams/beam/compare/octo...ryan-williams:mk); 
this PR's main view still displays all the changes it is rebased on top of.
   
   > Is this the one that introduces the new layer to extract the new 
MonitoringInfo metrics over an RPC?
   https://s.apache.org/get-metrics-api
   
   Yea, that is in here! Swimming to the surface…
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 200963)
Time Spent: 12h 10m  (was: 12h)

> JobService should support returning metrics
> ---
>
> Key: BEAM-4775
> URL: https://issues.apache.org/jira/browse/BEAM-4775
> Project: Beam
>  Issue Type: Bug
>  Components: beam-model
>Reporter: Eugene Kirpichov
>Assignee: Ryan Williams
>Priority: Major
>  Labels: triaged
>  Time Spent: 12h 10m
>  Remaining Estimate: 0h
>
> [https://github.com/apache/beam/blob/master/model/job-management/src/main/proto/beam_job_api.proto]
>  currently doesn't appear to have a way for JobService to return metrics to a 
> user, even though 
> [https://github.com/apache/beam/blob/master/model/fn-execution/src/main/proto/beam_fn_api.proto]
>  includes support for reporting SDK metrics to the runner harness.
>  
> Metrics are apparently necessary to run any ValidatesRunner tests because 
> PAssert needs to validate that the assertions succeeded. However, this 
> statement should be double-checked: perhaps it's possible to somehow work 
> with PAssert without metrics support.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5638) Add exception handling to single message transforms in Java SDK

2019-02-19 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5638?focusedWorklogId=200964&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-200964
 ]

ASF GitHub Bot logged work on BEAM-5638:


Author: ASF GitHub Bot
Created on: 19/Feb/19 23:10
Start Date: 19/Feb/19 23:10
Worklog Time Spent: 10m 
  Work Description: reuvenlax commented on pull request #7736: [BEAM-5638] 
Exception handling for Java MapElements and FlatMapElements
URL: https://github.com/apache/beam/pull/7736#discussion_r258254032
 
 

 ##
 File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/MapElements.java
 ##
 @@ -156,4 +160,190 @@ public void populateDisplayData(DisplayData.Builder 
builder) {
   builder.include("fn", (HasDisplayData) originalFnForDisplayData);
 }
   }
+
+  /**
+   * Return a modified {@code PTransform} that catches exceptions raised while 
mapping elements.
+   *
+   * The user must call {@code via} on the returned {@link 
MapWithExceptions} instance to define
+   * an exception handler. If the handler does not provide sufficient type 
information, the user
+   * must also call {@code into} to define a type descriptor for the error 
collection.
+   *
+   * See {@link WithExceptions} documentation for usage patterns of the 
returned {@link
+   * WithExceptions.Result}.
+   *
+   * @return a {@link WithExceptions.Result} wrapping the output and error 
collections
+   */
+  @Experimental(Kind.WITH_EXCEPTIONS)
+  public MapWithExceptions withExceptions() {
+return new MapWithExceptions<>(fn, originalFnForDisplayData, inputType, 
outputType, null, null);
+  }
+
+  /** Implementation of {@link MapElements#withExceptions()}. */
+  @Experimental(Kind.WITH_EXCEPTIONS)
+  public static class MapWithExceptions
+  extends PTransform<
+  PCollection, WithExceptions.Result, 
FailureT>> {
+
+private final transient TypeDescriptor inputType;
+private final transient TypeDescriptor outputType;
+@Nullable private final transient TypeDescriptor failureType;
+private final transient Object originalFnForDisplayData;
+private final Contextful> fn;
+@Nullable private final ProcessFunction, 
FailureT> exceptionHandler;
+
+MapWithExceptions(
+Contextful> fn,
+Object originalFnForDisplayData,
+TypeDescriptor inputType,
+TypeDescriptor outputType,
+@Nullable ProcessFunction, FailureT> 
exceptionHandler,
+@Nullable TypeDescriptor failureType) {
+  this.fn = fn;
+  this.originalFnForDisplayData = originalFnForDisplayData;
+  this.inputType = inputType;
+  this.outputType = outputType;
+  this.exceptionHandler = exceptionHandler;
+  this.failureType = failureType;
+}
+
+/**
+ * Returns a new {@link MapWithExceptions} transform with the given type 
descriptor for the
+ * error collection, but the exception handler yet to be specified using 
{@link
+ * #via(ProcessFunction)}.
+ */
+public  MapWithExceptions into(
+TypeDescriptor failureTypeDescriptor) {
+  return new MapWithExceptions<>(
+  fn, originalFnForDisplayData, inputType, outputType, null, 
failureTypeDescriptor);
+}
+
+/**
+ * Returns a {@code PTransform} that catches exceptions raised while 
mapping elements, passing
+ * the raised exception instance and the input element being processed 
through the given {@code
+ * exceptionHandler} and emitting the result to an error collection.
+ *
+ * Example usage:
+ *
+ * {@code
+ * Result, String> result = words.apply(
+ * MapElements.into(TypeDescriptors.integers())
+ *.via((String word) -> 1 / word.length())
+ *.withExceptions()
+ *.into(TypeDescriptors.strings())
+ *.via(ee -> e.exception().getMessage()));
+ * PCollection errors = result.errors();
+ * }
+ */
+public MapWithExceptions via(
+ProcessFunction, FailureT> exceptionHandler) {
+  return new MapWithExceptions<>(
+  fn, originalFnForDisplayData, inputType, outputType, 
exceptionHandler, failureType);
+}
+
+/**
+ * Like {@link #via(ProcessFunction)}, but takes advantage of the type 
information provided by
+ * {@link InferableFunction}, meaning that a call to {@link 
#into(TypeDescriptor)} may not be
+ * necessary.
+ *
+ * Example usage:
+ *
+ * {@code
+ * Result, KV>> result = 
words.apply(
+ * MapElements.into(TypeDescriptors.integers())
+ *.via((String word) -> 1 / word.length())
+ *.withExceptions()
+ *.via(new WithExceptions.ExceptionAsMapHandler() 
{}));
+ * PCollection>> errors = result.errors();
+ * }
+ */
+public  MapWithExceptions via(
+InferableFunction, NewFailureT> 
exceptionH

[jira] [Work logged] (BEAM-5638) Add exception handling to single message transforms in Java SDK

2019-02-19 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5638?focusedWorklogId=200966&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-200966
 ]

ASF GitHub Bot logged work on BEAM-5638:


Author: ASF GitHub Bot
Created on: 19/Feb/19 23:10
Start Date: 19/Feb/19 23:10
Worklog Time Spent: 10m 
  Work Description: reuvenlax commented on pull request #7736: [BEAM-5638] 
Exception handling for Java MapElements and FlatMapElements
URL: https://github.com/apache/beam/pull/7736#discussion_r258208269
 
 

 ##
 File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/FlatMapElements.java
 ##
 @@ -170,4 +174,191 @@ public void populateDisplayData(DisplayData.Builder 
builder) {
   builder.include("fn", (HasDisplayData) originalFnForDisplayData);
 }
   }
+
+  /**
+   * Return a modified {@code PTransform} that catches exceptions raised while 
mapping elements.
+   *
+   * The user must call {@code via} on the returned {@link 
FlatMapWithExceptions} instance to
+   * define an exception handler. If the handler does not provide sufficient 
type information, the
+   * user must also call {@code into} to define a type descriptor for the 
error collection.
+   *
+   * See {@link WithExceptions} documentation for usage patterns of the 
returned {@link
+   * WithExceptions.Result}.
+   *
+   * @return a {@link WithExceptions.Result} wrapping the output and error 
collections
+   */
+  public FlatMapWithExceptions withExceptions() {
+return new FlatMapWithExceptions<>(
+fn, originalFnForDisplayData, inputType, outputType, null, null);
+  }
+
+  /** Implementation of {@link FlatMapElements#withExceptions()}. */
+  public static class FlatMapWithExceptions
+  extends PTransform<
+  PCollection, WithExceptions.Result, 
FailureT>> {
+
+private final transient TypeDescriptor inputType;
+private final transient TypeDescriptor outputType;
+@Nullable private final transient TypeDescriptor failureType;
+private final transient Object originalFnForDisplayData;
+@Nullable private final Contextful>> fn;
+@Nullable private final ProcessFunction, 
FailureT> exceptionHandler;
+
+FlatMapWithExceptions(
+@Nullable Contextful>> fn,
+Object originalFnForDisplayData,
+TypeDescriptor inputType,
+TypeDescriptor outputType,
+@Nullable ProcessFunction, FailureT> 
exceptionHandler,
+@Nullable TypeDescriptor failureType) {
+  this.fn = fn;
+  this.originalFnForDisplayData = originalFnForDisplayData;
+  this.inputType = inputType;
+  this.outputType = outputType;
+  this.exceptionHandler = exceptionHandler;
+  this.failureType = failureType;
+}
+
+/**
+ * Returns a new {@link FlatMapWithExceptions} transform with the given 
type descriptor for the
+ * error collection, but the exception handler yet to be specified using 
{@link
+ * #via(ProcessFunction)}.
+ */
+public  FlatMapWithExceptions 
into(
+TypeDescriptor failureTypeDescriptor) {
+  return new FlatMapWithExceptions<>(
+  fn, originalFnForDisplayData, inputType, outputType, null, 
failureTypeDescriptor);
+}
+
+/**
+ * Returns a {@code PTransform} that catches exceptions raised while 
mapping elements, passing
+ * the raised exception instance and the input element being processed 
through the given {@code
+ * exceptionHandler} and emitting the result to an error collection.
+ *
+ * Example usage:
+ *
+ * {@code
+ * Result, String>> result = words.apply(
+ * FlatMapElements.into(TypeDescriptors.strings())
+ * .via((String line) -> 
Arrays.asList(Arrays.copyOfRange(line.split(" "), 1, 5)))
+ * .withExceptions()
+ * .into(TypeDescriptors.strings())
+ * .via(ee -> e.exception().getMessage()));
 
 Review comment:
   I'm thinking that we might be better off renaming the via. In the above 
example there are two calls to via() which is confusing; you have to read 
through the code and realize that the call to withExceptions returns a new 
object. Maybe withExceptionHandler instead?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 200966)
Time Spent: 7h 40m  (was: 7.5h)
Remaining Estimate: 160h 20m  (was: 160.5h)

> Add exception handling to single message transforms in Java SDK
> ---
>
> Key: BEAM-5638
> URL: https://issues.apache.org/jira/browse/

[jira] [Work logged] (BEAM-5638) Add exception handling to single message transforms in Java SDK

2019-02-19 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5638?focusedWorklogId=200967&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-200967
 ]

ASF GitHub Bot logged work on BEAM-5638:


Author: ASF GitHub Bot
Created on: 19/Feb/19 23:10
Start Date: 19/Feb/19 23:10
Worklog Time Spent: 10m 
  Work Description: reuvenlax commented on pull request #7736: [BEAM-5638] 
Exception handling for Java MapElements and FlatMapElements
URL: https://github.com/apache/beam/pull/7736#discussion_r258251871
 
 

 ##
 File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/FlatMapElements.java
 ##
 @@ -170,4 +174,193 @@ public void populateDisplayData(DisplayData.Builder 
builder) {
   builder.include("fn", (HasDisplayData) originalFnForDisplayData);
 }
   }
+
+  /**
+   * Return a modified {@code PTransform} that catches exceptions raised while 
mapping elements.
+   *
+   * The user must call {@code via} on the returned {@link 
FlatMapWithExceptions} instance to
+   * define an exception handler. If the handler does not provide sufficient 
type information, the
+   * user must also call {@code into} to define a type descriptor for the 
error collection.
+   *
+   * See {@link WithExceptions} documentation for usage patterns of the 
returned {@link
+   * WithExceptions.Result}.
+   *
+   * @return a {@link WithExceptions.Result} wrapping the output and error 
collections
+   */
+  @Experimental(Experimental.Kind.WITH_EXCEPTIONS)
+  public FlatMapWithExceptions withExceptions() {
+return new FlatMapWithExceptions<>(
+fn, originalFnForDisplayData, inputType, outputType, null, null);
+  }
+
+  /** Implementation of {@link FlatMapElements#withExceptions()}. */
+  @Experimental(Experimental.Kind.WITH_EXCEPTIONS)
+  public static class FlatMapWithExceptions
+  extends PTransform<
+  PCollection, WithExceptions.Result, 
FailureT>> {
+
+private final transient TypeDescriptor inputType;
+private final transient TypeDescriptor outputType;
+@Nullable private final transient TypeDescriptor failureType;
+private final transient Object originalFnForDisplayData;
+@Nullable private final Contextful>> fn;
+@Nullable private final ProcessFunction, 
FailureT> exceptionHandler;
+
+FlatMapWithExceptions(
+@Nullable Contextful>> fn,
+Object originalFnForDisplayData,
+TypeDescriptor inputType,
+TypeDescriptor outputType,
+@Nullable ProcessFunction, FailureT> 
exceptionHandler,
+@Nullable TypeDescriptor failureType) {
+  this.fn = fn;
+  this.originalFnForDisplayData = originalFnForDisplayData;
+  this.inputType = inputType;
+  this.outputType = outputType;
+  this.exceptionHandler = exceptionHandler;
+  this.failureType = failureType;
+}
+
+/**
+ * Returns a new {@link FlatMapWithExceptions} transform with the given 
type descriptor for the
+ * error collection, but the exception handler yet to be specified using 
{@link
+ * #via(ProcessFunction)}.
+ */
+public  FlatMapWithExceptions 
into(
+TypeDescriptor failureTypeDescriptor) {
+  return new FlatMapWithExceptions<>(
+  fn, originalFnForDisplayData, inputType, outputType, null, 
failureTypeDescriptor);
+}
+
+/**
+ * Returns a {@code PTransform} that catches exceptions raised while 
mapping elements, passing
+ * the raised exception instance and the input element being processed 
through the given {@code
+ * exceptionHandler} and emitting the result to an error collection.
+ *
+ * Example usage:
+ *
+ * {@code
+ * Result, String>> result = words.apply(
+ * FlatMapElements.into(TypeDescriptors.strings())
+ * .via((String line) -> 
Arrays.asList(Arrays.copyOfRange(line.split(" "), 1, 5)))
+ * .withExceptions()
+ * .into(TypeDescriptors.strings())
+ * .via(ee -> e.exception().getMessage()));
+ * PCollection errors = result.errors();
+ * }
+ */
+public FlatMapWithExceptions via(
+ProcessFunction, FailureT> exceptionHandler) {
+  return new FlatMapWithExceptions<>(
+  fn, originalFnForDisplayData, inputType, outputType, 
exceptionHandler, failureType);
+}
+
+/**
+ * Like {@link #via(ProcessFunction)}, but takes advantage of the type 
information provided by
+ * {@link InferableFunction}, meaning that a call to {@link 
#into(TypeDescriptor)} may not be
+ * necessary.
+ *
+ * Example usage:
+ *
+ * {@code
+ * Result, KV>> result = 
words.apply(
+ * FlatMapElements.into(TypeDescriptors.strings())
+ * .via((String line) -> 
Arrays.asList(Arrays.copyOfRange(line.split(" "), 1, 5)))
+ * .withExceptions()
+ * .via(new WithExceptions.ExceptionAsMapHandler() {}));
+ * PCol

[jira] [Work logged] (BEAM-6709) Typehinting depends on typing changes in Python 3.5.3

2019-02-19 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6709?focusedWorklogId=200961&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-200961
 ]

ASF GitHub Bot logged work on BEAM-6709:


Author: ASF GitHub Bot
Created on: 19/Feb/19 22:57
Start Date: 19/Feb/19 22:57
Worklog Time Spent: 10m 
  Work Description: RobbeSneyders commented on issue #7873: [BEAM-6709] 
Check tuple typing failure.
URL: https://github.com/apache/beam/pull/7873#issuecomment-465345841
 
 
   Fixed the lint error. PTAL
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 200961)
Time Spent: 0.5h  (was: 20m)

> Typehinting depends on typing changes in Python 3.5.3
> -
>
> Key: BEAM-6709
> URL: https://issues.apache.org/jira/browse/BEAM-6709
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Robbe
>Assignee: Robbe
>Priority: Major
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> On Python versions < 3.5.3, the Tuple and Union type from typing do not have 
> an `__args__` attribute, but a `__tuple_params__`, and a `__union_params__` 
> and `__union_set_params__` argument respectively.
> The current implementation fails <3.5.3 since it depends on the `__args__` 
> attribute



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-6530) Strange character on website (contact page)

2019-02-19 Thread Ruoyun Huang (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-6530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16772419#comment-16772419
 ] 

Ruoyun Huang commented on BEAM-6530:


+ [~melap] to check if this is expected? 

 

 

> Strange character on website (contact page)
> ---
>
> Key: BEAM-6530
> URL: https://issues.apache.org/jira/browse/BEAM-6530
> Project: Beam
>  Issue Type: Bug
>  Components: website
>Reporter: Ruoyun Huang
>Priority: Minor
> Attachments: Screen Shot 2019-01-28 at 6.22.11 PM.png
>
>
> see screen shot as attached. 
>  
> Looks like an html error somewhere, Looking at the code though don't see 
> strange redundant characters: 
> [https://github.com/apache/beam/blob/master/website/src/community/contact-us.md]
>  
> Some one know more about how the web pages organized might want to take a 
> look. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6711) Bigquery Tornadoes IT is broken in Python3 PostCommit test suite.

2019-02-19 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6711?focusedWorklogId=200960&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-200960
 ]

ASF GitHub Bot logged work on BEAM-6711:


Author: ASF GitHub Bot
Created on: 19/Feb/19 22:54
Start Date: 19/Feb/19 22:54
Worklog Time Spent: 10m 
  Work Description: pabloem commented on issue #7892: [BEAM-6711] 
[BEAM-6553] A Python SDK sink that supports File Loads into BQ (#7655)
URL: https://github.com/apache/beam/pull/7892#issuecomment-465345118
 
 
   Run Python PostCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 200960)
Time Spent: 1h  (was: 50m)

> Bigquery Tornadoes IT is broken in Python3 PostCommit test suite. 
> --
>
> Key: BEAM-6711
> URL: https://issues.apache.org/jira/browse/BEAM-6711
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: Pablo Estrada
>Priority: Major
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> First failure was observed in 
> https://builds.apache.org/job/beam_PostCommit_Python3_Verify/54 , after 
> https://github.com/apache/beam/commit/cdea885872b3be7de9ba22f22700be89f7d53766
>  was merged. 
> [~pabloem], could you please take a look? I suggest we do a rollback + 
> rollforward with a fix.
> {noformat}
> root: ERROR: Exception at bundle 
> , 
> due to an exception.
>  Traceback (most recent call last):
>   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py",
>  line 727, in process
> return self.do_fn_invoker.invoke_process(windowed_value)
>   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py",
>  line 556, in invoke_process
> windowed_value, additional_args, additional_kwargs, output_processor)
>   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py",
>  line 622, in _invoke_per_window
> self.process_method(*args_for_process, **kwargs_for_process))
>   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py",
>  line 823, in process_outputs
> for result in results:
>   File "/home/jenkins/jenkins-slave/works
> pace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/io/gcp/bigquery_file_loads.py",
>  line 191, in process
> if destination in self._destination_to_file_writer:
> TypeError: unhashable type: 'TableReference'
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4775) JobService should support returning metrics

2019-02-19 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4775?focusedWorklogId=200958&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-200958
 ]

ASF GitHub Bot logged work on BEAM-4775:


Author: ASF GitHub Bot
Created on: 19/Feb/19 22:54
Start Date: 19/Feb/19 22:54
Worklog Time Spent: 10m 
  Work Description: ajamato commented on pull request #7876: [BEAM-4775] 
Clean up metric protos; support integer distributions, gauges
URL: https://github.com/apache/beam/pull/7876#discussion_r258264891
 
 

 ##
 File path: 
runners/flink/src/main/java/org/apache/beam/runners/flink/metrics/FlinkMetricContainer.java
 ##
 @@ -94,121 +93,128 @@ public MetricsContainer getMetricsContainer(String 
stepName) {
 : null;
   }
 
+  public MetricsContainer getUnboundMetricsContainer() {
+return metricsAccumulator != null
+? metricsAccumulator.getLocalValue().getUnboundContainer()
+: null;
+  }
+
   /**
* Update this container with metrics from the passed {@link 
MonitoringInfo}s, and send updates
* along to Flink's internal metrics framework.
*/
-  public void updateMetrics(String stepName, List 
monitoringInfos) {
-MetricsContainer metricsContainer = getMetricsContainer(stepName);
+  public void updateMetrics(List monitoringInfos) {
+LOG.info("Flink updating metrics with {} monitoring infos", 
monitoringInfos.size());
 monitoringInfos.forEach(
 monitoringInfo -> {
-  if (monitoringInfo.hasMetric()) {
-String urn = monitoringInfo.getUrn();
-MetricName metricName = parseUrn(urn);
-Metric metric = monitoringInfo.getMetric();
-if (metric.hasCounterData()) {
-  CounterData counterData = metric.getCounterData();
-  if (counterData.getValueCase() == 
CounterData.ValueCase.INT64_VALUE) {
-org.apache.beam.sdk.metrics.Counter counter =
-metricsContainer.getCounter(metricName);
-counter.inc(counterData.getInt64Value());
-  } else {
-LOG.warn("Unsupported CounterData type: {}", counterData);
-  }
-} else if (metric.hasDistributionData()) {
-  DistributionData distributionData = metric.getDistributionData();
-  if (distributionData.hasIntDistributionData()) {
-Distribution distribution = 
metricsContainer.getDistribution(metricName);
-IntDistributionData intDistributionData = 
distributionData.getIntDistributionData();
-distribution.update(
-intDistributionData.getSum(),
-intDistributionData.getCount(),
-intDistributionData.getMin(),
-intDistributionData.getMax());
-  } else {
-LOG.warn("Unsupported DistributionData type: {}", 
distributionData);
-  }
-} else if (metric.hasExtremaData()) {
-  ExtremaData extremaData = metric.getExtremaData();
-  LOG.warn("Extrema metric unsupported: {}", extremaData);
-}
+  if (!monitoringInfo.hasMetric()) {
 
 Review comment:
   Whitelisting the URNs you support would be the preferred method of doing 
this.
   
   What you've done here will allow various SDK system metrics which get added 
later on, without the devs of this runner getting any sort of heads up. The 
intention of the design is to allow the runner to whitelist the specific URNs 
it wishes to support. You may just wish to support the user metric prefixes. 
Doing that would drop the element count, execution time, etc. metrics.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 200958)
Time Spent: 11h 40m  (was: 11.5h)

> JobService should support returning metrics
> ---
>
> Key: BEAM-4775
> URL: https://issues.apache.org/jira/browse/BEAM-4775
> Project: Beam
>  Issue Type: Bug
>  Components: beam-model
>Reporter: Eugene Kirpichov
>Assignee: Ryan Williams
>Priority: Major
>  Labels: triaged
>  Time Spent: 11h 40m
>  Remaining Estimate: 0h
>
> [https://github.com/apache/beam/blob/master/model/job-management/src/main/proto/beam_job_api.proto]
>  currently doesn't appear to have a way for JobService to return metrics to a 
> user, even though 
> [https://github.com/apache/beam/blob/master/model/fn-execution/src/main/proto/beam_fn_api.proto]
>  includes 

[jira] [Work logged] (BEAM-4775) JobService should support returning metrics

2019-02-19 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4775?focusedWorklogId=200962&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-200962
 ]

ASF GitHub Bot logged work on BEAM-4775:


Author: ASF GitHub Bot
Created on: 19/Feb/19 22:59
Start Date: 19/Feb/19 22:59
Worklog Time Spent: 10m 
  Work Description: ajamato commented on issue #7823: [DO NOT MERGE] 
[BEAM-4775] Second take on portable metrics over the job-server API
URL: https://github.com/apache/beam/pull/7823#issuecomment-465346491
 
 
   I'm not sure if you want me to review this one or not. This PR is a bit 
massive.
   
   Is this the one that introduces the new layer to extract the new 
MonitoringInfo metrics over an RPC?
   https://s.apache.org/get-metrics-api
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 200962)
Time Spent: 12h  (was: 11h 50m)

> JobService should support returning metrics
> ---
>
> Key: BEAM-4775
> URL: https://issues.apache.org/jira/browse/BEAM-4775
> Project: Beam
>  Issue Type: Bug
>  Components: beam-model
>Reporter: Eugene Kirpichov
>Assignee: Ryan Williams
>Priority: Major
>  Labels: triaged
>  Time Spent: 12h
>  Remaining Estimate: 0h
>
> [https://github.com/apache/beam/blob/master/model/job-management/src/main/proto/beam_job_api.proto]
>  currently doesn't appear to have a way for JobService to return metrics to a 
> user, even though 
> [https://github.com/apache/beam/blob/master/model/fn-execution/src/main/proto/beam_fn_api.proto]
>  includes support for reporting SDK metrics to the runner harness.
>  
> Metrics are apparently necessary to run any ValidatesRunner tests because 
> PAssert needs to validate that the assertions succeeded. However, this 
> statement should be double-checked: perhaps it's possible to somehow work 
> with PAssert without metrics support.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4775) JobService should support returning metrics

2019-02-19 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4775?focusedWorklogId=200952&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-200952
 ]

ASF GitHub Bot logged work on BEAM-4775:


Author: ASF GitHub Bot
Created on: 19/Feb/19 22:50
Start Date: 19/Feb/19 22:50
Worklog Time Spent: 10m 
  Work Description: ryan-williams commented on issue #7868: [BEAM-4775] 
MonitoringInfo URN tweaks
URL: https://github.com/apache/beam/pull/7868#issuecomment-465343573
 
 
   > Just curious, what were you hoping to get out of this change in 
particular? This is largely a bunch of renamings, was it too confusing before?
   
   There are 2 renamings:
   - 
https://github.com/apache/beam/pull/7868/commits/737c9e25ebbbf2654d2294705d5d3f14e17a841d:
 `MonitoringInfoLabels.{TRANSFORM → PTRANSFORM}`, per [this 
`TODO(ajamato)`](https://github.com/apache/beam/pull/7868/commits/737c9e25ebbbf2654d2294705d5d3f14e17a841d#diff-4d226ebf3f70cf18449c29e82511a67eL439)
 😄  
   - 
https://github.com/apache/beam/pull/7868/commits/c17694b3486e1e0f03f5dfd63b6c108dc68caf02:
  `USER_COUNTER_URN_PREFIX ` → `USER_METRIC_URN_PREFIX `
 - (in `SimpleMonitoringInfoBuilder` and `MonitoringInfoUrns.Enum`)
 - we need either this change, or to add new user-URN-prefixes for 
distributions/gauges, afaict.
   
   There are also 6 other commits here that each constitute minor forward 
progress, I think:
   - 
https://github.com/apache/beam/pull/7868/commits/08108ccdfad500bb68241e588bb525e36e0efa3d:
 `UserMonitoringInfoToCounterUpdateTransformer` had [a redundant definition of  
`USER_COUNTER_URN_PREFIX`](https://github.com/apache/beam/pull/7868/commits/08108ccdfad500bb68241e588bb525e36e0efa3d#diff-a2189a7c97dace5f44f04e16a71b6468L56)
   - 
https://github.com/apache/beam/pull/7868/commits/937a9c68d7d81af28d503ad9b3fc4056bb0253ab:
 consolidate two (different!) implementations of [parsing a MetricName from a 
URN]
   - etc.
   
   Sorry if it just feels like dust. I thought these were all thematically 
similar and a reasonable chunk to pull out of #7823 and reckon with 
independently.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 200952)
Time Spent: 10h 50m  (was: 10h 40m)

> JobService should support returning metrics
> ---
>
> Key: BEAM-4775
> URL: https://issues.apache.org/jira/browse/BEAM-4775
> Project: Beam
>  Issue Type: Bug
>  Components: beam-model
>Reporter: Eugene Kirpichov
>Assignee: Ryan Williams
>Priority: Major
>  Labels: triaged
>  Time Spent: 10h 50m
>  Remaining Estimate: 0h
>
> [https://github.com/apache/beam/blob/master/model/job-management/src/main/proto/beam_job_api.proto]
>  currently doesn't appear to have a way for JobService to return metrics to a 
> user, even though 
> [https://github.com/apache/beam/blob/master/model/fn-execution/src/main/proto/beam_fn_api.proto]
>  includes support for reporting SDK metrics to the runner harness.
>  
> Metrics are apparently necessary to run any ValidatesRunner tests because 
> PAssert needs to validate that the assertions succeeded. However, this 
> statement should be double-checked: perhaps it's possible to somehow work 
> with PAssert without metrics support.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4775) JobService should support returning metrics

2019-02-19 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4775?focusedWorklogId=200953&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-200953
 ]

ASF GitHub Bot logged work on BEAM-4775:


Author: ASF GitHub Bot
Created on: 19/Feb/19 22:54
Start Date: 19/Feb/19 22:54
Worklog Time Spent: 10m 
  Work Description: ajamato commented on pull request #7876: [BEAM-4775] 
Clean up metric protos; support integer distributions, gauges
URL: https://github.com/apache/beam/pull/7876#discussion_r258253497
 
 

 ##
 File path: model/fn-execution/src/main/proto/beam_fn_api.proto
 ##
 @@ -553,11 +514,9 @@ message IntDistributionData {
   int64 max = 4;
 }
 
-message DoubleDistributionData {
-  int64 count = 1;
-  double sum = 2;
-  double min = 3;
-  double max = 4;
+message IntGaugeData {
 
 Review comment:
   Is there a use case for this at the moment? Are you implementing these right 
now?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 200953)
Time Spent: 11h  (was: 10h 50m)

> JobService should support returning metrics
> ---
>
> Key: BEAM-4775
> URL: https://issues.apache.org/jira/browse/BEAM-4775
> Project: Beam
>  Issue Type: Bug
>  Components: beam-model
>Reporter: Eugene Kirpichov
>Assignee: Ryan Williams
>Priority: Major
>  Labels: triaged
>  Time Spent: 11h
>  Remaining Estimate: 0h
>
> [https://github.com/apache/beam/blob/master/model/job-management/src/main/proto/beam_job_api.proto]
>  currently doesn't appear to have a way for JobService to return metrics to a 
> user, even though 
> [https://github.com/apache/beam/blob/master/model/fn-execution/src/main/proto/beam_fn_api.proto]
>  includes support for reporting SDK metrics to the runner harness.
>  
> Metrics are apparently necessary to run any ValidatesRunner tests because 
> PAssert needs to validate that the assertions succeeded. However, this 
> statement should be double-checked: perhaps it's possible to somehow work 
> with PAssert without metrics support.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4775) JobService should support returning metrics

2019-02-19 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4775?focusedWorklogId=200954&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-200954
 ]

ASF GitHub Bot logged work on BEAM-4775:


Author: ASF GitHub Bot
Created on: 19/Feb/19 22:54
Start Date: 19/Feb/19 22:54
Worklog Time Spent: 10m 
  Work Description: ajamato commented on pull request #7876: [BEAM-4775] 
Clean up metric protos; support integer distributions, gauges
URL: https://github.com/apache/beam/pull/7876#discussion_r258261366
 
 

 ##
 File path: model/fn-execution/src/main/proto/beam_fn_api.proto
 ##
 @@ -501,48 +501,9 @@ message MonitoringInfoTypeUrns {
 message Metric {
   // (Required) The data for this metric.
   oneof data {
-CounterData counter_data = 1;
-DistributionData distribution_data = 2;
-ExtremaData extrema_data = 3;
-  }
-}
-
-// Data associated with a Counter or Gauge metric.
-// This is designed to be compatible with metric collection
-// systems such as DropWizard.
-message CounterData {
-  oneof value {
-int64 int64_value = 1;
-double double_value = 2;
-string string_value = 3;
-  }
-}
-
-// Extrema messages are used for calculating
-// Top-N/Bottom-N metrics.
-message ExtremaData {
-  oneof extrema {
-IntExtremaData int_extrema_data = 1;
-DoubleExtremaData double_extrema_data = 2;
-  }
-}
-
-message IntExtremaData {
-  repeated int64 int_values = 1;
-}
-
-message DoubleExtremaData {
-  repeated double double_values = 2;
-}
-
-// Data associated with a distribution metric.
-// This is based off of the current DistributionData metric.
-// This is not a stackdriver or dropwizard compatible
-// style of distribution metric.
-message DistributionData {
-  oneof distribution {
-IntDistributionData int_distribution_data = 1;
-DoubleDistributionData double_distribution_data = 2;
+int64 counter = 1;
 
 Review comment:
   There was long debate when, and the existing protos were what we could agree 
on when we introduced this design. While I am fine with this change you are 
proposing personally. I'd like you to get buy in on the dev list.
   
   https://s.apache.org/beam-fn-api-metrics
   
   https://lists.apache.org/list.html?d...@beam.apache.org:2018-2
   
   I think most of it was in the comments of the doc. So unfortunately, I 
cannot point you toward too much.
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 200954)
Time Spent: 11h 10m  (was: 11h)

> JobService should support returning metrics
> ---
>
> Key: BEAM-4775
> URL: https://issues.apache.org/jira/browse/BEAM-4775
> Project: Beam
>  Issue Type: Bug
>  Components: beam-model
>Reporter: Eugene Kirpichov
>Assignee: Ryan Williams
>Priority: Major
>  Labels: triaged
>  Time Spent: 11h 10m
>  Remaining Estimate: 0h
>
> [https://github.com/apache/beam/blob/master/model/job-management/src/main/proto/beam_job_api.proto]
>  currently doesn't appear to have a way for JobService to return metrics to a 
> user, even though 
> [https://github.com/apache/beam/blob/master/model/fn-execution/src/main/proto/beam_fn_api.proto]
>  includes support for reporting SDK metrics to the runner harness.
>  
> Metrics are apparently necessary to run any ValidatesRunner tests because 
> PAssert needs to validate that the assertions succeeded. However, this 
> statement should be double-checked: perhaps it's possible to somehow work 
> with PAssert without metrics support.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4775) JobService should support returning metrics

2019-02-19 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4775?focusedWorklogId=200957&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-200957
 ]

ASF GitHub Bot logged work on BEAM-4775:


Author: ASF GitHub Bot
Created on: 19/Feb/19 22:54
Start Date: 19/Feb/19 22:54
Worklog Time Spent: 10m 
  Work Description: ajamato commented on pull request #7876: [BEAM-4775] 
Clean up metric protos; support integer distributions, gauges
URL: https://github.com/apache/beam/pull/7876#discussion_r258262204
 
 

 ##
 File path: 
runners/core-java/src/main/java/org/apache/beam/runners/core/metrics/MonitoringInfoMetricName.java
 ##
 @@ -113,11 +136,15 @@ public boolean equals(Object o) {
   }
 
   @Override
-  public String toString() {
+  public String toString(String delimiter) {
+if (getNamespace() != null && getName() != null) {
+  return super.toString(delimiter);
+}
 StringBuilder builder = new StringBuilder();
-builder.append(this.urn.toString());
-builder.append(" ");
-builder.append(this.labels.toString());
+if (labels.containsKey(PCOLLECTION_LABEL)) {
 
 Review comment:
   just iterate over the keyas and values and add them to the return string, 
instead of inspecting the labels for specific keys.
   
   Then there is no need to update this code for new labels.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 200957)
Time Spent: 11.5h  (was: 11h 20m)

> JobService should support returning metrics
> ---
>
> Key: BEAM-4775
> URL: https://issues.apache.org/jira/browse/BEAM-4775
> Project: Beam
>  Issue Type: Bug
>  Components: beam-model
>Reporter: Eugene Kirpichov
>Assignee: Ryan Williams
>Priority: Major
>  Labels: triaged
>  Time Spent: 11.5h
>  Remaining Estimate: 0h
>
> [https://github.com/apache/beam/blob/master/model/job-management/src/main/proto/beam_job_api.proto]
>  currently doesn't appear to have a way for JobService to return metrics to a 
> user, even though 
> [https://github.com/apache/beam/blob/master/model/fn-execution/src/main/proto/beam_fn_api.proto]
>  includes support for reporting SDK metrics to the runner harness.
>  
> Metrics are apparently necessary to run any ValidatesRunner tests because 
> PAssert needs to validate that the assertions succeeded. However, this 
> statement should be double-checked: perhaps it's possible to somehow work 
> with PAssert without metrics support.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4775) JobService should support returning metrics

2019-02-19 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4775?focusedWorklogId=200959&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-200959
 ]

ASF GitHub Bot logged work on BEAM-4775:


Author: ASF GitHub Bot
Created on: 19/Feb/19 22:54
Start Date: 19/Feb/19 22:54
Worklog Time Spent: 10m 
  Work Description: ajamato commented on issue #7876: [BEAM-4775] Clean up 
metric protos; support integer distributions, gauges
URL: https://github.com/apache/beam/pull/7876#issuecomment-465345038
 
 
   @Ardagan You might want to take a look at this as well. To see the proto 
changes.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 200959)
Time Spent: 11h 50m  (was: 11h 40m)

> JobService should support returning metrics
> ---
>
> Key: BEAM-4775
> URL: https://issues.apache.org/jira/browse/BEAM-4775
> Project: Beam
>  Issue Type: Bug
>  Components: beam-model
>Reporter: Eugene Kirpichov
>Assignee: Ryan Williams
>Priority: Major
>  Labels: triaged
>  Time Spent: 11h 50m
>  Remaining Estimate: 0h
>
> [https://github.com/apache/beam/blob/master/model/job-management/src/main/proto/beam_job_api.proto]
>  currently doesn't appear to have a way for JobService to return metrics to a 
> user, even though 
> [https://github.com/apache/beam/blob/master/model/fn-execution/src/main/proto/beam_fn_api.proto]
>  includes support for reporting SDK metrics to the runner harness.
>  
> Metrics are apparently necessary to run any ValidatesRunner tests because 
> PAssert needs to validate that the assertions succeeded. However, this 
> statement should be double-checked: perhaps it's possible to somehow work 
> with PAssert without metrics support.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4775) JobService should support returning metrics

2019-02-19 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4775?focusedWorklogId=200955&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-200955
 ]

ASF GitHub Bot logged work on BEAM-4775:


Author: ASF GitHub Bot
Created on: 19/Feb/19 22:54
Start Date: 19/Feb/19 22:54
Worklog Time Spent: 10m 
  Work Description: ajamato commented on pull request #7876: [BEAM-4775] 
Clean up metric protos; support integer distributions, gauges
URL: https://github.com/apache/beam/pull/7876#discussion_r258264884
 
 

 ##
 File path: 
runners/flink/src/main/java/org/apache/beam/runners/flink/metrics/FlinkMetricContainer.java
 ##
 @@ -94,121 +93,128 @@ public MetricsContainer getMetricsContainer(String 
stepName) {
 : null;
   }
 
+  public MetricsContainer getUnboundMetricsContainer() {
+return metricsAccumulator != null
+? metricsAccumulator.getLocalValue().getUnboundContainer()
+: null;
+  }
+
   /**
* Update this container with metrics from the passed {@link 
MonitoringInfo}s, and send updates
* along to Flink's internal metrics framework.
*/
-  public void updateMetrics(String stepName, List 
monitoringInfos) {
-MetricsContainer metricsContainer = getMetricsContainer(stepName);
+  public void updateMetrics(List monitoringInfos) {
 
 Review comment:
   Question about the overall design here. Much of this class doesn't seem 
flink specific. If its just aggregating the values in memory, perhaps it should 
just aggregate the MonitoringInfos. Then have some method to extract them in a 
flink specific format (possibly in another class)
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 200955)
Time Spent: 11h 20m  (was: 11h 10m)

> JobService should support returning metrics
> ---
>
> Key: BEAM-4775
> URL: https://issues.apache.org/jira/browse/BEAM-4775
> Project: Beam
>  Issue Type: Bug
>  Components: beam-model
>Reporter: Eugene Kirpichov
>Assignee: Ryan Williams
>Priority: Major
>  Labels: triaged
>  Time Spent: 11h 20m
>  Remaining Estimate: 0h
>
> [https://github.com/apache/beam/blob/master/model/job-management/src/main/proto/beam_job_api.proto]
>  currently doesn't appear to have a way for JobService to return metrics to a 
> user, even though 
> [https://github.com/apache/beam/blob/master/model/fn-execution/src/main/proto/beam_fn_api.proto]
>  includes support for reporting SDK metrics to the runner harness.
>  
> Metrics are apparently necessary to run any ValidatesRunner tests because 
> PAssert needs to validate that the assertions succeeded. However, this 
> statement should be double-checked: perhaps it's possible to somehow work 
> with PAssert without metrics support.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4775) JobService should support returning metrics

2019-02-19 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4775?focusedWorklogId=200956&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-200956
 ]

ASF GitHub Bot logged work on BEAM-4775:


Author: ASF GitHub Bot
Created on: 19/Feb/19 22:54
Start Date: 19/Feb/19 22:54
Worklog Time Spent: 10m 
  Work Description: ajamato commented on pull request #7876: [BEAM-4775] 
Clean up metric protos; support integer distributions, gauges
URL: https://github.com/apache/beam/pull/7876#discussion_r258254412
 
 

 ##
 File path: 
runners/core-java/src/main/java/org/apache/beam/runners/core/metrics/LabeledMetrics.java
 ##
 @@ -29,7 +30,7 @@
   /**
* Create a metric that can be incremented and decremented, and is 
aggregated by taking the sum.
*/
-  public static Counter counter(MonitoringInfoMetricName metricName) {
+  public static Counter counter(MetricName metricName) {
 
 Review comment:
   Let's keep it as MonitoringInfoMetricName, since this is meant to be an 
implementation only used by SDK and RunnerHarness authors. For creating system 
style metrics with URN+labels,
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 200956)

> JobService should support returning metrics
> ---
>
> Key: BEAM-4775
> URL: https://issues.apache.org/jira/browse/BEAM-4775
> Project: Beam
>  Issue Type: Bug
>  Components: beam-model
>Reporter: Eugene Kirpichov
>Assignee: Ryan Williams
>Priority: Major
>  Labels: triaged
>  Time Spent: 11h 20m
>  Remaining Estimate: 0h
>
> [https://github.com/apache/beam/blob/master/model/job-management/src/main/proto/beam_job_api.proto]
>  currently doesn't appear to have a way for JobService to return metrics to a 
> user, even though 
> [https://github.com/apache/beam/blob/master/model/fn-execution/src/main/proto/beam_fn_api.proto]
>  includes support for reporting SDK metrics to the runner harness.
>  
> Metrics are apparently necessary to run any ValidatesRunner tests because 
> PAssert needs to validate that the assertions succeeded. However, this 
> statement should be double-checked: perhaps it's possible to somehow work 
> with PAssert without metrics support.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4775) JobService should support returning metrics

2019-02-19 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4775?focusedWorklogId=200933&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-200933
 ]

ASF GitHub Bot logged work on BEAM-4775:


Author: ASF GitHub Bot
Created on: 19/Feb/19 22:15
Start Date: 19/Feb/19 22:15
Worklog Time Spent: 10m 
  Work Description: ajamato commented on pull request #7868: [BEAM-4775] 
MonitoringInfo URN tweaks
URL: https://github.com/apache/beam/pull/7868#discussion_r258251631
 
 

 ##
 File path: 
runners/core-java/src/main/java/org/apache/beam/runners/core/metrics/MetricUrns.java
 ##
 @@ -29,14 +30,17 @@
*
* Should be consistent with {@code parse_namespace_and_name} in 
monitoring_infos.py.
*/
+  @Nullable
   public static MetricName parseUrn(String urn) {
-if (urn.startsWith(USER_COUNTER_URN_PREFIX)) {
-  urn = urn.substring(USER_COUNTER_URN_PREFIX.length());
+if (urn.startsWith(USER_METRIC_URN_PREFIX)) {
+  urn = urn.substring(USER_METRIC_URN_PREFIX.length());
+} else {
+  return null;
 
 Review comment:
   Did you encounter this somewhere? Hopefully this isn't called if the metric 
isn't actually starting with USER_METRIC_URN_PREFIX
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 200933)
Time Spent: 10h  (was: 9h 50m)

> JobService should support returning metrics
> ---
>
> Key: BEAM-4775
> URL: https://issues.apache.org/jira/browse/BEAM-4775
> Project: Beam
>  Issue Type: Bug
>  Components: beam-model
>Reporter: Eugene Kirpichov
>Assignee: Ryan Williams
>Priority: Major
>  Labels: triaged
>  Time Spent: 10h
>  Remaining Estimate: 0h
>
> [https://github.com/apache/beam/blob/master/model/job-management/src/main/proto/beam_job_api.proto]
>  currently doesn't appear to have a way for JobService to return metrics to a 
> user, even though 
> [https://github.com/apache/beam/blob/master/model/fn-execution/src/main/proto/beam_fn_api.proto]
>  includes support for reporting SDK metrics to the runner harness.
>  
> Metrics are apparently necessary to run any ValidatesRunner tests because 
> PAssert needs to validate that the assertions succeeded. However, this 
> statement should be double-checked: perhaps it's possible to somehow work 
> with PAssert without metrics support.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6672) Make bundle execution with ExecutableStage support user states

2019-02-19 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6672?focusedWorklogId=200948&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-200948
 ]

ASF GitHub Bot logged work on BEAM-6672:


Author: ASF GitHub Bot
Created on: 19/Feb/19 22:30
Start Date: 19/Feb/19 22:30
Worklog Time Spent: 10m 
  Work Description: rohdesamuel commented on pull request #7847: 
[BEAM-6672] Add the StateRequestHandlerImpl and test
URL: https://github.com/apache/beam/pull/7847
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 200948)
Time Spent: 3h 10m  (was: 3h)

> Make bundle execution with ExecutableStage support user states
> --
>
> Key: BEAM-6672
> URL: https://issues.apache.org/jira/browse/BEAM-6672
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-dataflow
>Reporter: Sam Rohde
>Assignee: Sam Rohde
>Priority: Major
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-6711) Bigquery Tornadoes IT is broken in Python3 PostCommit test suite.

2019-02-19 Thread Pablo Estrada (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-6711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16772392#comment-16772392
 ] 

Pablo Estrada commented on BEAM-6711:
-

Rolled back, and working on a fix: https://github.com/apache/beam/pull/7892

> Bigquery Tornadoes IT is broken in Python3 PostCommit test suite. 
> --
>
> Key: BEAM-6711
> URL: https://issues.apache.org/jira/browse/BEAM-6711
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: Pablo Estrada
>Priority: Major
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> First failure was observed in 
> https://builds.apache.org/job/beam_PostCommit_Python3_Verify/54 , after 
> https://github.com/apache/beam/commit/cdea885872b3be7de9ba22f22700be89f7d53766
>  was merged. 
> [~pabloem], could you please take a look? I suggest we do a rollback + 
> rollforward with a fix.
> {noformat}
> root: ERROR: Exception at bundle 
> , 
> due to an exception.
>  Traceback (most recent call last):
>   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py",
>  line 727, in process
> return self.do_fn_invoker.invoke_process(windowed_value)
>   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py",
>  line 556, in invoke_process
> windowed_value, additional_args, additional_kwargs, output_processor)
>   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py",
>  line 622, in _invoke_per_window
> self.process_method(*args_for_process, **kwargs_for_process))
>   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py",
>  line 823, in process_outputs
> for result in results:
>   File "/home/jenkins/jenkins-slave/works
> pace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/io/gcp/bigquery_file_loads.py",
>  line 191, in process
> if destination in self._destination_to_file_writer:
> TypeError: unhashable type: 'TableReference'
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4775) JobService should support returning metrics

2019-02-19 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4775?focusedWorklogId=200946&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-200946
 ]

ASF GitHub Bot logged work on BEAM-4775:


Author: ASF GitHub Bot
Created on: 19/Feb/19 22:28
Start Date: 19/Feb/19 22:28
Worklog Time Spent: 10m 
  Work Description: ryan-williams commented on pull request #7868: 
[BEAM-4775] MonitoringInfo URN tweaks
URL: https://github.com/apache/beam/pull/7868#discussion_r258257940
 
 

 ##
 File path: 
runners/core-java/src/main/java/org/apache/beam/runners/core/metrics/MetricUrns.java
 ##
 @@ -29,14 +30,17 @@
*
* Should be consistent with {@code parse_namespace_and_name} in 
monitoring_infos.py.
*/
+  @Nullable
   public static MetricName parseUrn(String urn) {
-if (urn.startsWith(USER_COUNTER_URN_PREFIX)) {
-  urn = urn.substring(USER_COUNTER_URN_PREFIX.length());
+if (urn.startsWith(USER_METRIC_URN_PREFIX)) {
+  urn = urn.substring(USER_METRIC_URN_PREFIX.length());
+} else {
+  return null;
 
 Review comment:
   As of this PR, this helper is called in two places:
   - 
[`MonitoringInfoMetricName.parseUrn`](https://github.com/apache/beam/pull/7868/files#diff-ba1b936a7d4c8d43789e81a816b9bb90R55):
 - use the parsed user-`MetricName`, if possible
 - `throw` if it's not a user-metric (signaled by `null` return here)
   - 
[`FlinkMetricContainer.updateMetrics`](https://github.com/apache/beam/pull/7868/files#diff-bd906385f3e3ff74094985af80da5a41R108):
 - use the parsed user-`MetricName`, if possible
 - skip the metric if it's not a user-metric (again, signaled by `null` 
return)
   - in later PRs I make Flink support system-metrics as well.
   
   This was the most direct way to support those uses. The Flink metrics do 
actually see system metrics, and should gracefully skip them instead of 
`throw`ing (for now).
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 200946)
Time Spent: 10h 40m  (was: 10.5h)

> JobService should support returning metrics
> ---
>
> Key: BEAM-4775
> URL: https://issues.apache.org/jira/browse/BEAM-4775
> Project: Beam
>  Issue Type: Bug
>  Components: beam-model
>Reporter: Eugene Kirpichov
>Assignee: Ryan Williams
>Priority: Major
>  Labels: triaged
>  Time Spent: 10h 40m
>  Remaining Estimate: 0h
>
> [https://github.com/apache/beam/blob/master/model/job-management/src/main/proto/beam_job_api.proto]
>  currently doesn't appear to have a way for JobService to return metrics to a 
> user, even though 
> [https://github.com/apache/beam/blob/master/model/fn-execution/src/main/proto/beam_fn_api.proto]
>  includes support for reporting SDK metrics to the runner harness.
>  
> Metrics are apparently necessary to run any ValidatesRunner tests because 
> PAssert needs to validate that the assertions succeeded. However, this 
> statement should be double-checked: perhaps it's possible to somehow work 
> with PAssert without metrics support.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4775) JobService should support returning metrics

2019-02-19 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4775?focusedWorklogId=200934&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-200934
 ]

ASF GitHub Bot logged work on BEAM-4775:


Author: ASF GitHub Bot
Created on: 19/Feb/19 22:15
Start Date: 19/Feb/19 22:15
Worklog Time Spent: 10m 
  Work Description: ajamato commented on pull request #7868: [BEAM-4775] 
MonitoringInfo URN tweaks
URL: https://github.com/apache/beam/pull/7868#discussion_r258248491
 
 

 ##
 File path: model/fn-execution/src/main/proto/beam_fn_api.proto
 ##
 @@ -337,24 +337,19 @@ message Annotation {
 // MonitoringInfo protos.
 message MonitoringInfoSpecs {
   enum Enum {
-// TODO(ajamato): Add the PTRANSFORM name as a required label after
-// upgrading the python SDK.
-USER_COUNTER = 0 [(monitoring_info_spec) = {
-  urn: "beam:metric:user:",
-  type_urn: "beam:metrics:sum_int_64",
-}];
-
-ELEMENT_COUNT = 1 [(monitoring_info_spec) = {
+ELEMENT_COUNT = 0 [(monitoring_info_spec) = {
   urn: "beam:metric:element_count:v1",
   type_urn: "beam:metrics:sum_int_64",
+  // TODO(ryan): we currently also generate this metric with 
["PTRANSFORM"] labels, but it fails validation in
 
 Review comment:
   That's in python @Ardagan is fixing that implementation to emit the 
PCollection name. This is the one spec runners should support.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 200934)
Time Spent: 10h 10m  (was: 10h)

> JobService should support returning metrics
> ---
>
> Key: BEAM-4775
> URL: https://issues.apache.org/jira/browse/BEAM-4775
> Project: Beam
>  Issue Type: Bug
>  Components: beam-model
>Reporter: Eugene Kirpichov
>Assignee: Ryan Williams
>Priority: Major
>  Labels: triaged
>  Time Spent: 10h 10m
>  Remaining Estimate: 0h
>
> [https://github.com/apache/beam/blob/master/model/job-management/src/main/proto/beam_job_api.proto]
>  currently doesn't appear to have a way for JobService to return metrics to a 
> user, even though 
> [https://github.com/apache/beam/blob/master/model/fn-execution/src/main/proto/beam_fn_api.proto]
>  includes support for reporting SDK metrics to the runner harness.
>  
> Metrics are apparently necessary to run any ValidatesRunner tests because 
> PAssert needs to validate that the assertions succeeded. However, this 
> statement should be double-checked: perhaps it's possible to somehow work 
> with PAssert without metrics support.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4775) JobService should support returning metrics

2019-02-19 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4775?focusedWorklogId=200944&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-200944
 ]

ASF GitHub Bot logged work on BEAM-4775:


Author: ASF GitHub Bot
Created on: 19/Feb/19 22:20
Start Date: 19/Feb/19 22:20
Worklog Time Spent: 10m 
  Work Description: ryan-williams commented on pull request #7868: 
[BEAM-4775] MonitoringInfo URN tweaks
URL: https://github.com/apache/beam/pull/7868#discussion_r258255289
 
 

 ##
 File path: model/fn-execution/src/main/proto/beam_fn_api.proto
 ##
 @@ -311,13 +311,13 @@ message ProcessBundleProgressRequest {
   string instruction_reference = 1;
 }
 
-// A specification containing required set of fields and labels required
-// to be set on a MonitoringInfo for the specific URN for SDK->RunnerHarness
+// A specification containing required set of fields and labels required to be
+// set on a MonitoringInfo for a given URN-type, for SDK->RunnerHarness
 
 Review comment:
   Yea, "type" was not what I meant here; context was [discussion w/ 
Robert](https://github.com/apache/beam/pull/7868#discussion_r257795396) about 
whether these specs describe "specific URNs" or "URN categories" (e.g. "user 
metrics").
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 200944)
Time Spent: 10.5h  (was: 10h 20m)

> JobService should support returning metrics
> ---
>
> Key: BEAM-4775
> URL: https://issues.apache.org/jira/browse/BEAM-4775
> Project: Beam
>  Issue Type: Bug
>  Components: beam-model
>Reporter: Eugene Kirpichov
>Assignee: Ryan Williams
>Priority: Major
>  Labels: triaged
>  Time Spent: 10.5h
>  Remaining Estimate: 0h
>
> [https://github.com/apache/beam/blob/master/model/job-management/src/main/proto/beam_job_api.proto]
>  currently doesn't appear to have a way for JobService to return metrics to a 
> user, even though 
> [https://github.com/apache/beam/blob/master/model/fn-execution/src/main/proto/beam_fn_api.proto]
>  includes support for reporting SDK metrics to the runner harness.
>  
> Metrics are apparently necessary to run any ValidatesRunner tests because 
> PAssert needs to validate that the assertions succeeded. However, this 
> statement should be double-checked: perhaps it's possible to somehow work 
> with PAssert without metrics support.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4775) JobService should support returning metrics

2019-02-19 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4775?focusedWorklogId=200941&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-200941
 ]

ASF GitHub Bot logged work on BEAM-4775:


Author: ASF GitHub Bot
Created on: 19/Feb/19 22:18
Start Date: 19/Feb/19 22:18
Worklog Time Spent: 10m 
  Work Description: ryan-williams commented on pull request #7868: 
[BEAM-4775] MonitoringInfo URN tweaks
URL: https://github.com/apache/beam/pull/7868#discussion_r258254620
 
 

 ##
 File path: model/fn-execution/src/main/proto/beam_fn_api.proto
 ##
 @@ -337,24 +337,19 @@ message Annotation {
 // MonitoringInfo protos.
 message MonitoringInfoSpecs {
   enum Enum {
-// TODO(ajamato): Add the PTRANSFORM name as a required label after
 
 Review comment:
   hm, @robertwb [wanted it 
removed](https://github.com/apache/beam/pull/7868#discussion_r257840224), IIUC.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 200941)
Time Spent: 10h 20m  (was: 10h 10m)

> JobService should support returning metrics
> ---
>
> Key: BEAM-4775
> URL: https://issues.apache.org/jira/browse/BEAM-4775
> Project: Beam
>  Issue Type: Bug
>  Components: beam-model
>Reporter: Eugene Kirpichov
>Assignee: Ryan Williams
>Priority: Major
>  Labels: triaged
>  Time Spent: 10h 20m
>  Remaining Estimate: 0h
>
> [https://github.com/apache/beam/blob/master/model/job-management/src/main/proto/beam_job_api.proto]
>  currently doesn't appear to have a way for JobService to return metrics to a 
> user, even though 
> [https://github.com/apache/beam/blob/master/model/fn-execution/src/main/proto/beam_fn_api.proto]
>  includes support for reporting SDK metrics to the runner harness.
>  
> Metrics are apparently necessary to run any ValidatesRunner tests because 
> PAssert needs to validate that the assertions succeeded. However, this 
> statement should be double-checked: perhaps it's possible to somehow work 
> with PAssert without metrics support.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6553) A BigQuery sink thta is SDK-implemented and supports file loads in Python

2019-02-19 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6553?focusedWorklogId=200938&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-200938
 ]

ASF GitHub Bot logged work on BEAM-6553:


Author: ASF GitHub Bot
Created on: 19/Feb/19 22:17
Start Date: 19/Feb/19 22:17
Worklog Time Spent: 10m 
  Work Description: pabloem commented on issue #7892: [BEAM-6553] A Python 
SDK sink that supports File Loads into BQ (#7655)
URL: https://github.com/apache/beam/pull/7892#issuecomment-465333677
 
 
   Run Python PostCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 200938)
Time Spent: 11h 10m  (was: 11h)

> A BigQuery sink thta is SDK-implemented and supports file loads in Python
> -
>
> Key: BEAM-6553
> URL: https://issues.apache.org/jira/browse/BEAM-6553
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Reporter: Pablo Estrada
>Assignee: Pablo Estrada
>Priority: Major
>  Labels: triaged
>  Time Spent: 11h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4775) JobService should support returning metrics

2019-02-19 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4775?focusedWorklogId=200931&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-200931
 ]

ASF GitHub Bot logged work on BEAM-4775:


Author: ASF GitHub Bot
Created on: 19/Feb/19 22:15
Start Date: 19/Feb/19 22:15
Worklog Time Spent: 10m 
  Work Description: ajamato commented on pull request #7868: [BEAM-4775] 
MonitoringInfo URN tweaks
URL: https://github.com/apache/beam/pull/7868#discussion_r258248278
 
 

 ##
 File path: model/fn-execution/src/main/proto/beam_fn_api.proto
 ##
 @@ -311,13 +311,13 @@ message ProcessBundleProgressRequest {
   string instruction_reference = 1;
 }
 
-// A specification containing required set of fields and labels required
-// to be set on a MonitoringInfo for the specific URN for SDK->RunnerHarness
+// A specification containing required set of fields and labels required to be
+// set on a MonitoringInfo for a given URN-type, for SDK->RunnerHarness
 
 Review comment:
   That's not correct. The specific URN has a set of required field.
   ElementCount has a PCollection.
   ExecutionTimes have a PTransform.
   
   URN-type refers to the value type. counter int64, counter double, gauge 
string, gauge int64, etc.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 200931)
Time Spent: 9h 40m  (was: 9.5h)

> JobService should support returning metrics
> ---
>
> Key: BEAM-4775
> URL: https://issues.apache.org/jira/browse/BEAM-4775
> Project: Beam
>  Issue Type: Bug
>  Components: beam-model
>Reporter: Eugene Kirpichov
>Assignee: Ryan Williams
>Priority: Major
>  Labels: triaged
>  Time Spent: 9h 40m
>  Remaining Estimate: 0h
>
> [https://github.com/apache/beam/blob/master/model/job-management/src/main/proto/beam_job_api.proto]
>  currently doesn't appear to have a way for JobService to return metrics to a 
> user, even though 
> [https://github.com/apache/beam/blob/master/model/fn-execution/src/main/proto/beam_fn_api.proto]
>  includes support for reporting SDK metrics to the runner harness.
>  
> Metrics are apparently necessary to run any ValidatesRunner tests because 
> PAssert needs to validate that the assertions succeeded. However, this 
> statement should be double-checked: perhaps it's possible to somehow work 
> with PAssert without metrics support.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4775) JobService should support returning metrics

2019-02-19 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4775?focusedWorklogId=200930&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-200930
 ]

ASF GitHub Bot logged work on BEAM-4775:


Author: ASF GitHub Bot
Created on: 19/Feb/19 22:15
Start Date: 19/Feb/19 22:15
Worklog Time Spent: 10m 
  Work Description: ajamato commented on pull request #7868: [BEAM-4775] 
MonitoringInfo URN tweaks
URL: https://github.com/apache/beam/pull/7868#discussion_r258248313
 
 

 ##
 File path: model/fn-execution/src/main/proto/beam_fn_api.proto
 ##
 @@ -311,13 +311,13 @@ message ProcessBundleProgressRequest {
   string instruction_reference = 1;
 }
 
-// A specification containing required set of fields and labels required
-// to be set on a MonitoringInfo for the specific URN for SDK->RunnerHarness
+// A specification containing required set of fields and labels required to be
+// set on a MonitoringInfo for a given URN-type, for SDK->RunnerHarness
 // ProcessBundleResponse reporting.
 message MonitoringInfoSpec {
   string urn = 1;
   string type_urn = 2;
-  // The list of required
+  // The list of required labels for this URN-type
 
 Review comment:
   ditto, should be URN
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 200930)
Time Spent: 9.5h  (was: 9h 20m)

> JobService should support returning metrics
> ---
>
> Key: BEAM-4775
> URL: https://issues.apache.org/jira/browse/BEAM-4775
> Project: Beam
>  Issue Type: Bug
>  Components: beam-model
>Reporter: Eugene Kirpichov
>Assignee: Ryan Williams
>Priority: Major
>  Labels: triaged
>  Time Spent: 9.5h
>  Remaining Estimate: 0h
>
> [https://github.com/apache/beam/blob/master/model/job-management/src/main/proto/beam_job_api.proto]
>  currently doesn't appear to have a way for JobService to return metrics to a 
> user, even though 
> [https://github.com/apache/beam/blob/master/model/fn-execution/src/main/proto/beam_fn_api.proto]
>  includes support for reporting SDK metrics to the runner harness.
>  
> Metrics are apparently necessary to run any ValidatesRunner tests because 
> PAssert needs to validate that the assertions succeeded. However, this 
> statement should be double-checked: perhaps it's possible to somehow work 
> with PAssert without metrics support.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4775) JobService should support returning metrics

2019-02-19 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4775?focusedWorklogId=200932&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-200932
 ]

ASF GitHub Bot logged work on BEAM-4775:


Author: ASF GitHub Bot
Created on: 19/Feb/19 22:15
Start Date: 19/Feb/19 22:15
Worklog Time Spent: 10m 
  Work Description: ajamato commented on pull request #7868: [BEAM-4775] 
MonitoringInfo URN tweaks
URL: https://github.com/apache/beam/pull/7868#discussion_r258248851
 
 

 ##
 File path: model/fn-execution/src/main/proto/beam_fn_api.proto
 ##
 @@ -337,24 +337,19 @@ message Annotation {
 // MonitoringInfo protos.
 message MonitoringInfoSpecs {
   enum Enum {
-// TODO(ajamato): Add the PTRANSFORM name as a required label after
 
 Review comment:
   Let's keep this one in please. @Ardagan's validation is going to pull in 
these strings soon. We are hoping to remove the MonitoringInfoUrns section and 
just have MonitoringInfoSpecs
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 200932)
Time Spent: 9h 50m  (was: 9h 40m)

> JobService should support returning metrics
> ---
>
> Key: BEAM-4775
> URL: https://issues.apache.org/jira/browse/BEAM-4775
> Project: Beam
>  Issue Type: Bug
>  Components: beam-model
>Reporter: Eugene Kirpichov
>Assignee: Ryan Williams
>Priority: Major
>  Labels: triaged
>  Time Spent: 9h 50m
>  Remaining Estimate: 0h
>
> [https://github.com/apache/beam/blob/master/model/job-management/src/main/proto/beam_job_api.proto]
>  currently doesn't appear to have a way for JobService to return metrics to a 
> user, even though 
> [https://github.com/apache/beam/blob/master/model/fn-execution/src/main/proto/beam_fn_api.proto]
>  includes support for reporting SDK metrics to the runner harness.
>  
> Metrics are apparently necessary to run any ValidatesRunner tests because 
> PAssert needs to validate that the assertions succeeded. However, this 
> statement should be double-checked: perhaps it's possible to somehow work 
> with PAssert without metrics support.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4775) JobService should support returning metrics

2019-02-19 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4775?focusedWorklogId=200927&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-200927
 ]

ASF GitHub Bot logged work on BEAM-4775:


Author: ASF GitHub Bot
Created on: 19/Feb/19 22:09
Start Date: 19/Feb/19 22:09
Worklog Time Spent: 10m 
  Work Description: ryan-williams commented on pull request #7867: 
[BEAM-4775] key MetricResult by a MetricKey
URL: https://github.com/apache/beam/pull/7867#discussion_r258251288
 
 

 ##
 File path: 
runners/extensions-java/metrics/src/main/java/org/apache/beam/runners/extensions/metrics/MetricsHttpSink.java
 ##
 @@ -17,20 +17,28 @@
  */
 package org.apache.beam.runners.extensions.metrics;
 
+import com.fasterxml.jackson.core.JsonGenerator;
 import com.fasterxml.jackson.databind.JsonMappingException;
 import com.fasterxml.jackson.databind.MapperFeature;
 import com.fasterxml.jackson.databind.ObjectMapper;
 import com.fasterxml.jackson.databind.SerializationFeature;
+import com.fasterxml.jackson.databind.SerializerProvider;
+import com.fasterxml.jackson.databind.module.SimpleModule;
 import com.fasterxml.jackson.databind.ser.impl.SimpleBeanPropertyFilter;
 import com.fasterxml.jackson.databind.ser.impl.SimpleFilterProvider;
+import com.fasterxml.jackson.databind.ser.std.StdSerializer;
 import com.fasterxml.jackson.datatype.joda.JodaModule;
 import java.io.DataOutputStream;
+import java.io.IOException;
 import java.net.HttpURLConnection;
 import java.net.URL;
 import java.nio.charset.StandardCharsets;
 import javax.xml.ws.http.HTTPException;
 import org.apache.beam.sdk.annotations.Experimental;
+import org.apache.beam.sdk.metrics.MetricKey;
+import org.apache.beam.sdk.metrics.MetricName;
 import org.apache.beam.sdk.metrics.MetricQueryResults;
+import org.apache.beam.sdk.metrics.MetricResult;
 import org.apache.beam.sdk.metrics.MetricsOptions;
 import org.apache.beam.sdk.metrics.MetricsSink;
 
 
 Review comment:
   Yea, those would be nice to have 😀 I don't really know anything about it, 
and just made these changes to preserve the existing format. 
   
   I'll do some digging and try to formalize / document it better, though I 
assume that [not rocking the wire-format boat with whatever is going on here] 
is going to be what we want to do, in the short term at least.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 200927)
Time Spent: 9h 20m  (was: 9h 10m)

> JobService should support returning metrics
> ---
>
> Key: BEAM-4775
> URL: https://issues.apache.org/jira/browse/BEAM-4775
> Project: Beam
>  Issue Type: Bug
>  Components: beam-model
>Reporter: Eugene Kirpichov
>Assignee: Ryan Williams
>Priority: Major
>  Labels: triaged
>  Time Spent: 9h 20m
>  Remaining Estimate: 0h
>
> [https://github.com/apache/beam/blob/master/model/job-management/src/main/proto/beam_job_api.proto]
>  currently doesn't appear to have a way for JobService to return metrics to a 
> user, even though 
> [https://github.com/apache/beam/blob/master/model/fn-execution/src/main/proto/beam_fn_api.proto]
>  includes support for reporting SDK metrics to the runner harness.
>  
> Metrics are apparently necessary to run any ValidatesRunner tests because 
> PAssert needs to validate that the assertions succeeded. However, this 
> statement should be double-checked: perhaps it's possible to somehow work 
> with PAssert without metrics support.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-6713) FileIO and TextIO unable to alter WriteFiles maxNumWritersPerBundl

2019-02-19 Thread Kyle Winkelman (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kyle Winkelman updated BEAM-6713:
-
Summary: FileIO and TextIO unable to alter WriteFiles maxNumWritersPerBundl 
 (was: FileIO and TextIO Unable to alter WriteFiles maxNumWritersPerBundle)

> FileIO and TextIO unable to alter WriteFiles maxNumWritersPerBundl
> --
>
> Key: BEAM-6713
> URL: https://issues.apache.org/jira/browse/BEAM-6713
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Kyle Winkelman
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> When attempting to run a batch workflow with a FileIO.write() I was getting 
> job failures due to WriteFiles.DEFAULT_MAX_NUM_WRITERS_PER_BUNDLE causing a 
> significant amount of data to be shuffled. My issues would be solved by 
> increasing this and luckily WriteFiles already has withMaxNumWritersPerBundle 
> but unfortunately FileIO and TextIO do not.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4775) JobService should support returning metrics

2019-02-19 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4775?focusedWorklogId=200920&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-200920
 ]

ASF GitHub Bot logged work on BEAM-4775:


Author: ASF GitHub Bot
Created on: 19/Feb/19 22:05
Start Date: 19/Feb/19 22:05
Worklog Time Spent: 10m 
  Work Description: ryan-williams commented on pull request #7867: 
[BEAM-4775] key MetricResult by a MetricKey
URL: https://github.com/apache/beam/pull/7867#discussion_r258250130
 
 

 ##
 File path: 
runners/spark/src/main/java/org/apache/beam/runners/spark/metrics/SparkBeamMetric.java
 ##
 @@ -17,27 +17,30 @@
  */
 package org.apache.beam.runners.spark.metrics;
 
+import static java.util.stream.Collectors.toList;
 import static 
org.apache.beam.runners.core.metrics.MetricsContainerStepMap.asAttemptedOnlyMetricResults;
 
 import com.codahale.metrics.Metric;
+import java.util.ArrayList;
 import java.util.HashMap;
 import java.util.Map;
 import org.apache.beam.runners.core.metrics.MetricsContainerStepMap;
 import org.apache.beam.sdk.metrics.DistributionResult;
 import org.apache.beam.sdk.metrics.GaugeResult;
+import org.apache.beam.sdk.metrics.MetricKey;
 import org.apache.beam.sdk.metrics.MetricName;
 import org.apache.beam.sdk.metrics.MetricQueryResults;
 import org.apache.beam.sdk.metrics.MetricResult;
 import org.apache.beam.sdk.metrics.MetricResults;
 import org.apache.beam.sdk.metrics.MetricsFilter;
 import 
org.apache.beam.vendor.guava.v20_0.com.google.common.annotations.VisibleForTesting;
+import 
org.apache.beam.vendor.guava.v20_0.com.google.common.collect.ImmutableList;
 
 /**
  * An adapter between the {@link MetricsContainerStepMap} and Codahale's 
{@link Metric} interface.
  */
 class SparkBeamMetric implements Metric {
 
 Review comment:
   Agreed. Most of what I've done so far (in #7823) is [expanding the Java SDK 
metrics classes to support the wider MonitoringInfo data model] while 
[preserving semantics of existing code that assumes / requires user-metrics].
   
   In this case, [I end up adapting 
this](https://github.com/apache/beam/pull/7823/files#diff-2b144bc5b4cec16779fa6126425c6fe3R71)
 to use [a `MetricLabels.value()` 
API](https://github.com/apache/beam/pull/7823/files#diff-245a1fd7e743b2f382c978a12028f8c2R46)
 that assumes there is exactly one label set on the MonitoringInfo 
(`PTRANSFORM` xor `PCOLLECTION`) and uses that, so it should be reasonably 
"future-proofed"
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 200920)
Time Spent: 9h 10m  (was: 9h)

> JobService should support returning metrics
> ---
>
> Key: BEAM-4775
> URL: https://issues.apache.org/jira/browse/BEAM-4775
> Project: Beam
>  Issue Type: Bug
>  Components: beam-model
>Reporter: Eugene Kirpichov
>Assignee: Ryan Williams
>Priority: Major
>  Labels: triaged
>  Time Spent: 9h 10m
>  Remaining Estimate: 0h
>
> [https://github.com/apache/beam/blob/master/model/job-management/src/main/proto/beam_job_api.proto]
>  currently doesn't appear to have a way for JobService to return metrics to a 
> user, even though 
> [https://github.com/apache/beam/blob/master/model/fn-execution/src/main/proto/beam_fn_api.proto]
>  includes support for reporting SDK metrics to the runner harness.
>  
> Metrics are apparently necessary to run any ValidatesRunner tests because 
> PAssert needs to validate that the assertions succeeded. However, this 
> statement should be double-checked: perhaps it's possible to somehow work 
> with PAssert without metrics support.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4775) JobService should support returning metrics

2019-02-19 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4775?focusedWorklogId=200916&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-200916
 ]

ASF GitHub Bot logged work on BEAM-4775:


Author: ASF GitHub Bot
Created on: 19/Feb/19 22:01
Start Date: 19/Feb/19 22:01
Worklog Time Spent: 10m 
  Work Description: ryan-williams commented on issue #7867: [BEAM-4775] key 
MetricResult by a MetricKey
URL: https://github.com/apache/beam/pull/7867#issuecomment-465328868
 
 
   Thanks Alex. The good news is I think we agree about everything you've said.
   
   In #7823 I basically turn the existing Java SDK metrics classes into 
wrappers for corresponding MonitoringInfo structures:
   - `MetricKey`: {URN, labels}
   - `MetricName`: URN
 - I basically folded `MonitoringInfoMetricName` into `MetricName`, which 
becomes the only implementation of that concept
   - `MetricLabels`: labels map
 - thin wrapper over the raw map, that also exposes some APIs that encode 
the fact that today we expect MonitoringInfos to have exactly one of 
{`PTRANSFORM`,`PCOLLECTION`}.
   
   So that is the larger vision, which this PR is a small step toward. LMK if 
that sounds off to you. I'm in a constant process of pulling stuff out of #7823 
so that the real work will be in clearer relief there.
   
   A different question that [came up with @mxm on 
#7866](https://github.com/apache/beam/pull/7866#issuecomment-465241925) is 
around what packages things should live in. I don't think it's too bad if 
`MetricKey` goes into java/core here, but where it gets complicated is that 
inevitably, MetricResult → MetricKey → {fn-api protos}, which adds [java/core → 
model-fn-execution], which gives me some pause.
   
   Hopefully we can push on other parts of this (and related PRs) while I think 
more about that question. Ofc any thoughts are welcome.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 200916)
Time Spent: 9h  (was: 8h 50m)

> JobService should support returning metrics
> ---
>
> Key: BEAM-4775
> URL: https://issues.apache.org/jira/browse/BEAM-4775
> Project: Beam
>  Issue Type: Bug
>  Components: beam-model
>Reporter: Eugene Kirpichov
>Assignee: Ryan Williams
>Priority: Major
>  Labels: triaged
>  Time Spent: 9h
>  Remaining Estimate: 0h
>
> [https://github.com/apache/beam/blob/master/model/job-management/src/main/proto/beam_job_api.proto]
>  currently doesn't appear to have a way for JobService to return metrics to a 
> user, even though 
> [https://github.com/apache/beam/blob/master/model/fn-execution/src/main/proto/beam_fn_api.proto]
>  includes support for reporting SDK metrics to the runner harness.
>  
> Metrics are apparently necessary to run any ValidatesRunner tests because 
> PAssert needs to validate that the assertions succeeded. However, this 
> statement should be double-checked: perhaps it's possible to somehow work 
> with PAssert without metrics support.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4775) JobService should support returning metrics

2019-02-19 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4775?focusedWorklogId=200914&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-200914
 ]

ASF GitHub Bot logged work on BEAM-4775:


Author: ASF GitHub Bot
Created on: 19/Feb/19 21:58
Start Date: 19/Feb/19 21:58
Worklog Time Spent: 10m 
  Work Description: ajamato commented on pull request #7867: [BEAM-4775] 
key MetricResult by a MetricKey
URL: https://github.com/apache/beam/pull/7867#discussion_r258244655
 
 

 ##
 File path: 
sdks/java/core/src/test/java/org/apache/beam/sdk/metrics/MetricResultsMatchers.java
 ##
 @@ -194,28 +192,13 @@ protected void describeMismatchSafely(
   String namespace,
   String name,
   String step) {
-if (!Objects.equals(namespace, item.getName().getNamespace())) {
+MetricKey key = MetricKey.create(step, MetricName.named(namespace, name));
+if (!Objects.equals(key, item.getKey())) {
   mismatchDescription
-  .appendText("inNamespace: ")
-  .appendValue(namespace)
+  .appendText("inKey: ")
+  .appendValue(key)
 
 Review comment:
   Please check if tostring is implemented in MonitoringInfoMetricKey and 
MetricKey
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 200914)
Time Spent: 8h 50m  (was: 8h 40m)

> JobService should support returning metrics
> ---
>
> Key: BEAM-4775
> URL: https://issues.apache.org/jira/browse/BEAM-4775
> Project: Beam
>  Issue Type: Bug
>  Components: beam-model
>Reporter: Eugene Kirpichov
>Assignee: Ryan Williams
>Priority: Major
>  Labels: triaged
>  Time Spent: 8h 50m
>  Remaining Estimate: 0h
>
> [https://github.com/apache/beam/blob/master/model/job-management/src/main/proto/beam_job_api.proto]
>  currently doesn't appear to have a way for JobService to return metrics to a 
> user, even though 
> [https://github.com/apache/beam/blob/master/model/fn-execution/src/main/proto/beam_fn_api.proto]
>  includes support for reporting SDK metrics to the runner harness.
>  
> Metrics are apparently necessary to run any ValidatesRunner tests because 
> PAssert needs to validate that the assertions succeeded. However, this 
> statement should be double-checked: perhaps it's possible to somehow work 
> with PAssert without metrics support.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4775) JobService should support returning metrics

2019-02-19 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4775?focusedWorklogId=200912&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-200912
 ]

ASF GitHub Bot logged work on BEAM-4775:


Author: ASF GitHub Bot
Created on: 19/Feb/19 21:58
Start Date: 19/Feb/19 21:58
Worklog Time Spent: 10m 
  Work Description: ajamato commented on pull request #7867: [BEAM-4775] 
key MetricResult by a MetricKey
URL: https://github.com/apache/beam/pull/7867#discussion_r258247534
 
 

 ##
 File path: 
runners/extensions-java/metrics/src/main/java/org/apache/beam/runners/extensions/metrics/MetricsHttpSink.java
 ##
 @@ -17,20 +17,28 @@
  */
 package org.apache.beam.runners.extensions.metrics;
 
+import com.fasterxml.jackson.core.JsonGenerator;
 import com.fasterxml.jackson.databind.JsonMappingException;
 import com.fasterxml.jackson.databind.MapperFeature;
 import com.fasterxml.jackson.databind.ObjectMapper;
 import com.fasterxml.jackson.databind.SerializationFeature;
+import com.fasterxml.jackson.databind.SerializerProvider;
+import com.fasterxml.jackson.databind.module.SimpleModule;
 import com.fasterxml.jackson.databind.ser.impl.SimpleBeanPropertyFilter;
 import com.fasterxml.jackson.databind.ser.impl.SimpleFilterProvider;
+import com.fasterxml.jackson.databind.ser.std.StdSerializer;
 import com.fasterxml.jackson.datatype.joda.JodaModule;
 import java.io.DataOutputStream;
+import java.io.IOException;
 import java.net.HttpURLConnection;
 import java.net.URL;
 import java.nio.charset.StandardCharsets;
 import javax.xml.ws.http.HTTPException;
 import org.apache.beam.sdk.annotations.Experimental;
+import org.apache.beam.sdk.metrics.MetricKey;
+import org.apache.beam.sdk.metrics.MetricName;
 import org.apache.beam.sdk.metrics.MetricQueryResults;
+import org.apache.beam.sdk.metrics.MetricResult;
 import org.apache.beam.sdk.metrics.MetricsOptions;
 import org.apache.beam.sdk.metrics.MetricsSink;
 
 
 Review comment:
   Do you have any more details for this class? Link to doc? Explaination of 
how to use it. Example code of handling the Post request, etc.? What is the 
wire format used? Can it be included in this header?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 200912)
Time Spent: 8.5h  (was: 8h 20m)

> JobService should support returning metrics
> ---
>
> Key: BEAM-4775
> URL: https://issues.apache.org/jira/browse/BEAM-4775
> Project: Beam
>  Issue Type: Bug
>  Components: beam-model
>Reporter: Eugene Kirpichov
>Assignee: Ryan Williams
>Priority: Major
>  Labels: triaged
>  Time Spent: 8.5h
>  Remaining Estimate: 0h
>
> [https://github.com/apache/beam/blob/master/model/job-management/src/main/proto/beam_job_api.proto]
>  currently doesn't appear to have a way for JobService to return metrics to a 
> user, even though 
> [https://github.com/apache/beam/blob/master/model/fn-execution/src/main/proto/beam_fn_api.proto]
>  includes support for reporting SDK metrics to the runner harness.
>  
> Metrics are apparently necessary to run any ValidatesRunner tests because 
> PAssert needs to validate that the assertions succeeded. However, this 
> statement should be double-checked: perhaps it's possible to somehow work 
> with PAssert without metrics support.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4775) JobService should support returning metrics

2019-02-19 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4775?focusedWorklogId=200913&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-200913
 ]

ASF GitHub Bot logged work on BEAM-4775:


Author: ASF GitHub Bot
Created on: 19/Feb/19 21:58
Start Date: 19/Feb/19 21:58
Worklog Time Spent: 10m 
  Work Description: ajamato commented on pull request #7867: [BEAM-4775] 
key MetricResult by a MetricKey
URL: https://github.com/apache/beam/pull/7867#discussion_r258246677
 
 

 ##
 File path: 
runners/spark/src/main/java/org/apache/beam/runners/spark/metrics/SparkBeamMetric.java
 ##
 @@ -17,27 +17,30 @@
  */
 package org.apache.beam.runners.spark.metrics;
 
+import static java.util.stream.Collectors.toList;
 import static 
org.apache.beam.runners.core.metrics.MetricsContainerStepMap.asAttemptedOnlyMetricResults;
 
 import com.codahale.metrics.Metric;
+import java.util.ArrayList;
 import java.util.HashMap;
 import java.util.Map;
 import org.apache.beam.runners.core.metrics.MetricsContainerStepMap;
 import org.apache.beam.sdk.metrics.DistributionResult;
 import org.apache.beam.sdk.metrics.GaugeResult;
+import org.apache.beam.sdk.metrics.MetricKey;
 import org.apache.beam.sdk.metrics.MetricName;
 import org.apache.beam.sdk.metrics.MetricQueryResults;
 import org.apache.beam.sdk.metrics.MetricResult;
 import org.apache.beam.sdk.metrics.MetricResults;
 import org.apache.beam.sdk.metrics.MetricsFilter;
 import 
org.apache.beam.vendor.guava.v20_0.com.google.common.annotations.VisibleForTesting;
+import 
org.apache.beam.vendor.guava.v20_0.com.google.common.collect.ImmutableList;
 
 /**
  * An adapter between the {@link MetricsContainerStepMap} and Codahale's 
{@link Metric} interface.
  */
 class SparkBeamMetric implements Metric {
 
 Review comment:
   I also notice the Metric class will only handle User Metrics (Name, 
Namespace). If spark only cares about UserMetrics that's okay I suppose. but 
won't handle more generlized MonitoringInfos
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 200913)
Time Spent: 8h 40m  (was: 8.5h)

> JobService should support returning metrics
> ---
>
> Key: BEAM-4775
> URL: https://issues.apache.org/jira/browse/BEAM-4775
> Project: Beam
>  Issue Type: Bug
>  Components: beam-model
>Reporter: Eugene Kirpichov
>Assignee: Ryan Williams
>Priority: Major
>  Labels: triaged
>  Time Spent: 8h 40m
>  Remaining Estimate: 0h
>
> [https://github.com/apache/beam/blob/master/model/job-management/src/main/proto/beam_job_api.proto]
>  currently doesn't appear to have a way for JobService to return metrics to a 
> user, even though 
> [https://github.com/apache/beam/blob/master/model/fn-execution/src/main/proto/beam_fn_api.proto]
>  includes support for reporting SDK metrics to the runner harness.
>  
> Metrics are apparently necessary to run any ValidatesRunner tests because 
> PAssert needs to validate that the assertions succeeded. However, this 
> statement should be double-checked: perhaps it's possible to somehow work 
> with PAssert without metrics support.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-6713) FileIO and TextIO unable to alter WriteFiles maxNumWritersPerBundle

2019-02-19 Thread Kyle Winkelman (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kyle Winkelman updated BEAM-6713:
-
Summary: FileIO and TextIO unable to alter WriteFiles 
maxNumWritersPerBundle  (was: FileIO and TextIO unable to alter WriteFiles 
maxNumWritersPerBundl)

> FileIO and TextIO unable to alter WriteFiles maxNumWritersPerBundle
> ---
>
> Key: BEAM-6713
> URL: https://issues.apache.org/jira/browse/BEAM-6713
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Kyle Winkelman
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> When attempting to run a batch workflow with a FileIO.write() I was getting 
> job failures due to WriteFiles.DEFAULT_MAX_NUM_WRITERS_PER_BUNDLE causing a 
> significant amount of data to be shuffled. My issues would be solved by 
> increasing this and luckily WriteFiles already has withMaxNumWritersPerBundle 
> but unfortunately FileIO and TextIO do not.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6713) FileIO and TextIO Unable to alter WriteFiles maxNumWritersPerBundle

2019-02-19 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6713?focusedWorklogId=200907&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-200907
 ]

ASF GitHub Bot logged work on BEAM-6713:


Author: ASF GitHub Bot
Created on: 19/Feb/19 21:55
Start Date: 19/Feb/19 21:55
Worklog Time Spent: 10m 
  Work Description: kyle-winkelman commented on pull request #7893: 
[BEAM-6713] Add withMaxNumWritersPerBundle from WriteFiles to FileIO …
URL: https://github.com/apache/beam/pull/7893
 
 
   …and TextIO.
   
   The WriteFiles.DEFAULT_MAX_NUM_WRITERS_PER_BUNDLE is not ideal for my use 
case and is difficult to change when using FileIO and TextIO. Adding pass 
through values to these classes to make this configuration more accessible.
   
   @lukecwik 
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | --- | --- | --- | ---
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/)
 | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/)
  [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PreCommit_Python_PVR_Flink_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_Python_PVR_Flink_Cron/lastCompletedBuild/)
 | --- | --- | ---
   
   See [.test-infra/jenkins/README](../.test-infra/jenkins/README.md) for 
trigger phrase, status and link of all Jenkins jobs.
   
 

This is an auto

[jira] [Work logged] (BEAM-4775) JobService should support returning metrics

2019-02-19 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4775?focusedWorklogId=200905&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-200905
 ]

ASF GitHub Bot logged work on BEAM-4775:


Author: ASF GitHub Bot
Created on: 19/Feb/19 21:49
Start Date: 19/Feb/19 21:49
Worklog Time Spent: 10m 
  Work Description: ajamato commented on issue #7867: [BEAM-4775] key 
MetricResult by a MetricKey
URL: https://github.com/apache/beam/pull/7867#issuecomment-465324701
 
 
   One thing I don't like about MetricKey, is that its only applicable to user 
metrics. Which are identified with the following fields:
   - Step Name
   - Namespace
   - User provided Metric Name
   
   MonitoringInfos are defined by
   - URN
   - Arbitrary set of Key value pair labels. 
   - (Doesn't necessarily have a step name at all).
   
   MetricKey is insufficient today to describe any metric which isn't defined 
by (step, namespace, metric name).
   
   Take a look at MonitoringInfoMetricName (extends MetricName), which I added.
   
https://github.com/apache/beam/blob/5b1700db834cf949547a20a30f3ee1270710585a/runners/core-java/src/main/java/org/apache/beam/runners/core/metrics/MonitoringInfoMetricName.java
   
   This can contain a URN and an arbitrary set of key value pair labels. I 
think this is the more generalized way to key a Metric and I would like to land 
on this in the future. I think we don't really need two separte classed 
MetricName and MetricKey either.
   
   We could just have one class in the end similar to which represents the 
MetricKey.
   
   The challenge I see with this though, is that querying the metrics and 
storing the metrics internally use the same classes  today (MetricKey). It 
would be better to keep the user facing API more similar to how the user 
defines a metric (explicitly being able to access the name and namespace 
fields). In the end we might have an internal MonitoringInfoMetricKey separate 
from a (User)MetricKey the user can us in the querying API.
   
   If this change unblocks your work now, then I think its fine to proceed. But 
I wanted to give a bit more context here on how I think this could be 
simplified in the end. I think we have too many classes right now, for 
identifying metrics.
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 200905)
Time Spent: 8h 20m  (was: 8h 10m)

> JobService should support returning metrics
> ---
>
> Key: BEAM-4775
> URL: https://issues.apache.org/jira/browse/BEAM-4775
> Project: Beam
>  Issue Type: Bug
>  Components: beam-model
>Reporter: Eugene Kirpichov
>Assignee: Ryan Williams
>Priority: Major
>  Labels: triaged
>  Time Spent: 8h 20m
>  Remaining Estimate: 0h
>
> [https://github.com/apache/beam/blob/master/model/job-management/src/main/proto/beam_job_api.proto]
>  currently doesn't appear to have a way for JobService to return metrics to a 
> user, even though 
> [https://github.com/apache/beam/blob/master/model/fn-execution/src/main/proto/beam_fn_api.proto]
>  includes support for reporting SDK metrics to the runner harness.
>  
> Metrics are apparently necessary to run any ValidatesRunner tests because 
> PAssert needs to validate that the assertions succeeded. However, this 
> statement should be double-checked: perhaps it's possible to somehow work 
> with PAssert without metrics support.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-6713) FileIO and TextIO Unable to alter WriteFiles maxNumWritersPerBundle

2019-02-19 Thread Kyle Winkelman (JIRA)
Kyle Winkelman created BEAM-6713:


 Summary: FileIO and TextIO Unable to alter WriteFiles 
maxNumWritersPerBundle
 Key: BEAM-6713
 URL: https://issues.apache.org/jira/browse/BEAM-6713
 Project: Beam
  Issue Type: Improvement
  Components: sdk-java-core
Reporter: Kyle Winkelman


When attempting to run a batch workflow with a FileIO.write() I was getting job 
failures due to WriteFiles.DEFAULT_MAX_NUM_WRITERS_PER_BUNDLE causing a 
significant amount of data to be shuffled. My issues would be solved by 
increasing this and luckily WriteFiles already has withMaxNumWritersPerBundle 
but unfortunately FileIO and TextIO do not.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6650) FlinkRunner fails to checkpoint elements emitted during finishBundle

2019-02-19 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6650?focusedWorklogId=200902&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-200902
 ]

ASF GitHub Bot logged work on BEAM-6650:


Author: ASF GitHub Bot
Created on: 19/Feb/19 21:46
Start Date: 19/Feb/19 21:46
Worklog Time Spent: 10m 
  Work Description: aaltay commented on issue #7874: [release-2.11.0] 
Backport for BEAM-6650 and BEAM-6678
URL: https://github.com/apache/beam/pull/7874#issuecomment-465323737
 
 
   Run Python PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 200902)
Time Spent: 5h 40m  (was: 5.5h)

> FlinkRunner fails to checkpoint elements emitted during finishBundle
> 
>
> Key: BEAM-6650
> URL: https://issues.apache.org/jira/browse/BEAM-6650
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Major
> Fix For: 2.11.0
>
>  Time Spent: 5h 40m
>  Remaining Estimate: 0h
>
> Elements emitted during the finalizeBundle call in snapshopState are lost 
> after the pipeline is restored. This only happens when the operator is keyed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5381) Dataflow runner creates duplicate CoGBK step IDs

2019-02-19 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5381?focusedWorklogId=200898&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-200898
 ]

ASF GitHub Bot logged work on BEAM-5381:


Author: ASF GitHub Bot
Created on: 19/Feb/19 21:39
Start Date: 19/Feb/19 21:39
Worklog Time Spent: 10m 
  Work Description: lostluck commented on issue #7889: [BEAM-5381] Fix two 
issues with generating the pipeline plan for Go SDK
URL: https://github.com/apache/beam/pull/7889#issuecomment-465321441
 
 
   Nice catch on the transform scoping! It worked since it was still pointing 
at the root inject transform and the graph continued from there. I agree that 
the consistency checker is correct as well.
   
   Consider adding (or amending) the integration test cases to prevent 
regressions in the future: 
https://github.com/apache/beam/tree/master/sdks/go/test/integration 
   This is making the assumption you have a pipeline shape that you were using 
when you discovered these errors.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 200898)
Time Spent: 40m  (was: 0.5h)

> Dataflow runner creates duplicate CoGBK step IDs
> 
>
> Key: BEAM-5381
> URL: https://issues.apache.org/jira/browse/BEAM-5381
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-go
>Reporter: Cody Schroeder
>Priority: Major
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> https://gist.github.com/schroederc/699f42e407702cf9584b15d6885ad297
> If the attached {{beam_dataflow_err.go}} pipeline is executed with the 
> {{dataflow}} runner, GCP reports the following error:
> {code}
> Step with name e5 already exists. Duplicates are not allowed.
> {code}
> Executing the pipeline in {{--dry_run}} mode shows that "e5" is indeed 
> duplicated.  If the CoGBK in the pipeline is not scoped, the duplication is 
> fixed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6553) A BigQuery sink thta is SDK-implemented and supports file loads in Python

2019-02-19 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6553?focusedWorklogId=200886&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-200886
 ]

ASF GitHub Bot logged work on BEAM-6553:


Author: ASF GitHub Bot
Created on: 19/Feb/19 21:14
Start Date: 19/Feb/19 21:14
Worklog Time Spent: 10m 
  Work Description: pabloem commented on issue #7892: [BEAM-6553] A Python 
SDK sink that supports File Loads into BQ (#7655)
URL: https://github.com/apache/beam/pull/7892#issuecomment-465312506
 
 
   Run Python PostCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 200886)
Time Spent: 11h  (was: 10h 50m)

> A BigQuery sink thta is SDK-implemented and supports file loads in Python
> -
>
> Key: BEAM-6553
> URL: https://issues.apache.org/jira/browse/BEAM-6553
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Reporter: Pablo Estrada
>Assignee: Pablo Estrada
>Priority: Major
>  Labels: triaged
>  Time Spent: 11h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6553) A BigQuery sink thta is SDK-implemented and supports file loads in Python

2019-02-19 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6553?focusedWorklogId=200885&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-200885
 ]

ASF GitHub Bot logged work on BEAM-6553:


Author: ASF GitHub Bot
Created on: 19/Feb/19 21:14
Start Date: 19/Feb/19 21:14
Worklog Time Spent: 10m 
  Work Description: pabloem commented on pull request #7892: [BEAM-6553] A 
Python SDK sink that supports File Loads into BQ (#7655)
URL: https://github.com/apache/beam/pull/7892
 
 
   Fixing issues with BQ Sink for File Loads that were not found earlier.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 200885)
Time Spent: 10h 50m  (was: 10h 40m)

> A BigQuery sink thta is SDK-implemented and supports file loads in Python
> -
>
> Key: BEAM-6553
> URL: https://issues.apache.org/jira/browse/BEAM-6553
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Reporter: Pablo Estrada
>Assignee: Pablo Estrada
>Priority: Major
>  Labels: triaged
>  Time Spent: 10h 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5381) Dataflow runner creates duplicate CoGBK step IDs

2019-02-19 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5381?focusedWorklogId=200882&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-200882
 ]

ASF GitHub Bot logged work on BEAM-5381:


Author: ASF GitHub Bot
Created on: 19/Feb/19 21:10
Start Date: 19/Feb/19 21:10
Worklog Time Spent: 10m 
  Work Description: lostluck commented on issue #7889: [BEAM-5381] Fix two 
issues with generating the pipeline plan for Go SDK
URL: https://github.com/apache/beam/pull/7889#issuecomment-465311087
 
 
   Run Go Postcommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 200882)
Time Spent: 0.5h  (was: 20m)

> Dataflow runner creates duplicate CoGBK step IDs
> 
>
> Key: BEAM-5381
> URL: https://issues.apache.org/jira/browse/BEAM-5381
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-go
>Reporter: Cody Schroeder
>Priority: Major
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> https://gist.github.com/schroederc/699f42e407702cf9584b15d6885ad297
> If the attached {{beam_dataflow_err.go}} pipeline is executed with the 
> {{dataflow}} runner, GCP reports the following error:
> {code}
> Step with name e5 already exists. Duplicates are not allowed.
> {code}
> Executing the pipeline in {{--dry_run}} mode shows that "e5" is indeed 
> duplicated.  If the CoGBK in the pipeline is not scoped, the duplication is 
> fixed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6553) A BigQuery sink thta is SDK-implemented and supports file loads in Python

2019-02-19 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6553?focusedWorklogId=200880&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-200880
 ]

ASF GitHub Bot logged work on BEAM-6553:


Author: ASF GitHub Bot
Created on: 19/Feb/19 20:59
Start Date: 19/Feb/19 20:59
Worklog Time Spent: 10m 
  Work Description: pabloem commented on pull request #7887: Revert 
"[BEAM-6553] A Python SDK sink that supports File Loads into B…
URL: https://github.com/apache/beam/pull/7887
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 200880)
Time Spent: 10h 40m  (was: 10.5h)

> A BigQuery sink thta is SDK-implemented and supports file loads in Python
> -
>
> Key: BEAM-6553
> URL: https://issues.apache.org/jira/browse/BEAM-6553
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Reporter: Pablo Estrada
>Assignee: Pablo Estrada
>Priority: Major
>  Labels: triaged
>  Time Spent: 10h 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4776) Java PortableRunner should support metrics

2019-02-19 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4776?focusedWorklogId=200879&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-200879
 ]

ASF GitHub Bot logged work on BEAM-4776:


Author: ASF GitHub Bot
Created on: 19/Feb/19 20:59
Start Date: 19/Feb/19 20:59
Worklog Time Spent: 10m 
  Work Description: ryan-williams commented on pull request #7890: 
[BEAM-4776] consolidate MetricResult implementations
URL: https://github.com/apache/beam/pull/7890
 
 
   (factored out of #7823)
   
   This simplifies something I found most confusing while learning how the Java 
SDK metrics worked: there were many implementations of `MetricResult` laying 
around, all with *almost* the same semantics (and a fair amount of copy-paste 
having happened between them).
   
   I think I've reduced this to a minimal interface here:
   - keyed by "step name" (`@Nullable`, for pcollection-scoped system metrics; 
first-class support of these is coming in #7823) and `MetricName` (namespace, 
name)
   - containing an "attempted" value and (possibly `null` / "unsupported") 
"committed" value
 - `throw`s `UnsupportedOperationException` on attempting to access 
absent/unsupported "committed" value
   - buildable from `MetricUpdate`s, including a transient interim step where 
"attempted" but not "committed" may be set, for a `MetricResult` that will 
ultimately contain both.
 - `MetricsContainerStepMap` in particular had [one `MetricResult` 
implementation](https://github.com/apache/beam/compare/master...ryan-williams:ac?expand=1#diff-7fd541906a9e771c3312b75947e969daL413)
 (that was already basically redundant with others in the codebase), and 
[another just like 
it](https://github.com/apache/beam/compare/master...ryan-williams:ac?expand=1#diff-7fd541906a9e771c3312b75947e969daL460)
 that acted as an unofficial sort of "builder"; both are gone now.
   
   R: @robertwb, @ajamato 
   
   I'll CC @mxm @tweise as well since I've sent a bunch to Robert/Alex on this 
topic, and this one doesn't directly involve the new metrics API design
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | --- | --- | --- | ---
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/)
 | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/)
  [![Build 
Sta

[jira] [Work logged] (BEAM-5381) Dataflow runner creates duplicate CoGBK step IDs

2019-02-19 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5381?focusedWorklogId=200876&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-200876
 ]

ASF GitHub Bot logged work on BEAM-5381:


Author: ASF GitHub Bot
Created on: 19/Feb/19 20:51
Start Date: 19/Feb/19 20:51
Worklog Time Spent: 10m 
  Work Description: aaltay commented on issue #7889: [BEAM-5381] Fix two 
issues with generating the pipeline plan for Go SDK
URL: https://github.com/apache/beam/pull/7889#issuecomment-465304595
 
 
   R: @lostluck 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 200876)
Time Spent: 20m  (was: 10m)

> Dataflow runner creates duplicate CoGBK step IDs
> 
>
> Key: BEAM-5381
> URL: https://issues.apache.org/jira/browse/BEAM-5381
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-go
>Reporter: Cody Schroeder
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> https://gist.github.com/schroederc/699f42e407702cf9584b15d6885ad297
> If the attached {{beam_dataflow_err.go}} pipeline is executed with the 
> {{dataflow}} runner, GCP reports the following error:
> {code}
> Step with name e5 already exists. Duplicates are not allowed.
> {code}
> Executing the pipeline in {{--dry_run}} mode shows that "e5" is indeed 
> duplicated.  If the CoGBK in the pipeline is not scoped, the duplication is 
> fixed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4776) Java PortableRunner should support metrics

2019-02-19 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4776?focusedWorklogId=200866&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-200866
 ]

ASF GitHub Bot logged work on BEAM-4776:


Author: ASF GitHub Bot
Created on: 19/Feb/19 20:36
Start Date: 19/Feb/19 20:36
Worklog Time Spent: 10m 
  Work Description: ryan-williams commented on issue #7883: [BEAM-4776] Add 
MetricQueryResults.allMetrics() helper
URL: https://github.com/apache/beam/pull/7883#issuecomment-465299563
 
 
   Run Java PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 200866)
Time Spent: 1h  (was: 50m)

> Java PortableRunner should support metrics
> --
>
> Key: BEAM-4776
> URL: https://issues.apache.org/jira/browse/BEAM-4776
> Project: Beam
>  Issue Type: Bug
>  Components: runner-core
>Reporter: Eugene Kirpichov
>Assignee: Ryan Williams
>Priority: Major
>  Labels: triaged
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> BEAM-4775 concerns adding metrics to the JobService API; the current issue is 
> about making PortableRunner understand them.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4776) Java PortableRunner should support metrics

2019-02-19 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4776?focusedWorklogId=200865&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-200865
 ]

ASF GitHub Bot logged work on BEAM-4776:


Author: ASF GitHub Bot
Created on: 19/Feb/19 20:35
Start Date: 19/Feb/19 20:35
Worklog Time Spent: 10m 
  Work Description: ryan-williams commented on issue #7883: [BEAM-4776] Add 
MetricQueryResults.allMetrics() helper
URL: https://github.com/apache/beam/pull/7883#issuecomment-465299519
 
 
   [seems like a flaky 
test](https://scans.gradle.com/s/m2vawlhlp25ze/console-log?task=:beam-sdks-java-io-cassandra:test#L6):
   
   ```
   
   org.apache.beam.sdk.io.cassandra.CassandraIOTest > classMethod FAILED
   --
   java.lang.AssertionError at CassandraIOTest.java:100
   18:50:38.650 [StorageServiceShutdownHook] INFO  
o.a.cassandra.hints.HintsService - Paused hints dispatch
    
   org.apache.beam.sdk.io.cassandra.CassandraIOTest > classMethod FAILED
   java.lang.NullPointerException at CassandraIOTest.java:125
   ```
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 200865)
Time Spent: 50m  (was: 40m)

> Java PortableRunner should support metrics
> --
>
> Key: BEAM-4776
> URL: https://issues.apache.org/jira/browse/BEAM-4776
> Project: Beam
>  Issue Type: Bug
>  Components: runner-core
>Reporter: Eugene Kirpichov
>Assignee: Ryan Williams
>Priority: Major
>  Labels: triaged
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> BEAM-4775 concerns adding metrics to the JobService API; the current issue is 
> about making PortableRunner understand them.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-6712) Change Row's internal implementation of DATETIME

2019-02-19 Thread Rui Wang (JIRA)
Rui Wang created BEAM-6712:
--

 Summary: Change Row's internal implementation of DATETIME
 Key: BEAM-6712
 URL: https://issues.apache.org/jira/browse/BEAM-6712
 Project: Beam
  Issue Type: Sub-task
  Components: sdk-java-core
Reporter: Rui Wang
Assignee: Rui Wang


Should use java.time.Instant for nanosecond precision (java Instant uses a long 
for epoch-seconds and int for nanoseconds of second.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


  1   2   3   >