[jira] [Work logged] (BEAM-5995) Create Jenkins jobs to run the load tests

2019-04-25 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5995?focusedWorklogId=233293&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-233293
 ]

ASF GitHub Bot logged work on BEAM-5995:


Author: ASF GitHub Bot
Created on: 26/Apr/19 06:47
Start Date: 26/Apr/19 06:47
Worklog Time Spent: 10m 
  Work Description: kkucharc commented on issue #8151: [BEAM-5995] add 
Jenkins job with GBK Python load tests
URL: https://github.com/apache/beam/pull/8151#issuecomment-486946186
 
 
   Run Python Load Tests Smoke
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 233293)
Time Spent: 31h 50m  (was: 31h 40m)

> Create Jenkins jobs to run the load tests
> -
>
> Key: BEAM-5995
> URL: https://issues.apache.org/jira/browse/BEAM-5995
> Project: Beam
>  Issue Type: Sub-task
>  Components: testing
>Reporter: Kasia Kucharczyk
>Assignee: Kasia Kucharczyk
>Priority: Major
>  Labels: triaged
>  Time Spent: 31h 50m
>  Remaining Estimate: 0h
>
> (/) Add SMOKE test 
>  Add GBK load tests.
> Add CoGBK load tests.
> Add Pardo load tests.
> Add SideInput tests.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5995) Create Jenkins jobs to run the load tests

2019-04-25 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5995?focusedWorklogId=233291&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-233291
 ]

ASF GitHub Bot logged work on BEAM-5995:


Author: ASF GitHub Bot
Created on: 26/Apr/19 06:41
Start Date: 26/Apr/19 06:41
Worklog Time Spent: 10m 
  Work Description: kkucharc commented on issue #8151: [BEAM-5995] add 
Jenkins job with GBK Python load tests
URL: https://github.com/apache/beam/pull/8151#issuecomment-486944634
 
 
   run seed job
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 233291)
Time Spent: 31h 40m  (was: 31.5h)

> Create Jenkins jobs to run the load tests
> -
>
> Key: BEAM-5995
> URL: https://issues.apache.org/jira/browse/BEAM-5995
> Project: Beam
>  Issue Type: Sub-task
>  Components: testing
>Reporter: Kasia Kucharczyk
>Assignee: Kasia Kucharczyk
>Priority: Major
>  Labels: triaged
>  Time Spent: 31h 40m
>  Remaining Estimate: 0h
>
> (/) Add SMOKE test 
>  Add GBK load tests.
> Add CoGBK load tests.
> Add Pardo load tests.
> Add SideInput tests.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6627) Use Metrics API in IO performance tests

2019-04-25 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6627?focusedWorklogId=233285&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-233285
 ]

ASF GitHub Bot logged work on BEAM-6627:


Author: ASF GitHub Bot
Created on: 26/Apr/19 06:22
Start Date: 26/Apr/19 06:22
Worklog Time Spent: 10m 
  Work Description: mwalenia commented on issue #8400: [BEAM-6627] Added 
byte and item counting metrics to integration tests
URL: https://github.com/apache/beam/pull/8400#issuecomment-486940668
 
 
   I fixed the commit history to be cleaner
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 233285)
Time Spent: 10h 40m  (was: 10.5h)

> Use Metrics API in IO performance tests
> ---
>
> Key: BEAM-6627
> URL: https://issues.apache.org/jira/browse/BEAM-6627
> Project: Beam
>  Issue Type: Improvement
>  Components: testing
>Reporter: Michal Walenia
>Assignee: Michal Walenia
>Priority: Minor
>  Time Spent: 10h 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7141) Expose kv and window parameters for on_timer

2019-04-25 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7141?focusedWorklogId=233277&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-233277
 ]

ASF GitHub Bot logged work on BEAM-7141:


Author: ASF GitHub Bot
Created on: 26/Apr/19 05:57
Start Date: 26/Apr/19 05:57
Worklog Time Spent: 10m 
  Work Description: rakeshcusat commented on issue #8408: [BEAM-7141] Add 
window & timestamp in timer callback method argument
URL: https://github.com/apache/beam/pull/8408#issuecomment-486935666
 
 
   This is still work in progress. I am mostly done with implementation but 
still working on my test case. I thought to open the PR so you can provide me 
early feedback.
   
   R: @tweise @robertwb @pabloem 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 233277)
Time Spent: 20m  (was: 10m)

> Expose kv and window parameters for on_timer
> 
>
> Key: BEAM-7141
> URL: https://issues.apache.org/jira/browse/BEAM-7141
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Affects Versions: 2.12.0
>Reporter: Thomas Weise
>Assignee: Rakesh Kumar
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> We would like to have access to key and window inside the timer callback. 
> Without, it is also difficult to debug. We run into this while working on 
> BEAM-7112



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6695) Latest transform for Python SDK

2019-04-25 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6695?focusedWorklogId=233274&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-233274
 ]

ASF GitHub Bot logged work on BEAM-6695:


Author: ASF GitHub Bot
Created on: 26/Apr/19 05:56
Start Date: 26/Apr/19 05:56
Worklog Time Spent: 10m 
  Work Description: ttanay commented on issue #8206: [BEAM-6695] Latest 
PTransform for Python SDK
URL: https://github.com/apache/beam/pull/8206#issuecomment-486935471
 
 
   Thank you @robinyqiu! :smile: 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 233274)
Time Spent: 6h 20m  (was: 6h 10m)

> Latest transform for Python SDK
> ---
>
> Key: BEAM-6695
> URL: https://issues.apache.org/jira/browse/BEAM-6695
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py-core
>Reporter: Ahmet Altay
>Assignee: Tanay Tummalapalli
>Priority: Minor
>  Time Spent: 6h 20m
>  Remaining Estimate: 0h
>
> Add a PTransform} and Combine.CombineFn for computing the latest element in a 
> PCollection.
> It should offer the same API as its Java counterpart: 
> https://github.com/apache/beam/blob/11a977b8b26eff2274d706541127c19dc93131a2/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/Latest.java



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7141) Expose kv and window parameters for on_timer

2019-04-25 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7141?focusedWorklogId=233269&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-233269
 ]

ASF GitHub Bot logged work on BEAM-7141:


Author: ASF GitHub Bot
Created on: 26/Apr/19 05:54
Start Date: 26/Apr/19 05:54
Worklog Time Spent: 10m 
  Work Description: rakeshcusat commented on pull request #8408: 
[BEAM-7141] Add window & timestamp in timer callback method argument
URL: https://github.com/apache/beam/pull/8408
 
 
   Timer callback argument didn't have timestamp and window parameter. This 
change will allow callback method to specify window and timestamp parameters 
and system will be able to pass these values.
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [x] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [x] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [x] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | --- | --- | --- | ---
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python3_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python3_Verify/lastCompletedBuild/)
 | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/)
  [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PreCommit_Python_PVR_Flink_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_Python_PVR_Flink_Cron/lastCompletedBuild/)
 | --- | --- | ---
   
   Pre-Commit Tests Status (on master branch)
   
--

[jira] [Work logged] (BEAM-7139) Blog post announcing Kotlin Sample addition to Beam

2019-04-25 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7139?focusedWorklogId=233240&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-233240
 ]

ASF GitHub Bot logged work on BEAM-7139:


Author: ASF GitHub Bot
Created on: 26/Apr/19 02:49
Start Date: 26/Apr/19 02:49
Worklog Time Spent: 10m 
  Work Description: pabloem commented on pull request #8391: [BEAM-7139] 
Blogpost for Kotlin Samples
URL: https://github.com/apache/beam/pull/8391
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 233240)
Time Spent: 1h 20m  (was: 1h 10m)

> Blog post announcing Kotlin Sample addition to Beam
> ---
>
> Key: BEAM-7139
> URL: https://issues.apache.org/jira/browse/BEAM-7139
> Project: Beam
>  Issue Type: Task
>  Components: website
>Reporter: Harshit Dwivedi
>Assignee: Harshit Dwivedi
>Priority: Trivial
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Publishing a quick blog post that lets the users know that the Beam samples 
> now have kotlin support.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-7155) Provide access to within DoFn.OnTimerContext

2019-04-25 Thread Reza ardeshir rokni (JIRA)
Reza ardeshir rokni created BEAM-7155:
-

 Summary: Provide access to  within DoFn.OnTimerContext 
 Key: BEAM-7155
 URL: https://issues.apache.org/jira/browse/BEAM-7155
 Project: Beam
  Issue Type: Improvement
  Components: sdk-java-core
Affects Versions: 2.11.0
Reporter: Reza ardeshir rokni


org.apache.beam.sdk.transforms.DoFn.OnTimerContext Does not support access to 
Key. Current workaround:

@ProcessElement 

@StateId("dataKey") ValueState key

{ key.write(key)}

 

...

 

@OnTimer("processTimer") public void OnTimer(OnTimerContext otc,
@StateId("dataKey") ValueState key,

{key.read()}

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7087) Create an errors package for Go SDK

2019-04-25 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7087?focusedWorklogId=233220&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-233220
 ]

ASF GitHub Bot logged work on BEAM-7087:


Author: ASF GitHub Bot
Created on: 26/Apr/19 01:05
Start Date: 26/Apr/19 01:05
Worklog Time Spent: 10m 
  Work Description: lostluck commented on pull request #8406: [BEAM-7087] 
Updating Go SDK errors in base beam directory
URL: https://github.com/apache/beam/pull/8406#discussion_r278779819
 
 

 ##
 File path: sdks/go/pkg/beam/combine.go
 ##
 @@ -54,25 +57,26 @@ func TryCombinePerKey(s Scope, combinefn interface{}, col 
PCollection) (PCollect
ValidateKVType(col)
col, err := TryGroupByKey(s, col)
if err != nil {
-   return PCollection{}, fmt.Errorf("failed to group by key: %v", 
err)
+   return PCollection{}, addCombinePerKeyCtx(err, s)
 
 Review comment:
   Just checking: Does the root error here from TryGroupByKey reveal that it's 
failing at adding a GroupByKey to the graph? Otherwise it feels like we're 
losing some context. 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 233220)
Time Spent: 1h 20m  (was: 1h 10m)

> Create an errors package for Go SDK
> ---
>
> Key: BEAM-7087
> URL: https://issues.apache.org/jira/browse/BEAM-7087
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-go
>Reporter: Daniel Oliveira
>Assignee: Daniel Oliveira
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> We can't import an established errors package such as github.com/pkg/errors 
> due to how the Go SDK is currently implemented, so instead we should have our 
> own package. This package should provide the following to be worth replacing 
> the basic errors we currently use.
>  # A way to wrap other errors without losing the original error, as in 
> [https://github.com/pkg/errors]
>  # Making printed errors more legible than currently. (For ex. having 
> different wrapped errors on different lines, differentiating between errors 
> and context, etc.)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7087) Create an errors package for Go SDK

2019-04-25 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7087?focusedWorklogId=233219&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-233219
 ]

ASF GitHub Bot logged work on BEAM-7087:


Author: ASF GitHub Bot
Created on: 26/Apr/19 01:05
Start Date: 26/Apr/19 01:05
Worklog Time Spent: 10m 
  Work Description: lostluck commented on pull request #8406: [BEAM-7087] 
Updating Go SDK errors in base beam directory
URL: https://github.com/apache/beam/pull/8406#discussion_r278780082
 
 

 ##
 File path: sdks/go/pkg/beam/coder.go
 ##
 @@ -26,6 +26,7 @@ import (
"github.com/apache/beam/sdks/go/pkg/beam/core/runtime/exec"
"github.com/apache/beam/sdks/go/pkg/beam/core/typex"
"github.com/apache/beam/sdks/go/pkg/beam/core/util/reflectx"
+   "github.com/apache/beam/sdks/go/pkg/beam/internal"
 
 Review comment:
   Shouldn't this be ".../beam/internal/errors" ? (same in other files.)
   
   I missed this in the previous PR. The errors package files should be under a 
subdirectory in the internal folder to have the package properly be "errors" 
rather than "internal".
   Note how code in the "beam" package is all in a folder called "beam"
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 233219)
Time Spent: 1h 10m  (was: 1h)

> Create an errors package for Go SDK
> ---
>
> Key: BEAM-7087
> URL: https://issues.apache.org/jira/browse/BEAM-7087
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-go
>Reporter: Daniel Oliveira
>Assignee: Daniel Oliveira
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> We can't import an established errors package such as github.com/pkg/errors 
> due to how the Go SDK is currently implemented, so instead we should have our 
> own package. This package should provide the following to be worth replacing 
> the basic errors we currently use.
>  # A way to wrap other errors without losing the original error, as in 
> [https://github.com/pkg/errors]
>  # Making printed errors more legible than currently. (For ex. having 
> different wrapped errors on different lines, differentiating between errors 
> and context, etc.)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6138) Add User Metric Support to Java SDK

2019-04-25 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6138?focusedWorklogId=233206&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-233206
 ]

ASF GitHub Bot logged work on BEAM-6138:


Author: ASF GitHub Bot
Created on: 26/Apr/19 00:42
Start Date: 26/Apr/19 00:42
Worklog Time Spent: 10m 
  Work Description: ajamato commented on pull request #8280: [BEAM-6138] 
Update java SDK to report user distribution tuple metrics over the FN API
URL: https://github.com/apache/beam/pull/8280#discussion_r275425153
 
 

 ##
 File path: 
runners/core-java/src/test/java/org/apache/beam/runners/core/metrics/MetricsContainerImplTest.java
 ##
 @@ -178,6 +178,34 @@ public void 
testMonitoringInfosArePopulatedForUserCounters() {
 assertThat(actualMonitoringInfos, containsInAnyOrder(builder1.build(), 
builder2.build()));
   }
 
+  @Test
+  public void testMonitoringInfosArePopulatedForUserDistributions() {
+MetricsContainerImpl testObject = new MetricsContainerImpl("step1");
+DistributionCell c1 = testObject.getDistribution(MetricName.named("ns", 
"name1"));
+DistributionCell c2 = testObject.getDistribution(MetricName.named("ns", 
"name2"));
+c1.update(5L);
+c2.update(4L);
+
+SimpleMonitoringInfoBuilder builder1 = new SimpleMonitoringInfoBuilder();
+builder1.setUrnForUserDistribution("ns", "name1");
+builder1.setInt64DistributionValue(DistributionData.create(5, 1, 5, 5));
+builder1.setPTransformLabel("step1");
+builder1.build();
 
 Review comment:
   done
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 233206)
Time Spent: 9h 20m  (was: 9h 10m)

> Add User Metric Support to Java SDK
> ---
>
> Key: BEAM-6138
> URL: https://issues.apache.org/jira/browse/BEAM-6138
> Project: Beam
>  Issue Type: New Feature
>  Components: java-fn-execution
>Reporter: Alex Amato
>Assignee: Alex Amato
>Priority: Major
>  Labels: triaged
>  Time Spent: 9h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6138) Add User Metric Support to Java SDK

2019-04-25 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6138?focusedWorklogId=233205&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-233205
 ]

ASF GitHub Bot logged work on BEAM-6138:


Author: ASF GitHub Bot
Created on: 26/Apr/19 00:42
Start Date: 26/Apr/19 00:42
Worklog Time Spent: 10m 
  Work Description: ajamato commented on pull request #8280: [BEAM-6138] 
Update java SDK to report user distribution tuple metrics over the FN API
URL: https://github.com/apache/beam/pull/8280#discussion_r275424257
 
 

 ##
 File path: 
runners/core-java/src/main/java/org/apache/beam/runners/core/metrics/MonitoringInfoConstants.java
 ##
 @@ -41,7 +41,7 @@
 public static final String TOTAL_MSECS = 
extractUrn(MonitoringInfoSpecs.Enum.TOTAL_MSECS);
 public static final String USER_COUNTER_PREFIX =
 extractUrn(MonitoringInfoSpecs.Enum.USER_COUNTER);
-public static final String USER_DISTRIBUTION_COUNTER_PREFIX =
+public static final String USER_DISTRIBUTION_PREFIX =
 
 Review comment:
   I removed references to calling it a "counter" whereever possible, because 
it is a distribution, not a counter. This approach is still consistent. 
USER_X_PREFIX.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 233205)
Time Spent: 9h 10m  (was: 9h)

> Add User Metric Support to Java SDK
> ---
>
> Key: BEAM-6138
> URL: https://issues.apache.org/jira/browse/BEAM-6138
> Project: Beam
>  Issue Type: New Feature
>  Components: java-fn-execution
>Reporter: Alex Amato
>Assignee: Alex Amato
>Priority: Major
>  Labels: triaged
>  Time Spent: 9h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6138) Add User Metric Support to Java SDK

2019-04-25 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6138?focusedWorklogId=233207&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-233207
 ]

ASF GitHub Bot logged work on BEAM-6138:


Author: ASF GitHub Bot
Created on: 26/Apr/19 00:42
Start Date: 26/Apr/19 00:42
Worklog Time Spent: 10m 
  Work Description: ajamato commented on pull request #8280: [BEAM-6138] 
Update java SDK to report user distribution tuple metrics over the FN API
URL: https://github.com/apache/beam/pull/8280#discussion_r275425479
 
 

 ##
 File path: 
runners/core-java/src/main/java/org/apache/beam/runners/core/metrics/MetricUrns.java
 ##
 @@ -31,6 +31,9 @@ public static MetricName parseUrn(String urn) {
 if (urn.startsWith(MonitoringInfoConstants.Urns.USER_COUNTER_PREFIX)) {
   urn = 
urn.substring(MonitoringInfoConstants.Urns.USER_COUNTER_PREFIX.length());
 }
+if (urn.startsWith(MonitoringInfoConstants.Urns.USER_DISTRIBUTION_PREFIX)) 
{
 
 Review comment:
   Done
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 233207)
Time Spent: 9.5h  (was: 9h 20m)

> Add User Metric Support to Java SDK
> ---
>
> Key: BEAM-6138
> URL: https://issues.apache.org/jira/browse/BEAM-6138
> Project: Beam
>  Issue Type: New Feature
>  Components: java-fn-execution
>Reporter: Alex Amato
>Assignee: Alex Amato
>Priority: Major
>  Labels: triaged
>  Time Spent: 9.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7087) Create an errors package for Go SDK

2019-04-25 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7087?focusedWorklogId=233194&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-233194
 ]

ASF GitHub Bot logged work on BEAM-7087:


Author: ASF GitHub Bot
Created on: 26/Apr/19 00:14
Start Date: 26/Apr/19 00:14
Worklog Time Spent: 10m 
  Work Description: youngoli commented on issue #8406: [BEAM-7087] Updating 
Go SDK errors in base beam directory
URL: https://github.com/apache/beam/pull/8406#issuecomment-486881210
 
 
   R: @lostluck 
   CC: @aaltay Ahmet I may be sending some of the later PRs to you for review, 
so wanted to keep you in the loop.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 233194)
Time Spent: 1h  (was: 50m)

> Create an errors package for Go SDK
> ---
>
> Key: BEAM-7087
> URL: https://issues.apache.org/jira/browse/BEAM-7087
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-go
>Reporter: Daniel Oliveira
>Assignee: Daniel Oliveira
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> We can't import an established errors package such as github.com/pkg/errors 
> due to how the Go SDK is currently implemented, so instead we should have our 
> own package. This package should provide the following to be worth replacing 
> the basic errors we currently use.
>  # A way to wrap other errors without losing the original error, as in 
> [https://github.com/pkg/errors]
>  # Making printed errors more legible than currently. (For ex. having 
> different wrapped errors on different lines, differentiating between errors 
> and context, etc.)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5709) Tests in BeamFnControlServiceTest are flaky.

2019-04-25 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5709?focusedWorklogId=233193&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-233193
 ]

ASF GitHub Bot logged work on BEAM-5709:


Author: ASF GitHub Bot
Created on: 26/Apr/19 00:13
Start Date: 26/Apr/19 00:13
Worklog Time Spent: 10m 
  Work Description: boyuanzz commented on pull request #8395: [BEAM-5709] 
Changing sleeps to CountdownLatch in test
URL: https://github.com/apache/beam/pull/8395
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 233193)
Time Spent: 2h  (was: 1h 50m)

> Tests in BeamFnControlServiceTest are flaky.
> 
>
> Key: BEAM-5709
> URL: https://issues.apache.org/jira/browse/BEAM-5709
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Reporter: Daniel Oliveira
>Assignee: Daniel Oliveira
>Priority: Minor
>  Labels: triaged
> Fix For: Not applicable
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> https://github.com/apache/beam/blob/master/runners/google-cloud-dataflow-java/worker/src/test/java/org/apache/beam/runners/dataflow/worker/fn/BeamFnControlServiceTest.java
> Tests for BeamFnControlService are currently flaky. The test attempts to 
> verify that onCompleted was called on the mock streams, but that function 
> gets called on a separate thread, so occasionally the function will not have 
> been called yet, despite the server being shut down.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7087) Create an errors package for Go SDK

2019-04-25 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7087?focusedWorklogId=233192&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-233192
 ]

ASF GitHub Bot logged work on BEAM-7087:


Author: ASF GitHub Bot
Created on: 26/Apr/19 00:12
Start Date: 26/Apr/19 00:12
Worklog Time Spent: 10m 
  Work Description: youngoli commented on pull request #8406: [BEAM-7087] 
Updating Go SDK errors in base beam directory
URL: https://github.com/apache/beam/pull/8406
 
 
   This change updates code in sdks/go/pkg/beam to use the new errors package, 
and (mostly) does not change code in any sub-directories. It should be the 
first of several PRs to update all our Go SDK code to use the new errors where 
possible.
   
   That said, this change isn't as "clean" as I'd like it to be. During the 
course of changing the calls I decided to change some of the messages slightly 
to more accurately match the intended style of errors that this package 
promotes (i.e. context should say "doing this" instead of "something broke"). 
You can see this in the spots where I completely changed error messages. But in 
order to retain all the useful information, I sometimes had to go and edit the 
calls that generated the error and so forth. So this PR is a mix of actual 
error improvements and just changing calls.
   
   For future PRs in this effort I'm definitely going to keep it more strict 
and not change any error implementations (I can note those down and put them in 
their own PR). Unfortunately I was too far into this change to separate it out 
easily, so whoever reviews this gets the short end of the stick.
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [x] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [x] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | --- | --- | --- | ---
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_

[jira] [Created] (BEAM-7154) Switch internal Go SDK error code to use new package

2019-04-25 Thread Daniel Oliveira (JIRA)
Daniel Oliveira created BEAM-7154:
-

 Summary: Switch internal Go SDK error code to use new package
 Key: BEAM-7154
 URL: https://issues.apache.org/jira/browse/BEAM-7154
 Project: Beam
  Issue Type: Improvement
  Components: sdk-go
Reporter: Daniel Oliveira
Assignee: Daniel Oliveira


I added a new package for errors in the Go SDK: 
[https://github.com/apache/beam/pull/8369]

This issue tracks progress on modifying existing error code, which mostly uses 
fmt.Errorf, to use this new package.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (BEAM-7087) Create an errors package for Go SDK

2019-04-25 Thread Daniel Oliveira (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Oliveira resolved BEAM-7087.
---
   Resolution: Fixed
Fix Version/s: Not applicable

> Create an errors package for Go SDK
> ---
>
> Key: BEAM-7087
> URL: https://issues.apache.org/jira/browse/BEAM-7087
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-go
>Reporter: Daniel Oliveira
>Assignee: Daniel Oliveira
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> We can't import an established errors package such as github.com/pkg/errors 
> due to how the Go SDK is currently implemented, so instead we should have our 
> own package. This package should provide the following to be worth replacing 
> the basic errors we currently use.
>  # A way to wrap other errors without losing the original error, as in 
> [https://github.com/pkg/errors]
>  # Making printed errors more legible than currently. (For ex. having 
> different wrapped errors on different lines, differentiating between errors 
> and context, etc.)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-7153) pydoc broken for @deprecated annotations

2019-04-25 Thread Udi Meiri (JIRA)
Udi Meiri created BEAM-7153:
---

 Summary: pydoc broken for @deprecated annotations
 Key: BEAM-7153
 URL: https://issues.apache.org/jira/browse/BEAM-7153
 Project: Beam
  Issue Type: Bug
  Components: sdk-py-core
Reporter: Udi Meiri


It seems this is a similar issue to 
https://issues.apache.org/jira/browse/BEAM-7062
but I haven't investigated.

Example: 
https://beam.apache.org/releases/pydoc/2.12.0/apache_beam.io.gcp.bigquery.html#apache_beam.io.gcp.bigquery.BigQuerySink



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7043) Add DynamoDBIO

2019-04-25 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7043?focusedWorklogId=233178&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-233178
 ]

ASF GitHub Bot logged work on BEAM-7043:


Author: ASF GitHub Bot
Created on: 25/Apr/19 23:34
Start Date: 25/Apr/19 23:34
Worklog Time Spent: 10m 
  Work Description: cmachgodaddy commented on issue #8390:  [BEAM-7043] Add 
DynamoDBIO
URL: https://github.com/apache/beam/pull/8390#issuecomment-486874220
 
 
   @chamikaramj , @lukecwik , @mxm , can you guys review this PR? Thanks
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 233178)
Time Spent: 3h 20m  (was: 3h 10m)

> Add DynamoDBIO
> --
>
> Key: BEAM-7043
> URL: https://issues.apache.org/jira/browse/BEAM-7043
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-aws
>Reporter: Cam Mach
>Assignee: Cam Mach
>Priority: Minor
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> Currently we don't have any feature to write data to AWS DynamoDB. This 
> feature will enable us to send data to DynamoDB



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7043) Add DynamoDBIO

2019-04-25 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7043?focusedWorklogId=233175&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-233175
 ]

ASF GitHub Bot logged work on BEAM-7043:


Author: ASF GitHub Bot
Created on: 25/Apr/19 23:29
Start Date: 25/Apr/19 23:29
Worklog Time Spent: 10m 
  Work Description: cmachgodaddy commented on pull request #8390:  
[BEAM-7043] Add DynamoDBIO
URL: https://github.com/apache/beam/pull/8390#discussion_r278766976
 
 

 ##
 File path: 
sdks/java/io/amazon-web-services/src/test/java/org/apache/beam/sdk/io/aws/dynamodb/AwsClientProviderMock.java
 ##
 @@ -0,0 +1,48 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.aws.dynamodb;
+
+import com.amazonaws.services.cloudwatch.AmazonCloudWatch;
+import com.amazonaws.services.dynamodbv2.AmazonDynamoDB;
+import org.mockito.Mockito;
+
+/**
+ * Mocking AwsClientProvider instance so it can take i different {@link 
AmazonDynamoDB} object type.
+ */
+public class AwsClientProviderMock implements AwsClientsProvider {
 
 Review comment:
   @iemejia , thanks for your comments. Working to address them
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 233175)
Time Spent: 3h 10m  (was: 3h)

> Add DynamoDBIO
> --
>
> Key: BEAM-7043
> URL: https://issues.apache.org/jira/browse/BEAM-7043
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-aws
>Reporter: Cam Mach
>Assignee: Cam Mach
>Priority: Minor
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> Currently we don't have any feature to write data to AWS DynamoDB. This 
> feature will enable us to send data to DynamoDB



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-7152) public class TestPipeline fails with java.lang.IllegalStateException

2019-04-25 Thread Hil (JIRA)
Hil created BEAM-7152:
-

 Summary: public class TestPipeline fails with 
java.lang.IllegalStateException
 Key: BEAM-7152
 URL: https://issues.apache.org/jira/browse/BEAM-7152
 Project: Beam
  Issue Type: Bug
  Components: testing
Affects Versions: 2.11.0
 Environment: hil@xeon:~/mavenwave/mw-finack/deploy$ java -version
openjdk version "1.8.0_191"
OpenJDK Runtime Environment (build 1.8.0_191-8u191-b12-2ubuntu0.18.04.1-b12)
OpenJDK 64-Bit Server VM (build 25.191-b12, mixed mode)
hil@xeon:~/mavenwave/mw-finack/deploy$ uname -a
Linux xeon 4.15.0-42-generic #45-Ubuntu SMP Thu Nov 15 19:32:57 UTC 2018 x86_64 
x86_64 x86_64 GNU/Linux
hil@xeon:~/mavenwave/mw-finack/deploy$
Reporter: Hil


I tried version 2.12 and 2.11 for [creating unit 
tests|[https://cwiki.apache.org/confluence/display/BEAM/Contribution+Testing+Guide#ContributionTestingGuide-EffectiveuseoftheTestPipelineJUnitrule]]:

 
{code:java}
public class FinackPipelineTest {

@Rule
public final TestPipeline pipeline = TestPipeline.create();

@Test
@Category(NeedsRunner.class)
public void myPipelineTest() throws Exception {
String WORDS = "words";
String WHATEVER = "whatever";
final PCollection pCollection =
pipeline
.apply("Create", 
Create.of(WORDS).withCoder(StringUtf8Coder.of()))
.apply(
"Map1",
MapElements.via(
new SimpleFunction() {

@Override
public String apply(final String input) 
{
return WHATEVER;
}
}));

PAssert.that(pCollection).containsInAnyOrder(WHATEVER);

pipeline.run();
}
}{code}
When I run the unit tests in Intellij, I got the following error:

 

{color:#FF}java.lang.IllegalStateException: Is your TestPipeline 
declaration missing a @Rule annotation? Usage: @Rule public final transient 
TestPipeline pipeline = TestPipeline.create();{color}

{color:#FF}at 
org.apache.beam.vendor.guava.v20_0.com.google.common.base.Preconditions.checkState(Preconditions.java:444){color}
{color:#FF} at 
org.apache.beam.sdk.testing.TestPipeline.run(TestPipeline.java:337){color}
{color:#FF} at 
org.apache.beam.sdk.testing.TestPipeline.run(TestPipeline.java:331){color}
{color:#FF} at 
com.finack.app.FinackPipelineTest.myPipelineTest(FinackPipelineTest.java:86){color}
{color:#FF} at sun.reflect.NativeMethodAccessorImpl.invoke0(Native 
Method){color}
{color:#FF} at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62){color}
{color:#FF} at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43){color}
{color:#FF} at java.lang.reflect.Method.invoke(Method.java:498){color}
{color:#FF} at 
org.junit.platform.commons.util.ReflectionUtils.invokeMethod(ReflectionUtils.java:628){color}
{color:#FF} at 
org.junit.jupiter.engine.execution.ExecutableInvoker.invoke(ExecutableInvoker.java:117){color}
{color:#FF} at 
org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.lambda$invokeTestMethod$7(TestMethodTestDescriptor.java:184){color}
{color:#FF} at 
org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73){color}
{color:#FF} at 
org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.invokeTestMethod(TestMethodTestDescriptor.java:180){color}
{color:#FF} at 
org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.execute(TestMethodTestDescriptor.java:127){color}
{color:#FF} at 
org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.execute(TestMethodTestDescriptor.java:68){color}
{color:#FF} at 
org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$5(NodeTestTask.java:135){color}
{color:#FF} at 
org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73){color}
{color:#FF} at 
org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$7(NodeTestTask.java:125){color}
{color:#FF} at 
org.junit.platform.engine.support.hierarchical.Node.around(Node.java:135){color}
{color:#FF} at 
org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$8(NodeTestTask.java:123){color}
{color:#FF} at 
org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73){color}
{color:#FF} at 
org.junit.platform.engine.support.hierarchical.NodeTestTask.executeRecursively(NodeTestTask.java:122){color}
{color:#FF} at 
org.junit.platform.engine.support.hierarchical.NodeTestTask.execute(NodeTestTask.java:80){color}
{color:#FF} at java.util.ArrayLi

[jira] [Work logged] (BEAM-4046) Decouple gradle project names and maven artifact ids

2019-04-25 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4046?focusedWorklogId=233172&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-233172
 ]

ASF GitHub Bot logged work on BEAM-4046:


Author: ASF GitHub Bot
Created on: 25/Apr/19 23:20
Start Date: 25/Apr/19 23:20
Worklog Time Spent: 10m 
  Work Description: adude3141 commented on issue #8194: [DO NOT MERGE] 
[BEAM-4046] decouple gradle project names and maven artifact ids
URL: https://github.com/apache/beam/pull/8194#issuecomment-486871641
 
 
   Run Java Flink PortableValidatesRunner Streaming
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 233172)
Time Spent: 11.5h  (was: 11h 20m)

> Decouple gradle project names and maven artifact ids
> 
>
> Key: BEAM-4046
> URL: https://issues.apache.org/jira/browse/BEAM-4046
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system
>Reporter: Kenneth Knowles
>Assignee: Michael Luckey
>Priority: Major
>  Time Spent: 11.5h
>  Remaining Estimate: 0h
>
> In our first draft, we had gradle projects like {{":beam-sdks-java-core"}}. 
> It is clumsy and requires a hacky settings.gradle that is not idiomatic.
> In our second draft, we changed them to names that work well with Gradle, 
> like {{":sdks:java:core"}}. This caused Maven artifact IDs to be wonky.
> In our third draft, we regressed to the first draft to get the Maven artifact 
> ids right.
> These should be able to be decoupled. It seems there are many StackOverflow 
> questions on the subject.
> Since it is unidiomatic and a poor user experience, if it does turn out to be 
> mandatory then it needs to be documented inline everywhere - the 
> settings.gradle should say why it is so bizarre, and each build.gradle should 
> indicate what its project id is.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4046) Decouple gradle project names and maven artifact ids

2019-04-25 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4046?focusedWorklogId=233171&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-233171
 ]

ASF GitHub Bot logged work on BEAM-4046:


Author: ASF GitHub Bot
Created on: 25/Apr/19 23:19
Start Date: 25/Apr/19 23:19
Worklog Time Spent: 10m 
  Work Description: adude3141 commented on issue #8194: [DO NOT MERGE] 
[BEAM-4046] decouple gradle project names and maven artifact ids
URL: https://github.com/apache/beam/pull/8194#issuecomment-486871489
 
 
   Run Dataflow PortabilityApi ValidatesRunner
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 233171)
Time Spent: 11h 20m  (was: 11h 10m)

> Decouple gradle project names and maven artifact ids
> 
>
> Key: BEAM-4046
> URL: https://issues.apache.org/jira/browse/BEAM-4046
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system
>Reporter: Kenneth Knowles
>Assignee: Michael Luckey
>Priority: Major
>  Time Spent: 11h 20m
>  Remaining Estimate: 0h
>
> In our first draft, we had gradle projects like {{":beam-sdks-java-core"}}. 
> It is clumsy and requires a hacky settings.gradle that is not idiomatic.
> In our second draft, we changed them to names that work well with Gradle, 
> like {{":sdks:java:core"}}. This caused Maven artifact IDs to be wonky.
> In our third draft, we regressed to the first draft to get the Maven artifact 
> ids right.
> These should be able to be decoupled. It seems there are many StackOverflow 
> questions on the subject.
> Since it is unidiomatic and a poor user experience, if it does turn out to be 
> mandatory then it needs to be documented inline everywhere - the 
> settings.gradle should say why it is so bizarre, and each build.gradle should 
> indicate what its project id is.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4046) Decouple gradle project names and maven artifact ids

2019-04-25 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4046?focusedWorklogId=233170&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-233170
 ]

ASF GitHub Bot logged work on BEAM-4046:


Author: ASF GitHub Bot
Created on: 25/Apr/19 23:19
Start Date: 25/Apr/19 23:19
Worklog Time Spent: 10m 
  Work Description: adude3141 commented on issue #8194: [DO NOT MERGE] 
[BEAM-4046] decouple gradle project names and maven artifact ids
URL: https://github.com/apache/beam/pull/8194#issuecomment-486871424
 
 
   Run Dataflow ValidatesRunner
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 233170)
Time Spent: 11h 10m  (was: 11h)

> Decouple gradle project names and maven artifact ids
> 
>
> Key: BEAM-4046
> URL: https://issues.apache.org/jira/browse/BEAM-4046
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system
>Reporter: Kenneth Knowles
>Assignee: Michael Luckey
>Priority: Major
>  Time Spent: 11h 10m
>  Remaining Estimate: 0h
>
> In our first draft, we had gradle projects like {{":beam-sdks-java-core"}}. 
> It is clumsy and requires a hacky settings.gradle that is not idiomatic.
> In our second draft, we changed them to names that work well with Gradle, 
> like {{":sdks:java:core"}}. This caused Maven artifact IDs to be wonky.
> In our third draft, we regressed to the first draft to get the Maven artifact 
> ids right.
> These should be able to be decoupled. It seems there are many StackOverflow 
> questions on the subject.
> Since it is unidiomatic and a poor user experience, if it does turn out to be 
> mandatory then it needs to be documented inline everywhere - the 
> settings.gradle should say why it is so bizarre, and each build.gradle should 
> indicate what its project id is.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7043) Add DynamoDBIO

2019-04-25 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7043?focusedWorklogId=233168&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-233168
 ]

ASF GitHub Bot logged work on BEAM-7043:


Author: ASF GitHub Bot
Created on: 25/Apr/19 23:16
Start Date: 25/Apr/19 23:16
Worklog Time Spent: 10m 
  Work Description: cmachgodaddy commented on issue #8390:  [BEAM-7043] Add 
DynamoDBIO
URL: https://github.com/apache/beam/pull/8390#issuecomment-486870805
 
 
   retest this please
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 233168)
Time Spent: 3h  (was: 2h 50m)

> Add DynamoDBIO
> --
>
> Key: BEAM-7043
> URL: https://issues.apache.org/jira/browse/BEAM-7043
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-aws
>Reporter: Cam Mach
>Assignee: Cam Mach
>Priority: Minor
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> Currently we don't have any feature to write data to AWS DynamoDB. This 
> feature will enable us to send data to DynamoDB



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6880) Deprecate Java Portable Reference Runner

2019-04-25 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6880?focusedWorklogId=233155&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-233155
 ]

ASF GitHub Bot logged work on BEAM-6880:


Author: ASF GitHub Bot
Created on: 25/Apr/19 23:00
Start Date: 25/Apr/19 23:00
Worklog Time Spent: 10m 
  Work Description: youngoli commented on issue #8380: [BEAM-6880] Remove 
deprecated Reference Runner code.
URL: https://github.com/apache/beam/pull/8380#issuecomment-486867567
 
 
   Run Java Flink PortableValidatesRunner Streaming
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 233155)
Time Spent: 4h  (was: 3h 50m)

> Deprecate Java Portable Reference Runner
> 
>
> Key: BEAM-6880
> URL: https://issues.apache.org/jira/browse/BEAM-6880
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-direct, test-failures, testing
>Reporter: Mikhail Gryzykhin
>Assignee: Daniel Oliveira
>Priority: Major
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> This ticket is about deprecating Java Portable Reference runner.
>  
> Discussion is happening in [this 
> thread|[https://lists.apache.org/thread.html/0b68efce9b7f2c5297b32d09e5d903e9b354199fe2ce446fbcd240bc@%3Cdev.beam.apache.org%3E]]
>  
>  
> Current summary is: disable beam_PostCommit_Java_PVR_Reference job.
> Keeping or removing reference runner code is still under discussion. It is 
> suggested to create PR that removes relevant code and start voting there.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7138) keep Java serialized coder in length-prefixed wire coder construction

2019-04-25 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7138?focusedWorklogId=233137&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-233137
 ]

ASF GitHub Bot logged work on BEAM-7138:


Author: ASF GitHub Bot
Created on: 25/Apr/19 22:38
Start Date: 25/Apr/19 22:38
Worklog Time Spent: 10m 
  Work Description: ihji commented on issue #8396: [BEAM-7138] keep Java 
serialized coder in wire coder construction
URL: https://github.com/apache/beam/pull/8396#issuecomment-486862571
 
 
   I'll look into what's wrong in the test.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 233137)
Time Spent: 50m  (was: 40m)

> keep Java serialized coder in length-prefixed wire coder construction
> -
>
> Key: BEAM-7138
> URL: https://issues.apache.org/jira/browse/BEAM-7138
> Project: Beam
>  Issue Type: Improvement
>  Components: java-fn-execution
>Reporter: Heejong Lee
>Assignee: Heejong Lee
>Priority: Major
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> don't replace Java serialized coder with byte array coder in length-prefixed 
> wire coder construction.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7138) keep Java serialized coder in length-prefixed wire coder construction

2019-04-25 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7138?focusedWorklogId=233136&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-233136
 ]

ASF GitHub Bot logged work on BEAM-7138:


Author: ASF GitHub Bot
Created on: 25/Apr/19 22:37
Start Date: 25/Apr/19 22:37
Worklog Time Spent: 10m 
  Work Description: ihji commented on issue #8396: [BEAM-7138] keep Java 
serialized coder in wire coder construction
URL: https://github.com/apache/beam/pull/8396#issuecomment-486862457
 
 
   @mxm I encountered the `ClassCastException` when I tried to use 
`.withMaxNumRecords` in `KafkaIO`. It generates a `PCollection` of 
`AutoValue_BoundedReadFromUnboundedSource_Shard` and current implementation of 
WireCoder doesn't support Java serialized coder which this `PCollection` needs 
to use.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 233136)
Time Spent: 40m  (was: 0.5h)

> keep Java serialized coder in length-prefixed wire coder construction
> -
>
> Key: BEAM-7138
> URL: https://issues.apache.org/jira/browse/BEAM-7138
> Project: Beam
>  Issue Type: Improvement
>  Components: java-fn-execution
>Reporter: Heejong Lee
>Assignee: Heejong Lee
>Priority: Major
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> don't replace Java serialized coder with byte array coder in length-prefixed 
> wire coder construction.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7139) Blog post announcing Kotlin Sample addition to Beam

2019-04-25 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7139?focusedWorklogId=233133&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-233133
 ]

ASF GitHub Bot logged work on BEAM-7139:


Author: ASF GitHub Bot
Created on: 25/Apr/19 22:17
Start Date: 25/Apr/19 22:17
Worklog Time Spent: 10m 
  Work Description: harshithdwivedi commented on issue #8391: [BEAM-7139] 
Blogpost for Kotlin Samples
URL: https://github.com/apache/beam/pull/8391#issuecomment-486857625
 
 
   No worries, done with the changes!
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 233133)
Time Spent: 1h 10m  (was: 1h)

> Blog post announcing Kotlin Sample addition to Beam
> ---
>
> Key: BEAM-7139
> URL: https://issues.apache.org/jira/browse/BEAM-7139
> Project: Beam
>  Issue Type: Task
>  Components: website
>Reporter: Harshit Dwivedi
>Assignee: Harshit Dwivedi
>Priority: Trivial
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Publishing a quick blog post that lets the users know that the Beam samples 
> now have kotlin support.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-7141) Expose kv and window parameters for on_timer

2019-04-25 Thread Rakesh Kumar (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rakesh Kumar reassigned BEAM-7141:
--

Assignee: Rakesh Kumar

> Expose kv and window parameters for on_timer
> 
>
> Key: BEAM-7141
> URL: https://issues.apache.org/jira/browse/BEAM-7141
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Affects Versions: 2.12.0
>Reporter: Thomas Weise
>Assignee: Rakesh Kumar
>Priority: Major
>
> We would like to have access to key and window inside the timer callback. 
> Without, it is also difficult to debug. We run into this while working on 
> BEAM-7112



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work started] (BEAM-7141) Expose kv and window parameters for on_timer

2019-04-25 Thread Rakesh Kumar (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on BEAM-7141 started by Rakesh Kumar.
--
> Expose kv and window parameters for on_timer
> 
>
> Key: BEAM-7141
> URL: https://issues.apache.org/jira/browse/BEAM-7141
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Affects Versions: 2.12.0
>Reporter: Thomas Weise
>Assignee: Rakesh Kumar
>Priority: Major
>
> We would like to have access to key and window inside the timer callback. 
> Without, it is also difficult to debug. We run into this while working on 
> BEAM-7112



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6880) Deprecate Java Portable Reference Runner

2019-04-25 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6880?focusedWorklogId=233125&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-233125
 ]

ASF GitHub Bot logged work on BEAM-6880:


Author: ASF GitHub Bot
Created on: 25/Apr/19 21:56
Start Date: 25/Apr/19 21:56
Worklog Time Spent: 10m 
  Work Description: youngoli commented on issue #8380: [BEAM-6880] Remove 
deprecated Reference Runner code.
URL: https://github.com/apache/beam/pull/8380#issuecomment-486852153
 
 
   Run Java Flink PortableValidatesRunner Streaming
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 233125)
Time Spent: 3h 50m  (was: 3h 40m)

> Deprecate Java Portable Reference Runner
> 
>
> Key: BEAM-6880
> URL: https://issues.apache.org/jira/browse/BEAM-6880
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-direct, test-failures, testing
>Reporter: Mikhail Gryzykhin
>Assignee: Daniel Oliveira
>Priority: Major
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>
> This ticket is about deprecating Java Portable Reference runner.
>  
> Discussion is happening in [this 
> thread|[https://lists.apache.org/thread.html/0b68efce9b7f2c5297b32d09e5d903e9b354199fe2ce446fbcd240bc@%3Cdev.beam.apache.org%3E]]
>  
>  
> Current summary is: disable beam_PostCommit_Java_PVR_Reference job.
> Keeping or removing reference runner code is still under discussion. It is 
> suggested to create PR that removes relevant code and start voting there.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7121) Provide deterministic version of Python's ProtoCoder

2019-04-25 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7121?focusedWorklogId=233126&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-233126
 ]

ASF GitHub Bot logged work on BEAM-7121:


Author: ASF GitHub Bot
Created on: 25/Apr/19 21:56
Start Date: 25/Apr/19 21:56
Worklog Time Spent: 10m 
  Work Description: charlesccychen commented on issue #8377: [BEAM-7121] 
Add deterministic proto coder
URL: https://github.com/apache/beam/pull/8377#issuecomment-486852299
 
 
   Also note that precommits fail because of lint:
   
   ```
   11:31:44 > Task :beam-sdks-python:lintPy27 FAILED
   11:31:44 * Module apache_beam.coders.coders_test
   11:31:44 C: 95, 0: Line too long (102/80) (line-too-long)
   11:31:44 
   11:31:44 
   11:31:44 Your code has been rated at 10.00/10 (previous run: 10.00/10, -0.00)
   11:31:44 
   11:31:44 Command exited with non-zero status 16
   11:31:44 601.37user 16.20system 2:47.52elapsed 368%CPU (0avgtext+0avgdata 
422480maxresident)k
   11:31:44 0inputs+224outputs (0major+828313minor)pagefaults 0swaps
   11:31:44 ERROR: InvocationError for command '/usr/bin/time 
/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/build/srcs/sdks/python/scripts/run_pylint.sh'
 (exited with code 16)
   ```
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 233126)
Time Spent: 1h 10m  (was: 1h)

> Provide deterministic version of Python's ProtoCoder
> 
>
> Key: BEAM-7121
> URL: https://issues.apache.org/jira/browse/BEAM-7121
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-core
>Reporter: Yifan Mai
>Priority: Minor
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Passing deterministic=true to proto's 
> [SerializeToString|https://github.com/protocolbuffers/protobuf/blob/60b66a119d17f0a2a595a231bea87cd4f4cf2689/python/google/protobuf/message.py#L189-L204]
>  will result in deterministic encoding of maps in protos. This can be used to 
> provide a deterministic version of ProtoCoder.
> This would allow protos to be used as a key for grouping by key.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6880) Deprecate Java Portable Reference Runner

2019-04-25 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6880?focusedWorklogId=233123&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-233123
 ]

ASF GitHub Bot logged work on BEAM-6880:


Author: ASF GitHub Bot
Created on: 25/Apr/19 21:55
Start Date: 25/Apr/19 21:55
Worklog Time Spent: 10m 
  Work Description: youngoli commented on issue #8380: [BEAM-6880] Remove 
deprecated Reference Runner code.
URL: https://github.com/apache/beam/pull/8380#issuecomment-486852057
 
 
   Run Java Flink PortableValidatesRunner Batch
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 233123)
Time Spent: 3h 40m  (was: 3.5h)

> Deprecate Java Portable Reference Runner
> 
>
> Key: BEAM-6880
> URL: https://issues.apache.org/jira/browse/BEAM-6880
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-direct, test-failures, testing
>Reporter: Mikhail Gryzykhin
>Assignee: Daniel Oliveira
>Priority: Major
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> This ticket is about deprecating Java Portable Reference runner.
>  
> Discussion is happening in [this 
> thread|[https://lists.apache.org/thread.html/0b68efce9b7f2c5297b32d09e5d903e9b354199fe2ce446fbcd240bc@%3Cdev.beam.apache.org%3E]]
>  
>  
> Current summary is: disable beam_PostCommit_Java_PVR_Reference job.
> Keeping or removing reference runner code is still under discussion. It is 
> suggested to create PR that removes relevant code and start voting there.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7121) Provide deterministic version of Python's ProtoCoder

2019-04-25 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7121?focusedWorklogId=233124&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-233124
 ]

ASF GitHub Bot logged work on BEAM-7121:


Author: ASF GitHub Bot
Created on: 25/Apr/19 21:55
Start Date: 25/Apr/19 21:55
Worklog Time Spent: 10m 
  Work Description: charlesccychen commented on issue #8377: [BEAM-7121] 
Add deterministic proto coder
URL: https://github.com/apache/beam/pull/8377#issuecomment-486852058
 
 
   Thanks.
   
   @yifanmai @aaltay @robertwb What is the rationale for keeping both 
ProtoCoder and DeterministicProtoCoder going forward?  Is there so much of a 
performance difference?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 233124)
Time Spent: 1h  (was: 50m)

> Provide deterministic version of Python's ProtoCoder
> 
>
> Key: BEAM-7121
> URL: https://issues.apache.org/jira/browse/BEAM-7121
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-core
>Reporter: Yifan Mai
>Priority: Minor
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Passing deterministic=true to proto's 
> [SerializeToString|https://github.com/protocolbuffers/protobuf/blob/60b66a119d17f0a2a595a231bea87cd4f4cf2689/python/google/protobuf/message.py#L189-L204]
>  will result in deterministic encoding of maps in protos. This can be used to 
> provide a deterministic version of ProtoCoder.
> This would allow protos to be used as a key for grouping by key.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6880) Deprecate Java Portable Reference Runner

2019-04-25 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6880?focusedWorklogId=233122&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-233122
 ]

ASF GitHub Bot logged work on BEAM-6880:


Author: ASF GitHub Bot
Created on: 25/Apr/19 21:55
Start Date: 25/Apr/19 21:55
Worklog Time Spent: 10m 
  Work Description: youngoli commented on issue #8380: [BEAM-6880] Remove 
deprecated Reference Runner code.
URL: https://github.com/apache/beam/pull/8380#issuecomment-486851997
 
 
   Run Java PortabilityApi PostCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 233122)
Time Spent: 3.5h  (was: 3h 20m)

> Deprecate Java Portable Reference Runner
> 
>
> Key: BEAM-6880
> URL: https://issues.apache.org/jira/browse/BEAM-6880
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-direct, test-failures, testing
>Reporter: Mikhail Gryzykhin
>Assignee: Daniel Oliveira
>Priority: Major
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> This ticket is about deprecating Java Portable Reference runner.
>  
> Discussion is happening in [this 
> thread|[https://lists.apache.org/thread.html/0b68efce9b7f2c5297b32d09e5d903e9b354199fe2ce446fbcd240bc@%3Cdev.beam.apache.org%3E]]
>  
>  
> Current summary is: disable beam_PostCommit_Java_PVR_Reference job.
> Keeping or removing reference runner code is still under discussion. It is 
> suggested to create PR that removes relevant code and start voting there.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7012) Support TestStream in FlinkRunner

2019-04-25 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7012?focusedWorklogId=233116&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-233116
 ]

ASF GitHub Bot logged work on BEAM-7012:


Author: ASF GitHub Bot
Created on: 25/Apr/19 21:30
Start Date: 25/Apr/19 21:30
Worklog Time Spent: 10m 
  Work Description: kennknowles commented on issue #8383: [BEAM-7012] 
Support TestStream in streaming Flink Runner
URL: https://github.com/apache/beam/pull/8383#issuecomment-486845015
 
 
   Nice! It is great to have an example of this for a runner other than the 
direct runner. I will read this later for sure. Sorry I didn't have time to 
review it promptly.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 233116)
Time Spent: 2h 20m  (was: 2h 10m)

> Support TestStream in FlinkRunner
> -
>
> Key: BEAM-7012
> URL: https://issues.apache.org/jira/browse/BEAM-7012
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-flink
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Major
> Fix For: 2.13.0
>
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> TestStream is a primitive transform which is only supported by the 
> DirectRunner. It might be useful to also implement it in the Flink Runner to 
> run similar kind of tests.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-6621) Exercise Dataflow runner integration tests in a postcommit suite for Python 3.5 and 3.6

2019-04-25 Thread Valentyn Tymofieiev (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-6621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16826479#comment-16826479
 ] 

Valentyn Tymofieiev commented on BEAM-6621:
---

It's partially completed, I removed the tag.

> Exercise Dataflow runner integration tests in a postcommit suite for Python 
> 3.5 and 3.6
> ---
>
> Key: BEAM-6621
> URL: https://issues.apache.org/jira/browse/BEAM-6621
> Project: Beam
>  Issue Type: Sub-task
>  Components: testing
>Reporter: Valentyn Tymofieiev
>Assignee: Juta Staes
>Priority: Critical
>  Labels: triaged
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> [https://github.com/apache/beam/pull/7756] adds a test suite with a single 
> test. Once merged, we need to extend the suite to all available integration 
> tests. If some tests are not passing, we can track any remaining 
> parallelizable work in separate JIRAs.  
> cc: [~Juta]
> cc: [~markflyhigh]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-6621) Exercise Dataflow runner integration tests in a postcommit suite for Python 3.5 and 3.6

2019-04-25 Thread Valentyn Tymofieiev (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Valentyn Tymofieiev updated BEAM-6621:
--
Fix Version/s: (was: 2.12.0)

> Exercise Dataflow runner integration tests in a postcommit suite for Python 
> 3.5 and 3.6
> ---
>
> Key: BEAM-6621
> URL: https://issues.apache.org/jira/browse/BEAM-6621
> Project: Beam
>  Issue Type: Sub-task
>  Components: testing
>Reporter: Valentyn Tymofieiev
>Assignee: Juta Staes
>Priority: Critical
>  Labels: triaged
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> [https://github.com/apache/beam/pull/7756] adds a test suite with a single 
> test. Once merged, we need to extend the suite to all available integration 
> tests. If some tests are not passing, we can track any remaining 
> parallelizable work in separate JIRAs.  
> cc: [~Juta]
> cc: [~markflyhigh]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4543) Remove dependency on googledatastore in favor of google-cloud-datastore.

2019-04-25 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4543?focusedWorklogId=233099&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-233099
 ]

ASF GitHub Bot logged work on BEAM-4543:


Author: ASF GitHub Bot
Created on: 25/Apr/19 20:41
Start Date: 25/Apr/19 20:41
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on pull request #8262: 
[BEAM-4543] Python Datastore IO using google-cloud-datastore
URL: https://github.com/apache/beam/pull/8262#discussion_r278722641
 
 

 ##
 File path: sdks/python/setup.py
 ##
 @@ -137,11 +137,14 @@ def get_version():
 ]
 
 GCP_REQUIREMENTS = [
+'cachetools>=3.1.0,<4',
 # google-apitools 0.5.23 and above has important Python 3 supports.
 'google-apitools>=0.5.26,<0.5.27',
-'proto-google-cloud-datastore-v1>=0.90.0,<=0.90.4',
 
 Review comment:
   Don't we need proto-google-cloud-datastore-v1 for v1/datastoreio ?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 233099)
Time Spent: 4h 20m  (was: 4h 10m)

> Remove dependency on googledatastore in favor of google-cloud-datastore.
> 
>
> Key: BEAM-4543
> URL: https://issues.apache.org/jira/browse/BEAM-4543
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: Udi Meiri
>Priority: Minor
>  Labels: triaged
>  Time Spent: 4h 20m
>  Remaining Estimate: 0h
>
> apache-beam[gcp] package depends [1] on googledatastore package [2]. We 
> should replace this dependency with google-cloud-datastore [3] which is 
> officially supported, has better release cadence and also has Python 3 
> support.
> [1] 
> https://github.com/apache/beam/blob/fad655462f8fadfdfaab0b7a09cab538f076f94e/sdks/python/setup.py#L126
> [2] [https://pypi.org/project/googledatastore/]
> [3] [https://pypi.org/project/google-cloud-datastore/]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4543) Remove dependency on googledatastore in favor of google-cloud-datastore.

2019-04-25 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4543?focusedWorklogId=233098&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-233098
 ]

ASF GitHub Bot logged work on BEAM-4543:


Author: ASF GitHub Bot
Created on: 25/Apr/19 20:41
Start Date: 25/Apr/19 20:41
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on pull request #8262: 
[BEAM-4543] Python Datastore IO using google-cloud-datastore
URL: https://github.com/apache/beam/pull/8262#discussion_r278721359
 
 

 ##
 File path: 
sdks/python/apache_beam/io/gcp/datastore/v1new/datastore_write_it_test.py
 ##
 @@ -0,0 +1,75 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+"""An integration test for datastore_write_it_pipeline
+
+This test creates entities and writes them to Cloud Datastore. Subsequently,
+these entities are read from Cloud Datastore, compared to the expected value
+for the entity, and deleted.
+
+There is no output; instead, we use `assert_that` transform to verify the
+results in the pipeline.
+"""
+
+from __future__ import absolute_import
+
+import logging
+import random
+import unittest
+from datetime import datetime
+
+from hamcrest.core.core.allof import all_of
+from nose.plugins.attrib import attr
+
+from apache_beam.testing.pipeline_verifiers import PipelineStateMatcher
+from apache_beam.testing.test_pipeline import TestPipeline
+
+try:
+  from apache_beam.io.gcp.datastore.v1new import datastore_write_it_pipeline
+# TODO(BEAM-4543): Remove TypeError once googledatastore dependency is removed.
+except (ImportError, TypeError):
+  datastore_write_it_pipeline = None
+
+
+class DatastoreWriteIT(unittest.TestCase):
+
+  NUM_ENTITIES = 1001
+  LIMIT = 500
+
+  def run_datastore_write(self, limit=None):
+test_pipeline = TestPipeline(is_integration_test=True)
+current_time = datetime.now().strftime("%m%d%H%M%S")
+seed = random.randint(0, 10)
+kind = 'testkind%s%d' % (current_time, seed)
+pipeline_verifiers = [PipelineStateMatcher()]
+extra_opts = {'kind': kind,
+  'num_entities': self.NUM_ENTITIES,
+  'on_success_matcher': all_of(*pipeline_verifiers)}
+if limit is not None:
+  extra_opts['limit'] = limit
+
+datastore_write_it_pipeline.run(test_pipeline.get_full_options_as_args(
+**extra_opts))
+
+  @attr('IT')
+  def test_datastore_write_limit(self):
 
 Review comment:
   Can you add a IT for read as well ?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 233098)
Time Spent: 4h 10m  (was: 4h)

> Remove dependency on googledatastore in favor of google-cloud-datastore.
> 
>
> Key: BEAM-4543
> URL: https://issues.apache.org/jira/browse/BEAM-4543
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: Udi Meiri
>Priority: Minor
>  Labels: triaged
>  Time Spent: 4h 10m
>  Remaining Estimate: 0h
>
> apache-beam[gcp] package depends [1] on googledatastore package [2]. We 
> should replace this dependency with google-cloud-datastore [3] which is 
> officially supported, has better release cadence and also has Python 3 
> support.
> [1] 
> https://github.com/apache/beam/blob/fad655462f8fadfdfaab0b7a09cab538f076f94e/sdks/python/setup.py#L126
> [2] [https://pypi.org/project/googledatastore/]
> [3] [https://pypi.org/project/google-cloud-datastore/]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4543) Remove dependency on googledatastore in favor of google-cloud-datastore.

2019-04-25 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4543?focusedWorklogId=233101&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-233101
 ]

ASF GitHub Bot logged work on BEAM-4543:


Author: ASF GitHub Bot
Created on: 25/Apr/19 20:41
Start Date: 25/Apr/19 20:41
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on pull request #8262: 
[BEAM-4543] Python Datastore IO using google-cloud-datastore
URL: https://github.com/apache/beam/pull/8262#discussion_r278692493
 
 

 ##
 File path: 
sdks/python/apache_beam/io/gcp/datastore/v1new/query_splitter_test.py
 ##
 @@ -173,5 +163,9 @@ def test_client_key_sort_key(self):
 self.assertEqual(expected_sort, keys)
 
 
+# Hide base class from collection by nose.
+del QuerySplitterTestBase
 
 Review comment:
   But the actual tests are in base class, right ? Do they still get collected ?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 233101)
Time Spent: 4.5h  (was: 4h 20m)

> Remove dependency on googledatastore in favor of google-cloud-datastore.
> 
>
> Key: BEAM-4543
> URL: https://issues.apache.org/jira/browse/BEAM-4543
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: Udi Meiri
>Priority: Minor
>  Labels: triaged
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> apache-beam[gcp] package depends [1] on googledatastore package [2]. We 
> should replace this dependency with google-cloud-datastore [3] which is 
> officially supported, has better release cadence and also has Python 3 
> support.
> [1] 
> https://github.com/apache/beam/blob/fad655462f8fadfdfaab0b7a09cab538f076f94e/sdks/python/setup.py#L126
> [2] [https://pypi.org/project/googledatastore/]
> [3] [https://pypi.org/project/google-cloud-datastore/]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4543) Remove dependency on googledatastore in favor of google-cloud-datastore.

2019-04-25 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4543?focusedWorklogId=233097&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-233097
 ]

ASF GitHub Bot logged work on BEAM-4543:


Author: ASF GitHub Bot
Created on: 25/Apr/19 20:41
Start Date: 25/Apr/19 20:41
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on pull request #8262: 
[BEAM-4543] Python Datastore IO using google-cloud-datastore
URL: https://github.com/apache/beam/pull/8262#discussion_r278693190
 
 

 ##
 File path: sdks/python/apache_beam/io/gcp/datastore/v1/datastoreio.py
 ##
 @@ -45,6 +50,10 @@
 try:
   from google.cloud.proto.datastore.v1 import datastore_pb2
   from googledatastore import helper as datastore_helper
+  logging.warning(
+  'Using deprecated Datastore client.\n'
+  'This client will be removed in the next Beam major release.\n'
 
 Review comment:
   "removed in Beam 3.0 (next Beam major release)"
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 233097)
Time Spent: 4h  (was: 3h 50m)

> Remove dependency on googledatastore in favor of google-cloud-datastore.
> 
>
> Key: BEAM-4543
> URL: https://issues.apache.org/jira/browse/BEAM-4543
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: Udi Meiri
>Priority: Minor
>  Labels: triaged
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> apache-beam[gcp] package depends [1] on googledatastore package [2]. We 
> should replace this dependency with google-cloud-datastore [3] which is 
> officially supported, has better release cadence and also has Python 3 
> support.
> [1] 
> https://github.com/apache/beam/blob/fad655462f8fadfdfaab0b7a09cab538f076f94e/sdks/python/setup.py#L126
> [2] [https://pypi.org/project/googledatastore/]
> [3] [https://pypi.org/project/google-cloud-datastore/]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4543) Remove dependency on googledatastore in favor of google-cloud-datastore.

2019-04-25 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4543?focusedWorklogId=233100&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-233100
 ]

ASF GitHub Bot logged work on BEAM-4543:


Author: ASF GitHub Bot
Created on: 25/Apr/19 20:41
Start Date: 25/Apr/19 20:41
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on pull request #8262: 
[BEAM-4543] Python Datastore IO using google-cloud-datastore
URL: https://github.com/apache/beam/pull/8262#discussion_r278722835
 
 

 ##
 File path: sdks/python/setup.py
 ##
 @@ -137,11 +137,14 @@ def get_version():
 ]
 
 GCP_REQUIREMENTS = [
+'cachetools>=3.1.0,<4',
 # google-apitools 0.5.23 and above has important Python 3 supports.
 'google-apitools>=0.5.26,<0.5.27',
-'proto-google-cloud-datastore-v1>=0.90.0,<=0.90.4',
-# [BEAM-4543] Datastore IO is not supported in Python 3.
+# [BEAM-4543] googledatastore is not supported in Python 3.
 
 Review comment:
   Can both google-cloud-datastore and googledatastore installed in the same 
env without conflicts ?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 233100)

> Remove dependency on googledatastore in favor of google-cloud-datastore.
> 
>
> Key: BEAM-4543
> URL: https://issues.apache.org/jira/browse/BEAM-4543
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: Udi Meiri
>Priority: Minor
>  Labels: triaged
>  Time Spent: 4h 20m
>  Remaining Estimate: 0h
>
> apache-beam[gcp] package depends [1] on googledatastore package [2]. We 
> should replace this dependency with google-cloud-datastore [3] which is 
> officially supported, has better release cadence and also has Python 3 
> support.
> [1] 
> https://github.com/apache/beam/blob/fad655462f8fadfdfaab0b7a09cab538f076f94e/sdks/python/setup.py#L126
> [2] [https://pypi.org/project/googledatastore/]
> [3] [https://pypi.org/project/google-cloud-datastore/]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7112) State cleanup interferes with user timer callback

2019-04-25 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7112?focusedWorklogId=233095&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-233095
 ]

ASF GitHub Bot logged work on BEAM-7112:


Author: ASF GitHub Bot
Created on: 25/Apr/19 20:24
Start Date: 25/Apr/19 20:24
Worklog Time Spent: 10m 
  Work Description: tweise commented on issue #8399: [BEAM-7112] Timer race 
with state cleanup - take two
URL: https://github.com/apache/beam/pull/8399#issuecomment-486824637
 
 
   Run Java PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 233095)
Time Spent: 5h 50m  (was: 5h 40m)

> State cleanup interferes with user timer callback
> -
>
> Key: BEAM-7112
> URL: https://issues.apache.org/jira/browse/BEAM-7112
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Affects Versions: 2.12.0
>Reporter: Thomas Weise
>Assignee: Thomas Weise
>Priority: Major
>  Labels: portability-flink
> Fix For: 2.13.0
>
>  Time Spent: 5h 50m
>  Remaining Estimate: 0h
>
> Cleanup timers and user timers are fired at the watermark. Processing of 
> timers in the SDK worker is asynchronous, so it is possible that the state is 
> already removed when the user timer callback executes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-6621) Exercise Dataflow runner integration tests in a postcommit suite for Python 3.5 and 3.6

2019-04-25 Thread Kenneth Knowles (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-6621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16826440#comment-16826440
 ] 

Kenneth Knowles commented on BEAM-6621:
---

This is listed as part of 2.12.0. Voting has just finished. Is it done?

> Exercise Dataflow runner integration tests in a postcommit suite for Python 
> 3.5 and 3.6
> ---
>
> Key: BEAM-6621
> URL: https://issues.apache.org/jira/browse/BEAM-6621
> Project: Beam
>  Issue Type: Sub-task
>  Components: testing
>Reporter: Valentyn Tymofieiev
>Assignee: Juta Staes
>Priority: Critical
>  Labels: triaged
> Fix For: 2.12.0
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> [https://github.com/apache/beam/pull/7756] adds a test suite with a single 
> test. Once merged, we need to extend the suite to all available integration 
> tests. If some tests are not passing, we can track any remaining 
> parallelizable work in separate JIRAs.  
> cc: [~Juta]
> cc: [~markflyhigh]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-7151) Support conjunction clause when it's only equi-join

2019-04-25 Thread Rui Wang (JIRA)
Rui Wang created BEAM-7151:
--

 Summary: Support conjunction clause when it's only equi-join
 Key: BEAM-7151
 URL: https://issues.apache.org/jira/browse/BEAM-7151
 Project: Beam
  Issue Type: Sub-task
  Components: dsl-sql
Reporter: Rui Wang


conjunction_clause: function_call(function_parameter, ...) | field_access | 
column
function_parameter: function_call | field_access

In Beam, equi-join is implemented by CoGBK, which requires both join inputs 
(assume binary join) to build PCollection of KV, where the key is 
join key.

For equi-join, conjunction clause is essentially an equation. In order to build 
KV, it requires that columns from different sides of equation should 
come from different join input. For example, a + b = 2 cannot be used to build 
join key but a = 2 - b can. So rewriting is required for clauses when it does 
not satisfy this property. 

It also implies that not every clause is rewritable. Say the clause is f(a, b) 
= 3, in which a is from left input and b is from right input. If this function 
f is not splittable, such that we cannot move a or b to right side of equation, 
then we cannot support this clause in BeamSQL's  join.





--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7138) keep Java serialized coder in length-prefixed wire coder construction

2019-04-25 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7138?focusedWorklogId=233092&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-233092
 ]

ASF GitHub Bot logged work on BEAM-7138:


Author: ASF GitHub Bot
Created on: 25/Apr/19 20:03
Start Date: 25/Apr/19 20:03
Worklog Time Spent: 10m 
  Work Description: mxm commented on issue #8396: [BEAM-7138] keep Java 
serialized coder in wire coder construction
URL: https://github.com/apache/beam/pull/8396#issuecomment-486818011
 
 
   Actually, just came across this as well when implementing TestStream for 
PVR. We need to look into why the Flink tests fail here.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 233092)
Time Spent: 0.5h  (was: 20m)

> keep Java serialized coder in length-prefixed wire coder construction
> -
>
> Key: BEAM-7138
> URL: https://issues.apache.org/jira/browse/BEAM-7138
> Project: Beam
>  Issue Type: Improvement
>  Components: java-fn-execution
>Reporter: Heejong Lee
>Assignee: Heejong Lee
>Priority: Major
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> don't replace Java serialized coder with byte array coder in length-prefixed 
> wire coder construction.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7139) Blog post announcing Kotlin Sample addition to Beam

2019-04-25 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7139?focusedWorklogId=233090&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-233090
 ]

ASF GitHub Bot logged work on BEAM-7139:


Author: ASF GitHub Bot
Created on: 25/Apr/19 20:02
Start Date: 25/Apr/19 20:02
Worklog Time Spent: 10m 
  Work Description: pabloem commented on issue #8391: [BEAM-7139] Blogpost 
for Kotlin Samples
URL: https://github.com/apache/beam/pull/8391#issuecomment-486817686
 
 
   btw, to see the staged version of the website, check the jenkins result for 
the Website_Stage_GCS test task.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 233090)
Time Spent: 1h  (was: 50m)

> Blog post announcing Kotlin Sample addition to Beam
> ---
>
> Key: BEAM-7139
> URL: https://issues.apache.org/jira/browse/BEAM-7139
> Project: Beam
>  Issue Type: Task
>  Components: website
>Reporter: Harshit Dwivedi
>Assignee: Harshit Dwivedi
>Priority: Trivial
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Publishing a quick blog post that lets the users know that the Beam samples 
> now have kotlin support.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7139) Blog post announcing Kotlin Sample addition to Beam

2019-04-25 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7139?focusedWorklogId=233089&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-233089
 ]

ASF GitHub Bot logged work on BEAM-7139:


Author: ASF GitHub Bot
Created on: 25/Apr/19 20:01
Start Date: 25/Apr/19 20:01
Worklog Time Spent: 10m 
  Work Description: pabloem commented on issue #8391: [BEAM-7139] Blogpost 
for Kotlin Samples
URL: https://github.com/apache/beam/pull/8391#issuecomment-486817554
 
 
   Sorry about the trouble. It seems that this wont work for kotlin: 
http://apache-beam-website-pull-requests.storage.googleapis.com/8391/blog/2019/04/25/beam-kotlin.html
   
   You may be able to just let it highlight as Java.  Thanks!
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 233089)
Time Spent: 50m  (was: 40m)

> Blog post announcing Kotlin Sample addition to Beam
> ---
>
> Key: BEAM-7139
> URL: https://issues.apache.org/jira/browse/BEAM-7139
> Project: Beam
>  Issue Type: Task
>  Components: website
>Reporter: Harshit Dwivedi
>Assignee: Harshit Dwivedi
>Priority: Trivial
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Publishing a quick blog post that lets the users know that the Beam samples 
> now have kotlin support.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-7150) Support nested AND

2019-04-25 Thread Rui Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Wang updated BEAM-7150:
---
Description: 
There might be nested AND cases:

a = b AND (c = d AND (e = f)).


Such cases should be flattened as a = b AND c = d AND e = f. 


This is useful because after done so, we can easily build keys for join inputs: 
each conjunction clause contributes a field of key (key is Row in SQL 
equi-join) for join inputs.

  was:
There might be nested AND cases:

a = b AND (c = d AND (e = f)).


Such cases should be flattened as a = b AND c = d AND e = f. 


> Support nested AND
> --
>
> Key: BEAM-7150
> URL: https://issues.apache.org/jira/browse/BEAM-7150
> Project: Beam
>  Issue Type: Sub-task
>  Components: dsl-sql
>Reporter: Rui Wang
>Priority: Major
>
> There might be nested AND cases:
> a = b AND (c = d AND (e = f)).
> Such cases should be flattened as a = b AND c = d AND e = f. 
> This is useful because after done so, we can easily build keys for join 
> inputs: each conjunction clause contributes a field of key (key is Row in SQL 
> equi-join) for join inputs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (BEAM-6815) Add last year speakers to the Beam summit site

2019-04-25 Thread Pablo Estrada (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pablo Estrada resolved BEAM-6815.
-
   Resolution: Fixed
 Assignee: Aizhamal Nurmamat kyzy  (was: Pablo Estrada)
Fix Version/s: Not applicable

> Add last year speakers to the Beam summit site
> --
>
> Key: BEAM-6815
> URL: https://issues.apache.org/jira/browse/BEAM-6815
> Project: Beam
>  Issue Type: Task
>  Components: beam-events
>Reporter: Aizhamal Nurmamat kyzy
>Assignee: Aizhamal Nurmamat kyzy
>Priority: Minor
> Fix For: Not applicable
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-6830) Direct runner throws error "http 404 not found" when reading Bigquery tables from any other region than "US"

2019-04-25 Thread Pablo Estrada (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pablo Estrada updated BEAM-6830:

Component/s: (was: beam-events)
 io-java-gcp

> Direct runner throws error "http 404 not found" when reading Bigquery tables 
> from any other region than "US"
> 
>
> Key: BEAM-6830
> URL: https://issues.apache.org/jira/browse/BEAM-6830
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-gcp
>Reporter: Suraj
>Assignee: Pablo Estrada
>Priority: Major
>
> When trying to read bigquery table located in region "asia-southeast-1" using 
> DirectRunner ,it throws error "http 404 not found" but same code works when 
> run using DataflowRunner



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-7136) Beam documentation inconsistent

2019-04-25 Thread Pablo Estrada (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pablo Estrada updated BEAM-7136:

Component/s: (was: beam-events)
 website

> Beam documentation inconsistent
> ---
>
> Key: BEAM-7136
> URL: https://issues.apache.org/jira/browse/BEAM-7136
> Project: Beam
>  Issue Type: Bug
>  Components: beam-model, website
>Reporter: Daniel Collins
>Assignee: Aizhamal Nurmamat kyzy
>Priority: Major
>
> Apache beam documentation is inconsistent about Python usage of Pub/Sub.  The 
> chart [here|[https://beam.apache.org/documentation/io/built-in/]] claims that 
> Pub/Sub is available for usage in Batch Pipelines, but the source [from 
> clicking on that 
> link|[https://github.com/apache/beam/blob/master/sdks/python/apache_beam/io/gcp/pubsub.py]]
>  explicitly says only streaming is supported.  Both of these cannot be 
> correct.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-7150) Support nested AND

2019-04-25 Thread Rui Wang (JIRA)
Rui Wang created BEAM-7150:
--

 Summary: Support nested AND
 Key: BEAM-7150
 URL: https://issues.apache.org/jira/browse/BEAM-7150
 Project: Beam
  Issue Type: Sub-task
  Components: dsl-sql
Reporter: Rui Wang


There might be nested AND cases:

a = b AND (c = d AND (e = f)).


Such cases should be flattened as a = b AND c = d AND e = f. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-7149) support OR

2019-04-25 Thread Rui Wang (JIRA)
Rui Wang created BEAM-7149:
--

 Summary: support OR
 Key: BEAM-7149
 URL: https://issues.apache.org/jira/browse/BEAM-7149
 Project: Beam
  Issue Type: Sub-task
  Components: dsl-sql
Reporter: Rui Wang


OR supposed to be implement in rules. 

Basically when LogicalRel is generated and being converted to PhysicalRel, OR 
should be handled by a separate converter rule: 

all conjunction clauses of OR should be converted to independent JOIN and their 
results are combined by UNION rel with deduplication.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-6112) SQL could support equijoins on complex expressions, not just column refs

2019-04-25 Thread Rui Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Wang reassigned BEAM-6112:
--

Assignee: (was: Rui Wang)

> SQL could support equijoins on complex expressions, not just column refs
> 
>
> Key: BEAM-6112
> URL: https://issues.apache.org/jira/browse/BEAM-6112
> Project: Beam
>  Issue Type: New Feature
>  Components: dsl-sql
>Reporter: Kenneth Knowles
>Priority: Major
>
> Currently a join such as {{CAST(A.x AS BIGINT) = B.y}} will fail, along with 
> other similar simple expressions. We only support joining directly on 
> immediate column references. It can be worked around by inserting a WITH 
> clause for the intermediate value.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (BEAM-7147) Several PR Postcommits failing: java.lang.OutOfMemoryError

2019-04-25 Thread yifan zou (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

yifan zou closed BEAM-7147.
---
   Resolution: Information Provided
Fix Version/s: Not applicable

> Several PR Postcommits failing: java.lang.OutOfMemoryError
> --
>
> Key: BEAM-7147
> URL: https://issues.apache.org/jira/browse/BEAM-7147
> Project: Beam
>  Issue Type: Bug
>  Components: test-failures
>Reporter: Daniel Oliveira
>Assignee: yifan zou
>Priority: Major
>  Labels: currently-failing
> Fix For: Not applicable
>
>
> While working on my PR ([PR #6880|https://github.com/apache/beam/pull/8380]) 
> I tried to run a few Postcommits and got the same error on each one. Not sure 
> if this is related to the work being done to move to new machines or 
> something else. Can someone help diagnose?
> From this run 
> ([https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch_PR/38/]):
> {noformat}
> *17:10:36* Error occurred during initialization of VM
> *17:10:36* java.lang.OutOfMemoryError: unable to create new native thread
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7139) Blog post announcing Kotlin Sample addition to Beam

2019-04-25 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7139?focusedWorklogId=233031&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-233031
 ]

ASF GitHub Bot logged work on BEAM-7139:


Author: ASF GitHub Bot
Created on: 25/Apr/19 18:38
Start Date: 25/Apr/19 18:38
Worklog Time Spent: 10m 
  Work Description: harshithdwivedi commented on issue #8391: [BEAM-7139] 
Blogpost for Kotlin Samples
URL: https://github.com/apache/beam/pull/8391#issuecomment-486790554
 
 
   Oh neat, I didn't know that!
   Done 🙂 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 233031)
Time Spent: 40m  (was: 0.5h)

> Blog post announcing Kotlin Sample addition to Beam
> ---
>
> Key: BEAM-7139
> URL: https://issues.apache.org/jira/browse/BEAM-7139
> Project: Beam
>  Issue Type: Task
>  Components: website
>Reporter: Harshit Dwivedi
>Assignee: Harshit Dwivedi
>Priority: Trivial
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Publishing a quick blog post that lets the users know that the Beam samples 
> now have kotlin support.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6880) Deprecate Java Portable Reference Runner

2019-04-25 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6880?focusedWorklogId=233023&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-233023
 ]

ASF GitHub Bot logged work on BEAM-6880:


Author: ASF GitHub Bot
Created on: 25/Apr/19 18:20
Start Date: 25/Apr/19 18:20
Worklog Time Spent: 10m 
  Work Description: youngoli commented on issue #8380: [BEAM-6880] Remove 
deprecated Reference Runner code.
URL: https://github.com/apache/beam/pull/8380#issuecomment-486783882
 
 
   Run Java PostCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 233023)
Time Spent: 3h 20m  (was: 3h 10m)

> Deprecate Java Portable Reference Runner
> 
>
> Key: BEAM-6880
> URL: https://issues.apache.org/jira/browse/BEAM-6880
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-direct, test-failures, testing
>Reporter: Mikhail Gryzykhin
>Assignee: Daniel Oliveira
>Priority: Major
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> This ticket is about deprecating Java Portable Reference runner.
>  
> Discussion is happening in [this 
> thread|[https://lists.apache.org/thread.html/0b68efce9b7f2c5297b32d09e5d903e9b354199fe2ce446fbcd240bc@%3Cdev.beam.apache.org%3E]]
>  
>  
> Current summary is: disable beam_PostCommit_Java_PVR_Reference job.
> Keeping or removing reference runner code is still under discussion. It is 
> suggested to create PR that removes relevant code and start voting there.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-7148) LTS backport: AfterProcessingTime trigger doesn't fire reliably

2019-04-25 Thread Maximilian Michels (JIRA)
Maximilian Michels created BEAM-7148:


 Summary: LTS backport: AfterProcessingTime trigger doesn't fire 
reliably
 Key: BEAM-7148
 URL: https://issues.apache.org/jira/browse/BEAM-7148
 Project: Beam
  Issue Type: Bug
  Components: runner-flink
Affects Versions: 2.1.0, 2.2.0, 2.3.0
Reporter: Maximilian Michels
Assignee: Maximilian Michels
 Fix For: 2.13.0


*Issue*

Beam AfterProcessingTime trigger doesn't fire always reliably after a 
configured delay.

The following job triggers should fire after watermark passes the end of the 
window and then every 5 seconds for late data and the finally at the end of 
allowed lateness.

*Expected behaviour*

Late firing after processing time trigger should fire after 5 seconds since 
first late records arrive in the pane.

*Actual behaviour*

>From my testings late triggers works for some keys but not for the other - 
>it's pretty random which keys are affected. The DummySource generates 15 
>distinct keys AA,BB,..., PP. For each key it sends 5 on time records and one 
>late record. In case late trigger firing is missed it won't fire until the 
>allowed lateness period. 

*Job code*
{code:java}
String[] runnerArgs = {"--runner=FlinkRunner", "--parallelism=8"};

FlinkPipelineOptions options = 
PipelineOptionsFactory.fromArgs(runnerArgs).as(FlinkPipelineOptions.class);
Pipeline pipeline = Pipeline.create(options);
PCollection apply = pipeline.apply(Read.from(new DummySource()))

.apply(Window.into(FixedWindows.of(Duration.standardSeconds(10)))
.triggering(AfterWatermark.pastEndOfWindow()
.withLateFirings(
AfterProcessingTime

.pastFirstElementInPane().plusDelayOf(Duration.standardSeconds(5
.accumulatingFiredPanes()
.withAllowedLateness(Duration.standardMinutes(2), 
Window.ClosingBehavior.FIRE_IF_NON_EMPTY)
);
apply.apply(Count.perElement())
.apply(ParDo.of(new DoFn, Long>() {
@ProcessElement
public void process(ProcessContext context, BoundedWindow window) {
LOG.info("Count: {}. For window {}, Pane {}", 
context.element(), window, context.pane());
}
}));

pipeline.run().waitUntilFinish();{code}
 

*How can you replicate the issue?*

 I've created a github repo 
[https://github.com/pbartoszek/BEAM-3863_late_trigger] with the code shown 
above. Please check out the README file for details how to replicate the issue.

*What's is causing the issue?*

I explained the cause in PR.

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-7148) LTS backport: AfterProcessingTime trigger doesn't fire reliably

2019-04-25 Thread Maximilian Michels (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels updated BEAM-7148:
-
Fix Version/s: (was: 2.13.0)
   2.7.1

> LTS backport: AfterProcessingTime trigger doesn't fire reliably
> ---
>
> Key: BEAM-7148
> URL: https://issues.apache.org/jira/browse/BEAM-7148
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Affects Versions: 2.1.0, 2.2.0, 2.3.0
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Major
>  Labels: triaged
> Fix For: 2.7.1
>
>
> *Issue*
> Beam AfterProcessingTime trigger doesn't fire always reliably after a 
> configured delay.
> The following job triggers should fire after watermark passes the end of the 
> window and then every 5 seconds for late data and the finally at the end of 
> allowed lateness.
> *Expected behaviour*
> Late firing after processing time trigger should fire after 5 seconds since 
> first late records arrive in the pane.
> *Actual behaviour*
> From my testings late triggers works for some keys but not for the other - 
> it's pretty random which keys are affected. The DummySource generates 15 
> distinct keys AA,BB,..., PP. For each key it sends 5 on time records and one 
> late record. In case late trigger firing is missed it won't fire until the 
> allowed lateness period. 
> *Job code*
> {code:java}
> String[] runnerArgs = {"--runner=FlinkRunner", "--parallelism=8"};
> FlinkPipelineOptions options = 
> PipelineOptionsFactory.fromArgs(runnerArgs).as(FlinkPipelineOptions.class);
> Pipeline pipeline = Pipeline.create(options);
> PCollection apply = pipeline.apply(Read.from(new DummySource()))
> 
> .apply(Window.into(FixedWindows.of(Duration.standardSeconds(10)))
> .triggering(AfterWatermark.pastEndOfWindow()
> .withLateFirings(
> AfterProcessingTime
> 
> .pastFirstElementInPane().plusDelayOf(Duration.standardSeconds(5
> .accumulatingFiredPanes()
> .withAllowedLateness(Duration.standardMinutes(2), 
> Window.ClosingBehavior.FIRE_IF_NON_EMPTY)
> );
> apply.apply(Count.perElement())
> .apply(ParDo.of(new DoFn, Long>() {
> @ProcessElement
> public void process(ProcessContext context, BoundedWindow window) 
> {
> LOG.info("Count: {}. For window {}, Pane {}", 
> context.element(), window, context.pane());
> }
> }));
> pipeline.run().waitUntilFinish();{code}
>  
> *How can you replicate the issue?*
>  I've created a github repo 
> [https://github.com/pbartoszek/BEAM-3863_late_trigger] with the code shown 
> above. Please check out the README file for details how to replicate the 
> issue.
> *What's is causing the issue?*
> I explained the cause in PR.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (BEAM-3863) AfterProcessingTime trigger doesn't fire reliably

2019-04-25 Thread Maximilian Michels (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels resolved BEAM-3863.
--
   Resolution: Fixed
Fix Version/s: 2.13.0

This should be resolved now. Thanks again for the great report!

> AfterProcessingTime trigger doesn't fire reliably
> -
>
> Key: BEAM-3863
> URL: https://issues.apache.org/jira/browse/BEAM-3863
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Affects Versions: 2.1.0, 2.2.0, 2.3.0
>Reporter: Pawel Bartoszek
>Assignee: Maximilian Michels
>Priority: Major
>  Labels: triaged
> Fix For: 2.13.0
>
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> *Issue*
> Beam AfterProcessingTime trigger doesn't fire always reliably after a 
> configured delay.
> The following job triggers should fire after watermark passes the end of the 
> window and then every 5 seconds for late data and the finally at the end of 
> allowed lateness.
> *Expected behaviour*
> Late firing after processing time trigger should fire after 5 seconds since 
> first late records arrive in the pane.
> *Actual behaviour*
> From my testings late triggers works for some keys but not for the other - 
> it's pretty random which keys are affected. The DummySource generates 15 
> distinct keys AA,BB,..., PP. For each key it sends 5 on time records and one 
> late record. In case late trigger firing is missed it won't fire until the 
> allowed lateness period. 
> *Job code*
> {code:java}
> String[] runnerArgs = {"--runner=FlinkRunner", "--parallelism=8"};
> FlinkPipelineOptions options = 
> PipelineOptionsFactory.fromArgs(runnerArgs).as(FlinkPipelineOptions.class);
> Pipeline pipeline = Pipeline.create(options);
> PCollection apply = pipeline.apply(Read.from(new DummySource()))
> 
> .apply(Window.into(FixedWindows.of(Duration.standardSeconds(10)))
> .triggering(AfterWatermark.pastEndOfWindow()
> .withLateFirings(
> AfterProcessingTime
> 
> .pastFirstElementInPane().plusDelayOf(Duration.standardSeconds(5
> .accumulatingFiredPanes()
> .withAllowedLateness(Duration.standardMinutes(2), 
> Window.ClosingBehavior.FIRE_IF_NON_EMPTY)
> );
> apply.apply(Count.perElement())
> .apply(ParDo.of(new DoFn, Long>() {
> @ProcessElement
> public void process(ProcessContext context, BoundedWindow window) 
> {
> LOG.info("Count: {}. For window {}, Pane {}", 
> context.element(), window, context.pane());
> }
> }));
> pipeline.run().waitUntilFinish();{code}
>  
> *How can you replicate the issue?*
>  I've created a github repo 
> [https://github.com/pbartoszek/BEAM-3863_late_trigger] with the code shown 
> above. Please check out the README file for details how to replicate the 
> issue.
> *What's is causing the issue?*
> I explained the cause in PR.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3863) AfterProcessingTime trigger doesn't fire reliably

2019-04-25 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3863?focusedWorklogId=233020&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-233020
 ]

ASF GitHub Bot logged work on BEAM-3863:


Author: ASF GitHub Bot
Created on: 25/Apr/19 18:13
Start Date: 25/Apr/19 18:13
Worklog Time Spent: 10m 
  Work Description: mxm commented on pull request #8366:  [BEAM-3863] 
Ensure correct firing of processing time timers
URL: https://github.com/apache/beam/pull/8366
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 233020)
Time Spent: 2h 40m  (was: 2.5h)

> AfterProcessingTime trigger doesn't fire reliably
> -
>
> Key: BEAM-3863
> URL: https://issues.apache.org/jira/browse/BEAM-3863
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Affects Versions: 2.1.0, 2.2.0, 2.3.0
>Reporter: Pawel Bartoszek
>Assignee: Maximilian Michels
>Priority: Major
>  Labels: triaged
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> *Issue*
> Beam AfterProcessingTime trigger doesn't fire always reliably after a 
> configured delay.
> The following job triggers should fire after watermark passes the end of the 
> window and then every 5 seconds for late data and the finally at the end of 
> allowed lateness.
> *Expected behaviour*
> Late firing after processing time trigger should fire after 5 seconds since 
> first late records arrive in the pane.
> *Actual behaviour*
> From my testings late triggers works for some keys but not for the other - 
> it's pretty random which keys are affected. The DummySource generates 15 
> distinct keys AA,BB,..., PP. For each key it sends 5 on time records and one 
> late record. In case late trigger firing is missed it won't fire until the 
> allowed lateness period. 
> *Job code*
> {code:java}
> String[] runnerArgs = {"--runner=FlinkRunner", "--parallelism=8"};
> FlinkPipelineOptions options = 
> PipelineOptionsFactory.fromArgs(runnerArgs).as(FlinkPipelineOptions.class);
> Pipeline pipeline = Pipeline.create(options);
> PCollection apply = pipeline.apply(Read.from(new DummySource()))
> 
> .apply(Window.into(FixedWindows.of(Duration.standardSeconds(10)))
> .triggering(AfterWatermark.pastEndOfWindow()
> .withLateFirings(
> AfterProcessingTime
> 
> .pastFirstElementInPane().plusDelayOf(Duration.standardSeconds(5
> .accumulatingFiredPanes()
> .withAllowedLateness(Duration.standardMinutes(2), 
> Window.ClosingBehavior.FIRE_IF_NON_EMPTY)
> );
> apply.apply(Count.perElement())
> .apply(ParDo.of(new DoFn, Long>() {
> @ProcessElement
> public void process(ProcessContext context, BoundedWindow window) 
> {
> LOG.info("Count: {}. For window {}, Pane {}", 
> context.element(), window, context.pane());
> }
> }));
> pipeline.run().waitUntilFinish();{code}
>  
> *How can you replicate the issue?*
>  I've created a github repo 
> [https://github.com/pbartoszek/BEAM-3863_late_trigger] with the code shown 
> above. Please check out the README file for details how to replicate the 
> issue.
> *What's is causing the issue?*
> I explained the cause in PR.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4543) Remove dependency on googledatastore in favor of google-cloud-datastore.

2019-04-25 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4543?focusedWorklogId=233019&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-233019
 ]

ASF GitHub Bot logged work on BEAM-4543:


Author: ASF GitHub Bot
Created on: 25/Apr/19 18:13
Start Date: 25/Apr/19 18:13
Worklog Time Spent: 10m 
  Work Description: udim commented on issue #8262: [BEAM-4543] Python 
Datastore IO using google-cloud-datastore
URL: https://github.com/apache/beam/pull/8262#issuecomment-486781398
 
 
   run python postcommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 233019)
Time Spent: 3h 50m  (was: 3h 40m)

> Remove dependency on googledatastore in favor of google-cloud-datastore.
> 
>
> Key: BEAM-4543
> URL: https://issues.apache.org/jira/browse/BEAM-4543
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: Udi Meiri
>Priority: Minor
>  Labels: triaged
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>
> apache-beam[gcp] package depends [1] on googledatastore package [2]. We 
> should replace this dependency with google-cloud-datastore [3] which is 
> officially supported, has better release cadence and also has Python 3 
> support.
> [1] 
> https://github.com/apache/beam/blob/fad655462f8fadfdfaab0b7a09cab538f076f94e/sdks/python/setup.py#L126
> [2] [https://pypi.org/project/googledatastore/]
> [3] [https://pypi.org/project/google-cloud-datastore/]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-7147) Several PR Postcommits failing: java.lang.OutOfMemoryError

2019-04-25 Thread yifan zou (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-7147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16826315#comment-16826315
 ] 

yifan zou commented on BEAM-7147:
-

the beam4 is offline. The OOM is misleading, it is actually caused by some 
thread leaking. A major leak was fixed in 
[https://issues.apache.org/jira/browse/BEAM-7109], but seems like there are 
still other tests eating the thread pool of the VM. I'll keep beam4 offline for 
investigation. You can rerun the test and it could be assigned to a healthy 
node.

> Several PR Postcommits failing: java.lang.OutOfMemoryError
> --
>
> Key: BEAM-7147
> URL: https://issues.apache.org/jira/browse/BEAM-7147
> Project: Beam
>  Issue Type: Bug
>  Components: test-failures
>Reporter: Daniel Oliveira
>Assignee: yifan zou
>Priority: Major
>  Labels: currently-failing
>
> While working on my PR ([PR #6880|https://github.com/apache/beam/pull/8380]) 
> I tried to run a few Postcommits and got the same error on each one. Not sure 
> if this is related to the work being done to move to new machines or 
> something else. Can someone help diagnose?
> From this run 
> ([https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch_PR/38/]):
> {noformat}
> *17:10:36* Error occurred during initialization of VM
> *17:10:36* java.lang.OutOfMemoryError: unable to create new native thread
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7029) Support KafkaIO to be configured externally for use with other SDKs

2019-04-25 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7029?focusedWorklogId=233018&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-233018
 ]

ASF GitHub Bot logged work on BEAM-7029:


Author: ASF GitHub Bot
Created on: 25/Apr/19 18:10
Start Date: 25/Apr/19 18:10
Worklog Time Spent: 10m 
  Work Description: mxm commented on pull request #8322: [BEAM-7029] Add 
KafkaIO.Write as an external transform
URL: https://github.com/apache/beam/pull/8322#discussion_r278670159
 
 

 ##
 File path: 
sdks/java/io/kafka/src/main/java/org/apache/beam/sdk/io/kafka/KafkaIO.java
 ##
 @@ -1325,12 +1325,89 @@ public void populateDisplayData(DisplayData.Builder 
builder) {
 abstract Builder toBuilder();
 
 @AutoValue.Builder
-abstract static class Builder {
+abstract static class Builder
+implements ExternalTransformBuilder>, PDone> {
   abstract Builder setTopic(String topic);
 
   abstract Builder setWriteRecordsTransform(WriteRecords 
transform);
 
   abstract Write build();
+
+  @Override
+  public PTransform>, PDone> buildExternal(
+  External.Configuration configuration) {
+String topic = utf8String(configuration.topic);
+setTopic(topic);
+
+Map producerConfig = new HashMap<>();
+for (KV kv : configuration.producerConfig) {
+  String key = utf8String(kv.getKey());
+  String value = utf8String(kv.getValue());
+  producerConfig.put(key, value);
+}
+Class keySerializer = 
resolveClass(utf8String(configuration.keySerializer));
+Class valSerializer = 
resolveClass(utf8String(configuration.valueSerializer));
+
+WriteRecords writeRecords =
+KafkaIO.writeRecords()
+.updateProducerProperties(producerConfig)
+.withKeySerializer(keySerializer)
+.withValueSerializer(valSerializer)
+.withTopic(topic);
+setWriteRecordsTransform(writeRecords);
+
+return build();
+  }
+
+  private static Class resolveClass(String className) {
 
 Review comment:
   Actually, I've moved the method to the root class because I didn't know 
whether this would be useful beyond this scope. That way at least both inner 
classes can use it.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 233018)
Time Spent: 13h 10m  (was: 13h)

> Support KafkaIO to be configured externally for use with other SDKs
> ---
>
> Key: BEAM-7029
> URL: https://issues.apache.org/jira/browse/BEAM-7029
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-kafka, runner-flink, sdk-py-core
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Major
>  Time Spent: 13h 10m
>  Remaining Estimate: 0h
>
> As of BEAM-6730, we can externally configure existing transforms from SDKs. 
> We should add more useful transforms then just {{GenerateSequence}}. 
> {{KafkaIO}} is a good candidate.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7139) Blog post announcing Kotlin Sample addition to Beam

2019-04-25 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7139?focusedWorklogId=233015&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-233015
 ]

ASF GitHub Bot logged work on BEAM-7139:


Author: ASF GitHub Bot
Created on: 25/Apr/19 18:04
Start Date: 25/Apr/19 18:04
Worklog Time Spent: 10m 
  Work Description: pabloem commented on issue #8391: [BEAM-7139] Blogpost 
for Kotlin Samples
URL: https://github.com/apache/beam/pull/8391#issuecomment-486778315
 
 
   Thank you Harshith. This looks good to me : ) just one more thing: If you 
add the language to the code snippet like so: ` ```java  `, the website 
will add syntax highlighting to your code samples.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 233015)
Time Spent: 0.5h  (was: 20m)

> Blog post announcing Kotlin Sample addition to Beam
> ---
>
> Key: BEAM-7139
> URL: https://issues.apache.org/jira/browse/BEAM-7139
> Project: Beam
>  Issue Type: Task
>  Components: website
>Reporter: Harshit Dwivedi
>Assignee: Harshit Dwivedi
>Priority: Trivial
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Publishing a quick blog post that lets the users know that the Beam samples 
> now have kotlin support.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-7137) TypeError caused by using str variable as header argument in apache_beam.io.textio.WriteToText

2019-04-25 Thread Chamikara Jayalath (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-7137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16826287#comment-16826287
 ] 

Chamikara Jayalath commented on BEAM-7137:
--

+1 for encoding header to bytes (using UTF-8). We have clearly documented that 
header should be a string here: 
[https://github.com/apache/beam/blob/master/sdks/python/apache_beam/io/textio.py#L599]

> TypeError caused by using str variable as header argument in 
> apache_beam.io.textio.WriteToText
> --
>
> Key: BEAM-7137
> URL: https://issues.apache.org/jira/browse/BEAM-7137
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Affects Versions: 2.11.0
> Environment: Python 3.5.6
> macOS Mojave 10.14.4
>Reporter: yoshiki obata
>Assignee: yoshiki obata
>Priority: Major
>
> Using str header to apache_beam.io.textio.WriteToText as argument cause 
> TypeError with Python 3.5.6 - or maybe higher - despite docstring says header 
> is str.
>  This error occurred by writing header to file without encoding to bytes at 
> apache_beam.io.textio._TextSink.open.
>  
> {code:java}
> Traceback (most recent call last):
>   File "apache_beam/runners/common.py", line 727, in 
> apache_beam.runners.common.DoFnRunner.process
>   File "apache_beam/runners/common.py", line 555, in 
> apache_beam.runners.common.PerWindowInvoker.invoke_process
>   File "apache_beam/runners/common.py", line 625, in 
> apache_beam.runners.common.PerWindowInvoker._invoke_per_window
>   File 
> "/Users/yob/.local/share/virtualenvs/test/lib/python3.5/site-packages/apache_beam/io/iobase.py",
>  line 1033, in process
>     self.writer = self.sink.open_writer(init_result, str(uuid.uuid4()))
>   File 
> "/Users/yob/.local/share/virtualenvs/test/lib/python3.5/site-packages/apache_beam/options/value_provider.py",
>  line 137, in _f
>     return fnc(self, *args, **kwargs)
>   File 
> "/Users/yob/.local/share/virtualenvs/test/lib/python3.5/site-packages/apache_beam/io/filebasedsink.py",
>  line 185, in open_writer
>     return FileBasedSinkWriter(self, os.path.join(init_result, uid) + suffix)
>   File 
> "/Users/yob/.local/share/virtualenvs/test/lib/python3.5/site-packages/apache_beam/io/filebasedsink.py",
>  line 389, in __init__
>     self.temp_handle = self.sink.open(temp_shard_path)
>   File 
> "/Users/yob/.local/share/virtualenvs/test/lib/python3.5/site-packages/apache_beam/io/textio.py",
>  line 393, in open
>     file_handle.write(self._header)
> TypeError: a bytes-like object is required, not 'str'
> {code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-7147) Several PR Postcommits failing: java.lang.OutOfMemoryError

2019-04-25 Thread Daniel Oliveira (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Oliveira updated BEAM-7147:
--
Description: 
While working on my PR ([PR #6880|https://github.com/apache/beam/pull/8380]) I 
tried to run a few Postcommits and got the same error on each one. Not sure if 
this is related to the work being done to move to new machines or something 
else. Can someone help diagnose?

>From this run 
>([https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch_PR/38/]):


{noformat}
*17:10:36* Error occurred during initialization of VM
*17:10:36* java.lang.OutOfMemoryError: unable to create new native thread
{noformat}



  was:
While working on my PR ([PR #6880|https://github.com/apache/beam/pull/8380]) I 
tried to run a few Postcommits and got the same error on each one. Not sure if 
this is related to the work being done to move to new machines or something 
else. Can someone help diagnose?

>From this run 
>([https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch_PR/38/]):
*17:10:36* Error occurred during initialization of VM*17:10:36* 
java.lang.OutOfMemoryError: unable to create new native thread


> Several PR Postcommits failing: java.lang.OutOfMemoryError
> --
>
> Key: BEAM-7147
> URL: https://issues.apache.org/jira/browse/BEAM-7147
> Project: Beam
>  Issue Type: Bug
>  Components: test-failures
>Reporter: Daniel Oliveira
>Assignee: yifan zou
>Priority: Major
>  Labels: currently-failing
>
> While working on my PR ([PR #6880|https://github.com/apache/beam/pull/8380]) 
> I tried to run a few Postcommits and got the same error on each one. Not sure 
> if this is related to the work being done to move to new machines or 
> something else. Can someone help diagnose?
> From this run 
> ([https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch_PR/38/]):
> {noformat}
> *17:10:36* Error occurred during initialization of VM
> *17:10:36* java.lang.OutOfMemoryError: unable to create new native thread
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-7147) Several PR Postcommits failing: java.lang.OutOfMemoryError

2019-04-25 Thread Daniel Oliveira (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-7147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16826285#comment-16826285
 ] 

Daniel Oliveira commented on BEAM-7147:
---

Yifan, I'm adding you as assignee since you've been working on the migration to 
new machines, so you might know if this is related or not.

> Several PR Postcommits failing: java.lang.OutOfMemoryError
> --
>
> Key: BEAM-7147
> URL: https://issues.apache.org/jira/browse/BEAM-7147
> Project: Beam
>  Issue Type: Bug
>  Components: test-failures
>Reporter: Daniel Oliveira
>Assignee: yifan zou
>Priority: Major
>  Labels: currently-failing
>
> While working on my PR ([PR #6880|https://github.com/apache/beam/pull/8380]) 
> I tried to run a few Postcommits and got the same error on each one. Not sure 
> if this is related to the work being done to move to new machines or 
> something else. Can someone help diagnose?
> From this run 
> ([https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch_PR/38/]):
> *17:10:36* Error occurred during initialization of VM*17:10:36* 
> java.lang.OutOfMemoryError: unable to create new native thread



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-7147) Several PR Postcommits failing: java.lang.OutOfMemoryError

2019-04-25 Thread Daniel Oliveira (JIRA)
Daniel Oliveira created BEAM-7147:
-

 Summary: Several PR Postcommits failing: java.lang.OutOfMemoryError
 Key: BEAM-7147
 URL: https://issues.apache.org/jira/browse/BEAM-7147
 Project: Beam
  Issue Type: Bug
  Components: test-failures
Reporter: Daniel Oliveira
Assignee: yifan zou


While working on my PR ([PR #6880|https://github.com/apache/beam/pull/8380]) I 
tried to run a few Postcommits and got the same error on each one. Not sure if 
this is related to the work being done to move to new machines or something 
else. Can someone help diagnose?

>From this run 
>([https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch_PR/38/]):
*17:10:36* Error occurred during initialization of VM*17:10:36* 
java.lang.OutOfMemoryError: unable to create new native thread



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-7137) TypeError caused by using str variable as header argument in apache_beam.io.textio.WriteToText

2019-04-25 Thread Valentyn Tymofieiev (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-7137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16826279#comment-16826279
 ] 

Valentyn Tymofieiev commented on BEAM-7137:
---

[~yoshiki.obata], if you don't plan to work on this issue, feel free to 
reassign it to me and I will try to find an owner.

> TypeError caused by using str variable as header argument in 
> apache_beam.io.textio.WriteToText
> --
>
> Key: BEAM-7137
> URL: https://issues.apache.org/jira/browse/BEAM-7137
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Affects Versions: 2.11.0
> Environment: Python 3.5.6
> macOS Mojave 10.14.4
>Reporter: yoshiki obata
>Assignee: yoshiki obata
>Priority: Major
>
> Using str header to apache_beam.io.textio.WriteToText as argument cause 
> TypeError with Python 3.5.6 - or maybe higher - despite docstring says header 
> is str.
>  This error occurred by writing header to file without encoding to bytes at 
> apache_beam.io.textio._TextSink.open.
>  
> {code:java}
> Traceback (most recent call last):
>   File "apache_beam/runners/common.py", line 727, in 
> apache_beam.runners.common.DoFnRunner.process
>   File "apache_beam/runners/common.py", line 555, in 
> apache_beam.runners.common.PerWindowInvoker.invoke_process
>   File "apache_beam/runners/common.py", line 625, in 
> apache_beam.runners.common.PerWindowInvoker._invoke_per_window
>   File 
> "/Users/yob/.local/share/virtualenvs/test/lib/python3.5/site-packages/apache_beam/io/iobase.py",
>  line 1033, in process
>     self.writer = self.sink.open_writer(init_result, str(uuid.uuid4()))
>   File 
> "/Users/yob/.local/share/virtualenvs/test/lib/python3.5/site-packages/apache_beam/options/value_provider.py",
>  line 137, in _f
>     return fnc(self, *args, **kwargs)
>   File 
> "/Users/yob/.local/share/virtualenvs/test/lib/python3.5/site-packages/apache_beam/io/filebasedsink.py",
>  line 185, in open_writer
>     return FileBasedSinkWriter(self, os.path.join(init_result, uid) + suffix)
>   File 
> "/Users/yob/.local/share/virtualenvs/test/lib/python3.5/site-packages/apache_beam/io/filebasedsink.py",
>  line 389, in __init__
>     self.temp_handle = self.sink.open(temp_shard_path)
>   File 
> "/Users/yob/.local/share/virtualenvs/test/lib/python3.5/site-packages/apache_beam/io/textio.py",
>  line 393, in open
>     file_handle.write(self._header)
> TypeError: a bytes-like object is required, not 'str'
> {code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (BEAM-7137) TypeError caused by using str variable as header argument in apache_beam.io.textio.WriteToText

2019-04-25 Thread Valentyn Tymofieiev (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-7137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16826278#comment-16826278
 ] 

Valentyn Tymofieiev edited comment on BEAM-7137 at 4/25/19 5:51 PM:


[~yoshiki.obata], thanks a lot for trying out Beam on Python 3 and reporting 
this!
 We have to either encode header to bytes 
[here|https://github.com/apache/beam/blob/2cb44a81b258a64544fd8ca387305b2d5ccce13b/sdks/python/apache_beam/io/textio.py#L393],
 or require header to be bytes in the first place. Looking through the IO 
codebase, it seems that we assume header to be a string in quite a few places 
starting from [WriteToText 
PTransform|https://github.com/apache/beam/blob/2cb44a81b258a64544fd8ca387305b2d5ccce13b/sdks/python/apache_beam/io/textio.py#L599],
 so encoding header to bytes may be the path of least resistance. We can revise 
this if we find a strong reason to require header to be bytes.

cc: [~chamikara]

Also, we should find out why this was not caught by our postcommit integration 
tests, and improve test coverage so that we have confidence that that both 
write and read path work correctly.

cc: [~Juta]


was (Author: tvalentyn):
[~yoshiki.obata], thanks a lot for trying out Beam on Python 3 and reporting 
this!
 We have to either encode header to bytes 
[here|https://github.com/apache/beam/blob/2cb44a81b258a64544fd8ca387305b2d5ccce13b/sdks/python/apache_beam/io/textio.py#L393],
 or require header to be bytes in the first place. Looking through the IO 
codebase, it seems that we assume header to be a string in quite a few places 
starting from [WriteToText 
PTransform|https://github.com/apache/beam/blob/2cb44a81b258a64544fd8ca387305b2d5ccce13b/sdks/python/apache_beam/io/textio.py#L599],
 so encoding header to bytes may be the past of least resistance. We can revise 
this if we find a strong reason to require header to be bytes.

cc: [~chamikara]

Also, we should find out why this was not caught by our postcommit integration 
tests, and improve test coverage so that we have confidence that that both 
write and read path work correctly.

cc: [~Juta]

> TypeError caused by using str variable as header argument in 
> apache_beam.io.textio.WriteToText
> --
>
> Key: BEAM-7137
> URL: https://issues.apache.org/jira/browse/BEAM-7137
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Affects Versions: 2.11.0
> Environment: Python 3.5.6
> macOS Mojave 10.14.4
>Reporter: yoshiki obata
>Assignee: yoshiki obata
>Priority: Major
>
> Using str header to apache_beam.io.textio.WriteToText as argument cause 
> TypeError with Python 3.5.6 - or maybe higher - despite docstring says header 
> is str.
>  This error occurred by writing header to file without encoding to bytes at 
> apache_beam.io.textio._TextSink.open.
>  
> {code:java}
> Traceback (most recent call last):
>   File "apache_beam/runners/common.py", line 727, in 
> apache_beam.runners.common.DoFnRunner.process
>   File "apache_beam/runners/common.py", line 555, in 
> apache_beam.runners.common.PerWindowInvoker.invoke_process
>   File "apache_beam/runners/common.py", line 625, in 
> apache_beam.runners.common.PerWindowInvoker._invoke_per_window
>   File 
> "/Users/yob/.local/share/virtualenvs/test/lib/python3.5/site-packages/apache_beam/io/iobase.py",
>  line 1033, in process
>     self.writer = self.sink.open_writer(init_result, str(uuid.uuid4()))
>   File 
> "/Users/yob/.local/share/virtualenvs/test/lib/python3.5/site-packages/apache_beam/options/value_provider.py",
>  line 137, in _f
>     return fnc(self, *args, **kwargs)
>   File 
> "/Users/yob/.local/share/virtualenvs/test/lib/python3.5/site-packages/apache_beam/io/filebasedsink.py",
>  line 185, in open_writer
>     return FileBasedSinkWriter(self, os.path.join(init_result, uid) + suffix)
>   File 
> "/Users/yob/.local/share/virtualenvs/test/lib/python3.5/site-packages/apache_beam/io/filebasedsink.py",
>  line 389, in __init__
>     self.temp_handle = self.sink.open(temp_shard_path)
>   File 
> "/Users/yob/.local/share/virtualenvs/test/lib/python3.5/site-packages/apache_beam/io/textio.py",
>  line 393, in open
>     file_handle.write(self._header)
> TypeError: a bytes-like object is required, not 'str'
> {code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-7137) TypeError caused by using str variable as header argument in apache_beam.io.textio.WriteToText

2019-04-25 Thread Valentyn Tymofieiev (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Valentyn Tymofieiev updated BEAM-7137:
--
Issue Type: Sub-task  (was: Bug)
Parent: BEAM-1251

> TypeError caused by using str variable as header argument in 
> apache_beam.io.textio.WriteToText
> --
>
> Key: BEAM-7137
> URL: https://issues.apache.org/jira/browse/BEAM-7137
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Affects Versions: 2.11.0
> Environment: Python 3.5.6
> macOS Mojave 10.14.4
>Reporter: yoshiki obata
>Assignee: yoshiki obata
>Priority: Minor
>
> Using str header to apache_beam.io.textio.WriteToText as argument cause 
> TypeError with Python 3.5.6 - or maybe higher - despite docstring says header 
> is str.
>  This error occurred by writing header to file without encoding to bytes at 
> apache_beam.io.textio._TextSink.open.
>  
> {code:java}
> Traceback (most recent call last):
>   File "apache_beam/runners/common.py", line 727, in 
> apache_beam.runners.common.DoFnRunner.process
>   File "apache_beam/runners/common.py", line 555, in 
> apache_beam.runners.common.PerWindowInvoker.invoke_process
>   File "apache_beam/runners/common.py", line 625, in 
> apache_beam.runners.common.PerWindowInvoker._invoke_per_window
>   File 
> "/Users/yob/.local/share/virtualenvs/test/lib/python3.5/site-packages/apache_beam/io/iobase.py",
>  line 1033, in process
>     self.writer = self.sink.open_writer(init_result, str(uuid.uuid4()))
>   File 
> "/Users/yob/.local/share/virtualenvs/test/lib/python3.5/site-packages/apache_beam/options/value_provider.py",
>  line 137, in _f
>     return fnc(self, *args, **kwargs)
>   File 
> "/Users/yob/.local/share/virtualenvs/test/lib/python3.5/site-packages/apache_beam/io/filebasedsink.py",
>  line 185, in open_writer
>     return FileBasedSinkWriter(self, os.path.join(init_result, uid) + suffix)
>   File 
> "/Users/yob/.local/share/virtualenvs/test/lib/python3.5/site-packages/apache_beam/io/filebasedsink.py",
>  line 389, in __init__
>     self.temp_handle = self.sink.open(temp_shard_path)
>   File 
> "/Users/yob/.local/share/virtualenvs/test/lib/python3.5/site-packages/apache_beam/io/textio.py",
>  line 393, in open
>     file_handle.write(self._header)
> TypeError: a bytes-like object is required, not 'str'
> {code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-7137) TypeError caused by using str variable as header argument in apache_beam.io.textio.WriteToText

2019-04-25 Thread Valentyn Tymofieiev (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Valentyn Tymofieiev updated BEAM-7137:
--
Priority: Major  (was: Minor)

> TypeError caused by using str variable as header argument in 
> apache_beam.io.textio.WriteToText
> --
>
> Key: BEAM-7137
> URL: https://issues.apache.org/jira/browse/BEAM-7137
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Affects Versions: 2.11.0
> Environment: Python 3.5.6
> macOS Mojave 10.14.4
>Reporter: yoshiki obata
>Assignee: yoshiki obata
>Priority: Major
>
> Using str header to apache_beam.io.textio.WriteToText as argument cause 
> TypeError with Python 3.5.6 - or maybe higher - despite docstring says header 
> is str.
>  This error occurred by writing header to file without encoding to bytes at 
> apache_beam.io.textio._TextSink.open.
>  
> {code:java}
> Traceback (most recent call last):
>   File "apache_beam/runners/common.py", line 727, in 
> apache_beam.runners.common.DoFnRunner.process
>   File "apache_beam/runners/common.py", line 555, in 
> apache_beam.runners.common.PerWindowInvoker.invoke_process
>   File "apache_beam/runners/common.py", line 625, in 
> apache_beam.runners.common.PerWindowInvoker._invoke_per_window
>   File 
> "/Users/yob/.local/share/virtualenvs/test/lib/python3.5/site-packages/apache_beam/io/iobase.py",
>  line 1033, in process
>     self.writer = self.sink.open_writer(init_result, str(uuid.uuid4()))
>   File 
> "/Users/yob/.local/share/virtualenvs/test/lib/python3.5/site-packages/apache_beam/options/value_provider.py",
>  line 137, in _f
>     return fnc(self, *args, **kwargs)
>   File 
> "/Users/yob/.local/share/virtualenvs/test/lib/python3.5/site-packages/apache_beam/io/filebasedsink.py",
>  line 185, in open_writer
>     return FileBasedSinkWriter(self, os.path.join(init_result, uid) + suffix)
>   File 
> "/Users/yob/.local/share/virtualenvs/test/lib/python3.5/site-packages/apache_beam/io/filebasedsink.py",
>  line 389, in __init__
>     self.temp_handle = self.sink.open(temp_shard_path)
>   File 
> "/Users/yob/.local/share/virtualenvs/test/lib/python3.5/site-packages/apache_beam/io/textio.py",
>  line 393, in open
>     file_handle.write(self._header)
> TypeError: a bytes-like object is required, not 'str'
> {code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-7137) TypeError caused by using str variable as header argument in apache_beam.io.textio.WriteToText

2019-04-25 Thread Valentyn Tymofieiev (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-7137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16826278#comment-16826278
 ] 

Valentyn Tymofieiev commented on BEAM-7137:
---

[~yoshiki.obata], thanks a lot for trying out Beam on Python 3 and reporting 
this!
 We have to either encode header to bytes 
[here|https://github.com/apache/beam/blob/2cb44a81b258a64544fd8ca387305b2d5ccce13b/sdks/python/apache_beam/io/textio.py#L393],
 or require header to be bytes in the first place. Looking through the IO 
codebase, it seems that we assume header to be a string in quite a few places 
starting from [WriteToText 
PTransform|https://github.com/apache/beam/blob/2cb44a81b258a64544fd8ca387305b2d5ccce13b/sdks/python/apache_beam/io/textio.py#L599],
 so encoding header to bytes may be the past of least resistance. We can revise 
this if we find a strong reason to require header to be bytes.

cc: [~chamikara]

Also, we should find out why this was not caught by our postcommit integration 
tests, and improve test coverage so that we have confidence that that both 
write and read path work correctly.

cc: [~Juta]

> TypeError caused by using str variable as header argument in 
> apache_beam.io.textio.WriteToText
> --
>
> Key: BEAM-7137
> URL: https://issues.apache.org/jira/browse/BEAM-7137
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Affects Versions: 2.11.0
> Environment: Python 3.5.6
> macOS Mojave 10.14.4
>Reporter: yoshiki obata
>Assignee: yoshiki obata
>Priority: Minor
>
> Using str header to apache_beam.io.textio.WriteToText as argument cause 
> TypeError with Python 3.5.6 - or maybe higher - despite docstring says header 
> is str.
>  This error occurred by writing header to file without encoding to bytes at 
> apache_beam.io.textio._TextSink.open.
>  
> {code:java}
> Traceback (most recent call last):
>   File "apache_beam/runners/common.py", line 727, in 
> apache_beam.runners.common.DoFnRunner.process
>   File "apache_beam/runners/common.py", line 555, in 
> apache_beam.runners.common.PerWindowInvoker.invoke_process
>   File "apache_beam/runners/common.py", line 625, in 
> apache_beam.runners.common.PerWindowInvoker._invoke_per_window
>   File 
> "/Users/yob/.local/share/virtualenvs/test/lib/python3.5/site-packages/apache_beam/io/iobase.py",
>  line 1033, in process
>     self.writer = self.sink.open_writer(init_result, str(uuid.uuid4()))
>   File 
> "/Users/yob/.local/share/virtualenvs/test/lib/python3.5/site-packages/apache_beam/options/value_provider.py",
>  line 137, in _f
>     return fnc(self, *args, **kwargs)
>   File 
> "/Users/yob/.local/share/virtualenvs/test/lib/python3.5/site-packages/apache_beam/io/filebasedsink.py",
>  line 185, in open_writer
>     return FileBasedSinkWriter(self, os.path.join(init_result, uid) + suffix)
>   File 
> "/Users/yob/.local/share/virtualenvs/test/lib/python3.5/site-packages/apache_beam/io/filebasedsink.py",
>  line 389, in __init__
>     self.temp_handle = self.sink.open(temp_shard_path)
>   File 
> "/Users/yob/.local/share/virtualenvs/test/lib/python3.5/site-packages/apache_beam/io/textio.py",
>  line 393, in open
>     file_handle.write(self._header)
> TypeError: a bytes-like object is required, not 'str'
> {code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-2857) Create FileIO in Python

2019-04-25 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-2857?focusedWorklogId=233005&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-233005
 ]

ASF GitHub Bot logged work on BEAM-2857:


Author: ASF GitHub Bot
Created on: 25/Apr/19 17:39
Start Date: 25/Apr/19 17:39
Worklog Time Spent: 10m 
  Work Description: pabloem commented on issue #8394: [BEAM-2857] 
Implementing WriteToFiles transform for fileio (Python)
URL: https://github.com/apache/beam/pull/8394#issuecomment-486769282
 
 
   Passing postcommit: 
https://builds.apache.org/job/beam_PostCommit_Python_Verify_PR/593/
   PAssing py3 postcommit: 
https://builds.apache.org/job/beam_PostCommit_Python3_Verify_PR/230/
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 233005)
Time Spent: 40m  (was: 0.5h)

> Create FileIO in Python
> ---
>
> Key: BEAM-2857
> URL: https://issues.apache.org/jira/browse/BEAM-2857
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py-core
>Reporter: Eugene Kirpichov
>Assignee: Pablo Estrada
>Priority: Major
>  Labels: gsoc, gsoc2019, mentor, triaged
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Beam Java has a FileIO with operations: match()/matchAll(), readMatches(), 
> which together cover the majority of needs for general-purpose file 
> ingestion. Beam Python should have something similar.
> An early design document for this: https://s.apache.org/fileio-beam-python



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7029) Support KafkaIO to be configured externally for use with other SDKs

2019-04-25 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7029?focusedWorklogId=233004&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-233004
 ]

ASF GitHub Bot logged work on BEAM-7029:


Author: ASF GitHub Bot
Created on: 25/Apr/19 17:36
Start Date: 25/Apr/19 17:36
Worklog Time Spent: 10m 
  Work Description: mxm commented on pull request #8322: [BEAM-7029] Add 
KafkaIO.Write as an external transform
URL: https://github.com/apache/beam/pull/8322#discussion_r278657310
 
 

 ##
 File path: 
sdks/java/io/kafka/src/main/java/org/apache/beam/sdk/io/kafka/KafkaIO.java
 ##
 @@ -1325,12 +1325,89 @@ public void populateDisplayData(DisplayData.Builder 
builder) {
 abstract Builder toBuilder();
 
 @AutoValue.Builder
-abstract static class Builder {
+abstract static class Builder
+implements ExternalTransformBuilder>, PDone> {
   abstract Builder setTopic(String topic);
 
   abstract Builder setWriteRecordsTransform(WriteRecords 
transform);
 
   abstract Write build();
+
+  @Override
+  public PTransform>, PDone> buildExternal(
+  External.Configuration configuration) {
+String topic = utf8String(configuration.topic);
+setTopic(topic);
+
+Map producerConfig = new HashMap<>();
+for (KV kv : configuration.producerConfig) {
+  String key = utf8String(kv.getKey());
+  String value = utf8String(kv.getValue());
+  producerConfig.put(key, value);
+}
+Class keySerializer = 
resolveClass(utf8String(configuration.keySerializer));
+Class valSerializer = 
resolveClass(utf8String(configuration.valueSerializer));
+
+WriteRecords writeRecords =
+KafkaIO.writeRecords()
+.updateProducerProperties(producerConfig)
+.withKeySerializer(keySerializer)
+.withValueSerializer(valSerializer)
+.withTopic(topic);
+setWriteRecordsTransform(writeRecords);
+
+return build();
+  }
+
+  private static Class resolveClass(String className) {
 
 Review comment:
   +1
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 233004)
Time Spent: 13h  (was: 12h 50m)

> Support KafkaIO to be configured externally for use with other SDKs
> ---
>
> Key: BEAM-7029
> URL: https://issues.apache.org/jira/browse/BEAM-7029
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-kafka, runner-flink, sdk-py-core
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Major
>  Time Spent: 13h
>  Remaining Estimate: 0h
>
> As of BEAM-6730, we can externally configure existing transforms from SDKs. 
> We should add more useful transforms then just {{GenerateSequence}}. 
> {{KafkaIO}} is a good candidate.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7029) Support KafkaIO to be configured externally for use with other SDKs

2019-04-25 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7029?focusedWorklogId=233003&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-233003
 ]

ASF GitHub Bot logged work on BEAM-7029:


Author: ASF GitHub Bot
Created on: 25/Apr/19 17:35
Start Date: 25/Apr/19 17:35
Worklog Time Spent: 10m 
  Work Description: mxm commented on pull request #8322: [BEAM-7029] Add 
KafkaIO.Write as an external transform
URL: https://github.com/apache/beam/pull/8322#discussion_r278657210
 
 

 ##
 File path: 
sdks/java/io/kafka/src/main/java/org/apache/beam/sdk/io/kafka/KafkaIO.java
 ##
 @@ -1325,12 +1325,89 @@ public void populateDisplayData(DisplayData.Builder 
builder) {
 abstract Builder toBuilder();
 
 @AutoValue.Builder
-abstract static class Builder {
+abstract static class Builder
+implements ExternalTransformBuilder>, PDone> {
   abstract Builder setTopic(String topic);
 
   abstract Builder setWriteRecordsTransform(WriteRecords 
transform);
 
   abstract Write build();
+
+  @Override
+  public PTransform>, PDone> buildExternal(
+  External.Configuration configuration) {
+String topic = utf8String(configuration.topic);
+setTopic(topic);
+
+Map producerConfig = new HashMap<>();
+for (KV kv : configuration.producerConfig) {
+  String key = utf8String(kv.getKey());
+  String value = utf8String(kv.getValue());
+  producerConfig.put(key, value);
+}
+Class keySerializer = 
resolveClass(utf8String(configuration.keySerializer));
+Class valSerializer = 
resolveClass(utf8String(configuration.valueSerializer));
+
+WriteRecords writeRecords =
+KafkaIO.writeRecords()
+.updateProducerProperties(producerConfig)
+.withKeySerializer(keySerializer)
+.withValueSerializer(valSerializer)
+.withTopic(topic);
+setWriteRecordsTransform(writeRecords);
+
+return build();
+  }
+
+  private static Class resolveClass(String className) {
+try {
+  return Class.forName(className);
+} catch (ClassNotFoundException e) {
+  throw new RuntimeException("Could not find Serializer class: " + 
className);
+}
+  }
+
+  private static String utf8String(byte[] bytes) {
 
 Review comment:
   I think it makes sense not to repeat this. Let me see if I can find 
something. `StringUtils` seems like a good place otherwise.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 233003)
Time Spent: 12h 50m  (was: 12h 40m)

> Support KafkaIO to be configured externally for use with other SDKs
> ---
>
> Key: BEAM-7029
> URL: https://issues.apache.org/jira/browse/BEAM-7029
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-kafka, runner-flink, sdk-py-core
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Major
>  Time Spent: 12h 50m
>  Remaining Estimate: 0h
>
> As of BEAM-6730, we can externally configure existing transforms from SDKs. 
> We should add more useful transforms then just {{GenerateSequence}}. 
> {{KafkaIO}} is a good candidate.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7029) Support KafkaIO to be configured externally for use with other SDKs

2019-04-25 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7029?focusedWorklogId=233000&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-233000
 ]

ASF GitHub Bot logged work on BEAM-7029:


Author: ASF GitHub Bot
Created on: 25/Apr/19 17:34
Start Date: 25/Apr/19 17:34
Worklog Time Spent: 10m 
  Work Description: mxm commented on pull request #8322: [BEAM-7029] Add 
KafkaIO.Write as an external transform
URL: https://github.com/apache/beam/pull/8322#discussion_r278656684
 
 

 ##
 File path: sdks/python/apache_beam/io/external/kafka.py
 ##
 @@ -118,35 +118,112 @@ def expand(self, pbegin):
 payload.SerializeToString(),
 self.expansion_service))
 
-  @staticmethod
-  def _encode_map(dict_obj):
-kv_list = [(key.encode('utf-8'), val.encode('utf-8'))
-   for key, val in dict_obj.items()]
-coder = IterableCoder(TupleCoder(
-[LengthPrefixCoder(BytesCoder()), LengthPrefixCoder(BytesCoder())]))
-coder_urns = ['beam:coder:iterable:v1',
-  'beam:coder:kv:v1',
-  'beam:coder:bytes:v1',
-  'beam:coder:bytes:v1']
-return ConfigValue(
-coder_urn=coder_urns,
-payload=coder.encode(kv_list))
-
-  @staticmethod
-  def _encode_list(list_obj):
-encoded_list = [val.encode('utf-8') for val in list_obj]
-coder = IterableCoder(LengthPrefixCoder(BytesCoder()))
-coder_urns = ['beam:coder:iterable:v1',
-  'beam:coder:bytes:v1']
-return ConfigValue(
-coder_urn=coder_urns,
-payload=coder.encode(encoded_list))
-
-  @staticmethod
-  def _encode_str(str_obj):
-encoded_str = str_obj.encode('utf-8')
-coder = LengthPrefixCoder(BytesCoder())
-coder_urns = ['beam:coder:bytes:v1']
-return ConfigValue(
-coder_urn=coder_urns,
-payload=coder.encode(encoded_str))
+
+class WriteToKafka(ptransform.PTransform):
+  """
+An external PTransform which writes KV data to a specified Kafka topic.
+If no Kafka Serializer for key/value is provided, then key/value are
+assumed to be byte arrays.
+
+Note: To use this transform, you need to start the Java expansion service.
+Please refer to the portability documentation on how to do that. The
+expansion service address has to be provided when instantiating this
+transform. During pipeline translation this transform will be replaced by
+the Java SDK's KafkaIO.
 
 Review comment:
   Wasn't sure how this is displayed in IDEs for Python development but seems 
fair not to repeat it.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 233000)
Time Spent: 12h 40m  (was: 12.5h)

> Support KafkaIO to be configured externally for use with other SDKs
> ---
>
> Key: BEAM-7029
> URL: https://issues.apache.org/jira/browse/BEAM-7029
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-kafka, runner-flink, sdk-py-core
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Major
>  Time Spent: 12h 40m
>  Remaining Estimate: 0h
>
> As of BEAM-6730, we can externally configure existing transforms from SDKs. 
> We should add more useful transforms then just {{GenerateSequence}}. 
> {{KafkaIO}} is a good candidate.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7029) Support KafkaIO to be configured externally for use with other SDKs

2019-04-25 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7029?focusedWorklogId=232998&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-232998
 ]

ASF GitHub Bot logged work on BEAM-7029:


Author: ASF GitHub Bot
Created on: 25/Apr/19 17:33
Start Date: 25/Apr/19 17:33
Worklog Time Spent: 10m 
  Work Description: mxm commented on pull request #8322: [BEAM-7029] Add 
KafkaIO.Write as an external transform
URL: https://github.com/apache/beam/pull/8322#discussion_r278656442
 
 

 ##
 File path: sdks/python/apache_beam/runners/portability/flink_runner_test.py
 ##
 @@ -189,12 +190,28 @@ def test_external_transforms(self):
value_deserializer='org.apache.kafka.'
   'common.serialization.'
   'LongDeserializer',
-   expansion_service=expansion_address))
+   expansion_service=get_expansion_service()))
   self.assertTrue('No resolvable bootstrap urls given in bootstrap.servers'
   in str(ctx.exception),
   'Expected to fail due to invalid bootstrap.servers, but '
   'failed due to:\n%s' % str(ctx.exception))
 
+  # We just test the expansion but do not execute.
 
 Review comment:
   Yes, I saw the PR (#8397). That will be great to extend our test coverage 
here.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 232998)
Time Spent: 12.5h  (was: 12h 20m)

> Support KafkaIO to be configured externally for use with other SDKs
> ---
>
> Key: BEAM-7029
> URL: https://issues.apache.org/jira/browse/BEAM-7029
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-kafka, runner-flink, sdk-py-core
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Major
>  Time Spent: 12.5h
>  Remaining Estimate: 0h
>
> As of BEAM-6730, we can externally configure existing transforms from SDKs. 
> We should add more useful transforms then just {{GenerateSequence}}. 
> {{KafkaIO}} is a good candidate.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3863) AfterProcessingTime trigger doesn't fire reliably

2019-04-25 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3863?focusedWorklogId=232994&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-232994
 ]

ASF GitHub Bot logged work on BEAM-3863:


Author: ASF GitHub Bot
Created on: 25/Apr/19 17:29
Start Date: 25/Apr/19 17:29
Worklog Time Spent: 10m 
  Work Description: mxm commented on issue #8366:  [BEAM-3863] Ensure 
correct firing of processing time timers
URL: https://github.com/apache/beam/pull/8366#issuecomment-486765910
 
 
   Run Flink ValidatesRunner
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 232994)
Time Spent: 2.5h  (was: 2h 20m)

> AfterProcessingTime trigger doesn't fire reliably
> -
>
> Key: BEAM-3863
> URL: https://issues.apache.org/jira/browse/BEAM-3863
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Affects Versions: 2.1.0, 2.2.0, 2.3.0
>Reporter: Pawel Bartoszek
>Assignee: Maximilian Michels
>Priority: Major
>  Labels: triaged
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> *Issue*
> Beam AfterProcessingTime trigger doesn't fire always reliably after a 
> configured delay.
> The following job triggers should fire after watermark passes the end of the 
> window and then every 5 seconds for late data and the finally at the end of 
> allowed lateness.
> *Expected behaviour*
> Late firing after processing time trigger should fire after 5 seconds since 
> first late records arrive in the pane.
> *Actual behaviour*
> From my testings late triggers works for some keys but not for the other - 
> it's pretty random which keys are affected. The DummySource generates 15 
> distinct keys AA,BB,..., PP. For each key it sends 5 on time records and one 
> late record. In case late trigger firing is missed it won't fire until the 
> allowed lateness period. 
> *Job code*
> {code:java}
> String[] runnerArgs = {"--runner=FlinkRunner", "--parallelism=8"};
> FlinkPipelineOptions options = 
> PipelineOptionsFactory.fromArgs(runnerArgs).as(FlinkPipelineOptions.class);
> Pipeline pipeline = Pipeline.create(options);
> PCollection apply = pipeline.apply(Read.from(new DummySource()))
> 
> .apply(Window.into(FixedWindows.of(Duration.standardSeconds(10)))
> .triggering(AfterWatermark.pastEndOfWindow()
> .withLateFirings(
> AfterProcessingTime
> 
> .pastFirstElementInPane().plusDelayOf(Duration.standardSeconds(5
> .accumulatingFiredPanes()
> .withAllowedLateness(Duration.standardMinutes(2), 
> Window.ClosingBehavior.FIRE_IF_NON_EMPTY)
> );
> apply.apply(Count.perElement())
> .apply(ParDo.of(new DoFn, Long>() {
> @ProcessElement
> public void process(ProcessContext context, BoundedWindow window) 
> {
> LOG.info("Count: {}. For window {}, Pane {}", 
> context.element(), window, context.pane());
> }
> }));
> pipeline.run().waitUntilFinish();{code}
>  
> *How can you replicate the issue?*
>  I've created a github repo 
> [https://github.com/pbartoszek/BEAM-3863_late_trigger] with the code shown 
> above. Please check out the README file for details how to replicate the 
> issue.
> *What's is causing the issue?*
> I explained the cause in PR.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3863) AfterProcessingTime trigger doesn't fire reliably

2019-04-25 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3863?focusedWorklogId=232992&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-232992
 ]

ASF GitHub Bot logged work on BEAM-3863:


Author: ASF GitHub Bot
Created on: 25/Apr/19 17:28
Start Date: 25/Apr/19 17:28
Worklog Time Spent: 10m 
  Work Description: mxm commented on issue #8366:  [BEAM-3863] Ensure 
correct firing of processing time timers
URL: https://github.com/apache/beam/pull/8366#issuecomment-486765577
 
 
   Thanks!
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 232992)
Time Spent: 2h 20m  (was: 2h 10m)

> AfterProcessingTime trigger doesn't fire reliably
> -
>
> Key: BEAM-3863
> URL: https://issues.apache.org/jira/browse/BEAM-3863
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Affects Versions: 2.1.0, 2.2.0, 2.3.0
>Reporter: Pawel Bartoszek
>Assignee: Maximilian Michels
>Priority: Major
>  Labels: triaged
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> *Issue*
> Beam AfterProcessingTime trigger doesn't fire always reliably after a 
> configured delay.
> The following job triggers should fire after watermark passes the end of the 
> window and then every 5 seconds for late data and the finally at the end of 
> allowed lateness.
> *Expected behaviour*
> Late firing after processing time trigger should fire after 5 seconds since 
> first late records arrive in the pane.
> *Actual behaviour*
> From my testings late triggers works for some keys but not for the other - 
> it's pretty random which keys are affected. The DummySource generates 15 
> distinct keys AA,BB,..., PP. For each key it sends 5 on time records and one 
> late record. In case late trigger firing is missed it won't fire until the 
> allowed lateness period. 
> *Job code*
> {code:java}
> String[] runnerArgs = {"--runner=FlinkRunner", "--parallelism=8"};
> FlinkPipelineOptions options = 
> PipelineOptionsFactory.fromArgs(runnerArgs).as(FlinkPipelineOptions.class);
> Pipeline pipeline = Pipeline.create(options);
> PCollection apply = pipeline.apply(Read.from(new DummySource()))
> 
> .apply(Window.into(FixedWindows.of(Duration.standardSeconds(10)))
> .triggering(AfterWatermark.pastEndOfWindow()
> .withLateFirings(
> AfterProcessingTime
> 
> .pastFirstElementInPane().plusDelayOf(Duration.standardSeconds(5
> .accumulatingFiredPanes()
> .withAllowedLateness(Duration.standardMinutes(2), 
> Window.ClosingBehavior.FIRE_IF_NON_EMPTY)
> );
> apply.apply(Count.perElement())
> .apply(ParDo.of(new DoFn, Long>() {
> @ProcessElement
> public void process(ProcessContext context, BoundedWindow window) 
> {
> LOG.info("Count: {}. For window {}, Pane {}", 
> context.element(), window, context.pane());
> }
> }));
> pipeline.run().waitUntilFinish();{code}
>  
> *How can you replicate the issue?*
>  I've created a github repo 
> [https://github.com/pbartoszek/BEAM-3863_late_trigger] with the code shown 
> above. Please check out the README file for details how to replicate the 
> issue.
> *What's is causing the issue?*
> I explained the cause in PR.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7130) convertAvroFieldStrict as public static function could handle more types of value for logical type timestamp-millis

2019-04-25 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7130?focusedWorklogId=232989&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-232989
 ]

ASF GitHub Bot logged work on BEAM-7130:


Author: ASF GitHub Bot
Created on: 25/Apr/19 17:18
Start Date: 25/Apr/19 17:18
Worklog Time Spent: 10m 
  Work Description: bmv126 commented on pull request #8376: 
[BEAM-7130]Support Datetime value conversion in convertAvroFieldStrict
URL: https://github.com/apache/beam/pull/8376#discussion_r278650481
 
 

 ##
 File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/utils/AvroUtils.java
 ##
 @@ -669,7 +670,11 @@ public static Object convertAvroFieldStrict(
 .fromBytes(byteBuffer.duplicate(), type.type, logicalType);
 return convertDecimal(bigDecimal, fieldType);
   } else if (logicalType instanceof LogicalTypes.TimestampMillis) {
-return convertDateTimeStrict((Long) value, fieldType);
+if (value instanceof DateTime) {
+  return convertDateTimeStrict(((DateTime) value).getMillis(), 
fieldType);
 
 Review comment:
   As the TimestampConversion is set in a static initialization block, the 
expectation of the initial code is to covert long to JodaTime during 
deserialization.  So with that expectation the else part will not be needed in 
this case.
   
   Also how are the other logical-types of Avro handled when converting to Row ?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 232989)
Time Spent: 10m
Remaining Estimate: 0h

> convertAvroFieldStrict as public static function could handle more types of 
> value for logical type timestamp-millis
> ---
>
> Key: BEAM-7130
> URL: https://issues.apache.org/jira/browse/BEAM-7130
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Rui Wang
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> https://lists.apache.org/thread.html/68322dcf9418b1d1640273f1a58f70874b61b4996d08dd982b29492c@%3Cdev.beam.apache.org%3E



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (BEAM-7070) JOIN condition should accept field access

2019-04-25 Thread Rui Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Wang closed BEAM-7070.
--
   Resolution: Fixed
Fix Version/s: Not applicable

> JOIN condition should accept field access
> -
>
> Key: BEAM-7070
> URL: https://issues.apache.org/jira/browse/BEAM-7070
> Project: Beam
>  Issue Type: Sub-task
>  Components: dsl-sql
>Reporter: Rui Wang
>Assignee: Rui Wang
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 5.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-7146) Code generator is accessible from all RelNode

2019-04-25 Thread Rui Wang (JIRA)
Rui Wang created BEAM-7146:
--

 Summary: Code generator is accessible from all RelNode
 Key: BEAM-7146
 URL: https://issues.apache.org/jira/browse/BEAM-7146
 Project: Beam
  Issue Type: New Feature
  Components: dsl-sql
Reporter: Rui Wang


Right now code generation is used in Projection (in BeamCalcRel). However, code 
generation can be used in all places that might need to execute a function. 
Therefore, this JIRA propose to have a layer (called code generator) that could 
be used in other rels: join, aggregation, etc.


  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6695) Latest transform for Python SDK

2019-04-25 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6695?focusedWorklogId=232980&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-232980
 ]

ASF GitHub Bot logged work on BEAM-6695:


Author: ASF GitHub Bot
Created on: 25/Apr/19 17:00
Start Date: 25/Apr/19 17:00
Worklog Time Spent: 10m 
  Work Description: robinyqiu commented on issue #8206: [BEAM-6695] Latest 
PTransform for Python SDK
URL: https://github.com/apache/beam/pull/8206#issuecomment-486755234
 
 
   LGTM, and thank you very much for making these changes!
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 232980)
Time Spent: 6h 10m  (was: 6h)

> Latest transform for Python SDK
> ---
>
> Key: BEAM-6695
> URL: https://issues.apache.org/jira/browse/BEAM-6695
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py-core
>Reporter: Ahmet Altay
>Assignee: Tanay Tummalapalli
>Priority: Minor
>  Time Spent: 6h 10m
>  Remaining Estimate: 0h
>
> Add a PTransform} and Combine.CombineFn for computing the latest element in a 
> PCollection.
> It should offer the same API as its Java counterpart: 
> https://github.com/apache/beam/blob/11a977b8b26eff2274d706541127c19dc93131a2/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/Latest.java



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7070) JOIN condition should accept field access

2019-04-25 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7070?focusedWorklogId=232973&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-232973
 ]

ASF GitHub Bot logged work on BEAM-7070:


Author: ASF GitHub Bot
Created on: 25/Apr/19 16:47
Start Date: 25/Apr/19 16:47
Worklog Time Spent: 10m 
  Work Description: akedin commented on pull request #8301: [BEAM-7070] 
JOIN condition should accept field access
URL: https://github.com/apache/beam/pull/8301
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 232973)
Time Spent: 5.5h  (was: 5h 20m)

> JOIN condition should accept field access
> -
>
> Key: BEAM-7070
> URL: https://issues.apache.org/jira/browse/BEAM-7070
> Project: Beam
>  Issue Type: Sub-task
>  Components: dsl-sql
>Reporter: Rui Wang
>Assignee: Rui Wang
>Priority: Major
>  Time Spent: 5.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7029) Support KafkaIO to be configured externally for use with other SDKs

2019-04-25 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7029?focusedWorklogId=232961&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-232961
 ]

ASF GitHub Bot logged work on BEAM-7029:


Author: ASF GitHub Bot
Created on: 25/Apr/19 16:17
Start Date: 25/Apr/19 16:17
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on pull request #8322: 
[BEAM-7029] Add KafkaIO.Write as an external transform
URL: https://github.com/apache/beam/pull/8322#discussion_r278627111
 
 

 ##
 File path: sdks/python/apache_beam/io/external/kafka.py
 ##
 @@ -118,35 +118,112 @@ def expand(self, pbegin):
 payload.SerializeToString(),
 self.expansion_service))
 
-  @staticmethod
-  def _encode_map(dict_obj):
-kv_list = [(key.encode('utf-8'), val.encode('utf-8'))
-   for key, val in dict_obj.items()]
-coder = IterableCoder(TupleCoder(
-[LengthPrefixCoder(BytesCoder()), LengthPrefixCoder(BytesCoder())]))
-coder_urns = ['beam:coder:iterable:v1',
-  'beam:coder:kv:v1',
-  'beam:coder:bytes:v1',
-  'beam:coder:bytes:v1']
-return ConfigValue(
-coder_urn=coder_urns,
-payload=coder.encode(kv_list))
-
-  @staticmethod
-  def _encode_list(list_obj):
-encoded_list = [val.encode('utf-8') for val in list_obj]
-coder = IterableCoder(LengthPrefixCoder(BytesCoder()))
-coder_urns = ['beam:coder:iterable:v1',
-  'beam:coder:bytes:v1']
-return ConfigValue(
-coder_urn=coder_urns,
-payload=coder.encode(encoded_list))
-
-  @staticmethod
-  def _encode_str(str_obj):
-encoded_str = str_obj.encode('utf-8')
-coder = LengthPrefixCoder(BytesCoder())
-coder_urns = ['beam:coder:bytes:v1']
-return ConfigValue(
-coder_urn=coder_urns,
-payload=coder.encode(encoded_str))
+
+class WriteToKafka(ptransform.PTransform):
+  """
+An external PTransform which writes KV data to a specified Kafka topic.
+If no Kafka Serializer for key/value is provided, then key/value are
+assumed to be byte arrays.
+
+Note: To use this transform, you need to start the Java expansion service.
+Please refer to the portability documentation on how to do that. The
+expansion service address has to be provided when instantiating this
+transform. During pipeline translation this transform will be replaced by
+the Java SDK's KafkaIO.
 
 Review comment:
   Prob. move this information to the top of the module (instead of repeating 
for read and write transforms).
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 232961)
Time Spent: 12h  (was: 11h 50m)

> Support KafkaIO to be configured externally for use with other SDKs
> ---
>
> Key: BEAM-7029
> URL: https://issues.apache.org/jira/browse/BEAM-7029
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-kafka, runner-flink, sdk-py-core
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Major
>  Time Spent: 12h
>  Remaining Estimate: 0h
>
> As of BEAM-6730, we can externally configure existing transforms from SDKs. 
> We should add more useful transforms then just {{GenerateSequence}}. 
> {{KafkaIO}} is a good candidate.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7029) Support KafkaIO to be configured externally for use with other SDKs

2019-04-25 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7029?focusedWorklogId=232962&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-232962
 ]

ASF GitHub Bot logged work on BEAM-7029:


Author: ASF GitHub Bot
Created on: 25/Apr/19 16:17
Start Date: 25/Apr/19 16:17
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on pull request #8322: 
[BEAM-7029] Add KafkaIO.Write as an external transform
URL: https://github.com/apache/beam/pull/8322#discussion_r278624714
 
 

 ##
 File path: sdks/java/container/Dockerfile
 ##
 @@ -22,6 +22,8 @@ MAINTAINER "Apache Beam "
 ADD target/slf4j-api.jar /opt/apache/beam/jars/
 ADD target/slf4j-jdk14.jar /opt/apache/beam/jars/
 ADD target/beam-sdks-java-harness.jar /opt/apache/beam/jars/
+ADD target/beam-sdks-java-io-kafka.jar /opt/apache/beam/jars/
 
 Review comment:
   Please add a TODO to remove these when we support custom environments.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 232962)
Time Spent: 12h 10m  (was: 12h)

> Support KafkaIO to be configured externally for use with other SDKs
> ---
>
> Key: BEAM-7029
> URL: https://issues.apache.org/jira/browse/BEAM-7029
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-kafka, runner-flink, sdk-py-core
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Major
>  Time Spent: 12h 10m
>  Remaining Estimate: 0h
>
> As of BEAM-6730, we can externally configure existing transforms from SDKs. 
> We should add more useful transforms then just {{GenerateSequence}}. 
> {{KafkaIO}} is a good candidate.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7029) Support KafkaIO to be configured externally for use with other SDKs

2019-04-25 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7029?focusedWorklogId=232964&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-232964
 ]

ASF GitHub Bot logged work on BEAM-7029:


Author: ASF GitHub Bot
Created on: 25/Apr/19 16:17
Start Date: 25/Apr/19 16:17
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on pull request #8322: 
[BEAM-7029] Add KafkaIO.Write as an external transform
URL: https://github.com/apache/beam/pull/8322#discussion_r278624790
 
 

 ##
 File path: sdks/java/container/Dockerfile-java11
 ##
 @@ -22,6 +22,8 @@ MAINTAINER "Apache Beam "
 ADD target/slf4j-api.jar /opt/apache/beam/jars/
 ADD target/slf4j-jdk14.jar /opt/apache/beam/jars/
 ADD target/beam-sdks-java-harness.jar /opt/apache/beam/jars/
+ADD target/beam-sdks-java-io-kafka.jar /opt/apache/beam/jars/
 
 Review comment:
   Ditto (here and below).
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 232964)
Time Spent: 12h 20m  (was: 12h 10m)

> Support KafkaIO to be configured externally for use with other SDKs
> ---
>
> Key: BEAM-7029
> URL: https://issues.apache.org/jira/browse/BEAM-7029
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-kafka, runner-flink, sdk-py-core
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Major
>  Time Spent: 12h 20m
>  Remaining Estimate: 0h
>
> As of BEAM-6730, we can externally configure existing transforms from SDKs. 
> We should add more useful transforms then just {{GenerateSequence}}. 
> {{KafkaIO}} is a good candidate.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7029) Support KafkaIO to be configured externally for use with other SDKs

2019-04-25 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7029?focusedWorklogId=232963&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-232963
 ]

ASF GitHub Bot logged work on BEAM-7029:


Author: ASF GitHub Bot
Created on: 25/Apr/19 16:17
Start Date: 25/Apr/19 16:17
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on pull request #8322: 
[BEAM-7029] Add KafkaIO.Write as an external transform
URL: https://github.com/apache/beam/pull/8322#discussion_r278627923
 
 

 ##
 File path: sdks/python/apache_beam/runners/portability/flink_runner_test.py
 ##
 @@ -189,12 +190,28 @@ def test_external_transforms(self):
value_deserializer='org.apache.kafka.'
   'common.serialization.'
   'LongDeserializer',
-   expansion_service=expansion_address))
+   expansion_service=get_expansion_service()))
   self.assertTrue('No resolvable bootstrap urls given in bootstrap.servers'
   in str(ctx.exception),
   'Expected to fail due to invalid bootstrap.servers, but '
   'failed due to:\n%s' % str(ctx.exception))
 
+  # We just test the expansion but do not execute.
 
 Review comment:
   FYI: @ihji is looking into adding a validates runner test suite for 
cross-language transforms that includes Kafka.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 232963)
Time Spent: 12h 10m  (was: 12h)

> Support KafkaIO to be configured externally for use with other SDKs
> ---
>
> Key: BEAM-7029
> URL: https://issues.apache.org/jira/browse/BEAM-7029
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-kafka, runner-flink, sdk-py-core
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Major
>  Time Spent: 12h 10m
>  Remaining Estimate: 0h
>
> As of BEAM-6730, we can externally configure existing transforms from SDKs. 
> We should add more useful transforms then just {{GenerateSequence}}. 
> {{KafkaIO}} is a good candidate.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7029) Support KafkaIO to be configured externally for use with other SDKs

2019-04-25 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7029?focusedWorklogId=232944&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-232944
 ]

ASF GitHub Bot logged work on BEAM-7029:


Author: ASF GitHub Bot
Created on: 25/Apr/19 15:30
Start Date: 25/Apr/19 15:30
Worklog Time Spent: 10m 
  Work Description: aromanenko-dev commented on pull request #8322: 
[BEAM-7029] Add KafkaIO.Write as an external transform
URL: https://github.com/apache/beam/pull/8322#discussion_r278579554
 
 

 ##
 File path: 
sdks/java/io/kafka/src/main/java/org/apache/beam/sdk/io/kafka/KafkaIO.java
 ##
 @@ -1325,12 +1325,89 @@ public void populateDisplayData(DisplayData.Builder 
builder) {
 abstract Builder toBuilder();
 
 @AutoValue.Builder
-abstract static class Builder {
+abstract static class Builder
+implements ExternalTransformBuilder>, PDone> {
   abstract Builder setTopic(String topic);
 
   abstract Builder setWriteRecordsTransform(WriteRecords 
transform);
 
   abstract Write build();
+
+  @Override
+  public PTransform>, PDone> buildExternal(
+  External.Configuration configuration) {
+String topic = utf8String(configuration.topic);
+setTopic(topic);
+
+Map producerConfig = new HashMap<>();
+for (KV kv : configuration.producerConfig) {
+  String key = utf8String(kv.getKey());
+  String value = utf8String(kv.getValue());
+  producerConfig.put(key, value);
+}
+Class keySerializer = 
resolveClass(utf8String(configuration.keySerializer));
+Class valSerializer = 
resolveClass(utf8String(configuration.valueSerializer));
+
+WriteRecords writeRecords =
+KafkaIO.writeRecords()
+.updateProducerProperties(producerConfig)
+.withKeySerializer(keySerializer)
+.withValueSerializer(valSerializer)
+.withTopic(topic);
+setWriteRecordsTransform(writeRecords);
+
+return build();
+  }
+
+  private static Class resolveClass(String className) {
+try {
+  return Class.forName(className);
+} catch (ClassNotFoundException e) {
+  throw new RuntimeException("Could not find Serializer class: " + 
className);
+}
+  }
+
+  private static String utf8String(byte[] bytes) {
 
 Review comment:
   There is the same method for Read transform. Do we have similar function in 
any utils class already? If not, it'd make sense to put it into 
`org.apache.beam.sdk.util.StringUtils` to avoid code duplication.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 232944)
Time Spent: 11h 50m  (was: 11h 40m)

> Support KafkaIO to be configured externally for use with other SDKs
> ---
>
> Key: BEAM-7029
> URL: https://issues.apache.org/jira/browse/BEAM-7029
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-kafka, runner-flink, sdk-py-core
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Major
>  Time Spent: 11h 50m
>  Remaining Estimate: 0h
>
> As of BEAM-6730, we can externally configure existing transforms from SDKs. 
> We should add more useful transforms then just {{GenerateSequence}}. 
> {{KafkaIO}} is a good candidate.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7029) Support KafkaIO to be configured externally for use with other SDKs

2019-04-25 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7029?focusedWorklogId=232914&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-232914
 ]

ASF GitHub Bot logged work on BEAM-7029:


Author: ASF GitHub Bot
Created on: 25/Apr/19 14:40
Start Date: 25/Apr/19 14:40
Worklog Time Spent: 10m 
  Work Description: aromanenko-dev commented on pull request #8322: 
[BEAM-7029] Add KafkaIO.Write as an external transform
URL: https://github.com/apache/beam/pull/8322#discussion_r278579554
 
 

 ##
 File path: 
sdks/java/io/kafka/src/main/java/org/apache/beam/sdk/io/kafka/KafkaIO.java
 ##
 @@ -1325,12 +1325,89 @@ public void populateDisplayData(DisplayData.Builder 
builder) {
 abstract Builder toBuilder();
 
 @AutoValue.Builder
-abstract static class Builder {
+abstract static class Builder
+implements ExternalTransformBuilder>, PDone> {
   abstract Builder setTopic(String topic);
 
   abstract Builder setWriteRecordsTransform(WriteRecords 
transform);
 
   abstract Write build();
+
+  @Override
+  public PTransform>, PDone> buildExternal(
+  External.Configuration configuration) {
+String topic = utf8String(configuration.topic);
+setTopic(topic);
+
+Map producerConfig = new HashMap<>();
+for (KV kv : configuration.producerConfig) {
+  String key = utf8String(kv.getKey());
+  String value = utf8String(kv.getValue());
+  producerConfig.put(key, value);
+}
+Class keySerializer = 
resolveClass(utf8String(configuration.keySerializer));
+Class valSerializer = 
resolveClass(utf8String(configuration.valueSerializer));
+
+WriteRecords writeRecords =
+KafkaIO.writeRecords()
+.updateProducerProperties(producerConfig)
+.withKeySerializer(keySerializer)
+.withValueSerializer(valSerializer)
+.withTopic(topic);
+setWriteRecordsTransform(writeRecords);
+
+return build();
+  }
+
+  private static Class resolveClass(String className) {
+try {
+  return Class.forName(className);
+} catch (ClassNotFoundException e) {
+  throw new RuntimeException("Could not find Serializer class: " + 
className);
+}
+  }
+
+  private static String utf8String(byte[] bytes) {
 
 Review comment:
   We have the same method for Read transform. Do we have similar function in 
any utils class already? If not, it'd make sense to put it into 
`org.apache.beam.sdk.util.StringUtils` to avoid code duplication.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 232914)
Time Spent: 11.5h  (was: 11h 20m)

> Support KafkaIO to be configured externally for use with other SDKs
> ---
>
> Key: BEAM-7029
> URL: https://issues.apache.org/jira/browse/BEAM-7029
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-kafka, runner-flink, sdk-py-core
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Major
>  Time Spent: 11.5h
>  Remaining Estimate: 0h
>
> As of BEAM-6730, we can externally configure existing transforms from SDKs. 
> We should add more useful transforms then just {{GenerateSequence}}. 
> {{KafkaIO}} is a good candidate.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7029) Support KafkaIO to be configured externally for use with other SDKs

2019-04-25 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7029?focusedWorklogId=232915&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-232915
 ]

ASF GitHub Bot logged work on BEAM-7029:


Author: ASF GitHub Bot
Created on: 25/Apr/19 14:40
Start Date: 25/Apr/19 14:40
Worklog Time Spent: 10m 
  Work Description: aromanenko-dev commented on pull request #8322: 
[BEAM-7029] Add KafkaIO.Write as an external transform
URL: https://github.com/apache/beam/pull/8322#discussion_r278581393
 
 

 ##
 File path: 
sdks/java/io/kafka/src/main/java/org/apache/beam/sdk/io/kafka/KafkaIO.java
 ##
 @@ -1325,12 +1325,89 @@ public void populateDisplayData(DisplayData.Builder 
builder) {
 abstract Builder toBuilder();
 
 @AutoValue.Builder
-abstract static class Builder {
+abstract static class Builder
+implements ExternalTransformBuilder>, PDone> {
   abstract Builder setTopic(String topic);
 
   abstract Builder setWriteRecordsTransform(WriteRecords 
transform);
 
   abstract Write build();
+
+  @Override
+  public PTransform>, PDone> buildExternal(
+  External.Configuration configuration) {
+String topic = utf8String(configuration.topic);
+setTopic(topic);
+
+Map producerConfig = new HashMap<>();
+for (KV kv : configuration.producerConfig) {
+  String key = utf8String(kv.getKey());
+  String value = utf8String(kv.getValue());
+  producerConfig.put(key, value);
+}
+Class keySerializer = 
resolveClass(utf8String(configuration.keySerializer));
+Class valSerializer = 
resolveClass(utf8String(configuration.valueSerializer));
+
+WriteRecords writeRecords =
+KafkaIO.writeRecords()
+.updateProducerProperties(producerConfig)
+.withKeySerializer(keySerializer)
+.withValueSerializer(valSerializer)
+.withTopic(topic);
+setWriteRecordsTransform(writeRecords);
+
+return build();
+  }
+
+  private static Class resolveClass(String className) {
 
 Review comment:
   Does it makes sense to move into 
`org.apache.beam.sdk.util.SerializableUtils` ?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 232915)
Time Spent: 11h 40m  (was: 11.5h)

> Support KafkaIO to be configured externally for use with other SDKs
> ---
>
> Key: BEAM-7029
> URL: https://issues.apache.org/jira/browse/BEAM-7029
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-kafka, runner-flink, sdk-py-core
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Major
>  Time Spent: 11h 40m
>  Remaining Estimate: 0h
>
> As of BEAM-6730, we can externally configure existing transforms from SDKs. 
> We should add more useful transforms then just {{GenerateSequence}}. 
> {{KafkaIO}} is a good candidate.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7112) State cleanup interferes with user timer callback

2019-04-25 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7112?focusedWorklogId=232916&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-232916
 ]

ASF GitHub Bot logged work on BEAM-7112:


Author: ASF GitHub Bot
Created on: 25/Apr/19 14:40
Start Date: 25/Apr/19 14:40
Worklog Time Spent: 10m 
  Work Description: mxm commented on pull request #8399: [BEAM-7112] Timer 
race with state cleanup - take two
URL: https://github.com/apache/beam/pull/8399#discussion_r278583803
 
 

 ##
 File path: 
runners/flink/src/main/java/org/apache/beam/runners/flink/translation/wrappers/streaming/ExecutableStageDoFnOperator.java
 ##
 @@ -401,7 +402,14 @@ private void setTimer(WindowedValue timerElement, 
TimerInternals.TimerDa
   @SuppressWarnings("ByteBufferBackingArray")
   private void fireCleanupTimers() {
 while (!cleanupTimers.isEmpty()) {
-  InternalTimer timer = 
cleanupTimers.remove();
+  KV> kv = 
cleanupTimers.peek();
+  if (kv.getKey() != null && kv.getKey() == sdkHarnessRunner.remoteBundle) 
{
+// user timer and cleanup may trigger together in finish bundle,
+// defer state cleanup to allow user timers to complete
+return;
 
 Review comment:
   Let's move it to where it belongs, i.e. after the bundle has been finished.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 232916)
Time Spent: 5h 40m  (was: 5.5h)

> State cleanup interferes with user timer callback
> -
>
> Key: BEAM-7112
> URL: https://issues.apache.org/jira/browse/BEAM-7112
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Affects Versions: 2.12.0
>Reporter: Thomas Weise
>Assignee: Thomas Weise
>Priority: Major
>  Labels: portability-flink
> Fix For: 2.13.0
>
>  Time Spent: 5h 40m
>  Remaining Estimate: 0h
>
> Cleanup timers and user timers are fired at the watermark. Processing of 
> timers in the SDK worker is asynchronous, so it is possible that the state is 
> already removed when the user timer callback executes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7029) Support KafkaIO to be configured externally for use with other SDKs

2019-04-25 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7029?focusedWorklogId=232911&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-232911
 ]

ASF GitHub Bot logged work on BEAM-7029:


Author: ASF GitHub Bot
Created on: 25/Apr/19 14:36
Start Date: 25/Apr/19 14:36
Worklog Time Spent: 10m 
  Work Description: mxm commented on issue #8322: [BEAM-7029] Add 
KafkaIO.Write as an external transform
URL: https://github.com/apache/beam/pull/8322#issuecomment-486699358
 
 
   I already discovered this before opening this PR and created 
https://jira.apache.org/jira/browse/BEAM-7084. I think this can be solved 
independently of this PR.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 232911)
Time Spent: 11h 20m  (was: 11h 10m)

> Support KafkaIO to be configured externally for use with other SDKs
> ---
>
> Key: BEAM-7029
> URL: https://issues.apache.org/jira/browse/BEAM-7029
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-kafka, runner-flink, sdk-py-core
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Major
>  Time Spent: 11h 20m
>  Remaining Estimate: 0h
>
> As of BEAM-6730, we can externally configure existing transforms from SDKs. 
> We should add more useful transforms then just {{GenerateSequence}}. 
> {{KafkaIO}} is a good candidate.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7029) Support KafkaIO to be configured externally for use with other SDKs

2019-04-25 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7029?focusedWorklogId=232909&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-232909
 ]

ASF GitHub Bot logged work on BEAM-7029:


Author: ASF GitHub Bot
Created on: 25/Apr/19 14:31
Start Date: 25/Apr/19 14:31
Worklog Time Spent: 10m 
  Work Description: tweise commented on issue #8322: [BEAM-7029] Add 
KafkaIO.Write as an external transform
URL: https://github.com/apache/beam/pull/8322#issuecomment-486697457
 
 
   We should allow the user to specify the environment for the external 
transform. Would like to be able to use EMBEDDED when using the Flink runner.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 232909)
Time Spent: 11h 10m  (was: 11h)

> Support KafkaIO to be configured externally for use with other SDKs
> ---
>
> Key: BEAM-7029
> URL: https://issues.apache.org/jira/browse/BEAM-7029
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-kafka, runner-flink, sdk-py-core
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Major
>  Time Spent: 11h 10m
>  Remaining Estimate: 0h
>
> As of BEAM-6730, we can externally configure existing transforms from SDKs. 
> We should add more useful transforms then just {{GenerateSequence}}. 
> {{KafkaIO}} is a good candidate.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


  1   2   3   >