[jira] [Work logged] (BEAM-9678) Introduction Kata | Go SDK Code Katas

2020-04-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9678?focusedWorklogId=422520=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-422520
 ]

ASF GitHub Bot logged work on BEAM-9678:


Author: ASF GitHub Bot
Created on: 15/Apr/20 05:10
Start Date: 15/Apr/20 05:10
Worklog Time Spent: 10m 
  Work Description: damondouglas commented on pull request #11340: 
[BEAM-9678] Create Go SDK introduction kata
URL: https://github.com/apache/beam/pull/11340#discussion_r408584291
 
 

 ##
 File path: learning/katas/go/Introduction/Hello Beam/Hello 
Beam/pkg/task/task.go
 ##
 @@ -0,0 +1,24 @@
+// Licensed to the Apache Software Foundation (ASF) under one or more
+// contributor license agreements.  See the NOTICE file distributed with
+// this work for additional information regarding copyright ownership.
+// The ASF licenses this file to You under the Apache License, Version 2.0
+// (the "License"); you may not use this file except in compliance with
+// the License.  You may obtain a copy of the License at
+//
+//http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+package task
+
+import (
+   "github.com/apache/beam/sdks/go/pkg/beam"
+)
+
+func HelloBeam(s beam.Scope) beam.PCollection {
 
 Review comment:
   May we consider keeping this?  When I consolidated everything into the same 
file, I received the error `flag redefined: runner`.  IntelliJ Edu constrains 
us to having tests in the test folder in order to achieve lesson task 
validation, so we can't do any task validation with tests in any other folder.  
Also, while there doesn't seem to be a consensus around go project structure in 
the community, this repo seems to follows the `./{cmd,pkg}/*` structure.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 422520)
Time Spent: 3.5h  (was: 3h 20m)

> Introduction Kata | Go SDK Code Katas
> -
>
> Key: BEAM-9678
> URL: https://issues.apache.org/jira/browse/BEAM-9678
> Project: Beam
>  Issue Type: Sub-task
>  Components: katas, sdk-go
>Reporter: Damon Douglas
>Assignee: Damon Douglas
>Priority: Major
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> An Introduction kata patterns after 
> [https://github.com/apache/beam/tree/master/learning/katas/java/Introduction] 
> where the take away is an individual's ability to start an Apache Beam 
> pipeline using the Golang SDK.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9642) Add SDF execution-time runners

2020-04-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9642?focusedWorklogId=422476=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-422476
 ]

ASF GitHub Bot logged work on BEAM-9642:


Author: ASF GitHub Bot
Created on: 15/Apr/20 01:17
Start Date: 15/Apr/20 01:17
Worklog Time Spent: 10m 
  Work Description: youngoli commented on pull request #11423: [BEAM-9642] 
Fix infinite recursion.
URL: https://github.com/apache/beam/pull/11423
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 422476)
Time Spent: 5h 20m  (was: 5h 10m)

> Add SDF execution-time runners
> --
>
> Key: BEAM-9642
> URL: https://issues.apache.org/jira/browse/BEAM-9642
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-go
>Reporter: Daniel Oliveira
>Assignee: Daniel Oliveira
>Priority: Major
>  Time Spent: 5h 20m
>  Remaining Estimate: 0h
>
> Adds execution-time SDF runner units to the exec package, and any unit tests 
> + helpers required.
> This is needed to get the expanded SDF URNs to execute in the runner harness.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9642) Add SDF execution-time runners

2020-04-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9642?focusedWorklogId=422475=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-422475
 ]

ASF GitHub Bot logged work on BEAM-9642:


Author: ASF GitHub Bot
Created on: 15/Apr/20 01:16
Start Date: 15/Apr/20 01:16
Worklog Time Spent: 10m 
  Work Description: youngoli commented on issue #11423: [BEAM-9642] Fix 
infinite recursion.
URL: https://github.com/apache/beam/pull/11423#issuecomment-613760696
 
 
   Agreed. I did what validation I could, but neither lint or gofmt caught this 
one. I think go vet would've caught it, but it was annoying to use (if I ran it 
on this file specifically, it would complain about not finding the symbols from 
other files in the same package. If I ran it on the entire directory, I'd get a 
million errors about the missing documentation for all the autogenerated code).
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 422475)
Time Spent: 5h 10m  (was: 5h)

> Add SDF execution-time runners
> --
>
> Key: BEAM-9642
> URL: https://issues.apache.org/jira/browse/BEAM-9642
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-go
>Reporter: Daniel Oliveira
>Assignee: Daniel Oliveira
>Priority: Major
>  Time Spent: 5h 10m
>  Remaining Estimate: 0h
>
> Adds execution-time SDF runner units to the exec package, and any unit tests 
> + helpers required.
> This is needed to get the expanded SDF URNs to execute in the runner harness.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9642) Add SDF execution-time runners

2020-04-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9642?focusedWorklogId=422474=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-422474
 ]

ASF GitHub Bot logged work on BEAM-9642:


Author: ASF GitHub Bot
Created on: 15/Apr/20 01:10
Start Date: 15/Apr/20 01:10
Worklog Time Spent: 10m 
  Work Description: lostluck commented on issue #11423: [BEAM-9642] Fix 
infinite recursion.
URL: https://github.com/apache/beam/pull/11423#issuecomment-613759173
 
 
   Thanks! If we could get some kind of github static check to have caught that 
for us that would have been great. Good thing we had a quick import after the 
merge though.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 422474)
Time Spent: 5h  (was: 4h 50m)

> Add SDF execution-time runners
> --
>
> Key: BEAM-9642
> URL: https://issues.apache.org/jira/browse/BEAM-9642
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-go
>Reporter: Daniel Oliveira
>Assignee: Daniel Oliveira
>Priority: Major
>  Time Spent: 5h
>  Remaining Estimate: 0h
>
> Adds execution-time SDF runner units to the exec package, and any unit tests 
> + helpers required.
> This is needed to get the expanded SDF URNs to execute in the runner harness.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9642) Add SDF execution-time runners

2020-04-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9642?focusedWorklogId=422467=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-422467
 ]

ASF GitHub Bot logged work on BEAM-9642:


Author: ASF GitHub Bot
Created on: 15/Apr/20 00:53
Start Date: 15/Apr/20 00:53
Worklog Time Spent: 10m 
  Work Description: youngoli commented on issue #11423: [BEAM-9642] Fix 
infinite recursion.
URL: https://github.com/apache/beam/pull/11423#issuecomment-613754243
 
 
   R: @lostluck 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 422467)
Time Spent: 4h 50m  (was: 4h 40m)

> Add SDF execution-time runners
> --
>
> Key: BEAM-9642
> URL: https://issues.apache.org/jira/browse/BEAM-9642
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-go
>Reporter: Daniel Oliveira
>Assignee: Daniel Oliveira
>Priority: Major
>  Time Spent: 4h 50m
>  Remaining Estimate: 0h
>
> Adds execution-time SDF runner units to the exec package, and any unit tests 
> + helpers required.
> This is needed to get the expanded SDF URNs to execute in the runner harness.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9642) Add SDF execution-time runners

2020-04-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9642?focusedWorklogId=422466=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-422466
 ]

ASF GitHub Bot logged work on BEAM-9642:


Author: ASF GitHub Bot
Created on: 15/Apr/20 00:53
Start Date: 15/Apr/20 00:53
Worklog Time Spent: 10m 
  Work Description: youngoli commented on pull request #11423: [BEAM-9642] 
Fix infinite recursion.
URL: https://github.com/apache/beam/pull/11423
 
 
   Small fix. I debated bundling this in a larger PR but I was worried I might 
forget.
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [x] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [x] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] Update `CHANGES.md` with noteworthy changes.
- [x] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   See the [Contributor Guide](https://beam.apache.org/contribute) for more 
tips on [how to make review process 
smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)[![Build
 

[jira] [Work logged] (BEAM-9746) [Go SDK] Empty side inputs causing spurious zero elements

2020-04-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9746?focusedWorklogId=422459=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-422459
 ]

ASF GitHub Bot logged work on BEAM-9746:


Author: ASF GitHub Bot
Created on: 15/Apr/20 00:33
Start Date: 15/Apr/20 00:33
Worklog Time Spent: 10m 
  Work Description: lostluck commented on pull request #11413: [BEAM-9746] 
check for 0 length copies from state
URL: https://github.com/apache/beam/pull/11413
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 422459)
Time Spent: 3h 10m  (was: 3h)

> [Go SDK] Empty side inputs causing spurious zero elements
> -
>
> Key: BEAM-9746
> URL: https://issues.apache.org/jira/browse/BEAM-9746
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-go
>Reporter: Robert Burke
>Assignee: Robert Burke
>Priority: Major
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> A user discovered that empty side inputs would spuriously provide a single 
> zero element.
> The error was narrowed down to the Go SDK's state manager code  copying the 
> stateGetResponse data wasn't checking that the original data source even had 
> any bytes in it, leading it in particular to interpret length prefixed data 
> as having 0 length, which would cause zero value elements to be generated. 
> Notably, this caused empty strings.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9746) [Go SDK] Empty side inputs causing spurious zero elements

2020-04-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9746?focusedWorklogId=422453=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-422453
 ]

ASF GitHub Bot logged work on BEAM-9746:


Author: ASF GitHub Bot
Created on: 15/Apr/20 00:18
Start Date: 15/Apr/20 00:18
Worklog Time Spent: 10m 
  Work Description: lostluck commented on pull request #11413: [BEAM-9746] 
check for 0 length copies from state
URL: https://github.com/apache/beam/pull/11413#discussion_r408509101
 
 

 ##
 File path: sdks/go/pkg/beam/core/runtime/harness/statemgr_test.go
 ##
 @@ -258,6 +261,167 @@ func TestStateChannel(t *testing.T) {
}
 }
 
+// TestStateKeyReader validates ordinary Read cases
+func TestStateKeyReader(t *testing.T) {
+   const readLen = 4
+   tests := []struct {
+   name string
+   buflens  []int // sizes of the buffers received on the state 
channel.
+   numReads int
+   closed   bool // tries to read from closed reader
+   noGetbool // tries to read from nil get response reader
+   }{
+   {
+   name: "emptyData",
+   buflens:  []int{-1},
+   numReads: 1,
+   }, {
+   name: "singleBufferSingleRead",
+   buflens:  []int{readLen},
+   numReads: 2,
+   }, {
+   name: "singleBufferMultipleReads",
+   buflens:  []int{2 * readLen},
+   numReads: 3,
+   }, {
+   name: "singleBufferShortRead",
+   buflens:  []int{readLen - 1},
+   numReads: 2,
+   }, {
+   name: "multiBuffer",
+   buflens:  []int{readLen, readLen},
+   numReads: 3,
+   }, {
+   name: "multiBuffer-short-reads",
+   buflens:  []int{readLen - 1, readLen - 1, readLen - 2},
+   numReads: 4,
+   }, {
+   name: "emptyDataFirst", // Shouldn't happen, but 
not unreasonable to handle.
+   buflens:  []int{-1, readLen, readLen},
+   numReads: 4,
+   }, {
+   name: "emptyDataMid", // Shouldn't happen, but not 
unreasonable to handle.
+   buflens:  []int{readLen, readLen, -1, readLen},
+   numReads: 5,
+   }, {
+   name: "emptyDataLast", // Shouldn't happen, but not 
unreasonable to handle.
+   buflens:  []int{readLen, readLen, -1},
+   numReads: 3,
+   }, {
+   name: "emptyDataLast-short",
+   buflens:  []int{3*readLen - 2, -1},
+   numReads: 4,
+   }, {
+   name: "closed",
+   buflens:  []int{-1, -1},
+   numReads: 1,
+   closed:   true,
+   }, {
+   name: "noGet",
+   buflens:  []int{-1},
+   numReads: 1,
+   noGet:true,
+   },
+   }
+   for _, test := range tests {
+   t.Run(test.name, func(t *testing.T) {
+   ctx, cancelFn := 
context.WithCancel(context.Background())
+   ch := {
+   id:"test",
+   requests:  make(chan *fnpb.StateRequest),
+   responses: make(map[string]chan<- 
*fnpb.StateResponse),
+   cancelFn:  cancelFn,
+   DoneCh:ctx.Done(),
+   }
+
+   // Handle the channel behavior asynchronously.
+   go func() {
+   for i, buflen := range test.buflens {
+   token := []byte(strconv.Itoa(i))
+   var buf []byte
+   if buflen >= 0 {
+   buf = bytes.Repeat([]byte{42}, 
buflen)
+   }
+   // On the last request response pair, 
send no token.
+   if i+1 == len(test.buflens) {
+   token = nil
+   }
+
+   req := <-ch.requests
+
+   if test.noGet {
+   

[jira] [Work logged] (BEAM-9746) [Go SDK] Empty side inputs causing spurious zero elements

2020-04-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9746?focusedWorklogId=422452=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-422452
 ]

ASF GitHub Bot logged work on BEAM-9746:


Author: ASF GitHub Bot
Created on: 15/Apr/20 00:18
Start Date: 15/Apr/20 00:18
Worklog Time Spent: 10m 
  Work Description: lostluck commented on pull request #11413: [BEAM-9746] 
check for 0 length copies from state
URL: https://github.com/apache/beam/pull/11413#discussion_r408509078
 
 

 ##
 File path: sdks/go/pkg/beam/core/runtime/harness/statemgr_test.go
 ##
 @@ -258,6 +261,167 @@ func TestStateChannel(t *testing.T) {
}
 }
 
+// TestStateKeyReader validates ordinary Read cases
+func TestStateKeyReader(t *testing.T) {
+   const readLen = 4
+   tests := []struct {
+   name string
+   buflens  []int // sizes of the buffers received on the state 
channel.
+   numReads int
+   closed   bool // tries to read from closed reader
+   noGetbool // tries to read from nil get response reader
+   }{
+   {
+   name: "emptyData",
+   buflens:  []int{-1},
+   numReads: 1,
+   }, {
+   name: "singleBufferSingleRead",
+   buflens:  []int{readLen},
+   numReads: 2,
+   }, {
+   name: "singleBufferMultipleReads",
+   buflens:  []int{2 * readLen},
+   numReads: 3,
+   }, {
+   name: "singleBufferShortRead",
+   buflens:  []int{readLen - 1},
+   numReads: 2,
+   }, {
+   name: "multiBuffer",
+   buflens:  []int{readLen, readLen},
+   numReads: 3,
+   }, {
+   name: "multiBuffer-short-reads",
+   buflens:  []int{readLen - 1, readLen - 1, readLen - 2},
+   numReads: 4,
+   }, {
+   name: "emptyDataFirst", // Shouldn't happen, but 
not unreasonable to handle.
+   buflens:  []int{-1, readLen, readLen},
+   numReads: 4,
+   }, {
+   name: "emptyDataMid", // Shouldn't happen, but not 
unreasonable to handle.
+   buflens:  []int{readLen, readLen, -1, readLen},
+   numReads: 5,
+   }, {
+   name: "emptyDataLast", // Shouldn't happen, but not 
unreasonable to handle.
+   buflens:  []int{readLen, readLen, -1},
+   numReads: 3,
+   }, {
+   name: "emptyDataLast-short",
+   buflens:  []int{3*readLen - 2, -1},
+   numReads: 4,
+   }, {
+   name: "closed",
+   buflens:  []int{-1, -1},
+   numReads: 1,
+   closed:   true,
+   }, {
+   name: "noGet",
+   buflens:  []int{-1},
+   numReads: 1,
+   noGet:true,
+   },
+   }
+   for _, test := range tests {
+   t.Run(test.name, func(t *testing.T) {
+   ctx, cancelFn := 
context.WithCancel(context.Background())
+   ch := {
+   id:"test",
+   requests:  make(chan *fnpb.StateRequest),
+   responses: make(map[string]chan<- 
*fnpb.StateResponse),
+   cancelFn:  cancelFn,
+   DoneCh:ctx.Done(),
+   }
+
+   // Handle the channel behavior asynchronously.
+   go func() {
+   for i, buflen := range test.buflens {
+   token := []byte(strconv.Itoa(i))
+   var buf []byte
+   if buflen >= 0 {
+   buf = bytes.Repeat([]byte{42}, 
buflen)
+   }
+   // On the last request response pair, 
send no token.
+   if i+1 == len(test.buflens) {
+   token = nil
+   }
+
+   req := <-ch.requests
+
+   if test.noGet {
+   

[jira] [Work logged] (BEAM-9639) Abstract bundle execution logic from stage execution logic

2020-04-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9639?focusedWorklogId=422451=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-422451
 ]

ASF GitHub Bot logged work on BEAM-9639:


Author: ASF GitHub Bot
Created on: 15/Apr/20 00:17
Start Date: 15/Apr/20 00:17
Worklog Time Spent: 10m 
  Work Description: pabloem commented on pull request #11270: 
[BEAM-9639][BEAM-9608] Improvements for FnApiRunner
URL: https://github.com/apache/beam/pull/11270#discussion_r408508964
 
 

 ##
 File path: 
sdks/python/apache_beam/runners/portability/fn_api_runner/translations.py
 ##
 @@ -75,21 +75,27 @@
 
 IMPULSE_BUFFER = b'impulse'
 
+# SideInputId is identified by a consumer ParDo + tag.
+SideInputId = Tuple[str, str]
+
+DataSideInput = Dict[SideInputId,
+ Tuple[bytes, beam_runner_api_pb2.FunctionSpec]]
+
 
 class Stage(object):
   """A set of Transforms that can be sent to the worker for processing."""
   def __init__(self,
name,  # type: str
transforms,  # type: List[beam_runner_api_pb2.PTransform]
-   downstream_side_inputs=None,  # type: Optional[FrozenSet[str]]
+   downstream_side_inputs=None,  # type: Optional[Dict[str, 
SideInputId]]
 
 Review comment:
   Hm so this change breaks that, so the memory requirements would be larger. I 
would think that they would not be too bad, since most graphs don't have many 
side inputs going many places. What do you think? I'm willing to find a better 
solution for this, but I wonder if it's worth the extra time.
   
   The reason that this is made into a dict is to contain more information 
about downstream side inputs. specifically, it contains which transforms will 
consume the side inputs. this is used to commit the side inputs to state after 
they are calculated (rather than before they are consumed). This will be 
necessary for streaming, because side inputs will need to be added to state as 
they are computed.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 422451)
Time Spent: 3h 40m  (was: 3.5h)

> Abstract bundle execution logic from stage execution logic
> --
>
> Key: BEAM-9639
> URL: https://issues.apache.org/jira/browse/BEAM-9639
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Pablo Estrada
>Assignee: Pablo Estrada
>Priority: Major
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> The FnApiRunner currently works on a per-stage manner, and does not abstract 
> single-bundle execution much. This work item is to clearly define the code to 
> execute a single bundle.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9639) Abstract bundle execution logic from stage execution logic

2020-04-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9639?focusedWorklogId=422442=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-422442
 ]

ASF GitHub Bot logged work on BEAM-9639:


Author: ASF GitHub Bot
Created on: 15/Apr/20 00:13
Start Date: 15/Apr/20 00:13
Worklog Time Spent: 10m 
  Work Description: pabloem commented on pull request #11270: 
[BEAM-9639][BEAM-9608] Improvements for FnApiRunner
URL: https://github.com/apache/beam/pull/11270#discussion_r408507669
 
 

 ##
 File path: 
sdks/python/apache_beam/runners/portability/fn_api_runner/translations.py
 ##
 @@ -149,15 +155,26 @@ def no_overlap(a, b):
 return (
 not consumer.forced_root and not self in consumer.must_follow and
 not self.is_runner_urn(context) and
-not consumer.is_runner_urn(context) and
-no_overlap(self.downstream_side_inputs, consumer.side_inputs()))
+not consumer.is_runner_urn(context) and no_overlap(
+set(self.downstream_side_inputs.keys()),
+{i
+ for i, _, _ in consumer.side_inputs()}))
+
+  def _fuse_downstream_side_inputs(self, other):
+res = dict(self.downstream_side_inputs)
+for si, other_si_ids in other.downstream_side_inputs.items():
+  if si in res:
+res[si] = union(res[si], other_si_ids)
+  else:
+res[si] = other_si_ids
+return res
 
   def fuse(self, other):
 # type: (Stage) -> Stage
 return Stage(
 "(%s)+(%s)" % (self.name, other.name),
 self.transforms + other.transforms,
-union(self.downstream_side_inputs, other.downstream_side_inputs),
+self._fuse_downstream_side_inputs(other),
 
 Review comment:
   renamed to `_get_fused_downstream_side_inputs` thoughts?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 422442)
Time Spent: 2h 40m  (was: 2.5h)

> Abstract bundle execution logic from stage execution logic
> --
>
> Key: BEAM-9639
> URL: https://issues.apache.org/jira/browse/BEAM-9639
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Pablo Estrada
>Assignee: Pablo Estrada
>Priority: Major
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> The FnApiRunner currently works on a per-stage manner, and does not abstract 
> single-bundle execution much. This work item is to clearly define the code to 
> execute a single bundle.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9639) Abstract bundle execution logic from stage execution logic

2020-04-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9639?focusedWorklogId=422447=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-422447
 ]

ASF GitHub Bot logged work on BEAM-9639:


Author: ASF GitHub Bot
Created on: 15/Apr/20 00:13
Start Date: 15/Apr/20 00:13
Worklog Time Spent: 10m 
  Work Description: pabloem commented on pull request #11270: 
[BEAM-9639][BEAM-9608] Improvements for FnApiRunner
URL: https://github.com/apache/beam/pull/11270#discussion_r408507914
 
 

 ##
 File path: 
sdks/python/apache_beam/runners/portability/fn_api_runner/fn_runner.py
 ##
 @@ -914,14 +898,16 @@ def process_bundle(self,
  expected_outputs,  # type: DataOutput
  fired_timers,  # type: Mapping[Tuple[str, str], 
PartitionableBuffer]
  expected_output_timers  # type: Dict[str, Dict[str, str]]
+ dry_run=False
 
 Review comment:
   Done
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 422447)
Time Spent: 3.5h  (was: 3h 20m)

> Abstract bundle execution logic from stage execution logic
> --
>
> Key: BEAM-9639
> URL: https://issues.apache.org/jira/browse/BEAM-9639
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Pablo Estrada
>Assignee: Pablo Estrada
>Priority: Major
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> The FnApiRunner currently works on a per-stage manner, and does not abstract 
> single-bundle execution much. This work item is to clearly define the code to 
> execute a single bundle.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9639) Abstract bundle execution logic from stage execution logic

2020-04-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9639?focusedWorklogId=422444=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-422444
 ]

ASF GitHub Bot logged work on BEAM-9639:


Author: ASF GitHub Bot
Created on: 15/Apr/20 00:13
Start Date: 15/Apr/20 00:13
Worklog Time Spent: 10m 
  Work Description: pabloem commented on pull request #11270: 
[BEAM-9639][BEAM-9608] Improvements for FnApiRunner
URL: https://github.com/apache/beam/pull/11270#discussion_r408507730
 
 

 ##
 File path: 
sdks/python/apache_beam/runners/portability/fn_api_runner/execution.py
 ##
 @@ -367,6 +413,73 @@ def _build_process_bundle_descriptor(self):
 state_api_service_descriptor=self.state_api_service_descriptor(),
 timer_api_service_descriptor=self.data_api_service_descriptor())
 
+  def commit_output_views_to_state(self):
+"""Commit bundle outputs to state to be consumed as side inputs later.
+
+Only the outputs that should be side inputs are committed to state.
+"""
+data_side_input = {}  # type: DataSideInput
+for pcoll, si_ids in self.stage.downstream_side_inputs.items():
+  for (consumer_transform_name, tag), access_pattern in si_ids.items():
+data_side_input[consumer_transform_name, tag] = (
+translations.create_buffer_id(pcoll), access_pattern)
+self.execution_context.commit_side_inputs_to_state(data_side_input)
+
+  def extract_bundle_inputs(self):
+# type: (...) -> Tuple[Dict[str, PartitionableBuffer], DataOutput]
+
+"""Returns maps of transform names to PCollection identifiers.
+
+Also mutates IO stages to point to the data ApiServiceDescriptor.
+
+Returns:
+  A tuple of (data_input, data_output) dictionaries.
+`data_input` is a dictionary mapping (transform_name, output_name) to a
+PCollection buffer; `data_output` is a dictionary mapping
+(transform_name, output_name) to a PCollection ID.
+"""
+data_input = {}  # type: Dict[str, PartitionableBuffer]
+data_output = {}  # type: DataOutput
+# A mapping of {(transform_id, timer_family_id) : buffer_id}
+expected_timer_output = {}  # type: Dict[Tuple(str, str), str]
+for transform in self.stage.transforms:
+  if transform.spec.urn in (bundle_processor.DATA_INPUT_URN,
+bundle_processor.DATA_OUTPUT_URN):
+pcoll_id = transform.spec.payload
+if transform.spec.urn == bundle_processor.DATA_INPUT_URN:
+  coder_id = self.execution_context.data_channel_coders[only_element(
+  transform.outputs.values())]
+  coder = self.execution_context.pipeline_context.coders[
+  self.execution_context.safe_coders.get(coder_id, coder_id)]
+  if pcoll_id == translations.IMPULSE_BUFFER:
+data_input[transform.unique_name] = ListBuffer(
+coder_impl=coder.get_impl())
+data_input[transform.unique_name].append(ENCODED_IMPULSE_VALUE)
+  else:
+if pcoll_id not in self.execution_context.pcoll_buffers:
+  self.execution_context.pcoll_buffers[pcoll_id] = ListBuffer(
+  coder_impl=coder.get_impl())
+data_input[transform.unique_name] = \
+  self.execution_context.pcoll_buffers[pcoll_id]
+elif transform.spec.urn == bundle_processor.DATA_OUTPUT_URN:
+  data_output[transform.unique_name] = pcoll_id
+  coder_id = self.execution_context.data_channel_coders[only_element(
+  transform.inputs.values())]
+else:
+  raise NotImplementedError
+data_spec = beam_fn_api_pb2.RemoteGrpcPort(coder_id=coder_id)
+data_api_service_descriptor = \
+  self.data_api_service_descriptor()
+if data_api_service_descriptor:
+  data_spec.api_service_descriptor.url = (
+  data_api_service_descriptor.url)
+transform.spec.payload = data_spec.SerializeToString()
+  elif transform.spec.urn in translations.PAR_DO_URNS:
+for timer_family_id in payload.timer_family_specs.keys():
+  expected_timer_output[(transform.unique_name, timer_family_id)] = (
+  create_buffer_id(timer_family_id, 'timers'))
+return data_input, data_output, expected_timer_output
 
 Review comment:
   done
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 422444)
Time Spent: 3h  (was: 2h 50m)

> Abstract bundle execution logic from stage execution logic
> --
>
>

[jira] [Work logged] (BEAM-9639) Abstract bundle execution logic from stage execution logic

2020-04-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9639?focusedWorklogId=422445=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-422445
 ]

ASF GitHub Bot logged work on BEAM-9639:


Author: ASF GitHub Bot
Created on: 15/Apr/20 00:13
Start Date: 15/Apr/20 00:13
Worklog Time Spent: 10m 
  Work Description: pabloem commented on pull request #11270: 
[BEAM-9639][BEAM-9608] Improvements for FnApiRunner
URL: https://github.com/apache/beam/pull/11270#discussion_r408507780
 
 

 ##
 File path: 
sdks/python/apache_beam/runners/portability/fn_api_runner/execution.py
 ##
 @@ -367,6 +413,73 @@ def _build_process_bundle_descriptor(self):
 state_api_service_descriptor=self.state_api_service_descriptor(),
 timer_api_service_descriptor=self.data_api_service_descriptor())
 
+  def commit_output_views_to_state(self):
+"""Commit bundle outputs to state to be consumed as side inputs later.
+
+Only the outputs that should be side inputs are committed to state.
+"""
+data_side_input = {}  # type: DataSideInput
+for pcoll, si_ids in self.stage.downstream_side_inputs.items():
+  for (consumer_transform_name, tag), access_pattern in si_ids.items():
+data_side_input[consumer_transform_name, tag] = (
+translations.create_buffer_id(pcoll), access_pattern)
+self.execution_context.commit_side_inputs_to_state(data_side_input)
+
+  def extract_bundle_inputs(self):
 
 Review comment:
   done
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 422445)
Time Spent: 3h 10m  (was: 3h)

> Abstract bundle execution logic from stage execution logic
> --
>
> Key: BEAM-9639
> URL: https://issues.apache.org/jira/browse/BEAM-9639
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Pablo Estrada
>Assignee: Pablo Estrada
>Priority: Major
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> The FnApiRunner currently works on a per-stage manner, and does not abstract 
> single-bundle execution much. This work item is to clearly define the code to 
> execute a single bundle.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9639) Abstract bundle execution logic from stage execution logic

2020-04-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9639?focusedWorklogId=422446=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-422446
 ]

ASF GitHub Bot logged work on BEAM-9639:


Author: ASF GitHub Bot
Created on: 15/Apr/20 00:13
Start Date: 15/Apr/20 00:13
Worklog Time Spent: 10m 
  Work Description: pabloem commented on pull request #11270: 
[BEAM-9639][BEAM-9608] Improvements for FnApiRunner
URL: https://github.com/apache/beam/pull/11270#discussion_r408507854
 
 

 ##
 File path: 
sdks/python/apache_beam/runners/portability/fn_api_runner/execution.py
 ##
 @@ -326,8 +327,8 @@ def commit_side_inputs_to_state(
   data_side_input,  # type: DataSideInput
   ):
 # type: (...) -> None
-for (consuming_transform_id, tag), (buffer_id, func_spec) \
-in data_side_input.items():
+for (consuming_transform_id, tag), (buffer_id,
 
 Review comment:
   it did not : ( hehe
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 422446)
Time Spent: 3h 20m  (was: 3h 10m)

> Abstract bundle execution logic from stage execution logic
> --
>
> Key: BEAM-9639
> URL: https://issues.apache.org/jira/browse/BEAM-9639
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Pablo Estrada
>Assignee: Pablo Estrada
>Priority: Major
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> The FnApiRunner currently works on a per-stage manner, and does not abstract 
> single-bundle execution much. This work item is to clearly define the code to 
> execute a single bundle.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9639) Abstract bundle execution logic from stage execution logic

2020-04-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9639?focusedWorklogId=422443=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-422443
 ]

ASF GitHub Bot logged work on BEAM-9639:


Author: ASF GitHub Bot
Created on: 15/Apr/20 00:13
Start Date: 15/Apr/20 00:13
Worklog Time Spent: 10m 
  Work Description: pabloem commented on pull request #11270: 
[BEAM-9639][BEAM-9608] Improvements for FnApiRunner
URL: https://github.com/apache/beam/pull/11270#discussion_r408507693
 
 

 ##
 File path: 
sdks/python/apache_beam/runners/portability/fn_api_runner/execution.py
 ##
 @@ -367,6 +413,73 @@ def _build_process_bundle_descriptor(self):
 state_api_service_descriptor=self.state_api_service_descriptor(),
 timer_api_service_descriptor=self.data_api_service_descriptor())
 
+  def commit_output_views_to_state(self):
 
 Review comment:
   done
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 422443)
Time Spent: 2h 50m  (was: 2h 40m)

> Abstract bundle execution logic from stage execution logic
> --
>
> Key: BEAM-9639
> URL: https://issues.apache.org/jira/browse/BEAM-9639
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Pablo Estrada
>Assignee: Pablo Estrada
>Priority: Major
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> The FnApiRunner currently works on a per-stage manner, and does not abstract 
> single-bundle execution much. This work item is to clearly define the code to 
> execute a single bundle.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9746) [Go SDK] Empty side inputs causing spurious zero elements

2020-04-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9746?focusedWorklogId=422439=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-422439
 ]

ASF GitHub Bot logged work on BEAM-9746:


Author: ASF GitHub Bot
Created on: 15/Apr/20 00:12
Start Date: 15/Apr/20 00:12
Worklog Time Spent: 10m 
  Work Description: lostluck commented on pull request #11413: [BEAM-9746] 
check for 0 length copies from state
URL: https://github.com/apache/beam/pull/11413#discussion_r408507507
 
 

 ##
 File path: sdks/go/pkg/beam/core/runtime/harness/statemgr_test.go
 ##
 @@ -258,6 +261,167 @@ func TestStateChannel(t *testing.T) {
}
 }
 
+// TestStateKeyReader validates ordinary Read cases
+func TestStateKeyReader(t *testing.T) {
+   const readLen = 4
+   tests := []struct {
+   name string
+   buflens  []int // sizes of the buffers received on the state 
channel.
+   numReads int
 
 Review comment:
   Consider: What kind of change to the stateKeyReader implementation would 
change the expected number of reads, based on how stateKeyReader gets its data?
   
   There are two kinds of unit test. What you've described is black box 
testing, where the testing code has no knowledge of internal details of the 
implementation. This is valuable when there are accessible API calls to set the 
unknown internal state of a struct, and in doing so, there are checkable 
expectations of how other API calls will behave as a result. As a rule, this is 
preferable, as it's resilient to implementation changes, but with a broad 
enough API surface, it can be very very expensive since it's simply checking 
the same code paths over and over and over again on different tests. Black box 
testing by necessity ends up having alot of redundancy. We do have black box 
testing in general for Beam Go, since we use the direct runner to validate 
large portions of the exec package in a "pipeline" context. You can find them 
via any of the test files that have their package name to include _test.
   
   There's also white box testing, where the internal details are known to the 
test implementer. These tests are an example of the latter. 
   
   There is no user side API to configure the internal state of 
stateKeyReader.Read(), and in general for the io.Reader interface, which is how 
users (AKA other parts of the beam framework), will be using stateKeyReader. 
   Since we can't configure things with a user side API, we must manipulate 
some other internal state, to set the initial conditions, and then from those 
conditions, check that we understand the code under test.
   
   Further, white box testing is the only reasonable option for such a general 
API like Read(). It gets passed in a buffer, and it's expected to write up to 
len(buffer) bytes to it, and tell you what it did. It doesn't say anything 
about the bytes themselves. The implementations vary dramatically depending on 
the purpose, and only the implementation can check that. The API in this case 
doesn't dictate the tests, otherwise there could be a "io.ReaderTester" that we 
could instead of running these.
   
   So, while numReads in general is inscrutable to the user, it's not 
inscrutable to us, the implementers of this test case, and the code. We have 
our own expectations for the code
   
   In fact, numReads a deterministic number based on the lengths of the backing 
buffers vs the lengths of the reads. Since it's something we can check, and 
know based on the initial conditions, and our understanding of the code, it's 
fine. numReads is a second order expectation based on the initial conditions, 
but it's also one that's very easy to check.
   
   To be very blunt, numReads is checking that the code behaves the way we 
think it behaves.
   
   In this case, had we been testing that the "nil" data buffer case required 1 
read to return EOF, we wouldn't have had the bug in question.
   
   Your underlying concern about a "change state" test is valid though. Tests 
that require a specific implementation in order to pass, can be a nuisance. But 
that doesn't apply so much to second order expectations like numReads.  If the 
number of reads changes when the implementation changes, I certainly would like 
to know about it, because that could be an *unexpected* side effect of the 
change being made. Conversely, if a refactorer believes they can change the 
implementation to reduce the number of calls to Read the test can expect, they 
can simply adjust the number, and in Test Driven Development style, use that to 
validate that the changes they're making accomplish the intended goal.
   
   I'm on the side of I'd rather have tests fail if a dramatic change in 
behavior occurs, and the change author need to adjust them since then it 
verifies that they ran the tests, and that the tests are actually working. 
Having 

[jira] [Work logged] (BEAM-9639) Abstract bundle execution logic from stage execution logic

2020-04-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9639?focusedWorklogId=422441=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-422441
 ]

ASF GitHub Bot logged work on BEAM-9639:


Author: ASF GitHub Bot
Created on: 15/Apr/20 00:12
Start Date: 15/Apr/20 00:12
Worklog Time Spent: 10m 
  Work Description: pabloem commented on pull request #11270: 
[BEAM-9639][BEAM-9608] Improvements for FnApiRunner
URL: https://github.com/apache/beam/pull/11270#discussion_r408507610
 
 

 ##
 File path: 
sdks/python/apache_beam/runners/portability/fn_api_runner/translations.py
 ##
 @@ -75,21 +75,27 @@
 
 IMPULSE_BUFFER = b'impulse'
 
+# SideInputId is identified by a consumer ParDo + tag.
+SideInputId = Tuple[str, str]
+
+DataSideInput = Dict[SideInputId,
 
 Review comment:
   The value is a tuple with the encoded data. I've moved these to 
translations.py, and updated the comment
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 422441)
Time Spent: 2.5h  (was: 2h 20m)

> Abstract bundle execution logic from stage execution logic
> --
>
> Key: BEAM-9639
> URL: https://issues.apache.org/jira/browse/BEAM-9639
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Pablo Estrada
>Assignee: Pablo Estrada
>Priority: Major
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> The FnApiRunner currently works on a per-stage manner, and does not abstract 
> single-bundle execution much. This work item is to clearly define the code to 
> execute a single bundle.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9639) Abstract bundle execution logic from stage execution logic

2020-04-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9639?focusedWorklogId=422440=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-422440
 ]

ASF GitHub Bot logged work on BEAM-9639:


Author: ASF GitHub Bot
Created on: 15/Apr/20 00:12
Start Date: 15/Apr/20 00:12
Worklog Time Spent: 10m 
  Work Description: pabloem commented on pull request #11270: 
[BEAM-9639][BEAM-9608] Improvements for FnApiRunner
URL: https://github.com/apache/beam/pull/11270#discussion_r408507558
 
 

 ##
 File path: 
sdks/python/apache_beam/runners/portability/fn_api_runner/translations.py
 ##
 @@ -149,15 +155,26 @@ def no_overlap(a, b):
 return (
 not consumer.forced_root and not self in consumer.must_follow and
 not self.is_runner_urn(context) and
-not consumer.is_runner_urn(context) and
-no_overlap(self.downstream_side_inputs, consumer.side_inputs()))
+not consumer.is_runner_urn(context) and no_overlap(
+set(self.downstream_side_inputs.keys()),
+{i
+ for i, _, _ in consumer.side_inputs()}))
+
+  def _fuse_downstream_side_inputs(self, other):
+res = dict(self.downstream_side_inputs)
+for si, other_si_ids in other.downstream_side_inputs.items():
+  if si in res:
+res[si] = union(res[si], other_si_ids)
 
 Review comment:
   woah this is a bug.
   downstream side input is a dictionary mapping to dictionaries.
   
   Dict[Output Pcollection, Dict[Side input ID, Access pattern]]
   Where Side input ID is Tuple[consumer ptransform, input index]. Added 
appropriate annotations, and fixed the bug.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 422440)
Time Spent: 2h 20m  (was: 2h 10m)

> Abstract bundle execution logic from stage execution logic
> --
>
> Key: BEAM-9639
> URL: https://issues.apache.org/jira/browse/BEAM-9639
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Pablo Estrada
>Assignee: Pablo Estrada
>Priority: Major
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> The FnApiRunner currently works on a per-stage manner, and does not abstract 
> single-bundle execution much. This work item is to clearly define the code to 
> execute a single bundle.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9468) Add Google Cloud Healthcare API IO Connectors

2020-04-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9468?focusedWorklogId=422436=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-422436
 ]

ASF GitHub Bot logged work on BEAM-9468:


Author: ASF GitHub Bot
Created on: 14/Apr/20 23:57
Start Date: 14/Apr/20 23:57
Worklog Time Spent: 10m 
  Work Description: jaketf commented on issue #11339: [BEAM-9468] [WIP] 
Fhir io
URL: https://github.com/apache/beam/pull/11339#issuecomment-611125298
 
 
   TODOs:
   - [x] ValueProvider support
   - [x] Add example usage to javadoc
   - [x] Unit test for FhirIO dead letter handling
   - [x] Migrate ITs to parameterized tests to DRY up ITs against different 
FHIR versions (improves maintainability)
   - [ ] Add IT for FhirIO.Read
   - [x] implement scaffolding for test (currently always passes because 
initial PCollection is empty)
   - [ ] Needs convenience method to "read all resource IDs from a FHIR 
store to populate initial PCollection of resource IDs
   - [ ] Benchmark / load test the FhirIO.Import
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 422436)
Time Spent: 27h 50m  (was: 27h 40m)

> Add Google Cloud Healthcare API IO Connectors
> -
>
> Key: BEAM-9468
> URL: https://issues.apache.org/jira/browse/BEAM-9468
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-gcp
>Reporter: Jacob Ferriero
>Assignee: Jacob Ferriero
>Priority: Minor
>  Time Spent: 27h 50m
>  Remaining Estimate: 0h
>
> Add IO Transforms for the HL7v2, FHIR and DICOM stores in the [Google Cloud 
> Healthcare API|https://cloud.google.com/healthcare/docs/]
> HL7v2IO
> FHIRIO
> DICOM 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9468) Add Google Cloud Healthcare API IO Connectors

2020-04-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9468?focusedWorklogId=422432=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-422432
 ]

ASF GitHub Bot logged work on BEAM-9468:


Author: ASF GitHub Bot
Created on: 14/Apr/20 23:56
Start Date: 14/Apr/20 23:56
Worklog Time Spent: 10m 
  Work Description: jaketf commented on issue #11339: [BEAM-9468] [WIP] 
Fhir io
URL: https://github.com/apache/beam/pull/11339#issuecomment-611125298
 
 
   TODOs:
   - [x] ValueProvider support
   - [x] Add example usage to javadoc
   - [x] Unit test for FhirIO dead letter handling
   - [x] Migrate ITs to parameterized tests to DRY up ITs against different 
FHIR versions (improves maintainability)
   - [ ] Add IT for FhirIO.Read
   - [ ] Benchmark / load test the FhirIO.Import
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 422432)
Time Spent: 27h 20m  (was: 27h 10m)

> Add Google Cloud Healthcare API IO Connectors
> -
>
> Key: BEAM-9468
> URL: https://issues.apache.org/jira/browse/BEAM-9468
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-gcp
>Reporter: Jacob Ferriero
>Assignee: Jacob Ferriero
>Priority: Minor
>  Time Spent: 27h 20m
>  Remaining Estimate: 0h
>
> Add IO Transforms for the HL7v2, FHIR and DICOM stores in the [Google Cloud 
> Healthcare API|https://cloud.google.com/healthcare/docs/]
> HL7v2IO
> FHIRIO
> DICOM 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9737) beam_PostCommit_Website_Test failing

2020-04-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9737?focusedWorklogId=422435=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-422435
 ]

ASF GitHub Bot logged work on BEAM-9737:


Author: ASF GitHub Bot
Created on: 14/Apr/20 23:56
Start Date: 14/Apr/20 23:56
Worklog Time Spent: 10m 
  Work Description: udim commented on issue #11386: [BEAM-9737] Fix website 
postcommit
URL: https://github.com/apache/beam/pull/11386#issuecomment-613738430
 
 
   Run Full Website Test
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 422435)
Time Spent: 2h 20m  (was: 2h 10m)

> beam_PostCommit_Website_Test failing
> 
>
> Key: BEAM-9737
> URL: https://issues.apache.org/jira/browse/BEAM-9737
> Project: Beam
>  Issue Type: Bug
>  Components: test-failures, website
>Reporter: Udi Meiri
>Assignee: Udi Meiri
>Priority: Major
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> Also failing: beam_PostCommit_Website_Publish (same failure)
> {code}
> > Task :website:buildLocalWebsite
> `/` is not writable.
> Bundler will use `/tmp/bundler/home/unknown' as your home directory 
> temporarily.
> Configuration file: /repo/website/_config.yml
> Configuration file: /repo/website/_config_test.yml
> Configuration file: /tmp/_config_branch_repo.yml
> Source: /repo/website/src
>Destination: generated-local-content
>  Incremental build: enabled
>   Generating... 
> jekyll 3.6.3 | Error:  Permission denied @ dir_s_mkdir - 
> /repo/build/website/generated-local-content/security
> {code}
> https://builds.apache.org/view/A-D/view/Beam/view/PostCommit/job/beam_PostCommit_Website_Test/3676/console
> Possible culprit: https://github.com/apache/beam/pull/11232/files



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9468) Add Google Cloud Healthcare API IO Connectors

2020-04-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9468?focusedWorklogId=422433=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-422433
 ]

ASF GitHub Bot logged work on BEAM-9468:


Author: ASF GitHub Bot
Created on: 14/Apr/20 23:56
Start Date: 14/Apr/20 23:56
Worklog Time Spent: 10m 
  Work Description: jaketf commented on issue #11339: [BEAM-9468] [WIP] 
Fhir io
URL: https://github.com/apache/beam/pull/11339#issuecomment-611125298
 
 
   TODOs:
   - [x] ValueProvider support
   - [x] Add example usage to javadoc
   - [x] Unit test for FhirIO dead letter handling
   - [x] Migrate ITs to parameterized tests to DRY up ITs against different 
FHIR versions (improves maintainability)
   - [x] Add IT for FhirIO.Read
   - [ ] Benchmark / load test the FhirIO.Import
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 422433)
Time Spent: 27.5h  (was: 27h 20m)

> Add Google Cloud Healthcare API IO Connectors
> -
>
> Key: BEAM-9468
> URL: https://issues.apache.org/jira/browse/BEAM-9468
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-gcp
>Reporter: Jacob Ferriero
>Assignee: Jacob Ferriero
>Priority: Minor
>  Time Spent: 27.5h
>  Remaining Estimate: 0h
>
> Add IO Transforms for the HL7v2, FHIR and DICOM stores in the [Google Cloud 
> Healthcare API|https://cloud.google.com/healthcare/docs/]
> HL7v2IO
> FHIRIO
> DICOM 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9468) Add Google Cloud Healthcare API IO Connectors

2020-04-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9468?focusedWorklogId=422434=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-422434
 ]

ASF GitHub Bot logged work on BEAM-9468:


Author: ASF GitHub Bot
Created on: 14/Apr/20 23:56
Start Date: 14/Apr/20 23:56
Worklog Time Spent: 10m 
  Work Description: jaketf commented on issue #11339: [BEAM-9468] [WIP] 
Fhir io
URL: https://github.com/apache/beam/pull/11339#issuecomment-611125298
 
 
   TODOs:
   - [x] ValueProvider support
   - [x] Add example usage to javadoc
   - [x] Unit test for FhirIO dead letter handling
   - [x] Migrate ITs to parameterized tests to DRY up ITs against different 
FHIR versions (improves maintainability)
   - [ ] Add IT for FhirIO.Read
   - [ ] Benchmark / load test the FhirIO.Import
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 422434)
Time Spent: 27h 40m  (was: 27.5h)

> Add Google Cloud Healthcare API IO Connectors
> -
>
> Key: BEAM-9468
> URL: https://issues.apache.org/jira/browse/BEAM-9468
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-gcp
>Reporter: Jacob Ferriero
>Assignee: Jacob Ferriero
>Priority: Minor
>  Time Spent: 27h 40m
>  Remaining Estimate: 0h
>
> Add IO Transforms for the HL7v2, FHIR and DICOM stores in the [Google Cloud 
> Healthcare API|https://cloud.google.com/healthcare/docs/]
> HL7v2IO
> FHIRIO
> DICOM 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9468) Add Google Cloud Healthcare API IO Connectors

2020-04-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9468?focusedWorklogId=422430=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-422430
 ]

ASF GitHub Bot logged work on BEAM-9468:


Author: ASF GitHub Bot
Created on: 14/Apr/20 23:54
Start Date: 14/Apr/20 23:54
Worklog Time Spent: 10m 
  Work Description: jaketf commented on issue #11339: [BEAM-9468] [WIP] 
Fhir io
URL: https://github.com/apache/beam/pull/11339#issuecomment-611125298
 
 
   TODOs:
   - [x] ValueProvider support
   - [x] Add example usage to javadoc
   - [x] Unit test for FhirIO dead letter handling
   - [ ] Add IT for FhirIO.Read
   - [ ] Migrate ITs to parameterized tests to DRY up ITs against different 
FHIR versions (improves maintainability)
   - [ ] Benchmark / load test the FhirIO.Import
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 422430)
Time Spent: 27h  (was: 26h 50m)

> Add Google Cloud Healthcare API IO Connectors
> -
>
> Key: BEAM-9468
> URL: https://issues.apache.org/jira/browse/BEAM-9468
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-gcp
>Reporter: Jacob Ferriero
>Assignee: Jacob Ferriero
>Priority: Minor
>  Time Spent: 27h
>  Remaining Estimate: 0h
>
> Add IO Transforms for the HL7v2, FHIR and DICOM stores in the [Google Cloud 
> Healthcare API|https://cloud.google.com/healthcare/docs/]
> HL7v2IO
> FHIRIO
> DICOM 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9468) Add Google Cloud Healthcare API IO Connectors

2020-04-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9468?focusedWorklogId=422431=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-422431
 ]

ASF GitHub Bot logged work on BEAM-9468:


Author: ASF GitHub Bot
Created on: 14/Apr/20 23:54
Start Date: 14/Apr/20 23:54
Worklog Time Spent: 10m 
  Work Description: jaketf commented on issue #11339: [BEAM-9468] [WIP] 
Fhir io
URL: https://github.com/apache/beam/pull/11339#issuecomment-611125298
 
 
   TODOs:
   - [x] ValueProvider support
   - [x] Add example usage to javadoc
   - [x] Unit test for FhirIO dead letter handling
   - [ ] Migrate ITs to parameterized tests to DRY up ITs against different 
FHIR versions (improves maintainability)
   - [ ] Add IT for FhirIO.Read
   - [ ] Benchmark / load test the FhirIO.Import
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 422431)
Time Spent: 27h 10m  (was: 27h)

> Add Google Cloud Healthcare API IO Connectors
> -
>
> Key: BEAM-9468
> URL: https://issues.apache.org/jira/browse/BEAM-9468
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-gcp
>Reporter: Jacob Ferriero
>Assignee: Jacob Ferriero
>Priority: Minor
>  Time Spent: 27h 10m
>  Remaining Estimate: 0h
>
> Add IO Transforms for the HL7v2, FHIR and DICOM stores in the [Google Cloud 
> Healthcare API|https://cloud.google.com/healthcare/docs/]
> HL7v2IO
> FHIRIO
> DICOM 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9737) beam_PostCommit_Website_Test failing

2020-04-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9737?focusedWorklogId=422429=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-422429
 ]

ASF GitHub Bot logged work on BEAM-9737:


Author: ASF GitHub Bot
Created on: 14/Apr/20 23:53
Start Date: 14/Apr/20 23:53
Worklog Time Spent: 10m 
  Work Description: udim commented on issue #11386: [BEAM-9737] Fix website 
postcommit
URL: https://github.com/apache/beam/pull/11386#issuecomment-613737579
 
 
   Run Full Website Test
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 422429)
Time Spent: 2h 10m  (was: 2h)

> beam_PostCommit_Website_Test failing
> 
>
> Key: BEAM-9737
> URL: https://issues.apache.org/jira/browse/BEAM-9737
> Project: Beam
>  Issue Type: Bug
>  Components: test-failures, website
>Reporter: Udi Meiri
>Assignee: Udi Meiri
>Priority: Major
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> Also failing: beam_PostCommit_Website_Publish (same failure)
> {code}
> > Task :website:buildLocalWebsite
> `/` is not writable.
> Bundler will use `/tmp/bundler/home/unknown' as your home directory 
> temporarily.
> Configuration file: /repo/website/_config.yml
> Configuration file: /repo/website/_config_test.yml
> Configuration file: /tmp/_config_branch_repo.yml
> Source: /repo/website/src
>Destination: generated-local-content
>  Incremental build: enabled
>   Generating... 
> jekyll 3.6.3 | Error:  Permission denied @ dir_s_mkdir - 
> /repo/build/website/generated-local-content/security
> {code}
> https://builds.apache.org/view/A-D/view/Beam/view/PostCommit/job/beam_PostCommit_Website_Test/3676/console
> Possible culprit: https://github.com/apache/beam/pull/11232/files



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9757) Flake in JavaPortabilityApi precommit

2020-04-14 Thread Jira


 [ 
https://issues.apache.org/jira/browse/BEAM-9757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismaël Mejía updated BEAM-9757:
---
Status: Open  (was: Triage Needed)

> Flake in JavaPortabilityApi precommit
> -
>
> Key: BEAM-9757
> URL: https://issues.apache.org/jira/browse/BEAM-9757
> Project: Beam
>  Issue Type: Bug
>  Components: test-failures
>Reporter: Kyle Weaver
>Assignee: Boyuan Zhang
>Priority: Major
>
> org.apache.beam.runners.dataflow.worker.util.MemoryMonitorTest.detectGCThrashing
> Error Message
> java.lang.AssertionError
> Stacktrace
> java.lang.AssertionError
>   at org.junit.Assert.fail(Assert.java:87)
>   at org.junit.Assert.assertTrue(Assert.java:42)
>   at org.junit.Assert.assertTrue(Assert.java:53)
>   at 
> org.apache.beam.runners.dataflow.worker.util.MemoryMonitorTest.detectGCThrashing(MemoryMonitorTest.java:93)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:288)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:282)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at java.lang.Thread.run(Thread.java:748)
> Standard Error
> Apr 13, 2020 11:38:54 PM 
> org.apache.beam.runners.dataflow.worker.util.MemoryMonitor run
> INFO: Memory is used/total/max = 189/1511/1820 MB, GC last/max = 0.00/0.00 %, 
> #pushbacks=0, gc thrashing=false
> Apr 13, 2020 11:38:56 PM 
> org.apache.beam.runners.dataflow.worker.util.MemoryMonitor dumpHeap
> WARNING: Heap dumped to 
> /tmp/junit4868537985750235077/junit1782833439507703448/heap_dump.hprof
> Apr 13, 2020 11:38:58 PM 
> org.apache.beam.runners.dataflow.worker.util.MemoryMonitor run
> INFO: Memory is used/total/max = 210/1511/1820 MB, GC last/max = 0.00/0.00 %, 
> #pushbacks=0, gc thrashing=false
> Apr 13, 2020 11:38:58 PM 
> org.apache.beam.runners.dataflow.worker.util.MemoryMonitor dumpHeap
> WARNING: Heap dumped to 
> /tmp/junit5110608932509920991/junit3762538188841792209/heap_dump.hprof
> Apr 13, 2020 11:38:59 PM 
> org.apache.beam.runners.dataflow.worker.util.MemoryMonitor run
> INFO: Memory is used/total/max = 231/1511/1820 MB, GC last/max = 0.00/0.00 %, 
> #pushbacks=0, gc thrashing=false
> Apr 13, 2020 11:38:59 PM 
> org.apache.beam.runners.dataflow.worker.util.MemoryMonitor dumpHeap
> WARNING: Heap dumped to 
> /tmp/junit419632844612311154/junit923609148290982010/heap_dump.hprof
> Apr 13, 2020 11:38:59 PM 
> org.apache.beam.runners.dataflow.worker.util.MemoryMonitor 
> tryUploadHeapDumpIfItExists
> INFO: Looking for heap dump at 
> /tmp/junit419632844612311154/junit923609148290982010/heap_dump.hprof
> Apr 13, 2020 11:38:59 PM 
> org.apache.beam.runners.dataflow.worker.util.MemoryMonitor 
> tryUploadHeapDumpIfItExists
> WARNING: Heap dump 
> /tmp/junit419632844612311154/junit923609148290982010/heap_dump.hprof 
> detected, attempting to upload to GCS
> Apr 13, 2020 11:39:00 PM 
> org.apache.beam.runners.dataflow.worker.util.MemoryMonitor 
> tryUploadHeapDumpIfItExists
> WARNING: Heap dump 
> /tmp/junit419632844612311154/junit923609148290982010/heap_dump.hprof uploaded 
> to 
> /tmp/junit419632844612311154/junit2761624807817345489/heap_dumpa8934b66-834c-4c26-8b05-47c995150ef8.hprof
> Apr 13, 2020 11:39:00 PM 
> org.apache.beam.runners.dataflow.worker.util.MemoryMonitor 
> tryUploadHeapDumpIfItExists
> INFO: Deleted local heap dump 
> /tmp/junit419632844612311154/junit923609148290982010/heap_dump.hprof
> Apr 13, 2020 11:39:00 PM 
> org.apache.beam.runners.dataflow.worker.util.MemoryMonitor run
> INFO: Memory is used/total/max = 247/1511/1820 MB, GC last/max = 
> 120.00/120.00 %, #pushbacks=0, gc thrashing=false
> Apr 13, 2020 11:39:00 PM 
> org.apache.beam.runners.dataflow.worker.util.MemoryMonitor waitForResources
> INFO: Waiting for resources for Test2. Memory is used/total/max = 
> 247/1511/1820 MB, GC last/max = 100.00/120.00 %, #pushbacks=1, gc 
> thrashing=true
> Apr 13, 2020 11:39:00 PM 
> org.apache.beam.runners.dataflow.worker.util.MemoryMonitor waitForResources
> INFO: Resources granted 

[jira] [Work logged] (BEAM-9756) beam_PostCommit_Java_Nexmark (non-Dataflow) failing

2020-04-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9756?focusedWorklogId=422422=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-422422
 ]

ASF GitHub Bot logged work on BEAM-9756:


Author: ASF GitHub Bot
Created on: 14/Apr/20 23:17
Start Date: 14/Apr/20 23:17
Worklog Time Spent: 10m 
  Work Description: ibzib commented on issue #11417: [BEAM-9756] Nexmark: 
only use --region in Dataflow.
URL: https://github.com/apache/beam/pull/11417#issuecomment-613727368
 
 
   Run Dataflow Runner Nexmark Tests
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 422422)
Time Spent: 1h 10m  (was: 1h)

> beam_PostCommit_Java_Nexmark (non-Dataflow) failing
> ---
>
> Key: BEAM-9756
> URL: https://issues.apache.org/jira/browse/BEAM-9756
> Project: Beam
>  Issue Type: Bug
>  Components: test-failures, testing-nexmark
>Reporter: Kyle Weaver
>Assignee: Kyle Weaver
>Priority: Major
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> 12:02:24 Exception in thread "main" java.lang.IllegalArgumentException: Class 
> interface org.apache.beam.sdk.nexmark.NexmarkOptions missing a property named 
> 'region'.
> 12:02:24  at 
> org.apache.beam.sdk.options.PipelineOptionsFactory.parseObjects(PipelineOptionsFactory.java:1625)
> 12:02:24  at 
> org.apache.beam.sdk.options.PipelineOptionsFactory.access$400(PipelineOptionsFactory.java:115)
> 12:02:24  at 
> org.apache.beam.sdk.options.PipelineOptionsFactory$Builder.as(PipelineOptionsFactory.java:298)
> 12:02:24  at org.apache.beam.sdk.nexmark.Main.runAll(Main.java:98)
> 12:02:24  at org.apache.beam.sdk.nexmark.Main.main(Main.java:415)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-5605) Support Portable SplittableDoFn for batch

2020-04-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-5605?focusedWorklogId=422418=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-422418
 ]

ASF GitHub Bot logged work on BEAM-5605:


Author: ASF GitHub Bot
Created on: 14/Apr/20 23:10
Start Date: 14/Apr/20 23:10
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on pull request #11414: [BEAM-5605, 
BEAM-2939] Add support for FnApiDoFnRunner to handle split calls.
URL: https://github.com/apache/beam/pull/11414#discussion_r408487984
 
 

 ##
 File path: 
sdks/java/harness/src/main/java/org/apache/beam/fn/harness/FnApiDoFnRunner.java
 ##
 @@ -761,72 +795,111 @@ public void processElementForElementAndRestriction(
   continue;
 }
 
-// Make sure to get the output watermark before we split to ensure 
that the lower bound
-// applies to both the primary and residual.
-KV watermarkAndState =
-currentWatermarkEstimator.getWatermarkAndState();
-SplitResult result = currentTracker.trySplit(0);
+// Attempt to checkpoint the current restriction.
+HandlesSplits.SplitResult splitResult =
 
 Review comment:
   The DoFn returned `ProcessContinuation#resume` and not 
`ProcessContinuation#stop`.
   
   The current restriction within the restriction tracker represents the entire 
restriction so splitting it provides the remainder and should update the 
restriction tracker so that the primary is in a done state so that `checkDone` 
down below can succeed.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 422418)
Time Spent: 18h 10m  (was: 18h)

> Support Portable SplittableDoFn for batch
> -
>
> Key: BEAM-5605
> URL: https://issues.apache.org/jira/browse/BEAM-5605
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Scott Wegner
>Assignee: Luke Cwik
>Priority: Major
>  Labels: portability
>  Time Spent: 18h 10m
>  Remaining Estimate: 0h
>
> Roll-up item tracking work towards supporting portable SplittableDoFn for 
> batch



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8603) Add Python SqlTransform MVP

2020-04-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8603?focusedWorklogId=422417=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-422417
 ]

ASF GitHub Bot logged work on BEAM-8603:


Author: ASF GitHub Bot
Created on: 14/Apr/20 23:01
Start Date: 14/Apr/20 23:01
Worklog Time Spent: 10m 
  Work Description: TheNeuralBit commented on issue #10055: [BEAM-8603] Add 
Python SqlTransform
URL: https://github.com/apache/beam/pull/10055#issuecomment-613722634
 
 
   cc: @Ardagan 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 422417)
Time Spent: 9h 10m  (was: 9h)

> Add Python SqlTransform MVP
> ---
>
> Key: BEAM-8603
> URL: https://issues.apache.org/jira/browse/BEAM-8603
> Project: Beam
>  Issue Type: Improvement
>  Components: dsl-sql, sdk-py-core
>Reporter: Brian Hulette
>Assignee: Brian Hulette
>Priority: Major
>  Time Spent: 9h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9748) Add Reshuffle.ForSequentiallyGeneratedInput transform

2020-04-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9748?focusedWorklogId=422415=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-422415
 ]

ASF GitHub Bot logged work on BEAM-9748:


Author: ASF GitHub Bot
Created on: 14/Apr/20 23:00
Start Date: 14/Apr/20 23:00
Worklog Time Spent: 10m 
  Work Description: jkff commented on pull request #11406: [BEAM-9748] Move 
Reparallelize transform to Reshuffle
URL: https://github.com/apache/beam/pull/11406#discussion_r408484775
 
 

 ##
 File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/Reshuffle.java
 ##
 @@ -65,6 +66,16 @@ private Reshuffle() {}
 return new ViaRandomKey<>();
   }
 
+  @Experimental
 
 Review comment:
   Annotation should probably be below the comment.
   
   Also suggest rephrasing a bit, to explain when one should use one or the 
other, and what are the consequences of choosing wrong. I'm not sure that 
"sequentially generated" is clear enough to a casual user. Maybe something like 
this:
   
   ```
   Materializes the input and prepares it to be consumed in a highly parallel 
fashion.
   
   This version is tailored to the case when input was produced in an extremely 
sequential way - typically by a ParDo that emits millions of outputs _per input 
element_, e.g., executing a large database query or a large simulation and 
emitting all of their results.
   
   Internally, this version first materializes the input at a moderate cost 
before reshuffling it internally using viaRandomKey(), making the reshuffling 
itself significantly cheaper in these extreme cases on some runners. Use this 
over viaRandomKey() only if your benchmarks show an improvement.
   ```
   
   And mention this at the class-level documentation of Reshuffle for 
visibility.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 422415)
Time Spent: 1h 50m  (was: 1h 40m)

> Add Reshuffle.ForSequentiallyGeneratedInput transform
> -
>
> Key: BEAM-9748
> URL: https://issues.apache.org/jira/browse/BEAM-9748
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Minor
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> Some DoFn based IOs like JdbcIO and RedisIO rely on a different approach to 
> Reparallelize outputs using a combination of a an empty PCollectionView to 
> force materialization and Reshuffle.viaRandomkey to reparallelize a 
> PCollection. This issue extracts this transform and expose it as part of the 
> Reshuffle to avoid repeating the code for transforms (notably IOs) that 
> produce lots of sequentially generated data where and benefit of this 
> alternative approach to perform better reparallelization of its output.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9623) Add support for configuring TableProviders in Python SqlTransform

2020-04-14 Thread Brian Hulette (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brian Hulette updated BEAM-9623:

Summary: Add support for configuring TableProviders in Python SqlTransform  
(was: Add support for TableProviders in Python SqlTransform)

> Add support for configuring TableProviders in Python SqlTransform
> -
>
> Key: BEAM-9623
> URL: https://issues.apache.org/jira/browse/BEAM-9623
> Project: Beam
>  Issue Type: Improvement
>  Components: dsl-sql, sdk-py-core
>Reporter: Brian Hulette
>Priority: Major
>
> It should be possible to use e.g. DataCatalogTableProvider and access 
> BigQuery, PubSub, and GCS in queries.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9699) Add ability to use ZetaSQL in Python SqlTransform

2020-04-14 Thread Brian Hulette (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brian Hulette updated BEAM-9699:

Status: Open  (was: Triage Needed)

> Add ability to use ZetaSQL in Python SqlTransform
> -
>
> Key: BEAM-9699
> URL: https://issues.apache.org/jira/browse/BEAM-9699
> Project: Beam
>  Issue Type: Improvement
>  Components: dsl-sql, sdk-py-core
>Reporter: Brian Hulette
>Priority: Major
>
> This may just work when the [plannerName pipeline 
> option|https://github.com/apache/beam/blob/1e52e4298085eda8e88e1215c7a73d52658b31f1/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/BeamSqlPipelineOptions.java#L29]
>  is exposed to Python



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8872) Add support for splitting at fractions > 0 to org.apache.beam.sdk.transforms.splittabledofn.OffsetRangeTracker

2020-04-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8872?focusedWorklogId=422414=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-422414
 ]

ASF GitHub Bot logged work on BEAM-8872:


Author: ASF GitHub Bot
Created on: 14/Apr/20 22:57
Start Date: 14/Apr/20 22:57
Worklog Time Spent: 10m 
  Work Description: boyuanzz commented on pull request #11418: [BEAM-8872] 
Support split at fraction for OffsetRangeTracker
URL: https://github.com/apache/beam/pull/11418#discussion_r408483552
 
 

 ##
 File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/splittabledofn/OffsetRangeTracker.java
 ##
 @@ -49,13 +50,22 @@ public OffsetRange currentRestriction() {
 
   @Override
   public SplitResult trySplit(double fractionOfRemainder) {
-// TODO(BEAM-8872): Add support for splitting off a fixed amount of work 
for this restriction
-// instead of only supporting checkpointing.
-
-checkState(
-lastClaimedOffset != null, "Can't checkpoint before any offset was 
successfully claimed");
-OffsetRange res = new OffsetRange(lastClaimedOffset + 1, range.getTo());
-this.range = new OffsetRange(range.getFrom(), lastClaimedOffset + 1);
+checkState(lastClaimedOffset != null, "Can't split before any offset was 
successfully claimed");
+// No more split should be performed if checkpoint has happened.
+if (checkpointed) {
+  return null;
+}
+Long splitPos =
+lastClaimedOffset
++ Math.max(1L, (long) ((range.getTo() - lastClaimedOffset) * 
fractionOfRemainder));
+if (splitPos >= range.getTo()) {
+  return null;
+}
+if (fractionOfRemainder == 0.0) {
+  checkpointed = true;
 
 Review comment:
   Just return early since we know there is no more split after checkpointing.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 422414)
Time Spent: 1h  (was: 50m)

> Add support for splitting at fractions > 0 to 
> org.apache.beam.sdk.transforms.splittabledofn.OffsetRangeTracker
> --
>
> Key: BEAM-8872
> URL: https://issues.apache.org/jira/browse/BEAM-8872
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-java-core
>Reporter: Luke Cwik
>Assignee: Boyuan Zhang
>Priority: Major
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> org.apache.beam.sdk.transforms.splittabledofn.OffsetRangeTracker only 
> supports checkpointing.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8872) Add support for splitting at fractions > 0 to org.apache.beam.sdk.transforms.splittabledofn.OffsetRangeTracker

2020-04-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8872?focusedWorklogId=422413=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-422413
 ]

ASF GitHub Bot logged work on BEAM-8872:


Author: ASF GitHub Bot
Created on: 14/Apr/20 22:56
Start Date: 14/Apr/20 22:56
Worklog Time Spent: 10m 
  Work Description: boyuanzz commented on pull request #11418: [BEAM-8872] 
Support split at fraction for OffsetRangeTracker
URL: https://github.com/apache/beam/pull/11418#discussion_r408483381
 
 

 ##
 File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/splittabledofn/OffsetRangeTracker.java
 ##
 @@ -49,13 +50,22 @@ public OffsetRange currentRestriction() {
 
   @Override
   public SplitResult trySplit(double fractionOfRemainder) {
-// TODO(BEAM-8872): Add support for splitting off a fixed amount of work 
for this restriction
-// instead of only supporting checkpointing.
-
-checkState(
-lastClaimedOffset != null, "Can't checkpoint before any offset was 
successfully claimed");
-OffsetRange res = new OffsetRange(lastClaimedOffset + 1, range.getTo());
-this.range = new OffsetRange(range.getFrom(), lastClaimedOffset + 1);
+checkState(lastClaimedOffset != null, "Can't split before any offset was 
successfully claimed");
 
 Review comment:
   Allowing split before first claiming makes sense to me. Python has already 
allowed that.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 422413)
Time Spent: 50m  (was: 40m)

> Add support for splitting at fractions > 0 to 
> org.apache.beam.sdk.transforms.splittabledofn.OffsetRangeTracker
> --
>
> Key: BEAM-8872
> URL: https://issues.apache.org/jira/browse/BEAM-8872
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-java-core
>Reporter: Luke Cwik
>Assignee: Boyuan Zhang
>Priority: Major
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> org.apache.beam.sdk.transforms.splittabledofn.OffsetRangeTracker only 
> supports checkpointing.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9746) [Go SDK] Empty side inputs causing spurious zero elements

2020-04-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9746?focusedWorklogId=422411=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-422411
 ]

ASF GitHub Bot logged work on BEAM-9746:


Author: ASF GitHub Bot
Created on: 14/Apr/20 22:55
Start Date: 14/Apr/20 22:55
Worklog Time Spent: 10m 
  Work Description: youngoli commented on pull request #11413: [BEAM-9746] 
check for 0 length copies from state
URL: https://github.com/apache/beam/pull/11413#discussion_r408481881
 
 

 ##
 File path: sdks/go/pkg/beam/core/runtime/harness/statemgr_test.go
 ##
 @@ -258,6 +261,167 @@ func TestStateChannel(t *testing.T) {
}
 }
 
+// TestStateKeyReader validates ordinary Read cases
+func TestStateKeyReader(t *testing.T) {
+   const readLen = 4
+   tests := []struct {
+   name string
+   buflens  []int // sizes of the buffers received on the state 
channel.
+   numReads int
 
 Review comment:
   Is the number of reads something that needs to be predictable to whatever 
code uses the reader? To me it seems like one of those implementation details 
that doesn't need to be unit tested because it's invisible to the caller.
   
   The reason this stood out to me is that the `numReads` in all the test cases 
below seem a bit inscrutable, and it would cause these tests to fail if the 
implementation of the `stateKeyReader` was changed slightly, which is just an 
annoyance unless it would actually break code.
   
   On the other hand, if the number of reads _is_ a detail that users should 
know, and correctness does rely on it staying consistent, that seems like it 
should be explicitly documented on `stateKeyReader.Read`.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 422411)

> [Go SDK] Empty side inputs causing spurious zero elements
> -
>
> Key: BEAM-9746
> URL: https://issues.apache.org/jira/browse/BEAM-9746
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-go
>Reporter: Robert Burke
>Assignee: Robert Burke
>Priority: Major
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> A user discovered that empty side inputs would spuriously provide a single 
> zero element.
> The error was narrowed down to the Go SDK's state manager code  copying the 
> stateGetResponse data wasn't checking that the original data source even had 
> any bytes in it, leading it in particular to interpret length prefixed data 
> as having 0 length, which would cause zero value elements to be generated. 
> Notably, this caused empty strings.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9746) [Go SDK] Empty side inputs causing spurious zero elements

2020-04-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9746?focusedWorklogId=422410=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-422410
 ]

ASF GitHub Bot logged work on BEAM-9746:


Author: ASF GitHub Bot
Created on: 14/Apr/20 22:55
Start Date: 14/Apr/20 22:55
Worklog Time Spent: 10m 
  Work Description: youngoli commented on pull request #11413: [BEAM-9746] 
check for 0 length copies from state
URL: https://github.com/apache/beam/pull/11413#discussion_r408477833
 
 

 ##
 File path: sdks/go/pkg/beam/core/runtime/harness/statemgr_test.go
 ##
 @@ -258,6 +261,167 @@ func TestStateChannel(t *testing.T) {
}
 }
 
+// TestStateKeyReader validates ordinary Read cases
+func TestStateKeyReader(t *testing.T) {
+   const readLen = 4
+   tests := []struct {
+   name string
+   buflens  []int // sizes of the buffers received on the state 
channel.
+   numReads int
+   closed   bool // tries to read from closed reader
+   noGetbool // tries to read from nil get response reader
+   }{
+   {
+   name: "emptyData",
+   buflens:  []int{-1},
+   numReads: 1,
+   }, {
+   name: "singleBufferSingleRead",
+   buflens:  []int{readLen},
+   numReads: 2,
+   }, {
+   name: "singleBufferMultipleReads",
+   buflens:  []int{2 * readLen},
+   numReads: 3,
+   }, {
+   name: "singleBufferShortRead",
+   buflens:  []int{readLen - 1},
+   numReads: 2,
+   }, {
+   name: "multiBuffer",
+   buflens:  []int{readLen, readLen},
+   numReads: 3,
+   }, {
+   name: "multiBuffer-short-reads",
+   buflens:  []int{readLen - 1, readLen - 1, readLen - 2},
+   numReads: 4,
+   }, {
+   name: "emptyDataFirst", // Shouldn't happen, but 
not unreasonable to handle.
+   buflens:  []int{-1, readLen, readLen},
+   numReads: 4,
+   }, {
+   name: "emptyDataMid", // Shouldn't happen, but not 
unreasonable to handle.
+   buflens:  []int{readLen, readLen, -1, readLen},
+   numReads: 5,
+   }, {
+   name: "emptyDataLast", // Shouldn't happen, but not 
unreasonable to handle.
+   buflens:  []int{readLen, readLen, -1},
+   numReads: 3,
+   }, {
+   name: "emptyDataLast-short",
+   buflens:  []int{3*readLen - 2, -1},
+   numReads: 4,
+   }, {
+   name: "closed",
+   buflens:  []int{-1, -1},
+   numReads: 1,
+   closed:   true,
+   }, {
+   name: "noGet",
+   buflens:  []int{-1},
+   numReads: 1,
+   noGet:true,
+   },
+   }
+   for _, test := range tests {
+   t.Run(test.name, func(t *testing.T) {
+   ctx, cancelFn := 
context.WithCancel(context.Background())
+   ch := {
+   id:"test",
+   requests:  make(chan *fnpb.StateRequest),
+   responses: make(map[string]chan<- 
*fnpb.StateResponse),
+   cancelFn:  cancelFn,
+   DoneCh:ctx.Done(),
+   }
+
+   // Handle the channel behavior asynchronously.
+   go func() {
+   for i, buflen := range test.buflens {
+   token := []byte(strconv.Itoa(i))
+   var buf []byte
+   if buflen >= 0 {
+   buf = bytes.Repeat([]byte{42}, 
buflen)
+   }
+   // On the last request response pair, 
send no token.
+   if i+1 == len(test.buflens) {
+   token = nil
+   }
+
+   req := <-ch.requests
+
+   if test.noGet {
+   

[jira] [Work logged] (BEAM-9746) [Go SDK] Empty side inputs causing spurious zero elements

2020-04-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9746?focusedWorklogId=422409=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-422409
 ]

ASF GitHub Bot logged work on BEAM-9746:


Author: ASF GitHub Bot
Created on: 14/Apr/20 22:55
Start Date: 14/Apr/20 22:55
Worklog Time Spent: 10m 
  Work Description: youngoli commented on pull request #11413: [BEAM-9746] 
check for 0 length copies from state
URL: https://github.com/apache/beam/pull/11413#discussion_r408477556
 
 

 ##
 File path: sdks/go/pkg/beam/core/runtime/harness/statemgr_test.go
 ##
 @@ -258,6 +261,167 @@ func TestStateChannel(t *testing.T) {
}
 }
 
+// TestStateKeyReader validates ordinary Read cases
+func TestStateKeyReader(t *testing.T) {
+   const readLen = 4
+   tests := []struct {
+   name string
+   buflens  []int // sizes of the buffers received on the state 
channel.
+   numReads int
+   closed   bool // tries to read from closed reader
+   noGetbool // tries to read from nil get response reader
+   }{
+   {
+   name: "emptyData",
+   buflens:  []int{-1},
+   numReads: 1,
+   }, {
+   name: "singleBufferSingleRead",
+   buflens:  []int{readLen},
+   numReads: 2,
+   }, {
+   name: "singleBufferMultipleReads",
+   buflens:  []int{2 * readLen},
+   numReads: 3,
+   }, {
+   name: "singleBufferShortRead",
+   buflens:  []int{readLen - 1},
+   numReads: 2,
+   }, {
+   name: "multiBuffer",
+   buflens:  []int{readLen, readLen},
+   numReads: 3,
+   }, {
+   name: "multiBuffer-short-reads",
+   buflens:  []int{readLen - 1, readLen - 1, readLen - 2},
+   numReads: 4,
+   }, {
+   name: "emptyDataFirst", // Shouldn't happen, but 
not unreasonable to handle.
+   buflens:  []int{-1, readLen, readLen},
+   numReads: 4,
+   }, {
+   name: "emptyDataMid", // Shouldn't happen, but not 
unreasonable to handle.
+   buflens:  []int{readLen, readLen, -1, readLen},
+   numReads: 5,
+   }, {
+   name: "emptyDataLast", // Shouldn't happen, but not 
unreasonable to handle.
+   buflens:  []int{readLen, readLen, -1},
+   numReads: 3,
+   }, {
+   name: "emptyDataLast-short",
+   buflens:  []int{3*readLen - 2, -1},
+   numReads: 4,
+   }, {
+   name: "closed",
+   buflens:  []int{-1, -1},
+   numReads: 1,
+   closed:   true,
+   }, {
+   name: "noGet",
+   buflens:  []int{-1},
+   numReads: 1,
+   noGet:true,
+   },
+   }
+   for _, test := range tests {
+   t.Run(test.name, func(t *testing.T) {
+   ctx, cancelFn := 
context.WithCancel(context.Background())
+   ch := {
+   id:"test",
+   requests:  make(chan *fnpb.StateRequest),
+   responses: make(map[string]chan<- 
*fnpb.StateResponse),
+   cancelFn:  cancelFn,
+   DoneCh:ctx.Done(),
+   }
+
+   // Handle the channel behavior asynchronously.
+   go func() {
+   for i, buflen := range test.buflens {
+   token := []byte(strconv.Itoa(i))
+   var buf []byte
+   if buflen >= 0 {
+   buf = bytes.Repeat([]byte{42}, 
buflen)
+   }
+   // On the last request response pair, 
send no token.
+   if i+1 == len(test.buflens) {
+   token = nil
+   }
+
+   req := <-ch.requests
+
+   if test.noGet {
+   

[jira] [Work logged] (BEAM-5605) Support Portable SplittableDoFn for batch

2020-04-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-5605?focusedWorklogId=422408=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-422408
 ]

ASF GitHub Bot logged work on BEAM-5605:


Author: ASF GitHub Bot
Created on: 14/Apr/20 22:53
Start Date: 14/Apr/20 22:53
Worklog Time Spent: 10m 
  Work Description: boyuanzz commented on pull request #11414: [BEAM-5605, 
BEAM-2939] Add support for FnApiDoFnRunner to handle split calls.
URL: https://github.com/apache/beam/pull/11414#discussion_r408482142
 
 

 ##
 File path: 
sdks/java/harness/src/main/java/org/apache/beam/fn/harness/FnApiDoFnRunner.java
 ##
 @@ -761,72 +795,111 @@ public void processElementForElementAndRestriction(
   continue;
 }
 
-// Make sure to get the output watermark before we split to ensure 
that the lower bound
-// applies to both the primary and residual.
-KV watermarkAndState =
-currentWatermarkEstimator.getWatermarkAndState();
-SplitResult result = currentTracker.trySplit(0);
+// Attempt to checkpoint the current restriction.
+HandlesSplits.SplitResult splitResult =
 
 Review comment:
   Why do we want to checkpoint after processElement finishing?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 422408)
Time Spent: 18h  (was: 17h 50m)

> Support Portable SplittableDoFn for batch
> -
>
> Key: BEAM-5605
> URL: https://issues.apache.org/jira/browse/BEAM-5605
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Scott Wegner
>Assignee: Luke Cwik
>Priority: Major
>  Labels: portability
>  Time Spent: 18h
>  Remaining Estimate: 0h
>
> Roll-up item tracking work towards supporting portable SplittableDoFn for 
> batch



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8603) Add Python SqlTransform MVP

2020-04-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8603?focusedWorklogId=422407=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-422407
 ]

ASF GitHub Bot logged work on BEAM-8603:


Author: ASF GitHub Bot
Created on: 14/Apr/20 22:53
Start Date: 14/Apr/20 22:53
Worklog Time Spent: 10m 
  Work Description: TheNeuralBit commented on issue #10055: [BEAM-8603] Add 
Python SqlTransform
URL: https://github.com/apache/beam/pull/10055#issuecomment-613720045
 
 
   It looks like PreCommit failures are just flakes or persistent failures at 
this point.
   
   I think I've addressed all of the PR comments, I just need an answer on two 
things:
   - @robertwb: Is it acceptable to skip sql_test for the default runner for 
now, since it doesn't support  java_artifacts?
   - @ihji: Are the groovy changes ok given that they're outside of the `.each` 
block?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 422407)
Time Spent: 9h  (was: 8h 50m)

> Add Python SqlTransform MVP
> ---
>
> Key: BEAM-8603
> URL: https://issues.apache.org/jira/browse/BEAM-8603
> Project: Beam
>  Issue Type: Improvement
>  Components: dsl-sql, sdk-py-core
>Reporter: Brian Hulette
>Assignee: Brian Hulette
>Priority: Major
>  Time Spent: 9h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9746) [Go SDK] Empty side inputs causing spurious zero elements

2020-04-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9746?focusedWorklogId=422403=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-422403
 ]

ASF GitHub Bot logged work on BEAM-9746:


Author: ASF GitHub Bot
Created on: 14/Apr/20 22:51
Start Date: 14/Apr/20 22:51
Worklog Time Spent: 10m 
  Work Description: lostluck commented on pull request #11413: [BEAM-9746] 
check for 0 length copies from state
URL: https://github.com/apache/beam/pull/11413#discussion_r408481474
 
 

 ##
 File path: sdks/go/pkg/beam/core/runtime/harness/statemgr_test.go
 ##
 @@ -258,6 +260,166 @@ func TestStateChannel(t *testing.T) {
}
 }
 
+// TestStateKeyReader validates ordinary Read cases
+func TestStateKeyReader(t *testing.T) {
+   const readLen = 4
+   tests := []struct {
+   name string
+   buflens  []int // sizes of the buffers received on the state 
channel.
+   numReads int
+   closed   bool // tries to read from closed reader
+   noGetbool // tries to read from nil get response reader
+   }{
+   {
+   name: "emptyData",
+   buflens:  []int{-1},
+   numReads: 1,
+   }, {
+   name: "singleBufferSingleRead",
+   buflens:  []int{readLen},
+   numReads: 2,
+   }, {
+   name: "singleBufferMultipleReads",
+   buflens:  []int{2 * readLen},
+   numReads: 3,
+   }, {
+   name: "singleBufferShortRead",
+   buflens:  []int{readLen - 1},
+   numReads: 2,
+   }, {
+   name: "multiBuffer",
+   buflens:  []int{readLen, readLen},
+   numReads: 3,
+   }, {
+   name: "multiBuffer-short-reads",
+   buflens:  []int{readLen - 1, readLen - 1, readLen - 2},
+   numReads: 4,
+   }, {
+   name: "emptyDataFirst", // Shouldn't happen, but 
not unreasonable to handle.
+   buflens:  []int{-1, readLen, readLen},
+   numReads: 4,
+   }, {
+   name: "emptyDataMid", // Shouldn't happen, but not 
unreasonable to handle.
+   buflens:  []int{readLen, readLen, -1, readLen},
+   numReads: 5,
+   }, {
+   name: "emptyDataLast", // Shouldn't happen, but not 
unreasonable to handle.
+   buflens:  []int{readLen, readLen, -1},
+   numReads: 3,
+   }, {
+   name: "emptyDataLast-short",
+   buflens:  []int{3*readLen - 2, -1},
+   numReads: 4,
+   }, {
+   name: "closed",
+   buflens:  []int{-1, -1},
+   numReads: 1,
+   closed:   true,
+   }, {
+   name: "noGet",
+   buflens:  []int{-1},
+   numReads: 1,
+   noGet:true,
+   },
+   }
+   for _, test := range tests {
+   t.Run(test.name, func(t *testing.T) {
+   ctx, cancelFn := 
context.WithCancel(context.Background())
+   ch := {
+   id:"test",
+   requests:  make(chan *fnpb.StateRequest),
+   responses: make(map[string]chan<- 
*fnpb.StateResponse),
+   cancelFn:  cancelFn,
+   DoneCh:ctx.Done(),
+   }
+
+   // Handle the channel behavior asynchronously.
+   go func() {
+   if test.noGet {
+   req := <-ch.requests
+   ch.responses[req.Id] <- 
{
+   Id: req.Id,
+   }
+   return
+   }
+   for i, buflen := range test.buflens {
+   var buf []byte
+   if buflen >= 0 {
+   buf = bytes.Repeat([]byte{42}, 
buflen)
+   }
+   token := []byte(fmt.Sprint(i))
+   if i+1 == len(test.buflens) {
+ 

[jira] [Work logged] (BEAM-9746) [Go SDK] Empty side inputs causing spurious zero elements

2020-04-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9746?focusedWorklogId=422398=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-422398
 ]

ASF GitHub Bot logged work on BEAM-9746:


Author: ASF GitHub Bot
Created on: 14/Apr/20 22:48
Start Date: 14/Apr/20 22:48
Worklog Time Spent: 10m 
  Work Description: thetorpedodog commented on pull request #11413: 
[BEAM-9746] check for 0 length copies from state
URL: https://github.com/apache/beam/pull/11413#discussion_r408478965
 
 

 ##
 File path: sdks/go/pkg/beam/core/runtime/harness/statemgr_test.go
 ##
 @@ -258,6 +261,167 @@ func TestStateChannel(t *testing.T) {
}
 }
 
+// TestStateKeyReader validates ordinary Read cases
+func TestStateKeyReader(t *testing.T) {
+   const readLen = 4
+   tests := []struct {
+   name string
+   buflens  []int // sizes of the buffers received on the state 
channel.
+   numReads int
+   closed   bool // tries to read from closed reader
+   noGetbool // tries to read from nil get response reader
+   }{
+   {
+   name: "emptyData",
+   buflens:  []int{-1},
+   numReads: 1,
+   }, {
+   name: "singleBufferSingleRead",
+   buflens:  []int{readLen},
+   numReads: 2,
+   }, {
+   name: "singleBufferMultipleReads",
+   buflens:  []int{2 * readLen},
+   numReads: 3,
+   }, {
+   name: "singleBufferShortRead",
+   buflens:  []int{readLen - 1},
+   numReads: 2,
+   }, {
+   name: "multiBuffer",
+   buflens:  []int{readLen, readLen},
+   numReads: 3,
+   }, {
+   name: "multiBuffer-short-reads",
+   buflens:  []int{readLen - 1, readLen - 1, readLen - 2},
+   numReads: 4,
+   }, {
+   name: "emptyDataFirst", // Shouldn't happen, but 
not unreasonable to handle.
+   buflens:  []int{-1, readLen, readLen},
+   numReads: 4,
+   }, {
+   name: "emptyDataMid", // Shouldn't happen, but not 
unreasonable to handle.
+   buflens:  []int{readLen, readLen, -1, readLen},
+   numReads: 5,
+   }, {
+   name: "emptyDataLast", // Shouldn't happen, but not 
unreasonable to handle.
+   buflens:  []int{readLen, readLen, -1},
+   numReads: 3,
+   }, {
+   name: "emptyDataLast-short",
+   buflens:  []int{3*readLen - 2, -1},
+   numReads: 4,
+   }, {
+   name: "closed",
+   buflens:  []int{-1, -1},
+   numReads: 1,
+   closed:   true,
+   }, {
+   name: "noGet",
+   buflens:  []int{-1},
+   numReads: 1,
+   noGet:true,
+   },
+   }
+   for _, test := range tests {
+   t.Run(test.name, func(t *testing.T) {
+   ctx, cancelFn := 
context.WithCancel(context.Background())
+   ch := {
+   id:"test",
+   requests:  make(chan *fnpb.StateRequest),
+   responses: make(map[string]chan<- 
*fnpb.StateResponse),
+   cancelFn:  cancelFn,
+   DoneCh:ctx.Done(),
+   }
+
+   // Handle the channel behavior asynchronously.
+   go func() {
+   for i := 0; i < len(test.buflens); i++ {
+   token := []byte(strconv.Itoa(i))
+   var buf []byte
+   if test.buflens[i] >= 0 {
+   buf = bytes.Repeat([]byte{42}, 
test.buflens[i])
+   }
+   // On the last request response pair, 
send no token.
+   if i+1 == len(test.buflens) {
+   token = nil
+   }
+
+   req := <-ch.requests
+
+   if test.noGet {
+   

[jira] [Work logged] (BEAM-9746) [Go SDK] Empty side inputs causing spurious zero elements

2020-04-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9746?focusedWorklogId=422399=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-422399
 ]

ASF GitHub Bot logged work on BEAM-9746:


Author: ASF GitHub Bot
Created on: 14/Apr/20 22:48
Start Date: 14/Apr/20 22:48
Worklog Time Spent: 10m 
  Work Description: thetorpedodog commented on pull request #11413: 
[BEAM-9746] check for 0 length copies from state
URL: https://github.com/apache/beam/pull/11413#discussion_r408479609
 
 

 ##
 File path: sdks/go/pkg/beam/core/runtime/harness/statemgr_test.go
 ##
 @@ -258,6 +260,166 @@ func TestStateChannel(t *testing.T) {
}
 }
 
+// TestStateKeyReader validates ordinary Read cases
+func TestStateKeyReader(t *testing.T) {
+   const readLen = 4
+   tests := []struct {
+   name string
+   buflens  []int // sizes of the buffers received on the state 
channel.
+   numReads int
+   closed   bool // tries to read from closed reader
+   noGetbool // tries to read from nil get response reader
+   }{
+   {
+   name: "emptyData",
+   buflens:  []int{-1},
+   numReads: 1,
+   }, {
+   name: "singleBufferSingleRead",
+   buflens:  []int{readLen},
+   numReads: 2,
+   }, {
+   name: "singleBufferMultipleReads",
+   buflens:  []int{2 * readLen},
+   numReads: 3,
+   }, {
+   name: "singleBufferShortRead",
+   buflens:  []int{readLen - 1},
+   numReads: 2,
+   }, {
+   name: "multiBuffer",
+   buflens:  []int{readLen, readLen},
+   numReads: 3,
+   }, {
+   name: "multiBuffer-short-reads",
+   buflens:  []int{readLen - 1, readLen - 1, readLen - 2},
+   numReads: 4,
+   }, {
+   name: "emptyDataFirst", // Shouldn't happen, but 
not unreasonable to handle.
+   buflens:  []int{-1, readLen, readLen},
+   numReads: 4,
+   }, {
+   name: "emptyDataMid", // Shouldn't happen, but not 
unreasonable to handle.
+   buflens:  []int{readLen, readLen, -1, readLen},
+   numReads: 5,
+   }, {
+   name: "emptyDataLast", // Shouldn't happen, but not 
unreasonable to handle.
+   buflens:  []int{readLen, readLen, -1},
+   numReads: 3,
+   }, {
+   name: "emptyDataLast-short",
+   buflens:  []int{3*readLen - 2, -1},
+   numReads: 4,
+   }, {
+   name: "closed",
+   buflens:  []int{-1, -1},
+   numReads: 1,
+   closed:   true,
+   }, {
+   name: "noGet",
+   buflens:  []int{-1},
+   numReads: 1,
+   noGet:true,
+   },
+   }
+   for _, test := range tests {
+   t.Run(test.name, func(t *testing.T) {
+   ctx, cancelFn := 
context.WithCancel(context.Background())
+   ch := {
+   id:"test",
+   requests:  make(chan *fnpb.StateRequest),
+   responses: make(map[string]chan<- 
*fnpb.StateResponse),
+   cancelFn:  cancelFn,
+   DoneCh:ctx.Done(),
+   }
+
+   // Handle the channel behavior asynchronously.
+   go func() {
+   if test.noGet {
+   req := <-ch.requests
+   ch.responses[req.Id] <- 
{
+   Id: req.Id,
+   }
+   return
+   }
+   for i, buflen := range test.buflens {
+   var buf []byte
+   if buflen >= 0 {
+   buf = bytes.Repeat([]byte{42}, 
buflen)
+   }
+   token := []byte(fmt.Sprint(i))
+   if i+1 == len(test.buflens) {
+

[jira] [Updated] (BEAM-9758) [very low priority] Asterisks in nexmark should be escaped

2020-04-14 Thread Kyle Weaver (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kyle Weaver updated BEAM-9758:
--
Status: Open  (was: Triage Needed)

> [very low priority] Asterisks in nexmark should be escaped
> --
>
> Key: BEAM-9758
> URL: https://issues.apache.org/jira/browse/BEAM-9758
> Project: Beam
>  Issue Type: Bug
>  Components: testing-nexmark
>Reporter: Kyle Weaver
>Assignee: Kyle Weaver
>Priority: Trivial
>
> Nexmark tests echo asterisks. These don't print as asterisks, but instead are 
> expanded by the shell to show the files in the current folder, e.g. "echo src 
> RUN NEXMARK IN BATCH MODE USING DATAFLOW RUNNER src". Which is not really 
> much of an issue but I thought it was kind of silly :)
> https://github.com/apache/beam/blob/f950b71bb37366cf23bab43eb46f77c761e2300b/.test-infra/jenkins/NexmarkBuilder.groovy#L77



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-9758) [very low priority] Asterisks in nexmark should be escaped

2020-04-14 Thread Kyle Weaver (Jira)
Kyle Weaver created BEAM-9758:
-

 Summary: [very low priority] Asterisks in nexmark should be escaped
 Key: BEAM-9758
 URL: https://issues.apache.org/jira/browse/BEAM-9758
 Project: Beam
  Issue Type: Bug
  Components: testing-nexmark
Reporter: Kyle Weaver
Assignee: Kyle Weaver


Nexmark tests echo asterisks. These don't print as asterisks, but instead are 
expanded by the shell to show the files in the current folder, e.g. "echo src 
RUN NEXMARK IN BATCH MODE USING DATAFLOW RUNNER src". Which is not really much 
of an issue but I thought it was kind of silly :)

https://github.com/apache/beam/blob/f950b71bb37366cf23bab43eb46f77c761e2300b/.test-infra/jenkins/NexmarkBuilder.groovy#L77



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9756) beam_PostCommit_Java_Nexmark (non-Dataflow) failing

2020-04-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9756?focusedWorklogId=422397=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-422397
 ]

ASF GitHub Bot logged work on BEAM-9756:


Author: ASF GitHub Bot
Created on: 14/Apr/20 22:36
Start Date: 14/Apr/20 22:36
Worklog Time Spent: 10m 
  Work Description: ibzib commented on issue #11417: [BEAM-9756] Nexmark: 
only use --region in Dataflow.
URL: https://github.com/apache/beam/pull/11417#issuecomment-613714882
 
 
   Run Seed Job
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 422397)
Time Spent: 1h  (was: 50m)

> beam_PostCommit_Java_Nexmark (non-Dataflow) failing
> ---
>
> Key: BEAM-9756
> URL: https://issues.apache.org/jira/browse/BEAM-9756
> Project: Beam
>  Issue Type: Bug
>  Components: test-failures, testing-nexmark
>Reporter: Kyle Weaver
>Assignee: Kyle Weaver
>Priority: Major
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> 12:02:24 Exception in thread "main" java.lang.IllegalArgumentException: Class 
> interface org.apache.beam.sdk.nexmark.NexmarkOptions missing a property named 
> 'region'.
> 12:02:24  at 
> org.apache.beam.sdk.options.PipelineOptionsFactory.parseObjects(PipelineOptionsFactory.java:1625)
> 12:02:24  at 
> org.apache.beam.sdk.options.PipelineOptionsFactory.access$400(PipelineOptionsFactory.java:115)
> 12:02:24  at 
> org.apache.beam.sdk.options.PipelineOptionsFactory$Builder.as(PipelineOptionsFactory.java:298)
> 12:02:24  at org.apache.beam.sdk.nexmark.Main.runAll(Main.java:98)
> 12:02:24  at org.apache.beam.sdk.nexmark.Main.main(Main.java:415)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8872) Add support for splitting at fractions > 0 to org.apache.beam.sdk.transforms.splittabledofn.OffsetRangeTracker

2020-04-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8872?focusedWorklogId=422396=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-422396
 ]

ASF GitHub Bot logged work on BEAM-8872:


Author: ASF GitHub Bot
Created on: 14/Apr/20 22:32
Start Date: 14/Apr/20 22:32
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on pull request #11418: [BEAM-8872] 
Support split at fraction for OffsetRangeTracker
URL: https://github.com/apache/beam/pull/11418#discussion_r408473046
 
 

 ##
 File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/splittabledofn/OffsetRangeTracker.java
 ##
 @@ -49,13 +50,22 @@ public OffsetRange currentRestriction() {
 
   @Override
   public SplitResult trySplit(double fractionOfRemainder) {
-// TODO(BEAM-8872): Add support for splitting off a fixed amount of work 
for this restriction
-// instead of only supporting checkpointing.
-
-checkState(
-lastClaimedOffset != null, "Can't checkpoint before any offset was 
successfully claimed");
-OffsetRange res = new OffsetRange(lastClaimedOffset + 1, range.getTo());
-this.range = new OffsetRange(range.getFrom(), lastClaimedOffset + 1);
+checkState(lastClaimedOffset != null, "Can't split before any offset was 
successfully claimed");
 
 Review comment:
   We can split before any successfully claimed block by returning `[from, to)` 
and updating the current range to be `[from, from)`
   
   This makes sense in some cases where we want to handoff all the work to 
someone else for the active element while this bundle finishes other processing.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 422396)
Time Spent: 40m  (was: 0.5h)

> Add support for splitting at fractions > 0 to 
> org.apache.beam.sdk.transforms.splittabledofn.OffsetRangeTracker
> --
>
> Key: BEAM-8872
> URL: https://issues.apache.org/jira/browse/BEAM-8872
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-java-core
>Reporter: Luke Cwik
>Assignee: Boyuan Zhang
>Priority: Major
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> org.apache.beam.sdk.transforms.splittabledofn.OffsetRangeTracker only 
> supports checkpointing.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8872) Add support for splitting at fractions > 0 to org.apache.beam.sdk.transforms.splittabledofn.OffsetRangeTracker

2020-04-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8872?focusedWorklogId=422394=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-422394
 ]

ASF GitHub Bot logged work on BEAM-8872:


Author: ASF GitHub Bot
Created on: 14/Apr/20 22:32
Start Date: 14/Apr/20 22:32
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on pull request #11418: [BEAM-8872] 
Support split at fraction for OffsetRangeTracker
URL: https://github.com/apache/beam/pull/11418#discussion_r408473046
 
 

 ##
 File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/splittabledofn/OffsetRangeTracker.java
 ##
 @@ -49,13 +50,22 @@ public OffsetRange currentRestriction() {
 
   @Override
   public SplitResult trySplit(double fractionOfRemainder) {
-// TODO(BEAM-8872): Add support for splitting off a fixed amount of work 
for this restriction
-// instead of only supporting checkpointing.
-
-checkState(
-lastClaimedOffset != null, "Can't checkpoint before any offset was 
successfully claimed");
-OffsetRange res = new OffsetRange(lastClaimedOffset + 1, range.getTo());
-this.range = new OffsetRange(range.getFrom(), lastClaimedOffset + 1);
+checkState(lastClaimedOffset != null, "Can't split before any offset was 
successfully claimed");
 
 Review comment:
   We can split before any successfully claimed block by returning `[from, to)` 
and updating the current range to be `[from, from)`
   
   This makes sense in some cases where we want to handoff all the work to 
someone else for the active element while this Bundle finishes other processing.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 422394)
Time Spent: 20m  (was: 10m)

> Add support for splitting at fractions > 0 to 
> org.apache.beam.sdk.transforms.splittabledofn.OffsetRangeTracker
> --
>
> Key: BEAM-8872
> URL: https://issues.apache.org/jira/browse/BEAM-8872
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-java-core
>Reporter: Luke Cwik
>Assignee: Boyuan Zhang
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> org.apache.beam.sdk.transforms.splittabledofn.OffsetRangeTracker only 
> supports checkpointing.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8872) Add support for splitting at fractions > 0 to org.apache.beam.sdk.transforms.splittabledofn.OffsetRangeTracker

2020-04-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8872?focusedWorklogId=422395=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-422395
 ]

ASF GitHub Bot logged work on BEAM-8872:


Author: ASF GitHub Bot
Created on: 14/Apr/20 22:32
Start Date: 14/Apr/20 22:32
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on pull request #11418: [BEAM-8872] 
Support split at fraction for OffsetRangeTracker
URL: https://github.com/apache/beam/pull/11418#discussion_r408473271
 
 

 ##
 File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/splittabledofn/OffsetRangeTracker.java
 ##
 @@ -49,13 +50,22 @@ public OffsetRange currentRestriction() {
 
   @Override
   public SplitResult trySplit(double fractionOfRemainder) {
-// TODO(BEAM-8872): Add support for splitting off a fixed amount of work 
for this restriction
-// instead of only supporting checkpointing.
-
-checkState(
-lastClaimedOffset != null, "Can't checkpoint before any offset was 
successfully claimed");
-OffsetRange res = new OffsetRange(lastClaimedOffset + 1, range.getTo());
-this.range = new OffsetRange(range.getFrom(), lastClaimedOffset + 1);
+checkState(lastClaimedOffset != null, "Can't split before any offset was 
successfully claimed");
+// No more split should be performed if checkpoint has happened.
+if (checkpointed) {
+  return null;
+}
+Long splitPos =
+lastClaimedOffset
++ Math.max(1L, (long) ((range.getTo() - lastClaimedOffset) * 
fractionOfRemainder));
+if (splitPos >= range.getTo()) {
+  return null;
+}
+if (fractionOfRemainder == 0.0) {
+  checkpointed = true;
 
 Review comment:
   Why do we need `checkpointed`?
   
   Shouldn't the range restriction change so that `to` becomes `lastClaimed` 
(or `from` if nothing has been claimed)?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 422395)
Time Spent: 0.5h  (was: 20m)

> Add support for splitting at fractions > 0 to 
> org.apache.beam.sdk.transforms.splittabledofn.OffsetRangeTracker
> --
>
> Key: BEAM-8872
> URL: https://issues.apache.org/jira/browse/BEAM-8872
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-java-core
>Reporter: Luke Cwik
>Assignee: Boyuan Zhang
>Priority: Major
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> org.apache.beam.sdk.transforms.splittabledofn.OffsetRangeTracker only 
> supports checkpointing.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9136) Add LICENSES and NOTICES to docker images

2020-04-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9136?focusedWorklogId=422393=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-422393
 ]

ASF GitHub Bot logged work on BEAM-9136:


Author: ASF GitHub Bot
Created on: 14/Apr/20 22:30
Start Date: 14/Apr/20 22:30
Worklog Time Spent: 10m 
  Work Description: Hannah-Jiang commented on pull request #11416: 
[BEAM-9136] reduce third_party_dependencies size
URL: https://github.com/apache/beam/pull/11416
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 422393)
Time Spent: 22h 10m  (was: 22h)

> Add LICENSES and NOTICES to docker images
> -
>
> Key: BEAM-9136
> URL: https://issues.apache.org/jira/browse/BEAM-9136
> Project: Beam
>  Issue Type: Task
>  Components: build-system
>Reporter: Hannah Jiang
>Assignee: Hannah Jiang
>Priority: Major
> Fix For: 2.21.0
>
>  Time Spent: 22h 10m
>  Remaining Estimate: 0h
>
> Scan dependencies and add licenses and notices of the dependencies to SDK 
> docker images.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8889) Make GcsUtil use GoogleCloudStorage

2020-04-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8889?focusedWorklogId=422392=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-422392
 ]

ASF GitHub Bot logged work on BEAM-8889:


Author: ASF GitHub Bot
Created on: 14/Apr/20 22:20
Start Date: 14/Apr/20 22:20
Worklog Time Spent: 10m 
  Work Description: stale[bot] commented on issue #10857: [BEAM-8889] 
Upgrade guava to 28.0-jre
URL: https://github.com/apache/beam/pull/10857#issuecomment-613709936
 
 
   This pull request has been marked as stale due to 60 days of inactivity. It 
will be closed in 1 week if no further activity occurs. If you think that’s 
incorrect or this pull request requires a review, please simply write any 
comment. If closed, you can revive the PR at any time and @mention a reviewer 
or discuss it on the d...@beam.apache.org list. Thank you for your 
contributions.
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 422392)
Remaining Estimate: 138h  (was: 138h 10m)
Time Spent: 30h  (was: 29h 50m)

> Make GcsUtil use GoogleCloudStorage
> ---
>
> Key: BEAM-8889
> URL: https://issues.apache.org/jira/browse/BEAM-8889
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-gcp
>Affects Versions: 2.16.0
>Reporter: Esun Kim
>Assignee: VASU NORI
>Priority: Major
>  Labels: gcs
>   Original Estimate: 168h
>  Time Spent: 30h
>  Remaining Estimate: 138h
>
> [GcsUtil|https://github.com/apache/beam/blob/master/sdks/java/extensions/google-cloud-platform-core/src/main/java/org/apache/beam/sdk/extensions/gcp/util/GcsUtil.java]
>  is a primary class to access Google Cloud Storage on Apache Beam. Current 
> implementation directly creates GoogleCloudStorageReadChannel and 
> GoogleCloudStorageWriteChannel by itself to read and write GCS data rather 
> than using 
> [GoogleCloudStorage|https://github.com/GoogleCloudPlatform/bigdata-interop/blob/master/gcsio/src/main/java/com/google/cloud/hadoop/gcsio/GoogleCloudStorage.java]
>  which is an abstract class providing basic IO capability which eventually 
> creates channel objects. This request is about updating GcsUtil to use 
> GoogleCloudStorage to create read and write channel, which is expected 
> flexible because it can easily pick up the new change; e.g. new channel 
> implementation using new protocol without code change.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9746) [Go SDK] Empty side inputs causing spurious zero elements

2020-04-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9746?focusedWorklogId=422388=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-422388
 ]

ASF GitHub Bot logged work on BEAM-9746:


Author: ASF GitHub Bot
Created on: 14/Apr/20 22:17
Start Date: 14/Apr/20 22:17
Worklog Time Spent: 10m 
  Work Description: lostluck commented on pull request #11413: [BEAM-9746] 
check for 0 length copies from state
URL: https://github.com/apache/beam/pull/11413#discussion_r408468636
 
 

 ##
 File path: sdks/go/pkg/beam/core/runtime/harness/statemgr_test.go
 ##
 @@ -258,6 +261,167 @@ func TestStateChannel(t *testing.T) {
}
 }
 
+// TestStateKeyReader validates ordinary Read cases
+func TestStateKeyReader(t *testing.T) {
+   const readLen = 4
+   tests := []struct {
+   name string
+   buflens  []int // sizes of the buffers received on the state 
channel.
+   numReads int
+   closed   bool // tries to read from closed reader
+   noGetbool // tries to read from nil get response reader
+   }{
+   {
+   name: "emptyData",
+   buflens:  []int{-1},
+   numReads: 1,
+   }, {
+   name: "singleBufferSingleRead",
+   buflens:  []int{readLen},
+   numReads: 2,
+   }, {
+   name: "singleBufferMultipleReads",
+   buflens:  []int{2 * readLen},
+   numReads: 3,
+   }, {
+   name: "singleBufferShortRead",
+   buflens:  []int{readLen - 1},
+   numReads: 2,
+   }, {
+   name: "multiBuffer",
+   buflens:  []int{readLen, readLen},
+   numReads: 3,
+   }, {
+   name: "multiBuffer-short-reads",
+   buflens:  []int{readLen - 1, readLen - 1, readLen - 2},
+   numReads: 4,
+   }, {
+   name: "emptyDataFirst", // Shouldn't happen, but 
not unreasonable to handle.
+   buflens:  []int{-1, readLen, readLen},
+   numReads: 4,
+   }, {
+   name: "emptyDataMid", // Shouldn't happen, but not 
unreasonable to handle.
+   buflens:  []int{readLen, readLen, -1, readLen},
+   numReads: 5,
+   }, {
+   name: "emptyDataLast", // Shouldn't happen, but not 
unreasonable to handle.
+   buflens:  []int{readLen, readLen, -1},
+   numReads: 3,
+   }, {
+   name: "emptyDataLast-short",
+   buflens:  []int{3*readLen - 2, -1},
+   numReads: 4,
+   }, {
+   name: "closed",
+   buflens:  []int{-1, -1},
+   numReads: 1,
+   closed:   true,
+   }, {
+   name: "noGet",
+   buflens:  []int{-1},
+   numReads: 1,
+   noGet:true,
+   },
+   }
+   for _, test := range tests {
+   t.Run(test.name, func(t *testing.T) {
+   ctx, cancelFn := 
context.WithCancel(context.Background())
+   ch := {
+   id:"test",
+   requests:  make(chan *fnpb.StateRequest),
+   responses: make(map[string]chan<- 
*fnpb.StateResponse),
+   cancelFn:  cancelFn,
+   DoneCh:ctx.Done(),
+   }
+
+   // Handle the channel behavior asynchronously.
+   go func() {
+   for i := 0; i < len(test.buflens); i++ {
+   token := []byte(strconv.Itoa(i))
+   var buf []byte
+   if test.buflens[i] >= 0 {
+   buf = bytes.Repeat([]byte{42}, 
test.buflens[i])
+   }
+   // On the last request response pair, 
send no token.
+   if i+1 == len(test.buflens) {
+   token = nil
+   }
+
+   req := <-ch.requests
+
+   if test.noGet {
+

[jira] [Work logged] (BEAM-9746) [Go SDK] Empty side inputs causing spurious zero elements

2020-04-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9746?focusedWorklogId=422387=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-422387
 ]

ASF GitHub Bot logged work on BEAM-9746:


Author: ASF GitHub Bot
Created on: 14/Apr/20 22:17
Start Date: 14/Apr/20 22:17
Worklog Time Spent: 10m 
  Work Description: lostluck commented on pull request #11413: [BEAM-9746] 
check for 0 length copies from state
URL: https://github.com/apache/beam/pull/11413#discussion_r408468620
 
 

 ##
 File path: sdks/go/pkg/beam/core/runtime/harness/statemgr_test.go
 ##
 @@ -258,6 +261,167 @@ func TestStateChannel(t *testing.T) {
}
 }
 
+// TestStateKeyReader validates ordinary Read cases
+func TestStateKeyReader(t *testing.T) {
+   const readLen = 4
+   tests := []struct {
+   name string
+   buflens  []int // sizes of the buffers received on the state 
channel.
+   numReads int
+   closed   bool // tries to read from closed reader
+   noGetbool // tries to read from nil get response reader
+   }{
+   {
+   name: "emptyData",
+   buflens:  []int{-1},
+   numReads: 1,
+   }, {
+   name: "singleBufferSingleRead",
+   buflens:  []int{readLen},
+   numReads: 2,
+   }, {
+   name: "singleBufferMultipleReads",
+   buflens:  []int{2 * readLen},
+   numReads: 3,
+   }, {
+   name: "singleBufferShortRead",
+   buflens:  []int{readLen - 1},
+   numReads: 2,
+   }, {
+   name: "multiBuffer",
+   buflens:  []int{readLen, readLen},
+   numReads: 3,
+   }, {
+   name: "multiBuffer-short-reads",
+   buflens:  []int{readLen - 1, readLen - 1, readLen - 2},
+   numReads: 4,
+   }, {
+   name: "emptyDataFirst", // Shouldn't happen, but 
not unreasonable to handle.
+   buflens:  []int{-1, readLen, readLen},
+   numReads: 4,
+   }, {
+   name: "emptyDataMid", // Shouldn't happen, but not 
unreasonable to handle.
+   buflens:  []int{readLen, readLen, -1, readLen},
+   numReads: 5,
+   }, {
+   name: "emptyDataLast", // Shouldn't happen, but not 
unreasonable to handle.
+   buflens:  []int{readLen, readLen, -1},
+   numReads: 3,
+   }, {
+   name: "emptyDataLast-short",
+   buflens:  []int{3*readLen - 2, -1},
+   numReads: 4,
+   }, {
+   name: "closed",
+   buflens:  []int{-1, -1},
+   numReads: 1,
+   closed:   true,
+   }, {
+   name: "noGet",
+   buflens:  []int{-1},
+   numReads: 1,
+   noGet:true,
+   },
+   }
+   for _, test := range tests {
+   t.Run(test.name, func(t *testing.T) {
+   ctx, cancelFn := 
context.WithCancel(context.Background())
+   ch := {
+   id:"test",
+   requests:  make(chan *fnpb.StateRequest),
+   responses: make(map[string]chan<- 
*fnpb.StateResponse),
+   cancelFn:  cancelFn,
+   DoneCh:ctx.Done(),
+   }
+
+   // Handle the channel behavior asynchronously.
+   go func() {
+   for i := 0; i < len(test.buflens); i++ {
+   token := []byte(strconv.Itoa(i))
+   var buf []byte
+   if test.buflens[i] >= 0 {
+   buf = bytes.Repeat([]byte{42}, 
test.buflens[i])
+   }
+   // On the last request response pair, 
send no token.
+   if i+1 == len(test.buflens) {
+   token = nil
+   }
 
 Review comment:
   Great suggestions. Done.
 

This is an automated message from the Apache Git 

[jira] [Work logged] (BEAM-9746) [Go SDK] Empty side inputs causing spurious zero elements

2020-04-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9746?focusedWorklogId=422386=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-422386
 ]

ASF GitHub Bot logged work on BEAM-9746:


Author: ASF GitHub Bot
Created on: 14/Apr/20 22:16
Start Date: 14/Apr/20 22:16
Worklog Time Spent: 10m 
  Work Description: lostluck commented on pull request #11413: [BEAM-9746] 
check for 0 length copies from state
URL: https://github.com/apache/beam/pull/11413#discussion_r408468603
 
 

 ##
 File path: sdks/go/pkg/beam/core/runtime/harness/statemgr_test.go
 ##
 @@ -258,6 +261,167 @@ func TestStateChannel(t *testing.T) {
}
 }
 
+// TestStateKeyReader validates ordinary Read cases
+func TestStateKeyReader(t *testing.T) {
+   const readLen = 4
+   tests := []struct {
+   name string
+   buflens  []int // sizes of the buffers received on the state 
channel.
+   numReads int
+   closed   bool // tries to read from closed reader
+   noGetbool // tries to read from nil get response reader
+   }{
+   {
+   name: "emptyData",
+   buflens:  []int{-1},
+   numReads: 1,
+   }, {
+   name: "singleBufferSingleRead",
+   buflens:  []int{readLen},
+   numReads: 2,
+   }, {
+   name: "singleBufferMultipleReads",
+   buflens:  []int{2 * readLen},
+   numReads: 3,
+   }, {
+   name: "singleBufferShortRead",
+   buflens:  []int{readLen - 1},
+   numReads: 2,
+   }, {
+   name: "multiBuffer",
+   buflens:  []int{readLen, readLen},
+   numReads: 3,
+   }, {
+   name: "multiBuffer-short-reads",
+   buflens:  []int{readLen - 1, readLen - 1, readLen - 2},
+   numReads: 4,
+   }, {
+   name: "emptyDataFirst", // Shouldn't happen, but 
not unreasonable to handle.
+   buflens:  []int{-1, readLen, readLen},
+   numReads: 4,
+   }, {
+   name: "emptyDataMid", // Shouldn't happen, but not 
unreasonable to handle.
+   buflens:  []int{readLen, readLen, -1, readLen},
+   numReads: 5,
+   }, {
+   name: "emptyDataLast", // Shouldn't happen, but not 
unreasonable to handle.
+   buflens:  []int{readLen, readLen, -1},
+   numReads: 3,
+   }, {
+   name: "emptyDataLast-short",
+   buflens:  []int{3*readLen - 2, -1},
+   numReads: 4,
+   }, {
+   name: "closed",
+   buflens:  []int{-1, -1},
+   numReads: 1,
+   closed:   true,
+   }, {
+   name: "noGet",
+   buflens:  []int{-1},
+   numReads: 1,
+   noGet:true,
+   },
+   }
+   for _, test := range tests {
+   t.Run(test.name, func(t *testing.T) {
+   ctx, cancelFn := 
context.WithCancel(context.Background())
+   ch := {
+   id:"test",
+   requests:  make(chan *fnpb.StateRequest),
+   responses: make(map[string]chan<- 
*fnpb.StateResponse),
+   cancelFn:  cancelFn,
+   DoneCh:ctx.Done(),
+   }
+
+   // Handle the channel behavior asynchronously.
+   go func() {
+   for i := 0; i < len(test.buflens); i++ {
+   token := []byte(strconv.Itoa(i))
+   var buf []byte
+   if test.buflens[i] >= 0 {
+   buf = bytes.Repeat([]byte{42}, 
test.buflens[i])
+   }
+   // On the last request response pair, 
send no token.
+   if i+1 == len(test.buflens) {
+   token = nil
+   }
+
+   req := <-ch.requests
+
+   if test.noGet {
+

[jira] [Work logged] (BEAM-9577) Update artifact staging and retrieval protocols to be dependency aware.

2020-04-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9577?focusedWorklogId=422383=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-422383
 ]

ASF GitHub Bot logged work on BEAM-9577:


Author: ASF GitHub Bot
Created on: 14/Apr/20 22:16
Start Date: 14/Apr/20 22:16
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on pull request #11420: [BEAM-9577] 
Fix test to create urls from paths which are compatible with Windows.
URL: https://github.com/apache/beam/pull/11420
 
 
   Tested manually on a Windows VM with Python 3.7
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] Update `CHANGES.md` with noteworthy changes.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   See the [Contributor Guide](https://beam.apache.org/contribute) for more 
tips on [how to make review process 
smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)[![Build
 

[jira] [Work logged] (BEAM-9577) Update artifact staging and retrieval protocols to be dependency aware.

2020-04-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9577?focusedWorklogId=422385=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-422385
 ]

ASF GitHub Bot logged work on BEAM-9577:


Author: ASF GitHub Bot
Created on: 14/Apr/20 22:16
Start Date: 14/Apr/20 22:16
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on issue #11420: [BEAM-9577] Fix 
test to create urls from paths which are compatible with Windows.
URL: https://github.com/apache/beam/pull/11420#issuecomment-613708529
 
 
   R: @tvalentyn @robertwb 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 422385)
Time Spent: 14.5h  (was: 14h 20m)

> Update artifact staging and retrieval protocols to be dependency aware.
> ---
>
> Key: BEAM-9577
> URL: https://issues.apache.org/jira/browse/BEAM-9577
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model
>Reporter: Robert Bradshaw
>Assignee: Robert Bradshaw
>Priority: Major
>  Time Spent: 14.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9468) Add Google Cloud Healthcare API IO Connectors

2020-04-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9468?focusedWorklogId=422375=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-422375
 ]

ASF GitHub Bot logged work on BEAM-9468:


Author: ASF GitHub Bot
Created on: 14/Apr/20 22:12
Start Date: 14/Apr/20 22:12
Worklog Time Spent: 10m 
  Work Description: jaketf commented on issue #11339: [BEAM-9468] [WIP] 
Fhir io
URL: https://github.com/apache/beam/pull/11339#issuecomment-611125298
 
 
   TODOs:
   - [x] ValueProvider support
   - [x] Add example usage to javadoc
   - [x] Unit test for FhirIO dead letter handling
   - [ ] Add IT for FhirIO.Read
   - [ ] Migrate ITs to parameterized tests to DRY up ITs against different 
FHIR versions
   - [ ] Benchmark / load test the FhirIO.Import
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 422375)
Time Spent: 26h 50m  (was: 26h 40m)

> Add Google Cloud Healthcare API IO Connectors
> -
>
> Key: BEAM-9468
> URL: https://issues.apache.org/jira/browse/BEAM-9468
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-gcp
>Reporter: Jacob Ferriero
>Assignee: Jacob Ferriero
>Priority: Minor
>  Time Spent: 26h 50m
>  Remaining Estimate: 0h
>
> Add IO Transforms for the HL7v2, FHIR and DICOM stores in the [Google Cloud 
> Healthcare API|https://cloud.google.com/healthcare/docs/]
> HL7v2IO
> FHIRIO
> DICOM 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9748) Add Reshuffle.ForSequentiallyGeneratedInput transform

2020-04-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9748?focusedWorklogId=422374=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-422374
 ]

ASF GitHub Bot logged work on BEAM-9748:


Author: ASF GitHub Bot
Created on: 14/Apr/20 22:08
Start Date: 14/Apr/20 22:08
Worklog Time Spent: 10m 
  Work Description: iemejia commented on issue #11406: [BEAM-9748] Move 
Reparallelize transform to Reshuffle
URL: https://github.com/apache/beam/pull/11406#issuecomment-613706077
 
 
   Changes addressed and renamed to the suggested name by Eugene. PTAL again 
@lukecwik 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 422374)
Time Spent: 1h 40m  (was: 1.5h)

> Add Reshuffle.ForSequentiallyGeneratedInput transform
> -
>
> Key: BEAM-9748
> URL: https://issues.apache.org/jira/browse/BEAM-9748
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Minor
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> Some DoFn based IOs like JdbcIO and RedisIO rely on a different approach to 
> Reparallelize outputs using a combination of a an empty PCollectionView to 
> force materialization and Reshuffle.viaRandomkey to reparallelize a 
> PCollection. This issue extracts this transform and expose it as part of the 
> Reshuffle to avoid repeating the code for transforms (notably IOs) that 
> produce lots of sequentially generated data where and benefit of this 
> alternative approach to perform better reparallelization of its output.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9748) Add Reshuffle.ForSequentiallyGeneratedInput transform

2020-04-14 Thread Jira


 [ 
https://issues.apache.org/jira/browse/BEAM-9748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismaël Mejía updated BEAM-9748:
---
Description: Some DoFn based IOs like JdbcIO and RedisIO rely on a 
different approach to Reparallelize outputs using a combination of a an empty 
PCollectionView to force materialization and Reshuffle.viaRandomkey to 
reparallelize a PCollection. This issue extracts this transform and expose it 
as part of the Reshuffle to avoid repeating the code for transforms (notably 
IOs) that produce lots of sequentially generated data where and benefit of this 
alternative approach to perform better reparallelization of its output.  (was: 
Some DoFn based IOs like JdbcIO and RedisIO rely on the Reparallelize transform,
a combination of a an empty PCollectionView and Reshuffle to force the
materialization and reparallelize a PCollection. The idea of this issue is to
extract this transform and expose it as part of the internal Reshuffle
transform to avoid repeating the code for transforms (notably IOs) that require
to reparallelize its output.)

> Add Reshuffle.ForSequentiallyGeneratedInput transform
> -
>
> Key: BEAM-9748
> URL: https://issues.apache.org/jira/browse/BEAM-9748
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Minor
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Some DoFn based IOs like JdbcIO and RedisIO rely on a different approach to 
> Reparallelize outputs using a combination of a an empty PCollectionView to 
> force materialization and Reshuffle.viaRandomkey to reparallelize a 
> PCollection. This issue extracts this transform and expose it as part of the 
> Reshuffle to avoid repeating the code for transforms (notably IOs) that 
> produce lots of sequentially generated data where and benefit of this 
> alternative approach to perform better reparallelization of its output.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-2939) Fn API SDF support

2020-04-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-2939?focusedWorklogId=422372=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-422372
 ]

ASF GitHub Bot logged work on BEAM-2939:


Author: ASF GitHub Bot
Created on: 14/Apr/20 22:01
Start Date: 14/Apr/20 22:01
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on issue #11409: [BEAM-2939] Update 
unbounded source as SDF wrapper to resume successfully.
URL: https://github.com/apache/beam/pull/11409#issuecomment-613703212
 
 
   Run Java PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 422372)
Time Spent: 27h 20m  (was: 27h 10m)

> Fn API SDF support
> --
>
> Key: BEAM-2939
> URL: https://issues.apache.org/jira/browse/BEAM-2939
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model
>Reporter: Henning Rohde
>Assignee: Luke Cwik
>Priority: Major
>  Labels: portability
>  Time Spent: 27h 20m
>  Remaining Estimate: 0h
>
> The Fn API should support streaming SDF. Detailed design TBD.
> Once design is ready, expand subtasks similarly to BEAM-2822.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9748) Add Reshuffle.ForSequentiallyGeneratedInput transform

2020-04-14 Thread Jira


 [ 
https://issues.apache.org/jira/browse/BEAM-9748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismaël Mejía updated BEAM-9748:
---
Summary: Add Reshuffle.ForSequentiallyGeneratedInput transform  (was: Add 
Reshuffle.forSequentiallyGeneratedInput() transform)

> Add Reshuffle.ForSequentiallyGeneratedInput transform
> -
>
> Key: BEAM-9748
> URL: https://issues.apache.org/jira/browse/BEAM-9748
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Minor
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Some DoFn based IOs like JdbcIO and RedisIO rely on the Reparallelize 
> transform,
> a combination of a an empty PCollectionView and Reshuffle to force the
> materialization and reparallelize a PCollection. The idea of this issue is to
> extract this transform and expose it as part of the internal Reshuffle
> transform to avoid repeating the code for transforms (notably IOs) that 
> require
> to reparallelize its output.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9748) Add Reshuffle.forSequentiallyGeneratedInput() transform

2020-04-14 Thread Jira


 [ 
https://issues.apache.org/jira/browse/BEAM-9748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismaël Mejía updated BEAM-9748:
---
Summary: Add Reshuffle.forSequentiallyGeneratedInput() transform  (was: 
Move Reparallelize transform to Reshuffle)

> Add Reshuffle.forSequentiallyGeneratedInput() transform
> ---
>
> Key: BEAM-9748
> URL: https://issues.apache.org/jira/browse/BEAM-9748
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Minor
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Some DoFn based IOs like JdbcIO and RedisIO rely on the Reparallelize 
> transform,
> a combination of a an empty PCollectionView and Reshuffle to force the
> materialization and reparallelize a PCollection. The idea of this issue is to
> extract this transform and expose it as part of the internal Reshuffle
> transform to avoid repeating the code for transforms (notably IOs) that 
> require
> to reparallelize its output.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9692) Clean Python DataflowRunner to use portable pipelines

2020-04-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9692?focusedWorklogId=422371=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-422371
 ]

ASF GitHub Bot logged work on BEAM-9692:


Author: ASF GitHub Bot
Created on: 14/Apr/20 22:00
Start Date: 14/Apr/20 22:00
Worklog Time Spent: 10m 
  Work Description: robertwb commented on pull request #11335: [BEAM-9692]: 
Make CombineValues portable
URL: https://github.com/apache/beam/pull/11335#discussion_r408461679
 
 

 ##
 File path: sdks/python/apache_beam/runners/dataflow/dataflow_runner.py
 ##
 @@ -110,22 +110,27 @@ class DataflowRunner(PipelineRunner):
 
   # Imported here to avoid circular dependencies.
   # TODO: Remove the apache_beam.pipeline dependency in 
CreatePTransformOverride
+  from apache_beam.runners.dataflow.ptransform_overrides import 
CombineValuesPTransformOverride
   from apache_beam.runners.dataflow.ptransform_overrides import 
CreatePTransformOverride
   from apache_beam.runners.dataflow.ptransform_overrides import 
ReadPTransformOverride
   from apache_beam.runners.dataflow.ptransform_overrides import 
JrhReadPTransformOverride
 
-  _PTRANSFORM_OVERRIDES = []  # type: List[PTransformOverride]
+  # Thesse overrides should be applied before the proto representation of the
+  # graph is created.
+  _PTRANSFORM_OVERRIDES = [
+  CombineValuesPTransformOverride()
 
 Review comment:
   I suppose this is fine; it's just preserving an inconsistency in Dataflow 
vs. everything else. 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 422371)
Time Spent: 1.5h  (was: 1h 20m)

> Clean Python DataflowRunner to use portable pipelines
> -
>
> Key: BEAM-9692
> URL: https://issues.apache.org/jira/browse/BEAM-9692
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-dataflow
>Reporter: Sam Rohde
>Assignee: Sam Rohde
>Priority: Major
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9692) Clean Python DataflowRunner to use portable pipelines

2020-04-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9692?focusedWorklogId=422370=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-422370
 ]

ASF GitHub Bot logged work on BEAM-9692:


Author: ASF GitHub Bot
Created on: 14/Apr/20 22:00
Start Date: 14/Apr/20 22:00
Worklog Time Spent: 10m 
  Work Description: robertwb commented on pull request #11335: [BEAM-9692]: 
Make CombineValues portable
URL: https://github.com/apache/beam/pull/11335#discussion_r408461337
 
 

 ##
 File path: sdks/python/apache_beam/runners/dataflow/ptransform_overrides.py
 ##
 @@ -111,3 +111,38 @@ def expand(self, pbegin):
 
 return JrhRead().with_output_types(
 ptransform.get_type_hints().simple_output_type('Read'))
+
+
+class CombineValuesPTransformOverride(PTransformOverride):
+  """A ``PTransformOverride`` for ``CombineValues``.
+
+  The DataflowRunner expects that the CombineValues PTransform acts as a
+  primitive. So this override replaces the CombineValues with a primitive.
+  """
+  def matches(self, applied_ptransform):
+# Imported here to avoid circular dependencies.
+# pylint: disable=wrong-import-order, wrong-import-position
+from apache_beam import CombineValues
+
+if isinstance(applied_ptransform.transform, CombineValues):
+  self.transform = applied_ptransform.transform
+  return True
+return False
+
+  def get_replacement_transform(self, ptransform):
+# Imported here to avoid circular dependencies.
+# pylint: disable=wrong-import-order, wrong-import-position
+from apache_beam import PTransform
+from apache_beam.pvalue import PCollection
+
+# The DataflowRunner still needs access to the CombineValues members to
 
 Review comment:
   I was thinking that run_xxx could also be called for composites. That might, 
however, be a bigger change, so we can go with this approach. 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 422370)
Time Spent: 1.5h  (was: 1h 20m)

> Clean Python DataflowRunner to use portable pipelines
> -
>
> Key: BEAM-9692
> URL: https://issues.apache.org/jira/browse/BEAM-9692
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-dataflow
>Reporter: Sam Rohde
>Assignee: Sam Rohde
>Priority: Major
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9650) Add consistent slowly changing side inputs support

2020-04-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9650?focusedWorklogId=422357=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-422357
 ]

ASF GitHub Bot logged work on BEAM-9650:


Author: ASF GitHub Bot
Created on: 14/Apr/20 21:42
Start Date: 14/Apr/20 21:42
Worklog Time Spent: 10m 
  Work Description: Ardagan commented on pull request #11415: [BEAM-9650] 
Cleanup documentation on side inputs patterns
URL: https://github.com/apache/beam/pull/11415#discussion_r408453789
 
 

 ##
 File path: website/src/documentation/patterns/side-inputs.md
 ##
 @@ -50,25 +50,25 @@ For instance, the following code sample uses a `Map` to 
create a `DoFn`. The `Ma
 
 ## Slowly updating side input using windowing
 
-You can read side input pcollection periodically into distinct windows.
-Later, when you apply side input to your main input, windows will be matched 
automatically 1:1.
-This way, you can guarantee side input consistency on the duration of the 
single window.
-
-To do this, you can utilize PeriodicSequence PTransform that will generate 
infinite sequence
-of elements with some real-time period:
+You can read side input data periodically into distinct PCollection windows.
+Later, when you apply the side input to your main input, each main input
+window is automatically matched to a single side input window.
+This guarantees side input consistency on the duration of the single window,
+meaning that each window on main input side will be matched to a single
+version of side input data.
+
+To do this, you can utilize a combination of PeriodicSequence/PeriodicImpulse
+PTransforms that will generate infinite sequence of elements with some 
real-time
+period and SDF Read operation or similar to read data into side input
+periodically:
 
 1. Use the PeriodicImpulse transform to generate windowed periodic sequence.
 
-a. MAX_TIMESTAMP can be replaced with some closer boundary if you want to 
stop generating elements at some point.
-
 1. Read data using Read operation triggered by arrival of PCollection element.
 
 1. Apply side input.
 
-```python
+```py
 
 Review comment:
   Website sets config for the page to show only python or java snippets. It 
seem to pick the last language provided as default.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 422357)
Time Spent: 5h 10m  (was: 5h)

> Add consistent slowly changing side inputs support
> --
>
> Key: BEAM-9650
> URL: https://issues.apache.org/jira/browse/BEAM-9650
> Project: Beam
>  Issue Type: Bug
>  Components: io-ideas
>Reporter: Mikhail Gryzykhin
>Assignee: Mikhail Gryzykhin
>Priority: Major
>  Time Spent: 5h 10m
>  Remaining Estimate: 0h
>
> Add implementation for slowly changing dimentions based on [design 
> doc](https://docs.google.com/document/d/1LDY_CtsOJ8Y_zNv1QtkP6AGFrtzkj1q5EW_gSChOIvg/edit]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-2546) Create InfluxDbIO

2020-04-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-2546?focusedWorklogId=422355=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-422355
 ]

ASF GitHub Bot logged work on BEAM-2546:


Author: ASF GitHub Bot
Created on: 14/Apr/20 21:40
Start Date: 14/Apr/20 21:40
Worklog Time Spent: 10m 
  Work Description: iemejia commented on issue #11028: BEAM-2546 Beam IO 
for InfluxDB
URL: https://github.com/apache/beam/pull/11028#issuecomment-613695575
 
 
   retest this please
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 422355)
Time Spent: 16h 20m  (was: 16h 10m)

> Create InfluxDbIO
> -
>
> Key: BEAM-2546
> URL: https://issues.apache.org/jira/browse/BEAM-2546
> Project: Beam
>  Issue Type: New Feature
>  Components: io-ideas
>Reporter: Jean-Baptiste Onofré
>Assignee: Bipin Upadhyaya
>Priority: Major
>  Time Spent: 16h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-9738) Python Dataflow runner omits capabilities.

2020-04-14 Thread Robert Bradshaw (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17083619#comment-17083619
 ] 

Robert Bradshaw commented on BEAM-9738:
---

Yes, the cherry-pick has now been merged into the release branch.

> Python Dataflow runner omits capabilities.
> --
>
> Key: BEAM-9738
> URL: https://issues.apache.org/jira/browse/BEAM-9738
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Reporter: Robert Bradshaw
>Assignee: Robert Bradshaw
>Priority: Blocker
> Fix For: 2.21.0
>
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (BEAM-9738) Python Dataflow runner omits capabilities.

2020-04-14 Thread Robert Bradshaw (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9738?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Bradshaw resolved BEAM-9738.
---
Resolution: Fixed

> Python Dataflow runner omits capabilities.
> --
>
> Key: BEAM-9738
> URL: https://issues.apache.org/jira/browse/BEAM-9738
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Reporter: Robert Bradshaw
>Assignee: Robert Bradshaw
>Priority: Blocker
> Fix For: 2.21.0
>
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8603) Add Python SqlTransform MVP

2020-04-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8603?focusedWorklogId=422354=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-422354
 ]

ASF GitHub Bot logged work on BEAM-8603:


Author: ASF GitHub Bot
Created on: 14/Apr/20 21:38
Start Date: 14/Apr/20 21:38
Worklog Time Spent: 10m 
  Work Description: TheNeuralBit commented on issue #10055: [BEAM-8603] Add 
Python SqlTransform
URL: https://github.com/apache/beam/pull/10055#issuecomment-613694546
 
 
   Run CommunityMetrics PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 422354)
Time Spent: 8h 50m  (was: 8h 40m)

> Add Python SqlTransform MVP
> ---
>
> Key: BEAM-8603
> URL: https://issues.apache.org/jira/browse/BEAM-8603
> Project: Beam
>  Issue Type: Improvement
>  Components: dsl-sql, sdk-py-core
>Reporter: Brian Hulette
>Assignee: Brian Hulette
>Priority: Major
>  Time Spent: 8h 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9136) Add LICENSES and NOTICES to docker images

2020-04-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9136?focusedWorklogId=422343=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-422343
 ]

ASF GitHub Bot logged work on BEAM-9136:


Author: ASF GitHub Bot
Created on: 14/Apr/20 21:33
Start Date: 14/Apr/20 21:33
Worklog Time Spent: 10m 
  Work Description: Hannah-Jiang commented on issue #11416: [BEAM-9136] 
reduce third_party_dependencies size
URL: https://github.com/apache/beam/pull/11416#issuecomment-613692624
 
 
   Run Java PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 422343)
Time Spent: 22h  (was: 21h 50m)

> Add LICENSES and NOTICES to docker images
> -
>
> Key: BEAM-9136
> URL: https://issues.apache.org/jira/browse/BEAM-9136
> Project: Beam
>  Issue Type: Task
>  Components: build-system
>Reporter: Hannah Jiang
>Assignee: Hannah Jiang
>Priority: Major
> Fix For: 2.21.0
>
>  Time Spent: 22h
>  Remaining Estimate: 0h
>
> Scan dependencies and add licenses and notices of the dependencies to SDK 
> docker images.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9650) Add consistent slowly changing side inputs support

2020-04-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9650?focusedWorklogId=422340=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-422340
 ]

ASF GitHub Bot logged work on BEAM-9650:


Author: ASF GitHub Bot
Created on: 14/Apr/20 21:32
Start Date: 14/Apr/20 21:32
Worklog Time Spent: 10m 
  Work Description: Ardagan commented on pull request #11415: [BEAM-9650] 
Cleanup documentation on side inputs patterns
URL: https://github.com/apache/beam/pull/11415#discussion_r408448451
 
 

 ##
 File path: website/src/documentation/patterns/side-inputs.md
 ##
 @@ -50,25 +50,25 @@ For instance, the following code sample uses a `Map` to 
create a `DoFn`. The `Ma
 
 ## Slowly updating side input using windowing
 
-You can read side input pcollection periodically into distinct windows.
-Later, when you apply side input to your main input, windows will be matched 
automatically 1:1.
-This way, you can guarantee side input consistency on the duration of the 
single window.
-
-To do this, you can utilize PeriodicSequence PTransform that will generate 
infinite sequence
-of elements with some real-time period:
+You can read side input data periodically into distinct PCollection windows.
+Later, when you apply the side input to your main input, each main input
+window is automatically matched to a single side input window.
+This guarantees side input consistency on the duration of the single window,
+meaning that each window on main input side will be matched to a single
+version of side input data.
+
+To do this, you can utilize a combination of PeriodicSequence/PeriodicImpulse
+PTransforms that will generate infinite sequence of elements with some 
real-time
+period and SDF Read operation or similar to read data into side input
+periodically:
 
 1. Use the PeriodicImpulse transform to generate windowed periodic sequence.
 
-a. MAX_TIMESTAMP can be replaced with some closer boundary if you want to 
stop generating elements at some point.
-
 1. Read data using Read operation triggered by arrival of PCollection element.
 
 1. Apply side input.
 
-```python
+```py
 
 Review comment:
   Seems that adding this snippet makes java snippet at L45 disappear from 
served site despite me not changing anything in code above. Do you know if 
there's some trick to make both snippets appear?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 422340)
Time Spent: 5h  (was: 4h 50m)

> Add consistent slowly changing side inputs support
> --
>
> Key: BEAM-9650
> URL: https://issues.apache.org/jira/browse/BEAM-9650
> Project: Beam
>  Issue Type: Bug
>  Components: io-ideas
>Reporter: Mikhail Gryzykhin
>Assignee: Mikhail Gryzykhin
>Priority: Major
>  Time Spent: 5h
>  Remaining Estimate: 0h
>
> Add implementation for slowly changing dimentions based on [design 
> doc](https://docs.google.com/document/d/1LDY_CtsOJ8Y_zNv1QtkP6AGFrtzkj1q5EW_gSChOIvg/edit]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8603) Add Python SqlTransform MVP

2020-04-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8603?focusedWorklogId=422331=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-422331
 ]

ASF GitHub Bot logged work on BEAM-8603:


Author: ASF GitHub Bot
Created on: 14/Apr/20 21:13
Start Date: 14/Apr/20 21:13
Worklog Time Spent: 10m 
  Work Description: TheNeuralBit commented on issue #10055: [BEAM-8603] Add 
Python SqlTransform
URL: https://github.com/apache/beam/pull/10055#issuecomment-613684342
 
 
   Run Python2_PVR_Flink PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 422331)
Time Spent: 8h 40m  (was: 8.5h)

> Add Python SqlTransform MVP
> ---
>
> Key: BEAM-8603
> URL: https://issues.apache.org/jira/browse/BEAM-8603
> Project: Beam
>  Issue Type: Improvement
>  Components: dsl-sql, sdk-py-core
>Reporter: Brian Hulette
>Assignee: Brian Hulette
>Priority: Major
>  Time Spent: 8h 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8603) Add Python SqlTransform MVP

2020-04-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8603?focusedWorklogId=422330=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-422330
 ]

ASF GitHub Bot logged work on BEAM-8603:


Author: ASF GitHub Bot
Created on: 14/Apr/20 21:12
Start Date: 14/Apr/20 21:12
Worklog Time Spent: 10m 
  Work Description: TheNeuralBit commented on issue #10055: [BEAM-8603] Add 
Python SqlTransform
URL: https://github.com/apache/beam/pull/10055#issuecomment-613684230
 
 
   Run CommunityMetrics PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 422330)
Time Spent: 8.5h  (was: 8h 20m)

> Add Python SqlTransform MVP
> ---
>
> Key: BEAM-8603
> URL: https://issues.apache.org/jira/browse/BEAM-8603
> Project: Beam
>  Issue Type: Improvement
>  Components: dsl-sql, sdk-py-core
>Reporter: Brian Hulette
>Assignee: Brian Hulette
>Priority: Major
>  Time Spent: 8.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8603) Add Python SqlTransform MVP

2020-04-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8603?focusedWorklogId=422329=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-422329
 ]

ASF GitHub Bot logged work on BEAM-8603:


Author: ASF GitHub Bot
Created on: 14/Apr/20 21:12
Start Date: 14/Apr/20 21:12
Worklog Time Spent: 10m 
  Work Description: TheNeuralBit commented on issue #10055: [BEAM-8603] Add 
Python SqlTransform
URL: https://github.com/apache/beam/pull/10055#issuecomment-613684149
 
 
   Run Python PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 422329)
Time Spent: 8h 20m  (was: 8h 10m)

> Add Python SqlTransform MVP
> ---
>
> Key: BEAM-8603
> URL: https://issues.apache.org/jira/browse/BEAM-8603
> Project: Beam
>  Issue Type: Improvement
>  Components: dsl-sql, sdk-py-core
>Reporter: Brian Hulette
>Assignee: Brian Hulette
>Priority: Major
>  Time Spent: 8h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9650) Add consistent slowly changing side inputs support

2020-04-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9650?focusedWorklogId=422326=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-422326
 ]

ASF GitHub Bot logged work on BEAM-9650:


Author: ASF GitHub Bot
Created on: 14/Apr/20 21:05
Start Date: 14/Apr/20 21:05
Worklog Time Spent: 10m 
  Work Description: soyrice commented on pull request #11415: [BEAM-9650] 
Cleanup documentation on side inputs patterns
URL: https://github.com/apache/beam/pull/11415#discussion_r408433143
 
 

 ##
 File path: website/src/documentation/patterns/side-inputs.md
 ##
 @@ -50,25 +50,25 @@ For instance, the following code sample uses a `Map` to 
create a `DoFn`. The `Ma
 
 ## Slowly updating side input using windowing
 
-You can read side input pcollection periodically into distinct windows.
-Later, when you apply side input to your main input, windows will be matched 
automatically 1:1.
-This way, you can guarantee side input consistency on the duration of the 
single window.
-
-To do this, you can utilize PeriodicSequence PTransform that will generate 
infinite sequence
-of elements with some real-time period:
+You can read side input data periodically into distinct PCollection windows.
+Later, when you apply the side input to your main input, each main input
+window is automatically matched to a single side input window.
+This guarantees side input consistency on the duration of the single window,
+meaning that each window on main input side will be matched to a single
+version of side input data.
+
+To do this, you can utilize a combination of PeriodicSequence/PeriodicImpulse
+PTransforms that will generate infinite sequence of elements with some 
real-time
 
 Review comment:
   Present tense -> "PTransforms that generate"
   
   Missing article -> "generate an infinite sequence..."
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 422326)
Time Spent: 4h 50m  (was: 4h 40m)

> Add consistent slowly changing side inputs support
> --
>
> Key: BEAM-9650
> URL: https://issues.apache.org/jira/browse/BEAM-9650
> Project: Beam
>  Issue Type: Bug
>  Components: io-ideas
>Reporter: Mikhail Gryzykhin
>Assignee: Mikhail Gryzykhin
>Priority: Major
>  Time Spent: 4h 50m
>  Remaining Estimate: 0h
>
> Add implementation for slowly changing dimentions based on [design 
> doc](https://docs.google.com/document/d/1LDY_CtsOJ8Y_zNv1QtkP6AGFrtzkj1q5EW_gSChOIvg/edit]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9650) Add consistent slowly changing side inputs support

2020-04-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9650?focusedWorklogId=422321=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-422321
 ]

ASF GitHub Bot logged work on BEAM-9650:


Author: ASF GitHub Bot
Created on: 14/Apr/20 21:05
Start Date: 14/Apr/20 21:05
Worklog Time Spent: 10m 
  Work Description: soyrice commented on pull request #11415: [BEAM-9650] 
Cleanup documentation on side inputs patterns
URL: https://github.com/apache/beam/pull/11415#discussion_r408430834
 
 

 ##
 File path: website/src/documentation/patterns/side-inputs.md
 ##
 @@ -50,25 +50,25 @@ For instance, the following code sample uses a `Map` to 
create a `DoFn`. The `Ma
 
 ## Slowly updating side input using windowing
 
-You can read side input pcollection periodically into distinct windows.
-Later, when you apply side input to your main input, windows will be matched 
automatically 1:1.
-This way, you can guarantee side input consistency on the duration of the 
single window.
-
-To do this, you can utilize PeriodicSequence PTransform that will generate 
infinite sequence
-of elements with some real-time period:
+You can read side input data periodically into distinct PCollection windows.
+Later, when you apply the side input to your main input, each main input
+window is automatically matched to a single side input window.
+This guarantees side input consistency on the duration of the single window,
+meaning that each window on main input side will be matched to a single
 
 Review comment:
   What does the term "main input side" mean?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 422321)
Time Spent: 4h 20m  (was: 4h 10m)

> Add consistent slowly changing side inputs support
> --
>
> Key: BEAM-9650
> URL: https://issues.apache.org/jira/browse/BEAM-9650
> Project: Beam
>  Issue Type: Bug
>  Components: io-ideas
>Reporter: Mikhail Gryzykhin
>Assignee: Mikhail Gryzykhin
>Priority: Major
>  Time Spent: 4h 20m
>  Remaining Estimate: 0h
>
> Add implementation for slowly changing dimentions based on [design 
> doc](https://docs.google.com/document/d/1LDY_CtsOJ8Y_zNv1QtkP6AGFrtzkj1q5EW_gSChOIvg/edit]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9468) Add Google Cloud Healthcare API IO Connectors

2020-04-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9468?focusedWorklogId=422320=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-422320
 ]

ASF GitHub Bot logged work on BEAM-9468:


Author: ASF GitHub Bot
Created on: 14/Apr/20 21:05
Start Date: 14/Apr/20 21:05
Worklog Time Spent: 10m 
  Work Description: pabloem commented on issue #11151: [BEAM-9468]  Hl7v2 io
URL: https://github.com/apache/beam/pull/11151#issuecomment-613681105
 
 
   @lukecwik ptal so @jaketf won't have to rebase if changes look fine
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 422320)
Time Spent: 26h 40m  (was: 26.5h)

> Add Google Cloud Healthcare API IO Connectors
> -
>
> Key: BEAM-9468
> URL: https://issues.apache.org/jira/browse/BEAM-9468
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-gcp
>Reporter: Jacob Ferriero
>Assignee: Jacob Ferriero
>Priority: Minor
>  Time Spent: 26h 40m
>  Remaining Estimate: 0h
>
> Add IO Transforms for the HL7v2, FHIR and DICOM stores in the [Google Cloud 
> Healthcare API|https://cloud.google.com/healthcare/docs/]
> HL7v2IO
> FHIRIO
> DICOM 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9650) Add consistent slowly changing side inputs support

2020-04-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9650?focusedWorklogId=422323=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-422323
 ]

ASF GitHub Bot logged work on BEAM-9650:


Author: ASF GitHub Bot
Created on: 14/Apr/20 21:05
Start Date: 14/Apr/20 21:05
Worklog Time Spent: 10m 
  Work Description: soyrice commented on pull request #11415: [BEAM-9650] 
Cleanup documentation on side inputs patterns
URL: https://github.com/apache/beam/pull/11415#discussion_r408431325
 
 

 ##
 File path: website/src/documentation/patterns/side-inputs.md
 ##
 @@ -50,25 +50,25 @@ For instance, the following code sample uses a `Map` to 
create a `DoFn`. The `Ma
 
 ## Slowly updating side input using windowing
 
-You can read side input pcollection periodically into distinct windows.
-Later, when you apply side input to your main input, windows will be matched 
automatically 1:1.
-This way, you can guarantee side input consistency on the duration of the 
single window.
-
-To do this, you can utilize PeriodicSequence PTransform that will generate 
infinite sequence
-of elements with some real-time period:
+You can read side input data periodically into distinct PCollection windows.
+Later, when you apply the side input to your main input, each main input
+window is automatically matched to a single side input window.
+This guarantees side input consistency on the duration of the single window,
+meaning that each window on main input side will be matched to a single
+version of side input data.
+
+To do this, you can utilize a combination of PeriodicSequence/PeriodicImpulse
 
 Review comment:
   It's probably best to replace "To do this" with "To do ABC" - it'll help the 
reader figure out what "this" refers to. Otherwise, if the reader is just 
skimming the page and starts at this paragraph, they have to read the previous 
paragraph to figure out what the pronoun refers to.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 422323)
Time Spent: 4.5h  (was: 4h 20m)

> Add consistent slowly changing side inputs support
> --
>
> Key: BEAM-9650
> URL: https://issues.apache.org/jira/browse/BEAM-9650
> Project: Beam
>  Issue Type: Bug
>  Components: io-ideas
>Reporter: Mikhail Gryzykhin
>Assignee: Mikhail Gryzykhin
>Priority: Major
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> Add implementation for slowly changing dimentions based on [design 
> doc](https://docs.google.com/document/d/1LDY_CtsOJ8Y_zNv1QtkP6AGFrtzkj1q5EW_gSChOIvg/edit]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9650) Add consistent slowly changing side inputs support

2020-04-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9650?focusedWorklogId=422325=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-422325
 ]

ASF GitHub Bot logged work on BEAM-9650:


Author: ASF GitHub Bot
Created on: 14/Apr/20 21:05
Start Date: 14/Apr/20 21:05
Worklog Time Spent: 10m 
  Work Description: soyrice commented on pull request #11415: [BEAM-9650] 
Cleanup documentation on side inputs patterns
URL: https://github.com/apache/beam/pull/11415#discussion_r408433930
 
 

 ##
 File path: website/src/documentation/patterns/side-inputs.md
 ##
 @@ -50,25 +50,25 @@ For instance, the following code sample uses a `Map` to 
create a `DoFn`. The `Ma
 
 ## Slowly updating side input using windowing
 
-You can read side input pcollection periodically into distinct windows.
-Later, when you apply side input to your main input, windows will be matched 
automatically 1:1.
-This way, you can guarantee side input consistency on the duration of the 
single window.
-
-To do this, you can utilize PeriodicSequence PTransform that will generate 
infinite sequence
-of elements with some real-time period:
+You can read side input data periodically into distinct PCollection windows.
+Later, when you apply the side input to your main input, each main input
+window is automatically matched to a single side input window.
+This guarantees side input consistency on the duration of the single window,
+meaning that each window on main input side will be matched to a single
+version of side input data.
+
+To do this, you can utilize a combination of PeriodicSequence/PeriodicImpulse
+PTransforms that will generate infinite sequence of elements with some 
real-time
+period and SDF Read operation or similar to read data into side input
 
 Review comment:
   I think we should split this sentence into two - it's hard to follow. A good 
place to start a new sentence is at "SDF Read operation"
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 422325)

> Add consistent slowly changing side inputs support
> --
>
> Key: BEAM-9650
> URL: https://issues.apache.org/jira/browse/BEAM-9650
> Project: Beam
>  Issue Type: Bug
>  Components: io-ideas
>Reporter: Mikhail Gryzykhin
>Assignee: Mikhail Gryzykhin
>Priority: Major
>  Time Spent: 4h 40m
>  Remaining Estimate: 0h
>
> Add implementation for slowly changing dimentions based on [design 
> doc](https://docs.google.com/document/d/1LDY_CtsOJ8Y_zNv1QtkP6AGFrtzkj1q5EW_gSChOIvg/edit]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9650) Add consistent slowly changing side inputs support

2020-04-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9650?focusedWorklogId=422327=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-422327
 ]

ASF GitHub Bot logged work on BEAM-9650:


Author: ASF GitHub Bot
Created on: 14/Apr/20 21:05
Start Date: 14/Apr/20 21:05
Worklog Time Spent: 10m 
  Work Description: soyrice commented on pull request #11415: [BEAM-9650] 
Cleanup documentation on side inputs patterns
URL: https://github.com/apache/beam/pull/11415#discussion_r408434733
 
 

 ##
 File path: website/src/documentation/patterns/side-inputs.md
 ##
 @@ -50,25 +50,25 @@ For instance, the following code sample uses a `Map` to 
create a `DoFn`. The `Ma
 
 ## Slowly updating side input using windowing
 
-You can read side input pcollection periodically into distinct windows.
-Later, when you apply side input to your main input, windows will be matched 
automatically 1:1.
-This way, you can guarantee side input consistency on the duration of the 
single window.
-
-To do this, you can utilize PeriodicSequence PTransform that will generate 
infinite sequence
-of elements with some real-time period:
+You can read side input data periodically into distinct PCollection windows.
+Later, when you apply the side input to your main input, each main input
+window is automatically matched to a single side input window.
+This guarantees side input consistency on the duration of the single window,
+meaning that each window on main input side will be matched to a single
+version of side input data.
+
+To do this, you can utilize a combination of PeriodicSequence/PeriodicImpulse
+PTransforms that will generate infinite sequence of elements with some 
real-time
+period and SDF Read operation or similar to read data into side input
+periodically:
 
 1. Use the PeriodicImpulse transform to generate windowed periodic sequence.
 
-a. MAX_TIMESTAMP can be replaced with some closer boundary if you want to 
stop generating elements at some point.
-
 1. Read data using Read operation triggered by arrival of PCollection element.
 
 Review comment:
   On a second look, I think it's redundant to say "Read data using a Read 
operation" - maybe we can say something like "Read data when a PCollection 
element arrives"
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 422327)
Time Spent: 4h 50m  (was: 4h 40m)

> Add consistent slowly changing side inputs support
> --
>
> Key: BEAM-9650
> URL: https://issues.apache.org/jira/browse/BEAM-9650
> Project: Beam
>  Issue Type: Bug
>  Components: io-ideas
>Reporter: Mikhail Gryzykhin
>Assignee: Mikhail Gryzykhin
>Priority: Major
>  Time Spent: 4h 50m
>  Remaining Estimate: 0h
>
> Add implementation for slowly changing dimentions based on [design 
> doc](https://docs.google.com/document/d/1LDY_CtsOJ8Y_zNv1QtkP6AGFrtzkj1q5EW_gSChOIvg/edit]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9650) Add consistent slowly changing side inputs support

2020-04-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9650?focusedWorklogId=422322=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-422322
 ]

ASF GitHub Bot logged work on BEAM-9650:


Author: ASF GitHub Bot
Created on: 14/Apr/20 21:05
Start Date: 14/Apr/20 21:05
Worklog Time Spent: 10m 
  Work Description: soyrice commented on pull request #11415: [BEAM-9650] 
Cleanup documentation on side inputs patterns
URL: https://github.com/apache/beam/pull/11415#discussion_r408430781
 
 

 ##
 File path: website/src/documentation/patterns/side-inputs.md
 ##
 @@ -50,25 +50,25 @@ For instance, the following code sample uses a `Map` to 
create a `DoFn`. The `Ma
 
 ## Slowly updating side input using windowing
 
-You can read side input pcollection periodically into distinct windows.
-Later, when you apply side input to your main input, windows will be matched 
automatically 1:1.
-This way, you can guarantee side input consistency on the duration of the 
single window.
-
-To do this, you can utilize PeriodicSequence PTransform that will generate 
infinite sequence
-of elements with some real-time period:
+You can read side input data periodically into distinct PCollection windows.
+Later, when you apply the side input to your main input, each main input
+window is automatically matched to a single side input window.
+This guarantees side input consistency on the duration of the single window,
+meaning that each window on main input side will be matched to a single
 
 Review comment:
   Missing "the" -> "on the main input"
   
   Present tense -> "is matched to a single..."
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 422322)
Time Spent: 4h 20m  (was: 4h 10m)

> Add consistent slowly changing side inputs support
> --
>
> Key: BEAM-9650
> URL: https://issues.apache.org/jira/browse/BEAM-9650
> Project: Beam
>  Issue Type: Bug
>  Components: io-ideas
>Reporter: Mikhail Gryzykhin
>Assignee: Mikhail Gryzykhin
>Priority: Major
>  Time Spent: 4h 20m
>  Remaining Estimate: 0h
>
> Add implementation for slowly changing dimentions based on [design 
> doc](https://docs.google.com/document/d/1LDY_CtsOJ8Y_zNv1QtkP6AGFrtzkj1q5EW_gSChOIvg/edit]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9756) beam_PostCommit_Java_Nexmark (non-Dataflow) failing

2020-04-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9756?focusedWorklogId=422319=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-422319
 ]

ASF GitHub Bot logged work on BEAM-9756:


Author: ASF GitHub Bot
Created on: 14/Apr/20 21:05
Start Date: 14/Apr/20 21:05
Worklog Time Spent: 10m 
  Work Description: ibzib commented on issue #11417: [BEAM-9756] Nexmark: 
only use --region in Dataflow.
URL: https://github.com/apache/beam/pull/11417#issuecomment-613681012
 
 
   Run Dataflow Runner Nexmark Tests
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 422319)
Time Spent: 50m  (was: 40m)

> beam_PostCommit_Java_Nexmark (non-Dataflow) failing
> ---
>
> Key: BEAM-9756
> URL: https://issues.apache.org/jira/browse/BEAM-9756
> Project: Beam
>  Issue Type: Bug
>  Components: test-failures, testing-nexmark
>Reporter: Kyle Weaver
>Assignee: Kyle Weaver
>Priority: Major
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> 12:02:24 Exception in thread "main" java.lang.IllegalArgumentException: Class 
> interface org.apache.beam.sdk.nexmark.NexmarkOptions missing a property named 
> 'region'.
> 12:02:24  at 
> org.apache.beam.sdk.options.PipelineOptionsFactory.parseObjects(PipelineOptionsFactory.java:1625)
> 12:02:24  at 
> org.apache.beam.sdk.options.PipelineOptionsFactory.access$400(PipelineOptionsFactory.java:115)
> 12:02:24  at 
> org.apache.beam.sdk.options.PipelineOptionsFactory$Builder.as(PipelineOptionsFactory.java:298)
> 12:02:24  at org.apache.beam.sdk.nexmark.Main.runAll(Main.java:98)
> 12:02:24  at org.apache.beam.sdk.nexmark.Main.main(Main.java:415)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9650) Add consistent slowly changing side inputs support

2020-04-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9650?focusedWorklogId=422324=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-422324
 ]

ASF GitHub Bot logged work on BEAM-9650:


Author: ASF GitHub Bot
Created on: 14/Apr/20 21:05
Start Date: 14/Apr/20 21:05
Worklog Time Spent: 10m 
  Work Description: soyrice commented on pull request #11415: [BEAM-9650] 
Cleanup documentation on side inputs patterns
URL: https://github.com/apache/beam/pull/11415#discussion_r408432853
 
 

 ##
 File path: website/src/documentation/patterns/side-inputs.md
 ##
 @@ -50,25 +50,25 @@ For instance, the following code sample uses a `Map` to 
create a `DoFn`. The `Ma
 
 ## Slowly updating side input using windowing
 
-You can read side input pcollection periodically into distinct windows.
-Later, when you apply side input to your main input, windows will be matched 
automatically 1:1.
-This way, you can guarantee side input consistency on the duration of the 
single window.
-
-To do this, you can utilize PeriodicSequence PTransform that will generate 
infinite sequence
-of elements with some real-time period:
+You can read side input data periodically into distinct PCollection windows.
+Later, when you apply the side input to your main input, each main input
+window is automatically matched to a single side input window.
+This guarantees side input consistency on the duration of the single window,
+meaning that each window on main input side will be matched to a single
+version of side input data.
+
+To do this, you can utilize a combination of PeriodicSequence/PeriodicImpulse
 
 Review comment:
   For formatting, it might be clearer to say "a combination of 
PeriodicSequence and PeriodicImpulse PTransforms" instead of 
"PeriodicSequence/PeriodicImpulse PTransforms" - just to make sure there's no 
ambiguity in regards to these being two separate transforms.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 422324)
Time Spent: 4h 40m  (was: 4.5h)

> Add consistent slowly changing side inputs support
> --
>
> Key: BEAM-9650
> URL: https://issues.apache.org/jira/browse/BEAM-9650
> Project: Beam
>  Issue Type: Bug
>  Components: io-ideas
>Reporter: Mikhail Gryzykhin
>Assignee: Mikhail Gryzykhin
>Priority: Major
>  Time Spent: 4h 40m
>  Remaining Estimate: 0h
>
> Add implementation for slowly changing dimentions based on [design 
> doc](https://docs.google.com/document/d/1LDY_CtsOJ8Y_zNv1QtkP6AGFrtzkj1q5EW_gSChOIvg/edit]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9363) BeamSQL windowing as TVF

2020-04-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9363?focusedWorklogId=422316=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-422316
 ]

ASF GitHub Bot logged work on BEAM-9363:


Author: ASF GitHub Bot
Created on: 14/Apr/20 21:01
Start Date: 14/Apr/20 21:01
Worklog Time Spent: 10m 
  Work Description: amaliujia commented on issue #10946: [BEAM-9363] TUMBLE 
as TVF
URL: https://github.com/apache/beam/pull/10946#issuecomment-613679461
 
 
   Run Java PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 422316)
Time Spent: 1h 10m  (was: 1h)

> BeamSQL windowing as TVF
> 
>
> Key: BEAM-9363
> URL: https://issues.apache.org/jira/browse/BEAM-9363
> Project: Beam
>  Issue Type: New Feature
>  Components: dsl-sql, dsl-sql-zetasql
>Reporter: Rui Wang
>Assignee: Rui Wang
>Priority: Major
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
>  This Jira tracks the implementation for 
> https://s.apache.org/streaming-beam-sql
> TVF is table-valued function, which is a SQL feature that produce a table as 
> function's output.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9561) Run pandas tests with Beam Dataframe API

2020-04-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9561?focusedWorklogId=422308=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-422308
 ]

ASF GitHub Bot logged work on BEAM-9561:


Author: ASF GitHub Bot
Created on: 14/Apr/20 20:50
Start Date: 14/Apr/20 20:50
Worklog Time Spent: 10m 
  Work Description: robertwb commented on pull request #11419: [BEAM-9561] 
Add a framework for running pandas doctests with beam dataframes.
URL: https://github.com/apache/beam/pull/11419
 
 
   R: @TheNeuralBit 
   
   Depends on https://github.com/apache/beam/pull/11264
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] Update `CHANGES.md` with noteworthy changes.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   See the [Contributor Guide](https://beam.apache.org/contribute) for more 
tips on [how to make review process 
smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)[![Build
 

[jira] [Work logged] (BEAM-9737) beam_PostCommit_Website_Test failing

2020-04-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9737?focusedWorklogId=422295=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-422295
 ]

ASF GitHub Bot logged work on BEAM-9737:


Author: ASF GitHub Bot
Created on: 14/Apr/20 20:39
Start Date: 14/Apr/20 20:39
Worklog Time Spent: 10m 
  Work Description: udim commented on issue #11386: [BEAM-9737] Fix website 
postcommit
URL: https://github.com/apache/beam/pull/11386#issuecomment-613669573
 
 
   Run Full Website Test
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 422295)
Time Spent: 2h  (was: 1h 50m)

> beam_PostCommit_Website_Test failing
> 
>
> Key: BEAM-9737
> URL: https://issues.apache.org/jira/browse/BEAM-9737
> Project: Beam
>  Issue Type: Bug
>  Components: test-failures, website
>Reporter: Udi Meiri
>Assignee: Udi Meiri
>Priority: Major
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> Also failing: beam_PostCommit_Website_Publish (same failure)
> {code}
> > Task :website:buildLocalWebsite
> `/` is not writable.
> Bundler will use `/tmp/bundler/home/unknown' as your home directory 
> temporarily.
> Configuration file: /repo/website/_config.yml
> Configuration file: /repo/website/_config_test.yml
> Configuration file: /tmp/_config_branch_repo.yml
> Source: /repo/website/src
>Destination: generated-local-content
>  Incremental build: enabled
>   Generating... 
> jekyll 3.6.3 | Error:  Permission denied @ dir_s_mkdir - 
> /repo/build/website/generated-local-content/security
> {code}
> https://builds.apache.org/view/A-D/view/Beam/view/PostCommit/job/beam_PostCommit_Website_Test/3676/console
> Possible culprit: https://github.com/apache/beam/pull/11232/files



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9650) Add consistent slowly changing side inputs support

2020-04-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9650?focusedWorklogId=422291=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-422291
 ]

ASF GitHub Bot logged work on BEAM-9650:


Author: ASF GitHub Bot
Created on: 14/Apr/20 20:27
Start Date: 14/Apr/20 20:27
Worklog Time Spent: 10m 
  Work Description: Ardagan commented on issue #11415: [BEAM-9650] Cleanup 
documentation on side inputs patterns
URL: https://github.com/apache/beam/pull/11415#issuecomment-613597087
 
 
   R: @soyrice  @rosetn
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 422291)
Time Spent: 4h 10m  (was: 4h)

> Add consistent slowly changing side inputs support
> --
>
> Key: BEAM-9650
> URL: https://issues.apache.org/jira/browse/BEAM-9650
> Project: Beam
>  Issue Type: Bug
>  Components: io-ideas
>Reporter: Mikhail Gryzykhin
>Assignee: Mikhail Gryzykhin
>Priority: Major
>  Time Spent: 4h 10m
>  Remaining Estimate: 0h
>
> Add implementation for slowly changing dimentions based on [design 
> doc](https://docs.google.com/document/d/1LDY_CtsOJ8Y_zNv1QtkP6AGFrtzkj1q5EW_gSChOIvg/edit]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8872) Add support for splitting at fractions > 0 to org.apache.beam.sdk.transforms.splittabledofn.OffsetRangeTracker

2020-04-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8872?focusedWorklogId=422290=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-422290
 ]

ASF GitHub Bot logged work on BEAM-8872:


Author: ASF GitHub Bot
Created on: 14/Apr/20 20:26
Start Date: 14/Apr/20 20:26
Worklog Time Spent: 10m 
  Work Description: boyuanzz commented on pull request #11418: [BEAM-8872] 
Support split at fraction for OffsetRangeTracker
URL: https://github.com/apache/beam/pull/11418
 
 
   **Please** add a meaningful description for your change here
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] Update `CHANGES.md` with noteworthy changes.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   See the [Contributor Guide](https://beam.apache.org/contribute) for more 
tips on [how to make review process 
smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)[![Build
 

[jira] [Commented] (BEAM-6860) WriteToText crash with "GlobalWindow -> ._IntervalWindowBase"

2020-04-14 Thread Chamikara Madhusanka Jayalath (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-6860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17083583#comment-17083583
 ] 

Chamikara Madhusanka Jayalath commented on BEAM-6860:
-

Please file a new Jira and assign to the author if you believe 
[https://github.com/apache/beam/pull/7170] broke this.

> WriteToText crash with "GlobalWindow -> ._IntervalWindowBase"
> -
>
> Key: BEAM-6860
> URL: https://issues.apache.org/jira/browse/BEAM-6860
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Affects Versions: 2.11.0
> Environment: macOS, DirectRunner, python 2.7.15 via 
> pyenv/pyenv-virtualenv
>Reporter: Henrik
>Assignee: Chamikara Madhusanka Jayalath
>Priority: Major
>  Labels: newbie
> Fix For: 2.16.0
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Main error:
> > Cannot convert GlobalWindow to 
> > apache_beam.utils.windowed_value._IntervalWindowBase
> This is very hard for me to debug. Doing a DoPar call before, printing the 
> input, gives me just what I want; so the lines of data to serialise are 
> "alright"; just JSON strings, in fact.
> Stacktrace:
> {code:java}
> Traceback (most recent call last):
>   File "./okr_end_ride.py", line 254, in 
>     run()
>   File "./okr_end_ride.py", line 250, in run
>     run_pipeline(pipeline_options, known_args)
>   File "./okr_end_ride.py", line 198, in run_pipeline
>     | 'write_all' >> WriteToText(known_args.output, 
> file_name_suffix=".txt")
>   File 
> "/Users/h/.pyenv/versions/2.7.15/envs/log-analytics/lib/python2.7/site-packages/apache_beam/pipeline.py",
>  line 426, in __exit__
>     self.run().wait_until_finish()
>   File 
> "/Users/h/.pyenv/versions/2.7.15/envs/log-analytics/lib/python2.7/site-packages/apache_beam/pipeline.py",
>  line 406, in run
>     self._options).run(False)
>   File 
> "/Users/h/.pyenv/versions/2.7.15/envs/log-analytics/lib/python2.7/site-packages/apache_beam/pipeline.py",
>  line 419, in run
>     return self.runner.run_pipeline(self, self._options)
>   File 
> "/Users/h/.pyenv/versions/2.7.15/envs/log-analytics/lib/python2.7/site-packages/apache_beam/runners/direct/direct_runner.py",
>  line 132, in run_pipeline
>     return runner.run_pipeline(pipeline, options)
>   File 
> "/Users/h/.pyenv/versions/2.7.15/envs/log-analytics/lib/python2.7/site-packages/apache_beam/runners/portability/fn_api_runner.py",
>  line 275, in run_pipeline
>     default_environment=self._default_environment))
>   File 
> "/Users/h/.pyenv/versions/2.7.15/envs/log-analytics/lib/python2.7/site-packages/apache_beam/runners/portability/fn_api_runner.py",
>  line 278, in run_via_runner_api
>     return self.run_stages(*self.create_stages(pipeline_proto))
>   File 
> "/Users/h/.pyenv/versions/2.7.15/envs/log-analytics/lib/python2.7/site-packages/apache_beam/runners/portability/fn_api_runner.py",
>  line 354, in run_stages
>     stage_context.safe_coders)
>   File 
> "/Users/h/.pyenv/versions/2.7.15/envs/log-analytics/lib/python2.7/site-packages/apache_beam/runners/portability/fn_api_runner.py",
>  line 509, in run_stage
>     data_input, data_output)
>   File 
> "/Users/h/.pyenv/versions/2.7.15/envs/log-analytics/lib/python2.7/site-packages/apache_beam/runners/portability/fn_api_runner.py",
>  line 1206, in process_bundle
>     result_future = self._controller.control_handler.push(process_bundle)
>   File 
> "/Users/h/.pyenv/versions/2.7.15/envs/log-analytics/lib/python2.7/site-packages/apache_beam/runners/portability/fn_api_runner.py",
>  line 821, in push
>     response = self.worker.do_instruction(request)
>   File 
> "/Users/h/.pyenv/versions/2.7.15/envs/log-analytics/lib/python2.7/site-packages/apache_beam/runners/worker/sdk_worker.py",
>  line 265, in do_instruction
>     request.instruction_id)
>   File 
> "/Users/h/.pyenv/versions/2.7.15/envs/log-analytics/lib/python2.7/site-packages/apache_beam/runners/worker/sdk_worker.py",
>  line 281, in process_bundle
>     delayed_applications = bundle_processor.process_bundle(instruction_id)
>   File 
> "/Users/h/.pyenv/versions/2.7.15/envs/log-analytics/lib/python2.7/site-packages/apache_beam/runners/worker/bundle_processor.py",
>  line 552, in process_bundle
>     op.finish()
>   File "apache_beam/runners/worker/operations.py", line 549, in 
> apache_beam.runners.worker.operations.DoOperation.finish
>   File "apache_beam/runners/worker/operations.py", line 550, in 
> apache_beam.runners.worker.operations.DoOperation.finish
>   File "apache_beam/runners/worker/operations.py", line 551, in 
> apache_beam.runners.worker.operations.DoOperation.finish
>   File "apache_beam/runners/common.py", line 758, in 
> apache_beam.runners.common.DoFnRunner.finish
>   File "apache_beam/runners/common.py", line 752, 

[jira] [Work logged] (BEAM-9746) [Go SDK] Empty side inputs causing spurious zero elements

2020-04-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9746?focusedWorklogId=422288=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-422288
 ]

ASF GitHub Bot logged work on BEAM-9746:


Author: ASF GitHub Bot
Created on: 14/Apr/20 20:24
Start Date: 14/Apr/20 20:24
Worklog Time Spent: 10m 
  Work Description: thetorpedodog commented on pull request #11413: 
[BEAM-9746] check for 0 length copies from state
URL: https://github.com/apache/beam/pull/11413#discussion_r408412639
 
 

 ##
 File path: sdks/go/pkg/beam/core/runtime/harness/statemgr_test.go
 ##
 @@ -258,6 +261,167 @@ func TestStateChannel(t *testing.T) {
}
 }
 
+// TestStateKeyReader validates ordinary Read cases
+func TestStateKeyReader(t *testing.T) {
+   const readLen = 4
+   tests := []struct {
+   name string
+   buflens  []int // sizes of the buffers received on the state 
channel.
+   numReads int
+   closed   bool // tries to read from closed reader
+   noGetbool // tries to read from nil get response reader
+   }{
+   {
+   name: "emptyData",
+   buflens:  []int{-1},
+   numReads: 1,
+   }, {
+   name: "singleBufferSingleRead",
+   buflens:  []int{readLen},
+   numReads: 2,
+   }, {
+   name: "singleBufferMultipleReads",
+   buflens:  []int{2 * readLen},
+   numReads: 3,
+   }, {
+   name: "singleBufferShortRead",
+   buflens:  []int{readLen - 1},
+   numReads: 2,
+   }, {
+   name: "multiBuffer",
+   buflens:  []int{readLen, readLen},
+   numReads: 3,
+   }, {
+   name: "multiBuffer-short-reads",
+   buflens:  []int{readLen - 1, readLen - 1, readLen - 2},
+   numReads: 4,
+   }, {
+   name: "emptyDataFirst", // Shouldn't happen, but 
not unreasonable to handle.
+   buflens:  []int{-1, readLen, readLen},
+   numReads: 4,
+   }, {
+   name: "emptyDataMid", // Shouldn't happen, but not 
unreasonable to handle.
+   buflens:  []int{readLen, readLen, -1, readLen},
+   numReads: 5,
+   }, {
+   name: "emptyDataLast", // Shouldn't happen, but not 
unreasonable to handle.
+   buflens:  []int{readLen, readLen, -1},
+   numReads: 3,
+   }, {
+   name: "emptyDataLast-short",
+   buflens:  []int{3*readLen - 2, -1},
+   numReads: 4,
+   }, {
+   name: "closed",
+   buflens:  []int{-1, -1},
+   numReads: 1,
+   closed:   true,
+   }, {
+   name: "noGet",
+   buflens:  []int{-1},
+   numReads: 1,
+   noGet:true,
+   },
+   }
+   for _, test := range tests {
+   t.Run(test.name, func(t *testing.T) {
+   ctx, cancelFn := 
context.WithCancel(context.Background())
+   ch := {
+   id:"test",
+   requests:  make(chan *fnpb.StateRequest),
+   responses: make(map[string]chan<- 
*fnpb.StateResponse),
+   cancelFn:  cancelFn,
+   DoneCh:ctx.Done(),
+   }
+
+   // Handle the channel behavior asynchronously.
+   go func() {
+   for i := 0; i < len(test.buflens); i++ {
+   token := []byte(strconv.Itoa(i))
+   var buf []byte
+   if test.buflens[i] >= 0 {
+   buf = bytes.Repeat([]byte{42}, 
test.buflens[i])
+   }
+   // On the last request response pair, 
send no token.
+   if i+1 == len(test.buflens) {
+   token = nil
+   }
+
+   req := <-ch.requests
+
+   if test.noGet {
+   

[jira] [Work logged] (BEAM-9746) [Go SDK] Empty side inputs causing spurious zero elements

2020-04-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9746?focusedWorklogId=422287=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-422287
 ]

ASF GitHub Bot logged work on BEAM-9746:


Author: ASF GitHub Bot
Created on: 14/Apr/20 20:24
Start Date: 14/Apr/20 20:24
Worklog Time Spent: 10m 
  Work Description: thetorpedodog commented on pull request #11413: 
[BEAM-9746] check for 0 length copies from state
URL: https://github.com/apache/beam/pull/11413#discussion_r408403890
 
 

 ##
 File path: sdks/go/pkg/beam/core/runtime/harness/statemgr_test.go
 ##
 @@ -258,6 +261,167 @@ func TestStateChannel(t *testing.T) {
}
 }
 
+// TestStateKeyReader validates ordinary Read cases
+func TestStateKeyReader(t *testing.T) {
+   const readLen = 4
+   tests := []struct {
+   name string
+   buflens  []int // sizes of the buffers received on the state 
channel.
+   numReads int
+   closed   bool // tries to read from closed reader
+   noGetbool // tries to read from nil get response reader
+   }{
+   {
+   name: "emptyData",
+   buflens:  []int{-1},
+   numReads: 1,
+   }, {
+   name: "singleBufferSingleRead",
+   buflens:  []int{readLen},
+   numReads: 2,
+   }, {
+   name: "singleBufferMultipleReads",
+   buflens:  []int{2 * readLen},
+   numReads: 3,
+   }, {
+   name: "singleBufferShortRead",
+   buflens:  []int{readLen - 1},
+   numReads: 2,
+   }, {
+   name: "multiBuffer",
+   buflens:  []int{readLen, readLen},
+   numReads: 3,
+   }, {
+   name: "multiBuffer-short-reads",
+   buflens:  []int{readLen - 1, readLen - 1, readLen - 2},
+   numReads: 4,
+   }, {
+   name: "emptyDataFirst", // Shouldn't happen, but 
not unreasonable to handle.
+   buflens:  []int{-1, readLen, readLen},
+   numReads: 4,
+   }, {
+   name: "emptyDataMid", // Shouldn't happen, but not 
unreasonable to handle.
+   buflens:  []int{readLen, readLen, -1, readLen},
+   numReads: 5,
+   }, {
+   name: "emptyDataLast", // Shouldn't happen, but not 
unreasonable to handle.
+   buflens:  []int{readLen, readLen, -1},
+   numReads: 3,
+   }, {
+   name: "emptyDataLast-short",
+   buflens:  []int{3*readLen - 2, -1},
+   numReads: 4,
+   }, {
+   name: "closed",
+   buflens:  []int{-1, -1},
+   numReads: 1,
+   closed:   true,
+   }, {
+   name: "noGet",
+   buflens:  []int{-1},
+   numReads: 1,
+   noGet:true,
+   },
+   }
+   for _, test := range tests {
+   t.Run(test.name, func(t *testing.T) {
+   ctx, cancelFn := 
context.WithCancel(context.Background())
+   ch := {
+   id:"test",
+   requests:  make(chan *fnpb.StateRequest),
+   responses: make(map[string]chan<- 
*fnpb.StateResponse),
+   cancelFn:  cancelFn,
+   DoneCh:ctx.Done(),
+   }
+
+   // Handle the channel behavior asynchronously.
+   go func() {
+   for i := 0; i < len(test.buflens); i++ {
+   token := []byte(strconv.Itoa(i))
+   var buf []byte
+   if test.buflens[i] >= 0 {
+   buf = bytes.Repeat([]byte{42}, 
test.buflens[i])
+   }
+   // On the last request response pair, 
send no token.
+   if i+1 == len(test.buflens) {
+   token = nil
+   }
 
 Review comment:
   rearranging this so that all the initialization for this variable is 
together might make it easier to follow:
   
   ```go
   var buf 

[jira] [Work logged] (BEAM-9746) [Go SDK] Empty side inputs causing spurious zero elements

2020-04-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9746?focusedWorklogId=422286=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-422286
 ]

ASF GitHub Bot logged work on BEAM-9746:


Author: ASF GitHub Bot
Created on: 14/Apr/20 20:24
Start Date: 14/Apr/20 20:24
Worklog Time Spent: 10m 
  Work Description: thetorpedodog commented on pull request #11413: 
[BEAM-9746] check for 0 length copies from state
URL: https://github.com/apache/beam/pull/11413#discussion_r408401030
 
 

 ##
 File path: sdks/go/pkg/beam/core/runtime/harness/statemgr_test.go
 ##
 @@ -258,6 +261,167 @@ func TestStateChannel(t *testing.T) {
}
 }
 
+// TestStateKeyReader validates ordinary Read cases
+func TestStateKeyReader(t *testing.T) {
+   const readLen = 4
+   tests := []struct {
+   name string
+   buflens  []int // sizes of the buffers received on the state 
channel.
+   numReads int
+   closed   bool // tries to read from closed reader
+   noGetbool // tries to read from nil get response reader
+   }{
+   {
+   name: "emptyData",
+   buflens:  []int{-1},
+   numReads: 1,
+   }, {
+   name: "singleBufferSingleRead",
+   buflens:  []int{readLen},
+   numReads: 2,
+   }, {
+   name: "singleBufferMultipleReads",
+   buflens:  []int{2 * readLen},
+   numReads: 3,
+   }, {
+   name: "singleBufferShortRead",
+   buflens:  []int{readLen - 1},
+   numReads: 2,
+   }, {
+   name: "multiBuffer",
+   buflens:  []int{readLen, readLen},
+   numReads: 3,
+   }, {
+   name: "multiBuffer-short-reads",
+   buflens:  []int{readLen - 1, readLen - 1, readLen - 2},
+   numReads: 4,
+   }, {
+   name: "emptyDataFirst", // Shouldn't happen, but 
not unreasonable to handle.
+   buflens:  []int{-1, readLen, readLen},
+   numReads: 4,
+   }, {
+   name: "emptyDataMid", // Shouldn't happen, but not 
unreasonable to handle.
+   buflens:  []int{readLen, readLen, -1, readLen},
+   numReads: 5,
+   }, {
+   name: "emptyDataLast", // Shouldn't happen, but not 
unreasonable to handle.
+   buflens:  []int{readLen, readLen, -1},
+   numReads: 3,
+   }, {
+   name: "emptyDataLast-short",
+   buflens:  []int{3*readLen - 2, -1},
+   numReads: 4,
+   }, {
+   name: "closed",
+   buflens:  []int{-1, -1},
+   numReads: 1,
+   closed:   true,
+   }, {
+   name: "noGet",
+   buflens:  []int{-1},
+   numReads: 1,
+   noGet:true,
+   },
+   }
+   for _, test := range tests {
+   t.Run(test.name, func(t *testing.T) {
+   ctx, cancelFn := 
context.WithCancel(context.Background())
+   ch := {
+   id:"test",
+   requests:  make(chan *fnpb.StateRequest),
+   responses: make(map[string]chan<- 
*fnpb.StateResponse),
+   cancelFn:  cancelFn,
+   DoneCh:ctx.Done(),
+   }
+
+   // Handle the channel behavior asynchronously.
+   go func() {
+   for i := 0; i < len(test.buflens); i++ {
+   token := []byte(strconv.Itoa(i))
+   var buf []byte
+   if test.buflens[i] >= 0 {
+   buf = bytes.Repeat([]byte{42}, 
test.buflens[i])
+   }
+   // On the last request response pair, 
send no token.
+   if i+1 == len(test.buflens) {
+   token = nil
+   }
+
+   req := <-ch.requests
+
+   if test.noGet {
+   

[jira] [Work logged] (BEAM-2939) Fn API SDF support

2020-04-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-2939?focusedWorklogId=422279=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-422279
 ]

ASF GitHub Bot logged work on BEAM-2939:


Author: ASF GitHub Bot
Created on: 14/Apr/20 20:10
Start Date: 14/Apr/20 20:10
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on issue #11409: [BEAM-2939] Update 
unbounded source as SDF wrapper to resume successfully.
URL: https://github.com/apache/beam/pull/11409#issuecomment-613656085
 
 
   Run Java PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 422279)
Time Spent: 27h 10m  (was: 27h)

> Fn API SDF support
> --
>
> Key: BEAM-2939
> URL: https://issues.apache.org/jira/browse/BEAM-2939
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model
>Reporter: Henning Rohde
>Assignee: Luke Cwik
>Priority: Major
>  Labels: portability
>  Time Spent: 27h 10m
>  Remaining Estimate: 0h
>
> The Fn API should support streaming SDF. Detailed design TBD.
> Once design is ready, expand subtasks similarly to BEAM-2822.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8889) Make GcsUtil use GoogleCloudStorage

2020-04-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8889?focusedWorklogId=422278=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-422278
 ]

ASF GitHub Bot logged work on BEAM-8889:


Author: ASF GitHub Bot
Created on: 14/Apr/20 20:07
Start Date: 14/Apr/20 20:07
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on issue #11381: [BEAM-8889] add 
gRPC suport in GCS connector (behind an experimental-flag)
URL: https://github.com/apache/beam/pull/11381#issuecomment-613655133
 
 
   Run Java PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 422278)
Remaining Estimate: 138h 10m  (was: 138h 20m)
Time Spent: 29h 50m  (was: 29h 40m)

> Make GcsUtil use GoogleCloudStorage
> ---
>
> Key: BEAM-8889
> URL: https://issues.apache.org/jira/browse/BEAM-8889
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-gcp
>Affects Versions: 2.16.0
>Reporter: Esun Kim
>Assignee: VASU NORI
>Priority: Major
>  Labels: gcs
>   Original Estimate: 168h
>  Time Spent: 29h 50m
>  Remaining Estimate: 138h 10m
>
> [GcsUtil|https://github.com/apache/beam/blob/master/sdks/java/extensions/google-cloud-platform-core/src/main/java/org/apache/beam/sdk/extensions/gcp/util/GcsUtil.java]
>  is a primary class to access Google Cloud Storage on Apache Beam. Current 
> implementation directly creates GoogleCloudStorageReadChannel and 
> GoogleCloudStorageWriteChannel by itself to read and write GCS data rather 
> than using 
> [GoogleCloudStorage|https://github.com/GoogleCloudPlatform/bigdata-interop/blob/master/gcsio/src/main/java/com/google/cloud/hadoop/gcsio/GoogleCloudStorage.java]
>  which is an abstract class providing basic IO capability which eventually 
> creates channel objects. This request is about updating GcsUtil to use 
> GoogleCloudStorage to create read and write channel, which is expected 
> flexible because it can easily pick up the new change; e.g. new channel 
> implementation using new protocol without code change.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9746) [Go SDK] Empty side inputs causing spurious zero elements

2020-04-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9746?focusedWorklogId=422277=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-422277
 ]

ASF GitHub Bot logged work on BEAM-9746:


Author: ASF GitHub Bot
Created on: 14/Apr/20 20:07
Start Date: 14/Apr/20 20:07
Worklog Time Spent: 10m 
  Work Description: lostluck commented on pull request #11413: [BEAM-9746] 
check for 0 length copies from state
URL: https://github.com/apache/beam/pull/11413#discussion_r408403428
 
 

 ##
 File path: sdks/go/pkg/beam/core/runtime/harness/statemgr_test.go
 ##
 @@ -258,6 +261,167 @@ func TestStateChannel(t *testing.T) {
}
 }
 
+// TestStateKeyReader validates ordinary Read cases
+func TestStateKeyReader(t *testing.T) {
+   const readLen = 4
+   tests := []struct {
+   name string
+   buflens  []int // sizes of the buffers received on the state 
channel.
+   numReads int
+   closed   bool // tries to read from closed reader
+   noGetbool // tries to read from nil get response reader
+   }{
+   {
+   name: "emptyData",
+   buflens:  []int{-1},
+   numReads: 1,
+   }, {
+   name: "singleBufferSingleRead",
+   buflens:  []int{readLen},
+   numReads: 2,
+   }, {
+   name: "singleBufferMultipleReads",
+   buflens:  []int{2 * readLen},
+   numReads: 3,
+   }, {
+   name: "singleBufferShortRead",
+   buflens:  []int{readLen - 1},
+   numReads: 2,
+   }, {
+   name: "multiBuffer",
+   buflens:  []int{readLen, readLen},
+   numReads: 3,
+   }, {
+   name: "multiBuffer-short-reads",
+   buflens:  []int{readLen - 1, readLen - 1, readLen - 2},
+   numReads: 4,
+   }, {
+   name: "emptyDataFirst", // Shouldn't happen, but 
not unreasonable to handle.
+   buflens:  []int{-1, readLen, readLen},
+   numReads: 4,
+   }, {
+   name: "emptyDataMid", // Shouldn't happen, but not 
unreasonable to handle.
+   buflens:  []int{readLen, readLen, -1, readLen},
+   numReads: 5,
+   }, {
+   name: "emptyDataLast", // Shouldn't happen, but not 
unreasonable to handle.
+   buflens:  []int{readLen, readLen, -1},
+   numReads: 3,
+   }, {
+   name: "emptyDataLast-short",
+   buflens:  []int{3*readLen - 2, -1},
+   numReads: 4,
+   }, {
+   name: "closed",
+   buflens:  []int{-1, -1},
+   numReads: 1,
+   closed:   true,
+   }, {
+   name: "noGet",
+   buflens:  []int{-1},
+   numReads: 1,
+   noGet:true,
+   },
+   }
+   for _, test := range tests {
+   t.Run(test.name, func(t *testing.T) {
+   ctx, cancelFn := 
context.WithCancel(context.Background())
+   ch := {
+   id:"test",
+   requests:  make(chan *fnpb.StateRequest),
+   responses: make(map[string]chan<- 
*fnpb.StateResponse),
+   cancelFn:  cancelFn,
+   DoneCh:ctx.Done(),
+   }
+
+   // Handle the channel behavior asynchronously.
+   go func() {
+   for i := 0; i < len(test.buflens); i++ {
 
 Review comment:
   Good catch!
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 422277)
Time Spent: 1h 10m  (was: 1h)

> [Go SDK] Empty side inputs causing spurious zero elements
> -
>
> Key: BEAM-9746
> URL: https://issues.apache.org/jira/browse/BEAM-9746
> Project: Beam
>  Issue Type: Improvement
> 

[jira] [Work logged] (BEAM-9737) beam_PostCommit_Website_Test failing

2020-04-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9737?focusedWorklogId=422270=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-422270
 ]

ASF GitHub Bot logged work on BEAM-9737:


Author: ASF GitHub Bot
Created on: 14/Apr/20 20:02
Start Date: 14/Apr/20 20:02
Worklog Time Spent: 10m 
  Work Description: udim commented on issue #11386: [BEAM-9737] Fix website 
postcommit
URL: https://github.com/apache/beam/pull/11386#issuecomment-613652725
 
 
   Run Full Website Test
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 422270)
Time Spent: 1h 50m  (was: 1h 40m)

> beam_PostCommit_Website_Test failing
> 
>
> Key: BEAM-9737
> URL: https://issues.apache.org/jira/browse/BEAM-9737
> Project: Beam
>  Issue Type: Bug
>  Components: test-failures, website
>Reporter: Udi Meiri
>Assignee: Udi Meiri
>Priority: Major
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> Also failing: beam_PostCommit_Website_Publish (same failure)
> {code}
> > Task :website:buildLocalWebsite
> `/` is not writable.
> Bundler will use `/tmp/bundler/home/unknown' as your home directory 
> temporarily.
> Configuration file: /repo/website/_config.yml
> Configuration file: /repo/website/_config_test.yml
> Configuration file: /tmp/_config_branch_repo.yml
> Source: /repo/website/src
>Destination: generated-local-content
>  Incremental build: enabled
>   Generating... 
> jekyll 3.6.3 | Error:  Permission denied @ dir_s_mkdir - 
> /repo/build/website/generated-local-content/security
> {code}
> https://builds.apache.org/view/A-D/view/Beam/view/PostCommit/job/beam_PostCommit_Website_Test/3676/console
> Possible culprit: https://github.com/apache/beam/pull/11232/files



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9746) [Go SDK] Empty side inputs causing spurious zero elements

2020-04-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9746?focusedWorklogId=422267=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-422267
 ]

ASF GitHub Bot logged work on BEAM-9746:


Author: ASF GitHub Bot
Created on: 14/Apr/20 19:57
Start Date: 14/Apr/20 19:57
Worklog Time Spent: 10m 
  Work Description: thetorpedodog commented on pull request #11413: 
[BEAM-9746] check for 0 length copies from state
URL: https://github.com/apache/beam/pull/11413#discussion_r408397803
 
 

 ##
 File path: sdks/go/pkg/beam/core/runtime/harness/statemgr_test.go
 ##
 @@ -258,6 +261,167 @@ func TestStateChannel(t *testing.T) {
}
 }
 
+// TestStateKeyReader validates ordinary Read cases
+func TestStateKeyReader(t *testing.T) {
+   const readLen = 4
+   tests := []struct {
+   name string
+   buflens  []int // sizes of the buffers received on the state 
channel.
+   numReads int
+   closed   bool // tries to read from closed reader
+   noGetbool // tries to read from nil get response reader
+   }{
+   {
+   name: "emptyData",
+   buflens:  []int{-1},
+   numReads: 1,
+   }, {
+   name: "singleBufferSingleRead",
+   buflens:  []int{readLen},
+   numReads: 2,
+   }, {
+   name: "singleBufferMultipleReads",
+   buflens:  []int{2 * readLen},
+   numReads: 3,
+   }, {
+   name: "singleBufferShortRead",
+   buflens:  []int{readLen - 1},
+   numReads: 2,
+   }, {
+   name: "multiBuffer",
+   buflens:  []int{readLen, readLen},
+   numReads: 3,
+   }, {
+   name: "multiBuffer-short-reads",
+   buflens:  []int{readLen - 1, readLen - 1, readLen - 2},
+   numReads: 4,
+   }, {
+   name: "emptyDataFirst", // Shouldn't happen, but 
not unreasonable to handle.
+   buflens:  []int{-1, readLen, readLen},
+   numReads: 4,
+   }, {
+   name: "emptyDataMid", // Shouldn't happen, but not 
unreasonable to handle.
+   buflens:  []int{readLen, readLen, -1, readLen},
+   numReads: 5,
+   }, {
+   name: "emptyDataLast", // Shouldn't happen, but not 
unreasonable to handle.
+   buflens:  []int{readLen, readLen, -1},
+   numReads: 3,
+   }, {
+   name: "emptyDataLast-short",
+   buflens:  []int{3*readLen - 2, -1},
+   numReads: 4,
+   }, {
+   name: "closed",
+   buflens:  []int{-1, -1},
+   numReads: 1,
+   closed:   true,
+   }, {
+   name: "noGet",
+   buflens:  []int{-1},
+   numReads: 1,
+   noGet:true,
+   },
+   }
+   for _, test := range tests {
+   t.Run(test.name, func(t *testing.T) {
+   ctx, cancelFn := 
context.WithCancel(context.Background())
+   ch := {
+   id:"test",
+   requests:  make(chan *fnpb.StateRequest),
+   responses: make(map[string]chan<- 
*fnpb.StateResponse),
+   cancelFn:  cancelFn,
+   DoneCh:ctx.Done(),
+   }
+
+   // Handle the channel behavior asynchronously.
+   go func() {
+   for i := 0; i < len(test.buflens); i++ {
 
 Review comment:
   i, buflen := range test.buflens ?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 422267)
Time Spent: 1h  (was: 50m)

> [Go SDK] Empty side inputs causing spurious zero elements
> -
>
> Key: BEAM-9746
> URL: https://issues.apache.org/jira/browse/BEAM-9746
> Project: Beam
>  Issue 

[jira] [Work logged] (BEAM-9737) beam_PostCommit_Website_Test failing

2020-04-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9737?focusedWorklogId=422256=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-422256
 ]

ASF GitHub Bot logged work on BEAM-9737:


Author: ASF GitHub Bot
Created on: 14/Apr/20 19:29
Start Date: 14/Apr/20 19:29
Worklog Time Spent: 10m 
  Work Description: udim commented on issue #11386: [BEAM-9737] Fix website 
postcommit
URL: https://github.com/apache/beam/pull/11386#issuecomment-613637918
 
 
   Run Full Website Test
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 422256)
Time Spent: 1h 40m  (was: 1.5h)

> beam_PostCommit_Website_Test failing
> 
>
> Key: BEAM-9737
> URL: https://issues.apache.org/jira/browse/BEAM-9737
> Project: Beam
>  Issue Type: Bug
>  Components: test-failures, website
>Reporter: Udi Meiri
>Assignee: Udi Meiri
>Priority: Major
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> Also failing: beam_PostCommit_Website_Publish (same failure)
> {code}
> > Task :website:buildLocalWebsite
> `/` is not writable.
> Bundler will use `/tmp/bundler/home/unknown' as your home directory 
> temporarily.
> Configuration file: /repo/website/_config.yml
> Configuration file: /repo/website/_config_test.yml
> Configuration file: /tmp/_config_branch_repo.yml
> Source: /repo/website/src
>Destination: generated-local-content
>  Incremental build: enabled
>   Generating... 
> jekyll 3.6.3 | Error:  Permission denied @ dir_s_mkdir - 
> /repo/build/website/generated-local-content/security
> {code}
> https://builds.apache.org/view/A-D/view/Beam/view/PostCommit/job/beam_PostCommit_Website_Test/3676/console
> Possible culprit: https://github.com/apache/beam/pull/11232/files



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9737) beam_PostCommit_Website_Test failing

2020-04-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9737?focusedWorklogId=422248=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-422248
 ]

ASF GitHub Bot logged work on BEAM-9737:


Author: ASF GitHub Bot
Created on: 14/Apr/20 19:22
Start Date: 14/Apr/20 19:22
Worklog Time Spent: 10m 
  Work Description: udim commented on issue #11386: [BEAM-9737] Fix website 
postcommit
URL: https://github.com/apache/beam/pull/11386#issuecomment-613634979
 
 
   Run Full Website Test
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 422248)
Time Spent: 1.5h  (was: 1h 20m)

> beam_PostCommit_Website_Test failing
> 
>
> Key: BEAM-9737
> URL: https://issues.apache.org/jira/browse/BEAM-9737
> Project: Beam
>  Issue Type: Bug
>  Components: test-failures, website
>Reporter: Udi Meiri
>Assignee: Udi Meiri
>Priority: Major
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Also failing: beam_PostCommit_Website_Publish (same failure)
> {code}
> > Task :website:buildLocalWebsite
> `/` is not writable.
> Bundler will use `/tmp/bundler/home/unknown' as your home directory 
> temporarily.
> Configuration file: /repo/website/_config.yml
> Configuration file: /repo/website/_config_test.yml
> Configuration file: /tmp/_config_branch_repo.yml
> Source: /repo/website/src
>Destination: generated-local-content
>  Incremental build: enabled
>   Generating... 
> jekyll 3.6.3 | Error:  Permission denied @ dir_s_mkdir - 
> /repo/build/website/generated-local-content/security
> {code}
> https://builds.apache.org/view/A-D/view/Beam/view/PostCommit/job/beam_PostCommit_Website_Test/3676/console
> Possible culprit: https://github.com/apache/beam/pull/11232/files



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


  1   2   >