[jira] [Work logged] (BEAM-9650) Add consistent slowly changing side inputs support

2020-04-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9650?focusedWorklogId=421864&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-421864
 ]

ASF GitHub Bot logged work on BEAM-9650:


Author: ASF GitHub Bot
Created on: 14/Apr/20 05:44
Start Date: 14/Apr/20 05:44
Worklog Time Spent: 10m 
  Work Description: Ardagan commented on issue #11182: [BEAM-9650] Add 
PeriodicImpulse Transform and slowly changing side input documentation
URL: https://github.com/apache/beam/pull/11182#issuecomment-613237708
 
 
   Run Python PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 421864)
Time Spent: 2h 20m  (was: 2h 10m)

> Add consistent slowly changing side inputs support
> --
>
> Key: BEAM-9650
> URL: https://issues.apache.org/jira/browse/BEAM-9650
> Project: Beam
>  Issue Type: Bug
>  Components: io-ideas
>Reporter: Mikhail Gryzykhin
>Assignee: Mikhail Gryzykhin
>Priority: Major
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> Add implementation for slowly changing dimentions based on [design 
> doc](https://docs.google.com/document/d/1LDY_CtsOJ8Y_zNv1QtkP6AGFrtzkj1q5EW_gSChOIvg/edit]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9642) Add SDF execution-time runners

2020-04-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9642?focusedWorklogId=421815&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-421815
 ]

ASF GitHub Bot logged work on BEAM-9642:


Author: ASF GitHub Bot
Created on: 14/Apr/20 03:21
Start Date: 14/Apr/20 03:21
Worklog Time Spent: 10m 
  Work Description: youngoli commented on pull request #11327: [BEAM-9642] 
Add SDF execution units.
URL: https://github.com/apache/beam/pull/11327#discussion_r407842880
 
 

 ##
 File path: sdks/go/pkg/beam/core/runtime/exec/sdf_test.go
 ##
 @@ -0,0 +1,408 @@
+// Licensed to the Apache Software Foundation (ASF) under one or more
+// contributor license agreements.  See the NOTICE file distributed with
+// this work for additional information regarding copyright ownership.
+// The ASF licenses this file to You under the Apache License, Version 2.0
+// (the "License"); you may not use this file except in compliance with
+// the License.  You may obtain a copy of the License at
+//
+//http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+package exec
+
+import (
+   "context"
+   "github.com/apache/beam/sdks/go/pkg/beam/core/graph"
+   "github.com/apache/beam/sdks/go/pkg/beam/core/graph/window"
+   "github.com/apache/beam/sdks/go/pkg/beam/core/typex"
+   "github.com/google/go-cmp/cmp"
+   "testing"
+)
+
+// testTimestamp is a constant used to check that timestamps are retained.
+const testTimestamp = 15
+
+// testWindow is a constant used to check that windows are retained
+var testWindows = []typex.Window{window.IntervalWindow{Start: 10, End: 20}}
+
+// TestSdfNodes verifies that the various SDF nodes fulfill each of their
+// described contracts, that they each successfully invoke any SDF methods
+// needed, and that they preserve timestamps and windows correctly.
+func TestSdfNodes(t *testing.T) {
+   // Setup. The DoFns created below are defined in sdf_invokers_test.go 
and
+   // have testable behavior to confirm that they got correctly invoked.
+   // Without knowing the expected behavior of these DoFns, the desired 
outputs
+   // in the unit tests below will not make much sense.
+   dfn, err := graph.NewDoFn(&Sdf{}, graph.NumMainInputs(graph.MainSingle))
+   if err != nil {
+   t.Fatalf("invalid function: %v", err)
+   }
+   kvdfn, err := graph.NewDoFn(&KvSdf{}, graph.NumMainInputs(graph.MainKv))
+   if err != nil {
+   t.Fatalf("invalid function: %v", err)
+   }
+
+   // Validate PairWithRestriction matches its contract and properly 
invokes
+   // SDF method CreateInitialRestriction.
+   t.Run("PairWithRestriction", func(t *testing.T) {
+   tests := []struct {
+   name string
+   fn   *graph.DoFn
+   in   *FullValue
+   want *FullValue
+   }{
+   {
+   name: "SingleElem",
+   fn:   dfn,
+   in: &FullValue{
+   Elm:   5,
+   Elm2:  nil,
+   Timestamp: testTimestamp,
+   Windows:   testWindows,
+   },
+   want: &FullValue{
+   Elm: &FullValue{
+   Elm:   5,
+   Elm2:  nil,
+   Timestamp: testTimestamp,
+   Windows:   testWindows,
+   },
+   Elm2:  Restriction{5},
+   Timestamp: testTimestamp,
+   Windows:   testWindows,
+   },
+   },
+   {
+   name: "KvElem",
+   fn:   kvdfn,
+   in: &FullValue{
+   Elm:   5,
+   Elm2:  2,
+   Timestamp: testTimestamp,
+   Windows:   testWindows,
+   },
+   want: &FullValue{
+  

[jira] [Work logged] (BEAM-9642) Add SDF execution-time runners

2020-04-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9642?focusedWorklogId=421820&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-421820
 ]

ASF GitHub Bot logged work on BEAM-9642:


Author: ASF GitHub Bot
Created on: 14/Apr/20 03:22
Start Date: 14/Apr/20 03:22
Worklog Time Spent: 10m 
  Work Description: youngoli commented on pull request #11327: [BEAM-9642] 
Add SDF execution units.
URL: https://github.com/apache/beam/pull/11327#discussion_r407843164
 
 

 ##
 File path: sdks/go/pkg/beam/core/runtime/exec/sdf_test.go
 ##
 @@ -0,0 +1,408 @@
+// Licensed to the Apache Software Foundation (ASF) under one or more
+// contributor license agreements.  See the NOTICE file distributed with
+// this work for additional information regarding copyright ownership.
+// The ASF licenses this file to You under the Apache License, Version 2.0
+// (the "License"); you may not use this file except in compliance with
+// the License.  You may obtain a copy of the License at
+//
+//http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+package exec
+
+import (
+   "context"
+   "github.com/apache/beam/sdks/go/pkg/beam/core/graph"
+   "github.com/apache/beam/sdks/go/pkg/beam/core/graph/window"
+   "github.com/apache/beam/sdks/go/pkg/beam/core/typex"
+   "github.com/google/go-cmp/cmp"
+   "testing"
+)
+
+// testTimestamp is a constant used to check that timestamps are retained.
+const testTimestamp = 15
+
+// testWindow is a constant used to check that windows are retained
+var testWindows = []typex.Window{window.IntervalWindow{Start: 10, End: 20}}
+
+// TestSdfNodes verifies that the various SDF nodes fulfill each of their
+// described contracts, that they each successfully invoke any SDF methods
+// needed, and that they preserve timestamps and windows correctly.
+func TestSdfNodes(t *testing.T) {
+   // Setup. The DoFns created below are defined in sdf_invokers_test.go 
and
+   // have testable behavior to confirm that they got correctly invoked.
+   // Without knowing the expected behavior of these DoFns, the desired 
outputs
+   // in the unit tests below will not make much sense.
+   dfn, err := graph.NewDoFn(&Sdf{}, graph.NumMainInputs(graph.MainSingle))
+   if err != nil {
+   t.Fatalf("invalid function: %v", err)
+   }
+   kvdfn, err := graph.NewDoFn(&KvSdf{}, graph.NumMainInputs(graph.MainKv))
+   if err != nil {
+   t.Fatalf("invalid function: %v", err)
+   }
+
+   // Validate PairWithRestriction matches its contract and properly 
invokes
+   // SDF method CreateInitialRestriction.
+   t.Run("PairWithRestriction", func(t *testing.T) {
+   tests := []struct {
+   name string
+   fn   *graph.DoFn
+   in   *FullValue
+   want *FullValue
+   }{
+   {
+   name: "SingleElem",
+   fn:   dfn,
+   in: &FullValue{
+   Elm:   5,
+   Elm2:  nil,
+   Timestamp: testTimestamp,
+   Windows:   testWindows,
+   },
+   want: &FullValue{
+   Elm: &FullValue{
+   Elm:   5,
+   Elm2:  nil,
+   Timestamp: testTimestamp,
+   Windows:   testWindows,
+   },
+   Elm2:  Restriction{5},
+   Timestamp: testTimestamp,
+   Windows:   testWindows,
+   },
+   },
+   {
+   name: "KvElem",
+   fn:   kvdfn,
+   in: &FullValue{
+   Elm:   5,
+   Elm2:  2,
+   Timestamp: testTimestamp,
+   Windows:   testWindows,
+   },
+   want: &FullValue{
+  

[jira] [Work logged] (BEAM-9642) Add SDF execution-time runners

2020-04-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9642?focusedWorklogId=421814&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-421814
 ]

ASF GitHub Bot logged work on BEAM-9642:


Author: ASF GitHub Bot
Created on: 14/Apr/20 03:21
Start Date: 14/Apr/20 03:21
Worklog Time Spent: 10m 
  Work Description: youngoli commented on pull request #11327: [BEAM-9642] 
Add SDF execution units.
URL: https://github.com/apache/beam/pull/11327#discussion_r407829916
 
 

 ##
 File path: sdks/go/pkg/beam/core/runtime/exec/sdf_test.go
 ##
 @@ -0,0 +1,408 @@
+// Licensed to the Apache Software Foundation (ASF) under one or more
+// contributor license agreements.  See the NOTICE file distributed with
+// this work for additional information regarding copyright ownership.
+// The ASF licenses this file to You under the Apache License, Version 2.0
+// (the "License"); you may not use this file except in compliance with
+// the License.  You may obtain a copy of the License at
+//
+//http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+package exec
+
+import (
+   "context"
+   "github.com/apache/beam/sdks/go/pkg/beam/core/graph"
+   "github.com/apache/beam/sdks/go/pkg/beam/core/graph/window"
+   "github.com/apache/beam/sdks/go/pkg/beam/core/typex"
+   "github.com/google/go-cmp/cmp"
+   "testing"
+)
+
+// testTimestamp is a constant used to check that timestamps are retained.
+const testTimestamp = 15
+
+// testWindow is a constant used to check that windows are retained
+var testWindows = []typex.Window{window.IntervalWindow{Start: 10, End: 20}}
+
+// TestSdfNodes verifies that the various SDF nodes fulfill each of their
+// described contracts, that they each successfully invoke any SDF methods
+// needed, and that they preserve timestamps and windows correctly.
+func TestSdfNodes(t *testing.T) {
+   // Setup. The DoFns created below are defined in sdf_invokers_test.go 
and
+   // have testable behavior to confirm that they got correctly invoked.
+   // Without knowing the expected behavior of these DoFns, the desired 
outputs
+   // in the unit tests below will not make much sense.
+   dfn, err := graph.NewDoFn(&Sdf{}, graph.NumMainInputs(graph.MainSingle))
+   if err != nil {
+   t.Fatalf("invalid function: %v", err)
+   }
+   kvdfn, err := graph.NewDoFn(&KvSdf{}, graph.NumMainInputs(graph.MainKv))
+   if err != nil {
+   t.Fatalf("invalid function: %v", err)
+   }
+
+   // Validate PairWithRestriction matches its contract and properly 
invokes
+   // SDF method CreateInitialRestriction.
+   t.Run("PairWithRestriction", func(t *testing.T) {
+   tests := []struct {
+   name string
+   fn   *graph.DoFn
+   in   *FullValue
+   want *FullValue
+   }{
+   {
+   name: "SingleElem",
+   fn:   dfn,
+   in: &FullValue{
+   Elm:   5,
+   Elm2:  nil,
+   Timestamp: testTimestamp,
+   Windows:   testWindows,
+   },
+   want: &FullValue{
+   Elm: &FullValue{
+   Elm:   5,
+   Elm2:  nil,
+   Timestamp: testTimestamp,
+   Windows:   testWindows,
+   },
+   Elm2:  Restriction{5},
+   Timestamp: testTimestamp,
+   Windows:   testWindows,
+   },
+   },
+   {
+   name: "KvElem",
+   fn:   kvdfn,
+   in: &FullValue{
+   Elm:   5,
+   Elm2:  2,
+   Timestamp: testTimestamp,
+   Windows:   testWindows,
+   },
+   want: &FullValue{
+  

[jira] [Work logged] (BEAM-9642) Add SDF execution-time runners

2020-04-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9642?focusedWorklogId=421816&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-421816
 ]

ASF GitHub Bot logged work on BEAM-9642:


Author: ASF GitHub Bot
Created on: 14/Apr/20 03:21
Start Date: 14/Apr/20 03:21
Worklog Time Spent: 10m 
  Work Description: youngoli commented on pull request #11327: [BEAM-9642] 
Add SDF execution units.
URL: https://github.com/apache/beam/pull/11327#discussion_r407827372
 
 

 ##
 File path: sdks/go/pkg/beam/core/runtime/exec/pardo.go
 ##
 @@ -120,11 +120,17 @@ func (n *ParDo) ProcessElement(ctx context.Context, elm 
*FullValue, values ...Re
if n.status != Active {
return errors.Errorf("invalid status for pardo %v: %v, want 
Active", n.UID, n.status)
}
+
+   return n.ProcessMainInput(&MainInput{Key: *elm, Values: values})
+}
+
+func (n *ParDo) ProcessMainInput(mainIn *MainInput) error {
 
 Review comment:
   Done.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 421816)
Time Spent: 4h  (was: 3h 50m)

> Add SDF execution-time runners
> --
>
> Key: BEAM-9642
> URL: https://issues.apache.org/jira/browse/BEAM-9642
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-go
>Reporter: Daniel Oliveira
>Assignee: Daniel Oliveira
>Priority: Major
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> Adds execution-time SDF runner units to the exec package, and any unit tests 
> + helpers required.
> This is needed to get the expanded SDF URNs to execute in the runner harness.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9642) Add SDF execution-time runners

2020-04-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9642?focusedWorklogId=421818&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-421818
 ]

ASF GitHub Bot logged work on BEAM-9642:


Author: ASF GitHub Bot
Created on: 14/Apr/20 03:21
Start Date: 14/Apr/20 03:21
Worklog Time Spent: 10m 
  Work Description: youngoli commented on pull request #11327: [BEAM-9642] 
Add SDF execution units.
URL: https://github.com/apache/beam/pull/11327#discussion_r407828960
 
 

 ##
 File path: sdks/go/pkg/beam/core/runtime/exec/sdf_invokers_test.go
 ##
 @@ -273,7 +287,9 @@ func (fn *KvSdf) CreateTracker(rest Restriction) *RTracker 
{
return &RTracker{rest, 2}
 }
 
-// ProcessElement is a no-op, it's only included to pass validation.
-func (fn *KvSdf) ProcessElement(*RTracker, int, int) int {
-   return 0
+// ProcessElement emits two ints. The first is the first input (key) +
+// RTracker.Rest.Val. The second is the second input (value) + RTracker.Val.
+func (fn *KvSdf) ProcessElement(rt *RTracker, i1 int, i2 int, emit func(int, 
int)) {
+   emit(i1+rt.Rest.Val, i2+rt.Val)
+   return
 
 Review comment:
   Done.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 421818)
Time Spent: 4h 10m  (was: 4h)

> Add SDF execution-time runners
> --
>
> Key: BEAM-9642
> URL: https://issues.apache.org/jira/browse/BEAM-9642
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-go
>Reporter: Daniel Oliveira
>Assignee: Daniel Oliveira
>Priority: Major
>  Time Spent: 4h 10m
>  Remaining Estimate: 0h
>
> Adds execution-time SDF runner units to the exec package, and any unit tests 
> + helpers required.
> This is needed to get the expanded SDF URNs to execute in the runner harness.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9642) Add SDF execution-time runners

2020-04-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9642?focusedWorklogId=421817&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-421817
 ]

ASF GitHub Bot logged work on BEAM-9642:


Author: ASF GitHub Bot
Created on: 14/Apr/20 03:21
Start Date: 14/Apr/20 03:21
Worklog Time Spent: 10m 
  Work Description: youngoli commented on pull request #11327: [BEAM-9642] 
Add SDF execution units.
URL: https://github.com/apache/beam/pull/11327#discussion_r407842781
 
 

 ##
 File path: sdks/go/pkg/beam/core/runtime/exec/sdf_test.go
 ##
 @@ -0,0 +1,408 @@
+// Licensed to the Apache Software Foundation (ASF) under one or more
+// contributor license agreements.  See the NOTICE file distributed with
+// this work for additional information regarding copyright ownership.
+// The ASF licenses this file to You under the Apache License, Version 2.0
+// (the "License"); you may not use this file except in compliance with
+// the License.  You may obtain a copy of the License at
+//
+//http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+package exec
+
+import (
+   "context"
+   "github.com/apache/beam/sdks/go/pkg/beam/core/graph"
+   "github.com/apache/beam/sdks/go/pkg/beam/core/graph/window"
+   "github.com/apache/beam/sdks/go/pkg/beam/core/typex"
+   "github.com/google/go-cmp/cmp"
+   "testing"
+)
+
+// testTimestamp is a constant used to check that timestamps are retained.
+const testTimestamp = 15
+
+// testWindow is a constant used to check that windows are retained
+var testWindows = []typex.Window{window.IntervalWindow{Start: 10, End: 20}}
+
+// TestSdfNodes verifies that the various SDF nodes fulfill each of their
+// described contracts, that they each successfully invoke any SDF methods
+// needed, and that they preserve timestamps and windows correctly.
+func TestSdfNodes(t *testing.T) {
+   // Setup. The DoFns created below are defined in sdf_invokers_test.go 
and
+   // have testable behavior to confirm that they got correctly invoked.
+   // Without knowing the expected behavior of these DoFns, the desired 
outputs
+   // in the unit tests below will not make much sense.
+   dfn, err := graph.NewDoFn(&Sdf{}, graph.NumMainInputs(graph.MainSingle))
+   if err != nil {
+   t.Fatalf("invalid function: %v", err)
+   }
+   kvdfn, err := graph.NewDoFn(&KvSdf{}, graph.NumMainInputs(graph.MainKv))
+   if err != nil {
+   t.Fatalf("invalid function: %v", err)
+   }
+
+   // Validate PairWithRestriction matches its contract and properly 
invokes
+   // SDF method CreateInitialRestriction.
+   t.Run("PairWithRestriction", func(t *testing.T) {
+   tests := []struct {
+   name string
+   fn   *graph.DoFn
+   in   *FullValue
+   want *FullValue
+   }{
+   {
+   name: "SingleElem",
+   fn:   dfn,
+   in: &FullValue{
+   Elm:   5,
+   Elm2:  nil,
+   Timestamp: testTimestamp,
+   Windows:   testWindows,
+   },
+   want: &FullValue{
+   Elm: &FullValue{
+   Elm:   5,
+   Elm2:  nil,
+   Timestamp: testTimestamp,
+   Windows:   testWindows,
+   },
+   Elm2:  Restriction{5},
+   Timestamp: testTimestamp,
+   Windows:   testWindows,
+   },
+   },
+   {
+   name: "KvElem",
+   fn:   kvdfn,
+   in: &FullValue{
+   Elm:   5,
+   Elm2:  2,
+   Timestamp: testTimestamp,
+   Windows:   testWindows,
+   },
+   want: &FullValue{
+  

[jira] [Work logged] (BEAM-9642) Add SDF execution-time runners

2020-04-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9642?focusedWorklogId=421813&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-421813
 ]

ASF GitHub Bot logged work on BEAM-9642:


Author: ASF GitHub Bot
Created on: 14/Apr/20 03:21
Start Date: 14/Apr/20 03:21
Worklog Time Spent: 10m 
  Work Description: youngoli commented on pull request #11327: [BEAM-9642] 
Add SDF execution units.
URL: https://github.com/apache/beam/pull/11327#discussion_r407827873
 
 

 ##
 File path: sdks/go/pkg/beam/core/runtime/exec/sdf.go
 ##
 @@ -0,0 +1,297 @@
+// Licensed to the Apache Software Foundation (ASF) under one or more
+// contributor license agreements.  See the NOTICE file distributed with
+// this work for additional information regarding copyright ownership.
+// The ASF licenses this file to You under the Apache License, Version 2.0
+// (the "License"); you may not use this file except in compliance with
+// the License.  You may obtain a copy of the License at
+//
+//http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+package exec
+
+import (
+   "context"
+   "fmt"
+   "github.com/apache/beam/sdks/go/pkg/beam/core/graph"
+   "path"
+
+   "github.com/apache/beam/sdks/go/pkg/beam/internal/errors"
+)
+
+// PairWithRestriction is an executor for the expanded SDF step of the same
+// name. This is the first step of an expanded SDF. It pairs each main input
+// element with a restriction via the SDF's associated sdf.RestrictionProvider.
+// This step is followed by SplitAndSizeRestrictions.
+type PairWithRestriction struct {
+   UID UnitID
+   Fn  *graph.DoFn
+   Out []Node
+
+   inv *cirInvoker
+}
+
+// ID returns the UnitID for this unit.
+func (n *PairWithRestriction) ID() UnitID {
+   return n.UID
+}
+
+// Up performs one-time setup for this executor.
+func (n *PairWithRestriction) Up(ctx context.Context) error {
+   fn := (*graph.SplittableDoFn)(n.Fn).CreateInitialRestrictionFn()
+   var err error
+   if n.inv, err = newCreateInitialRestrictionInvoker(fn); err != nil {
+   return errors.WithContextf(err, "PairWithRestriction transform 
with UID %v", n.ID())
+   }
+   return nil
+}
+
+// StartBundle currently does nothing.
+func (n *PairWithRestriction) StartBundle(ctx context.Context, id string, data 
DataContext) error {
+   return n.Out[0].StartBundle(ctx, id, data)
+}
+
+// ProcessElement expects elm to be the main input to the ParDo. See
+// exec.FullValue for more details on the expected input.
+//
+// ProcessElement creates an initial restriction representing the entire input.
+// The output is in the structure , where elem is the main
+// input originally passed in (i.e. the parameter elm). Windows and Timestamp
+// are copied to the outer *FullValue. They still remain within the original
+// element as well, but will no longer be used.
+//
+// Output Diagram:
+//
+//   *FullValue {
+// Elm: *FullValue (original input)
+// Elm2: Restriction
+// Windows
+// Timestamps
+//   }
+func (n *PairWithRestriction) ProcessElement(ctx context.Context, elm 
*FullValue, values ...ReStream) error {
+   rest := n.inv.Invoke(elm)
+   output := FullValue{Elm: elm, Elm2: rest, Timestamp: elm.Timestamp, 
Windows: elm.Windows}
+
+   return n.Out[0].ProcessElement(ctx, &output, values...)
+}
+
+// FinishBundle does some teardown for the end of the bundle.
+func (n *PairWithRestriction) FinishBundle(ctx context.Context) error {
+   n.inv.Reset()
+   return n.Out[0].FinishBundle(ctx)
+}
+
+// Down currently does nothing.
+func (n *PairWithRestriction) Down(ctx context.Context) error {
+   return nil
+}
+
+// String outputs a human-readable description of this transform.
+func (n *PairWithRestriction) String() string {
+   return fmt.Sprintf("SDF.PairWithRestriction[%v] Out:%v", 
path.Base(n.Fn.Name()), IDs(n.Out...))
+}
+
+// SplitAndSizeRestrictions is an executor for the expanded SDF step of the
+// same name. It is the second step of the expanded SDF, occuring after
+// CreateInitialRestriction. It performs initial splits on the initial 
restrictions
+// and adds sizing information, producing one or more output elements per input
+// element. This step is followed by ProcessSizedElementsAndRestrictions.
+type SplitAndSizeRestrictions struct {
+   UID UnitID
+   Fn  *graph.DoFn
+   Out []Node
 
 Review comment:
   This was me copying from the old implementation, which had all these nodes 
wrapping ParDos. No, these shouldn't be outputting to mo

[jira] [Work logged] (BEAM-9751) Upgrade ZetaSQL to 2020.04.1

2020-04-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9751?focusedWorklogId=421807&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-421807
 ]

ASF GitHub Bot logged work on BEAM-9751:


Author: ASF GitHub Bot
Created on: 14/Apr/20 02:50
Start Date: 14/Apr/20 02:50
Worklog Time Spent: 10m 
  Work Description: amaliujia commented on pull request #11410: [BEAM-9751] 
Upgrade ZetaSQL java 2020.04.1
URL: https://github.com/apache/beam/pull/11410
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 421807)
Time Spent: 1h 20m  (was: 1h 10m)

> Upgrade ZetaSQL to 2020.04.1
> 
>
> Key: BEAM-9751
> URL: https://issues.apache.org/jira/browse/BEAM-9751
> Project: Beam
>  Issue Type: Task
>  Components: dsl-sql-zetasql
>Reporter: Rui Wang
>Assignee: Rui Wang
>Priority: Major
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9751) Upgrade ZetaSQL to 2020.04.1

2020-04-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9751?focusedWorklogId=421804&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-421804
 ]

ASF GitHub Bot logged work on BEAM-9751:


Author: ASF GitHub Bot
Created on: 14/Apr/20 02:42
Start Date: 14/Apr/20 02:42
Worklog Time Spent: 10m 
  Work Description: amaliujia commented on issue #11410: [BEAM-9751] 
Upgrade ZetaSQL java 2020.04.1
URL: https://github.com/apache/beam/pull/11410#issuecomment-613195548
 
 
   Run JavaBeamZetaSQL PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 421804)
Time Spent: 1h  (was: 50m)

> Upgrade ZetaSQL to 2020.04.1
> 
>
> Key: BEAM-9751
> URL: https://issues.apache.org/jira/browse/BEAM-9751
> Project: Beam
>  Issue Type: Task
>  Components: dsl-sql-zetasql
>Reporter: Rui Wang
>Assignee: Rui Wang
>Priority: Major
>  Time Spent: 1h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9751) Upgrade ZetaSQL to 2020.04.1

2020-04-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9751?focusedWorklogId=421805&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-421805
 ]

ASF GitHub Bot logged work on BEAM-9751:


Author: ASF GitHub Bot
Created on: 14/Apr/20 02:42
Start Date: 14/Apr/20 02:42
Worklog Time Spent: 10m 
  Work Description: amaliujia commented on issue #11410: [BEAM-9751] 
Upgrade ZetaSQL java 2020.04.1
URL: https://github.com/apache/beam/pull/11410#issuecomment-613195581
 
 
   Run SQL PostCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 421805)
Time Spent: 1h 10m  (was: 1h)

> Upgrade ZetaSQL to 2020.04.1
> 
>
> Key: BEAM-9751
> URL: https://issues.apache.org/jira/browse/BEAM-9751
> Project: Beam
>  Issue Type: Task
>  Components: dsl-sql-zetasql
>Reporter: Rui Wang
>Assignee: Rui Wang
>Priority: Major
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-9753) Use cmp in fullvalue_test.go

2020-04-13 Thread Daniel Oliveira (Jira)
Daniel Oliveira created BEAM-9753:
-

 Summary: Use cmp in fullvalue_test.go
 Key: BEAM-9753
 URL: https://issues.apache.org/jira/browse/BEAM-9753
 Project: Beam
  Issue Type: Improvement
  Components: sdk-go
Reporter: Daniel Oliveira
Assignee: Daniel Oliveira


We could probably update the comparison helpers in 
[fullvalue_test.go|https://github.com/apache/beam/blob/master/sdks/go/pkg/beam/core/runtime/exec/fullvalue_test.go]
 to use cmp options and Transformers instead which would make things much 
clearer.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9753) Use cmp in fullvalue_test.go

2020-04-13 Thread Daniel Oliveira (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Oliveira updated BEAM-9753:
--
Status: Open  (was: Triage Needed)

> Use cmp in fullvalue_test.go
> 
>
> Key: BEAM-9753
> URL: https://issues.apache.org/jira/browse/BEAM-9753
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-go
>Reporter: Daniel Oliveira
>Assignee: Daniel Oliveira
>Priority: Minor
>
> We could probably update the comparison helpers in 
> [fullvalue_test.go|https://github.com/apache/beam/blob/master/sdks/go/pkg/beam/core/runtime/exec/fullvalue_test.go]
>  to use cmp options and Transformers instead which would make things much 
> clearer.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9250) Improve beam release script based on 2.19.0 release experience

2020-04-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9250?focusedWorklogId=421789&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-421789
 ]

ASF GitHub Bot logged work on BEAM-9250:


Author: ASF GitHub Bot
Created on: 14/Apr/20 01:46
Start Date: 14/Apr/20 01:46
Worklog Time Spent: 10m 
  Work Description: stale[bot] commented on issue #10776: [BEAM-9250] 
Update building java_doc and py_doc
URL: https://github.com/apache/beam/pull/10776#issuecomment-613181124
 
 
   This pull request has been marked as stale due to 60 days of inactivity. It 
will be closed in 1 week if no further activity occurs. If you think that’s 
incorrect or this pull request requires a review, please simply write any 
comment. If closed, you can revive the PR at any time and @mention a reviewer 
or discuss it on the d...@beam.apache.org list. Thank you for your 
contributions.
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 421789)
Time Spent: 3h  (was: 2h 50m)

> Improve beam release script based on 2.19.0 release experience
> --
>
> Key: BEAM-9250
> URL: https://issues.apache.org/jira/browse/BEAM-9250
> Project: Beam
>  Issue Type: Improvement
>  Components: testing
>Reporter: Boyuan Zhang
>Assignee: Boyuan Zhang
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9250) Improve beam release script based on 2.19.0 release experience

2020-04-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9250?focusedWorklogId=421790&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-421790
 ]

ASF GitHub Bot logged work on BEAM-9250:


Author: ASF GitHub Bot
Created on: 14/Apr/20 01:46
Start Date: 14/Apr/20 01:46
Worklog Time Spent: 10m 
  Work Description: stale[bot] commented on issue #10772: [BEAM-9250] 
Re-structure python release candidate target.
URL: https://github.com/apache/beam/pull/10772#issuecomment-613181131
 
 
   This pull request has been marked as stale due to 60 days of inactivity. It 
will be closed in 1 week if no further activity occurs. If you think that’s 
incorrect or this pull request requires a review, please simply write any 
comment. If closed, you can revive the PR at any time and @mention a reviewer 
or discuss it on the d...@beam.apache.org list. Thank you for your 
contributions.
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 421790)
Time Spent: 3h 10m  (was: 3h)

> Improve beam release script based on 2.19.0 release experience
> --
>
> Key: BEAM-9250
> URL: https://issues.apache.org/jira/browse/BEAM-9250
> Project: Beam
>  Issue Type: Improvement
>  Components: testing
>Reporter: Boyuan Zhang
>Assignee: Boyuan Zhang
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8889) Make GcsUtil use GoogleCloudStorage

2020-04-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8889?focusedWorklogId=421777&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-421777
 ]

ASF GitHub Bot logged work on BEAM-8889:


Author: ASF GitHub Bot
Created on: 14/Apr/20 01:08
Start Date: 14/Apr/20 01:08
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on issue #11381: [BEAM-8889] add 
gRPC suport in GCS connector (behind an experimental-flag)
URL: https://github.com/apache/beam/pull/11381#issuecomment-613171185
 
 
   Run Java PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 421777)
Remaining Estimate: 138h 20m  (was: 138.5h)
Time Spent: 29h 40m  (was: 29.5h)

> Make GcsUtil use GoogleCloudStorage
> ---
>
> Key: BEAM-8889
> URL: https://issues.apache.org/jira/browse/BEAM-8889
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-gcp
>Affects Versions: 2.16.0
>Reporter: Esun Kim
>Assignee: VASU NORI
>Priority: Major
>  Labels: gcs
>   Original Estimate: 168h
>  Time Spent: 29h 40m
>  Remaining Estimate: 138h 20m
>
> [GcsUtil|https://github.com/apache/beam/blob/master/sdks/java/extensions/google-cloud-platform-core/src/main/java/org/apache/beam/sdk/extensions/gcp/util/GcsUtil.java]
>  is a primary class to access Google Cloud Storage on Apache Beam. Current 
> implementation directly creates GoogleCloudStorageReadChannel and 
> GoogleCloudStorageWriteChannel by itself to read and write GCS data rather 
> than using 
> [GoogleCloudStorage|https://github.com/GoogleCloudPlatform/bigdata-interop/blob/master/gcsio/src/main/java/com/google/cloud/hadoop/gcsio/GoogleCloudStorage.java]
>  which is an abstract class providing basic IO capability which eventually 
> creates channel objects. This request is about updating GcsUtil to use 
> GoogleCloudStorage to create read and write channel, which is expected 
> flexible because it can easily pick up the new change; e.g. new channel 
> implementation using new protocol without code change.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8889) Make GcsUtil use GoogleCloudStorage

2020-04-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8889?focusedWorklogId=421776&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-421776
 ]

ASF GitHub Bot logged work on BEAM-8889:


Author: ASF GitHub Bot
Created on: 14/Apr/20 01:07
Start Date: 14/Apr/20 01:07
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on issue #11381: [BEAM-8889] add 
gRPC suport in GCS connector (behind an experimental-flag)
URL: https://github.com/apache/beam/pull/11381#issuecomment-613171121
 
 
   Run Java PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 421776)
Remaining Estimate: 138.5h  (was: 138h 40m)
Time Spent: 29.5h  (was: 29h 20m)

> Make GcsUtil use GoogleCloudStorage
> ---
>
> Key: BEAM-8889
> URL: https://issues.apache.org/jira/browse/BEAM-8889
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-gcp
>Affects Versions: 2.16.0
>Reporter: Esun Kim
>Assignee: VASU NORI
>Priority: Major
>  Labels: gcs
>   Original Estimate: 168h
>  Time Spent: 29.5h
>  Remaining Estimate: 138.5h
>
> [GcsUtil|https://github.com/apache/beam/blob/master/sdks/java/extensions/google-cloud-platform-core/src/main/java/org/apache/beam/sdk/extensions/gcp/util/GcsUtil.java]
>  is a primary class to access Google Cloud Storage on Apache Beam. Current 
> implementation directly creates GoogleCloudStorageReadChannel and 
> GoogleCloudStorageWriteChannel by itself to read and write GCS data rather 
> than using 
> [GoogleCloudStorage|https://github.com/GoogleCloudPlatform/bigdata-interop/blob/master/gcsio/src/main/java/com/google/cloud/hadoop/gcsio/GoogleCloudStorage.java]
>  which is an abstract class providing basic IO capability which eventually 
> creates channel objects. This request is about updating GcsUtil to use 
> GoogleCloudStorage to create read and write channel, which is expected 
> flexible because it can easily pick up the new change; e.g. new channel 
> implementation using new protocol without code change.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9751) Upgrade ZetaSQL to 2020.04.1

2020-04-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9751?focusedWorklogId=421775&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-421775
 ]

ASF GitHub Bot logged work on BEAM-9751:


Author: ASF GitHub Bot
Created on: 14/Apr/20 01:07
Start Date: 14/Apr/20 01:07
Worklog Time Spent: 10m 
  Work Description: amaliujia commented on issue #11410: [BEAM-9751] 
Upgrade ZetaSQL java 2020.04.1
URL: https://github.com/apache/beam/pull/11410#issuecomment-613170968
 
 
   Run SQL PostCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 421775)
Time Spent: 50m  (was: 40m)

> Upgrade ZetaSQL to 2020.04.1
> 
>
> Key: BEAM-9751
> URL: https://issues.apache.org/jira/browse/BEAM-9751
> Project: Beam
>  Issue Type: Task
>  Components: dsl-sql-zetasql
>Reporter: Rui Wang
>Assignee: Rui Wang
>Priority: Major
>  Time Spent: 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9738) Python Dataflow runner omits capabilities.

2020-04-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9738?focusedWorklogId=421771&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-421771
 ]

ASF GitHub Bot logged work on BEAM-9738:


Author: ASF GitHub Bot
Created on: 14/Apr/20 00:57
Start Date: 14/Apr/20 00:57
Worklog Time Spent: 10m 
  Work Description: robertwb commented on issue #11390: [BEAM-9738] Update 
dataflow to setup correct docker environment options.
URL: https://github.com/apache/beam/pull/11390#issuecomment-613168347
 
 
   Run Python PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 421771)
Time Spent: 2h  (was: 1h 50m)

> Python Dataflow runner omits capabilities.
> --
>
> Key: BEAM-9738
> URL: https://issues.apache.org/jira/browse/BEAM-9738
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Reporter: Robert Bradshaw
>Assignee: Robert Bradshaw
>Priority: Blocker
> Fix For: 2.21.0
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-9752) Too many shards in GCS

2020-04-13 Thread Ankur Goenka (Jira)
Ankur Goenka created BEAM-9752:
--

 Summary: Too many shards in GCS
 Key: BEAM-9752
 URL: https://issues.apache.org/jira/browse/BEAM-9752
 Project: Beam
  Issue Type: Bug
  Components: sdk-py-core, sdk-py-harness
Reporter: Ankur Goenka


We have observed case where the data was spread very thinly over automatically 
computed number of shards.

This caused wait for the buffers to fill before sending the data over to gcs 
causing upload timeout as we did not upload any data for while waiting.

However, by setting an explicit number of shards (1000 in my case) solved this 
problem potentially because all the shards had enough data to fill the buffer 
write avoiding timeout.

 

We can improve the sharding logic so that we don't create too many shards.

Alternatively, we can improve connection handling so that the connection does 
not timeout.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9751) Upgrade ZetaSQL to 2020.04.1

2020-04-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9751?focusedWorklogId=421757&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-421757
 ]

ASF GitHub Bot logged work on BEAM-9751:


Author: ASF GitHub Bot
Created on: 14/Apr/20 00:07
Start Date: 14/Apr/20 00:07
Worklog Time Spent: 10m 
  Work Description: apilloud commented on pull request #11410: [BEAM-9751] 
Upgrade ZetaSQL java 2020.04.1
URL: https://github.com/apache/beam/pull/11410#discussion_r407789393
 
 

 ##
 File path: sdks/java/extensions/sql/zetasql/build.gradle
 ##
 @@ -20,12 +20,18 @@ plugins {
   id 'org.apache.beam.module'
 }
 
+repositories {
 
 Review comment:
   Can't merge with this block.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 421757)
Time Spent: 40m  (was: 0.5h)

> Upgrade ZetaSQL to 2020.04.1
> 
>
> Key: BEAM-9751
> URL: https://issues.apache.org/jira/browse/BEAM-9751
> Project: Beam
>  Issue Type: Task
>  Components: dsl-sql-zetasql
>Reporter: Rui Wang
>Assignee: Rui Wang
>Priority: Major
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9743) TFRecordCodec not attempt to fully read/write

2020-04-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9743?focusedWorklogId=421754&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-421754
 ]

ASF GitHub Bot logged work on BEAM-9743:


Author: ASF GitHub Bot
Created on: 13/Apr/20 23:54
Start Date: 13/Apr/20 23:54
Worklog Time Spent: 10m 
  Work Description: lukemin89 commented on issue #11397: [BEAM-9743] Fix 
TFRecordCodec to try harder to read/write
URL: https://github.com/apache/beam/pull/11397#issuecomment-613152729
 
 
   R: @lukecwik 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 421754)
Time Spent: 2h  (was: 1h 50m)

> TFRecordCodec not attempt to fully read/write
> -
>
> Key: BEAM-9743
> URL: https://issues.apache.org/jira/browse/BEAM-9743
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-tfrecord, sdk-java-core
>Reporter: Kyoungha Min
>Assignee: Kyoungha Min
>Priority: Critical
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> The same issue has been pointed out and the issues were marked resolved. But 
> they were still remaining parts
> https://issues.apache.org/jira/browse/BEAM-5412?jql=text%20~%20%22tfrecord%22
>  
> Issue # 1: TFRecordCodec only tries once to read the header/footer. This is 
> likely to fail around the end of channel buffer.  
> Issue # 2: (minor) TFRecordCodec currently does not checks how much it 
> writes. 
>  
> Seems like it only happens with Zstd compression (or any other picky input 
> stream that refuse to read fully). ZstdInputStream seems very picky at giving 
> out data.
> The parts with the issue are
> [https://github.com/apache/beam/blob/c7911043510a266078a3dc8faef7a1dbe1f598c5/sdks/java/core/src/main/java/org/apache/beam/sdk/io/TFRecordIO.java#L672]
> [https://github.com/apache/beam/blob/c7911043510a266078a3dc8faef7a1dbe1f598c5/sdks/java/core/src/main/java/org/apache/beam/sdk/io/TFRecordIO.java#L699]
>  
> And not so problem within the beam application (As all (or most) of 
> WritableByteChannels in beam-java-sdk-core are backed by some OutputStream), 
> but still not following the WritableByteChannel specification, 
> [https://github.com/apache/beam/blob/c7911043510a266078a3dc8faef7a1dbe1f598c5/sdks/java/core/src/main/java/org/apache/beam/sdk/io/TFRecordIO.java#L720-L727]
>  
> ReadableByteChannel/WritableByteChannel Javadoc specifies that they are not 
> required to read/write fully, and can refuse to read/write time to time.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (BEAM-9709) timezone off by 8 hours

2020-04-13 Thread Andrew Pilloud (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17082720#comment-17082720
 ] 

Andrew Pilloud edited comment on BEAM-9709 at 4/13/20, 11:49 PM:
-

That test appears to be asserting the default timezone is not UTC (it has the 
wrong "right" answer for Beam, but right answer for ZetaSQL): 
https://github.com/google/zetasql/blob/7d983d3632702f200c8340933160c02f1d94e5a7/javatests/com/google/zetasql/PreparedExpressionTest.java#L370


was (Author: apilloud):
That test appears to be asserting the default timezone is not UTC (it has the 
wrong "right" answer): 
https://github.com/google/zetasql/blob/7d983d3632702f200c8340933160c02f1d94e5a7/javatests/com/google/zetasql/PreparedExpressionTest.java#L370

> timezone off by 8 hours
> ---
>
> Key: BEAM-9709
> URL: https://issues.apache.org/jira/browse/BEAM-9709
> Project: Beam
>  Issue Type: Bug
>  Components: dsl-sql-zetasql
>Reporter: Andrew Pilloud
>Assignee: Yueyang Qiu
>Priority: Trivial
>  Labels: zetasql-compliance
>
> two failures in shard 13, one failure in shard 19
> {code}
> Expected: ARRAY>[{2014-01-31 00:00:00+00}]
>   Actual: ARRAY>[{2014-01-31 08:00:00+00}], 
> {code}
> {code}
> select timestamp(date '2014-01-31')
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (BEAM-9709) timezone off by 8 hours

2020-04-13 Thread Yueyang Qiu (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17082745#comment-17082745
 ] 

Yueyang Qiu edited comment on BEAM-9709 at 4/13/20, 11:47 PM:
--

OK. I agree this is a bug in ZetaSQL.

 

I run
{code:java}
String expr = "cast(timestamp('2015-04-01') as string)";
try (PreparedExpression exp = new PreparedExpression(expr)) {
  AnalyzerOptions options = new AnalyzerOptions();
  options.setDefaultTimezone("Asia/Shanghai");
  exp.prepare(options);
  Value value = exp.execute();
  System.out.println(value.getStringValue());
}
{code}
and the result is
{code:java}
2015-04-01 00:00:00-07
{code}
which means setDefaultTimezone() does something wrong while running the 
timestamp() constructor, but it works fine while running cast()

(see the second part of the test: 
[https://github.com/google/zetasql/blob/7d983d3632702f200c8340933160c02f1d94e5a7/javatests/com/google/zetasql/PreparedExpressionTest.java#L375],
 the first part seems to be useless because the default time zone can be 
anything according to ZetaSQL spec; it does not have to be UTC)


was (Author: robinyqiu):
OK. I agree this is a bug in ZetaSQL.

 

I run
{code:java}
String expr = "cast(timestamp('2015-04-01') as string)";
try (PreparedExpression exp = new PreparedExpression(expr)) {
  AnalyzerOptions options = new AnalyzerOptions();
  options.setDefaultTimezone("Asia/Shanghai");
  exp.prepare(options);
  Value value = exp.execute();
  System.out.println(value.getStringValue());
}
{code}
and the result is
{code:java}
2015-04-01 00:00:00-07
{code}
which means setDefaultTimezone() does something wrong while running the 
timestamp() constructor, but it works fine while running cast()

(see the second part of the test: 
[https://github.com/google/zetasql/blob/7d983d3632702f200c8340933160c02f1d94e5a7/javatests/com/google/zetasql/PreparedExpressionTest.java#L375])

> timezone off by 8 hours
> ---
>
> Key: BEAM-9709
> URL: https://issues.apache.org/jira/browse/BEAM-9709
> Project: Beam
>  Issue Type: Bug
>  Components: dsl-sql-zetasql
>Reporter: Andrew Pilloud
>Assignee: Yueyang Qiu
>Priority: Trivial
>  Labels: zetasql-compliance
>
> two failures in shard 13, one failure in shard 19
> {code}
> Expected: ARRAY>[{2014-01-31 00:00:00+00}]
>   Actual: ARRAY>[{2014-01-31 08:00:00+00}], 
> {code}
> {code}
> select timestamp(date '2014-01-31')
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9692) Clean Python DataflowRunner to use portable pipelines

2020-04-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9692?focusedWorklogId=421749&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-421749
 ]

ASF GitHub Bot logged work on BEAM-9692:


Author: ASF GitHub Bot
Created on: 13/Apr/20 23:47
Start Date: 13/Apr/20 23:47
Worklog Time Spent: 10m 
  Work Description: rohdesamuel commented on pull request #11335: 
[BEAM-9692]: Make CombineValues portable
URL: https://github.com/apache/beam/pull/11335#discussion_r407783064
 
 

 ##
 File path: sdks/python/apache_beam/runners/dataflow/dataflow_runner.py
 ##
 @@ -110,22 +110,27 @@ class DataflowRunner(PipelineRunner):
 
   # Imported here to avoid circular dependencies.
   # TODO: Remove the apache_beam.pipeline dependency in 
CreatePTransformOverride
+  from apache_beam.runners.dataflow.ptransform_overrides import 
CombineValuesPTransformOverride
   from apache_beam.runners.dataflow.ptransform_overrides import 
CreatePTransformOverride
   from apache_beam.runners.dataflow.ptransform_overrides import 
ReadPTransformOverride
   from apache_beam.runners.dataflow.ptransform_overrides import 
JrhReadPTransformOverride
 
-  _PTRANSFORM_OVERRIDES = []  # type: List[PTransformOverride]
+  # Thesse overrides should be applied before the proto representation of the
+  # graph is created.
+  _PTRANSFORM_OVERRIDES = [
+  CombineValuesPTransformOverride()
 
 Review comment:
   This override should place the pipeline object into the same state as if the 
runner had defined an apply_CombineValues, what am I missing? Looking at the 
code, is it because other overrides might also use a CombineValues transform so 
it might needed to be replaced again?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 421749)
Time Spent: 1h 10m  (was: 1h)

> Clean Python DataflowRunner to use portable pipelines
> -
>
> Key: BEAM-9692
> URL: https://issues.apache.org/jira/browse/BEAM-9692
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-dataflow
>Reporter: Sam Rohde
>Assignee: Sam Rohde
>Priority: Major
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (BEAM-9709) timezone off by 8 hours

2020-04-13 Thread Yueyang Qiu (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17082745#comment-17082745
 ] 

Yueyang Qiu edited comment on BEAM-9709 at 4/13/20, 11:46 PM:
--

OK. I agree this is a bug in ZetaSQL.

 

I run
{code:java}
String expr = "cast(timestamp('2015-04-01') as string)";
try (PreparedExpression exp = new PreparedExpression(expr)) {
  AnalyzerOptions options = new AnalyzerOptions();
  options.setDefaultTimezone("Asia/Shanghai");
  exp.prepare(options);
  Value value = exp.execute();
  System.out.println(value.getStringValue());
}
{code}
and the result is
{code:java}
2015-04-01 00:00:00-07
{code}
which means setDefaultTimezone() does something wrong while running the 
timestamp() constructor, but it works fine while running cast()

(see the second part of the test: 
[https://github.com/google/zetasql/blob/7d983d3632702f200c8340933160c02f1d94e5a7/javatests/com/google/zetasql/PreparedExpressionTest.java#L375])


was (Author: robinyqiu):
OK. This seems to be a bug in ZetaSQL.

 

I run
{code:java}
String expr = "cast(timestamp('2015-04-01') as string)";
try (PreparedExpression exp = new PreparedExpression(expr)) {
  AnalyzerOptions options = new AnalyzerOptions();
  options.setDefaultTimezone("Asia/Shanghai");
  exp.prepare(options);
  Value value = exp.execute();
  System.out.println(value.getStringValue());
}
{code}
and the result is
{code:java}
2015-04-01 00:00:00-07
{code}
which means setDefaultTimezone() does something wrong while running the 
timestamp() constructor, but it works fine while running cast()

(see the second part of the test: 
https://github.com/google/zetasql/blob/7d983d3632702f200c8340933160c02f1d94e5a7/javatests/com/google/zetasql/PreparedExpressionTest.java#L375)

> timezone off by 8 hours
> ---
>
> Key: BEAM-9709
> URL: https://issues.apache.org/jira/browse/BEAM-9709
> Project: Beam
>  Issue Type: Bug
>  Components: dsl-sql-zetasql
>Reporter: Andrew Pilloud
>Assignee: Yueyang Qiu
>Priority: Trivial
>  Labels: zetasql-compliance
>
> two failures in shard 13, one failure in shard 19
> {code}
> Expected: ARRAY>[{2014-01-31 00:00:00+00}]
>   Actual: ARRAY>[{2014-01-31 08:00:00+00}], 
> {code}
> {code}
> select timestamp(date '2014-01-31')
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9468) Add Google Cloud Healthcare API IO Connectors

2020-04-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9468?focusedWorklogId=421746&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-421746
 ]

ASF GitHub Bot logged work on BEAM-9468:


Author: ASF GitHub Bot
Created on: 13/Apr/20 23:44
Start Date: 13/Apr/20 23:44
Worklog Time Spent: 10m 
  Work Description: pabloem commented on issue #11151: [BEAM-9468]  Hl7v2 io
URL: https://github.com/apache/beam/pull/11151#issuecomment-613150119
 
 
   this looks fine to me as long as the dependency changes look fine to 
@lukecwik 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 421746)
Time Spent: 26.5h  (was: 26h 20m)

> Add Google Cloud Healthcare API IO Connectors
> -
>
> Key: BEAM-9468
> URL: https://issues.apache.org/jira/browse/BEAM-9468
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-gcp
>Reporter: Jacob Ferriero
>Assignee: Jacob Ferriero
>Priority: Minor
>  Time Spent: 26.5h
>  Remaining Estimate: 0h
>
> Add IO Transforms for the HL7v2, FHIR and DICOM stores in the [Google Cloud 
> Healthcare API|https://cloud.google.com/healthcare/docs/]
> HL7v2IO
> FHIRIO
> DICOM 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-7923) Interactive Beam

2020-04-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7923?focusedWorklogId=421745&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-421745
 ]

ASF GitHub Bot logged work on BEAM-7923:


Author: ASF GitHub Bot
Created on: 13/Apr/20 23:43
Start Date: 13/Apr/20 23:43
Worklog Time Spent: 10m 
  Work Description: aaltay commented on issue #11338: [BEAM-7923] 
Screendiff Integration Tests
URL: https://github.com/apache/beam/pull/11338#issuecomment-613149907
 
 
   retest this please
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 421745)
Time Spent: 19h 40m  (was: 19.5h)

> Interactive Beam
> 
>
> Key: BEAM-7923
> URL: https://issues.apache.org/jira/browse/BEAM-7923
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-py-interactive
>Reporter: Ning Kang
>Assignee: Ning Kang
>Priority: Major
>  Time Spent: 19h 40m
>  Remaining Estimate: 0h
>
> This is the top level ticket for all efforts leveraging [interactive 
> Beam|[https://github.com/apache/beam/tree/master/sdks/python/apache_beam/runners/interactive]]
> As the development goes, blocking tickets will be added to this one.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (BEAM-9709) timezone off by 8 hours

2020-04-13 Thread Yueyang Qiu (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17082704#comment-17082704
 ] 

Yueyang Qiu edited comment on BEAM-9709 at 4/13/20, 11:43 PM:
--

(Previous comments are wrong. Writing the right version.)

Thank you Andrew for linking to the other bug. I took that as well. I think 
that one is easier to fix after this one is fixed. 


was (Author: robinyqiu):
(Previous comments are wrong. Writing the right version.)

Thank you Andrew for linking to the other bug. I took that as well. I think 
these 2 issues could be fixed together.

 

> timezone off by 8 hours
> ---
>
> Key: BEAM-9709
> URL: https://issues.apache.org/jira/browse/BEAM-9709
> Project: Beam
>  Issue Type: Bug
>  Components: dsl-sql-zetasql
>Reporter: Andrew Pilloud
>Assignee: Yueyang Qiu
>Priority: Trivial
>  Labels: zetasql-compliance
>
> two failures in shard 13, one failure in shard 19
> {code}
> Expected: ARRAY>[{2014-01-31 00:00:00+00}]
>   Actual: ARRAY>[{2014-01-31 08:00:00+00}], 
> {code}
> {code}
> select timestamp(date '2014-01-31')
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-9709) timezone off by 8 hours

2020-04-13 Thread Yueyang Qiu (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17082745#comment-17082745
 ] 

Yueyang Qiu commented on BEAM-9709:
---

OK. This seems to be a bug in ZetaSQL.

 

I run
{code:java}
String expr = "cast(timestamp('2015-04-01') as string)";
try (PreparedExpression exp = new PreparedExpression(expr)) {
  AnalyzerOptions options = new AnalyzerOptions();
  options.setDefaultTimezone("Asia/Shanghai");
  exp.prepare(options);
  Value value = exp.execute();
  System.out.println(value.getStringValue());
}
{code}
and the result is
{code:java}
2015-04-01 00:00:00-07
{code}
which means setDefaultTimezone() does something wrong while running the 
timestamp() constructor, but it works fine while running cast()

(see the second part of the test: 
https://github.com/google/zetasql/blob/7d983d3632702f200c8340933160c02f1d94e5a7/javatests/com/google/zetasql/PreparedExpressionTest.java#L375)

> timezone off by 8 hours
> ---
>
> Key: BEAM-9709
> URL: https://issues.apache.org/jira/browse/BEAM-9709
> Project: Beam
>  Issue Type: Bug
>  Components: dsl-sql-zetasql
>Reporter: Andrew Pilloud
>Assignee: Yueyang Qiu
>Priority: Trivial
>  Labels: zetasql-compliance
>
> two failures in shard 13, one failure in shard 19
> {code}
> Expected: ARRAY>[{2014-01-31 00:00:00+00}]
>   Actual: ARRAY>[{2014-01-31 08:00:00+00}], 
> {code}
> {code}
> select timestamp(date '2014-01-31')
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9738) Python Dataflow runner omits capabilities.

2020-04-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9738?focusedWorklogId=421742&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-421742
 ]

ASF GitHub Bot logged work on BEAM-9738:


Author: ASF GitHub Bot
Created on: 13/Apr/20 23:40
Start Date: 13/Apr/20 23:40
Worklog Time Spent: 10m 
  Work Description: robertwb commented on issue #11390: [BEAM-9738] Update 
dataflow to setup correct docker environment options.
URL: https://github.com/apache/beam/pull/11390#issuecomment-613149239
 
 
   This now contains the fixes from the cherry-pick. Letting tests run again. 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 421742)
Time Spent: 1h 50m  (was: 1h 40m)

> Python Dataflow runner omits capabilities.
> --
>
> Key: BEAM-9738
> URL: https://issues.apache.org/jira/browse/BEAM-9738
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Reporter: Robert Bradshaw
>Assignee: Robert Bradshaw
>Priority: Blocker
> Fix For: 2.21.0
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9748) Move Reparallelize transform to Reshuffle

2020-04-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9748?focusedWorklogId=421739&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-421739
 ]

ASF GitHub Bot logged work on BEAM-9748:


Author: ASF GitHub Bot
Created on: 13/Apr/20 23:32
Start Date: 13/Apr/20 23:32
Worklog Time Spent: 10m 
  Work Description: iemejia commented on pull request #11406: [BEAM-9748] 
Move Reparallelize transform to Reshuffle
URL: https://github.com/apache/beam/pull/11406#discussion_r407778247
 
 

 ##
 File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/Reshuffle.java
 ##
 @@ -65,6 +66,11 @@ private Reshuffle() {}
 return new ViaRandomKey<>();
   }
 
+  @Experimental
+  public static  Reparallelize reparallelize() {
 
 Review comment:
   Yes I will improve the javadoc, maybe move some of the details in the impl. 
comment below.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 421739)
Time Spent: 1h 20m  (was: 1h 10m)

> Move Reparallelize transform to Reshuffle
> -
>
> Key: BEAM-9748
> URL: https://issues.apache.org/jira/browse/BEAM-9748
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Minor
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Some DoFn based IOs like JdbcIO and RedisIO rely on the Reparallelize 
> transform,
> a combination of a an empty PCollectionView and Reshuffle to force the
> materialization and reparallelize a PCollection. The idea of this issue is to
> extract this transform and expose it as part of the internal Reshuffle
> transform to avoid repeating the code for transforms (notably IOs) that 
> require
> to reparallelize its output.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9748) Move Reparallelize transform to Reshuffle

2020-04-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9748?focusedWorklogId=421740&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-421740
 ]

ASF GitHub Bot logged work on BEAM-9748:


Author: ASF GitHub Bot
Created on: 13/Apr/20 23:32
Start Date: 13/Apr/20 23:32
Worklog Time Spent: 10m 
  Work Description: jkff commented on issue #11406: [BEAM-9748] Move 
Reparallelize transform to Reshuffle
URL: https://github.com/apache/beam/pull/11406#issuecomment-613147215
 
 
   The comments inside Reparallelize explain how this transform differs from 
Reshuffle.viaRandomKey(): it performs dramatically better on Dataflow in case 
the input PCollection is generated highly sequentially, as in the case of 
reading several GB of JDBC results. It almost certainly performs somewhat worse 
if the input PCollection is generated in a well-parallelized way, but I haven't 
measured that; I haven't measured the former case for non-Dataflow runners 
either.
   
   I think it's reasonable to move this to Reshuffle, but rename it to 
something more clear: maybe Reshuffle.forSequentiallyGeneratedInput()?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 421740)
Time Spent: 1.5h  (was: 1h 20m)

> Move Reparallelize transform to Reshuffle
> -
>
> Key: BEAM-9748
> URL: https://issues.apache.org/jira/browse/BEAM-9748
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Minor
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Some DoFn based IOs like JdbcIO and RedisIO rely on the Reparallelize 
> transform,
> a combination of a an empty PCollectionView and Reshuffle to force the
> materialization and reparallelize a PCollection. The idea of this issue is to
> extract this transform and expose it as part of the internal Reshuffle
> transform to avoid repeating the code for transforms (notably IOs) that 
> require
> to reparallelize its output.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9748) Move Reparallelize transform to Reshuffle

2020-04-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9748?focusedWorklogId=421738&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-421738
 ]

ASF GitHub Bot logged work on BEAM-9748:


Author: ASF GitHub Bot
Created on: 13/Apr/20 23:31
Start Date: 13/Apr/20 23:31
Worklog Time Spent: 10m 
  Work Description: iemejia commented on pull request #11406: [BEAM-9748] 
Move Reparallelize transform to Reshuffle
URL: https://github.com/apache/beam/pull/11406#discussion_r407778080
 
 

 ##
 File path: 
sdks/java/io/redis/src/main/java/org/apache/beam/sdk/io/redis/RedisIO.java
 ##
 @@ -309,7 +305,7 @@ public ReadAll withOutputParallelization(boolean 
outputParallelization) {
   .apply(ParDo.of(new ReadFn(connectionConfiguration(), 
batchSize(
   .setCoder(KvCoder.of(StringUtf8Coder.of(), 
StringUtf8Coder.of()));
   if (outputParallelization()) {
-output = output.apply(new Reparallelize());
+output = (PCollection>) 
output.apply(Reshuffle.reparallelize());
 
 Review comment:
   Not necessary, I will remove it.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 421738)
Time Spent: 1h 10m  (was: 1h)

> Move Reparallelize transform to Reshuffle
> -
>
> Key: BEAM-9748
> URL: https://issues.apache.org/jira/browse/BEAM-9748
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Minor
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Some DoFn based IOs like JdbcIO and RedisIO rely on the Reparallelize 
> transform,
> a combination of a an empty PCollectionView and Reshuffle to force the
> materialization and reparallelize a PCollection. The idea of this issue is to
> extract this transform and expose it as part of the internal Reshuffle
> transform to avoid repeating the code for transforms (notably IOs) that 
> require
> to reparallelize its output.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9748) Move Reparallelize transform to Reshuffle

2020-04-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9748?focusedWorklogId=421737&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-421737
 ]

ASF GitHub Bot logged work on BEAM-9748:


Author: ASF GitHub Bot
Created on: 13/Apr/20 23:30
Start Date: 13/Apr/20 23:30
Worklog Time Spent: 10m 
  Work Description: iemejia commented on issue #11406: [BEAM-9748] Move 
Reparallelize transform to Reshuffle
URL: https://github.com/apache/beam/pull/11406#issuecomment-613144797
 
 
   > Why do we want to do this over Reshuffle.viaRandomKeys which should get us 
the output parallelization we want?
   
   I think that's the case, maybe @jkff who created that code may confirm.
   There seems to be some more details in the implementation choice in the 
original ticket [BEAM-2803](https://issues.apache.org/jira/browse/BEAM-2803)
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 421737)
Time Spent: 1h  (was: 50m)

> Move Reparallelize transform to Reshuffle
> -
>
> Key: BEAM-9748
> URL: https://issues.apache.org/jira/browse/BEAM-9748
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Minor
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Some DoFn based IOs like JdbcIO and RedisIO rely on the Reparallelize 
> transform,
> a combination of a an empty PCollectionView and Reshuffle to force the
> materialization and reparallelize a PCollection. The idea of this issue is to
> extract this transform and expose it as part of the internal Reshuffle
> transform to avoid repeating the code for transforms (notably IOs) that 
> require
> to reparallelize its output.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9748) Move Reparallelize transform to Reshuffle

2020-04-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9748?focusedWorklogId=421734&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-421734
 ]

ASF GitHub Bot logged work on BEAM-9748:


Author: ASF GitHub Bot
Created on: 13/Apr/20 23:24
Start Date: 13/Apr/20 23:24
Worklog Time Spent: 10m 
  Work Description: iemejia commented on issue #11406: [BEAM-9748] Move 
Reparallelize transform to Reshuffle
URL: https://github.com/apache/beam/pull/11406#issuecomment-613144797
 
 
   > Why do we want to do this over Reshuffle.viaRandomKeys which should get us 
the output parallelization we want?
   
   I think that's the case, maybe @jkff who created that code may confirm.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 421734)
Time Spent: 50m  (was: 40m)

> Move Reparallelize transform to Reshuffle
> -
>
> Key: BEAM-9748
> URL: https://issues.apache.org/jira/browse/BEAM-9748
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Minor
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Some DoFn based IOs like JdbcIO and RedisIO rely on the Reparallelize 
> transform,
> a combination of a an empty PCollectionView and Reshuffle to force the
> materialization and reparallelize a PCollection. The idea of this issue is to
> extract this transform and expose it as part of the internal Reshuffle
> transform to avoid repeating the code for transforms (notably IOs) that 
> require
> to reparallelize its output.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9748) Move Reparallelize transform to Reshuffle

2020-04-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9748?focusedWorklogId=421733&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-421733
 ]

ASF GitHub Bot logged work on BEAM-9748:


Author: ASF GitHub Bot
Created on: 13/Apr/20 23:24
Start Date: 13/Apr/20 23:24
Worklog Time Spent: 10m 
  Work Description: iemejia commented on issue #11406: [BEAM-9748] Move 
Reparallelize transform to Reshuffle
URL: https://github.com/apache/beam/pull/11406#issuecomment-613144797
 
 
   > Why do we want to do this over Reshuffle.viaRandomKeys which should get us 
the output parallelization we want?
   I think that's the case, maybe @jkff who created that code may confirm.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 421733)
Time Spent: 40m  (was: 0.5h)

> Move Reparallelize transform to Reshuffle
> -
>
> Key: BEAM-9748
> URL: https://issues.apache.org/jira/browse/BEAM-9748
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Minor
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Some DoFn based IOs like JdbcIO and RedisIO rely on the Reparallelize 
> transform,
> a combination of a an empty PCollectionView and Reshuffle to force the
> materialization and reparallelize a PCollection. The idea of this issue is to
> extract this transform and expose it as part of the internal Reshuffle
> transform to avoid repeating the code for transforms (notably IOs) that 
> require
> to reparallelize its output.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9747) Deprecate RedisIO.readAll() and add RedisIO.readKeyPatterns() as a replacement

2020-04-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9747?focusedWorklogId=421732&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-421732
 ]

ASF GitHub Bot logged work on BEAM-9747:


Author: ASF GitHub Bot
Created on: 13/Apr/20 23:22
Start Date: 13/Apr/20 23:22
Worklog Time Spent: 10m 
  Work Description: iemejia commented on issue #11405: [BEAM-9747] 
Deprecate RedisIO.readAll() and add RedisIO.readKeyPatterns as a replacement
URL: https://github.com/apache/beam/pull/11405#issuecomment-613144141
 
 
   Run Java PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 421732)
Time Spent: 20m  (was: 10m)

> Deprecate RedisIO.readAll() and add RedisIO.readKeyPatterns() as a replacement
> --
>
> Key: BEAM-9747
> URL: https://issues.apache.org/jira/browse/BEAM-9747
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-redis
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Minor
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> The current RedisIO.ReadAll transform does not follow the ReadAll pattern
> introduced by other IOs like HBaseIO, SolrIO and soon CassandraIO. We should
> change current ReadAll into ReadKeyPatterns to avoid confusion to be able to 
> provide a
> consistent ReadAll transform BEAM-9403



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9751) Upgrade ZetaSQL to 2020.04.1

2020-04-13 Thread Jira


 [ 
https://issues.apache.org/jira/browse/BEAM-9751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismaël Mejía updated BEAM-9751:
---
Status: Open  (was: Triage Needed)

> Upgrade ZetaSQL to 2020.04.1
> 
>
> Key: BEAM-9751
> URL: https://issues.apache.org/jira/browse/BEAM-9751
> Project: Beam
>  Issue Type: Task
>  Components: dsl-sql-zetasql
>Reporter: Rui Wang
>Assignee: Rui Wang
>Priority: Major
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9750) Streaming Word Count Example Documents is out of date (Python)

2020-04-13 Thread Jira


 [ 
https://issues.apache.org/jira/browse/BEAM-9750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismaël Mejía updated BEAM-9750:
---
Status: Open  (was: Triage Needed)

> Streaming Word Count Example Documents is out of date (Python)
> --
>
> Key: BEAM-9750
> URL: https://issues.apache.org/jira/browse/BEAM-9750
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core, website
>Reporter: Ahmet Altay
>Assignee: Rose Nguyen
>Priority: Major
>
> Flink runners are listed as "This runner is not yet available for the Python 
> SDK." This is not accurate., Flink runner supports streaming with python.
> Link: 
> https://beam.apache.org/get-started/wordcount-example/#streamingwordcount-example
> /cc [~ibzib]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9738) Python Dataflow runner omits capabilities.

2020-04-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9738?focusedWorklogId=421731&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-421731
 ]

ASF GitHub Bot logged work on BEAM-9738:


Author: ASF GitHub Bot
Created on: 13/Apr/20 23:14
Start Date: 13/Apr/20 23:14
Worklog Time Spent: 10m 
  Work Description: ibzib commented on issue #11390: [BEAM-9738] Update 
dataflow to setup correct docker environment options.
URL: https://github.com/apache/beam/pull/11390#issuecomment-613141201
 
 
   @robertwb looks like some test failures are related, could you please take a 
look?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 421731)
Time Spent: 1h 40m  (was: 1.5h)

> Python Dataflow runner omits capabilities.
> --
>
> Key: BEAM-9738
> URL: https://issues.apache.org/jira/browse/BEAM-9738
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Reporter: Robert Bradshaw
>Assignee: Robert Bradshaw
>Priority: Blocker
> Fix For: 2.21.0
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9738) Python Dataflow runner omits capabilities.

2020-04-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9738?focusedWorklogId=421730&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-421730
 ]

ASF GitHub Bot logged work on BEAM-9738:


Author: ASF GitHub Bot
Created on: 13/Apr/20 23:12
Start Date: 13/Apr/20 23:12
Worklog Time Spent: 10m 
  Work Description: ibzib commented on issue #11390: [BEAM-9738] Update 
dataflow to setup correct docker environment options.
URL: https://github.com/apache/beam/pull/11390#issuecomment-613141201
 
 
   @robertwb looks like the test failure is related, could you please take a 
look?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 421730)
Time Spent: 1.5h  (was: 1h 20m)

> Python Dataflow runner omits capabilities.
> --
>
> Key: BEAM-9738
> URL: https://issues.apache.org/jira/browse/BEAM-9738
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Reporter: Robert Bradshaw
>Assignee: Robert Bradshaw
>Priority: Blocker
> Fix For: 2.21.0
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (BEAM-9745) [beam_PostCommit_Java_PortabilityApi] Various GCP IO tests failing, unable to deserialize Custom DoFns and Custom Coders.

2020-04-13 Thread Daniel Oliveira (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Oliveira reassigned BEAM-9745:
-

Assignee: Kyle Weaver  (was: Daniel Oliveira)

Handing it to Kyle who is likely going to be rebuilding the worker in the next 
few days. Feel free to hand it back to me if that doesn't fix it.

> [beam_PostCommit_Java_PortabilityApi] Various GCP IO tests failing, unable to 
> deserialize Custom DoFns and Custom Coders.
> -
>
> Key: BEAM-9745
> URL: https://issues.apache.org/jira/browse/BEAM-9745
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-gcp, java-fn-execution, sdk-java-harness, 
> test-failures
>Reporter: Daniel Oliveira
>Assignee: Kyle Weaver
>Priority: Blocker
>  Labels: currently-failing
> Fix For: 2.21.0
>
>
> _Use this form to file an issue for test failure:_
>  * [Jenkins 
> Job|https://builds.apache.org/job/beam_PostCommit_Java_PortabilityApi/4657/]
>  * [Gradle Build 
> Scan|https://scans.gradle.com/s/c3izncsa4u24k/tests/by-project]
> Initial investigation:
> The bug appears to be popping up on BigQuery tests mostly, but also a 
> BigTable and a Datastore test.
> Here's an example stacktrace of the two errors, showing _only_ the error 
> messages themselves. Source: 
> [https://scans.gradle.com/s/c3izncsa4u24k/tests/efn4wciuamvqq-ccxt3jvofvqbe]
> {noformat}
> java.util.concurrent.ExecutionException: java.lang.RuntimeException: Error 
> received from SDK harness for instruction -191: 
> java.lang.IllegalArgumentException: unable to deserialize Custom DoFn With 
> Execution Info
> ...
> Caused by: java.lang.ClassNotFoundException: 
> org.apache.beam.sdk.io.gcp.bigquery.BatchLoads$3
> ...
> Caused by: java.lang.RuntimeException: Error received from SDK harness for 
> instruction -191: java.lang.IllegalArgumentException: unable to deserialize 
> Custom DoFn With Execution Info
> ...
> Caused by: java.lang.ClassNotFoundException: 
> org.apache.beam.sdk.io.gcp.bigquery.BatchLoads$3
> java.util.concurrent.ExecutionException: java.lang.RuntimeException: Error 
> received from SDK harness for instruction -206: 
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException:
>  
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException:
>  java.lang.IllegalArgumentException: unable to deserialize Custom Coder Bytes
> ...
> Caused by: 
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException:
>  java.lang.IllegalArgumentException: unable to deserialize Custom Coder Bytes
> ...
> Caused by: java.lang.IllegalArgumentException: unable to deserialize Custom 
> Coder Bytes
> ...
> Caused by: java.lang.ClassNotFoundException: 
> org.apache.beam.sdk.io.gcp.bigquery.TableRowJsonCoder
> ...
> Caused by: java.lang.RuntimeException: Error received from SDK harness for 
> instruction -206: 
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException:
>  
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException:
>  java.lang.IllegalArgumentException: unable to deserialize Custom Coder Bytes
> ...
> Caused by: 
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException:
>  java.lang.IllegalArgumentException: unable to deserialize Custom Coder Bytes
> ...
> Caused by: java.lang.IllegalArgumentException: unable to deserialize Custom 
> Coder Bytes
> ...
> Caused by: java.lang.ClassNotFoundException: 
> org.apache.beam.sdk.io.gcp.bigquery.TableRowJsonCoder
> ...
> {noformat}
> Update: Looks like this has been failing as far back as [Apr 
> 4|https://builds.apache.org/job/beam_PostCommit_Java_PortabilityApi/4566/] 
> after a long period where the test was consistently timing out since [Mar 
> 31|https://builds.apache.org/job/beam_PostCommit_Java_PortabilityApi/4546/]. 
> So it's hard to narrow down what commit may have caused this. Plus, the test 
> was failing due to a completely different BigQuery failure before anyway, so 
> it seems like this test will need to be completely fixed from scratch, 
> instead of tracking down a specific breaking change.
> 
> _After you've filled out the above details, please [assign the issue to an 
> individual|https://beam.apache.org/contribute/postcommits-guides/index.html#find_specialist].
>  Assignee should [treat test failures as 
> high-priority|https://beam.apache.org/contribute/postcommits-policies/#assigned-failing-test],
>  helping to fix the issue or find a more appropriate owner. See [Apache Beam 
> Post-Commit 
> Policies|https://beam.apache.org/contribute/postc

[jira] [Resolved] (BEAM-9562) Remove timer from PCollection and treat timers as Elements

2020-04-13 Thread Kyle Weaver (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kyle Weaver resolved BEAM-9562.
---
Resolution: Fixed

> Remove timer from PCollection and treat timers as Elements 
> ---
>
> Key: BEAM-9562
> URL: https://issues.apache.org/jira/browse/BEAM-9562
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-harness, sdk-py-harness
>Reporter: Boyuan Zhang
>Assignee: Boyuan Zhang
>Priority: Major
> Fix For: 2.21.0
>
>  Time Spent: 24h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-9562) Remove timer from PCollection and treat timers as Elements

2020-04-13 Thread Kyle Weaver (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17082727#comment-17082727
 ] 

Kyle Weaver commented on BEAM-9562:
---

Thanks for the fix Luke. We cherry-picked it, so I am marking this issue as 
resolved again.

> Remove timer from PCollection and treat timers as Elements 
> ---
>
> Key: BEAM-9562
> URL: https://issues.apache.org/jira/browse/BEAM-9562
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-harness, sdk-py-harness
>Reporter: Boyuan Zhang
>Assignee: Boyuan Zhang
>Priority: Major
> Fix For: 2.21.0
>
>  Time Spent: 24h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (BEAM-9725) Perfomance regression in reshuffle

2020-04-13 Thread Kyle Weaver (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kyle Weaver closed BEAM-9725.
-
  Assignee: Ankur Goenka
Resolution: Fixed

> Perfomance regression in reshuffle 
> ---
>
> Key: BEAM-9725
> URL: https://issues.apache.org/jira/browse/BEAM-9725
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow, sdk-py-core, sdk-py-harness
>Affects Versions: 2.20.0
>Reporter: Ankur Goenka
>Assignee: Ankur Goenka
>Priority: Major
> Fix For: 2.21.0
>
>
> PR [https://github.com/apache/beam/pull/11066] is causing a performance 
> regression for reshuffle transform. 
>  
> cc: [~amaliujia] [~altay]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8466) Python typehints: pep 484 warn and strict modes

2020-04-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8466?focusedWorklogId=421728&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-421728
 ]

ASF GitHub Bot logged work on BEAM-8466:


Author: ASF GitHub Bot
Created on: 13/Apr/20 23:06
Start Date: 13/Apr/20 23:06
Worklog Time Spent: 10m 
  Work Description: udim commented on issue #11240: [BEAM-8466] Make 
strip_iterable more strict
URL: https://github.com/apache/beam/pull/11240#issuecomment-613139531
 
 
   This is now ready to be internally tested
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 421728)
Time Spent: 2h  (was: 1h 50m)

> Python typehints: pep 484 warn and strict modes
> ---
>
> Key: BEAM-8466
> URL: https://issues.apache.org/jira/browse/BEAM-8466
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Udi Meiri
>Assignee: Udi Meiri
>Priority: Major
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> Allow type checking to use PEP 484 type hints, but only warn if there are 
> errors, and in another mode to raise exceptions on errors.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (BEAM-9735) Performance regression in Python Batch pipeline in Reshuffle

2020-04-13 Thread Kyle Weaver (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kyle Weaver resolved BEAM-9735.
---
Resolution: Fixed

> Performance regression in Python Batch pipeline in Reshuffle
> 
>
> Key: BEAM-9735
> URL: https://issues.apache.org/jira/browse/BEAM-9735
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Ankur Goenka
>Assignee: Ankur Goenka
>Priority: Blocker
> Fix For: 2.21.0
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-9745) [beam_PostCommit_Java_PortabilityApi] Various GCP IO tests failing, unable to deserialize Custom DoFns and Custom Coders.

2020-04-13 Thread Daniel Oliveira (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17082722#comment-17082722
 ] 

Daniel Oliveira commented on BEAM-9745:
---

Did some asking around. Looks like this may be related to 
https://github.com/apache/beam/commit/0cd2fb6633a9d3d9183fc0532336501c3a56406c#diff-ecb570b49f9b4854404be5fbd74b0f22

And it will probably be fixed once the worker is rebuilt.

> [beam_PostCommit_Java_PortabilityApi] Various GCP IO tests failing, unable to 
> deserialize Custom DoFns and Custom Coders.
> -
>
> Key: BEAM-9745
> URL: https://issues.apache.org/jira/browse/BEAM-9745
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-gcp, java-fn-execution, sdk-java-harness, 
> test-failures
>Reporter: Daniel Oliveira
>Assignee: Daniel Oliveira
>Priority: Blocker
>  Labels: currently-failing
> Fix For: 2.21.0
>
>
> _Use this form to file an issue for test failure:_
>  * [Jenkins 
> Job|https://builds.apache.org/job/beam_PostCommit_Java_PortabilityApi/4657/]
>  * [Gradle Build 
> Scan|https://scans.gradle.com/s/c3izncsa4u24k/tests/by-project]
> Initial investigation:
> The bug appears to be popping up on BigQuery tests mostly, but also a 
> BigTable and a Datastore test.
> Here's an example stacktrace of the two errors, showing _only_ the error 
> messages themselves. Source: 
> [https://scans.gradle.com/s/c3izncsa4u24k/tests/efn4wciuamvqq-ccxt3jvofvqbe]
> {noformat}
> java.util.concurrent.ExecutionException: java.lang.RuntimeException: Error 
> received from SDK harness for instruction -191: 
> java.lang.IllegalArgumentException: unable to deserialize Custom DoFn With 
> Execution Info
> ...
> Caused by: java.lang.ClassNotFoundException: 
> org.apache.beam.sdk.io.gcp.bigquery.BatchLoads$3
> ...
> Caused by: java.lang.RuntimeException: Error received from SDK harness for 
> instruction -191: java.lang.IllegalArgumentException: unable to deserialize 
> Custom DoFn With Execution Info
> ...
> Caused by: java.lang.ClassNotFoundException: 
> org.apache.beam.sdk.io.gcp.bigquery.BatchLoads$3
> java.util.concurrent.ExecutionException: java.lang.RuntimeException: Error 
> received from SDK harness for instruction -206: 
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException:
>  
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException:
>  java.lang.IllegalArgumentException: unable to deserialize Custom Coder Bytes
> ...
> Caused by: 
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException:
>  java.lang.IllegalArgumentException: unable to deserialize Custom Coder Bytes
> ...
> Caused by: java.lang.IllegalArgumentException: unable to deserialize Custom 
> Coder Bytes
> ...
> Caused by: java.lang.ClassNotFoundException: 
> org.apache.beam.sdk.io.gcp.bigquery.TableRowJsonCoder
> ...
> Caused by: java.lang.RuntimeException: Error received from SDK harness for 
> instruction -206: 
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException:
>  
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException:
>  java.lang.IllegalArgumentException: unable to deserialize Custom Coder Bytes
> ...
> Caused by: 
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException:
>  java.lang.IllegalArgumentException: unable to deserialize Custom Coder Bytes
> ...
> Caused by: java.lang.IllegalArgumentException: unable to deserialize Custom 
> Coder Bytes
> ...
> Caused by: java.lang.ClassNotFoundException: 
> org.apache.beam.sdk.io.gcp.bigquery.TableRowJsonCoder
> ...
> {noformat}
> Update: Looks like this has been failing as far back as [Apr 
> 4|https://builds.apache.org/job/beam_PostCommit_Java_PortabilityApi/4566/] 
> after a long period where the test was consistently timing out since [Mar 
> 31|https://builds.apache.org/job/beam_PostCommit_Java_PortabilityApi/4546/]. 
> So it's hard to narrow down what commit may have caused this. Plus, the test 
> was failing due to a completely different BigQuery failure before anyway, so 
> it seems like this test will need to be completely fixed from scratch, 
> instead of tracking down a specific breaking change.
> 
> _After you've filled out the above details, please [assign the issue to an 
> individual|https://beam.apache.org/contribute/postcommits-guides/index.html#find_specialist].
>  Assignee should [treat test failures as 
> high-priority|https://beam.apache.org/contribute/postcommits-policies/#assigned-failing-test],
>  helping to fix the issue or find a more appropriat

[jira] [Commented] (BEAM-9709) timezone off by 8 hours

2020-04-13 Thread Andrew Pilloud (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17082720#comment-17082720
 ] 

Andrew Pilloud commented on BEAM-9709:
--

That test appears to be asserting the default timezone is not UTC (it has the 
wrong "right" answer): 
https://github.com/google/zetasql/blob/7d983d3632702f200c8340933160c02f1d94e5a7/javatests/com/google/zetasql/PreparedExpressionTest.java#L370

> timezone off by 8 hours
> ---
>
> Key: BEAM-9709
> URL: https://issues.apache.org/jira/browse/BEAM-9709
> Project: Beam
>  Issue Type: Bug
>  Components: dsl-sql-zetasql
>Reporter: Andrew Pilloud
>Assignee: Yueyang Qiu
>Priority: Trivial
>  Labels: zetasql-compliance
>
> two failures in shard 13, one failure in shard 19
> {code}
> Expected: ARRAY>[{2014-01-31 00:00:00+00}]
>   Actual: ARRAY>[{2014-01-31 08:00:00+00}], 
> {code}
> {code}
> select timestamp(date '2014-01-31')
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9745) [beam_PostCommit_Java_PortabilityApi] Various GCP IO tests failing, unable to deserialize Custom DoFns and Custom Coders.

2020-04-13 Thread Kyle Weaver (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kyle Weaver updated BEAM-9745:
--
Fix Version/s: 2.21.0

> [beam_PostCommit_Java_PortabilityApi] Various GCP IO tests failing, unable to 
> deserialize Custom DoFns and Custom Coders.
> -
>
> Key: BEAM-9745
> URL: https://issues.apache.org/jira/browse/BEAM-9745
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-gcp, java-fn-execution, sdk-java-harness, 
> test-failures
>Reporter: Daniel Oliveira
>Assignee: Daniel Oliveira
>Priority: Blocker
>  Labels: currently-failing
> Fix For: 2.21.0
>
>
> _Use this form to file an issue for test failure:_
>  * [Jenkins 
> Job|https://builds.apache.org/job/beam_PostCommit_Java_PortabilityApi/4657/]
>  * [Gradle Build 
> Scan|https://scans.gradle.com/s/c3izncsa4u24k/tests/by-project]
> Initial investigation:
> The bug appears to be popping up on BigQuery tests mostly, but also a 
> BigTable and a Datastore test.
> Here's an example stacktrace of the two errors, showing _only_ the error 
> messages themselves. Source: 
> [https://scans.gradle.com/s/c3izncsa4u24k/tests/efn4wciuamvqq-ccxt3jvofvqbe]
> {noformat}
> java.util.concurrent.ExecutionException: java.lang.RuntimeException: Error 
> received from SDK harness for instruction -191: 
> java.lang.IllegalArgumentException: unable to deserialize Custom DoFn With 
> Execution Info
> ...
> Caused by: java.lang.ClassNotFoundException: 
> org.apache.beam.sdk.io.gcp.bigquery.BatchLoads$3
> ...
> Caused by: java.lang.RuntimeException: Error received from SDK harness for 
> instruction -191: java.lang.IllegalArgumentException: unable to deserialize 
> Custom DoFn With Execution Info
> ...
> Caused by: java.lang.ClassNotFoundException: 
> org.apache.beam.sdk.io.gcp.bigquery.BatchLoads$3
> java.util.concurrent.ExecutionException: java.lang.RuntimeException: Error 
> received from SDK harness for instruction -206: 
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException:
>  
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException:
>  java.lang.IllegalArgumentException: unable to deserialize Custom Coder Bytes
> ...
> Caused by: 
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException:
>  java.lang.IllegalArgumentException: unable to deserialize Custom Coder Bytes
> ...
> Caused by: java.lang.IllegalArgumentException: unable to deserialize Custom 
> Coder Bytes
> ...
> Caused by: java.lang.ClassNotFoundException: 
> org.apache.beam.sdk.io.gcp.bigquery.TableRowJsonCoder
> ...
> Caused by: java.lang.RuntimeException: Error received from SDK harness for 
> instruction -206: 
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException:
>  
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException:
>  java.lang.IllegalArgumentException: unable to deserialize Custom Coder Bytes
> ...
> Caused by: 
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException:
>  java.lang.IllegalArgumentException: unable to deserialize Custom Coder Bytes
> ...
> Caused by: java.lang.IllegalArgumentException: unable to deserialize Custom 
> Coder Bytes
> ...
> Caused by: java.lang.ClassNotFoundException: 
> org.apache.beam.sdk.io.gcp.bigquery.TableRowJsonCoder
> ...
> {noformat}
> Update: Looks like this has been failing as far back as [Apr 
> 4|https://builds.apache.org/job/beam_PostCommit_Java_PortabilityApi/4566/] 
> after a long period where the test was consistently timing out since [Mar 
> 31|https://builds.apache.org/job/beam_PostCommit_Java_PortabilityApi/4546/]. 
> So it's hard to narrow down what commit may have caused this. Plus, the test 
> was failing due to a completely different BigQuery failure before anyway, so 
> it seems like this test will need to be completely fixed from scratch, 
> instead of tracking down a specific breaking change.
> 
> _After you've filled out the above details, please [assign the issue to an 
> individual|https://beam.apache.org/contribute/postcommits-guides/index.html#find_specialist].
>  Assignee should [treat test failures as 
> high-priority|https://beam.apache.org/contribute/postcommits-policies/#assigned-failing-test],
>  helping to fix the issue or find a more appropriate owner. See [Apache Beam 
> Post-Commit 
> Policies|https://beam.apache.org/contribute/postcommits-policies]._



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9692) Clean Python DataflowRunner to use portable pipelines

2020-04-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9692?focusedWorklogId=421713&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-421713
 ]

ASF GitHub Bot logged work on BEAM-9692:


Author: ASF GitHub Bot
Created on: 13/Apr/20 22:37
Start Date: 13/Apr/20 22:37
Worklog Time Spent: 10m 
  Work Description: robertwb commented on pull request #11335: [BEAM-9692]: 
Make CombineValues portable
URL: https://github.com/apache/beam/pull/11335#discussion_r407758508
 
 

 ##
 File path: sdks/python/apache_beam/runners/dataflow/dataflow_runner_test.py
 ##
 @@ -566,6 +566,19 @@ def test_get_default_gcp_region_ignores_error(
 result = runner.get_default_gcp_region()
 self.assertIsNone(result)
 
+  def test_combine_values_translation(self):
+runner = DataflowRunner()
+
+with beam.Pipeline(runner=runner,
+   options=PipelineOptions(self.default_properties)) as p:
+  (  # pylint: disable=expression-not-assigned
+  p
+  | beam.Create([('a', [1, 2]), ('b', [3, 4])])
+  | beam.CombineValues(lambda v, _: sum(v)))
+
+job_dict = json.loads(str(runner.job))
+self.assertEqual(job_dict[u'steps'][1][u'kind'], u'CombineValues')
 
 Review comment:
   Asserting that it's the first step seems brittle, maybe just assert that 
there is some step that has this kind?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 421713)
Time Spent: 50m  (was: 40m)

> Clean Python DataflowRunner to use portable pipelines
> -
>
> Key: BEAM-9692
> URL: https://issues.apache.org/jira/browse/BEAM-9692
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-dataflow
>Reporter: Sam Rohde
>Assignee: Sam Rohde
>Priority: Major
>  Time Spent: 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9692) Clean Python DataflowRunner to use portable pipelines

2020-04-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9692?focusedWorklogId=421714&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-421714
 ]

ASF GitHub Bot logged work on BEAM-9692:


Author: ASF GitHub Bot
Created on: 13/Apr/20 22:37
Start Date: 13/Apr/20 22:37
Worklog Time Spent: 10m 
  Work Description: robertwb commented on pull request #11335: [BEAM-9692]: 
Make CombineValues portable
URL: https://github.com/apache/beam/pull/11335#discussion_r407760055
 
 

 ##
 File path: sdks/python/apache_beam/runners/dataflow/ptransform_overrides.py
 ##
 @@ -111,3 +111,38 @@ def expand(self, pbegin):
 
 return JrhRead().with_output_types(
 ptransform.get_type_hints().simple_output_type('Read'))
+
+
+class CombineValuesPTransformOverride(PTransformOverride):
+  """A ``PTransformOverride`` for ``CombineValues``.
+
+  The DataflowRunner expects that the CombineValues PTransform acts as a
+  primitive. So this override replaces the CombineValues with a primitive.
+  """
+  def matches(self, applied_ptransform):
+# Imported here to avoid circular dependencies.
+# pylint: disable=wrong-import-order, wrong-import-position
+from apache_beam import CombineValues
+
+if isinstance(applied_ptransform.transform, CombineValues):
+  self.transform = applied_ptransform.transform
+  return True
+return False
+
+  def get_replacement_transform(self, ptransform):
+# Imported here to avoid circular dependencies.
+# pylint: disable=wrong-import-order, wrong-import-position
+from apache_beam import PTransform
+from apache_beam.pvalue import PCollection
+
+# The DataflowRunner still needs access to the CombineValues members to
 
 Review comment:
   It would be preferable to simply let try and find methods for composites as 
well, rather than using PTransformOverrides. This would likely help with the 
GBK one too. 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 421714)
Time Spent: 1h  (was: 50m)

> Clean Python DataflowRunner to use portable pipelines
> -
>
> Key: BEAM-9692
> URL: https://issues.apache.org/jira/browse/BEAM-9692
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-dataflow
>Reporter: Sam Rohde
>Assignee: Sam Rohde
>Priority: Major
>  Time Spent: 1h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9692) Clean Python DataflowRunner to use portable pipelines

2020-04-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9692?focusedWorklogId=421712&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-421712
 ]

ASF GitHub Bot logged work on BEAM-9692:


Author: ASF GitHub Bot
Created on: 13/Apr/20 22:37
Start Date: 13/Apr/20 22:37
Worklog Time Spent: 10m 
  Work Description: robertwb commented on pull request #11335: [BEAM-9692]: 
Make CombineValues portable
URL: https://github.com/apache/beam/pull/11335#discussion_r407757918
 
 

 ##
 File path: sdks/python/apache_beam/runners/dataflow/dataflow_runner.py
 ##
 @@ -110,22 +110,27 @@ class DataflowRunner(PipelineRunner):
 
   # Imported here to avoid circular dependencies.
   # TODO: Remove the apache_beam.pipeline dependency in 
CreatePTransformOverride
+  from apache_beam.runners.dataflow.ptransform_overrides import 
CombineValuesPTransformOverride
   from apache_beam.runners.dataflow.ptransform_overrides import 
CreatePTransformOverride
   from apache_beam.runners.dataflow.ptransform_overrides import 
ReadPTransformOverride
   from apache_beam.runners.dataflow.ptransform_overrides import 
JrhReadPTransformOverride
 
-  _PTRANSFORM_OVERRIDES = []  # type: List[PTransformOverride]
+  # Thesse overrides should be applied before the proto representation of the
+  # graph is created.
+  _PTRANSFORM_OVERRIDES = [
+  CombineValuesPTransformOverride()
 
 Review comment:
   Seems this one should happen after too...
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 421712)
Time Spent: 40m  (was: 0.5h)

> Clean Python DataflowRunner to use portable pipelines
> -
>
> Key: BEAM-9692
> URL: https://issues.apache.org/jira/browse/BEAM-9692
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-dataflow
>Reporter: Sam Rohde
>Assignee: Sam Rohde
>Priority: Major
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-9709) timezone off by 8 hours

2020-04-13 Thread Yueyang Qiu (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17082714#comment-17082714
 ] 

Yueyang Qiu commented on BEAM-9709:
---

It seems weird to me because this ZetaSQL unit test is working:

[https://github.com/google/zetasql/blob/master/javatests/com/google/zetasql/PreparedExpressionTest.java#L364]

> timezone off by 8 hours
> ---
>
> Key: BEAM-9709
> URL: https://issues.apache.org/jira/browse/BEAM-9709
> Project: Beam
>  Issue Type: Bug
>  Components: dsl-sql-zetasql
>Reporter: Andrew Pilloud
>Assignee: Yueyang Qiu
>Priority: Trivial
>  Labels: zetasql-compliance
>
> two failures in shard 13, one failure in shard 19
> {code}
> Expected: ARRAY>[{2014-01-31 00:00:00+00}]
>   Actual: ARRAY>[{2014-01-31 08:00:00+00}], 
> {code}
> {code}
> select timestamp(date '2014-01-31')
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (BEAM-9709) timezone off by 8 hours

2020-04-13 Thread Yueyang Qiu (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17082704#comment-17082704
 ] 

Yueyang Qiu edited comment on BEAM-9709 at 4/13/20, 10:33 PM:
--

(Previous comments are wrong. Writing the right version.)

Thank you Andrew for linking to the other bug. I took that as well. I think 
these 2 issues could be fixed together.

 


was (Author: robinyqiu):
OK I dug into this a bit more and found the actual problem.

 

Currently Beam ZetaSQL engine does not set default time zone. So in 
BeamZetaSqlCalcRel, when ZetaSQL evaluator evaluate an expression that depends 
on default time zone, it choose its own: "America/Los_Angeles" 
([https://github.com/google/zetasql/blob/master/zetasql/reference_impl/evaluation.cc#L122]).

 

The fix should be a one-line change in BeamZetaSqlCalcRel 
"options.setDefaultTimezone("UTC");" (i.e. Beam ZetaSQL defines its own default 
time zone to be UTC).

 

Thank you Andrew for linking to the other bug. I took that as well. I think 
these 2 issues could be fixed together.

 

> timezone off by 8 hours
> ---
>
> Key: BEAM-9709
> URL: https://issues.apache.org/jira/browse/BEAM-9709
> Project: Beam
>  Issue Type: Bug
>  Components: dsl-sql-zetasql
>Reporter: Andrew Pilloud
>Assignee: Yueyang Qiu
>Priority: Trivial
>  Labels: zetasql-compliance
>
> two failures in shard 13, one failure in shard 19
> {code}
> Expected: ARRAY>[{2014-01-31 00:00:00+00}]
>   Actual: ARRAY>[{2014-01-31 08:00:00+00}], 
> {code}
> {code}
> select timestamp(date '2014-01-31')
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-9709) timezone off by 8 hours

2020-04-13 Thread Yueyang Qiu (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17082713#comment-17082713
 ] 

Yueyang Qiu commented on BEAM-9709:
---

Yes you are right. I deleted the previous wrong comment.

> timezone off by 8 hours
> ---
>
> Key: BEAM-9709
> URL: https://issues.apache.org/jira/browse/BEAM-9709
> Project: Beam
>  Issue Type: Bug
>  Components: dsl-sql-zetasql
>Reporter: Andrew Pilloud
>Assignee: Yueyang Qiu
>Priority: Trivial
>  Labels: zetasql-compliance
>
> two failures in shard 13, one failure in shard 19
> {code}
> Expected: ARRAY>[{2014-01-31 00:00:00+00}]
>   Actual: ARRAY>[{2014-01-31 08:00:00+00}], 
> {code}
> {code}
> select timestamp(date '2014-01-31')
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-9709) timezone off by 8 hours

2020-04-13 Thread Andrew Pilloud (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17082712#comment-17082712
 ] 

Andrew Pilloud commented on BEAM-9709:
--

BeamZetaSqlCalcRel uses the analyzer options from Beam ZetaSQL, which already 
sets the default: 
https://github.com/apache/beam/blob/473790ac4f98405ef20fe9186a5b75b1e0ad5657/sdks/java/extensions/sql/zetasql/src/main/java/org/apache/beam/sdk/extensions/sql/zetasql/SqlAnalyzer.java#L105

That must not be getting plumbed through to evaluation.cc somewhere.

> timezone off by 8 hours
> ---
>
> Key: BEAM-9709
> URL: https://issues.apache.org/jira/browse/BEAM-9709
> Project: Beam
>  Issue Type: Bug
>  Components: dsl-sql-zetasql
>Reporter: Andrew Pilloud
>Assignee: Yueyang Qiu
>Priority: Trivial
>  Labels: zetasql-compliance
>
> two failures in shard 13, one failure in shard 19
> {code}
> Expected: ARRAY>[{2014-01-31 00:00:00+00}]
>   Actual: ARRAY>[{2014-01-31 08:00:00+00}], 
> {code}
> {code}
> select timestamp(date '2014-01-31')
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (BEAM-8472) Get default GCP region from gcloud

2020-04-13 Thread Robert Burke (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17082711#comment-17082711
 ] 

Robert Burke edited comment on BEAM-8472 at 4/13/20, 10:26 PM:
---

Just to be clear, the protocol is to check the environment variables, and then 
execute the gcloud command?

Which would be to use [os.Getenv|https://godoc.org/pkg/os#Getenv] with 
"CLOUDSDK_COMPUTE_REGION" and then use the [os/exec 
package|https://godoc.org/pkg/os/exec] to call the gcloud executable?


was (Author: lostluck):
Just to be clear, the protocol is to check the environment variables, and then 
execute the gcloud command?

Which would be to use [os.Getenv|https://godoc.corp.google.com/pkg/os#Getenv] 
with "CLOUDSDK_COMPUTE_REGION" and then use the [os/exec 
package|https://godoc.corp.google.com/pkg/os/exec] to call the gcloud 
executable?

> Get default GCP region from gcloud
> --
>
> Key: BEAM-8472
> URL: https://issues.apache.org/jira/browse/BEAM-8472
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-dataflow, sdk-go
>Reporter: Kyle Weaver
>Priority: Major
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> Currently, we default to us-central1 if --region flag is not set. The Google 
> Cloud SDK generally tries to get a default value in this case for 
> convenience, which we should follow. 
> [https://cloud.google.com/compute/docs/gcloud-compute/#order_of_precedence_for_default_properties]
> Update 11/12: this is complete for Python and Java, Go remains.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-8472) Get default GCP region from gcloud

2020-04-13 Thread Robert Burke (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17082711#comment-17082711
 ] 

Robert Burke commented on BEAM-8472:


Just to be clear, the protocol is to check the environment variables, and then 
execute the gcloud command?

Which would be to use [os.Getenv|https://godoc.corp.google.com/pkg/os#Getenv] 
with "CLOUDSDK_COMPUTE_REGION" and then use the [os/exec 
package|https://godoc.corp.google.com/pkg/os/exec] to call the gcloud 
executable?

> Get default GCP region from gcloud
> --
>
> Key: BEAM-8472
> URL: https://issues.apache.org/jira/browse/BEAM-8472
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-dataflow, sdk-go
>Reporter: Kyle Weaver
>Priority: Major
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> Currently, we default to us-central1 if --region flag is not set. The Google 
> Cloud SDK generally tries to get a default value in this case for 
> convenience, which we should follow. 
> [https://cloud.google.com/compute/docs/gcloud-compute/#order_of_precedence_for_default_properties]
> Update 11/12: this is complete for Python and Java, Go remains.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9751) Upgrade ZetaSQL to 2020.04.1

2020-04-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9751?focusedWorklogId=421708&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-421708
 ]

ASF GitHub Bot logged work on BEAM-9751:


Author: ASF GitHub Bot
Created on: 13/Apr/20 22:23
Start Date: 13/Apr/20 22:23
Worklog Time Spent: 10m 
  Work Description: amaliujia commented on issue #11410: [BEAM-9751] 
Upgrade ZetaSQL java 2020 04 1
URL: https://github.com/apache/beam/pull/11410#issuecomment-613126620
 
 
   Run SQL PostCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 421708)
Time Spent: 0.5h  (was: 20m)

> Upgrade ZetaSQL to 2020.04.1
> 
>
> Key: BEAM-9751
> URL: https://issues.apache.org/jira/browse/BEAM-9751
> Project: Beam
>  Issue Type: Task
>  Components: dsl-sql-zetasql
>Reporter: Rui Wang
>Assignee: Rui Wang
>Priority: Major
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9751) Upgrade ZetaSQL to 2020.04.1

2020-04-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9751?focusedWorklogId=421707&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-421707
 ]

ASF GitHub Bot logged work on BEAM-9751:


Author: ASF GitHub Bot
Created on: 13/Apr/20 22:23
Start Date: 13/Apr/20 22:23
Worklog Time Spent: 10m 
  Work Description: amaliujia commented on pull request #11410: [BEAM-9751] 
Upgrade ZetaSQL java 2020 04 1
URL: https://github.com/apache/beam/pull/11410#discussion_r407754950
 
 

 ##
 File path: sdks/java/extensions/sql/zetasql/build.gradle
 ##
 @@ -20,12 +20,18 @@ plugins {
   id 'org.apache.beam.module'
 }
 
+repositories {
 
 Review comment:
   Will remove this when PR is ready to merge.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 421707)
Time Spent: 20m  (was: 10m)

> Upgrade ZetaSQL to 2020.04.1
> 
>
> Key: BEAM-9751
> URL: https://issues.apache.org/jira/browse/BEAM-9751
> Project: Beam
>  Issue Type: Task
>  Components: dsl-sql-zetasql
>Reporter: Rui Wang
>Assignee: Rui Wang
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9751) Upgrade ZetaSQL to 2020.04.1

2020-04-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9751?focusedWorklogId=421706&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-421706
 ]

ASF GitHub Bot logged work on BEAM-9751:


Author: ASF GitHub Bot
Created on: 13/Apr/20 22:22
Start Date: 13/Apr/20 22:22
Worklog Time Spent: 10m 
  Work Description: amaliujia commented on pull request #11410: [BEAM-9751] 
Upgrade ZetaSQL java 2020 04 1
URL: https://github.com/apache/beam/pull/11410
 
 
   
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] Update `CHANGES.md` with noteworthy changes.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   See the [Contributor Guide](https://beam.apache.org/contribute) for more 
tips on [how to make review process 
smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/)
   Python | [![Build

[jira] [Commented] (BEAM-8472) Get default GCP region from gcloud

2020-04-13 Thread Robert Burke (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17082707#comment-17082707
 ] 

Robert Burke commented on BEAM-8472:


Eventually. Dataflow doesn't currently support the Go SDK so this won't be 
prioritized above current work any time soon.

> Get default GCP region from gcloud
> --
>
> Key: BEAM-8472
> URL: https://issues.apache.org/jira/browse/BEAM-8472
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-dataflow, sdk-go
>Reporter: Kyle Weaver
>Priority: Major
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> Currently, we default to us-central1 if --region flag is not set. The Google 
> Cloud SDK generally tries to get a default value in this case for 
> convenience, which we should follow. 
> [https://cloud.google.com/compute/docs/gcloud-compute/#order_of_precedence_for_default_properties]
> Update 11/12: this is complete for Python and Java, Go remains.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (BEAM-8472) Get default GCP region from gcloud

2020-04-13 Thread Robert Burke (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Burke reassigned BEAM-8472:
--

Assignee: (was: Kyle Weaver)

> Get default GCP region from gcloud
> --
>
> Key: BEAM-8472
> URL: https://issues.apache.org/jira/browse/BEAM-8472
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-dataflow, sdk-go
>Reporter: Kyle Weaver
>Priority: Major
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> Currently, we default to us-central1 if --region flag is not set. The Google 
> Cloud SDK generally tries to get a default value in this case for 
> convenience, which we should follow. 
> [https://cloud.google.com/compute/docs/gcloud-compute/#order_of_precedence_for_default_properties]
> Update 11/12: this is complete for Python and Java, Go remains.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-9751) Upgrade ZetaSQL to 2020.04.1

2020-04-13 Thread Rui Wang (Jira)
Rui Wang created BEAM-9751:
--

 Summary: Upgrade ZetaSQL to 2020.04.1
 Key: BEAM-9751
 URL: https://issues.apache.org/jira/browse/BEAM-9751
 Project: Beam
  Issue Type: Task
  Components: dsl-sql-zetasql
Reporter: Rui Wang
Assignee: Rui Wang






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-9709) timezone off by 8 hours

2020-04-13 Thread Yueyang Qiu (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17082704#comment-17082704
 ] 

Yueyang Qiu commented on BEAM-9709:
---

OK I dug into this a bit more and found the actual problem.

 

Currently Beam ZetaSQL engine does not set default time zone. So in 
BeamZetaSqlCalcRel, when ZetaSQL evaluator evaluate an expression that depends 
on default time zone, it choose its own: "America/Los_Angeles" 
([https://github.com/google/zetasql/blob/master/zetasql/reference_impl/evaluation.cc#L122]).

 

The fix should be a one-line change in BeamZetaSqlCalcRel 
"options.setDefaultTimezone("UTC");" (i.e. Beam ZetaSQL defines its own default 
time zone to be UTC).

 

Thank you Andrew for linking to the other bug. I took that as well. I think 
these 2 issues could be fixed together.

 

> timezone off by 8 hours
> ---
>
> Key: BEAM-9709
> URL: https://issues.apache.org/jira/browse/BEAM-9709
> Project: Beam
>  Issue Type: Bug
>  Components: dsl-sql-zetasql
>Reporter: Andrew Pilloud
>Assignee: Yueyang Qiu
>Priority: Trivial
>  Labels: zetasql-compliance
>
> two failures in shard 13, one failure in shard 19
> {code}
> Expected: ARRAY>[{2014-01-31 00:00:00+00}]
>   Actual: ARRAY>[{2014-01-31 08:00:00+00}], 
> {code}
> {code}
> select timestamp(date '2014-01-31')
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9496) Add a Dataframe API for Python

2020-04-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9496?focusedWorklogId=421690&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-421690
 ]

ASF GitHub Bot logged work on BEAM-9496:


Author: ASF GitHub Bot
Created on: 13/Apr/20 22:07
Start Date: 13/Apr/20 22:07
Worklog Time Spent: 10m 
  Work Description: robertwb commented on pull request #11264: [BEAM-9496] 
Add to_dataframe and to_pcollection APIs.
URL: https://github.com/apache/beam/pull/11264#discussion_r407749163
 
 

 ##
 File path: sdks/python/apache_beam/dataframe/convert.py
 ##
 @@ -0,0 +1,71 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+from __future__ import absolute_import
+
+import inspect
+
+from apache_beam import pvalue
+from apache_beam.dataframe import expressions
+from apache_beam.dataframe import frame_base
+from apache_beam.dataframe import transforms
+
+
+def to_dataframe(pc):
+  pass
+
+
+# TODO: Or should this be called as_dataframe?
 
 Review comment:
   So far we haven't added any methods to PCollection, but I'm open to the idea 
(thought it'd be a big change to the API that should be done wholistically, see 
https://lists.apache.org/thread.html/fcb422d61437a634662b24100d4e2d46a940ee766848b699023081d9%40%3Cdev.beam.apache.org%3E
 ) For now, at least, it seems a bit much to make dataframe-methods on 
PCollection itself. 
   
   Short of that, can you think of any fluent styles for
   
   ```
   from apache_beam import dataframe as ???
   ...
   pcol = p | "Read from Source" >> beam.io.SomeSchemaSource(foo)
   df = ???
   ```
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 421690)
Time Spent: 1h 20m  (was: 1h 10m)

> Add a Dataframe API for Python
> --
>
> Key: BEAM-9496
> URL: https://issues.apache.org/jira/browse/BEAM-9496
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py-core
>Reporter: Robert Bradshaw
>Assignee: Robert Bradshaw
>Priority: Major
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> This is an umbrella bug for the dataframes work.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9744) Python performance tests failing

2020-04-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9744?focusedWorklogId=421687&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-421687
 ]

ASF GitHub Bot logged work on BEAM-9744:


Author: ASF GitHub Bot
Created on: 13/Apr/20 22:05
Start Date: 13/Apr/20 22:05
Worklog Time Spent: 10m 
  Work Description: ibzib commented on pull request #11408: [BEAM-9744] 
Remove --region option from SQL tests.
URL: https://github.com/apache/beam/pull/11408
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 421687)
Time Spent: 1h 40m  (was: 1.5h)

> Python performance tests failing
> 
>
> Key: BEAM-9744
> URL: https://issues.apache.org/jira/browse/BEAM-9744
> Project: Beam
>  Issue Type: Bug
>  Components: test-failures
>Reporter: Kyle Weaver
>Assignee: Kyle Weaver
>Priority: Major
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> beam_PerformanceTests_WordCountIT_Py* failing because --region is missing.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9468) Add Google Cloud Healthcare API IO Connectors

2020-04-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9468?focusedWorklogId=421686&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-421686
 ]

ASF GitHub Bot logged work on BEAM-9468:


Author: ASF GitHub Bot
Created on: 13/Apr/20 22:04
Start Date: 13/Apr/20 22:04
Worklog Time Spent: 10m 
  Work Description: pabloem commented on issue #11151: [BEAM-9468]  Hl7v2 io
URL: https://github.com/apache/beam/pull/11151#issuecomment-613120178
 
 
   Run Java PostCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 421686)
Time Spent: 26h 20m  (was: 26h 10m)

> Add Google Cloud Healthcare API IO Connectors
> -
>
> Key: BEAM-9468
> URL: https://issues.apache.org/jira/browse/BEAM-9468
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-gcp
>Reporter: Jacob Ferriero
>Assignee: Jacob Ferriero
>Priority: Minor
>  Time Spent: 26h 20m
>  Remaining Estimate: 0h
>
> Add IO Transforms for the HL7v2, FHIR and DICOM stores in the [Google Cloud 
> Healthcare API|https://cloud.google.com/healthcare/docs/]
> HL7v2IO
> FHIRIO
> DICOM 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9468) Add Google Cloud Healthcare API IO Connectors

2020-04-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9468?focusedWorklogId=421685&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-421685
 ]

ASF GitHub Bot logged work on BEAM-9468:


Author: ASF GitHub Bot
Created on: 13/Apr/20 22:04
Start Date: 13/Apr/20 22:04
Worklog Time Spent: 10m 
  Work Description: pabloem commented on issue #11151: [BEAM-9468]  Hl7v2 io
URL: https://github.com/apache/beam/pull/11151#issuecomment-613120004
 
 
   Run Java PostCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 421685)
Time Spent: 26h 10m  (was: 26h)

> Add Google Cloud Healthcare API IO Connectors
> -
>
> Key: BEAM-9468
> URL: https://issues.apache.org/jira/browse/BEAM-9468
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-gcp
>Reporter: Jacob Ferriero
>Assignee: Jacob Ferriero
>Priority: Minor
>  Time Spent: 26h 10m
>  Remaining Estimate: 0h
>
> Add IO Transforms for the HL7v2, FHIR and DICOM stores in the [Google Cloud 
> Healthcare API|https://cloud.google.com/healthcare/docs/]
> HL7v2IO
> FHIRIO
> DICOM 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (BEAM-9712) setting default timezone doesn't work

2020-04-13 Thread Yueyang Qiu (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yueyang Qiu reassigned BEAM-9712:
-

Assignee: Yueyang Qiu

> setting default timezone doesn't work
> -
>
> Key: BEAM-9712
> URL: https://issues.apache.org/jira/browse/BEAM-9712
> Project: Beam
>  Issue Type: Bug
>  Components: dsl-sql-zetasql
>Reporter: Andrew Pilloud
>Assignee: Yueyang Qiu
>Priority: Trivial
>  Labels: zetasql-compliance
>
> several failures in shard 14
> (note: fixing the internal tests requires plumbing through the timezone 
> config.)
> {code}
> [name=timestamp_to_string_1]
> select [cast(timestamp "2015-01-28" as string),
> cast(timestamp "2015-01-28 00:00:00" as string),
> cast(timestamp "2015-01-28 00:00:00.0" as string),
> cast(timestamp "2015-01-28 00:00:00.00" as string),
> cast(timestamp "2015-01-28 00:00:00.000" as string),
> cast(timestamp "2015-01-28 00:00:00." as string),
> cast(timestamp "2015-01-28 00:00:00.0" as string),
> cast(timestamp "2015-01-28 00:00:00.00" as string)]
> --
> ARRAY>>[
>   {ARRAY[
>  "2015-01-28 00:00:00+13:45",
>  "2015-01-28 00:00:00+13:45",
>  "2015-01-28 00:00:00+13:45",
>  "2015-01-28 00:00:00+13:45",
>  "2015-01-28 00:00:00+13:45",
>  "2015-01-28 00:00:00+13:45",
>  "2015-01-28 00:00:00+13:45",
>  "2015-01-28 00:00:00+13:45"
>]}
> ]
> {code}
> {code}
> [default_time_zone=Pacific/Chatham]
> [name=timestamp_to_string_1]
> select [cast(timestamp "2015-01-28" as string),
> cast(timestamp "2015-01-28 00:00:00" as string),
> cast(timestamp "2015-01-28 00:00:00.0" as string),
> cast(timestamp "2015-01-28 00:00:00.00" as string),
> cast(timestamp "2015-01-28 00:00:00.000" as string),
> cast(timestamp "2015-01-28 00:00:00." as string),
> cast(timestamp "2015-01-28 00:00:00.0" as string),
> cast(timestamp "2015-01-28 00:00:00.00" as string)]
> --
> ARRAY>>[
>   {ARRAY[
>  "2015-01-28 00:00:00+13:45",
>  "2015-01-28 00:00:00+13:45",
>  "2015-01-28 00:00:00+13:45",
>  "2015-01-28 00:00:00+13:45",
>  "2015-01-28 00:00:00+13:45",
>  "2015-01-28 00:00:00+13:45",
>  "2015-01-28 00:00:00+13:45",
>  "2015-01-28 00:00:00+13:45"
>]}
> ]
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9735) Performance regression in Python Batch pipeline in Reshuffle

2020-04-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9735?focusedWorklogId=421683&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-421683
 ]

ASF GitHub Bot logged work on BEAM-9735:


Author: ASF GitHub Bot
Created on: 13/Apr/20 22:03
Start Date: 13/Apr/20 22:03
Worklog Time Spent: 10m 
  Work Description: ibzib commented on pull request #11395: [Cherrypick 
11365] [BEAM-9735] Adding Always trigger and using it in Reshuffle
URL: https://github.com/apache/beam/pull/11395
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 421683)
Time Spent: 1.5h  (was: 1h 20m)

> Performance regression in Python Batch pipeline in Reshuffle
> 
>
> Key: BEAM-9735
> URL: https://issues.apache.org/jira/browse/BEAM-9735
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Ankur Goenka
>Assignee: Ankur Goenka
>Priority: Blocker
> Fix For: 2.21.0
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9562) Remove timer from PCollection and treat timers as Elements

2020-04-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9562?focusedWorklogId=421682&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-421682
 ]

ASF GitHub Bot logged work on BEAM-9562:


Author: ASF GitHub Bot
Created on: 13/Apr/20 22:02
Start Date: 13/Apr/20 22:02
Worklog Time Spent: 10m 
  Work Description: ibzib commented on pull request #11407: [BEAM-9562] 
Cherry-pick: Fix output timestamp to be inferred from scheduled time w…
URL: https://github.com/apache/beam/pull/11407
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 421682)
Time Spent: 24h 20m  (was: 24h 10m)

> Remove timer from PCollection and treat timers as Elements 
> ---
>
> Key: BEAM-9562
> URL: https://issues.apache.org/jira/browse/BEAM-9562
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-harness, sdk-py-harness
>Reporter: Boyuan Zhang
>Assignee: Boyuan Zhang
>Priority: Major
> Fix For: 2.21.0
>
>  Time Spent: 24h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9744) Python performance tests failing

2020-04-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9744?focusedWorklogId=421680&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-421680
 ]

ASF GitHub Bot logged work on BEAM-9744:


Author: ASF GitHub Bot
Created on: 13/Apr/20 22:00
Start Date: 13/Apr/20 22:00
Worklog Time Spent: 10m 
  Work Description: ibzib commented on pull request #11408: [BEAM-9744] 
Remove --region option from SQL tests.
URL: https://github.com/apache/beam/pull/11408#discussion_r407746387
 
 

 ##
 File path: sdks/java/extensions/sql/build.gradle
 ##
 @@ -149,15 +149,13 @@ task runPojoExample(type: JavaExec) {
 task integrationTest(type: Test) {
   group = "Verification"
   def gcpProject = project.findProperty('gcpProject') ?: 'apache-beam-testing'
-  def gcpRegion = project.findProperty('gcpRegion') ?: 'us-central1'
   def gcsTempRoot = project.findProperty('gcsTempRoot') ?: 
'gs://temp-storage-for-end-to-end-tests/'
 
   // Disable Gradle cache (it should not be used because the IT's won't run).
   outputs.upToDateWhen { false }
 
   def pipelineOptions = [
   "--project=${gcpProject}",
 
 Review comment:
   These tests use other GCP resources like BigQuery and Pub/Sub.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 421680)
Time Spent: 1.5h  (was: 1h 20m)

> Python performance tests failing
> 
>
> Key: BEAM-9744
> URL: https://issues.apache.org/jira/browse/BEAM-9744
> Project: Beam
>  Issue Type: Bug
>  Components: test-failures
>Reporter: Kyle Weaver
>Assignee: Kyle Weaver
>Priority: Major
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> beam_PerformanceTests_WordCountIT_Py* failing because --region is missing.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9744) Python performance tests failing

2020-04-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9744?focusedWorklogId=421679&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-421679
 ]

ASF GitHub Bot logged work on BEAM-9744:


Author: ASF GitHub Bot
Created on: 13/Apr/20 21:59
Start Date: 13/Apr/20 21:59
Worklog Time Spent: 10m 
  Work Description: aaltay commented on pull request #11408: [BEAM-9744] 
Remove --region option from SQL tests.
URL: https://github.com/apache/beam/pull/11408#discussion_r407745861
 
 

 ##
 File path: sdks/java/extensions/sql/build.gradle
 ##
 @@ -149,15 +149,13 @@ task runPojoExample(type: JavaExec) {
 task integrationTest(type: Test) {
   group = "Verification"
   def gcpProject = project.findProperty('gcpProject') ?: 'apache-beam-testing'
-  def gcpRegion = project.findProperty('gcpRegion') ?: 'us-central1'
   def gcsTempRoot = project.findProperty('gcsTempRoot') ?: 
'gs://temp-storage-for-end-to-end-tests/'
 
   // Disable Gradle cache (it should not be used because the IT's won't run).
   outputs.upToDateWhen { false }
 
   def pipelineOptions = [
   "--project=${gcpProject}",
 
 Review comment:
   If they don't run on Dataflow, what is project and other gcp flags are used 
for?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 421679)
Time Spent: 1h 20m  (was: 1h 10m)

> Python performance tests failing
> 
>
> Key: BEAM-9744
> URL: https://issues.apache.org/jira/browse/BEAM-9744
> Project: Beam
>  Issue Type: Bug
>  Components: test-failures
>Reporter: Kyle Weaver
>Assignee: Kyle Weaver
>Priority: Major
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> beam_PerformanceTests_WordCountIT_Py* failing because --region is missing.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9496) Add a Dataframe API for Python

2020-04-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9496?focusedWorklogId=421673&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-421673
 ]

ASF GitHub Bot logged work on BEAM-9496:


Author: ASF GitHub Bot
Created on: 13/Apr/20 21:57
Start Date: 13/Apr/20 21:57
Worklog Time Spent: 10m 
  Work Description: robertwb commented on pull request #11264: [BEAM-9496] 
Add to_dataframe and to_pcollection APIs.
URL: https://github.com/apache/beam/pull/11264#discussion_r407745027
 
 

 ##
 File path: sdks/python/apache_beam/dataframe/transforms.py
 ##
 @@ -0,0 +1,255 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+from __future__ import absolute_import
+
+import pandas as pd
+
+import apache_beam as beam
+from apache_beam import transforms
+from apache_beam.dataframe import expressions
+from apache_beam.dataframe import frame_base
+from apache_beam.dataframe import frames  # pylint: disable=unused-import
+
+
+class DataframeTransform(transforms.PTransform):
+  """A PTransform for applying function that takes and returns dataframes
+  to one or more PCollections.
+
+  For example, if pcoll is a PCollection of dataframes, one could write::
+
+  pcoll | DataframeTransform(lambda df: df.group_by('key').sum(), 
proxy=...)
+
+  To pass multiple PCollections, pass a tuple of PCollections wich will be
+  passed to the callable as positional arguments, or a dictionary of
+  PCollections, in which case they will be passed as keyword arguments.
+  """
+  def __init__(self, func, proxy):
+self._func = func
+self._proxy = proxy
+
+  def expand(self, input_pcolls):
+def wrap_as_dict(values):
+  if isinstance(values, dict):
+return values
+  elif isinstance(values, tuple):
+return dict(enumerate(values))
+  else:
+return {None: values}
+
+# TODO: Infer the proxy from the input schema.
+def proxy(key):
+  if key is None:
+return self._proxy
+  else:
+return self._proxy[key]
+
+# The input can be a dictionary, tuple, or plain PCollection.
+# Wrap as a dict for homogeneity.
+# TODO: Possibly inject batching here.
+input_dict = wrap_as_dict(input_pcolls)
+placeholders = {
+key: frame_base.DeferredFrame.wrap(
+expressions.PlaceholderExpression(proxy(key)))
+for key in input_dict.keys()
+}
+
+# The calling convention of the user-supplied func varies according to the
+# type of the input.
+if isinstance(input_pcolls, dict):
+  result_frames = self._func(**placeholders)
+elif isinstance(input_pcolls, tuple):
+  result_frames = self._func(
+  *(value for _, value in sorted(placeholders.items(
+else:
+  result_frames = self._func(placeholders[None])
+
+# Likewise the output may be a dict, tuple, or raw (deferred) Dataframe.
+result_dict = wrap_as_dict(result_frames)
+
+result_pcolls = {
+placeholders[key]._expr: pcoll
+for key, pcoll in input_dict.items()
+} | 'Eval' >> DataframeExpressionsTransform(
+{key: df._expr
+ for key, df in result_dict.items()})
+
+# Convert the result back into a set of PCollections.
+if isinstance(result_frames, dict):
+  return result_pcolls
+elif isinstance(result_frames, tuple):
+  return tuple((value for _, value in sorted(result_pcolls.items(
+else:
+  return result_pcolls[None]
 
 Review comment:
   Yes. Done.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 421673)
Time Spent: 1h 10m  (was: 1h)

> Add a Dataframe API for Python
> --
>
> Key: BEAM-9496
> URL: https://issues.apache.org/jira/browse/BEAM-9496
> Project: Beam
>  Issue Type: New Feature
>  Compo

[jira] [Updated] (BEAM-8472) Get default GCP region from gcloud

2020-04-13 Thread Kyle Weaver (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kyle Weaver updated BEAM-8472:
--
Component/s: sdk-go

> Get default GCP region from gcloud
> --
>
> Key: BEAM-8472
> URL: https://issues.apache.org/jira/browse/BEAM-8472
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-dataflow, sdk-go
>Reporter: Kyle Weaver
>Assignee: Kyle Weaver
>Priority: Major
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> Currently, we default to us-central1 if --region flag is not set. The Google 
> Cloud SDK generally tries to get a default value in this case for 
> convenience, which we should follow. 
> [https://cloud.google.com/compute/docs/gcloud-compute/#order_of_precedence_for_default_properties]
> Update 11/12: this is complete for Python and Java, Go remains.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-8472) Get default GCP region from gcloud

2020-04-13 Thread Kyle Weaver (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17082689#comment-17082689
 ] 

Kyle Weaver commented on BEAM-8472:
---

FYI [~lostluck] [~danoliveira] Would one of you mind taking this issue? It 
should be pretty straightforward to port the Java/Python implementation to Go.

> Get default GCP region from gcloud
> --
>
> Key: BEAM-8472
> URL: https://issues.apache.org/jira/browse/BEAM-8472
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-dataflow
>Reporter: Kyle Weaver
>Assignee: Kyle Weaver
>Priority: Major
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> Currently, we default to us-central1 if --region flag is not set. The Google 
> Cloud SDK generally tries to get a default value in this case for 
> convenience, which we should follow. 
> [https://cloud.google.com/compute/docs/gcloud-compute/#order_of_precedence_for_default_properties]
> Update 11/12: this is complete for Python and Java, Go remains.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-8253) (Go SDK) Add worker_region and worker_zone options

2020-04-13 Thread Kyle Weaver (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17082685#comment-17082685
 ] 

Kyle Weaver commented on BEAM-8253:
---

[~lostluck] [~danoliveira] Could one of you take this over?

> (Go SDK) Add worker_region and worker_zone options
> --
>
> Key: BEAM-8253
> URL: https://issues.apache.org/jira/browse/BEAM-8253
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-dataflow, sdk-go
>Reporter: Kyle Weaver
>Assignee: Kyle Weaver
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9468) Add Google Cloud Healthcare API IO Connectors

2020-04-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9468?focusedWorklogId=421664&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-421664
 ]

ASF GitHub Bot logged work on BEAM-9468:


Author: ASF GitHub Bot
Created on: 13/Apr/20 21:45
Start Date: 13/Apr/20 21:45
Worklog Time Spent: 10m 
  Work Description: pabloem commented on issue #11151: [BEAM-9468]  Hl7v2 io
URL: https://github.com/apache/beam/pull/11151#issuecomment-613113280
 
 
   retest this please
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 421664)
Time Spent: 26h  (was: 25h 50m)

> Add Google Cloud Healthcare API IO Connectors
> -
>
> Key: BEAM-9468
> URL: https://issues.apache.org/jira/browse/BEAM-9468
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-gcp
>Reporter: Jacob Ferriero
>Assignee: Jacob Ferriero
>Priority: Minor
>  Time Spent: 26h
>  Remaining Estimate: 0h
>
> Add IO Transforms for the HL7v2, FHIR and DICOM stores in the [Google Cloud 
> Healthcare API|https://cloud.google.com/healthcare/docs/]
> HL7v2IO
> FHIRIO
> DICOM 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9468) Add Google Cloud Healthcare API IO Connectors

2020-04-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9468?focusedWorklogId=421663&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-421663
 ]

ASF GitHub Bot logged work on BEAM-9468:


Author: ASF GitHub Bot
Created on: 13/Apr/20 21:44
Start Date: 13/Apr/20 21:44
Worklog Time Spent: 10m 
  Work Description: pabloem commented on issue #11151: [BEAM-9468]  Hl7v2 io
URL: https://github.com/apache/beam/pull/11151#issuecomment-613112821
 
 
   retest this please
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 421663)
Time Spent: 25h 50m  (was: 25h 40m)

> Add Google Cloud Healthcare API IO Connectors
> -
>
> Key: BEAM-9468
> URL: https://issues.apache.org/jira/browse/BEAM-9468
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-gcp
>Reporter: Jacob Ferriero
>Assignee: Jacob Ferriero
>Priority: Minor
>  Time Spent: 25h 50m
>  Remaining Estimate: 0h
>
> Add IO Transforms for the HL7v2, FHIR and DICOM stores in the [Google Cloud 
> Healthcare API|https://cloud.google.com/healthcare/docs/]
> HL7v2IO
> FHIRIO
> DICOM 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-9709) timezone off by 8 hours

2020-04-13 Thread Andrew Pilloud (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17082680#comment-17082680
 ] 

Andrew Pilloud commented on BEAM-9709:
--

My understanding is that setting the default timezone to something other than 
UTC is BEAM-9712. This test is with the timezone set to UTC, so it should 
either work or return an unimplemented exception.

> timezone off by 8 hours
> ---
>
> Key: BEAM-9709
> URL: https://issues.apache.org/jira/browse/BEAM-9709
> Project: Beam
>  Issue Type: Bug
>  Components: dsl-sql-zetasql
>Reporter: Andrew Pilloud
>Assignee: Yueyang Qiu
>Priority: Trivial
>  Labels: zetasql-compliance
>
> two failures in shard 13, one failure in shard 19
> {code}
> Expected: ARRAY>[{2014-01-31 00:00:00+00}]
>   Actual: ARRAY>[{2014-01-31 08:00:00+00}], 
> {code}
> {code}
> select timestamp(date '2014-01-31')
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (BEAM-9709) timezone off by 8 hours

2020-04-13 Thread Yueyang Qiu (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17082672#comment-17082672
 ] 

Yueyang Qiu edited comment on BEAM-9709 at 4/13/20, 9:35 PM:
-

The cause of this issue is related to setting default time zone. ZetaSQL allows 
each engine to define its own default time zone. If the default time zone is 
UTC, then `2014-01-31 00:00:00+00` is the expected result, otherwise the 
current result could also be valid. This fails in compliance tests currently 
because in the harness we have not implemented this properly yet.


was (Author: robinyqiu):
The cause of this issue is related to setting default time zone. ZetaSQL allows 
each engine to define its own default time zone. If the default time zone is 
UTC, then `2014-01-31 00:00:00+00` is the expected result, otherwise the 
current result could also be valid. This fails in compliance tests currently 
because in the harness we have not implemented this properly yet 
([https://cs.corp.google.com/piper///depot/google3/third_party/cloud_dataflow/sql/ExecuteQueryServiceServer.java?type=cs&q=SetDefaultTimeZone+file:%5E//depot/google3/third_party/cloud_dataflow/sql/+package:%5Epiper$&g=0&l=515]).

> timezone off by 8 hours
> ---
>
> Key: BEAM-9709
> URL: https://issues.apache.org/jira/browse/BEAM-9709
> Project: Beam
>  Issue Type: Bug
>  Components: dsl-sql-zetasql
>Reporter: Andrew Pilloud
>Assignee: Yueyang Qiu
>Priority: Trivial
>  Labels: zetasql-compliance
>
> two failures in shard 13, one failure in shard 19
> {code}
> Expected: ARRAY>[{2014-01-31 00:00:00+00}]
>   Actual: ARRAY>[{2014-01-31 08:00:00+00}], 
> {code}
> {code}
> select timestamp(date '2014-01-31')
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (BEAM-9710) Got current time instead of timestamp value

2020-04-13 Thread Yueyang Qiu (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17082658#comment-17082658
 ] 

Yueyang Qiu edited comment on BEAM-9710 at 4/13/20, 9:34 PM:
-

This seems to be caused by bad code from internal compliance test harness.


was (Author: robinyqiu):
This seems to be caused by bad code from internal compliance test harness. I 
don't think this should be a Dataflow SQL GA blocker.

> Got current time instead of timestamp value
> ---
>
> Key: BEAM-9710
> URL: https://issues.apache.org/jira/browse/BEAM-9710
> Project: Beam
>  Issue Type: Bug
>  Components: dsl-sql-zetasql
>Reporter: Andrew Pilloud
>Assignee: Yueyang Qiu
>Priority: Trivial
>  Labels: zetasql-compliance
>
> one failure in shard 13
> {code}
> Expected: ARRAY>[{2014-12-01 00:00:00+00}]
>   Actual: ARRAY>[{2020-04-06 
> 00:20:40.052+00}], 
> {code}
> {code}
> [prepare_database]
> CREATE TABLE Table1 AS
> SELECT timestamp '2014-12-01' as timestamp_val
> --
> ARRAY>[{2014-12-01 00:00:00+00}]
> ==
> [name=timestamp_type_2]
> SELECT timestamp_val
> FROM Table1
> --
> ARRAY>[{2014-12-01 00:00:00+00}]
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-9709) timezone off by 8 hours

2020-04-13 Thread Kenneth Knowles (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17082675#comment-17082675
 ] 

Kenneth Knowles commented on BEAM-9709:
---

OK, got it.

> timezone off by 8 hours
> ---
>
> Key: BEAM-9709
> URL: https://issues.apache.org/jira/browse/BEAM-9709
> Project: Beam
>  Issue Type: Bug
>  Components: dsl-sql-zetasql
>Reporter: Andrew Pilloud
>Assignee: Yueyang Qiu
>Priority: Trivial
>  Labels: zetasql-compliance
>
> two failures in shard 13, one failure in shard 19
> {code}
> Expected: ARRAY>[{2014-01-31 00:00:00+00}]
>   Actual: ARRAY>[{2014-01-31 08:00:00+00}], 
> {code}
> {code}
> select timestamp(date '2014-01-31')
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-9709) timezone off by 8 hours

2020-04-13 Thread Kenneth Knowles (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17082674#comment-17082674
 ] 

Kenneth Knowles commented on BEAM-9709:
---

Just to clarify: the scope of this ticket is the Beam ZetaSQL dialect, while 
those are docs for Google's hosted product based on it.

I believe the Beam ZetaSQL dialect should probably fail this invocation if it 
is not supported, rather than return a result that disagrees with the ZetaSQL 
spec.

> timezone off by 8 hours
> ---
>
> Key: BEAM-9709
> URL: https://issues.apache.org/jira/browse/BEAM-9709
> Project: Beam
>  Issue Type: Bug
>  Components: dsl-sql-zetasql
>Reporter: Andrew Pilloud
>Assignee: Yueyang Qiu
>Priority: Trivial
>  Labels: zetasql-compliance
>
> two failures in shard 13, one failure in shard 19
> {code}
> Expected: ARRAY>[{2014-01-31 00:00:00+00}]
>   Actual: ARRAY>[{2014-01-31 08:00:00+00}], 
> {code}
> {code}
> select timestamp(date '2014-01-31')
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-9709) timezone off by 8 hours

2020-04-13 Thread Yueyang Qiu (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17082672#comment-17082672
 ] 

Yueyang Qiu commented on BEAM-9709:
---

The cause of this issue is related to setting default time zone. ZetaSQL allows 
each engine to define its own default time zone. If the default time zone is 
UTC, then `2014-01-31 00:00:00+00` is the expected result, otherwise the 
current result could also be valid. This fails in compliance tests currently 
because in the harness we have not implemented this properly yet 
([https://cs.corp.google.com/piper///depot/google3/third_party/cloud_dataflow/sql/ExecuteQueryServiceServer.java?type=cs&q=SetDefaultTimeZone+file:%5E//depot/google3/third_party/cloud_dataflow/sql/+package:%5Epiper$&g=0&l=515]).

> timezone off by 8 hours
> ---
>
> Key: BEAM-9709
> URL: https://issues.apache.org/jira/browse/BEAM-9709
> Project: Beam
>  Issue Type: Bug
>  Components: dsl-sql-zetasql
>Reporter: Andrew Pilloud
>Assignee: Yueyang Qiu
>Priority: Trivial
>  Labels: zetasql-compliance
>
> two failures in shard 13, one failure in shard 19
> {code}
> Expected: ARRAY>[{2014-01-31 00:00:00+00}]
>   Actual: ARRAY>[{2014-01-31 08:00:00+00}], 
> {code}
> {code}
> select timestamp(date '2014-01-31')
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-2939) Fn API SDF support

2020-04-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-2939?focusedWorklogId=421655&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-421655
 ]

ASF GitHub Bot logged work on BEAM-2939:


Author: ASF GitHub Bot
Created on: 13/Apr/20 21:32
Start Date: 13/Apr/20 21:32
Worklog Time Spent: 10m 
  Work Description: ihji commented on pull request #11409: [BEAM-2939] 
Update unbounded source as SDF wrapper to resume successfully.
URL: https://github.com/apache/beam/pull/11409#discussion_r407734012
 
 

 ##
 File path: sdks/java/core/src/main/java/org/apache/beam/sdk/io/Read.java
 ##
 @@ -743,8 +727,6 @@ public boolean tryClaim(UnboundedSourceValue[] 
position) {
   currentReader.close();
 } catch (IOException closeException) {
   e.addSuppressed(closeException);
-} finally {
 
 Review comment:
   Intended?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 421655)
Time Spent: 26h 50m  (was: 26h 40m)

> Fn API SDF support
> --
>
> Key: BEAM-2939
> URL: https://issues.apache.org/jira/browse/BEAM-2939
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model
>Reporter: Henning Rohde
>Assignee: Luke Cwik
>Priority: Major
>  Labels: portability
>  Time Spent: 26h 50m
>  Remaining Estimate: 0h
>
> The Fn API should support streaming SDF. Detailed design TBD.
> Once design is ready, expand subtasks similarly to BEAM-2822.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-9709) timezone off by 8 hours

2020-04-13 Thread Yueyang Qiu (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17082670#comment-17082670
 ] 

Yueyang Qiu commented on BEAM-9709:
---

This is not supported because it creates an intermediate DATE type, and 
TIMESTAMP() constructing function from DATE is not supported (see 
https://cloud.google.com/dataflow/docs/reference/sql/timestamp_functions#timestamp

)

> timezone off by 8 hours
> ---
>
> Key: BEAM-9709
> URL: https://issues.apache.org/jira/browse/BEAM-9709
> Project: Beam
>  Issue Type: Bug
>  Components: dsl-sql-zetasql
>Reporter: Andrew Pilloud
>Assignee: Yueyang Qiu
>Priority: Trivial
>  Labels: zetasql-compliance
>
> two failures in shard 13, one failure in shard 19
> {code}
> Expected: ARRAY>[{2014-01-31 00:00:00+00}]
>   Actual: ARRAY>[{2014-01-31 08:00:00+00}], 
> {code}
> {code}
> select timestamp(date '2014-01-31')
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-2939) Fn API SDF support

2020-04-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-2939?focusedWorklogId=421650&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-421650
 ]

ASF GitHub Bot logged work on BEAM-2939:


Author: ASF GitHub Bot
Created on: 13/Apr/20 21:24
Start Date: 13/Apr/20 21:24
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on issue #11409: [BEAM-2939] Update 
unbounded source as SDF wrapper to resume successfully.
URL: https://github.com/apache/beam/pull/11409#issuecomment-613105161
 
 
   R: @ihji 
   CC: @boyuanzz 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 421650)
Time Spent: 26h 40m  (was: 26.5h)

> Fn API SDF support
> --
>
> Key: BEAM-2939
> URL: https://issues.apache.org/jira/browse/BEAM-2939
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model
>Reporter: Henning Rohde
>Assignee: Luke Cwik
>Priority: Major
>  Labels: portability
>  Time Spent: 26h 40m
>  Remaining Estimate: 0h
>
> The Fn API should support streaming SDF. Detailed design TBD.
> Once design is ready, expand subtasks similarly to BEAM-2822.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-2939) Fn API SDF support

2020-04-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-2939?focusedWorklogId=421649&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-421649
 ]

ASF GitHub Bot logged work on BEAM-2939:


Author: ASF GitHub Bot
Created on: 13/Apr/20 21:23
Start Date: 13/Apr/20 21:23
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on pull request #11409: [BEAM-2939] 
Update unbounded source as SDF wrapper to resume successfully.
URL: https://github.com/apache/beam/pull/11409
 
 
   This fixes a bug where UnboundedReader's API for start() and advance() 
return false
   when there is no data right now but there could be data in the future which 
is
   different from BoundedReader start() and advance() which only return false on
   completion.
   
   Verified on Dataflow with KafkaIO.
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] Update `CHANGES.md` with noteworthy changes.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   See the [Contributor Guide](https://beam.apache.org/contribute) for more 
tips on [how to make review process 
smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/j

[jira] [Updated] (BEAM-9745) [beam_PostCommit_Java_PortabilityApi] Various GCP IO tests failing, unable to deserialize Custom DoFns and Custom Coders.

2020-04-13 Thread Daniel Oliveira (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Oliveira updated BEAM-9745:
--
Priority: Blocker  (was: Major)

> [beam_PostCommit_Java_PortabilityApi] Various GCP IO tests failing, unable to 
> deserialize Custom DoFns and Custom Coders.
> -
>
> Key: BEAM-9745
> URL: https://issues.apache.org/jira/browse/BEAM-9745
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-gcp, java-fn-execution, sdk-java-harness, 
> test-failures
>Reporter: Daniel Oliveira
>Assignee: Daniel Oliveira
>Priority: Blocker
>  Labels: currently-failing
>
> _Use this form to file an issue for test failure:_
>  * [Jenkins 
> Job|https://builds.apache.org/job/beam_PostCommit_Java_PortabilityApi/4657/]
>  * [Gradle Build 
> Scan|https://scans.gradle.com/s/c3izncsa4u24k/tests/by-project]
> Initial investigation:
> The bug appears to be popping up on BigQuery tests mostly, but also a 
> BigTable and a Datastore test.
> Here's an example stacktrace of the two errors, showing _only_ the error 
> messages themselves. Source: 
> [https://scans.gradle.com/s/c3izncsa4u24k/tests/efn4wciuamvqq-ccxt3jvofvqbe]
> {noformat}
> java.util.concurrent.ExecutionException: java.lang.RuntimeException: Error 
> received from SDK harness for instruction -191: 
> java.lang.IllegalArgumentException: unable to deserialize Custom DoFn With 
> Execution Info
> ...
> Caused by: java.lang.ClassNotFoundException: 
> org.apache.beam.sdk.io.gcp.bigquery.BatchLoads$3
> ...
> Caused by: java.lang.RuntimeException: Error received from SDK harness for 
> instruction -191: java.lang.IllegalArgumentException: unable to deserialize 
> Custom DoFn With Execution Info
> ...
> Caused by: java.lang.ClassNotFoundException: 
> org.apache.beam.sdk.io.gcp.bigquery.BatchLoads$3
> java.util.concurrent.ExecutionException: java.lang.RuntimeException: Error 
> received from SDK harness for instruction -206: 
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException:
>  
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException:
>  java.lang.IllegalArgumentException: unable to deserialize Custom Coder Bytes
> ...
> Caused by: 
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException:
>  java.lang.IllegalArgumentException: unable to deserialize Custom Coder Bytes
> ...
> Caused by: java.lang.IllegalArgumentException: unable to deserialize Custom 
> Coder Bytes
> ...
> Caused by: java.lang.ClassNotFoundException: 
> org.apache.beam.sdk.io.gcp.bigquery.TableRowJsonCoder
> ...
> Caused by: java.lang.RuntimeException: Error received from SDK harness for 
> instruction -206: 
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException:
>  
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException:
>  java.lang.IllegalArgumentException: unable to deserialize Custom Coder Bytes
> ...
> Caused by: 
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException:
>  java.lang.IllegalArgumentException: unable to deserialize Custom Coder Bytes
> ...
> Caused by: java.lang.IllegalArgumentException: unable to deserialize Custom 
> Coder Bytes
> ...
> Caused by: java.lang.ClassNotFoundException: 
> org.apache.beam.sdk.io.gcp.bigquery.TableRowJsonCoder
> ...
> {noformat}
> Update: Looks like this has been failing as far back as [Apr 
> 4|https://builds.apache.org/job/beam_PostCommit_Java_PortabilityApi/4566/] 
> after a long period where the test was consistently timing out since [Mar 
> 31|https://builds.apache.org/job/beam_PostCommit_Java_PortabilityApi/4546/]. 
> So it's hard to narrow down what commit may have caused this. Plus, the test 
> was failing due to a completely different BigQuery failure before anyway, so 
> it seems like this test will need to be completely fixed from scratch, 
> instead of tracking down a specific breaking change.
> 
> _After you've filled out the above details, please [assign the issue to an 
> individual|https://beam.apache.org/contribute/postcommits-guides/index.html#find_specialist].
>  Assignee should [treat test failures as 
> high-priority|https://beam.apache.org/contribute/postcommits-policies/#assigned-failing-test],
>  helping to fix the issue or find a more appropriate owner. See [Apache Beam 
> Post-Commit 
> Policies|https://beam.apache.org/contribute/postcommits-policies]._



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9744) Python performance tests failing

2020-04-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9744?focusedWorklogId=421642&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-421642
 ]

ASF GitHub Bot logged work on BEAM-9744:


Author: ASF GitHub Bot
Created on: 13/Apr/20 21:17
Start Date: 13/Apr/20 21:17
Worklog Time Spent: 10m 
  Work Description: ibzib commented on pull request #11408: [BEAM-9744] 
Remove --region option from SQL tests.
URL: https://github.com/apache/beam/pull/11408
 
 
   These tests don't run on Dataflow, so they don't recognize the region option.
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] Update `CHANGES.md` with noteworthy changes.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   See the [Contributor Guide](https://beam.apache.org/contribute) for more 
tips on [how to make review process 
smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_V

[jira] [Work logged] (BEAM-9744) Python performance tests failing

2020-04-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9744?focusedWorklogId=421643&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-421643
 ]

ASF GitHub Bot logged work on BEAM-9744:


Author: ASF GitHub Bot
Created on: 13/Apr/20 21:17
Start Date: 13/Apr/20 21:17
Worklog Time Spent: 10m 
  Work Description: ibzib commented on issue #11408: [BEAM-9744] Remove 
--region option from SQL tests.
URL: https://github.com/apache/beam/pull/11408#issuecomment-613102635
 
 
   Run SQL PostCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 421643)
Time Spent: 1h 10m  (was: 1h)

> Python performance tests failing
> 
>
> Key: BEAM-9744
> URL: https://issues.apache.org/jira/browse/BEAM-9744
> Project: Beam
>  Issue Type: Bug
>  Components: test-failures
>Reporter: Kyle Weaver
>Assignee: Kyle Weaver
>Priority: Major
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> beam_PerformanceTests_WordCountIT_Py* failing because --region is missing.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-9710) Got current time instead of timestamp value

2020-04-13 Thread Yueyang Qiu (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17082664#comment-17082664
 ] 

Yueyang Qiu commented on BEAM-9710:
---

I am marking this issue as a blocker of BEAM-9179. I believe this could be 
fixed while we fix the type translation code.

> Got current time instead of timestamp value
> ---
>
> Key: BEAM-9710
> URL: https://issues.apache.org/jira/browse/BEAM-9710
> Project: Beam
>  Issue Type: Bug
>  Components: dsl-sql-zetasql
>Reporter: Andrew Pilloud
>Assignee: Yueyang Qiu
>Priority: Trivial
>  Labels: zetasql-compliance
>
> one failure in shard 13
> {code}
> Expected: ARRAY>[{2014-12-01 00:00:00+00}]
>   Actual: ARRAY>[{2020-04-06 
> 00:20:40.052+00}], 
> {code}
> {code}
> [prepare_database]
> CREATE TABLE Table1 AS
> SELECT timestamp '2014-12-01' as timestamp_val
> --
> ARRAY>[{2014-12-01 00:00:00+00}]
> ==
> [name=timestamp_type_2]
> SELECT timestamp_val
> FROM Table1
> --
> ARRAY>[{2014-12-01 00:00:00+00}]
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-9710) Got current time instead of timestamp value

2020-04-13 Thread Yueyang Qiu (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17082658#comment-17082658
 ] 

Yueyang Qiu commented on BEAM-9710:
---

This seems to be caused by bad code from internal compliance test harness. I 
don't think this should be a Dataflow SQL GA blocker.

> Got current time instead of timestamp value
> ---
>
> Key: BEAM-9710
> URL: https://issues.apache.org/jira/browse/BEAM-9710
> Project: Beam
>  Issue Type: Bug
>  Components: dsl-sql-zetasql
>Reporter: Andrew Pilloud
>Assignee: Yueyang Qiu
>Priority: Trivial
>  Labels: zetasql-compliance
>
> one failure in shard 13
> {code}
> Expected: ARRAY>[{2014-12-01 00:00:00+00}]
>   Actual: ARRAY>[{2020-04-06 
> 00:20:40.052+00}], 
> {code}
> {code}
> [prepare_database]
> CREATE TABLE Table1 AS
> SELECT timestamp '2014-12-01' as timestamp_val
> --
> ARRAY>[{2014-12-01 00:00:00+00}]
> ==
> [name=timestamp_type_2]
> SELECT timestamp_val
> FROM Table1
> --
> ARRAY>[{2014-12-01 00:00:00+00}]
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9562) Remove timer from PCollection and treat timers as Elements

2020-04-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9562?focusedWorklogId=421626&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-421626
 ]

ASF GitHub Bot logged work on BEAM-9562:


Author: ASF GitHub Bot
Created on: 13/Apr/20 20:43
Start Date: 13/Apr/20 20:43
Worklog Time Spent: 10m 
  Work Description: boyuanzz commented on issue #11407: [BEAM-9562] 
Cherry-pick: Fix output timestamp to be inferred from scheduled time w…
URL: https://github.com/apache/beam/pull/11407#issuecomment-613088556
 
 
   R: @ibzib 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 421626)
Time Spent: 24h 10m  (was: 24h)

> Remove timer from PCollection and treat timers as Elements 
> ---
>
> Key: BEAM-9562
> URL: https://issues.apache.org/jira/browse/BEAM-9562
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-harness, sdk-py-harness
>Reporter: Boyuan Zhang
>Assignee: Boyuan Zhang
>Priority: Major
> Fix For: 2.21.0
>
>  Time Spent: 24h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9562) Remove timer from PCollection and treat timers as Elements

2020-04-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9562?focusedWorklogId=421625&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-421625
 ]

ASF GitHub Bot logged work on BEAM-9562:


Author: ASF GitHub Bot
Created on: 13/Apr/20 20:42
Start Date: 13/Apr/20 20:42
Worklog Time Spent: 10m 
  Work Description: boyuanzz commented on pull request #11407: [BEAM-9562] 
Cherry-pick: Fix output timestamp to be inferred from scheduled time w…
URL: https://github.com/apache/beam/pull/11407
 
 
   …hen in the event time domain.
   
   (cherry picked from commit 009578e374523f5acd8d24543ef1ceec30542a95)
   
   **Please** add a meaningful description for your change here
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] Update `CHANGES.md` with noteworthy changes.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   See the [Contributor Guide](https://beam.apache.org/contribute) for more 
tips on [how to make review process 
smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostComm

[jira] [Work logged] (BEAM-9562) Remove timer from PCollection and treat timers as Elements

2020-04-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9562?focusedWorklogId=421624&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-421624
 ]

ASF GitHub Bot logged work on BEAM-9562:


Author: ASF GitHub Bot
Created on: 13/Apr/20 20:36
Start Date: 13/Apr/20 20:36
Worklog Time Spent: 10m 
  Work Description: mxm commented on issue #11314: [BEAM-9562] Send Timers 
over Data Channel as Elements
URL: https://github.com/apache/beam/pull/11314#issuecomment-613085672
 
 
   I was actually working on something related to timers in #11362 and was 
surprised to see that the test failed when I opened the PR, since I had run 
tests locally. Then figured something must have changed on master in the 
meantime. Thanks for following up with this!
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 421624)
Time Spent: 23h 50m  (was: 23h 40m)

> Remove timer from PCollection and treat timers as Elements 
> ---
>
> Key: BEAM-9562
> URL: https://issues.apache.org/jira/browse/BEAM-9562
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-harness, sdk-py-harness
>Reporter: Boyuan Zhang
>Assignee: Boyuan Zhang
>Priority: Major
> Fix For: 2.21.0
>
>  Time Spent: 23h 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


  1   2   3   4   >