[jira] [Work logged] (BEAM-3817) Incompatible input encoding running Tornadoes example on dataflow
[ https://issues.apache.org/jira/browse/BEAM-3817?focusedWorklogId=82516&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-82516 ] ASF GitHub Bot logged work on BEAM-3817: Author: ASF GitHub Bot Created on: 20/Mar/18 22:17 Start Date: 20/Mar/18 22:17 Worklog Time Spent: 10m Work Description: herohde commented on issue #4840: [BEAM-3817] Switch Go SDK BQ write to not use side input URL: https://github.com/apache/beam/pull/4840#issuecomment-374776224 Thanks! This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 82516) Time Spent: 1.5h (was: 1h 20m) > Incompatible input encoding running Tornadoes example on dataflow > - > > Key: BEAM-3817 > URL: https://issues.apache.org/jira/browse/BEAM-3817 > Project: Beam > Issue Type: Bug > Components: sdk-go >Reporter: Braden Bassingthwaite >Assignee: Henning Rohde >Priority: Major > Time Spent: 1.5h > Remaining Estimate: 0h > > Trying to run: > go run tornadoes.go --output=:bbass.tornadoes --project > --runner dataflow --staging_location=gs://bbass/tornadoes > --worker_harness_container_image=gcr.io//beam/go > Found here: > [https://github.com/apache/beam/blob/master/sdks/go/examples/cookbook/tornadoes/tornadoes.go] > I can run it locally but I get the error on Dataflow: > (8fa522c2bb03a769): Workflow failed. Causes: (8fa522c2bb03ab04): Incompatible > input encoding. > > I built the worker_harness_container_image using: > mvn clean install -DskipTests -Pbuild-containers > -Ddocker-repository-root=gcr.io//beam > > Thanks! > > Very excited to start using the golang beam sdk! great work! > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-3817) Incompatible input encoding running Tornadoes example on dataflow
[ https://issues.apache.org/jira/browse/BEAM-3817?focusedWorklogId=82499&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-82499 ] ASF GitHub Bot logged work on BEAM-3817: Author: ASF GitHub Bot Created on: 20/Mar/18 21:41 Start Date: 20/Mar/18 21:41 Worklog Time Spent: 10m Work Description: robertwb commented on issue #4840: [BEAM-3817] Switch Go SDK BQ write to not use side input URL: https://github.com/apache/beam/pull/4840#issuecomment-374766650 Looks good, thanks. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 82499) Time Spent: 1h 10m (was: 1h) > Incompatible input encoding running Tornadoes example on dataflow > - > > Key: BEAM-3817 > URL: https://issues.apache.org/jira/browse/BEAM-3817 > Project: Beam > Issue Type: Bug > Components: sdk-go >Reporter: Braden Bassingthwaite >Assignee: Henning Rohde >Priority: Major > Time Spent: 1h 10m > Remaining Estimate: 0h > > Trying to run: > go run tornadoes.go --output=:bbass.tornadoes --project > --runner dataflow --staging_location=gs://bbass/tornadoes > --worker_harness_container_image=gcr.io//beam/go > Found here: > [https://github.com/apache/beam/blob/master/sdks/go/examples/cookbook/tornadoes/tornadoes.go] > I can run it locally but I get the error on Dataflow: > (8fa522c2bb03a769): Workflow failed. Causes: (8fa522c2bb03ab04): Incompatible > input encoding. > > I built the worker_harness_container_image using: > mvn clean install -DskipTests -Pbuild-containers > -Ddocker-repository-root=gcr.io//beam > > Thanks! > > Very excited to start using the golang beam sdk! great work! > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-3817) Incompatible input encoding running Tornadoes example on dataflow
[ https://issues.apache.org/jira/browse/BEAM-3817?focusedWorklogId=82500&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-82500 ] ASF GitHub Bot logged work on BEAM-3817: Author: ASF GitHub Bot Created on: 20/Mar/18 21:41 Start Date: 20/Mar/18 21:41 Worklog Time Spent: 10m Work Description: robertwb closed pull request #4840: [BEAM-3817] Switch Go SDK BQ write to not use side input URL: https://github.com/apache/beam/pull/4840 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/sdks/go/pkg/beam/combine.go b/sdks/go/pkg/beam/combine.go index c0b70fe2623..a9bc098a493 100644 --- a/sdks/go/pkg/beam/combine.go +++ b/sdks/go/pkg/beam/combine.go @@ -33,16 +33,11 @@ func CombinePerKey(s Scope, combinefn interface{}, col PCollection) PCollection return Must(TryCombinePerKey(s, combinefn, col)) } -// addFixedKeyFn forces all elements to a single key. -func addFixedKeyFn(elm T) (int, T) { - return 0, elm -} - // TryCombine attempts to insert a global Combine transform into the pipeline. It may fail // for multiple reasons, notably that the combinefn is not valid or cannot be bound // -- due to type mismatch, say -- to the incoming PCollections. func TryCombine(s Scope, combinefn interface{}, col PCollection) (PCollection, error) { - pre := ParDo(s, addFixedKeyFn, col) + pre := AddFixedKey(s, col) post, err := TryCombinePerKey(s, combinefn, pre) if err != nil { return PCollection{}, err diff --git a/sdks/go/pkg/beam/io/bigqueryio/bigquery.go b/sdks/go/pkg/beam/io/bigqueryio/bigquery.go index 07e315a8b64..22b5f3c3fcf 100644 --- a/sdks/go/pkg/beam/io/bigqueryio/bigquery.go +++ b/sdks/go/pkg/beam/io/bigqueryio/bigquery.go @@ -169,8 +169,11 @@ func Write(s beam.Scope, project, table string, col beam.PCollection) { s = s.Scope("bigquery.Write") - imp := beam.Impulse(s) - beam.ParDo0(s, &writeFn{Project: project, Table: qn, Type: beam.EncodedType{T: t}}, imp, beam.SideInput{Input: col}) + // TODO(BEAM-3860) 3/15/2018: use side input instead of GBK. + + pre := beam.AddFixedKey(s, col) + post := beam.GroupByKey(s, pre) + beam.ParDo0(s, &writeFn{Project: project, Table: qn, Type: beam.EncodedType{T: t}}, post) } type writeFn struct { @@ -182,7 +185,7 @@ type writeFn struct { Type beam.EncodedType `json:"type"` } -func (f *writeFn) ProcessElement(ctx context.Context, _ []byte, iter func(*beam.X) bool) error { +func (f *writeFn) ProcessElement(ctx context.Context, _ int, iter func(*beam.X) bool) error { client, err := bigquery.NewClient(ctx, f.Project) if err != nil { return err diff --git a/sdks/go/pkg/beam/io/textio/textio.go b/sdks/go/pkg/beam/io/textio/textio.go index b33a7d71ade..926251fb6bc 100644 --- a/sdks/go/pkg/beam/io/textio/textio.go +++ b/sdks/go/pkg/beam/io/textio/textio.go @@ -139,15 +139,13 @@ func Write(s beam.Scope, filename string, col beam.PCollection) { // FinishBundle doesn't have the right granularity. We therefore // perform a GBK with a fixed key to get all values in a single invocation. - pre := beam.ParDo(s, addFixedKey, col) + // TODO(BEAM-3860) 3/15/2018: use side input instead of GBK. + + pre := beam.AddFixedKey(s, col) post := beam.GroupByKey(s, pre) beam.ParDo0(s, &writeFileFn{Filename: filename}, post) } -func addFixedKey(elm beam.T) (int, beam.T) { - return 0, elm -} - type writeFileFn struct { Filename string `json:"filename"` } diff --git a/sdks/go/pkg/beam/util.go b/sdks/go/pkg/beam/util.go index f385e708ec1..e730765c61d 100644 --- a/sdks/go/pkg/beam/util.go +++ b/sdks/go/pkg/beam/util.go @@ -41,6 +41,15 @@ func Seq(s Scope, col PCollection, dofns ...interface{}) PCollection { return cur } +// AddFixedKey adds a fixed key (0) to every element. +func AddFixedKey(s Scope, col PCollection) PCollection { + return ParDo(s, addFixedKeyFn, col) +} + +func addFixedKeyFn(elm T) (int, T) { + return 0, elm +} + // DropKey drops the key for an input PCollection>. It returns // a PCollection. func DropKey(s Scope, col PCollection) PCollection { This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 82500) Time Spent: 1h 20m (was: 1h 10m) > Inc
[jira] [Work logged] (BEAM-3817) Incompatible input encoding running Tornadoes example on dataflow
[ https://issues.apache.org/jira/browse/BEAM-3817?focusedWorklogId=82111&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-82111 ] ASF GitHub Bot logged work on BEAM-3817: Author: ASF GitHub Bot Created on: 20/Mar/18 01:15 Start Date: 20/Mar/18 01:15 Worklog Time Spent: 10m Work Description: herohde commented on issue #4840: [BEAM-3817] Switch Go SDK BQ write to not use side input URL: https://github.com/apache/beam/pull/4840#issuecomment-374438977 @robertwb Any concerns about merging? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 82111) Time Spent: 1h (was: 50m) > Incompatible input encoding running Tornadoes example on dataflow > - > > Key: BEAM-3817 > URL: https://issues.apache.org/jira/browse/BEAM-3817 > Project: Beam > Issue Type: Bug > Components: sdk-go >Reporter: Braden Bassingthwaite >Assignee: Henning Rohde >Priority: Major > Time Spent: 1h > Remaining Estimate: 0h > > Trying to run: > go run tornadoes.go --output=:bbass.tornadoes --project > --runner dataflow --staging_location=gs://bbass/tornadoes > --worker_harness_container_image=gcr.io//beam/go > Found here: > [https://github.com/apache/beam/blob/master/sdks/go/examples/cookbook/tornadoes/tornadoes.go] > I can run it locally but I get the error on Dataflow: > (8fa522c2bb03a769): Workflow failed. Causes: (8fa522c2bb03ab04): Incompatible > input encoding. > > I built the worker_harness_container_image using: > mvn clean install -DskipTests -Pbuild-containers > -Ddocker-repository-root=gcr.io//beam > > Thanks! > > Very excited to start using the golang beam sdk! great work! > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-3817) Incompatible input encoding running Tornadoes example on dataflow
[ https://issues.apache.org/jira/browse/BEAM-3817?focusedWorklogId=80979&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-80979 ] ASF GitHub Bot logged work on BEAM-3817: Author: ASF GitHub Bot Created on: 15/Mar/18 20:44 Start Date: 15/Mar/18 20:44 Worklog Time Spent: 10m Work Description: herohde commented on issue #4840: [BEAM-3817] Switch Go SDK BQ write to not use side input URL: https://github.com/apache/beam/pull/4840#issuecomment-373516905 @robertwb Done. PTAL This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 80979) Time Spent: 50m (was: 40m) > Incompatible input encoding running Tornadoes example on dataflow > - > > Key: BEAM-3817 > URL: https://issues.apache.org/jira/browse/BEAM-3817 > Project: Beam > Issue Type: Bug > Components: sdk-go >Reporter: Braden Bassingthwaite >Assignee: Henning Rohde >Priority: Major > Time Spent: 50m > Remaining Estimate: 0h > > Trying to run: > go run tornadoes.go --output=:bbass.tornadoes --project > --runner dataflow --staging_location=gs://bbass/tornadoes > --worker_harness_container_image=gcr.io//beam/go > Found here: > [https://github.com/apache/beam/blob/master/sdks/go/examples/cookbook/tornadoes/tornadoes.go] > I can run it locally but I get the error on Dataflow: > (8fa522c2bb03a769): Workflow failed. Causes: (8fa522c2bb03ab04): Incompatible > input encoding. > > I built the worker_harness_container_image using: > mvn clean install -DskipTests -Pbuild-containers > -Ddocker-repository-root=gcr.io//beam > > Thanks! > > Very excited to start using the golang beam sdk! great work! > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-3817) Incompatible input encoding running Tornadoes example on dataflow
[ https://issues.apache.org/jira/browse/BEAM-3817?focusedWorklogId=79006&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-79006 ] ASF GitHub Bot logged work on BEAM-3817: Author: ASF GitHub Bot Created on: 09/Mar/18 19:54 Start Date: 09/Mar/18 19:54 Worklog Time Spent: 10m Work Description: herohde commented on issue #4840: [BEAM-3817] Switch Go SDK BQ write to not use side input URL: https://github.com/apache/beam/pull/4840#issuecomment-371927694 Thanks! R: @robertwb This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 79006) Time Spent: 40m (was: 0.5h) > Incompatible input encoding running Tornadoes example on dataflow > - > > Key: BEAM-3817 > URL: https://issues.apache.org/jira/browse/BEAM-3817 > Project: Beam > Issue Type: Bug > Components: sdk-go >Reporter: Braden Bassingthwaite >Assignee: Henning Rohde >Priority: Major > Time Spent: 40m > Remaining Estimate: 0h > > Trying to run: > go run tornadoes.go --output=:bbass.tornadoes --project > --runner dataflow --staging_location=gs://bbass/tornadoes > --worker_harness_container_image=gcr.io//beam/go > Found here: > [https://github.com/apache/beam/blob/master/sdks/go/examples/cookbook/tornadoes/tornadoes.go] > I can run it locally but I get the error on Dataflow: > (8fa522c2bb03a769): Workflow failed. Causes: (8fa522c2bb03ab04): Incompatible > input encoding. > > I built the worker_harness_container_image using: > mvn clean install -DskipTests -Pbuild-containers > -Ddocker-repository-root=gcr.io//beam > > Thanks! > > Very excited to start using the golang beam sdk! great work! > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-3817) Incompatible input encoding running Tornadoes example on dataflow
[ https://issues.apache.org/jira/browse/BEAM-3817?focusedWorklogId=78959&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-78959 ] ASF GitHub Bot logged work on BEAM-3817: Author: ASF GitHub Bot Created on: 09/Mar/18 18:01 Start Date: 09/Mar/18 18:01 Worklog Time Spent: 10m Work Description: herohde commented on issue #4840: [BEAM-3817] Switch BQ write to not use side input URL: https://github.com/apache/beam/pull/4840#issuecomment-371894353 R: @bbassingthwaite-va This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 78959) Time Spent: 0.5h (was: 20m) > Incompatible input encoding running Tornadoes example on dataflow > - > > Key: BEAM-3817 > URL: https://issues.apache.org/jira/browse/BEAM-3817 > Project: Beam > Issue Type: Bug > Components: sdk-go >Reporter: Braden Bassingthwaite >Assignee: Henning Rohde >Priority: Major > Time Spent: 0.5h > Remaining Estimate: 0h > > Trying to run: > go run tornadoes.go --output=:bbass.tornadoes --project > --runner dataflow --staging_location=gs://bbass/tornadoes > --worker_harness_container_image=gcr.io//beam/go > Found here: > [https://github.com/apache/beam/blob/master/sdks/go/examples/cookbook/tornadoes/tornadoes.go] > I can run it locally but I get the error on Dataflow: > (8fa522c2bb03a769): Workflow failed. Causes: (8fa522c2bb03ab04): Incompatible > input encoding. > > I built the worker_harness_container_image using: > mvn clean install -DskipTests -Pbuild-containers > -Ddocker-repository-root=gcr.io//beam > > Thanks! > > Very excited to start using the golang beam sdk! great work! > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-3817) Incompatible input encoding running Tornadoes example on dataflow
[ https://issues.apache.org/jira/browse/BEAM-3817?focusedWorklogId=78956&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-78956 ] ASF GitHub Bot logged work on BEAM-3817: Author: ASF GitHub Bot Created on: 09/Mar/18 18:00 Start Date: 09/Mar/18 18:00 Worklog Time Spent: 10m Work Description: herohde commented on issue #4840: [BEAM-3817] Switch BQ write to not use side input URL: https://github.com/apache/beam/pull/4840#issuecomment-371893893 R: @lostluck This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 78956) Time Spent: 20m (was: 10m) > Incompatible input encoding running Tornadoes example on dataflow > - > > Key: BEAM-3817 > URL: https://issues.apache.org/jira/browse/BEAM-3817 > Project: Beam > Issue Type: Bug > Components: sdk-go >Reporter: Braden Bassingthwaite >Assignee: Henning Rohde >Priority: Major > Time Spent: 20m > Remaining Estimate: 0h > > Trying to run: > go run tornadoes.go --output=:bbass.tornadoes --project > --runner dataflow --staging_location=gs://bbass/tornadoes > --worker_harness_container_image=gcr.io//beam/go > Found here: > [https://github.com/apache/beam/blob/master/sdks/go/examples/cookbook/tornadoes/tornadoes.go] > I can run it locally but I get the error on Dataflow: > (8fa522c2bb03a769): Workflow failed. Causes: (8fa522c2bb03ab04): Incompatible > input encoding. > > I built the worker_harness_container_image using: > mvn clean install -DskipTests -Pbuild-containers > -Ddocker-repository-root=gcr.io//beam > > Thanks! > > Very excited to start using the golang beam sdk! great work! > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-3817) Incompatible input encoding running Tornadoes example on dataflow
[ https://issues.apache.org/jira/browse/BEAM-3817?focusedWorklogId=78954&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-78954 ] ASF GitHub Bot logged work on BEAM-3817: Author: ASF GitHub Bot Created on: 09/Mar/18 17:58 Start Date: 09/Mar/18 17:58 Worklog Time Spent: 10m Work Description: herohde opened a new pull request #4840: [BEAM-3817] Switch BQ write to not use side input URL: https://github.com/apache/beam/pull/4840 Otherwise, it does not do currently run on non-direct runners. Also consolidated the use of adding a fixed key for this purpose. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 78954) Time Spent: 10m Remaining Estimate: 0h > Incompatible input encoding running Tornadoes example on dataflow > - > > Key: BEAM-3817 > URL: https://issues.apache.org/jira/browse/BEAM-3817 > Project: Beam > Issue Type: Bug > Components: sdk-go >Reporter: Braden Bassingthwaite >Assignee: Henning Rohde >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > Trying to run: > go run tornadoes.go --output=:bbass.tornadoes --project > --runner dataflow --staging_location=gs://bbass/tornadoes > --worker_harness_container_image=gcr.io//beam/go > Found here: > [https://github.com/apache/beam/blob/master/sdks/go/examples/cookbook/tornadoes/tornadoes.go] > I can run it locally but I get the error on Dataflow: > (8fa522c2bb03a769): Workflow failed. Causes: (8fa522c2bb03ab04): Incompatible > input encoding. > > I built the worker_harness_container_image using: > mvn clean install -DskipTests -Pbuild-containers > -Ddocker-repository-root=gcr.io//beam > > Thanks! > > Very excited to start using the golang beam sdk! great work! > -- This message was sent by Atlassian JIRA (v7.6.3#76005)