lostluck opened a new issue, #26373:
URL: https://github.com/apache/beam/issues/26373

   ### What happened?
   
   @miracvbasaran discovered some data races in the Go SDK when running other 
tests inside Google.
   
   First is on the state of the plan: This should be moved to an atomic check.
   
   
https://github.com/apache/beam/blob/master/sdks/go/pkg/beam/core/runtime/exec/plan.go#L41
   
   (code is vendored inside google, but I'll provide links to the github 
equivalents here.
   
   Caught at these lines: 
   
   Write: 
https://github.com/apache/beam/blob/master/sdks/go/pkg/beam/core/runtime/exec/plan.go#L146
   
   Read: 
   
https://github.com/apache/beam/blob/master/sdks/go/pkg/beam/core/runtime/exec/plan.go#L280
   
   ```
   WARNING: DATA RACE
   Write at 0x00c00c491668 by goroutine 241:
     
google3/third_party/golang/apache_beam/pkg/beam/core/runtime/exec/exec.(*Plan).Execute()
         third_party/golang/apache_beam/pkg/beam/core/runtime/exec/plan.go:146 
+0xa26
     
google3/third_party/golang/apache_beam/pkg/beam/core/runtime/harness/harness.(*control).handleInstruction()
         
third_party/golang/apache_beam/pkg/beam/core/runtime/harness/harness.go:408 
+0x1353
     
google3/third_party/golang/apache_beam/pkg/beam/core/runtime/harness/harness.Main.func4()
         
third_party/golang/apache_beam/pkg/beam/core/runtime/harness/harness.go:194 
+0x246
     
google3/third_party/golang/apache_beam/pkg/beam/core/runtime/harness/harness.Main.func8()
         
third_party/golang/apache_beam/pkg/beam/core/runtime/harness/harness.go:213 
+0x66
   
   Previous read at 0x00c00c491668 by goroutine 129:
     
google3/third_party/golang/apache_beam/pkg/beam/core/runtime/exec/exec.(*Plan).Split()
         third_party/golang/apache_beam/pkg/beam/core/runtime/exec/plan.go:280 
+0xab
     
google3/third_party/golang/apache_beam/pkg/beam/core/runtime/harness/harness.(*control).handleInstruction()
         
third_party/golang/apache_beam/pkg/beam/core/runtime/harness/harness.go:571 
+0x2ccd
     
google3/third_party/golang/apache_beam/pkg/beam/core/runtime/harness/harness.Main.func4()
         
third_party/golang/apache_beam/pkg/beam/core/runtime/harness/harness.go:194 
+0x246
     
google3/third_party/golang/apache_beam/pkg/beam/core/runtime/harness/harness.Main()
         
third_party/golang/apache_beam/pkg/beam/core/runtime/harness/harness.go:215 
+0x19f3
     
google3/third_party/golang/apache_beam/pkg/beam/core/runtime/harness/init/init.hook()
         
third_party/golang/apache_beam/pkg/beam/core/runtime/harness/init/init.go:122 
+0x54b
     google3/third_party/golang/apache_beam/pkg/beam/core/runtime/runtime.Init()
         third_party/golang/apache_beam/pkg/beam/core/runtime/init.go:42 +0x228
     google3/third_party/golang/apache_beam/pkg/beam/beam.Init()
         third_party/golang/apache_beam/pkg/beam/forward.go:147 +0x1c6
     google3/pipeline/flume/go/runner/flume.launchSDKHarness()
         pipeline/flume/go/runner/launcher.go:113 +0x1c5
     google3/pipeline/flume/go/runner/flume.runJob.func1()
         pipeline/flume/go/runner/df.go:137 +0x58
   ```
   
   In particular, this one triggers because a split request came in before the 
bundle meaningfully started.
   
   Not caught in open source because no runner other than Dataflow (and it's 
Google internal counterpart, Flume) currently splits.
   
   ----
   
   Next is one caught for two simultaneous bundles constructing their plans, 
because they came in at the same time. In particular
   
   ```
   WARNING: DATA RACE
   Read at 0x00c000628ee8 by goroutine 203:
     
google3/third_party/golang/apache_beam/pkg/beam/core/runtime/graphx/schema/schema.(*Registry).reconcileRegistrations()
         
third_party/golang/apache_beam/pkg/beam/core/runtime/graphx/schema/schema.go:118
 +0xad
     
google3/third_party/golang/apache_beam/pkg/beam/core/runtime/graphx/schema/schema.(*Registry).ToType()
         
third_party/golang/apache_beam/pkg/beam/core/runtime/graphx/schema/schema.go:683
 +0x2f
     
google3/third_party/golang/apache_beam/pkg/beam/core/runtime/graphx/schema/schema.ToType()
         
third_party/golang/apache_beam/pkg/beam/core/runtime/graphx/schema/schema.go:61 
+0xce9
     
google3/third_party/golang/apache_beam/pkg/beam/core/runtime/graphx/graphx.(*CoderUnmarshaller).makeCoder()
         
third_party/golang/apache_beam/pkg/beam/core/runtime/graphx/coder.go:367 +0xcc9
     
google3/third_party/golang/apache_beam/pkg/beam/core/runtime/graphx/graphx.(*CoderUnmarshaller).makeCoder()
     ... elided frames ...
         
third_party/golang/apache_beam/pkg/beam/core/runtime/graphx/coder.go:304 +0x25f7
     
google3/third_party/golang/apache_beam/pkg/beam/core/runtime/graphx/graphx.(*CoderUnmarshaller).Coder()
         
third_party/golang/apache_beam/pkg/beam/core/runtime/graphx/coder.go:147 +0x12a
     
google3/third_party/golang/apache_beam/pkg/beam/core/runtime/exec/exec.UnmarshalPlan()
         
third_party/golang/apache_beam/pkg/beam/core/runtime/exec/translate.go:72 +0x3f2
     
google3/third_party/golang/apache_beam/pkg/beam/core/runtime/harness/harness.(*control).getOrCreatePlan()
         
third_party/golang/apache_beam/pkg/beam/core/runtime/harness/harness.go:342 
+0x2de
     
google3/third_party/golang/apache_beam/pkg/beam/core/runtime/harness/harness.(*control).handleInstruction()
         
third_party/golang/apache_beam/pkg/beam/core/runtime/harness/harness.go:378 
+0xadc
     
google3/third_party/golang/apache_beam/pkg/beam/core/runtime/harness/harness.Main.func4()
         
third_party/golang/apache_beam/pkg/beam/core/runtime/harness/harness.go:189 
+0x92
     ??()
         -:0 +0x1eab09c1
   
   Previous write at 0x00c000628ee8 by goroutine 200:
     
google3/third_party/golang/apache_beam/pkg/beam/core/runtime/graphx/schema/schema.(*Registry).reconcileRegistrations()
         
third_party/golang/apache_beam/pkg/beam/core/runtime/graphx/schema/schema.go:132
 +0x334
     
google3/third_party/golang/apache_beam/pkg/beam/core/runtime/graphx/schema/schema.(*Registry).ToType()
         
third_party/golang/apache_beam/pkg/beam/core/runtime/graphx/schema/schema.go:683
 +0x2f
     
google3/third_party/golang/apache_beam/pkg/beam/core/runtime/graphx/schema/schema.ToType()
         
third_party/golang/apache_beam/pkg/beam/core/runtime/graphx/schema/schema.go:61 
+0xce9
     
google3/third_party/golang/apache_beam/pkg/beam/core/runtime/graphx/graphx.(*CoderUnmarshaller).makeCoder()
         
third_party/golang/apache_beam/pkg/beam/core/runtime/graphx/coder.go:367 +0xcc9
     
google3/third_party/golang/apache_beam/pkg/beam/core/runtime/graphx/graphx.(*CoderUnmarshaller).Coder()
     ... elided frames ...
         
third_party/golang/apache_beam/pkg/beam/core/runtime/exec/translate.go:383 
+0x16e
     
google3/third_party/golang/apache_beam/pkg/beam/core/runtime/exec/exec.UnmarshalPlan()
         
third_party/golang/apache_beam/pkg/beam/core/runtime/exec/translate.go:86 +0xa3e
     
google3/third_party/golang/apache_beam/pkg/beam/core/runtime/harness/harness.(*control).getOrCreatePlan()
         
third_party/golang/apache_beam/pkg/beam/core/runtime/harness/harness.go:342 
+0x2de
     
google3/third_party/golang/apache_beam/pkg/beam/core/runtime/harness/harness.(*control).handleInstruction()
         
third_party/golang/apache_beam/pkg/beam/core/runtime/harness/harness.go:378 
+0xadc
     
google3/third_party/golang/apache_beam/pkg/beam/core/runtime/harness/harness.Main.func4()
         
third_party/golang/apache_beam/pkg/beam/core/runtime/harness/harness.go:189 
+0x92
     ??()
   ``` 
   
   This one is in the tension around reading the registry and writing to the 
registry. Seems like a case for a concurrent map around the look ups, but a RW 
lock is more appropriate, since these will be read more often than they're 
written to. I don't love the additional serialization though. Technically, it's 
odd that a reconciliation is triggered so late in the process, since all the 
coders should have been registered at Init time.
   
   ### Issue Priority
   
   Priority: 2 (default / most bugs should be filed as P2)
   
   ### Issue Components
   
   - [ ] Component: Python SDK
   - [ ] Component: Java SDK
   - [X] Component: Go SDK
   - [ ] Component: Typescript SDK
   - [ ] Component: IO connector
   - [ ] Component: Beam examples
   - [ ] Component: Beam playground
   - [ ] Component: Beam katas
   - [ ] Component: Website
   - [ ] Component: Spark Runner
   - [ ] Component: Flink Runner
   - [ ] Component: Samza Runner
   - [ ] Component: Twister2 Runner
   - [ ] Component: Hazelcast Jet Runner
   - [ ] Component: Google Cloud Dataflow Runner


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to