[jira] [Work logged] (BEAM-6407) regression: FileIO.writeDynamic() with side inputs fails in DirectRunner

2019-01-18 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6407?focusedWorklogId=186791&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-186791
 ]

ASF GitHub Bot logged work on BEAM-6407:


Author: ASF GitHub Bot
Created on: 18/Jan/19 08:49
Start Date: 18/Jan/19 08:49
Worklog Time Spent: 10m 
  Work Description: iemejia commented on pull request #7550: [BEAM-6407] 
Cherry pick #7537, fixes FileIO.writeDynamic() in the DirectRunner
URL: https://github.com/apache/beam/pull/7550
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 186791)
Time Spent: 2.5h  (was: 2h 20m)

> regression: FileIO.writeDynamic() with side inputs fails in DirectRunner
> 
>
> Key: BEAM-6407
> URL: https://issues.apache.org/jira/browse/BEAM-6407
> Project: Beam
>  Issue Type: Bug
>  Components: runner-direct
>Affects Versions: 2.9.0
>Reporter: Niel Markwick
>Assignee: Niel Markwick
>Priority: Blocker
>  Labels: regression
> Fix For: 2.10.0
>
> Attachments: beam-filewriter-demo.tgz
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> When FileIO.writeDynamic is used with automatic sharding and  a Contextful.Fn 
> that uses side inputs for the file naming, DirectRunner (and TestPipeline) 
> fail with: 
> {{java.lang.IllegalStateException: All PCollectionViews that are consumed 
> must be written by some WriteView PTransform: Missing [ 
> [RunnerPCollectionView]]}}
>  
> Example code:  
> {code:java}
> PCollectionView outputFileName =
>    pipeline.apply(
>       "outputDir",
>        Create.of("/tmp/testout")).apply(View.asSingleton());
> Contextful.Fn manifestNaming =
>    (element, c) ->
>       (window, pane, numShards, shardIndex, compression) -> 
>          c.sideInput(outputFileName)+shardIndex;
> pipeline.apply(FileIO.writeDynamic()
>    .by(SerializableFunctions.constant(""))
>    .withDestinationCoder(StringUtf8Coder.of())
>    .via(TextIO.sink())
>    .withTempDirectory("/tmp")
>    .withNaming(Contextful.of(
>       manifestNaming,
>       Requirements.requiresSideInputs(outputFileName;
> {code}
>  
> This does not occur in Dataflow-runner
> It does not occur if the ContextFul.Fn is not given side inputs.
> It does not occur if withNumShards(1) is set.
> It did not occur in 2.8.0, and does in 2.9.0 and 2.10.0-SNAPSHOT (as of today)
>  
> The cause appears to be due to the DirectRunner using TransformOverrides 
> re-writing FileIO sinks to use runner-determined-sharding
> ( see [DirectRunner.java line 
> 226|https://github.com/apache/beam/blob/master/runners/direct-java/src/main/java/org/apache/beam/runners/direct/DirectRunner.java#L226]
>  )
>  but I do not know why this started occuring in 2.9.0...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6407) regression: FileIO.writeDynamic() with side inputs fails in DirectRunner

2019-01-17 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6407?focusedWorklogId=186450&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-186450
 ]

ASF GitHub Bot logged work on BEAM-6407:


Author: ASF GitHub Bot
Created on: 17/Jan/19 17:35
Start Date: 17/Jan/19 17:35
Worklog Time Spent: 10m 
  Work Description: kennknowles commented on pull request #7550: 
[BEAM-6407] Cherry pick #7537, fixes FileIO.writeDynamic() in the DirectRunner
URL: https://github.com/apache/beam/pull/7550
 
 
   
   
   
   
   Follow this checklist to help us incorporate your contribution quickly and 
easily:
   
- [x] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [x] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   It will help us expedite review of your Pull Request if you tag someone 
(e.g. `@username`) to look at it.
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | --- | --- | --- | ---
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/)
 | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/)
  [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_VR_Flink/lastCompletedBuild/)
 | --- | --- | ---
   
   
   
   
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 186450)
Time Spent: 2h 20m  (was: 2h 10m)

> regression: FileIO.writeDynamic() with side inputs fails in DirectRunner
> --

[jira] [Work logged] (BEAM-6407) regression: FileIO.writeDynamic() with side inputs fails in DirectRunner

2019-01-17 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6407?focusedWorklogId=186444&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-186444
 ]

ASF GitHub Bot logged work on BEAM-6407:


Author: ASF GitHub Bot
Created on: 17/Jan/19 17:32
Start Date: 17/Jan/19 17:32
Worklog Time Spent: 10m 
  Work Description: kennknowles commented on pull request #7537: 
[BEAM-6407] Revert "BEAM-5933: avoid memory allocation in hashCode call"
URL: https://github.com/apache/beam/pull/7537
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 186444)
Time Spent: 2h 10m  (was: 2h)

> regression: FileIO.writeDynamic() with side inputs fails in DirectRunner
> 
>
> Key: BEAM-6407
> URL: https://issues.apache.org/jira/browse/BEAM-6407
> Project: Beam
>  Issue Type: Bug
>  Components: runner-direct
>Affects Versions: 2.9.0
>Reporter: Niel Markwick
>Assignee: Kenneth Knowles
>Priority: Blocker
>  Labels: regression
> Fix For: 2.10.0
>
> Attachments: beam-filewriter-demo.tgz
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> When FileIO.writeDynamic is used with automatic sharding and  a Contextful.Fn 
> that uses side inputs for the file naming, DirectRunner (and TestPipeline) 
> fail with: 
> {{java.lang.IllegalStateException: All PCollectionViews that are consumed 
> must be written by some WriteView PTransform: Missing [ 
> [RunnerPCollectionView]]}}
>  
> Example code:  
> {code:java}
> PCollectionView outputFileName =
>    pipeline.apply(
>       "outputDir",
>        Create.of("/tmp/testout")).apply(View.asSingleton());
> Contextful.Fn manifestNaming =
>    (element, c) ->
>       (window, pane, numShards, shardIndex, compression) -> 
>          c.sideInput(outputFileName)+shardIndex;
> pipeline.apply(FileIO.writeDynamic()
>    .by(SerializableFunctions.constant(""))
>    .withDestinationCoder(StringUtf8Coder.of())
>    .via(TextIO.sink())
>    .withTempDirectory("/tmp")
>    .withNaming(Contextful.of(
>       manifestNaming,
>       Requirements.requiresSideInputs(outputFileName;
> {code}
>  
> This does not occur in Dataflow-runner
> It does not occur if the ContextFul.Fn is not given side inputs.
> It does not occur if withNumShards(1) is set.
> It did not occur in 2.8.0, and does in 2.9.0 and 2.10.0-SNAPSHOT (as of today)
>  
> The cause appears to be due to the DirectRunner using TransformOverrides 
> re-writing FileIO sinks to use runner-determined-sharding
> ( see [DirectRunner.java line 
> 226|https://github.com/apache/beam/blob/master/runners/direct-java/src/main/java/org/apache/beam/runners/direct/DirectRunner.java#L226]
>  )
>  but I do not know why this started occuring in 2.9.0...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6407) regression: FileIO.writeDynamic() with side inputs fails in DirectRunner

2019-01-17 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6407?focusedWorklogId=186443&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-186443
 ]

ASF GitHub Bot logged work on BEAM-6407:


Author: ASF GitHub Bot
Created on: 17/Jan/19 17:32
Start Date: 17/Jan/19 17:32
Worklog Time Spent: 10m 
  Work Description: kennknowles commented on issue #7537: [BEAM-6407] 
Revert "BEAM-5933: avoid memory allocation in hashCode call"
URL: https://github.com/apache/beam/pull/7537#issuecomment-455259310
 
 
   I am going to merge this and then look into the spotless issue. In this 
case, it is likely because the merge of turning off `paddedCell` was concurrent 
with the merge of adding `TestPOJOs.java`.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 186443)
Time Spent: 2h  (was: 1h 50m)

> regression: FileIO.writeDynamic() with side inputs fails in DirectRunner
> 
>
> Key: BEAM-6407
> URL: https://issues.apache.org/jira/browse/BEAM-6407
> Project: Beam
>  Issue Type: Bug
>  Components: runner-direct
>Affects Versions: 2.9.0
>Reporter: Niel Markwick
>Assignee: Kenneth Knowles
>Priority: Blocker
>  Labels: regression
> Fix For: 2.10.0
>
> Attachments: beam-filewriter-demo.tgz
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> When FileIO.writeDynamic is used with automatic sharding and  a Contextful.Fn 
> that uses side inputs for the file naming, DirectRunner (and TestPipeline) 
> fail with: 
> {{java.lang.IllegalStateException: All PCollectionViews that are consumed 
> must be written by some WriteView PTransform: Missing [ 
> [RunnerPCollectionView]]}}
>  
> Example code:  
> {code:java}
> PCollectionView outputFileName =
>    pipeline.apply(
>       "outputDir",
>        Create.of("/tmp/testout")).apply(View.asSingleton());
> Contextful.Fn manifestNaming =
>    (element, c) ->
>       (window, pane, numShards, shardIndex, compression) -> 
>          c.sideInput(outputFileName)+shardIndex;
> pipeline.apply(FileIO.writeDynamic()
>    .by(SerializableFunctions.constant(""))
>    .withDestinationCoder(StringUtf8Coder.of())
>    .via(TextIO.sink())
>    .withTempDirectory("/tmp")
>    .withNaming(Contextful.of(
>       manifestNaming,
>       Requirements.requiresSideInputs(outputFileName;
> {code}
>  
> This does not occur in Dataflow-runner
> It does not occur if the ContextFul.Fn is not given side inputs.
> It does not occur if withNumShards(1) is set.
> It did not occur in 2.8.0, and does in 2.9.0 and 2.10.0-SNAPSHOT (as of today)
>  
> The cause appears to be due to the DirectRunner using TransformOverrides 
> re-writing FileIO sinks to use runner-determined-sharding
> ( see [DirectRunner.java line 
> 226|https://github.com/apache/beam/blob/master/runners/direct-java/src/main/java/org/apache/beam/runners/direct/DirectRunner.java#L226]
>  )
>  but I do not know why this started occuring in 2.9.0...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6407) regression: FileIO.writeDynamic() with side inputs fails in DirectRunner

2019-01-17 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6407?focusedWorklogId=186362&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-186362
 ]

ASF GitHub Bot logged work on BEAM-6407:


Author: ASF GitHub Bot
Created on: 17/Jan/19 15:47
Start Date: 17/Jan/19 15:47
Worklog Time Spent: 10m 
  Work Description: nielm commented on issue #7537: [BEAM-6407] Revert 
"BEAM-5933: avoid memory allocation in hashCode call"
URL: https://github.com/apache/beam/pull/7537#issuecomment-455219983
 
 
   note that Spotless fails on TestPOJOs.java -- not on any of the files 
modified in this PR.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 186362)
Time Spent: 1h 50m  (was: 1h 40m)

> regression: FileIO.writeDynamic() with side inputs fails in DirectRunner
> 
>
> Key: BEAM-6407
> URL: https://issues.apache.org/jira/browse/BEAM-6407
> Project: Beam
>  Issue Type: Bug
>  Components: runner-direct
>Affects Versions: 2.9.0
>Reporter: Niel Markwick
>Assignee: Kenneth Knowles
>Priority: Blocker
>  Labels: regression
> Fix For: 2.10.0
>
> Attachments: beam-filewriter-demo.tgz
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> When FileIO.writeDynamic is used with automatic sharding and  a Contextful.Fn 
> that uses side inputs for the file naming, DirectRunner (and TestPipeline) 
> fail with: 
> {{java.lang.IllegalStateException: All PCollectionViews that are consumed 
> must be written by some WriteView PTransform: Missing [ 
> [RunnerPCollectionView]]}}
>  
> Example code:  
> {code:java}
> PCollectionView outputFileName =
>    pipeline.apply(
>       "outputDir",
>        Create.of("/tmp/testout")).apply(View.asSingleton());
> Contextful.Fn manifestNaming =
>    (element, c) ->
>       (window, pane, numShards, shardIndex, compression) -> 
>          c.sideInput(outputFileName)+shardIndex;
> pipeline.apply(FileIO.writeDynamic()
>    .by(SerializableFunctions.constant(""))
>    .withDestinationCoder(StringUtf8Coder.of())
>    .via(TextIO.sink())
>    .withTempDirectory("/tmp")
>    .withNaming(Contextful.of(
>       manifestNaming,
>       Requirements.requiresSideInputs(outputFileName;
> {code}
>  
> This does not occur in Dataflow-runner
> It does not occur if the ContextFul.Fn is not given side inputs.
> It does not occur if withNumShards(1) is set.
> It did not occur in 2.8.0, and does in 2.9.0 and 2.10.0-SNAPSHOT (as of today)
>  
> The cause appears to be due to the DirectRunner using TransformOverrides 
> re-writing FileIO sinks to use runner-determined-sharding
> ( see [DirectRunner.java line 
> 226|https://github.com/apache/beam/blob/master/runners/direct-java/src/main/java/org/apache/beam/runners/direct/DirectRunner.java#L226]
>  )
>  but I do not know why this started occuring in 2.9.0...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6407) regression: FileIO.writeDynamic() with side inputs fails in DirectRunner

2019-01-17 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6407?focusedWorklogId=186356&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-186356
 ]

ASF GitHub Bot logged work on BEAM-6407:


Author: ASF GitHub Bot
Created on: 17/Jan/19 15:41
Start Date: 17/Jan/19 15:41
Worklog Time Spent: 10m 
  Work Description: nielm commented on issue #7537: [BEAM-6407] Revert 
"BEAM-5933: avoid memory allocation in hashCode call"
URL: https://github.com/apache/beam/pull/7537#issuecomment-455217354
 
 
   Fixed test case - now fails reliably when the fix is not applied.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 186356)
Time Spent: 1h 40m  (was: 1.5h)

> regression: FileIO.writeDynamic() with side inputs fails in DirectRunner
> 
>
> Key: BEAM-6407
> URL: https://issues.apache.org/jira/browse/BEAM-6407
> Project: Beam
>  Issue Type: Bug
>  Components: runner-direct
>Affects Versions: 2.9.0
>Reporter: Niel Markwick
>Assignee: Kenneth Knowles
>Priority: Blocker
>  Labels: regression
> Fix For: 2.10.0
>
> Attachments: beam-filewriter-demo.tgz
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> When FileIO.writeDynamic is used with automatic sharding and  a Contextful.Fn 
> that uses side inputs for the file naming, DirectRunner (and TestPipeline) 
> fail with: 
> {{java.lang.IllegalStateException: All PCollectionViews that are consumed 
> must be written by some WriteView PTransform: Missing [ 
> [RunnerPCollectionView]]}}
>  
> Example code:  
> {code:java}
> PCollectionView outputFileName =
>    pipeline.apply(
>       "outputDir",
>        Create.of("/tmp/testout")).apply(View.asSingleton());
> Contextful.Fn manifestNaming =
>    (element, c) ->
>       (window, pane, numShards, shardIndex, compression) -> 
>          c.sideInput(outputFileName)+shardIndex;
> pipeline.apply(FileIO.writeDynamic()
>    .by(SerializableFunctions.constant(""))
>    .withDestinationCoder(StringUtf8Coder.of())
>    .via(TextIO.sink())
>    .withTempDirectory("/tmp")
>    .withNaming(Contextful.of(
>       manifestNaming,
>       Requirements.requiresSideInputs(outputFileName;
> {code}
>  
> This does not occur in Dataflow-runner
> It does not occur if the ContextFul.Fn is not given side inputs.
> It does not occur if withNumShards(1) is set.
> It did not occur in 2.8.0, and does in 2.9.0 and 2.10.0-SNAPSHOT (as of today)
>  
> The cause appears to be due to the DirectRunner using TransformOverrides 
> re-writing FileIO sinks to use runner-determined-sharding
> ( see [DirectRunner.java line 
> 226|https://github.com/apache/beam/blob/master/runners/direct-java/src/main/java/org/apache/beam/runners/direct/DirectRunner.java#L226]
>  )
>  but I do not know why this started occuring in 2.9.0...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6407) regression: FileIO.writeDynamic() with side inputs fails in DirectRunner

2019-01-17 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6407?focusedWorklogId=186257&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-186257
 ]

ASF GitHub Bot logged work on BEAM-6407:


Author: ASF GitHub Bot
Created on: 17/Jan/19 10:53
Start Date: 17/Jan/19 10:53
Worklog Time Spent: 10m 
  Work Description: nielm commented on issue #7537: [BEAM-6407] Revert 
"BEAM-5933: avoid memory allocation in hashCode call"
URL: https://github.com/apache/beam/pull/7537#issuecomment-455129486
 
 
   note that the unit test does not repro the problem on master branch... 
   I plan to find out whether that is because the test is invalid, or because 
the problem is fixed on master...
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 186257)
Time Spent: 1.5h  (was: 1h 20m)

> regression: FileIO.writeDynamic() with side inputs fails in DirectRunner
> 
>
> Key: BEAM-6407
> URL: https://issues.apache.org/jira/browse/BEAM-6407
> Project: Beam
>  Issue Type: Bug
>  Components: runner-direct
>Affects Versions: 2.9.0
>Reporter: Niel Markwick
>Assignee: Kenneth Knowles
>Priority: Blocker
>  Labels: regression
> Fix For: 2.10.0
>
> Attachments: beam-filewriter-demo.tgz
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> When FileIO.writeDynamic is used with automatic sharding and  a Contextful.Fn 
> that uses side inputs for the file naming, DirectRunner (and TestPipeline) 
> fail with: 
> {{java.lang.IllegalStateException: All PCollectionViews that are consumed 
> must be written by some WriteView PTransform: Missing [ 
> [RunnerPCollectionView]]}}
>  
> Example code:  
> {code:java}
> PCollectionView outputFileName =
>    pipeline.apply(
>       "outputDir",
>        Create.of("/tmp/testout")).apply(View.asSingleton());
> Contextful.Fn manifestNaming =
>    (element, c) ->
>       (window, pane, numShards, shardIndex, compression) -> 
>          c.sideInput(outputFileName)+shardIndex;
> pipeline.apply(FileIO.writeDynamic()
>    .by(SerializableFunctions.constant(""))
>    .withDestinationCoder(StringUtf8Coder.of())
>    .via(TextIO.sink())
>    .withTempDirectory("/tmp")
>    .withNaming(Contextful.of(
>       manifestNaming,
>       Requirements.requiresSideInputs(outputFileName;
> {code}
>  
> This does not occur in Dataflow-runner
> It does not occur if the ContextFul.Fn is not given side inputs.
> It does not occur if withNumShards(1) is set.
> It did not occur in 2.8.0, and does in 2.9.0 and 2.10.0-SNAPSHOT (as of today)
>  
> The cause appears to be due to the DirectRunner using TransformOverrides 
> re-writing FileIO sinks to use runner-determined-sharding
> ( see [DirectRunner.java line 
> 226|https://github.com/apache/beam/blob/master/runners/direct-java/src/main/java/org/apache/beam/runners/direct/DirectRunner.java#L226]
>  )
>  but I do not know why this started occuring in 2.9.0...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6407) regression: FileIO.writeDynamic() with side inputs fails in DirectRunner

2019-01-17 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6407?focusedWorklogId=186256&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-186256
 ]

ASF GitHub Bot logged work on BEAM-6407:


Author: ASF GitHub Bot
Created on: 17/Jan/19 10:53
Start Date: 17/Jan/19 10:53
Worklog Time Spent: 10m 
  Work Description: nielm commented on issue #7537: [BEAM-6407] Revert 
"BEAM-5933: avoid memory allocation in hashCode call"
URL: https://github.com/apache/beam/pull/7537#issuecomment-455129486
 
 
   note that the does not repro the problem on master branch... 
   I plan to find out whether that is because the test is invalid, or because 
the problem is fixed on master...
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 186256)
Time Spent: 1h 20m  (was: 1h 10m)

> regression: FileIO.writeDynamic() with side inputs fails in DirectRunner
> 
>
> Key: BEAM-6407
> URL: https://issues.apache.org/jira/browse/BEAM-6407
> Project: Beam
>  Issue Type: Bug
>  Components: runner-direct
>Affects Versions: 2.9.0
>Reporter: Niel Markwick
>Assignee: Kenneth Knowles
>Priority: Blocker
>  Labels: regression
> Fix For: 2.10.0
>
> Attachments: beam-filewriter-demo.tgz
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> When FileIO.writeDynamic is used with automatic sharding and  a Contextful.Fn 
> that uses side inputs for the file naming, DirectRunner (and TestPipeline) 
> fail with: 
> {{java.lang.IllegalStateException: All PCollectionViews that are consumed 
> must be written by some WriteView PTransform: Missing [ 
> [RunnerPCollectionView]]}}
>  
> Example code:  
> {code:java}
> PCollectionView outputFileName =
>    pipeline.apply(
>       "outputDir",
>        Create.of("/tmp/testout")).apply(View.asSingleton());
> Contextful.Fn manifestNaming =
>    (element, c) ->
>       (window, pane, numShards, shardIndex, compression) -> 
>          c.sideInput(outputFileName)+shardIndex;
> pipeline.apply(FileIO.writeDynamic()
>    .by(SerializableFunctions.constant(""))
>    .withDestinationCoder(StringUtf8Coder.of())
>    .via(TextIO.sink())
>    .withTempDirectory("/tmp")
>    .withNaming(Contextful.of(
>       manifestNaming,
>       Requirements.requiresSideInputs(outputFileName;
> {code}
>  
> This does not occur in Dataflow-runner
> It does not occur if the ContextFul.Fn is not given side inputs.
> It does not occur if withNumShards(1) is set.
> It did not occur in 2.8.0, and does in 2.9.0 and 2.10.0-SNAPSHOT (as of today)
>  
> The cause appears to be due to the DirectRunner using TransformOverrides 
> re-writing FileIO sinks to use runner-determined-sharding
> ( see [DirectRunner.java line 
> 226|https://github.com/apache/beam/blob/master/runners/direct-java/src/main/java/org/apache/beam/runners/direct/DirectRunner.java#L226]
>  )
>  but I do not know why this started occuring in 2.9.0...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6407) regression: FileIO.writeDynamic() with side inputs fails in DirectRunner

2019-01-17 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6407?focusedWorklogId=186237&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-186237
 ]

ASF GitHub Bot logged work on BEAM-6407:


Author: ASF GitHub Bot
Created on: 17/Jan/19 09:16
Start Date: 17/Jan/19 09:16
Worklog Time Spent: 10m 
  Work Description: janotav commented on issue #7537: [BEAM-6407] Revert 
"BEAM-5933: avoid memory allocation in hashCode call"
URL: https://github.com/apache/beam/pull/7537#issuecomment-455098685
 
 
   I will try to look into it over the week-end. If the release is sooner than 
that, rolling back is definitely the  appropriate solution. The allocations 
were frequent, but given their size they are unlikely to have material effect.
   
   Sorry for the complications, my intentions were good, honestly :-)
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 186237)
Time Spent: 1h 10m  (was: 1h)

> regression: FileIO.writeDynamic() with side inputs fails in DirectRunner
> 
>
> Key: BEAM-6407
> URL: https://issues.apache.org/jira/browse/BEAM-6407
> Project: Beam
>  Issue Type: Bug
>  Components: runner-direct
>Affects Versions: 2.9.0
>Reporter: Niel Markwick
>Assignee: Kenneth Knowles
>Priority: Blocker
>  Labels: regression
> Fix For: 2.10.0
>
> Attachments: beam-filewriter-demo.tgz
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> When FileIO.writeDynamic is used with automatic sharding and  a Contextful.Fn 
> that uses side inputs for the file naming, DirectRunner (and TestPipeline) 
> fail with: 
> {{java.lang.IllegalStateException: All PCollectionViews that are consumed 
> must be written by some WriteView PTransform: Missing [ 
> [RunnerPCollectionView]]}}
>  
> Example code:  
> {code:java}
> PCollectionView outputFileName =
>    pipeline.apply(
>       "outputDir",
>        Create.of("/tmp/testout")).apply(View.asSingleton());
> Contextful.Fn manifestNaming =
>    (element, c) ->
>       (window, pane, numShards, shardIndex, compression) -> 
>          c.sideInput(outputFileName)+shardIndex;
> pipeline.apply(FileIO.writeDynamic()
>    .by(SerializableFunctions.constant(""))
>    .withDestinationCoder(StringUtf8Coder.of())
>    .via(TextIO.sink())
>    .withTempDirectory("/tmp")
>    .withNaming(Contextful.of(
>       manifestNaming,
>       Requirements.requiresSideInputs(outputFileName;
> {code}
>  
> This does not occur in Dataflow-runner
> It does not occur if the ContextFul.Fn is not given side inputs.
> It does not occur if withNumShards(1) is set.
> It did not occur in 2.8.0, and does in 2.9.0 and 2.10.0-SNAPSHOT (as of today)
>  
> The cause appears to be due to the DirectRunner using TransformOverrides 
> re-writing FileIO sinks to use runner-determined-sharding
> ( see [DirectRunner.java line 
> 226|https://github.com/apache/beam/blob/master/runners/direct-java/src/main/java/org/apache/beam/runners/direct/DirectRunner.java#L226]
>  )
>  but I do not know why this started occuring in 2.9.0...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6407) regression: FileIO.writeDynamic() with side inputs fails in DirectRunner

2019-01-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6407?focusedWorklogId=186133&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-186133
 ]

ASF GitHub Bot logged work on BEAM-6407:


Author: ASF GitHub Bot
Created on: 17/Jan/19 03:52
Start Date: 17/Jan/19 03:52
Worklog Time Spent: 10m 
  Work Description: kennknowles commented on issue #7537: [BEAM-6407] 
Revert "BEAM-5933: avoid memory allocation in hashCode call"
URL: https://github.com/apache/beam/pull/7537#issuecomment-455033175
 
 
   run java precommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 186133)
Time Spent: 50m  (was: 40m)

> regression: FileIO.writeDynamic() with side inputs fails in DirectRunner
> 
>
> Key: BEAM-6407
> URL: https://issues.apache.org/jira/browse/BEAM-6407
> Project: Beam
>  Issue Type: Bug
>  Components: runner-direct
>Affects Versions: 2.9.0
>Reporter: Niel Markwick
>Assignee: Kenneth Knowles
>Priority: Blocker
>  Labels: regression
> Fix For: 2.10.0
>
> Attachments: beam-filewriter-demo.tgz
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> When FileIO.writeDynamic is used with automatic sharding and  a Contextful.Fn 
> that uses side inputs for the file naming, DirectRunner (and TestPipeline) 
> fail with: 
> {{java.lang.IllegalStateException: All PCollectionViews that are consumed 
> must be written by some WriteView PTransform: Missing [ 
> [RunnerPCollectionView]]}}
>  
> Example code:  
> {code:java}
> PCollectionView outputFileName =
>    pipeline.apply(
>       "outputDir",
>        Create.of("/tmp/testout")).apply(View.asSingleton());
> Contextful.Fn manifestNaming =
>    (element, c) ->
>       (window, pane, numShards, shardIndex, compression) -> 
>          c.sideInput(outputFileName)+shardIndex;
> pipeline.apply(FileIO.writeDynamic()
>    .by(SerializableFunctions.constant(""))
>    .withDestinationCoder(StringUtf8Coder.of())
>    .via(TextIO.sink())
>    .withTempDirectory("/tmp")
>    .withNaming(Contextful.of(
>       manifestNaming,
>       Requirements.requiresSideInputs(outputFileName;
> {code}
>  
> This does not occur in Dataflow-runner
> It does not occur if the ContextFul.Fn is not given side inputs.
> It does not occur if withNumShards(1) is set.
> It did not occur in 2.8.0, and does in 2.9.0 and 2.10.0-SNAPSHOT (as of today)
>  
> The cause appears to be due to the DirectRunner using TransformOverrides 
> re-writing FileIO sinks to use runner-determined-sharding
> ( see [DirectRunner.java line 
> 226|https://github.com/apache/beam/blob/master/runners/direct-java/src/main/java/org/apache/beam/runners/direct/DirectRunner.java#L226]
>  )
>  but I do not know why this started occuring in 2.9.0...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6407) regression: FileIO.writeDynamic() with side inputs fails in DirectRunner

2019-01-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6407?focusedWorklogId=186134&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-186134
 ]

ASF GitHub Bot logged work on BEAM-6407:


Author: ASF GitHub Bot
Created on: 17/Jan/19 03:52
Start Date: 17/Jan/19 03:52
Worklog Time Spent: 10m 
  Work Description: kennknowles commented on issue #7537: [BEAM-6407] 
Revert "BEAM-5933: avoid memory allocation in hashCode call"
URL: https://github.com/apache/beam/pull/7537#issuecomment-455033217
 
 
   Run Java_Examples_Dataflow PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 186134)
Time Spent: 1h  (was: 50m)

> regression: FileIO.writeDynamic() with side inputs fails in DirectRunner
> 
>
> Key: BEAM-6407
> URL: https://issues.apache.org/jira/browse/BEAM-6407
> Project: Beam
>  Issue Type: Bug
>  Components: runner-direct
>Affects Versions: 2.9.0
>Reporter: Niel Markwick
>Assignee: Kenneth Knowles
>Priority: Blocker
>  Labels: regression
> Fix For: 2.10.0
>
> Attachments: beam-filewriter-demo.tgz
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> When FileIO.writeDynamic is used with automatic sharding and  a Contextful.Fn 
> that uses side inputs for the file naming, DirectRunner (and TestPipeline) 
> fail with: 
> {{java.lang.IllegalStateException: All PCollectionViews that are consumed 
> must be written by some WriteView PTransform: Missing [ 
> [RunnerPCollectionView]]}}
>  
> Example code:  
> {code:java}
> PCollectionView outputFileName =
>    pipeline.apply(
>       "outputDir",
>        Create.of("/tmp/testout")).apply(View.asSingleton());
> Contextful.Fn manifestNaming =
>    (element, c) ->
>       (window, pane, numShards, shardIndex, compression) -> 
>          c.sideInput(outputFileName)+shardIndex;
> pipeline.apply(FileIO.writeDynamic()
>    .by(SerializableFunctions.constant(""))
>    .withDestinationCoder(StringUtf8Coder.of())
>    .via(TextIO.sink())
>    .withTempDirectory("/tmp")
>    .withNaming(Contextful.of(
>       manifestNaming,
>       Requirements.requiresSideInputs(outputFileName;
> {code}
>  
> This does not occur in Dataflow-runner
> It does not occur if the ContextFul.Fn is not given side inputs.
> It does not occur if withNumShards(1) is set.
> It did not occur in 2.8.0, and does in 2.9.0 and 2.10.0-SNAPSHOT (as of today)
>  
> The cause appears to be due to the DirectRunner using TransformOverrides 
> re-writing FileIO sinks to use runner-determined-sharding
> ( see [DirectRunner.java line 
> 226|https://github.com/apache/beam/blob/master/runners/direct-java/src/main/java/org/apache/beam/runners/direct/DirectRunner.java#L226]
>  )
>  but I do not know why this started occuring in 2.9.0...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6407) regression: FileIO.writeDynamic() with side inputs fails in DirectRunner

2019-01-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6407?focusedWorklogId=185977&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-185977
 ]

ASF GitHub Bot logged work on BEAM-6407:


Author: ASF GitHub Bot
Created on: 16/Jan/19 20:26
Start Date: 16/Jan/19 20:26
Worklog Time Spent: 10m 
  Work Description: kennknowles commented on issue #7537: [BEAM-6407] 
Revert "BEAM-5933: avoid memory allocation in hashCode call"
URL: https://github.com/apache/beam/pull/7537#issuecomment-454927557
 
 
   Note that these are certainly distinct hashes: 
https://docs.oracle.com/javase/8/docs/api/java/util/Objects.html#hash-java.lang.Object...-
   
   I'm guessing that the check whether every view is set up right also uses 
Objects.hash. So there might be an easy roll-forward.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 185977)
Time Spent: 40m  (was: 0.5h)

> regression: FileIO.writeDynamic() with side inputs fails in DirectRunner
> 
>
> Key: BEAM-6407
> URL: https://issues.apache.org/jira/browse/BEAM-6407
> Project: Beam
>  Issue Type: Bug
>  Components: runner-direct
>Affects Versions: 2.9.0
>Reporter: Niel Markwick
>Assignee: Kenneth Knowles
>Priority: Major
>  Labels: regression
> Fix For: 2.10.0
>
> Attachments: beam-filewriter-demo.tgz
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> When FileIO.writeDynamic is used with automatic sharding and  a Contextful.Fn 
> that uses side inputs for the file naming, DirectRunner (and TestPipeline) 
> fail with: 
> {{java.lang.IllegalStateException: All PCollectionViews that are consumed 
> must be written by some WriteView PTransform: Missing [ 
> [RunnerPCollectionView]]}}
>  
> Example code:  
> {code:java}
> PCollectionView outputFileName =
>    pipeline.apply(
>       "outputDir",
>        Create.of("/tmp/testout")).apply(View.asSingleton());
> Contextful.Fn manifestNaming =
>    (element, c) ->
>       (window, pane, numShards, shardIndex, compression) -> 
>          c.sideInput(outputFileName)+shardIndex;
> pipeline.apply(FileIO.writeDynamic()
>    .by(SerializableFunctions.constant(""))
>    .withDestinationCoder(StringUtf8Coder.of())
>    .via(TextIO.sink())
>    .withTempDirectory("/tmp")
>    .withNaming(Contextful.of(
>       manifestNaming,
>       Requirements.requiresSideInputs(outputFileName;
> {code}
>  
> This does not occur in Dataflow-runner
> It does not occur if the ContextFul.Fn is not given side inputs.
> It does not occur if withNumShards(1) is set.
> It did not occur in 2.8.0, and does in 2.9.0 and 2.10.0-SNAPSHOT (as of today)
>  
> The cause appears to be due to the DirectRunner using TransformOverrides 
> re-writing FileIO sinks to use runner-determined-sharding
> ( see [DirectRunner.java line 
> 226|https://github.com/apache/beam/blob/master/runners/direct-java/src/main/java/org/apache/beam/runners/direct/DirectRunner.java#L226]
>  )
>  but I do not know why this started occuring in 2.9.0...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6407) regression: FileIO.writeDynamic() with side inputs fails in DirectRunner

2019-01-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6407?focusedWorklogId=185975&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-185975
 ]

ASF GitHub Bot logged work on BEAM-6407:


Author: ASF GitHub Bot
Created on: 16/Jan/19 20:23
Start Date: 16/Jan/19 20:23
Worklog Time Spent: 10m 
  Work Description: kennknowles commented on issue #7537: [BEAM-6407] 
Revert "BEAM-5933: avoid memory allocation in hashCode call"
URL: https://github.com/apache/beam/pull/7537#issuecomment-454926641
 
 
   Can you add a regression test?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 185975)
Time Spent: 0.5h  (was: 20m)

> regression: FileIO.writeDynamic() with side inputs fails in DirectRunner
> 
>
> Key: BEAM-6407
> URL: https://issues.apache.org/jira/browse/BEAM-6407
> Project: Beam
>  Issue Type: Bug
>  Components: runner-direct
>Affects Versions: 2.9.0
>Reporter: Niel Markwick
>Assignee: Kenneth Knowles
>Priority: Major
>  Labels: regression
> Fix For: 2.10.0
>
> Attachments: beam-filewriter-demo.tgz
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> When FileIO.writeDynamic is used with automatic sharding and  a Contextful.Fn 
> that uses side inputs for the file naming, DirectRunner (and TestPipeline) 
> fail with: 
> {{java.lang.IllegalStateException: All PCollectionViews that are consumed 
> must be written by some WriteView PTransform: Missing [ 
> [RunnerPCollectionView]]}}
>  
> Example code:  
> {code:java}
> PCollectionView outputFileName =
>    pipeline.apply(
>       "outputDir",
>        Create.of("/tmp/testout")).apply(View.asSingleton());
> Contextful.Fn manifestNaming =
>    (element, c) ->
>       (window, pane, numShards, shardIndex, compression) -> 
>          c.sideInput(outputFileName)+shardIndex;
> pipeline.apply(FileIO.writeDynamic()
>    .by(SerializableFunctions.constant(""))
>    .withDestinationCoder(StringUtf8Coder.of())
>    .via(TextIO.sink())
>    .withTempDirectory("/tmp")
>    .withNaming(Contextful.of(
>       manifestNaming,
>       Requirements.requiresSideInputs(outputFileName;
> {code}
>  
> This does not occur in Dataflow-runner
> It does not occur if the ContextFul.Fn is not given side inputs.
> It does not occur if withNumShards(1) is set.
> It did not occur in 2.8.0, and does in 2.9.0 and 2.10.0-SNAPSHOT (as of today)
>  
> The cause appears to be due to the DirectRunner using TransformOverrides 
> re-writing FileIO sinks to use runner-determined-sharding
> ( see [DirectRunner.java line 
> 226|https://github.com/apache/beam/blob/master/runners/direct-java/src/main/java/org/apache/beam/runners/direct/DirectRunner.java#L226]
>  )
>  but I do not know why this started occuring in 2.9.0...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6407) regression: FileIO.writeDynamic() with side inputs fails in DirectRunner

2019-01-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6407?focusedWorklogId=185971&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-185971
 ]

ASF GitHub Bot logged work on BEAM-6407:


Author: ASF GitHub Bot
Created on: 16/Jan/19 20:17
Start Date: 16/Jan/19 20:17
Worklog Time Spent: 10m 
  Work Description: kennknowles commented on issue #7537: [BEAM-6407] 
Revert "BEAM-5933: avoid memory allocation in hashCode call"
URL: https://github.com/apache/beam/pull/7537#issuecomment-454924351
 
 
   CC @janotav @iemejia 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 185971)
Time Spent: 20m  (was: 10m)

> regression: FileIO.writeDynamic() with side inputs fails in DirectRunner
> 
>
> Key: BEAM-6407
> URL: https://issues.apache.org/jira/browse/BEAM-6407
> Project: Beam
>  Issue Type: Bug
>  Components: runner-direct
>Affects Versions: 2.9.0
>Reporter: Niel Markwick
>Assignee: Kenneth Knowles
>Priority: Major
>  Labels: regression
> Fix For: 2.10.0
>
> Attachments: beam-filewriter-demo.tgz
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> When FileIO.writeDynamic is used with automatic sharding and  a Contextful.Fn 
> that uses side inputs for the file naming, DirectRunner (and TestPipeline) 
> fail with: 
> {{java.lang.IllegalStateException: All PCollectionViews that are consumed 
> must be written by some WriteView PTransform: Missing [ 
> [RunnerPCollectionView]]}}
>  
> Example code:  
> {code:java}
> PCollectionView outputFileName =
>    pipeline.apply(
>       "outputDir",
>        Create.of("/tmp/testout")).apply(View.asSingleton());
> Contextful.Fn manifestNaming =
>    (element, c) ->
>       (window, pane, numShards, shardIndex, compression) -> 
>          c.sideInput(outputFileName)+shardIndex;
> pipeline.apply(FileIO.writeDynamic()
>    .by(SerializableFunctions.constant(""))
>    .withDestinationCoder(StringUtf8Coder.of())
>    .via(TextIO.sink())
>    .withTempDirectory("/tmp")
>    .withNaming(Contextful.of(
>       manifestNaming,
>       Requirements.requiresSideInputs(outputFileName;
> {code}
>  
> This does not occur in Dataflow-runner
> It does not occur if the ContextFul.Fn is not given side inputs.
> It does not occur if withNumShards(1) is set.
> It did not occur in 2.8.0, and does in 2.9.0 and 2.10.0-SNAPSHOT (as of today)
>  
> The cause appears to be due to the DirectRunner using TransformOverrides 
> re-writing FileIO sinks to use runner-determined-sharding
> ( see [DirectRunner.java line 
> 226|https://github.com/apache/beam/blob/master/runners/direct-java/src/main/java/org/apache/beam/runners/direct/DirectRunner.java#L226]
>  )
>  but I do not know why this started occuring in 2.9.0...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6407) regression: FileIO.writeDynamic() with side inputs fails in DirectRunner

2019-01-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6407?focusedWorklogId=185922&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-185922
 ]

ASF GitHub Bot logged work on BEAM-6407:


Author: ASF GitHub Bot
Created on: 16/Jan/19 18:13
Start Date: 16/Jan/19 18:13
Worklog Time Spent: 10m 
  Work Description: nielm commented on pull request #7537: [BEAM-6407] 
Revert "BEAM-5933: avoid memory allocation in hashCode call"
URL: https://github.com/apache/beam/pull/7537
 
 
   Reverts apache/beam#6909, as this appears to cause an obscure error in 
FileIO.writeDynamic
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 185922)
Time Spent: 10m
Remaining Estimate: 0h

> regression: FileIO.writeDynamic() with side inputs fails in DirectRunner
> 
>
> Key: BEAM-6407
> URL: https://issues.apache.org/jira/browse/BEAM-6407
> Project: Beam
>  Issue Type: Bug
>  Components: runner-direct
>Affects Versions: 2.9.0
>Reporter: Niel Markwick
>Assignee: Kenneth Knowles
>Priority: Major
>  Labels: regression
> Fix For: 2.10.0
>
> Attachments: beam-filewriter-demo.tgz
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> When FileIO.writeDynamic is used with automatic sharding and  a Contextful.Fn 
> that uses side inputs for the file naming, DirectRunner (and TestPipeline) 
> fail with: 
> {{java.lang.IllegalStateException: All PCollectionViews that are consumed 
> must be written by some WriteView PTransform: Missing [ 
> [RunnerPCollectionView]]}}
>  
> Example code:  
> {code:java}
> PCollectionView outputFileName =
>    pipeline.apply(
>       "outputDir",
>        Create.of("/tmp/testout")).apply(View.asSingleton());
> Contextful.Fn manifestNaming =
>    (element, c) ->
>       (window, pane, numShards, shardIndex, compression) -> 
>          c.sideInput(outputFileName)+shardIndex;
> pipeline.apply(FileIO.writeDynamic()
>    .by(SerializableFunctions.constant(""))
>    .withDestinationCoder(StringUtf8Coder.of())
>    .via(TextIO.sink())
>    .withTempDirectory("/tmp")
>    .withNaming(Contextful.of(
>       manifestNaming,
>       Requirements.requiresSideInputs(outputFileName;
> {code}
>  
> This does not occur in Dataflow-runner
> It does not occur if the ContextFul.Fn is not given side inputs.
> It does not occur if withNumShards(1) is set.
> It did not occur in 2.8.0, and does in 2.9.0 and 2.10.0-SNAPSHOT (as of today)
>  
> The cause appears to be due to the DirectRunner using TransformOverrides 
> re-writing FileIO sinks to use runner-determined-sharding
> ( see [DirectRunner.java line 
> 226|https://github.com/apache/beam/blob/master/runners/direct-java/src/main/java/org/apache/beam/runners/direct/DirectRunner.java#L226]
>  )
>  but I do not know why this started occuring in 2.9.0...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)