[jira] [Commented] (BEAM-1053) ApexGroupByKeyOperator serialization issues

2017-04-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15963410#comment-15963410
 ] 

ASF GitHub Bot commented on BEAM-1053:
--

Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/2473


> ApexGroupByKeyOperator serialization issues
> ---
>
> Key: BEAM-1053
> URL: https://issues.apache.org/jira/browse/BEAM-1053
> Project: Beam
>  Issue Type: Bug
>  Components: runner-apex
>Affects Versions: 0.3.0-incubating
>Reporter: Sandeep Deshmukh
>Assignee: Thomas Weise
>
> While trying to use Apex Runner for wordcount program, the 
> ApexGroupByKeyOperator  fails with following exception. 
> 2016-11-28 20:54:47,696 INFO com.datatorrent.stram.engine.StreamingContainer: 
> Deploy request: 
> [OperatorDeployInfo[id=9,name=ExtractPayload,type=GENERIC,checkpoint={583c4b1100b3,
>  0, 
> 0},inputs=[OperatorDeployInfo.InputDeployInfo[portName=input,streamId=stream14,sourceNodeId=8,sourcePortName=output,locality=CONTAINER_LOCAL,partitionMask=0,partitionKeys=]],outputs=[OperatorDeployInfo.OutputDeployInfo[portName=output,streamId=stream0,bufferServer=]]],
>  
> OperatorDeployInfo[id=8,name=ReadFromHDFS,type=INPUT,checkpoint={583c4b1100b3,
>  0, 
> 0},inputs=[],outputs=[OperatorDeployInfo.OutputDeployInfo[portName=output,streamId=stream14,bufferServer=]]],
>  
> OperatorDeployInfo[id=11,name=Application.CountWords/Count.PerElement/Init/Map,type=GENERIC,checkpoint={583c4b1100b3,
>  0, 
> 0},inputs=[OperatorDeployInfo.InputDeployInfo[portName=input,streamId=stream17,sourceNodeId=10,sourcePortName=output,locality=CONTAINER_LOCAL,partitionMask=0,partitionKeys=]],outputs=[OperatorDeployInfo.OutputDeployInfo[portName=output,streamId=stream18,bufferServer=]]],
>  
> OperatorDeployInfo[id=10,name=Application.CountWords/ParDo(ExtractWords),type=GENERIC,checkpoint={583c4b1100b3,
>  0, 
> 0},inputs=[OperatorDeployInfo.InputDeployInfo[portName=input,streamId=stream0,sourceNodeId=9,sourcePortName=output,locality=CONTAINER_LOCAL,partitionMask=0,partitionKeys=]],outputs=[OperatorDeployInfo.OutputDeployInfo[portName=output,streamId=stream17,bufferServer=]]],
>  
> OperatorDeployInfo[id=13,name=Application.CountWords/Count.PerElement/Count.PerKey/Combine.GroupedValues/AnonymousParDo,type=GENERIC,checkpoint={583c4b1100b3,
>  0, 
> 0},inputs=[OperatorDeployInfo.InputDeployInfo[portName=input,streamId=stream12,sourceNodeId=12,sourcePortName=output,locality=CONTAINER_LOCAL,partitionMask=0,partitionKeys=]],outputs=[OperatorDeployInfo.OutputDeployInfo[portName=output,streamId=stream3,bufferServer=]]],
>  
> OperatorDeployInfo[id=14,name=WriteToHDFS/Window.Into()/ApexRunner.AssignWindowsAndSetStrategy/AssignWindows/AssignWindows,type=GENERIC,checkpoint={583c4b1100b3,
>  0, 
> 0},inputs=[OperatorDeployInfo.InputDeployInfo[portName=input,streamId=stream3,sourceNodeId=13,sourcePortName=output,locality=CONTAINER_LOCAL,partitionMask=0,partitionKeys=]],outputs=[OperatorDeployInfo.OutputDeployInfo[portName=output,streamId=stream10,bufferServer=goofy]]],
>  
> OperatorDeployInfo[id=12,name=Application.CountWords/Count.PerElement/Count.PerKey/GroupByKey,type=GENERIC,checkpoint={583c4b1100b3,
>  0, 
> 0},inputs=[OperatorDeployInfo.InputDeployInfo[portName=input,streamId=stream18,sourceNodeId=11,sourcePortName=output,locality=CONTAINER_LOCAL,partitionMask=0,partitionKeys=]],outputs=[OperatorDeployInfo.OutputDeployInfo[portName=output,streamId=stream12,bufferServer=
> 2016-11-28 20:54:48,922 ERROR 
> com.datatorrent.stram.engine.StreamingContainer: deploy request failed
> com.esotericsoftware.kryo.KryoException: Class cannot be created (missing 
> no-arg constructor): java.nio.HeapByteBuffer
> Serialization trace:
> activeTimers 
> (org.apache.beam.runners.apex.translation.operators.ApexGroupByKeyOperator)
> at 
> com.esotericsoftware.kryo.Kryo$DefaultInstantiatorStrategy.newInstantiatorOf(Kryo.java:1228)
> at com.esotericsoftware.kryo.Kryo.newInstantiator(Kryo.java:1049)
> at com.esotericsoftware.kryo.Kryo.newInstance(Kryo.java:1058)
> at 
> com.esotericsoftware.kryo.serializers.FieldSerializer.create(FieldSerializer.java:547)
> at 
> com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:523)
> at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:761)
> at 
> com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:135)
> at 
> com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:21)
> at com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:679)
> at 
> com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106)
> at 
> 

[jira] [Commented] (BEAM-1053) ApexGroupByKeyOperator serialization issues

2017-04-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15961931#comment-15961931
 ] 

ASF GitHub Bot commented on BEAM-1053:
--

GitHub user tweise opened a pull request:

https://github.com/apache/beam/pull/2473

[BEAM-1053] ApexGroupByKeyOperator serialization issues

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).

---
R: @jkff 

Probably this code will go away after the SDF work but this bug is trivial 
to fix and has been there for a while, so let's close it out separately. For 
stateInternals it was already addressed.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tweise/beam 
BEAM-1053_ApexGroupByKeySerialization

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2473.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2473


commit adf378568e7c676c3b2780c07caae679ea461f01
Author: Thomas Weise 
Date:   2017-04-08T20:01:01Z

BEAM-1053 ApexGroupByKeyOperator serialization issues




> ApexGroupByKeyOperator serialization issues
> ---
>
> Key: BEAM-1053
> URL: https://issues.apache.org/jira/browse/BEAM-1053
> Project: Beam
>  Issue Type: Bug
>  Components: runner-apex
>Affects Versions: 0.3.0-incubating
>Reporter: Sandeep Deshmukh
>Assignee: Thomas Weise
>
> While trying to use Apex Runner for wordcount program, the 
> ApexGroupByKeyOperator  fails with following exception. 
> 2016-11-28 20:54:47,696 INFO com.datatorrent.stram.engine.StreamingContainer: 
> Deploy request: 
> [OperatorDeployInfo[id=9,name=ExtractPayload,type=GENERIC,checkpoint={583c4b1100b3,
>  0, 
> 0},inputs=[OperatorDeployInfo.InputDeployInfo[portName=input,streamId=stream14,sourceNodeId=8,sourcePortName=output,locality=CONTAINER_LOCAL,partitionMask=0,partitionKeys=]],outputs=[OperatorDeployInfo.OutputDeployInfo[portName=output,streamId=stream0,bufferServer=]]],
>  
> OperatorDeployInfo[id=8,name=ReadFromHDFS,type=INPUT,checkpoint={583c4b1100b3,
>  0, 
> 0},inputs=[],outputs=[OperatorDeployInfo.OutputDeployInfo[portName=output,streamId=stream14,bufferServer=]]],
>  
> OperatorDeployInfo[id=11,name=Application.CountWords/Count.PerElement/Init/Map,type=GENERIC,checkpoint={583c4b1100b3,
>  0, 
> 0},inputs=[OperatorDeployInfo.InputDeployInfo[portName=input,streamId=stream17,sourceNodeId=10,sourcePortName=output,locality=CONTAINER_LOCAL,partitionMask=0,partitionKeys=]],outputs=[OperatorDeployInfo.OutputDeployInfo[portName=output,streamId=stream18,bufferServer=]]],
>  
> OperatorDeployInfo[id=10,name=Application.CountWords/ParDo(ExtractWords),type=GENERIC,checkpoint={583c4b1100b3,
>  0, 
> 0},inputs=[OperatorDeployInfo.InputDeployInfo[portName=input,streamId=stream0,sourceNodeId=9,sourcePortName=output,locality=CONTAINER_LOCAL,partitionMask=0,partitionKeys=]],outputs=[OperatorDeployInfo.OutputDeployInfo[portName=output,streamId=stream17,bufferServer=]]],
>  
> OperatorDeployInfo[id=13,name=Application.CountWords/Count.PerElement/Count.PerKey/Combine.GroupedValues/AnonymousParDo,type=GENERIC,checkpoint={583c4b1100b3,
>  0, 
> 0},inputs=[OperatorDeployInfo.InputDeployInfo[portName=input,streamId=stream12,sourceNodeId=12,sourcePortName=output,locality=CONTAINER_LOCAL,partitionMask=0,partitionKeys=]],outputs=[OperatorDeployInfo.OutputDeployInfo[portName=output,streamId=stream3,bufferServer=]]],
>  
> OperatorDeployInfo[id=14,name=WriteToHDFS/Window.Into()/ApexRunner.AssignWindowsAndSetStrategy/AssignWindows/AssignWindows,type=GENERIC,checkpoint={583c4b1100b3,
>  0, 
> 0},inputs=[OperatorDeployInfo.InputDeployInfo[portName=input,streamId=stream3,sourceNodeId=13,sourcePortName=output,locality=CONTAINER_LOCAL,partitionMask=0,partitionKeys=]],outputs=[OperatorDeployInfo.OutputDeployInfo[portName=output,streamId=stream10,bufferServer=goofy]]],
>  
> OperatorDeployInfo[id=12,name=Application.CountWords/Count.PerElement/Count.PerKey/GroupByKey,type=GENERIC,checkpoint={583c4b1100b3,
>  0, 
> 

[jira] [Commented] (BEAM-1053) ApexGroupByKeyOperator serialization issues

2017-01-02 Thread Davor Bonaci (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15793738#comment-15793738
 ] 

Davor Bonaci commented on BEAM-1053:


Done. Welcome [~tushargosavi]!

> ApexGroupByKeyOperator serialization issues
> ---
>
> Key: BEAM-1053
> URL: https://issues.apache.org/jira/browse/BEAM-1053
> Project: Beam
>  Issue Type: Bug
>  Components: runner-apex
>Affects Versions: 0.3.0-incubating
>Reporter: Sandeep Deshmukh
>Assignee: Tushar Gosavi
>
> While trying to use Apex Runner for wordcount program, the 
> ApexGroupByKeyOperator  fails with following exception. 
> 2016-11-28 20:54:47,696 INFO com.datatorrent.stram.engine.StreamingContainer: 
> Deploy request: 
> [OperatorDeployInfo[id=9,name=ExtractPayload,type=GENERIC,checkpoint={583c4b1100b3,
>  0, 
> 0},inputs=[OperatorDeployInfo.InputDeployInfo[portName=input,streamId=stream14,sourceNodeId=8,sourcePortName=output,locality=CONTAINER_LOCAL,partitionMask=0,partitionKeys=]],outputs=[OperatorDeployInfo.OutputDeployInfo[portName=output,streamId=stream0,bufferServer=]]],
>  
> OperatorDeployInfo[id=8,name=ReadFromHDFS,type=INPUT,checkpoint={583c4b1100b3,
>  0, 
> 0},inputs=[],outputs=[OperatorDeployInfo.OutputDeployInfo[portName=output,streamId=stream14,bufferServer=]]],
>  
> OperatorDeployInfo[id=11,name=Application.CountWords/Count.PerElement/Init/Map,type=GENERIC,checkpoint={583c4b1100b3,
>  0, 
> 0},inputs=[OperatorDeployInfo.InputDeployInfo[portName=input,streamId=stream17,sourceNodeId=10,sourcePortName=output,locality=CONTAINER_LOCAL,partitionMask=0,partitionKeys=]],outputs=[OperatorDeployInfo.OutputDeployInfo[portName=output,streamId=stream18,bufferServer=]]],
>  
> OperatorDeployInfo[id=10,name=Application.CountWords/ParDo(ExtractWords),type=GENERIC,checkpoint={583c4b1100b3,
>  0, 
> 0},inputs=[OperatorDeployInfo.InputDeployInfo[portName=input,streamId=stream0,sourceNodeId=9,sourcePortName=output,locality=CONTAINER_LOCAL,partitionMask=0,partitionKeys=]],outputs=[OperatorDeployInfo.OutputDeployInfo[portName=output,streamId=stream17,bufferServer=]]],
>  
> OperatorDeployInfo[id=13,name=Application.CountWords/Count.PerElement/Count.PerKey/Combine.GroupedValues/AnonymousParDo,type=GENERIC,checkpoint={583c4b1100b3,
>  0, 
> 0},inputs=[OperatorDeployInfo.InputDeployInfo[portName=input,streamId=stream12,sourceNodeId=12,sourcePortName=output,locality=CONTAINER_LOCAL,partitionMask=0,partitionKeys=]],outputs=[OperatorDeployInfo.OutputDeployInfo[portName=output,streamId=stream3,bufferServer=]]],
>  
> OperatorDeployInfo[id=14,name=WriteToHDFS/Window.Into()/ApexRunner.AssignWindowsAndSetStrategy/AssignWindows/AssignWindows,type=GENERIC,checkpoint={583c4b1100b3,
>  0, 
> 0},inputs=[OperatorDeployInfo.InputDeployInfo[portName=input,streamId=stream3,sourceNodeId=13,sourcePortName=output,locality=CONTAINER_LOCAL,partitionMask=0,partitionKeys=]],outputs=[OperatorDeployInfo.OutputDeployInfo[portName=output,streamId=stream10,bufferServer=goofy]]],
>  
> OperatorDeployInfo[id=12,name=Application.CountWords/Count.PerElement/Count.PerKey/GroupByKey,type=GENERIC,checkpoint={583c4b1100b3,
>  0, 
> 0},inputs=[OperatorDeployInfo.InputDeployInfo[portName=input,streamId=stream18,sourceNodeId=11,sourcePortName=output,locality=CONTAINER_LOCAL,partitionMask=0,partitionKeys=]],outputs=[OperatorDeployInfo.OutputDeployInfo[portName=output,streamId=stream12,bufferServer=
> 2016-11-28 20:54:48,922 ERROR 
> com.datatorrent.stram.engine.StreamingContainer: deploy request failed
> com.esotericsoftware.kryo.KryoException: Class cannot be created (missing 
> no-arg constructor): java.nio.HeapByteBuffer
> Serialization trace:
> activeTimers 
> (org.apache.beam.runners.apex.translation.operators.ApexGroupByKeyOperator)
> at 
> com.esotericsoftware.kryo.Kryo$DefaultInstantiatorStrategy.newInstantiatorOf(Kryo.java:1228)
> at com.esotericsoftware.kryo.Kryo.newInstantiator(Kryo.java:1049)
> at com.esotericsoftware.kryo.Kryo.newInstance(Kryo.java:1058)
> at 
> com.esotericsoftware.kryo.serializers.FieldSerializer.create(FieldSerializer.java:547)
> at 
> com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:523)
> at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:761)
> at 
> com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:135)
> at 
> com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:21)
> at com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:679)
> at 
> com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106)
> at 
> com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:528)
> at 

[jira] [Commented] (BEAM-1053) ApexGroupByKeyOperator serialization issues

2017-01-02 Thread Thomas Weise (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15793359#comment-15793359
 ] 

Thomas Weise commented on BEAM-1053:


[~davor] [~dhalp...@google.com] can you please add Tushar and assign the JIRA


> ApexGroupByKeyOperator serialization issues
> ---
>
> Key: BEAM-1053
> URL: https://issues.apache.org/jira/browse/BEAM-1053
> Project: Beam
>  Issue Type: Bug
>  Components: runner-apex
>Affects Versions: 0.3.0-incubating
>Reporter: Sandeep Deshmukh
>
> While trying to use Apex Runner for wordcount program, the 
> ApexGroupByKeyOperator  fails with following exception. 
> 2016-11-28 20:54:47,696 INFO com.datatorrent.stram.engine.StreamingContainer: 
> Deploy request: 
> [OperatorDeployInfo[id=9,name=ExtractPayload,type=GENERIC,checkpoint={583c4b1100b3,
>  0, 
> 0},inputs=[OperatorDeployInfo.InputDeployInfo[portName=input,streamId=stream14,sourceNodeId=8,sourcePortName=output,locality=CONTAINER_LOCAL,partitionMask=0,partitionKeys=]],outputs=[OperatorDeployInfo.OutputDeployInfo[portName=output,streamId=stream0,bufferServer=]]],
>  
> OperatorDeployInfo[id=8,name=ReadFromHDFS,type=INPUT,checkpoint={583c4b1100b3,
>  0, 
> 0},inputs=[],outputs=[OperatorDeployInfo.OutputDeployInfo[portName=output,streamId=stream14,bufferServer=]]],
>  
> OperatorDeployInfo[id=11,name=Application.CountWords/Count.PerElement/Init/Map,type=GENERIC,checkpoint={583c4b1100b3,
>  0, 
> 0},inputs=[OperatorDeployInfo.InputDeployInfo[portName=input,streamId=stream17,sourceNodeId=10,sourcePortName=output,locality=CONTAINER_LOCAL,partitionMask=0,partitionKeys=]],outputs=[OperatorDeployInfo.OutputDeployInfo[portName=output,streamId=stream18,bufferServer=]]],
>  
> OperatorDeployInfo[id=10,name=Application.CountWords/ParDo(ExtractWords),type=GENERIC,checkpoint={583c4b1100b3,
>  0, 
> 0},inputs=[OperatorDeployInfo.InputDeployInfo[portName=input,streamId=stream0,sourceNodeId=9,sourcePortName=output,locality=CONTAINER_LOCAL,partitionMask=0,partitionKeys=]],outputs=[OperatorDeployInfo.OutputDeployInfo[portName=output,streamId=stream17,bufferServer=]]],
>  
> OperatorDeployInfo[id=13,name=Application.CountWords/Count.PerElement/Count.PerKey/Combine.GroupedValues/AnonymousParDo,type=GENERIC,checkpoint={583c4b1100b3,
>  0, 
> 0},inputs=[OperatorDeployInfo.InputDeployInfo[portName=input,streamId=stream12,sourceNodeId=12,sourcePortName=output,locality=CONTAINER_LOCAL,partitionMask=0,partitionKeys=]],outputs=[OperatorDeployInfo.OutputDeployInfo[portName=output,streamId=stream3,bufferServer=]]],
>  
> OperatorDeployInfo[id=14,name=WriteToHDFS/Window.Into()/ApexRunner.AssignWindowsAndSetStrategy/AssignWindows/AssignWindows,type=GENERIC,checkpoint={583c4b1100b3,
>  0, 
> 0},inputs=[OperatorDeployInfo.InputDeployInfo[portName=input,streamId=stream3,sourceNodeId=13,sourcePortName=output,locality=CONTAINER_LOCAL,partitionMask=0,partitionKeys=]],outputs=[OperatorDeployInfo.OutputDeployInfo[portName=output,streamId=stream10,bufferServer=goofy]]],
>  
> OperatorDeployInfo[id=12,name=Application.CountWords/Count.PerElement/Count.PerKey/GroupByKey,type=GENERIC,checkpoint={583c4b1100b3,
>  0, 
> 0},inputs=[OperatorDeployInfo.InputDeployInfo[portName=input,streamId=stream18,sourceNodeId=11,sourcePortName=output,locality=CONTAINER_LOCAL,partitionMask=0,partitionKeys=]],outputs=[OperatorDeployInfo.OutputDeployInfo[portName=output,streamId=stream12,bufferServer=
> 2016-11-28 20:54:48,922 ERROR 
> com.datatorrent.stram.engine.StreamingContainer: deploy request failed
> com.esotericsoftware.kryo.KryoException: Class cannot be created (missing 
> no-arg constructor): java.nio.HeapByteBuffer
> Serialization trace:
> activeTimers 
> (org.apache.beam.runners.apex.translation.operators.ApexGroupByKeyOperator)
> at 
> com.esotericsoftware.kryo.Kryo$DefaultInstantiatorStrategy.newInstantiatorOf(Kryo.java:1228)
> at com.esotericsoftware.kryo.Kryo.newInstantiator(Kryo.java:1049)
> at com.esotericsoftware.kryo.Kryo.newInstance(Kryo.java:1058)
> at 
> com.esotericsoftware.kryo.serializers.FieldSerializer.create(FieldSerializer.java:547)
> at 
> com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:523)
> at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:761)
> at 
> com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:135)
> at 
> com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:21)
> at com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:679)
> at 
> com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106)
> at 
> com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:528)
> at