[jira] [Commented] (BEAM-1980) Seeming deadlock using Apex with relatively small data

2017-05-08 Thread Kenneth Knowles (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16001087#comment-16001087
 ] 

Kenneth Knowles commented on BEAM-1980:
---

I did not get to repro'ing, but I believe it is redundant with the proposed 
release process.

> Seeming deadlock using Apex with relatively small data
> --
>
> Key: BEAM-1980
> URL: https://issues.apache.org/jira/browse/BEAM-1980
> Project: Beam
>  Issue Type: Bug
>  Components: runner-apex
>Reporter: Daniel Halperin
>Assignee: Thomas Weise
> Fix For: Not applicable
>
>
> I'm running the "beam portability demo" at 
> https://github.com/dhalperi/beam-portability-demo/tree/apex
> Made a very small input file:
> {code}
> gsutil cat gs://apache-beam-demo/data2/small-game.csv | head -n 10 > 
> tiny.csv
> {code}
> Ran the job in embedded mode using an Apex fat-jar from the pom in that 
> branch (and adding in {{slf4j-jdk14.jar}} for debugging info):
> {code}
> java -cp 
> ~/.m2/repository/org/slf4j/slf4j-jdk14/1.7.14/slf4j-jdk14-1.7.14.jar:target/portability-demo-bundled-apex.jar
>  demo.HourlyTeamScore --runner=ApexRunner 
> --outputPrefix=gs://clouddfe-dhalperi/output/apex --input=tiny.csv
> {code}
> A good run takes O(25 seconds):
> {code}
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/Users/dhalperi/.m2/repository/org/slf4j/slf4j-jdk14/1.7.14/slf4j-jdk14-1.7.14.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/Users/dhalperi/beam-portability-demo/target/portability-demo-bundled-apex.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.slf4j.impl.JDK14LoggerFactory]
> log4j:WARN No appenders could be found for logger 
> (org.apache.commons.beanutils.converters.BooleanConverter).
> log4j:WARN Please initialize the log4j system properly.
> log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more 
> info.
> Apr 14, 2017 1:20:55 PM com.datatorrent.common.util.AsyncFSStorageAgent save
> INFO: using 
> /var/folders/7r/s0gg2qb11jz4g4pcb8n15gkc009j1y/T/chkp8074838277485202831 as 
> the basepath for checkpointing.
> Apr 14, 2017 1:20:56 PM com.datatorrent.bufferserver.storage.DiskStorage 
> 
> INFO: using /var/folders/7r/s0gg2qb11jz4g4pcb8n15gkc009j1y/T as the basepath 
> for spooling.
> Apr 14, 2017 1:20:56 PM com.datatorrent.bufferserver.server.Server registered
> INFO: Server started listening at /0:0:0:0:0:0:0:0:61087
> Apr 14, 2017 1:20:56 PM com.datatorrent.stram.StramLocalCluster 
> INFO: Buffer server started: localhost:61087
> Apr 14, 2017 1:20:56 PM com.datatorrent.stram.Journal write
> WARNING: Journal output stream is null. Skipping write to the WAL.
> Apr 14, 2017 1:20:56 PM 
> com.datatorrent.stram.StramLocalCluster$LocalStreamingContainerLauncher run
> INFO: Started container container-0
> Apr 14, 2017 1:20:56 PM com.datatorrent.stram.Journal write
> WARNING: Journal output stream is null. Skipping write to the WAL.
> Apr 14, 2017 1:20:56 PM 
> com.datatorrent.stram.StramLocalCluster$LocalStreamingContainerLauncher run
> INFO: Started container container-1
> Apr 14, 2017 1:20:56 PM com.datatorrent.stram.Journal write
> WARNING: Journal output stream is null. Skipping write to the WAL.
> Apr 14, 2017 1:20:56 PM 
> com.datatorrent.stram.StramLocalCluster$LocalStreamingContainerLauncher run
> INFO: Started container container-2
> Apr 14, 2017 1:20:56 PM com.datatorrent.stram.Journal write
> WARNING: Journal output stream is null. Skipping write to the WAL.
> Apr 14, 2017 1:20:56 PM 
> com.datatorrent.stram.StramLocalCluster$LocalStreamingContainerLauncher run
> INFO: Started container container-3
> Apr 14, 2017 1:20:56 PM com.datatorrent.stram.Journal write
> WARNING: Journal output stream is null. Skipping write to the WAL.
> Apr 14, 2017 1:20:56 PM 
> com.datatorrent.stram.StramLocalCluster$LocalStreamingContainerLauncher run
> INFO: Started container container-4
> Apr 14, 2017 1:20:56 PM com.datatorrent.stram.Journal write
> WARNING: Journal output stream is null. Skipping write to the WAL.
> Apr 14, 2017 1:20:56 PM 
> com.datatorrent.stram.StramLocalCluster$LocalStreamingContainerLauncher run
> INFO: Started container container-5
> Apr 14, 2017 1:20:56 PM com.datatorrent.stram.Journal write
> WARNING: Journal output stream is null. Skipping write to the WAL.
> Apr 14, 2017 1:20:56 PM 
> com.datatorrent.stram.StramLocalCluster$UmbilicalProtocolLocalImpl log
> INFO: container-2 msg: [container-2] Entering heartbeat loop..
> Apr 14, 2017 1:20:56 PM 
> com.datatorrent.stram.StramLocalCluster$UmbilicalProtocolLocalImpl log
> INFO: container-1 msg: [container-1] Entering heartbeat loop..
>

[jira] [Commented] (BEAM-1980) Seeming deadlock using Apex with relatively small data

2017-05-08 Thread Davor Bonaci (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16001066#comment-16001066
 ] 

Davor Bonaci commented on BEAM-1980:


Has anyone been able to reproduce this in a while, [~kenn], [~thw]?

If not, I'd resolve this and move on.

> Seeming deadlock using Apex with relatively small data
> --
>
> Key: BEAM-1980
> URL: https://issues.apache.org/jira/browse/BEAM-1980
> Project: Beam
>  Issue Type: Bug
>  Components: runner-apex
>Reporter: Daniel Halperin
>Assignee: Thomas Weise
> Fix For: 2.0.0
>
>
> I'm running the "beam portability demo" at 
> https://github.com/dhalperi/beam-portability-demo/tree/apex
> Made a very small input file:
> {code}
> gsutil cat gs://apache-beam-demo/data2/small-game.csv | head -n 10 > 
> tiny.csv
> {code}
> Ran the job in embedded mode using an Apex fat-jar from the pom in that 
> branch (and adding in {{slf4j-jdk14.jar}} for debugging info):
> {code}
> java -cp 
> ~/.m2/repository/org/slf4j/slf4j-jdk14/1.7.14/slf4j-jdk14-1.7.14.jar:target/portability-demo-bundled-apex.jar
>  demo.HourlyTeamScore --runner=ApexRunner 
> --outputPrefix=gs://clouddfe-dhalperi/output/apex --input=tiny.csv
> {code}
> A good run takes O(25 seconds):
> {code}
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/Users/dhalperi/.m2/repository/org/slf4j/slf4j-jdk14/1.7.14/slf4j-jdk14-1.7.14.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/Users/dhalperi/beam-portability-demo/target/portability-demo-bundled-apex.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.slf4j.impl.JDK14LoggerFactory]
> log4j:WARN No appenders could be found for logger 
> (org.apache.commons.beanutils.converters.BooleanConverter).
> log4j:WARN Please initialize the log4j system properly.
> log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more 
> info.
> Apr 14, 2017 1:20:55 PM com.datatorrent.common.util.AsyncFSStorageAgent save
> INFO: using 
> /var/folders/7r/s0gg2qb11jz4g4pcb8n15gkc009j1y/T/chkp8074838277485202831 as 
> the basepath for checkpointing.
> Apr 14, 2017 1:20:56 PM com.datatorrent.bufferserver.storage.DiskStorage 
> 
> INFO: using /var/folders/7r/s0gg2qb11jz4g4pcb8n15gkc009j1y/T as the basepath 
> for spooling.
> Apr 14, 2017 1:20:56 PM com.datatorrent.bufferserver.server.Server registered
> INFO: Server started listening at /0:0:0:0:0:0:0:0:61087
> Apr 14, 2017 1:20:56 PM com.datatorrent.stram.StramLocalCluster 
> INFO: Buffer server started: localhost:61087
> Apr 14, 2017 1:20:56 PM com.datatorrent.stram.Journal write
> WARNING: Journal output stream is null. Skipping write to the WAL.
> Apr 14, 2017 1:20:56 PM 
> com.datatorrent.stram.StramLocalCluster$LocalStreamingContainerLauncher run
> INFO: Started container container-0
> Apr 14, 2017 1:20:56 PM com.datatorrent.stram.Journal write
> WARNING: Journal output stream is null. Skipping write to the WAL.
> Apr 14, 2017 1:20:56 PM 
> com.datatorrent.stram.StramLocalCluster$LocalStreamingContainerLauncher run
> INFO: Started container container-1
> Apr 14, 2017 1:20:56 PM com.datatorrent.stram.Journal write
> WARNING: Journal output stream is null. Skipping write to the WAL.
> Apr 14, 2017 1:20:56 PM 
> com.datatorrent.stram.StramLocalCluster$LocalStreamingContainerLauncher run
> INFO: Started container container-2
> Apr 14, 2017 1:20:56 PM com.datatorrent.stram.Journal write
> WARNING: Journal output stream is null. Skipping write to the WAL.
> Apr 14, 2017 1:20:56 PM 
> com.datatorrent.stram.StramLocalCluster$LocalStreamingContainerLauncher run
> INFO: Started container container-3
> Apr 14, 2017 1:20:56 PM com.datatorrent.stram.Journal write
> WARNING: Journal output stream is null. Skipping write to the WAL.
> Apr 14, 2017 1:20:56 PM 
> com.datatorrent.stram.StramLocalCluster$LocalStreamingContainerLauncher run
> INFO: Started container container-4
> Apr 14, 2017 1:20:56 PM com.datatorrent.stram.Journal write
> WARNING: Journal output stream is null. Skipping write to the WAL.
> Apr 14, 2017 1:20:56 PM 
> com.datatorrent.stram.StramLocalCluster$LocalStreamingContainerLauncher run
> INFO: Started container container-5
> Apr 14, 2017 1:20:56 PM com.datatorrent.stram.Journal write
> WARNING: Journal output stream is null. Skipping write to the WAL.
> Apr 14, 2017 1:20:56 PM 
> com.datatorrent.stram.StramLocalCluster$UmbilicalProtocolLocalImpl log
> INFO: container-2 msg: [container-2] Entering heartbeat loop..
> Apr 14, 2017 1:20:56 PM 
> com.datatorrent.stram.StramLocalCluster$UmbilicalProtocolLocalImpl log
> INFO: container-1 msg: [container-1] Entering heartbeat loop..
> A

[jira] [Commented] (BEAM-1980) Seeming deadlock using Apex with relatively small data

2017-04-30 Thread Kenneth Knowles (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15990363#comment-15990363
 ] 

Kenneth Knowles commented on BEAM-1980:
---

Let's try to repro and close this out, now, then.

> Seeming deadlock using Apex with relatively small data
> --
>
> Key: BEAM-1980
> URL: https://issues.apache.org/jira/browse/BEAM-1980
> Project: Beam
>  Issue Type: Bug
>  Components: runner-apex
>Reporter: Daniel Halperin
>Assignee: Thomas Weise
> Fix For: First stable release
>
>
> I'm running the "beam portability demo" at 
> https://github.com/dhalperi/beam-portability-demo/tree/apex
> Made a very small input file:
> {code}
> gsutil cat gs://apache-beam-demo/data2/small-game.csv | head -n 10 > 
> tiny.csv
> {code}
> Ran the job in embedded mode using an Apex fat-jar from the pom in that 
> branch (and adding in {{slf4j-jdk14.jar}} for debugging info):
> {code}
> java -cp 
> ~/.m2/repository/org/slf4j/slf4j-jdk14/1.7.14/slf4j-jdk14-1.7.14.jar:target/portability-demo-bundled-apex.jar
>  demo.HourlyTeamScore --runner=ApexRunner 
> --outputPrefix=gs://clouddfe-dhalperi/output/apex --input=tiny.csv
> {code}
> A good run takes O(25 seconds):
> {code}
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/Users/dhalperi/.m2/repository/org/slf4j/slf4j-jdk14/1.7.14/slf4j-jdk14-1.7.14.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/Users/dhalperi/beam-portability-demo/target/portability-demo-bundled-apex.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.slf4j.impl.JDK14LoggerFactory]
> log4j:WARN No appenders could be found for logger 
> (org.apache.commons.beanutils.converters.BooleanConverter).
> log4j:WARN Please initialize the log4j system properly.
> log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more 
> info.
> Apr 14, 2017 1:20:55 PM com.datatorrent.common.util.AsyncFSStorageAgent save
> INFO: using 
> /var/folders/7r/s0gg2qb11jz4g4pcb8n15gkc009j1y/T/chkp8074838277485202831 as 
> the basepath for checkpointing.
> Apr 14, 2017 1:20:56 PM com.datatorrent.bufferserver.storage.DiskStorage 
> 
> INFO: using /var/folders/7r/s0gg2qb11jz4g4pcb8n15gkc009j1y/T as the basepath 
> for spooling.
> Apr 14, 2017 1:20:56 PM com.datatorrent.bufferserver.server.Server registered
> INFO: Server started listening at /0:0:0:0:0:0:0:0:61087
> Apr 14, 2017 1:20:56 PM com.datatorrent.stram.StramLocalCluster 
> INFO: Buffer server started: localhost:61087
> Apr 14, 2017 1:20:56 PM com.datatorrent.stram.Journal write
> WARNING: Journal output stream is null. Skipping write to the WAL.
> Apr 14, 2017 1:20:56 PM 
> com.datatorrent.stram.StramLocalCluster$LocalStreamingContainerLauncher run
> INFO: Started container container-0
> Apr 14, 2017 1:20:56 PM com.datatorrent.stram.Journal write
> WARNING: Journal output stream is null. Skipping write to the WAL.
> Apr 14, 2017 1:20:56 PM 
> com.datatorrent.stram.StramLocalCluster$LocalStreamingContainerLauncher run
> INFO: Started container container-1
> Apr 14, 2017 1:20:56 PM com.datatorrent.stram.Journal write
> WARNING: Journal output stream is null. Skipping write to the WAL.
> Apr 14, 2017 1:20:56 PM 
> com.datatorrent.stram.StramLocalCluster$LocalStreamingContainerLauncher run
> INFO: Started container container-2
> Apr 14, 2017 1:20:56 PM com.datatorrent.stram.Journal write
> WARNING: Journal output stream is null. Skipping write to the WAL.
> Apr 14, 2017 1:20:56 PM 
> com.datatorrent.stram.StramLocalCluster$LocalStreamingContainerLauncher run
> INFO: Started container container-3
> Apr 14, 2017 1:20:56 PM com.datatorrent.stram.Journal write
> WARNING: Journal output stream is null. Skipping write to the WAL.
> Apr 14, 2017 1:20:56 PM 
> com.datatorrent.stram.StramLocalCluster$LocalStreamingContainerLauncher run
> INFO: Started container container-4
> Apr 14, 2017 1:20:56 PM com.datatorrent.stram.Journal write
> WARNING: Journal output stream is null. Skipping write to the WAL.
> Apr 14, 2017 1:20:56 PM 
> com.datatorrent.stram.StramLocalCluster$LocalStreamingContainerLauncher run
> INFO: Started container container-5
> Apr 14, 2017 1:20:56 PM com.datatorrent.stram.Journal write
> WARNING: Journal output stream is null. Skipping write to the WAL.
> Apr 14, 2017 1:20:56 PM 
> com.datatorrent.stram.StramLocalCluster$UmbilicalProtocolLocalImpl log
> INFO: container-2 msg: [container-2] Entering heartbeat loop..
> Apr 14, 2017 1:20:56 PM 
> com.datatorrent.stram.StramLocalCluster$UmbilicalProtocolLocalImpl log
> INFO: container-1 msg: [container-1] Entering heartbeat loop..
> Apr 14, 2017 1:20:56 PM 
> com.datato

[jira] [Commented] (BEAM-1980) Seeming deadlock using Apex with relatively small data

2017-04-14 Thread Daniel Halperin (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15969612#comment-15969612
 ] 

Daniel Halperin commented on BEAM-1980:
---

Dumb note: this may be a side effect of the workaround we deployed to 
[BEAM-1981] in which we made the timer data transient and lost on checkpoint.

> Seeming deadlock using Apex with relatively small data
> --
>
> Key: BEAM-1980
> URL: https://issues.apache.org/jira/browse/BEAM-1980
> Project: Beam
>  Issue Type: Bug
>  Components: runner-apex
>Reporter: Daniel Halperin
>Assignee: Thomas Weise
> Fix For: First stable release
>
>
> I'm running the "beam portability demo" at 
> https://github.com/dhalperi/beam-portability-demo/tree/apex
> Made a very small input file:
> {code}
> gsutil cat gs://apache-beam-demo/data2/small-game.csv | head -n 10 > 
> tiny.csv
> {code}
> Ran the job in embedded mode using an Apex fat-jar from the pom in that 
> branch (and adding in {{slf4j-jdk14.jar}} for debugging info):
> {code}
> java -cp 
> ~/.m2/repository/org/slf4j/slf4j-jdk14/1.7.14/slf4j-jdk14-1.7.14.jar:target/portability-demo-bundled-apex.jar
>  demo.HourlyTeamScore --runner=ApexRunner 
> --outputPrefix=gs://clouddfe-dhalperi/output/apex --input=tiny.csv
> {code}
> A good run takes O(25 seconds):
> {code}
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/Users/dhalperi/.m2/repository/org/slf4j/slf4j-jdk14/1.7.14/slf4j-jdk14-1.7.14.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/Users/dhalperi/beam-portability-demo/target/portability-demo-bundled-apex.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.slf4j.impl.JDK14LoggerFactory]
> log4j:WARN No appenders could be found for logger 
> (org.apache.commons.beanutils.converters.BooleanConverter).
> log4j:WARN Please initialize the log4j system properly.
> log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more 
> info.
> Apr 14, 2017 1:20:55 PM com.datatorrent.common.util.AsyncFSStorageAgent save
> INFO: using 
> /var/folders/7r/s0gg2qb11jz4g4pcb8n15gkc009j1y/T/chkp8074838277485202831 as 
> the basepath for checkpointing.
> Apr 14, 2017 1:20:56 PM com.datatorrent.bufferserver.storage.DiskStorage 
> 
> INFO: using /var/folders/7r/s0gg2qb11jz4g4pcb8n15gkc009j1y/T as the basepath 
> for spooling.
> Apr 14, 2017 1:20:56 PM com.datatorrent.bufferserver.server.Server registered
> INFO: Server started listening at /0:0:0:0:0:0:0:0:61087
> Apr 14, 2017 1:20:56 PM com.datatorrent.stram.StramLocalCluster 
> INFO: Buffer server started: localhost:61087
> Apr 14, 2017 1:20:56 PM com.datatorrent.stram.Journal write
> WARNING: Journal output stream is null. Skipping write to the WAL.
> Apr 14, 2017 1:20:56 PM 
> com.datatorrent.stram.StramLocalCluster$LocalStreamingContainerLauncher run
> INFO: Started container container-0
> Apr 14, 2017 1:20:56 PM com.datatorrent.stram.Journal write
> WARNING: Journal output stream is null. Skipping write to the WAL.
> Apr 14, 2017 1:20:56 PM 
> com.datatorrent.stram.StramLocalCluster$LocalStreamingContainerLauncher run
> INFO: Started container container-1
> Apr 14, 2017 1:20:56 PM com.datatorrent.stram.Journal write
> WARNING: Journal output stream is null. Skipping write to the WAL.
> Apr 14, 2017 1:20:56 PM 
> com.datatorrent.stram.StramLocalCluster$LocalStreamingContainerLauncher run
> INFO: Started container container-2
> Apr 14, 2017 1:20:56 PM com.datatorrent.stram.Journal write
> WARNING: Journal output stream is null. Skipping write to the WAL.
> Apr 14, 2017 1:20:56 PM 
> com.datatorrent.stram.StramLocalCluster$LocalStreamingContainerLauncher run
> INFO: Started container container-3
> Apr 14, 2017 1:20:56 PM com.datatorrent.stram.Journal write
> WARNING: Journal output stream is null. Skipping write to the WAL.
> Apr 14, 2017 1:20:56 PM 
> com.datatorrent.stram.StramLocalCluster$LocalStreamingContainerLauncher run
> INFO: Started container container-4
> Apr 14, 2017 1:20:56 PM com.datatorrent.stram.Journal write
> WARNING: Journal output stream is null. Skipping write to the WAL.
> Apr 14, 2017 1:20:56 PM 
> com.datatorrent.stram.StramLocalCluster$LocalStreamingContainerLauncher run
> INFO: Started container container-5
> Apr 14, 2017 1:20:56 PM com.datatorrent.stram.Journal write
> WARNING: Journal output stream is null. Skipping write to the WAL.
> Apr 14, 2017 1:20:56 PM 
> com.datatorrent.stram.StramLocalCluster$UmbilicalProtocolLocalImpl log
> INFO: container-2 msg: [container-2] Entering heartbeat loop..
> Apr 14, 2017 1:20:56 PM 
> com.datatorrent.stram.StramLocalCluster$UmbilicalProtocolLocalImpl log
> INF

[jira] [Commented] (BEAM-1980) Seeming deadlock using Apex with relatively small data

2017-04-14 Thread Daniel Halperin (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15969608#comment-15969608
 ] 

Daniel Halperin commented on BEAM-1980:
---

Just repro'ed; after 5 minutes the output is at the same place (undeployed 
container [2]) and the output file has not been created. So it seems stuck 
while processing.

jstack:

{code}
2017-04-14 15:08:34
Full thread dump Java HotSpot(TM) 64-Bit Server VM (25.112-b16 mixed mode):

"Attach Listener" #121 daemon prio=9 os_prio=31 tid=0x7f91a9140800 
nid=0x9007 waiting on condition [0x]
   java.lang.Thread.State: RUNNABLE

"StorageHelper-2-1" #120 prio=5 os_prio=31 tid=0x7f91a7193000 nid=0x9a03 
waiting on condition [0x7dac]
   java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for  <0x0006c0cd1cf0> (a 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
at 
java.util.concurrent.ArrayBlockingQueue.take(ArrayBlockingQueue.java:403)
at 
java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1067)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)

"ServerHelper-1-1" #119 prio=5 os_prio=31 tid=0x7f91a5a85800 nid=0x9803 
waiting on condition [0x7d9bd000]
   java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for  <0x0006c0cde730> (a 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
at 
java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
at 
java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1067)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)

"9/SumTeamScores/GroupByKey:ApexGroupByKeyOperator" #118 prio=5 os_prio=31 
tid=0x7f91a70d9800 nid=0x9603 waiting on condition [0x7d8ba000]
   java.lang.Thread.State: TIMED_WAITING (sleeping)
at java.lang.Thread.sleep(Native Method)
at com.datatorrent.stram.engine.GenericNode.run(GenericNode.java:601)
at 
com.datatorrent.stram.engine.StreamingContainer$2.run(StreamingContainer.java:1407)

"10/SumTeamScores/ParDo(WriteWindowedFiles)/ParMultiDo(WriteWindowedFiles):ApexParDoOperator"
 #117 prio=5 os_prio=31 tid=0x7f91a8811800 nid=0x9403 waiting on condition 
[0x7d7b7000]
   java.lang.Thread.State: TIMED_WAITING (sleeping)
at java.lang.Thread.sleep(Native Method)
at com.datatorrent.stram.engine.GenericNode.run(GenericNode.java:601)
at 
com.datatorrent.stram.engine.StreamingContainer$2.run(StreamingContainer.java:1407)

"3/SetTimestamps/ParMultiDo(SetTimestamps):ApexParDoOperator" #115 prio=5 
os_prio=31 tid=0x7f91a8860800 nid=0x8e03 waiting on condition 
[0x7d4ae000]
   java.lang.Thread.State: TIMED_WAITING (sleeping)
at java.lang.Thread.sleep(Native Method)
at com.datatorrent.stram.engine.GenericNode.run(GenericNode.java:601)
at 
com.datatorrent.stram.engine.StreamingContainer$2.run(StreamingContainer.java:1407)

"5/SumTeamScores/ParDo(KeyScoreByTeam)/ParMultiDo(KeyScoreByTeam):ApexParDoOperator"
 #114 prio=5 os_prio=31 tid=0x7f91a8809000 nid=0x8c03 waiting on condition 
[0x7d3ab000]
   java.lang.Thread.State: TIMED_WAITING (sleeping)
at java.lang.Thread.sleep(Native Method)
at com.datatorrent.stram.engine.GenericNode.run(GenericNode.java:601)
at 
com.datatorrent.stram.engine.StreamingContainer$2.run(StreamingContainer.java:1407)

"4/FixedWindows/Window.Assign:ApexProcessFnOperator" #112 prio=5 os_prio=31 
tid=0x7f91a6a66800 nid=0x8a03 waiting on condition [0x7d2a8000]
   java.lang.Thread.State: TIMED_WAITING (sleeping)
at java.lang.Thread.sleep(Native Method)
at com.datatorrent.stram.engine.GenericNode.run(GenericNode.java:601)
at 
com.datatorrent.stram.engine.StreamingContainer$2.run(StreamingContainer.java:1407)

"6/SumTeamScores/Combine.perKey(SumInteger)/GroupByKey:ApexGroupByKeyOperator" 
#111 prio=5 os_prio=31 tid=0x7f91a6a89800 nid=0x8803 waiting on condition 
[0x00

[jira] [Commented] (BEAM-1980) Seeming deadlock using Apex with relatively small data

2017-04-14 Thread Daniel Halperin (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15969598#comment-15969598
 ] 

Daniel Halperin commented on BEAM-1980:
---

Hmm, that is a good question. I'm honestly not sure I know but I will repro and 
tell you.

Certainly, it did not undeploy more containers than the 2 seen in the log.

> Seeming deadlock using Apex with relatively small data
> --
>
> Key: BEAM-1980
> URL: https://issues.apache.org/jira/browse/BEAM-1980
> Project: Beam
>  Issue Type: Bug
>  Components: runner-apex
>Reporter: Daniel Halperin
>Assignee: Thomas Weise
> Fix For: First stable release
>
>
> I'm running the "beam portability demo" at 
> https://github.com/dhalperi/beam-portability-demo/tree/apex
> Made a very small input file:
> {code}
> gsutil cat gs://apache-beam-demo/data2/small-game.csv | head -n 10 > 
> tiny.csv
> {code}
> Ran the job in embedded mode using an Apex fat-jar from the pom in that 
> branch (and adding in {{slf4j-jdk14.jar}} for debugging info):
> {code}
> java -cp 
> ~/.m2/repository/org/slf4j/slf4j-jdk14/1.7.14/slf4j-jdk14-1.7.14.jar:target/portability-demo-bundled-apex.jar
>  demo.HourlyTeamScore --runner=ApexRunner 
> --outputPrefix=gs://clouddfe-dhalperi/output/apex --input=tiny.csv
> {code}
> A good run takes O(25 seconds):
> {code}
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/Users/dhalperi/.m2/repository/org/slf4j/slf4j-jdk14/1.7.14/slf4j-jdk14-1.7.14.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/Users/dhalperi/beam-portability-demo/target/portability-demo-bundled-apex.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.slf4j.impl.JDK14LoggerFactory]
> log4j:WARN No appenders could be found for logger 
> (org.apache.commons.beanutils.converters.BooleanConverter).
> log4j:WARN Please initialize the log4j system properly.
> log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more 
> info.
> Apr 14, 2017 1:20:55 PM com.datatorrent.common.util.AsyncFSStorageAgent save
> INFO: using 
> /var/folders/7r/s0gg2qb11jz4g4pcb8n15gkc009j1y/T/chkp8074838277485202831 as 
> the basepath for checkpointing.
> Apr 14, 2017 1:20:56 PM com.datatorrent.bufferserver.storage.DiskStorage 
> 
> INFO: using /var/folders/7r/s0gg2qb11jz4g4pcb8n15gkc009j1y/T as the basepath 
> for spooling.
> Apr 14, 2017 1:20:56 PM com.datatorrent.bufferserver.server.Server registered
> INFO: Server started listening at /0:0:0:0:0:0:0:0:61087
> Apr 14, 2017 1:20:56 PM com.datatorrent.stram.StramLocalCluster 
> INFO: Buffer server started: localhost:61087
> Apr 14, 2017 1:20:56 PM com.datatorrent.stram.Journal write
> WARNING: Journal output stream is null. Skipping write to the WAL.
> Apr 14, 2017 1:20:56 PM 
> com.datatorrent.stram.StramLocalCluster$LocalStreamingContainerLauncher run
> INFO: Started container container-0
> Apr 14, 2017 1:20:56 PM com.datatorrent.stram.Journal write
> WARNING: Journal output stream is null. Skipping write to the WAL.
> Apr 14, 2017 1:20:56 PM 
> com.datatorrent.stram.StramLocalCluster$LocalStreamingContainerLauncher run
> INFO: Started container container-1
> Apr 14, 2017 1:20:56 PM com.datatorrent.stram.Journal write
> WARNING: Journal output stream is null. Skipping write to the WAL.
> Apr 14, 2017 1:20:56 PM 
> com.datatorrent.stram.StramLocalCluster$LocalStreamingContainerLauncher run
> INFO: Started container container-2
> Apr 14, 2017 1:20:56 PM com.datatorrent.stram.Journal write
> WARNING: Journal output stream is null. Skipping write to the WAL.
> Apr 14, 2017 1:20:56 PM 
> com.datatorrent.stram.StramLocalCluster$LocalStreamingContainerLauncher run
> INFO: Started container container-3
> Apr 14, 2017 1:20:56 PM com.datatorrent.stram.Journal write
> WARNING: Journal output stream is null. Skipping write to the WAL.
> Apr 14, 2017 1:20:56 PM 
> com.datatorrent.stram.StramLocalCluster$LocalStreamingContainerLauncher run
> INFO: Started container container-4
> Apr 14, 2017 1:20:56 PM com.datatorrent.stram.Journal write
> WARNING: Journal output stream is null. Skipping write to the WAL.
> Apr 14, 2017 1:20:56 PM 
> com.datatorrent.stram.StramLocalCluster$LocalStreamingContainerLauncher run
> INFO: Started container container-5
> Apr 14, 2017 1:20:56 PM com.datatorrent.stram.Journal write
> WARNING: Journal output stream is null. Skipping write to the WAL.
> Apr 14, 2017 1:20:56 PM 
> com.datatorrent.stram.StramLocalCluster$UmbilicalProtocolLocalImpl log
> INFO: container-2 msg: [container-2] Entering heartbeat loop..
> Apr 14, 2017 1:20:56 PM 
> com.datatorrent.stram.StramLocalCluster$UmbilicalProtocolL

[jira] [Commented] (BEAM-1980) Seeming deadlock using Apex with relatively small data

2017-04-14 Thread Thomas Weise (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15969586#comment-15969586
 ] 

Thomas Weise commented on BEAM-1980:


Did the stalled run produce the expected output and just not exit or did it get 
stuck while processing?

> Seeming deadlock using Apex with relatively small data
> --
>
> Key: BEAM-1980
> URL: https://issues.apache.org/jira/browse/BEAM-1980
> Project: Beam
>  Issue Type: Bug
>  Components: runner-apex
>Reporter: Daniel Halperin
>Assignee: Thomas Weise
> Fix For: First stable release
>
>
> I'm running the "beam portability demo" at 
> https://github.com/dhalperi/beam-portability-demo/tree/apex
> Made a very small input file:
> {code}
> gsutil cat gs://apache-beam-demo/data2/small-game.csv | head -n 10 > 
> tiny.csv
> {code}
> Ran the job in embedded mode using an Apex fat-jar from the pom in that 
> branch (and adding in {{slf4j-jdk14.jar}} for debugging info):
> {code}
> java -cp 
> ~/.m2/repository/org/slf4j/slf4j-jdk14/1.7.14/slf4j-jdk14-1.7.14.jar:target/portability-demo-bundled-apex.jar
>  demo.HourlyTeamScore --runner=ApexRunner 
> --outputPrefix=gs://clouddfe-dhalperi/output/apex --input=tiny.csv
> {code}
> A good run takes O(25 seconds):
> {code}
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/Users/dhalperi/.m2/repository/org/slf4j/slf4j-jdk14/1.7.14/slf4j-jdk14-1.7.14.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/Users/dhalperi/beam-portability-demo/target/portability-demo-bundled-apex.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.slf4j.impl.JDK14LoggerFactory]
> log4j:WARN No appenders could be found for logger 
> (org.apache.commons.beanutils.converters.BooleanConverter).
> log4j:WARN Please initialize the log4j system properly.
> log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more 
> info.
> Apr 14, 2017 1:20:55 PM com.datatorrent.common.util.AsyncFSStorageAgent save
> INFO: using 
> /var/folders/7r/s0gg2qb11jz4g4pcb8n15gkc009j1y/T/chkp8074838277485202831 as 
> the basepath for checkpointing.
> Apr 14, 2017 1:20:56 PM com.datatorrent.bufferserver.storage.DiskStorage 
> 
> INFO: using /var/folders/7r/s0gg2qb11jz4g4pcb8n15gkc009j1y/T as the basepath 
> for spooling.
> Apr 14, 2017 1:20:56 PM com.datatorrent.bufferserver.server.Server registered
> INFO: Server started listening at /0:0:0:0:0:0:0:0:61087
> Apr 14, 2017 1:20:56 PM com.datatorrent.stram.StramLocalCluster 
> INFO: Buffer server started: localhost:61087
> Apr 14, 2017 1:20:56 PM com.datatorrent.stram.Journal write
> WARNING: Journal output stream is null. Skipping write to the WAL.
> Apr 14, 2017 1:20:56 PM 
> com.datatorrent.stram.StramLocalCluster$LocalStreamingContainerLauncher run
> INFO: Started container container-0
> Apr 14, 2017 1:20:56 PM com.datatorrent.stram.Journal write
> WARNING: Journal output stream is null. Skipping write to the WAL.
> Apr 14, 2017 1:20:56 PM 
> com.datatorrent.stram.StramLocalCluster$LocalStreamingContainerLauncher run
> INFO: Started container container-1
> Apr 14, 2017 1:20:56 PM com.datatorrent.stram.Journal write
> WARNING: Journal output stream is null. Skipping write to the WAL.
> Apr 14, 2017 1:20:56 PM 
> com.datatorrent.stram.StramLocalCluster$LocalStreamingContainerLauncher run
> INFO: Started container container-2
> Apr 14, 2017 1:20:56 PM com.datatorrent.stram.Journal write
> WARNING: Journal output stream is null. Skipping write to the WAL.
> Apr 14, 2017 1:20:56 PM 
> com.datatorrent.stram.StramLocalCluster$LocalStreamingContainerLauncher run
> INFO: Started container container-3
> Apr 14, 2017 1:20:56 PM com.datatorrent.stram.Journal write
> WARNING: Journal output stream is null. Skipping write to the WAL.
> Apr 14, 2017 1:20:56 PM 
> com.datatorrent.stram.StramLocalCluster$LocalStreamingContainerLauncher run
> INFO: Started container container-4
> Apr 14, 2017 1:20:56 PM com.datatorrent.stram.Journal write
> WARNING: Journal output stream is null. Skipping write to the WAL.
> Apr 14, 2017 1:20:56 PM 
> com.datatorrent.stram.StramLocalCluster$LocalStreamingContainerLauncher run
> INFO: Started container container-5
> Apr 14, 2017 1:20:56 PM com.datatorrent.stram.Journal write
> WARNING: Journal output stream is null. Skipping write to the WAL.
> Apr 14, 2017 1:20:56 PM 
> com.datatorrent.stram.StramLocalCluster$UmbilicalProtocolLocalImpl log
> INFO: container-2 msg: [container-2] Entering heartbeat loop..
> Apr 14, 2017 1:20:56 PM 
> com.datatorrent.stram.StramLocalCluster$UmbilicalProtocolLocalImpl log
> INFO: container-1 msg: [container-1] Entering heartbe