[jira] [Commented] (BEAM-1980) Seeming deadlock using Apex with relatively small data
[ https://issues.apache.org/jira/browse/BEAM-1980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16001087#comment-16001087 ] Kenneth Knowles commented on BEAM-1980: --- I did not get to repro'ing, but I believe it is redundant with the proposed release process. > Seeming deadlock using Apex with relatively small data > -- > > Key: BEAM-1980 > URL: https://issues.apache.org/jira/browse/BEAM-1980 > Project: Beam > Issue Type: Bug > Components: runner-apex >Reporter: Daniel Halperin >Assignee: Thomas Weise > Fix For: Not applicable > > > I'm running the "beam portability demo" at > https://github.com/dhalperi/beam-portability-demo/tree/apex > Made a very small input file: > {code} > gsutil cat gs://apache-beam-demo/data2/small-game.csv | head -n 10 > > tiny.csv > {code} > Ran the job in embedded mode using an Apex fat-jar from the pom in that > branch (and adding in {{slf4j-jdk14.jar}} for debugging info): > {code} > java -cp > ~/.m2/repository/org/slf4j/slf4j-jdk14/1.7.14/slf4j-jdk14-1.7.14.jar:target/portability-demo-bundled-apex.jar > demo.HourlyTeamScore --runner=ApexRunner > --outputPrefix=gs://clouddfe-dhalperi/output/apex --input=tiny.csv > {code} > A good run takes O(25 seconds): > {code} > SLF4J: Class path contains multiple SLF4J bindings. > SLF4J: Found binding in > [jar:file:/Users/dhalperi/.m2/repository/org/slf4j/slf4j-jdk14/1.7.14/slf4j-jdk14-1.7.14.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: Found binding in > [jar:file:/Users/dhalperi/beam-portability-demo/target/portability-demo-bundled-apex.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an > explanation. > SLF4J: Actual binding is of type [org.slf4j.impl.JDK14LoggerFactory] > log4j:WARN No appenders could be found for logger > (org.apache.commons.beanutils.converters.BooleanConverter). > log4j:WARN Please initialize the log4j system properly. > log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more > info. > Apr 14, 2017 1:20:55 PM com.datatorrent.common.util.AsyncFSStorageAgent save > INFO: using > /var/folders/7r/s0gg2qb11jz4g4pcb8n15gkc009j1y/T/chkp8074838277485202831 as > the basepath for checkpointing. > Apr 14, 2017 1:20:56 PM com.datatorrent.bufferserver.storage.DiskStorage > > INFO: using /var/folders/7r/s0gg2qb11jz4g4pcb8n15gkc009j1y/T as the basepath > for spooling. > Apr 14, 2017 1:20:56 PM com.datatorrent.bufferserver.server.Server registered > INFO: Server started listening at /0:0:0:0:0:0:0:0:61087 > Apr 14, 2017 1:20:56 PM com.datatorrent.stram.StramLocalCluster > INFO: Buffer server started: localhost:61087 > Apr 14, 2017 1:20:56 PM com.datatorrent.stram.Journal write > WARNING: Journal output stream is null. Skipping write to the WAL. > Apr 14, 2017 1:20:56 PM > com.datatorrent.stram.StramLocalCluster$LocalStreamingContainerLauncher run > INFO: Started container container-0 > Apr 14, 2017 1:20:56 PM com.datatorrent.stram.Journal write > WARNING: Journal output stream is null. Skipping write to the WAL. > Apr 14, 2017 1:20:56 PM > com.datatorrent.stram.StramLocalCluster$LocalStreamingContainerLauncher run > INFO: Started container container-1 > Apr 14, 2017 1:20:56 PM com.datatorrent.stram.Journal write > WARNING: Journal output stream is null. Skipping write to the WAL. > Apr 14, 2017 1:20:56 PM > com.datatorrent.stram.StramLocalCluster$LocalStreamingContainerLauncher run > INFO: Started container container-2 > Apr 14, 2017 1:20:56 PM com.datatorrent.stram.Journal write > WARNING: Journal output stream is null. Skipping write to the WAL. > Apr 14, 2017 1:20:56 PM > com.datatorrent.stram.StramLocalCluster$LocalStreamingContainerLauncher run > INFO: Started container container-3 > Apr 14, 2017 1:20:56 PM com.datatorrent.stram.Journal write > WARNING: Journal output stream is null. Skipping write to the WAL. > Apr 14, 2017 1:20:56 PM > com.datatorrent.stram.StramLocalCluster$LocalStreamingContainerLauncher run > INFO: Started container container-4 > Apr 14, 2017 1:20:56 PM com.datatorrent.stram.Journal write > WARNING: Journal output stream is null. Skipping write to the WAL. > Apr 14, 2017 1:20:56 PM > com.datatorrent.stram.StramLocalCluster$LocalStreamingContainerLauncher run > INFO: Started container container-5 > Apr 14, 2017 1:20:56 PM com.datatorrent.stram.Journal write > WARNING: Journal output stream is null. Skipping write to the WAL. > Apr 14, 2017 1:20:56 PM > com.datatorrent.stram.StramLocalCluster$UmbilicalProtocolLocalImpl log > INFO: container-2 msg: [container-2] Entering heartbeat loop.. > Apr 14, 2017 1:20:56 PM > com.datatorrent.stram.StramLocalCluster$UmbilicalProtocolLocalImpl log > INFO: container-1 msg: [container-1] Entering heartbeat loop.. >
[jira] [Commented] (BEAM-1980) Seeming deadlock using Apex with relatively small data
[ https://issues.apache.org/jira/browse/BEAM-1980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16001066#comment-16001066 ] Davor Bonaci commented on BEAM-1980: Has anyone been able to reproduce this in a while, [~kenn], [~thw]? If not, I'd resolve this and move on. > Seeming deadlock using Apex with relatively small data > -- > > Key: BEAM-1980 > URL: https://issues.apache.org/jira/browse/BEAM-1980 > Project: Beam > Issue Type: Bug > Components: runner-apex >Reporter: Daniel Halperin >Assignee: Thomas Weise > Fix For: 2.0.0 > > > I'm running the "beam portability demo" at > https://github.com/dhalperi/beam-portability-demo/tree/apex > Made a very small input file: > {code} > gsutil cat gs://apache-beam-demo/data2/small-game.csv | head -n 10 > > tiny.csv > {code} > Ran the job in embedded mode using an Apex fat-jar from the pom in that > branch (and adding in {{slf4j-jdk14.jar}} for debugging info): > {code} > java -cp > ~/.m2/repository/org/slf4j/slf4j-jdk14/1.7.14/slf4j-jdk14-1.7.14.jar:target/portability-demo-bundled-apex.jar > demo.HourlyTeamScore --runner=ApexRunner > --outputPrefix=gs://clouddfe-dhalperi/output/apex --input=tiny.csv > {code} > A good run takes O(25 seconds): > {code} > SLF4J: Class path contains multiple SLF4J bindings. > SLF4J: Found binding in > [jar:file:/Users/dhalperi/.m2/repository/org/slf4j/slf4j-jdk14/1.7.14/slf4j-jdk14-1.7.14.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: Found binding in > [jar:file:/Users/dhalperi/beam-portability-demo/target/portability-demo-bundled-apex.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an > explanation. > SLF4J: Actual binding is of type [org.slf4j.impl.JDK14LoggerFactory] > log4j:WARN No appenders could be found for logger > (org.apache.commons.beanutils.converters.BooleanConverter). > log4j:WARN Please initialize the log4j system properly. > log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more > info. > Apr 14, 2017 1:20:55 PM com.datatorrent.common.util.AsyncFSStorageAgent save > INFO: using > /var/folders/7r/s0gg2qb11jz4g4pcb8n15gkc009j1y/T/chkp8074838277485202831 as > the basepath for checkpointing. > Apr 14, 2017 1:20:56 PM com.datatorrent.bufferserver.storage.DiskStorage > > INFO: using /var/folders/7r/s0gg2qb11jz4g4pcb8n15gkc009j1y/T as the basepath > for spooling. > Apr 14, 2017 1:20:56 PM com.datatorrent.bufferserver.server.Server registered > INFO: Server started listening at /0:0:0:0:0:0:0:0:61087 > Apr 14, 2017 1:20:56 PM com.datatorrent.stram.StramLocalCluster > INFO: Buffer server started: localhost:61087 > Apr 14, 2017 1:20:56 PM com.datatorrent.stram.Journal write > WARNING: Journal output stream is null. Skipping write to the WAL. > Apr 14, 2017 1:20:56 PM > com.datatorrent.stram.StramLocalCluster$LocalStreamingContainerLauncher run > INFO: Started container container-0 > Apr 14, 2017 1:20:56 PM com.datatorrent.stram.Journal write > WARNING: Journal output stream is null. Skipping write to the WAL. > Apr 14, 2017 1:20:56 PM > com.datatorrent.stram.StramLocalCluster$LocalStreamingContainerLauncher run > INFO: Started container container-1 > Apr 14, 2017 1:20:56 PM com.datatorrent.stram.Journal write > WARNING: Journal output stream is null. Skipping write to the WAL. > Apr 14, 2017 1:20:56 PM > com.datatorrent.stram.StramLocalCluster$LocalStreamingContainerLauncher run > INFO: Started container container-2 > Apr 14, 2017 1:20:56 PM com.datatorrent.stram.Journal write > WARNING: Journal output stream is null. Skipping write to the WAL. > Apr 14, 2017 1:20:56 PM > com.datatorrent.stram.StramLocalCluster$LocalStreamingContainerLauncher run > INFO: Started container container-3 > Apr 14, 2017 1:20:56 PM com.datatorrent.stram.Journal write > WARNING: Journal output stream is null. Skipping write to the WAL. > Apr 14, 2017 1:20:56 PM > com.datatorrent.stram.StramLocalCluster$LocalStreamingContainerLauncher run > INFO: Started container container-4 > Apr 14, 2017 1:20:56 PM com.datatorrent.stram.Journal write > WARNING: Journal output stream is null. Skipping write to the WAL. > Apr 14, 2017 1:20:56 PM > com.datatorrent.stram.StramLocalCluster$LocalStreamingContainerLauncher run > INFO: Started container container-5 > Apr 14, 2017 1:20:56 PM com.datatorrent.stram.Journal write > WARNING: Journal output stream is null. Skipping write to the WAL. > Apr 14, 2017 1:20:56 PM > com.datatorrent.stram.StramLocalCluster$UmbilicalProtocolLocalImpl log > INFO: container-2 msg: [container-2] Entering heartbeat loop.. > Apr 14, 2017 1:20:56 PM > com.datatorrent.stram.StramLocalCluster$UmbilicalProtocolLocalImpl log > INFO: container-1 msg: [container-1] Entering heartbeat loop.. > A
[jira] [Commented] (BEAM-1980) Seeming deadlock using Apex with relatively small data
[ https://issues.apache.org/jira/browse/BEAM-1980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15990363#comment-15990363 ] Kenneth Knowles commented on BEAM-1980: --- Let's try to repro and close this out, now, then. > Seeming deadlock using Apex with relatively small data > -- > > Key: BEAM-1980 > URL: https://issues.apache.org/jira/browse/BEAM-1980 > Project: Beam > Issue Type: Bug > Components: runner-apex >Reporter: Daniel Halperin >Assignee: Thomas Weise > Fix For: First stable release > > > I'm running the "beam portability demo" at > https://github.com/dhalperi/beam-portability-demo/tree/apex > Made a very small input file: > {code} > gsutil cat gs://apache-beam-demo/data2/small-game.csv | head -n 10 > > tiny.csv > {code} > Ran the job in embedded mode using an Apex fat-jar from the pom in that > branch (and adding in {{slf4j-jdk14.jar}} for debugging info): > {code} > java -cp > ~/.m2/repository/org/slf4j/slf4j-jdk14/1.7.14/slf4j-jdk14-1.7.14.jar:target/portability-demo-bundled-apex.jar > demo.HourlyTeamScore --runner=ApexRunner > --outputPrefix=gs://clouddfe-dhalperi/output/apex --input=tiny.csv > {code} > A good run takes O(25 seconds): > {code} > SLF4J: Class path contains multiple SLF4J bindings. > SLF4J: Found binding in > [jar:file:/Users/dhalperi/.m2/repository/org/slf4j/slf4j-jdk14/1.7.14/slf4j-jdk14-1.7.14.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: Found binding in > [jar:file:/Users/dhalperi/beam-portability-demo/target/portability-demo-bundled-apex.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an > explanation. > SLF4J: Actual binding is of type [org.slf4j.impl.JDK14LoggerFactory] > log4j:WARN No appenders could be found for logger > (org.apache.commons.beanutils.converters.BooleanConverter). > log4j:WARN Please initialize the log4j system properly. > log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more > info. > Apr 14, 2017 1:20:55 PM com.datatorrent.common.util.AsyncFSStorageAgent save > INFO: using > /var/folders/7r/s0gg2qb11jz4g4pcb8n15gkc009j1y/T/chkp8074838277485202831 as > the basepath for checkpointing. > Apr 14, 2017 1:20:56 PM com.datatorrent.bufferserver.storage.DiskStorage > > INFO: using /var/folders/7r/s0gg2qb11jz4g4pcb8n15gkc009j1y/T as the basepath > for spooling. > Apr 14, 2017 1:20:56 PM com.datatorrent.bufferserver.server.Server registered > INFO: Server started listening at /0:0:0:0:0:0:0:0:61087 > Apr 14, 2017 1:20:56 PM com.datatorrent.stram.StramLocalCluster > INFO: Buffer server started: localhost:61087 > Apr 14, 2017 1:20:56 PM com.datatorrent.stram.Journal write > WARNING: Journal output stream is null. Skipping write to the WAL. > Apr 14, 2017 1:20:56 PM > com.datatorrent.stram.StramLocalCluster$LocalStreamingContainerLauncher run > INFO: Started container container-0 > Apr 14, 2017 1:20:56 PM com.datatorrent.stram.Journal write > WARNING: Journal output stream is null. Skipping write to the WAL. > Apr 14, 2017 1:20:56 PM > com.datatorrent.stram.StramLocalCluster$LocalStreamingContainerLauncher run > INFO: Started container container-1 > Apr 14, 2017 1:20:56 PM com.datatorrent.stram.Journal write > WARNING: Journal output stream is null. Skipping write to the WAL. > Apr 14, 2017 1:20:56 PM > com.datatorrent.stram.StramLocalCluster$LocalStreamingContainerLauncher run > INFO: Started container container-2 > Apr 14, 2017 1:20:56 PM com.datatorrent.stram.Journal write > WARNING: Journal output stream is null. Skipping write to the WAL. > Apr 14, 2017 1:20:56 PM > com.datatorrent.stram.StramLocalCluster$LocalStreamingContainerLauncher run > INFO: Started container container-3 > Apr 14, 2017 1:20:56 PM com.datatorrent.stram.Journal write > WARNING: Journal output stream is null. Skipping write to the WAL. > Apr 14, 2017 1:20:56 PM > com.datatorrent.stram.StramLocalCluster$LocalStreamingContainerLauncher run > INFO: Started container container-4 > Apr 14, 2017 1:20:56 PM com.datatorrent.stram.Journal write > WARNING: Journal output stream is null. Skipping write to the WAL. > Apr 14, 2017 1:20:56 PM > com.datatorrent.stram.StramLocalCluster$LocalStreamingContainerLauncher run > INFO: Started container container-5 > Apr 14, 2017 1:20:56 PM com.datatorrent.stram.Journal write > WARNING: Journal output stream is null. Skipping write to the WAL. > Apr 14, 2017 1:20:56 PM > com.datatorrent.stram.StramLocalCluster$UmbilicalProtocolLocalImpl log > INFO: container-2 msg: [container-2] Entering heartbeat loop.. > Apr 14, 2017 1:20:56 PM > com.datatorrent.stram.StramLocalCluster$UmbilicalProtocolLocalImpl log > INFO: container-1 msg: [container-1] Entering heartbeat loop.. > Apr 14, 2017 1:20:56 PM > com.datato
[jira] [Commented] (BEAM-1980) Seeming deadlock using Apex with relatively small data
[ https://issues.apache.org/jira/browse/BEAM-1980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15969612#comment-15969612 ] Daniel Halperin commented on BEAM-1980: --- Dumb note: this may be a side effect of the workaround we deployed to [BEAM-1981] in which we made the timer data transient and lost on checkpoint. > Seeming deadlock using Apex with relatively small data > -- > > Key: BEAM-1980 > URL: https://issues.apache.org/jira/browse/BEAM-1980 > Project: Beam > Issue Type: Bug > Components: runner-apex >Reporter: Daniel Halperin >Assignee: Thomas Weise > Fix For: First stable release > > > I'm running the "beam portability demo" at > https://github.com/dhalperi/beam-portability-demo/tree/apex > Made a very small input file: > {code} > gsutil cat gs://apache-beam-demo/data2/small-game.csv | head -n 10 > > tiny.csv > {code} > Ran the job in embedded mode using an Apex fat-jar from the pom in that > branch (and adding in {{slf4j-jdk14.jar}} for debugging info): > {code} > java -cp > ~/.m2/repository/org/slf4j/slf4j-jdk14/1.7.14/slf4j-jdk14-1.7.14.jar:target/portability-demo-bundled-apex.jar > demo.HourlyTeamScore --runner=ApexRunner > --outputPrefix=gs://clouddfe-dhalperi/output/apex --input=tiny.csv > {code} > A good run takes O(25 seconds): > {code} > SLF4J: Class path contains multiple SLF4J bindings. > SLF4J: Found binding in > [jar:file:/Users/dhalperi/.m2/repository/org/slf4j/slf4j-jdk14/1.7.14/slf4j-jdk14-1.7.14.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: Found binding in > [jar:file:/Users/dhalperi/beam-portability-demo/target/portability-demo-bundled-apex.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an > explanation. > SLF4J: Actual binding is of type [org.slf4j.impl.JDK14LoggerFactory] > log4j:WARN No appenders could be found for logger > (org.apache.commons.beanutils.converters.BooleanConverter). > log4j:WARN Please initialize the log4j system properly. > log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more > info. > Apr 14, 2017 1:20:55 PM com.datatorrent.common.util.AsyncFSStorageAgent save > INFO: using > /var/folders/7r/s0gg2qb11jz4g4pcb8n15gkc009j1y/T/chkp8074838277485202831 as > the basepath for checkpointing. > Apr 14, 2017 1:20:56 PM com.datatorrent.bufferserver.storage.DiskStorage > > INFO: using /var/folders/7r/s0gg2qb11jz4g4pcb8n15gkc009j1y/T as the basepath > for spooling. > Apr 14, 2017 1:20:56 PM com.datatorrent.bufferserver.server.Server registered > INFO: Server started listening at /0:0:0:0:0:0:0:0:61087 > Apr 14, 2017 1:20:56 PM com.datatorrent.stram.StramLocalCluster > INFO: Buffer server started: localhost:61087 > Apr 14, 2017 1:20:56 PM com.datatorrent.stram.Journal write > WARNING: Journal output stream is null. Skipping write to the WAL. > Apr 14, 2017 1:20:56 PM > com.datatorrent.stram.StramLocalCluster$LocalStreamingContainerLauncher run > INFO: Started container container-0 > Apr 14, 2017 1:20:56 PM com.datatorrent.stram.Journal write > WARNING: Journal output stream is null. Skipping write to the WAL. > Apr 14, 2017 1:20:56 PM > com.datatorrent.stram.StramLocalCluster$LocalStreamingContainerLauncher run > INFO: Started container container-1 > Apr 14, 2017 1:20:56 PM com.datatorrent.stram.Journal write > WARNING: Journal output stream is null. Skipping write to the WAL. > Apr 14, 2017 1:20:56 PM > com.datatorrent.stram.StramLocalCluster$LocalStreamingContainerLauncher run > INFO: Started container container-2 > Apr 14, 2017 1:20:56 PM com.datatorrent.stram.Journal write > WARNING: Journal output stream is null. Skipping write to the WAL. > Apr 14, 2017 1:20:56 PM > com.datatorrent.stram.StramLocalCluster$LocalStreamingContainerLauncher run > INFO: Started container container-3 > Apr 14, 2017 1:20:56 PM com.datatorrent.stram.Journal write > WARNING: Journal output stream is null. Skipping write to the WAL. > Apr 14, 2017 1:20:56 PM > com.datatorrent.stram.StramLocalCluster$LocalStreamingContainerLauncher run > INFO: Started container container-4 > Apr 14, 2017 1:20:56 PM com.datatorrent.stram.Journal write > WARNING: Journal output stream is null. Skipping write to the WAL. > Apr 14, 2017 1:20:56 PM > com.datatorrent.stram.StramLocalCluster$LocalStreamingContainerLauncher run > INFO: Started container container-5 > Apr 14, 2017 1:20:56 PM com.datatorrent.stram.Journal write > WARNING: Journal output stream is null. Skipping write to the WAL. > Apr 14, 2017 1:20:56 PM > com.datatorrent.stram.StramLocalCluster$UmbilicalProtocolLocalImpl log > INFO: container-2 msg: [container-2] Entering heartbeat loop.. > Apr 14, 2017 1:20:56 PM > com.datatorrent.stram.StramLocalCluster$UmbilicalProtocolLocalImpl log > INF
[jira] [Commented] (BEAM-1980) Seeming deadlock using Apex with relatively small data
[ https://issues.apache.org/jira/browse/BEAM-1980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15969608#comment-15969608 ] Daniel Halperin commented on BEAM-1980: --- Just repro'ed; after 5 minutes the output is at the same place (undeployed container [2]) and the output file has not been created. So it seems stuck while processing. jstack: {code} 2017-04-14 15:08:34 Full thread dump Java HotSpot(TM) 64-Bit Server VM (25.112-b16 mixed mode): "Attach Listener" #121 daemon prio=9 os_prio=31 tid=0x7f91a9140800 nid=0x9007 waiting on condition [0x] java.lang.Thread.State: RUNNABLE "StorageHelper-2-1" #120 prio=5 os_prio=31 tid=0x7f91a7193000 nid=0x9a03 waiting on condition [0x7dac] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x0006c0cd1cf0> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039) at java.util.concurrent.ArrayBlockingQueue.take(ArrayBlockingQueue.java:403) at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1067) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) "ServerHelper-1-1" #119 prio=5 os_prio=31 tid=0x7f91a5a85800 nid=0x9803 waiting on condition [0x7d9bd000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x0006c0cde730> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039) at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442) at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1067) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) "9/SumTeamScores/GroupByKey:ApexGroupByKeyOperator" #118 prio=5 os_prio=31 tid=0x7f91a70d9800 nid=0x9603 waiting on condition [0x7d8ba000] java.lang.Thread.State: TIMED_WAITING (sleeping) at java.lang.Thread.sleep(Native Method) at com.datatorrent.stram.engine.GenericNode.run(GenericNode.java:601) at com.datatorrent.stram.engine.StreamingContainer$2.run(StreamingContainer.java:1407) "10/SumTeamScores/ParDo(WriteWindowedFiles)/ParMultiDo(WriteWindowedFiles):ApexParDoOperator" #117 prio=5 os_prio=31 tid=0x7f91a8811800 nid=0x9403 waiting on condition [0x7d7b7000] java.lang.Thread.State: TIMED_WAITING (sleeping) at java.lang.Thread.sleep(Native Method) at com.datatorrent.stram.engine.GenericNode.run(GenericNode.java:601) at com.datatorrent.stram.engine.StreamingContainer$2.run(StreamingContainer.java:1407) "3/SetTimestamps/ParMultiDo(SetTimestamps):ApexParDoOperator" #115 prio=5 os_prio=31 tid=0x7f91a8860800 nid=0x8e03 waiting on condition [0x7d4ae000] java.lang.Thread.State: TIMED_WAITING (sleeping) at java.lang.Thread.sleep(Native Method) at com.datatorrent.stram.engine.GenericNode.run(GenericNode.java:601) at com.datatorrent.stram.engine.StreamingContainer$2.run(StreamingContainer.java:1407) "5/SumTeamScores/ParDo(KeyScoreByTeam)/ParMultiDo(KeyScoreByTeam):ApexParDoOperator" #114 prio=5 os_prio=31 tid=0x7f91a8809000 nid=0x8c03 waiting on condition [0x7d3ab000] java.lang.Thread.State: TIMED_WAITING (sleeping) at java.lang.Thread.sleep(Native Method) at com.datatorrent.stram.engine.GenericNode.run(GenericNode.java:601) at com.datatorrent.stram.engine.StreamingContainer$2.run(StreamingContainer.java:1407) "4/FixedWindows/Window.Assign:ApexProcessFnOperator" #112 prio=5 os_prio=31 tid=0x7f91a6a66800 nid=0x8a03 waiting on condition [0x7d2a8000] java.lang.Thread.State: TIMED_WAITING (sleeping) at java.lang.Thread.sleep(Native Method) at com.datatorrent.stram.engine.GenericNode.run(GenericNode.java:601) at com.datatorrent.stram.engine.StreamingContainer$2.run(StreamingContainer.java:1407) "6/SumTeamScores/Combine.perKey(SumInteger)/GroupByKey:ApexGroupByKeyOperator" #111 prio=5 os_prio=31 tid=0x7f91a6a89800 nid=0x8803 waiting on condition [0x00
[jira] [Commented] (BEAM-1980) Seeming deadlock using Apex with relatively small data
[ https://issues.apache.org/jira/browse/BEAM-1980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15969598#comment-15969598 ] Daniel Halperin commented on BEAM-1980: --- Hmm, that is a good question. I'm honestly not sure I know but I will repro and tell you. Certainly, it did not undeploy more containers than the 2 seen in the log. > Seeming deadlock using Apex with relatively small data > -- > > Key: BEAM-1980 > URL: https://issues.apache.org/jira/browse/BEAM-1980 > Project: Beam > Issue Type: Bug > Components: runner-apex >Reporter: Daniel Halperin >Assignee: Thomas Weise > Fix For: First stable release > > > I'm running the "beam portability demo" at > https://github.com/dhalperi/beam-portability-demo/tree/apex > Made a very small input file: > {code} > gsutil cat gs://apache-beam-demo/data2/small-game.csv | head -n 10 > > tiny.csv > {code} > Ran the job in embedded mode using an Apex fat-jar from the pom in that > branch (and adding in {{slf4j-jdk14.jar}} for debugging info): > {code} > java -cp > ~/.m2/repository/org/slf4j/slf4j-jdk14/1.7.14/slf4j-jdk14-1.7.14.jar:target/portability-demo-bundled-apex.jar > demo.HourlyTeamScore --runner=ApexRunner > --outputPrefix=gs://clouddfe-dhalperi/output/apex --input=tiny.csv > {code} > A good run takes O(25 seconds): > {code} > SLF4J: Class path contains multiple SLF4J bindings. > SLF4J: Found binding in > [jar:file:/Users/dhalperi/.m2/repository/org/slf4j/slf4j-jdk14/1.7.14/slf4j-jdk14-1.7.14.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: Found binding in > [jar:file:/Users/dhalperi/beam-portability-demo/target/portability-demo-bundled-apex.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an > explanation. > SLF4J: Actual binding is of type [org.slf4j.impl.JDK14LoggerFactory] > log4j:WARN No appenders could be found for logger > (org.apache.commons.beanutils.converters.BooleanConverter). > log4j:WARN Please initialize the log4j system properly. > log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more > info. > Apr 14, 2017 1:20:55 PM com.datatorrent.common.util.AsyncFSStorageAgent save > INFO: using > /var/folders/7r/s0gg2qb11jz4g4pcb8n15gkc009j1y/T/chkp8074838277485202831 as > the basepath for checkpointing. > Apr 14, 2017 1:20:56 PM com.datatorrent.bufferserver.storage.DiskStorage > > INFO: using /var/folders/7r/s0gg2qb11jz4g4pcb8n15gkc009j1y/T as the basepath > for spooling. > Apr 14, 2017 1:20:56 PM com.datatorrent.bufferserver.server.Server registered > INFO: Server started listening at /0:0:0:0:0:0:0:0:61087 > Apr 14, 2017 1:20:56 PM com.datatorrent.stram.StramLocalCluster > INFO: Buffer server started: localhost:61087 > Apr 14, 2017 1:20:56 PM com.datatorrent.stram.Journal write > WARNING: Journal output stream is null. Skipping write to the WAL. > Apr 14, 2017 1:20:56 PM > com.datatorrent.stram.StramLocalCluster$LocalStreamingContainerLauncher run > INFO: Started container container-0 > Apr 14, 2017 1:20:56 PM com.datatorrent.stram.Journal write > WARNING: Journal output stream is null. Skipping write to the WAL. > Apr 14, 2017 1:20:56 PM > com.datatorrent.stram.StramLocalCluster$LocalStreamingContainerLauncher run > INFO: Started container container-1 > Apr 14, 2017 1:20:56 PM com.datatorrent.stram.Journal write > WARNING: Journal output stream is null. Skipping write to the WAL. > Apr 14, 2017 1:20:56 PM > com.datatorrent.stram.StramLocalCluster$LocalStreamingContainerLauncher run > INFO: Started container container-2 > Apr 14, 2017 1:20:56 PM com.datatorrent.stram.Journal write > WARNING: Journal output stream is null. Skipping write to the WAL. > Apr 14, 2017 1:20:56 PM > com.datatorrent.stram.StramLocalCluster$LocalStreamingContainerLauncher run > INFO: Started container container-3 > Apr 14, 2017 1:20:56 PM com.datatorrent.stram.Journal write > WARNING: Journal output stream is null. Skipping write to the WAL. > Apr 14, 2017 1:20:56 PM > com.datatorrent.stram.StramLocalCluster$LocalStreamingContainerLauncher run > INFO: Started container container-4 > Apr 14, 2017 1:20:56 PM com.datatorrent.stram.Journal write > WARNING: Journal output stream is null. Skipping write to the WAL. > Apr 14, 2017 1:20:56 PM > com.datatorrent.stram.StramLocalCluster$LocalStreamingContainerLauncher run > INFO: Started container container-5 > Apr 14, 2017 1:20:56 PM com.datatorrent.stram.Journal write > WARNING: Journal output stream is null. Skipping write to the WAL. > Apr 14, 2017 1:20:56 PM > com.datatorrent.stram.StramLocalCluster$UmbilicalProtocolLocalImpl log > INFO: container-2 msg: [container-2] Entering heartbeat loop.. > Apr 14, 2017 1:20:56 PM > com.datatorrent.stram.StramLocalCluster$UmbilicalProtocolL
[jira] [Commented] (BEAM-1980) Seeming deadlock using Apex with relatively small data
[ https://issues.apache.org/jira/browse/BEAM-1980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15969586#comment-15969586 ] Thomas Weise commented on BEAM-1980: Did the stalled run produce the expected output and just not exit or did it get stuck while processing? > Seeming deadlock using Apex with relatively small data > -- > > Key: BEAM-1980 > URL: https://issues.apache.org/jira/browse/BEAM-1980 > Project: Beam > Issue Type: Bug > Components: runner-apex >Reporter: Daniel Halperin >Assignee: Thomas Weise > Fix For: First stable release > > > I'm running the "beam portability demo" at > https://github.com/dhalperi/beam-portability-demo/tree/apex > Made a very small input file: > {code} > gsutil cat gs://apache-beam-demo/data2/small-game.csv | head -n 10 > > tiny.csv > {code} > Ran the job in embedded mode using an Apex fat-jar from the pom in that > branch (and adding in {{slf4j-jdk14.jar}} for debugging info): > {code} > java -cp > ~/.m2/repository/org/slf4j/slf4j-jdk14/1.7.14/slf4j-jdk14-1.7.14.jar:target/portability-demo-bundled-apex.jar > demo.HourlyTeamScore --runner=ApexRunner > --outputPrefix=gs://clouddfe-dhalperi/output/apex --input=tiny.csv > {code} > A good run takes O(25 seconds): > {code} > SLF4J: Class path contains multiple SLF4J bindings. > SLF4J: Found binding in > [jar:file:/Users/dhalperi/.m2/repository/org/slf4j/slf4j-jdk14/1.7.14/slf4j-jdk14-1.7.14.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: Found binding in > [jar:file:/Users/dhalperi/beam-portability-demo/target/portability-demo-bundled-apex.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an > explanation. > SLF4J: Actual binding is of type [org.slf4j.impl.JDK14LoggerFactory] > log4j:WARN No appenders could be found for logger > (org.apache.commons.beanutils.converters.BooleanConverter). > log4j:WARN Please initialize the log4j system properly. > log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more > info. > Apr 14, 2017 1:20:55 PM com.datatorrent.common.util.AsyncFSStorageAgent save > INFO: using > /var/folders/7r/s0gg2qb11jz4g4pcb8n15gkc009j1y/T/chkp8074838277485202831 as > the basepath for checkpointing. > Apr 14, 2017 1:20:56 PM com.datatorrent.bufferserver.storage.DiskStorage > > INFO: using /var/folders/7r/s0gg2qb11jz4g4pcb8n15gkc009j1y/T as the basepath > for spooling. > Apr 14, 2017 1:20:56 PM com.datatorrent.bufferserver.server.Server registered > INFO: Server started listening at /0:0:0:0:0:0:0:0:61087 > Apr 14, 2017 1:20:56 PM com.datatorrent.stram.StramLocalCluster > INFO: Buffer server started: localhost:61087 > Apr 14, 2017 1:20:56 PM com.datatorrent.stram.Journal write > WARNING: Journal output stream is null. Skipping write to the WAL. > Apr 14, 2017 1:20:56 PM > com.datatorrent.stram.StramLocalCluster$LocalStreamingContainerLauncher run > INFO: Started container container-0 > Apr 14, 2017 1:20:56 PM com.datatorrent.stram.Journal write > WARNING: Journal output stream is null. Skipping write to the WAL. > Apr 14, 2017 1:20:56 PM > com.datatorrent.stram.StramLocalCluster$LocalStreamingContainerLauncher run > INFO: Started container container-1 > Apr 14, 2017 1:20:56 PM com.datatorrent.stram.Journal write > WARNING: Journal output stream is null. Skipping write to the WAL. > Apr 14, 2017 1:20:56 PM > com.datatorrent.stram.StramLocalCluster$LocalStreamingContainerLauncher run > INFO: Started container container-2 > Apr 14, 2017 1:20:56 PM com.datatorrent.stram.Journal write > WARNING: Journal output stream is null. Skipping write to the WAL. > Apr 14, 2017 1:20:56 PM > com.datatorrent.stram.StramLocalCluster$LocalStreamingContainerLauncher run > INFO: Started container container-3 > Apr 14, 2017 1:20:56 PM com.datatorrent.stram.Journal write > WARNING: Journal output stream is null. Skipping write to the WAL. > Apr 14, 2017 1:20:56 PM > com.datatorrent.stram.StramLocalCluster$LocalStreamingContainerLauncher run > INFO: Started container container-4 > Apr 14, 2017 1:20:56 PM com.datatorrent.stram.Journal write > WARNING: Journal output stream is null. Skipping write to the WAL. > Apr 14, 2017 1:20:56 PM > com.datatorrent.stram.StramLocalCluster$LocalStreamingContainerLauncher run > INFO: Started container container-5 > Apr 14, 2017 1:20:56 PM com.datatorrent.stram.Journal write > WARNING: Journal output stream is null. Skipping write to the WAL. > Apr 14, 2017 1:20:56 PM > com.datatorrent.stram.StramLocalCluster$UmbilicalProtocolLocalImpl log > INFO: container-2 msg: [container-2] Entering heartbeat loop.. > Apr 14, 2017 1:20:56 PM > com.datatorrent.stram.StramLocalCluster$UmbilicalProtocolLocalImpl log > INFO: container-1 msg: [container-1] Entering heartbe