Re: Need help running beam word count example on apex/hdfs
For anyone who faces the same issue, I was not able to make work with "mvn compile exec:java ...". Instead, I ran with "hadoop jar ..." command which magically fixed this. Best guess is that maven is picking up incompatible version of commons-io from the wrong side of dependency tree. Regards, Shashank On Mon, Jan 8, 2018 at 4:07 PM, Shashank Prabhakara wrote: > Forgot to mention: > > Execution works in embedded mode and counts are created on the local fs. I > need this to run on hdfs/yarn with --embeddedExecution=false. > > Regards, > Shashank > > On Mon, Jan 8, 2018 at 3:06 PM, Shashank Prabhakara > wrote: > >> Hi All, >> >> I want to test beam on apex using the word count example provided in the >> beam repository, but I'm facing some difficulties while executing word >> count as described in the documentation. >> >> I'm running hadoop version 2.8.2 on debian in a multi-node environment. >> I cloned the beam github repository - master branch and executed: >> >> cd examples/java >> mvn compile exec:java -Dexec.mainClass=org.apache.beam.examples.WordCount >> -Dexec.args="--inputFile=/tmp/input/pom.xml --output=/tmp/output/counts >> --runner=ApexRunner --embeddedExecution=false" -Papex-runner >> >> However the driver hangs (waited for > 1hr) after printing the classpath >> on the console. I have attached the stdout and stacktrace to this (Pls let >> me know if not visible in ML). >> >> Thanks in advance for any help. >> >> Regards, >> Shashank >> > >
Re: Need help running beam word count example on apex/hdfs
Forgot to mention: Execution works in embedded mode and counts are created on the local fs. I need this to run on hdfs/yarn with --embeddedExecution=false. Regards, Shashank On Mon, Jan 8, 2018 at 3:06 PM, Shashank Prabhakara wrote: > Hi All, > > I want to test beam on apex using the word count example provided in the > beam repository, but I'm facing some difficulties while executing word > count as described in the documentation. > > I'm running hadoop version 2.8.2 on debian in a multi-node environment. > I cloned the beam github repository - master branch and executed: > > cd examples/java > mvn compile exec:java -Dexec.mainClass=org.apache.beam.examples.WordCount > -Dexec.args="--inputFile=/tmp/input/pom.xml --output=/tmp/output/counts > --runner=ApexRunner --embeddedExecution=false" -Papex-runner > > However the driver hangs (waited for > 1hr) after printing the classpath > on the console. I have attached the stdout and stacktrace to this (Pls let > me know if not visible in ML). > > Thanks in advance for any help. > > Regards, > Shashank >
Need help running beam word count example on apex/hdfs
Hi All, I want to test beam on apex using the word count example provided in the beam repository, but I'm facing some difficulties while executing word count as described in the documentation. I'm running hadoop version 2.8.2 on debian in a multi-node environment. I cloned the beam github repository - master branch and executed: cd examples/java mvn compile exec:java -Dexec.mainClass=org.apache.beam.examples.WordCount -Dexec.args="--inputFile=/tmp/input/pom.xml --output=/tmp/output/counts --runner=ApexRunner --embeddedExecution=false" -Papex-runner However the driver hangs (waited for > 1hr) after printing the classpath on the console. I have attached the stdout and stacktrace to this (Pls let me know if not visible in ML). Thanks in advance for any help. Regards, Shashank stacktrace.out Description: Binary data std.out Description: Binary data