Hey Telles, To get YARN working with the HTTP file system, you need to follow the instructions on:
http://samza.incubator.apache.org/learn/tutorials/0.7.0/run-in-multi-node-y arn.html In the "Set Up Http Filesystem for YARN" section. You shouldn't need to compile anything (no Gradle, which is what your stack trace is showing). This setup should be done for all of the NMs, since they will be the ones downloading your job's package (from yarn.package.path). Cheers, Chris On 8/9/14 9:44 PM, "Telles Nobrega" <[email protected]> wrote: >Hi again, I tried installing the scala libs but the Http problem still >occurs. I realised that I need to compile incubator samza in the machines >that I¹m going to run the jobs, but the compilation fails with this huge >message: > ># ># There is insufficient memory for the Java Runtime Environment to >continue. ># Native memory allocation (malloc) failed to allocate 3946053632 bytes >for committing reserved memory. ># An error report file with more information is saved as: ># /home/ubuntu/incubator-samza/samza-kafka/hs_err_pid2506.log >Could not write standard input into: Gradle Worker 13. >java.io.IOException: Broken pipe > at java.io.FileOutputStream.writeBytes(Native Method) > at java.io.FileOutputStream.write(FileOutputStream.java:345) > at > java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82) > at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140) > at >org.gradle.process.internal.streams.ExecOutputHandleRunner.run(ExecOutputH >andleRunner.java:53) > at >org.gradle.internal.concurrent.DefaultExecutorFactory$StoppableExecutorImp >l$1.run(DefaultExecutorFactory.java:66) > at >java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java: >1145) > at >java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java >:615) > at java.lang.Thread.run(Thread.java:744) >Process 'Gradle Worker 13' finished with non-zero exit value 1 >org.gradle.process.internal.ExecException: Process 'Gradle Worker 13' >finished with non-zero exit value 1 > at >org.gradle.process.internal.DefaultExecHandle$ExecResultImpl.assertNormalE >xitValue(DefaultExecHandle.java:362) > at >org.gradle.process.internal.DefaultWorkerProcess.onProcessStop(DefaultWork >erProcess.java:89) > at >org.gradle.process.internal.DefaultWorkerProcess.access$000(DefaultWorkerP >rocess.java:33) > at >org.gradle.process.internal.DefaultWorkerProcess$1.executionFinished(Defau >ltWorkerProcess.java:55) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at >sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java: >57) > at >sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorIm >pl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at >org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDispat >ch.java:35) > at >org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDispat >ch.java:24) > at >org.gradle.listener.BroadcastDispatch.dispatch(BroadcastDispatch.java:81) > at >org.gradle.listener.BroadcastDispatch.dispatch(BroadcastDispatch.java:30) > at >org.gradle.messaging.dispatch.ProxyDispatchAdapter$DispatchingInvocationHa >ndler.invoke(ProxyDispatchAdapter.java:93) > at com.sun.proxy.$Proxy46.executionFinished(Unknown Source) > at >org.gradle.process.internal.DefaultExecHandle.setEndStateInfo(DefaultExecH >andle.java:212) > at >org.gradle.process.internal.DefaultExecHandle.finished(DefaultExecHandle.j >ava:309) > at >org.gradle.process.internal.ExecHandleRunner.completed(ExecHandleRunner.ja >va:108) > at >org.gradle.process.internal.ExecHandleRunner.run(ExecHandleRunner.java:88) > at >org.gradle.internal.concurrent.DefaultExecutorFactory$StoppableExecutorImp >l$1.run(DefaultExecutorFactory.java:66) > at >java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java: >1145) > at >java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java >:615) > at java.lang.Thread.run(Thread.java:744) >OpenJDK 64-Bit Server VM warning: INFO: >os::commit_memory(0x000000070a6c0000, 3946053632, 0) failed; >error='Cannot allocate memory' (errno=12) ># ># There is insufficient memory for the Java Runtime Environment to >continue. ># Native memory allocation (malloc) failed to allocate 3946053632 bytes >for committing reserved memory. ># An error report file with more information is saved as: ># /home/ubuntu/incubator-samza/samza-kafka/hs_err_pid2518.log >Could not write standard input into: Gradle Worker 14. >java.io.IOException: Broken pipe > at java.io.FileOutputStream.writeBytes(Native Method) > at java.io.FileOutputStream.write(FileOutputStream.java:345) > at > java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82) > at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140) > at >org.gradle.process.internal.streams.ExecOutputHandleRunner.run(ExecOutputH >andleRunner.java:53) > at >org.gradle.internal.concurrent.DefaultExecutorFactory$StoppableExecutorImp >l$1.run(DefaultExecutorFactory.java:66) > at >java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java: >1145) > at >java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java >:615) > at java.lang.Thread.run(Thread.java:744) >Process 'Gradle Worker 14' finished with non-zero exit value 1 >org.gradle.process.internal.ExecException: Process 'Gradle Worker 14' >finished with non-zero exit value 1 > at >org.gradle.process.internal.DefaultExecHandle$ExecResultImpl.assertNormalE >xitValue(DefaultExecHandle.java:362) > at >org.gradle.process.internal.DefaultWorkerProcess.onProcessStop(DefaultWork >erProcess.java:89) > at >org.gradle.process.internal.DefaultWorkerProcess.access$000(DefaultWorkerP >rocess.java:33) > at >org.gradle.process.internal.DefaultWorkerProcess$1.executionFinished(Defau >ltWorkerProcess.java:55) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at >sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java: >57) > at >sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorIm >pl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at >org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDispat >ch.java:35) > at >org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDispat >ch.java:24) > at >org.gradle.listener.BroadcastDispatch.dispatch(BroadcastDispatch.java:81) > at >org.gradle.listener.BroadcastDispatch.dispatch(BroadcastDispatch.java:30) > at >org.gradle.messaging.dispatch.ProxyDispatchAdapter$DispatchingInvocationHa >ndler.invoke(ProxyDispatchAdapter.java:93) > at com.sun.proxy.$Proxy46.executionFinished(Unknown Source) > at >org.gradle.process.internal.DefaultExecHandle.setEndStateInfo(DefaultExecH >andle.java:212) > at >org.gradle.process.internal.DefaultExecHandle.finished(DefaultExecHandle.j >ava:309) > at >org.gradle.process.internal.ExecHandleRunner.completed(ExecHandleRunner.ja >va:108) > at >org.gradle.process.internal.ExecHandleRunner.run(ExecHandleRunner.java:88) > at >org.gradle.internal.concurrent.DefaultExecutorFactory$StoppableExecutorImp >l$1.run(DefaultExecutorFactory.java:66) > at >java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java: >1145) > at >java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java >:615) > at java.lang.Thread.r > >Do I need more memory for my machines? Each already has 4GB. I really >need to have this running. I¹m not sure which way is best http or hdfs >which one you suggest and how can i solve my problem for each case. > >Thanks in advance and sorry for bothering this much. >On 10 Aug 2014, at 00:20, Telles Nobrega <[email protected]> wrote: > >> Hi Chris, now I have the tar file in my RM machine, and the yarn path >>points to it. I changed the core-site.xml to use HttpFileSystem instead >>of HDFS now it is failing with >> >> Application application_1407640485281_0001 failed 2 times due to AM >>Container for appattempt_1407640485281_0001_000002 exited with >>exitCode:-1000 due to: java.lang.ClassNotFoundException: Class >>org.apache.samza.util.hadoop.HttpFileSystem not found >> >> I think I can solve this just installing scala files from the samza >>tutorial, can you confirm that? >> >> On 09 Aug 2014, at 08:34, Telles Nobrega <[email protected]> >>wrote: >> >>> Hi Chris, >>> >>> I think the problem is that I forgot to update the yarn.job.package. >>> I will try again to see if it works now. >>> >>> I have one more question, how can I stop (command line) the jobs >>>running in my topology, for the experiment that I will run, I need to >>>run the same job in 4 minutes intervals. So I need to kill it, clean >>>the kafka topics and rerun. >>> >>> Thanks in advance. >>> >>> On 08 Aug 2014, at 12:41, Chris Riccomini >>><[email protected]> wrote: >>> >>>> Hey Telles, >>>> >>>>>> Do I need to have the job folder on each machine in my cluster? >>>> >>>> No, you should not need to do this. There are two ways to deploy your >>>> tarball to the YARN grid. One is to put it in HDFS, and the other is >>>>to >>>> put it on an HTTP server. The link to running a Samza job in a >>>>multi-node >>>> YARN cluster describes how to do both (either HTTP server or HDFS). >>>> >>>> In both cases, once the tarball is put in on the HTTP/HDFS server(s), >>>>you >>>> must update yarn.package.path to point to it. From there, the YARN NM >>>> should download it for you automatically when you start your job. >>>> >>>> * Can you send along a paste of your job config? >>>> >>>> Cheers, >>>> Chris >>>> >>>> On 8/8/14 8:04 AM, "Claudio Martins" <[email protected]> wrote: >>>> >>>>> Hi Telles, it looks to me that you forgot to update the >>>>> "yarn.package.path" >>>>> attribute in your config file for the task. >>>>> >>>>> - Claudio Martins >>>>> Head of Engineering >>>>> MobileAware USA Inc. / www.mobileaware.com >>>>> office: +1 617 986 5060 / mobile: +1 617 480 5288 >>>>> linkedin: www.linkedin.com/in/martinsclaudio >>>>> >>>>> >>>>> On Fri, Aug 8, 2014 at 10:55 AM, Telles Nobrega >>>>><[email protected]> >>>>> wrote: >>>>> >>>>>> Hi, >>>>>> >>>>>> this is my first time trying to run a job on a multinode >>>>>>environment. I >>>>>> have the cluster set up, I can see in the GUI that all nodes are >>>>>> working. >>>>>> Do I need to have the job folder on each machine in my cluster? >>>>>> - The first time I tried running with the job on the namenode >>>>>>machine >>>>>> and >>>>>> it failed saying: >>>>>> >>>>>> Application application_1407509228798_0001 failed 2 times due to AM >>>>>> Container for appattempt_1407509228798_0001_000002 exited with >>>>>>exitCode: >>>>>> -1000 due to: File >>>>>> >>>>>> >>>>>> >>>>>>file:/home/ubuntu/alarm-samza/samza-job-package/target/samza-job-pack >>>>>>age- >>>>>> 0.7.0-dist.tar.gz >>>>>> does not exist >>>>>> >>>>>> So I copied the folder to each machine in my cluster and got this >>>>>>error: >>>>>> >>>>>> Application application_1407509228798_0002 failed 2 times due to AM >>>>>> Container for appattempt_1407509228798_0002_000002 exited with >>>>>>exitCode: >>>>>> -1000 due to: Resource >>>>>> >>>>>> >>>>>> >>>>>>file:/home/ubuntu/alarm-samza/samza-job-package/target/samza-job-pack >>>>>>age- >>>>>> 0.7.0-dist.tar.gz >>>>>> changed on src filesystem (expected 1407509168000, was 1407509434000 >>>>>> >>>>>> What am I missing? >>>>>> >>>>>> p.s.: I followed this >>>>>> >>>>>><https://github.com/yahoo/samoa/wiki/Executing-SAMOA-with-Apache-Samz >>>>>>a> >>>>>> tutorial >>>>>> and this >>>>>> < >>>>>> >>>>>> >>>>>>http://samza.incubator.apache.org/learn/tutorials/0.7.0/run-in-multi- >>>>>>node >>>>>> -yarn.html >>>>>>> >>>>>> to >>>>>> set up the cluster. >>>>>> >>>>>> Help is much appreciated. >>>>>> >>>>>> Thanks in advance. >>>>>> >>>>>> -- >>>>>> ------------------------------------------ >>>>>> Telles Mota Vidal Nobrega >>>>>> M.sc. Candidate at UFCG >>>>>> B.sc. in Computer Science at UFCG >>>>>> Software Engineer at OpenStack Project - HP/LSD-UFCG >>>>>> >>>> >>> >> >
