​Hi Yan Fang,

I was able to deploy the file to hdfs, I can see them in all my nodes but
when I tried running I got this error:

Exception in thread "main" java.io.IOException: No FileSystem for scheme:
hdfs
at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2421)
 at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2428)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:88)
 at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2467)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2449)
 at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:367)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:287)
 at
org.apache.samza.job.yarn.ClientHelper.submitApplication(ClientHelper.scala:111)
at org.apache.samza.job.yarn.YarnJob.submit(YarnJob.scala:55)
 at org.apache.samza.job.yarn.YarnJob.submit(YarnJob.scala:48)
at org.apache.samza.job.JobRunner.run(JobRunner.scala:62)
 at org.apache.samza.job.JobRunner$.main(JobRunner.scala:37)
at org.apache.samza.job.JobRunner.main(JobRunner.scala)


This is my yarn.package.path config:


 
​yarn.package.path=hdfs://telles-master-samza:50070/samza-job-package-0.7.0-dist.tar.gz

Thanks in advance





On Mon, Aug 11, 2014 at 3:00 PM, Yan Fang <[email protected]> wrote:

> Hi Telles,
>
> In terms of "*I tried pushing the tar file to HDFS but I got an error from
> hadoop saying that it couldn’t find core-site.xml file*.", I guess you set
> the HADOOP_CONF_DIR variable and made it point to ~/.samza/conf. You can do
> 1) make the HADOOP_CONF_DIR point to the directory where your conf files
> are, such as /etc/hadoop/conf. Or 2) copy the config files to
> ~/.samza/conf. Thank you,
>
> Cheer,
>
> Fang, Yan
> [email protected]
> +1 (206) 849-4108
>
>
> On Mon, Aug 11, 2014 at 7:40 AM, Chris Riccomini <
> [email protected]> wrote:
>
> > Hey Telles,
> >
> > To get YARN working with the HTTP file system, you need to follow the
> > instructions on:
> >
> >
> http://samza.incubator.apache.org/learn/tutorials/0.7.0/run-in-multi-node-y
> > arn.html
> >
> >
> > In the "Set Up Http Filesystem for YARN" section.
> >
> > You shouldn't need to compile anything (no Gradle, which is what your
> > stack trace is showing). This setup should be done for all of the NMs,
> > since they will be the ones downloading your job's package (from
> > yarn.package.path).
> >
> > Cheers,
> > Chris
> >
> > On 8/9/14 9:44 PM, "Telles Nobrega" <[email protected]> wrote:
> >
> > >Hi again, I tried installing the scala libs but the Http problem still
> > >occurs. I realised that I need to compile incubator samza in the
> machines
> > >that I¹m going to run the jobs, but the compilation fails with this huge
> > >message:
> > >
> > >#
> > ># There is insufficient memory for the Java Runtime Environment to
> > >continue.
> > ># Native memory allocation (malloc) failed to allocate 3946053632 bytes
> > >for committing reserved memory.
> > ># An error report file with more information is saved as:
> > ># /home/ubuntu/incubator-samza/samza-kafka/hs_err_pid2506.log
> > >Could not write standard input into: Gradle Worker 13.
> > >java.io.IOException: Broken pipe
> > >       at java.io.FileOutputStream.writeBytes(Native Method)
> > >       at java.io.FileOutputStream.write(FileOutputStream.java:345)
> > >       at
> > java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
> > >       at
> > java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
> > >       at
> >
> >org.gradle.process.internal.streams.ExecOutputHandleRunner.run(ExecOutputH
> > >andleRunner.java:53)
> > >       at
> >
> >org.gradle.internal.concurrent.DefaultExecutorFactory$StoppableExecutorImp
> > >l$1.run(DefaultExecutorFactory.java:66)
> > >       at
> >
> >java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:
> > >1145)
> > >       at
> >
> >java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java
> > >:615)
> > >       at java.lang.Thread.run(Thread.java:744)
> > >Process 'Gradle Worker 13' finished with non-zero exit value 1
> > >org.gradle.process.internal.ExecException: Process 'Gradle Worker 13'
> > >finished with non-zero exit value 1
> > >       at
> >
> >org.gradle.process.internal.DefaultExecHandle$ExecResultImpl.assertNormalE
> > >xitValue(DefaultExecHandle.java:362)
> > >       at
> >
> >org.gradle.process.internal.DefaultWorkerProcess.onProcessStop(DefaultWork
> > >erProcess.java:89)
> > >       at
> >
> >org.gradle.process.internal.DefaultWorkerProcess.access$000(DefaultWorkerP
> > >rocess.java:33)
> > >       at
> >
> >org.gradle.process.internal.DefaultWorkerProcess$1.executionFinished(Defau
> > >ltWorkerProcess.java:55)
> > >       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> > >       at
> >
> >sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:
> > >57)
> > >       at
> >
> >sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorIm
> > >pl.java:43)
> > >       at java.lang.reflect.Method.invoke(Method.java:606)
> > >       at
> >
> >org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDispat
> > >ch.java:35)
> > >       at
> >
> >org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDispat
> > >ch.java:24)
> > >       at
> >
> >org.gradle.listener.BroadcastDispatch.dispatch(BroadcastDispatch.java:81)
> > >       at
> >
> >org.gradle.listener.BroadcastDispatch.dispatch(BroadcastDispatch.java:30)
> > >       at
> >
> >org.gradle.messaging.dispatch.ProxyDispatchAdapter$DispatchingInvocationHa
> > >ndler.invoke(ProxyDispatchAdapter.java:93)
> > >       at com.sun.proxy.$Proxy46.executionFinished(Unknown Source)
> > >       at
> >
> >org.gradle.process.internal.DefaultExecHandle.setEndStateInfo(DefaultExecH
> > >andle.java:212)
> > >       at
> >
> >org.gradle.process.internal.DefaultExecHandle.finished(DefaultExecHandle.j
> > >ava:309)
> > >       at
> >
> >org.gradle.process.internal.ExecHandleRunner.completed(ExecHandleRunner.ja
> > >va:108)
> > >       at
> >
> >org.gradle.process.internal.ExecHandleRunner.run(ExecHandleRunner.java:88)
> > >       at
> >
> >org.gradle.internal.concurrent.DefaultExecutorFactory$StoppableExecutorImp
> > >l$1.run(DefaultExecutorFactory.java:66)
> > >       at
> >
> >java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:
> > >1145)
> > >       at
> >
> >java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java
> > >:615)
> > >       at java.lang.Thread.run(Thread.java:744)
> > >OpenJDK 64-Bit Server VM warning: INFO:
> > >os::commit_memory(0x000000070a6c0000, 3946053632, 0) failed;
> > >error='Cannot allocate memory' (errno=12)
> > >#
> > ># There is insufficient memory for the Java Runtime Environment to
> > >continue.
> > ># Native memory allocation (malloc) failed to allocate 3946053632 bytes
> > >for committing reserved memory.
> > ># An error report file with more information is saved as:
> > ># /home/ubuntu/incubator-samza/samza-kafka/hs_err_pid2518.log
> > >Could not write standard input into: Gradle Worker 14.
> > >java.io.IOException: Broken pipe
> > >       at java.io.FileOutputStream.writeBytes(Native Method)
> > >       at java.io.FileOutputStream.write(FileOutputStream.java:345)
> > >       at
> > java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
> > >       at
> > java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
> > >       at
> >
> >org.gradle.process.internal.streams.ExecOutputHandleRunner.run(ExecOutputH
> > >andleRunner.java:53)
> > >       at
> >
> >org.gradle.internal.concurrent.DefaultExecutorFactory$StoppableExecutorImp
> > >l$1.run(DefaultExecutorFactory.java:66)
> > >       at
> >
> >java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:
> > >1145)
> > >       at
> >
> >java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java
> > >:615)
> > >       at java.lang.Thread.run(Thread.java:744)
> > >Process 'Gradle Worker 14' finished with non-zero exit value 1
> > >org.gradle.process.internal.ExecException: Process 'Gradle Worker 14'
> > >finished with non-zero exit value 1
> > >       at
> >
> >org.gradle.process.internal.DefaultExecHandle$ExecResultImpl.assertNormalE
> > >xitValue(DefaultExecHandle.java:362)
> > >       at
> >
> >org.gradle.process.internal.DefaultWorkerProcess.onProcessStop(DefaultWork
> > >erProcess.java:89)
> > >       at
> >
> >org.gradle.process.internal.DefaultWorkerProcess.access$000(DefaultWorkerP
> > >rocess.java:33)
> > >       at
> >
> >org.gradle.process.internal.DefaultWorkerProcess$1.executionFinished(Defau
> > >ltWorkerProcess.java:55)
> > >       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> > >       at
> >
> >sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:
> > >57)
> > >       at
> >
> >sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorIm
> > >pl.java:43)
> > >       at java.lang.reflect.Method.invoke(Method.java:606)
> > >       at
> >
> >org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDispat
> > >ch.java:35)
> > >       at
> >
> >org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDispat
> > >ch.java:24)
> > >       at
> >
> >org.gradle.listener.BroadcastDispatch.dispatch(BroadcastDispatch.java:81)
> > >       at
> >
> >org.gradle.listener.BroadcastDispatch.dispatch(BroadcastDispatch.java:30)
> > >       at
> >
> >org.gradle.messaging.dispatch.ProxyDispatchAdapter$DispatchingInvocationHa
> > >ndler.invoke(ProxyDispatchAdapter.java:93)
> > >       at com.sun.proxy.$Proxy46.executionFinished(Unknown Source)
> > >       at
> >
> >org.gradle.process.internal.DefaultExecHandle.setEndStateInfo(DefaultExecH
> > >andle.java:212)
> > >       at
> >
> >org.gradle.process.internal.DefaultExecHandle.finished(DefaultExecHandle.j
> > >ava:309)
> > >       at
> >
> >org.gradle.process.internal.ExecHandleRunner.completed(ExecHandleRunner.ja
> > >va:108)
> > >       at
> >
> >org.gradle.process.internal.ExecHandleRunner.run(ExecHandleRunner.java:88)
> > >       at
> >
> >org.gradle.internal.concurrent.DefaultExecutorFactory$StoppableExecutorImp
> > >l$1.run(DefaultExecutorFactory.java:66)
> > >       at
> >
> >java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:
> > >1145)
> > >       at
> >
> >java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java
> > >:615)
> > >       at java.lang.Thread.r
> > >
> > >Do I need more memory for my machines? Each already has 4GB. I really
> > >need to have this running. I¹m not sure which way is best http or hdfs
> > >which one you suggest and how can i solve my problem for each case.
> > >
> > >Thanks in advance and sorry for bothering this much.
> > >On 10 Aug 2014, at 00:20, Telles Nobrega <[email protected]>
> wrote:
> > >
> > >> Hi Chris, now I have the tar file in my RM machine, and the yarn path
> > >>points to it. I changed the core-site.xml to use HttpFileSystem instead
> > >>of HDFS now it is failing with
> > >>
> > >> Application application_1407640485281_0001 failed 2 times due to AM
> > >>Container for appattempt_1407640485281_0001_000002 exited with
> > >>exitCode:-1000 due to: java.lang.ClassNotFoundException: Class
> > >>org.apache.samza.util.hadoop.HttpFileSystem not found
> > >>
> > >> I think I can solve this just installing scala files from the samza
> > >>tutorial, can you confirm that?
> > >>
> > >> On 09 Aug 2014, at 08:34, Telles Nobrega <[email protected]>
> > >>wrote:
> > >>
> > >>> Hi Chris,
> > >>>
> > >>> I think the problem is that I forgot to update the yarn.job.package.
> > >>> I will try again to see if it works now.
> > >>>
> > >>> I have one more question, how can I stop (command line) the jobs
> > >>>running in my topology, for the experiment that I will run, I need to
> > >>>run the same job in 4 minutes intervals. So I need to kill it, clean
> > >>>the kafka topics and rerun.
> > >>>
> > >>> Thanks in advance.
> > >>>
> > >>> On 08 Aug 2014, at 12:41, Chris Riccomini
> > >>><[email protected]> wrote:
> > >>>
> > >>>> Hey Telles,
> > >>>>
> > >>>>>> Do I need to have the job folder on each machine in my cluster?
> > >>>>
> > >>>> No, you should not need to do this. There are two ways to deploy
> your
> > >>>> tarball to the YARN grid. One is to put it in HDFS, and the other is
> > >>>>to
> > >>>> put it on an HTTP server. The link to running a Samza job in a
> > >>>>multi-node
> > >>>> YARN cluster describes how to do both (either HTTP server or HDFS).
> > >>>>
> > >>>> In both cases, once the tarball is put in on the HTTP/HDFS
> server(s),
> > >>>>you
> > >>>> must update yarn.package.path to point to it. From there, the YARN
> NM
> > >>>> should download it for you automatically when you start your job.
> > >>>>
> > >>>> * Can you send along a paste of your job config?
> > >>>>
> > >>>> Cheers,
> > >>>> Chris
> > >>>>
> > >>>> On 8/8/14 8:04 AM, "Claudio Martins" <[email protected]>
> wrote:
> > >>>>
> > >>>>> Hi Telles, it looks to me that you forgot to update the
> > >>>>> "yarn.package.path"
> > >>>>> attribute in your config file for the task.
> > >>>>>
> > >>>>> - Claudio Martins
> > >>>>> Head of Engineering
> > >>>>> MobileAware USA Inc. / www.mobileaware.com
> > >>>>> office: +1 617 986 5060 / mobile: +1 617 480 5288
> > >>>>> linkedin: www.linkedin.com/in/martinsclaudio
> > >>>>>
> > >>>>>
> > >>>>> On Fri, Aug 8, 2014 at 10:55 AM, Telles Nobrega
> > >>>>><[email protected]>
> > >>>>> wrote:
> > >>>>>
> > >>>>>> Hi,
> > >>>>>>
> > >>>>>> this is my first time trying to run a job on a multinode
> > >>>>>>environment. I
> > >>>>>> have the cluster set up, I can see in the GUI that all nodes are
> > >>>>>> working.
> > >>>>>> Do I need to have the job folder on each machine in my cluster?
> > >>>>>> - The first time I tried running with the job on the namenode
> > >>>>>>machine
> > >>>>>> and
> > >>>>>> it failed saying:
> > >>>>>>
> > >>>>>> Application application_1407509228798_0001 failed 2 times due to
> AM
> > >>>>>> Container for appattempt_1407509228798_0001_000002 exited with
> > >>>>>>exitCode:
> > >>>>>> -1000 due to: File
> > >>>>>>
> > >>>>>>
> > >>>>>>
> >
> >>>>>>file:/home/ubuntu/alarm-samza/samza-job-package/target/samza-job-pack
> > >>>>>>age-
> > >>>>>> 0.7.0-dist.tar.gz
> > >>>>>> does not exist
> > >>>>>>
> > >>>>>> So I copied the folder to each machine in my cluster and got this
> > >>>>>>error:
> > >>>>>>
> > >>>>>> Application application_1407509228798_0002 failed 2 times due to
> AM
> > >>>>>> Container for appattempt_1407509228798_0002_000002 exited with
> > >>>>>>exitCode:
> > >>>>>> -1000 due to: Resource
> > >>>>>>
> > >>>>>>
> > >>>>>>
> >
> >>>>>>file:/home/ubuntu/alarm-samza/samza-job-package/target/samza-job-pack
> > >>>>>>age-
> > >>>>>> 0.7.0-dist.tar.gz
> > >>>>>> changed on src filesystem (expected 1407509168000, was
> 1407509434000
> > >>>>>>
> > >>>>>> What am I missing?
> > >>>>>>
> > >>>>>> p.s.: I followed this
> > >>>>>>
> > >>>>>><
> > https://github.com/yahoo/samoa/wiki/Executing-SAMOA-with-Apache-Samz
> > >>>>>>a>
> > >>>>>> tutorial
> > >>>>>> and this
> > >>>>>> <
> > >>>>>>
> > >>>>>>
> > >>>>>>
> > http://samza.incubator.apache.org/learn/tutorials/0.7.0/run-in-multi-
> > >>>>>>node
> > >>>>>> -yarn.html
> > >>>>>>>
> > >>>>>> to
> > >>>>>> set up the cluster.
> > >>>>>>
> > >>>>>> Help is much appreciated.
> > >>>>>>
> > >>>>>> Thanks in advance.
> > >>>>>>
> > >>>>>> --
> > >>>>>> ------------------------------------------
> > >>>>>> Telles Mota Vidal Nobrega
> > >>>>>> M.sc. Candidate at UFCG
> > >>>>>> B.sc. in Computer Science at UFCG
> > >>>>>> Software Engineer at OpenStack Project - HP/LSD-UFCG
> > >>>>>>
> > >>>>
> > >>>
> > >>
> > >
> >
> >
>



-- 
------------------------------------------
Telles Mota Vidal Nobrega
M.sc. Candidate at UFCG
B.sc. in Computer Science at UFCG
Software Engineer at OpenStack Project - HP/LSD-UFCG

Reply via email to