Yes, it is like this:
<configuration>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:///home/ubuntu/hadoop-2.3.0/hdfs/datanode</value>
<description>Comma separated list of paths on the local filesystem of a
DataNode where it should store its blocks.</description>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:///home/ubuntu/hadoop-2.3.0/hdfs/namenode</value>
<description>Path on the local filesystem where the NameNode stores the
namespace and transaction logs persistently.</description>
</property>
</configuration>
~
I saw some report that this may be a classpath problem. Does this sounds
right to you?
On Mon, Aug 11, 2014 at 5:25 PM, Yan Fang <[email protected]> wrote:
> Hi Telles,
>
> It looks correct. Did you put the hdfs-site.xml into your HADOOP_CONF_DIR
> ?(such as ~/.samza/conf)
>
> Fang, Yan
> [email protected]
> +1 (206) 849-4108
>
>
> On Mon, Aug 11, 2014 at 1:02 PM, Telles Nobrega <[email protected]>
> wrote:
>
> > Hi Yan Fang,
> >
> > I was able to deploy the file to hdfs, I can see them in all my nodes but
> > when I tried running I got this error:
> >
> > Exception in thread "main" java.io.IOException: No FileSystem for scheme:
> > hdfs
> > at
> org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2421)
> > at
> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2428)
> > at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:88)
> > at
> org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2467)
> > at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2449)
> > at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:367)
> > at org.apache.hadoop.fs.Path.getFileSystem(Path.java:287)
> > at
> >
> >
> org.apache.samza.job.yarn.ClientHelper.submitApplication(ClientHelper.scala:111)
> > at org.apache.samza.job.yarn.YarnJob.submit(YarnJob.scala:55)
> > at org.apache.samza.job.yarn.YarnJob.submit(YarnJob.scala:48)
> > at org.apache.samza.job.JobRunner.run(JobRunner.scala:62)
> > at org.apache.samza.job.JobRunner$.main(JobRunner.scala:37)
> > at org.apache.samza.job.JobRunner.main(JobRunner.scala)
> >
> >
> > This is my yarn.package.path config:
> >
> >
> >
> >
>
> yarn.package.path=hdfs://telles-master-samza:50070/samza-job-package-0.7.0-dist.tar.gz
> >
> > Thanks in advance
> >
> >
> >
> >
> >
> > On Mon, Aug 11, 2014 at 3:00 PM, Yan Fang <[email protected]> wrote:
> >
> > > Hi Telles,
> > >
> > > In terms of "*I tried pushing the tar file to HDFS but I got an error
> > from
> > > hadoop saying that it couldn’t find core-site.xml file*.", I guess you
> > set
> > > the HADOOP_CONF_DIR variable and made it point to ~/.samza/conf. You
> can
> > do
> > > 1) make the HADOOP_CONF_DIR point to the directory where your conf
> files
> > > are, such as /etc/hadoop/conf. Or 2) copy the config files to
> > > ~/.samza/conf. Thank you,
> > >
> > > Cheer,
> > >
> > > Fang, Yan
> > > [email protected]
> > > +1 (206) 849-4108
> > >
> > >
> > > On Mon, Aug 11, 2014 at 7:40 AM, Chris Riccomini <
> > > [email protected]> wrote:
> > >
> > > > Hey Telles,
> > > >
> > > > To get YARN working with the HTTP file system, you need to follow the
> > > > instructions on:
> > > >
> > > >
> > >
> >
> http://samza.incubator.apache.org/learn/tutorials/0.7.0/run-in-multi-node-y
> > > > arn.html
> > > >
> > > >
> > > > In the "Set Up Http Filesystem for YARN" section.
> > > >
> > > > You shouldn't need to compile anything (no Gradle, which is what your
> > > > stack trace is showing). This setup should be done for all of the
> NMs,
> > > > since they will be the ones downloading your job's package (from
> > > > yarn.package.path).
> > > >
> > > > Cheers,
> > > > Chris
> > > >
> > > > On 8/9/14 9:44 PM, "Telles Nobrega" <[email protected]> wrote:
> > > >
> > > > >Hi again, I tried installing the scala libs but the Http problem
> still
> > > > >occurs. I realised that I need to compile incubator samza in the
> > > machines
> > > > >that I¹m going to run the jobs, but the compilation fails with this
> > huge
> > > > >message:
> > > > >
> > > > >#
> > > > ># There is insufficient memory for the Java Runtime Environment to
> > > > >continue.
> > > > ># Native memory allocation (malloc) failed to allocate 3946053632
> > bytes
> > > > >for committing reserved memory.
> > > > ># An error report file with more information is saved as:
> > > > ># /home/ubuntu/incubator-samza/samza-kafka/hs_err_pid2506.log
> > > > >Could not write standard input into: Gradle Worker 13.
> > > > >java.io.IOException: Broken pipe
> > > > > at java.io.FileOutputStream.writeBytes(Native Method)
> > > > > at java.io.FileOutputStream.write(FileOutputStream.java:345)
> > > > > at
> > > >
> java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
> > > > > at
> > > > java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
> > > > > at
> > > >
> > >
> >
> >org.gradle.process.internal.streams.ExecOutputHandleRunner.run(ExecOutputH
> > > > >andleRunner.java:53)
> > > > > at
> > > >
> > >
> >
> >org.gradle.internal.concurrent.DefaultExecutorFactory$StoppableExecutorImp
> > > > >l$1.run(DefaultExecutorFactory.java:66)
> > > > > at
> > > >
> > >
> >
> >java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:
> > > > >1145)
> > > > > at
> > > >
> > >
> >
> >java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java
> > > > >:615)
> > > > > at java.lang.Thread.run(Thread.java:744)
> > > > >Process 'Gradle Worker 13' finished with non-zero exit value 1
> > > > >org.gradle.process.internal.ExecException: Process 'Gradle Worker
> 13'
> > > > >finished with non-zero exit value 1
> > > > > at
> > > >
> > >
> >
> >org.gradle.process.internal.DefaultExecHandle$ExecResultImpl.assertNormalE
> > > > >xitValue(DefaultExecHandle.java:362)
> > > > > at
> > > >
> > >
> >
> >org.gradle.process.internal.DefaultWorkerProcess.onProcessStop(DefaultWork
> > > > >erProcess.java:89)
> > > > > at
> > > >
> > >
> >
> >org.gradle.process.internal.DefaultWorkerProcess.access$000(DefaultWorkerP
> > > > >rocess.java:33)
> > > > > at
> > > >
> > >
> >
> >org.gradle.process.internal.DefaultWorkerProcess$1.executionFinished(Defau
> > > > >ltWorkerProcess.java:55)
> > > > > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
> Method)
> > > > > at
> > > >
> > >
> >
> >sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:
> > > > >57)
> > > > > at
> > > >
> > >
> >
> >sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorIm
> > > > >pl.java:43)
> > > > > at java.lang.reflect.Method.invoke(Method.java:606)
> > > > > at
> > > >
> > >
> >
> >org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDispat
> > > > >ch.java:35)
> > > > > at
> > > >
> > >
> >
> >org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDispat
> > > > >ch.java:24)
> > > > > at
> > > >
> > >
> >
> >org.gradle.listener.BroadcastDispatch.dispatch(BroadcastDispatch.java:81)
> > > > > at
> > > >
> > >
> >
> >org.gradle.listener.BroadcastDispatch.dispatch(BroadcastDispatch.java:30)
> > > > > at
> > > >
> > >
> >
> >org.gradle.messaging.dispatch.ProxyDispatchAdapter$DispatchingInvocationHa
> > > > >ndler.invoke(ProxyDispatchAdapter.java:93)
> > > > > at com.sun.proxy.$Proxy46.executionFinished(Unknown Source)
> > > > > at
> > > >
> > >
> >
> >org.gradle.process.internal.DefaultExecHandle.setEndStateInfo(DefaultExecH
> > > > >andle.java:212)
> > > > > at
> > > >
> > >
> >
> >org.gradle.process.internal.DefaultExecHandle.finished(DefaultExecHandle.j
> > > > >ava:309)
> > > > > at
> > > >
> > >
> >
> >org.gradle.process.internal.ExecHandleRunner.completed(ExecHandleRunner.ja
> > > > >va:108)
> > > > > at
> > > >
> > >
> >
> >org.gradle.process.internal.ExecHandleRunner.run(ExecHandleRunner.java:88)
> > > > > at
> > > >
> > >
> >
> >org.gradle.internal.concurrent.DefaultExecutorFactory$StoppableExecutorImp
> > > > >l$1.run(DefaultExecutorFactory.java:66)
> > > > > at
> > > >
> > >
> >
> >java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:
> > > > >1145)
> > > > > at
> > > >
> > >
> >
> >java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java
> > > > >:615)
> > > > > at java.lang.Thread.run(Thread.java:744)
> > > > >OpenJDK 64-Bit Server VM warning: INFO:
> > > > >os::commit_memory(0x000000070a6c0000, 3946053632, 0) failed;
> > > > >error='Cannot allocate memory' (errno=12)
> > > > >#
> > > > ># There is insufficient memory for the Java Runtime Environment to
> > > > >continue.
> > > > ># Native memory allocation (malloc) failed to allocate 3946053632
> > bytes
> > > > >for committing reserved memory.
> > > > ># An error report file with more information is saved as:
> > > > ># /home/ubuntu/incubator-samza/samza-kafka/hs_err_pid2518.log
> > > > >Could not write standard input into: Gradle Worker 14.
> > > > >java.io.IOException: Broken pipe
> > > > > at java.io.FileOutputStream.writeBytes(Native Method)
> > > > > at java.io.FileOutputStream.write(FileOutputStream.java:345)
> > > > > at
> > > >
> java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
> > > > > at
> > > > java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
> > > > > at
> > > >
> > >
> >
> >org.gradle.process.internal.streams.ExecOutputHandleRunner.run(ExecOutputH
> > > > >andleRunner.java:53)
> > > > > at
> > > >
> > >
> >
> >org.gradle.internal.concurrent.DefaultExecutorFactory$StoppableExecutorImp
> > > > >l$1.run(DefaultExecutorFactory.java:66)
> > > > > at
> > > >
> > >
> >
> >java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:
> > > > >1145)
> > > > > at
> > > >
> > >
> >
> >java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java
> > > > >:615)
> > > > > at java.lang.Thread.run(Thread.java:744)
> > > > >Process 'Gradle Worker 14' finished with non-zero exit value 1
> > > > >org.gradle.process.internal.ExecException: Process 'Gradle Worker
> 14'
> > > > >finished with non-zero exit value 1
> > > > > at
> > > >
> > >
> >
> >org.gradle.process.internal.DefaultExecHandle$ExecResultImpl.assertNormalE
> > > > >xitValue(DefaultExecHandle.java:362)
> > > > > at
> > > >
> > >
> >
> >org.gradle.process.internal.DefaultWorkerProcess.onProcessStop(DefaultWork
> > > > >erProcess.java:89)
> > > > > at
> > > >
> > >
> >
> >org.gradle.process.internal.DefaultWorkerProcess.access$000(DefaultWorkerP
> > > > >rocess.java:33)
> > > > > at
> > > >
> > >
> >
> >org.gradle.process.internal.DefaultWorkerProcess$1.executionFinished(Defau
> > > > >ltWorkerProcess.java:55)
> > > > > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
> Method)
> > > > > at
> > > >
> > >
> >
> >sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:
> > > > >57)
> > > > > at
> > > >
> > >
> >
> >sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorIm
> > > > >pl.java:43)
> > > > > at java.lang.reflect.Method.invoke(Method.java:606)
> > > > > at
> > > >
> > >
> >
> >org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDispat
> > > > >ch.java:35)
> > > > > at
> > > >
> > >
> >
> >org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDispat
> > > > >ch.java:24)
> > > > > at
> > > >
> > >
> >
> >org.gradle.listener.BroadcastDispatch.dispatch(BroadcastDispatch.java:81)
> > > > > at
> > > >
> > >
> >
> >org.gradle.listener.BroadcastDispatch.dispatch(BroadcastDispatch.java:30)
> > > > > at
> > > >
> > >
> >
> >org.gradle.messaging.dispatch.ProxyDispatchAdapter$DispatchingInvocationHa
> > > > >ndler.invoke(ProxyDispatchAdapter.java:93)
> > > > > at com.sun.proxy.$Proxy46.executionFinished(Unknown Source)
> > > > > at
> > > >
> > >
> >
> >org.gradle.process.internal.DefaultExecHandle.setEndStateInfo(DefaultExecH
> > > > >andle.java:212)
> > > > > at
> > > >
> > >
> >
> >org.gradle.process.internal.DefaultExecHandle.finished(DefaultExecHandle.j
> > > > >ava:309)
> > > > > at
> > > >
> > >
> >
> >org.gradle.process.internal.ExecHandleRunner.completed(ExecHandleRunner.ja
> > > > >va:108)
> > > > > at
> > > >
> > >
> >
> >org.gradle.process.internal.ExecHandleRunner.run(ExecHandleRunner.java:88)
> > > > > at
> > > >
> > >
> >
> >org.gradle.internal.concurrent.DefaultExecutorFactory$StoppableExecutorImp
> > > > >l$1.run(DefaultExecutorFactory.java:66)
> > > > > at
> > > >
> > >
> >
> >java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:
> > > > >1145)
> > > > > at
> > > >
> > >
> >
> >java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java
> > > > >:615)
> > > > > at java.lang.Thread.r
> > > > >
> > > > >Do I need more memory for my machines? Each already has 4GB. I
> really
> > > > >need to have this running. I¹m not sure which way is best http or
> hdfs
> > > > >which one you suggest and how can i solve my problem for each case.
> > > > >
> > > > >Thanks in advance and sorry for bothering this much.
> > > > >On 10 Aug 2014, at 00:20, Telles Nobrega <[email protected]>
> > > wrote:
> > > > >
> > > > >> Hi Chris, now I have the tar file in my RM machine, and the yarn
> > path
> > > > >>points to it. I changed the core-site.xml to use HttpFileSystem
> > instead
> > > > >>of HDFS now it is failing with
> > > > >>
> > > > >> Application application_1407640485281_0001 failed 2 times due to
> AM
> > > > >>Container for appattempt_1407640485281_0001_000002 exited with
> > > > >>exitCode:-1000 due to: java.lang.ClassNotFoundException: Class
> > > > >>org.apache.samza.util.hadoop.HttpFileSystem not found
> > > > >>
> > > > >> I think I can solve this just installing scala files from the
> samza
> > > > >>tutorial, can you confirm that?
> > > > >>
> > > > >> On 09 Aug 2014, at 08:34, Telles Nobrega <[email protected]
> >
> > > > >>wrote:
> > > > >>
> > > > >>> Hi Chris,
> > > > >>>
> > > > >>> I think the problem is that I forgot to update the
> > yarn.job.package.
> > > > >>> I will try again to see if it works now.
> > > > >>>
> > > > >>> I have one more question, how can I stop (command line) the jobs
> > > > >>>running in my topology, for the experiment that I will run, I need
> > to
> > > > >>>run the same job in 4 minutes intervals. So I need to kill it,
> clean
> > > > >>>the kafka topics and rerun.
> > > > >>>
> > > > >>> Thanks in advance.
> > > > >>>
> > > > >>> On 08 Aug 2014, at 12:41, Chris Riccomini
> > > > >>><[email protected]> wrote:
> > > > >>>
> > > > >>>> Hey Telles,
> > > > >>>>
> > > > >>>>>> Do I need to have the job folder on each machine in my
> cluster?
> > > > >>>>
> > > > >>>> No, you should not need to do this. There are two ways to deploy
> > > your
> > > > >>>> tarball to the YARN grid. One is to put it in HDFS, and the
> other
> > is
> > > > >>>>to
> > > > >>>> put it on an HTTP server. The link to running a Samza job in a
> > > > >>>>multi-node
> > > > >>>> YARN cluster describes how to do both (either HTTP server or
> > HDFS).
> > > > >>>>
> > > > >>>> In both cases, once the tarball is put in on the HTTP/HDFS
> > > server(s),
> > > > >>>>you
> > > > >>>> must update yarn.package.path to point to it. From there, the
> YARN
> > > NM
> > > > >>>> should download it for you automatically when you start your
> job.
> > > > >>>>
> > > > >>>> * Can you send along a paste of your job config?
> > > > >>>>
> > > > >>>> Cheers,
> > > > >>>> Chris
> > > > >>>>
> > > > >>>> On 8/8/14 8:04 AM, "Claudio Martins" <[email protected]>
> > > wrote:
> > > > >>>>
> > > > >>>>> Hi Telles, it looks to me that you forgot to update the
> > > > >>>>> "yarn.package.path"
> > > > >>>>> attribute in your config file for the task.
> > > > >>>>>
> > > > >>>>> - Claudio Martins
> > > > >>>>> Head of Engineering
> > > > >>>>> MobileAware USA Inc. / www.mobileaware.com
> > > > >>>>> office: +1 617 986 5060 / mobile: +1 617 480 5288
> > > > >>>>> linkedin: www.linkedin.com/in/martinsclaudio
> > > > >>>>>
> > > > >>>>>
> > > > >>>>> On Fri, Aug 8, 2014 at 10:55 AM, Telles Nobrega
> > > > >>>>><[email protected]>
> > > > >>>>> wrote:
> > > > >>>>>
> > > > >>>>>> Hi,
> > > > >>>>>>
> > > > >>>>>> this is my first time trying to run a job on a multinode
> > > > >>>>>>environment. I
> > > > >>>>>> have the cluster set up, I can see in the GUI that all nodes
> are
> > > > >>>>>> working.
> > > > >>>>>> Do I need to have the job folder on each machine in my
> cluster?
> > > > >>>>>> - The first time I tried running with the job on the namenode
> > > > >>>>>>machine
> > > > >>>>>> and
> > > > >>>>>> it failed saying:
> > > > >>>>>>
> > > > >>>>>> Application application_1407509228798_0001 failed 2 times due
> to
> > > AM
> > > > >>>>>> Container for appattempt_1407509228798_0001_000002 exited with
> > > > >>>>>>exitCode:
> > > > >>>>>> -1000 due to: File
> > > > >>>>>>
> > > > >>>>>>
> > > > >>>>>>
> > > >
> > >
> >
> >>>>>>file:/home/ubuntu/alarm-samza/samza-job-package/target/samza-job-pack
> > > > >>>>>>age-
> > > > >>>>>> 0.7.0-dist.tar.gz
> > > > >>>>>> does not exist
> > > > >>>>>>
> > > > >>>>>> So I copied the folder to each machine in my cluster and got
> > this
> > > > >>>>>>error:
> > > > >>>>>>
> > > > >>>>>> Application application_1407509228798_0002 failed 2 times due
> to
> > > AM
> > > > >>>>>> Container for appattempt_1407509228798_0002_000002 exited with
> > > > >>>>>>exitCode:
> > > > >>>>>> -1000 due to: Resource
> > > > >>>>>>
> > > > >>>>>>
> > > > >>>>>>
> > > >
> > >
> >
> >>>>>>file:/home/ubuntu/alarm-samza/samza-job-package/target/samza-job-pack
> > > > >>>>>>age-
> > > > >>>>>> 0.7.0-dist.tar.gz
> > > > >>>>>> changed on src filesystem (expected 1407509168000, was
> > > 1407509434000
> > > > >>>>>>
> > > > >>>>>> What am I missing?
> > > > >>>>>>
> > > > >>>>>> p.s.: I followed this
> > > > >>>>>>
> > > > >>>>>><
> > > > https://github.com/yahoo/samoa/wiki/Executing-SAMOA-with-Apache-Samz
> > > > >>>>>>a>
> > > > >>>>>> tutorial
> > > > >>>>>> and this
> > > > >>>>>> <
> > > > >>>>>>
> > > > >>>>>>
> > > > >>>>>>
> > > >
> http://samza.incubator.apache.org/learn/tutorials/0.7.0/run-in-multi-
> > > > >>>>>>node
> > > > >>>>>> -yarn.html
> > > > >>>>>>>
> > > > >>>>>> to
> > > > >>>>>> set up the cluster.
> > > > >>>>>>
> > > > >>>>>> Help is much appreciated.
> > > > >>>>>>
> > > > >>>>>> Thanks in advance.
> > > > >>>>>>
> > > > >>>>>> --
> > > > >>>>>> ------------------------------------------
> > > > >>>>>> Telles Mota Vidal Nobrega
> > > > >>>>>> M.sc. Candidate at UFCG
> > > > >>>>>> B.sc. in Computer Science at UFCG
> > > > >>>>>> Software Engineer at OpenStack Project - HP/LSD-UFCG
> > > > >>>>>>
> > > > >>>>
> > > > >>>
> > > > >>
> > > > >
> > > >
> > > >
> > >
> >
> >
> >
> > --
> > ------------------------------------------
> > Telles Mota Vidal Nobrega
> > M.sc. Candidate at UFCG
> > B.sc. in Computer Science at UFCG
> > Software Engineer at OpenStack Project - HP/LSD-UFCG
> >
>
--
------------------------------------------
Telles Mota Vidal Nobrega
M.sc. Candidate at UFCG
B.sc. in Computer Science at UFCG
Software Engineer at OpenStack Project - HP/LSD-UFCG