Hey Guys,
Another update. It appears that this change fixes the problem (no
~/.bashrc update required).
$ git diff
diff --git a/samza-shell/src/main/bash/run-class.sh
b/samza-shell/src/main/bash/run-class.sh
index 2fa2acf..bb4e0d2 100755
--- a/samza-shell/src/main/bash/run-class.sh
+++ b/samza-shell/src/main/bash/run-class.sh
@@ -33,8 +33,8 @@ if [ ! -d "$base_dir/lib" ]; then
exit 1
fi
-YARN_HOME="${YARN_HOME:-$HOME/.samza}"
-CLASSPATH=$YARN_HOME/conf
+HADOOP_YARN_HOME="${HADOOP_YARN_HOME:-$HOME/.samza}"
+CLASSPATH="${HADOOP_CONF_DIR:-$HADOOP_YARN_HOME/conf}"
for file in $base_dir/lib/*.[jw]ar;
do
I'm going to post a fix on:
https://issues.apache.org/jira/browse/SAMZA-182
Cheers,
Chris
On 3/12/14 7:59 PM, "Chris Riccomini" <[email protected]> wrote:
>Hey Guys,
>
>I was able to reproduce this problem.
>
>I was also able to fix it (without the patch in SAMZA-182). All I needed
>to do was update ~/.bashrc on my NM's box to have:
>
> export YARN_HOME=/tmp/hadoop-2.3.0
>
>It appears that the YARN environment variables are somehow getting lost or
>not forwarded from the NM to the AM. Adding this bashrc setting makes sure
>that the NM gets them.
>
>
>I have a feeling upgrading Samza to YARN 2.3.0 will fix this, but I
>haven't validated yet. I will continue to investigate tomorrow.
>
>Cheers,
>Chris
>
>On 3/12/14 6:43 PM, "Yan Fang" <[email protected]> wrote:
>
>>I guess Sonali has the problem is because his NMs do not read the
>>YARN_HOME variable. That may be because the NM machine does not have
>>YARN_HOME set when the NM starts.
>>
>>Check this https://issues.apache.org/jira/browse/SAMZA-182
>>
>>Thanks,
>>
>>Yan Fang
>>
>>> On Mar 12, 2014, at 6:14 PM, Chris Riccomini <[email protected]>
>>>wrote:
>>>
>>> Hey Sonali,
>>>
>>> I am unfamiliar with the start-yarn.sh. Looking at:
>>>
>>>
>>>
>>>https://svn.apache.org/viewvc/hadoop/common/trunk/hadoop-yarn-project/ha
>>>d
>>>oo
>>> p-yarn/bin/stop-yarn.sh?revision=1370666&view=markup
>>>
>>> What version of YARN are you using?
>>>
>>> Cheers,
>>> Chris
>>>
>>> On 3/12/14 5:56 PM, "[email protected]"
>>> <[email protected]> wrote:
>>>
>>>> Hey Chris,
>>>>
>>>> Yes, I have YARN_HOME set in all the NMs pointing to the right
>>>> directories. I also made sure the yarn-site.xml file has the hostname
>>>>set.
>>>>
>>>> I start yarn using start.yarn.sh in the RM and that automatically
>>>>starts
>>>> the NMs on the slave nodes. Is that the right way to do it?
>>>>
>>>> -----Original Message-----
>>>> From: Chris Riccomini [mailto:[email protected]]
>>>> Sent: Wednesday, March 12, 2014 5:52 PM
>>>> To: [email protected]
>>>> Subject: Re: Failed to package using mvn
>>>>
>>>> Hey Sonali,
>>>>
>>>> OK, so we've validated that the NMs are able to connect, which means
>>>>they
>>>> can see the yarn-site.xml.
>>>>
>>>> How are you starting your NMs? Are you running:
>>>>
>>>> export YARN_HOME=/path/to/yarn/home
>>>>
>>>> In the CLI before starting the NM?
>>>>
>>>> For reference, we run:
>>>>
>>>> export YARN_HOME=/path/to/our/yarn-home
>>>> export YARN_CONF_DIR=$YARN_HOME/conf
>>>>
>>>> bin/yarn nodemanager
>>>>
>>>> With YARN_HOME pointing to a directory that has a subdirectory called
>>>> "conf" in it, which has a yarn-site.xml in it:
>>>>
>>>> /path/to/our/yarn-home/conf/yarn-site.xml
>>>>
>>>> This yarn-site.xml has yarn.resourcemanager.hostname set to the IP (or
>>>> hostname) of the resource manager:
>>>>
>>>> <property>
>>>> <name>yarn.resourcemanager.hostname</name>
>>>> <value>123.456.789.123</value>
>>>> </property>
>>>>
>>>>
>>>> Cheers,
>>>> Chris
>>>>
>>>> On 3/12/14 5:33 PM, "[email protected]"
>>>> <[email protected]> wrote:
>>>>
>>>>> I see two active nodes (I have 2 NMs running)
>>>>>
>>>>> -----Original Message-----
>>>>> From: Chris Riccomini [mailto:[email protected]]
>>>>> Sent: Wednesday, March 12, 2014 5:24 PM
>>>>> To: [email protected]
>>>>> Subject: Re: Failed to package using mvn
>>>>>
>>>>> Hey Sonali,
>>>>>
>>>>> Can you go to your ResourceManager's UI, and tell me how many active
>>>>> nodes you see? This should be under the "active nodes" heading.
>>>>>
>>>>> It sounds like the SamzaAppMaster is not getting the resource manager
>>>>> host/port from the yarn-site.xml. Usually this is due to not
>>>>>exporting
>>>>> YARN_HOME on the NodeManager before starting it.
>>>>>
>>>>> Cheers,
>>>>> Chris
>>>>>
>>>>> On 3/12/14 5:21 PM, "[email protected]"
>>>>> <[email protected]> wrote:
>>>>>
>>>>>> Okay so I was able to submit the job:
>>>>>>
>>>>>> In the nodemanager I get this error: Specifically it's trying to
>>>>>> connect to 0.0.0.0/8032 instead of the IP I have specified in the
>>>>>> yarn-site.xml file
>>>>>>
>>>>>> 2014-03-12 17:04:47 SamzaAppMaster$ [INFO] got container id:
>>>>>> container_1391637982288_0033_01_000001
>>>>>> 2014-03-12 17:04:47 SamzaAppMaster$ [INFO] got app attempt id:
>>>>>> appattempt_1391637982288_0033_000001
>>>>>> 2014-03-12 17:04:47 SamzaAppMaster$ [INFO] got node manager host:
>>>>>> svdpdac001.techlabs.accenture.com
>>>>>> 2014-03-12 17:04:47 SamzaAppMaster$ [INFO] got node manager port:
>>>>>> 38218
>>>>>> 2014-03-12 17:04:47 SamzaAppMaster$ [INFO] got node manager http
>>>>>>port:
>>>>>> 8042
>>>>>> 2014-03-12 17:04:47 SamzaAppMaster$ [INFO] got config:
>>>>>>
>>>>>>{task.inputs=wikipedia.#en.wikipedia,wikipedia.#en.wiktionary,wikiped
>>>>>>i
>>>>>> a .#e n.wikinews, systems.wikipedia.host=irc.wikimedia.org,
>>>>>> systems.kafka.producer.batch.num.messages=1,
>>>>>> job.factory.class=org.apache.samza.job.yarn.YarnJobFactory,
>>>>>> systems.wikipedia.port=6667,
>>>>>> systems.kafka.producer.producer.type=sync,
>>>>>> job.name=wikipedia-feed,
>>>>>>
>>>>>>systems.kafka.consumer.zookeeper.connect=svdpdac013.techlabs.accentur
>>>>>>e
>>>>>>.
>>>>>> com :2181/, systems.kafka.samza.msg.serde=json,
>>>>>>
>>>>>>serializers.registry.json.class=org.apache.samza.serializers.JsonSerd
>>>>>>e
>>>>>> F
>>>>>> act ory,
>>>>>> task.class=samza.examples.wikipedia.task.WikipediaFeedStreamTask,
>>>>>>
>>>>>>yarn.package.path=hdfs://10.1.174.85:9000/samza-job-package-0.7.0-dis
>>>>>>t
>>>>>>.
>>>>>> tar
>>>>>> .gz,
>>>>>>
>>>>>>systems.wikipedia.samza.factory=samza.examples.wikipedia.system.Wikip
>>>>>>e
>>>>>> d
>>>>>> iaS
>>>>>> ystemFactory,
>>>>>>
>>>>>>systems.kafka.samza.factory=org.apache.samza.system.kafka.KafkaSystem
>>>>>>F
>>>>>> a
>>>>>> cto
>>>>>> ry,
>>>>>>
>>>>>>systems.kafka.producer.metadata.broker.list=svdpdac001.techlabs.accen
>>>>>>t
>>>>>> ure
>>>>>> .
>>>>>> com:6667,svdpdac015.techlabs.accenture.com:6667}
>>>>>> 2014-03-12 17:04:48 ClientHelper [INFO] trying to connect to RM
>>>>>> 0.0.0.0:8032
>>>>>> 2014-03-12 17:04:48 NativeCodeLoader [WARN] Unable to load
>>>>>> native-hadoop library for your platform... using builtin-java
>>>>>>classes
>>>>>> where applicable
>>>>>> 2014-03-12 17:04:48 RMProxy [INFO] Connecting to ResourceManager at
>>>>>> /0.0.0.0:8032
>>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Chris Riccomini [mailto:[email protected]]
>>>>>> Sent: Wednesday, March 12, 2014 4:48 PM
>>>>>> To: [email protected]
>>>>>> Subject: Re: Failed to package using mvn
>>>>>>
>>>>>> Hey Sonali,
>>>>>>
>>>>>> You need to specify a valid HDFS uri. Usually something like:
>>>>>>
>>>>>> hdfs://<hdfs name node ip>:<hdfs name node port>/path/to/tgz
>>>>>>
>>>>>> Right now, Hadoop is trying to use the package name as the HDFS
>>>>>>host.
>>>>>>
>>>>>> Cheers,
>>>>>> Chris
>>>>>>
>>>>>> On 3/12/14 4:45 PM, "[email protected]"
>>>>>> <[email protected]> wrote:
>>>>>>
>>>>>>> I did and I can now see the hadoop-hdfs jar in /deploy/samza/lib
>>>>>>> folder.
>>>>>>>
>>>>>>> I do get a different error now.
>>>>>>>
>>>>>>> I uploaded the samza-job to hdfs and it resides on
>>>>>>> hdfs://samza-job-package-0.7.0-dist.tar.gz
>>>>>>>
>>>>>>> But when I run the job I get this exception:
>>>>>>>
>>>>>>> Exception in thread "main" java.lang.IllegalArgumentException:
>>>>>>> java.net.UnknownHostException: samza-job-package-0.7.0-dist.tar.gz
>>>>>>> at
>>>>>>>
>>>>>>>org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUt
>>>>>>>i
>>>>>>>l.
>>>>>>> jav
>>>>>>> a:418)
>>>>>>> at
>>>>>>>
>>>>>>>org.apache.hadoop.hdfs.NameNodeProxies.createNonHAProxy(NameNodeProx
>>>>>>>i
>>>>>>> e
>>>>>>> s
>>>>>>> .ja
>>>>>>> va:231)
>>>>>>> at
>>>>>>>
>>>>>>>org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.j
>>>>>>>a
>>>>>>> v
>>>>>>> a
>>>>>>> :13
>>>>>>> 9)
>>>>>>> at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:510)
>>>>>>> at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:453)
>>>>>>> at
>>>>>>>
>>>>>>>org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedF
>>>>>>>i
>>>>>>> l
>>>>>>> e
>>>>>>> Sys
>>>>>>> tem.java:136)
>>>>>>> at
>>>>>>>
>>>>>>>org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:243
>>>>>>>3
>>>>>>>)
>>>>>>> at
>>>>>>> org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:88)
>>>>>>> at
>>>>>>>
>>>>>>>org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:24
>>>>>>>6
>>>>>>>7)
>>>>>>> at
>>>>>>> org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2449)
>>>>>>> at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:367)
>>>>>>> at org.apache.hadoop.fs.Path.getFileSystem(Path.java:287)
>>>>>>> at
>>>>>>>
>>>>>>>org.apache.samza.job.yarn.ClientHelper.submitApplication(ClientHelpe
>>>>>>>r
>>>>>>>.
>>>>>>> s
>>>>>>> cal
>>>>>>> a:111)
>>>>>>> at org.apache.samza.job.yarn.YarnJob.submit(YarnJob.scala:55)
>>>>>>> at org.apache.samza.job.yarn.YarnJob.submit(YarnJob.scala:48)
>>>>>>> at org.apache.samza.job.JobRunner.run(JobRunner.scala:100)
>>>>>>> at org.apache.samza.job.JobRunner$.main(JobRunner.scala:75)
>>>>>>> at org.apache.samza.job.JobRunner.main(JobRunner.scala)
>>>>>>> Caused by: java.net.UnknownHostException:
>>>>>>> samza-job-package-0.7.0-dist.tar.gz
>>>>>>> ... 18 more
>>>>>>>
>>>>>>>
>>>>>>> -----Original Message-----
>>>>>>> From: Yan Fang [mailto:[email protected]]
>>>>>>> Sent: Wednesday, March 12, 2014 4:20 PM
>>>>>>> To: [email protected]
>>>>>>> Subject: Re: Failed to package using mvn
>>>>>>>
>>>>>>> Hi Sonali,
>>>>>>>
>>>>>>> One tip you may miss:
>>>>>>>
>>>>>>> If you had already run
>>>>>>>
>>>>>>> tar -xvf
>>>>>>> ./samza-job-package/target/samza-job-package-0.7.0-dist.tar.gz
>>>>>>> -C deploy/samza
>>>>>>>
>>>>>>> before you bundled the jar file to tar.gz. Please also remember to
>>>>>>> put the hdfs jar file to the deploy/samza/lib.
>>>>>>>
>>>>>>> Let me know if you miss this step.
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>> Fang, Yan
>>>>>>> [email protected]
>>>>>>> +1 (206) 849-4108
>>>>>>>
>>>>>>>
>>>>>>> On Wed, Mar 12, 2014 at 4:10 PM, Chris Riccomini
>>>>>>> <[email protected]>wrote:
>>>>>>>
>>>>>>>> Hey Sonali,
>>>>>>>>
>>>>>>>> Yan has made a step-by-step tutorial for this. Could you confirm
>>>>>>>> that you've followed the instructions, and it's still not working?
>>>>>>>>
>>>>>>>> https://issues.apache.org/jira/browse/SAMZA-181
>>>>>>>>
>>>>>>>> Cheers,
>>>>>>>> Chris
>>>>>>>>
>>>>>>>> On 3/12/14 3:12 PM, "[email protected]"
>>>>>>>> <[email protected]> wrote:
>>>>>>>>
>>>>>>>>> So sigh! I had some Kafka issues in-between. That's fixed now.
>>>>>>>>>
>>>>>>>>> As suggested,
>>>>>>>>>
>>>>>>>>> 1. I made sure the hadoop-hdfs-2.2.0.jar is bundled with the
>>>>>>>>>samza
>>>>>>>>> job tar.gz.
>>>>>>>>> 2. I added the configuration to implement hdfs in the
>>>>>>>>> hdfs-site.xml files both on the NMs and in the /conf directory
>>>>>>>>>for
>>>>>>>>> samza
>>>>>>>>>
>>>>>>>>> I still get the No Filesystem for scheme :hdfs error.
>>>>>>>>>
>>>>>>>>> Is there anything else im missing?
>>>>>>>>> Thanks,
>>>>>>>>> Sonali
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> -----Original Message-----
>>>>>>>>> From: Chris Riccomini [mailto:[email protected]]
>>>>>>>>> Sent: Tuesday, March 11, 2014 8:27 PM
>>>>>>>>> To: [email protected]
>>>>>>>>> Subject: Re: Failed to package using mvn
>>>>>>>>>
>>>>>>>>> Hey Yan,
>>>>>>>>>
>>>>>>>>> This looks great! I added a few requests to the JIRA, if you have
>>>>>>>> time.
>>>>>>>>>
>>>>>>>>> Cheers,
>>>>>>>>> Chris
>>>>>>>>>
>>>>>>>>>> On 3/11/14 7:20 PM, "Yan Fang" <[email protected]> wrote:
>>>>>>>>>>
>>>>>>>>>> Hi Chris,
>>>>>>>>>>
>>>>>>>>>> Has opened an issue
>>>>>>>>>> SAMZA-181<https://issues.apache.org/jira/browse/SAMZA-181>and
>>>>>>>>>> also uploaded the patch. Let me know if there is something wrong
>>>>>>>>>> in my tutorial. Thank you!
>>>>>>>>>>
>>>>>>>>>> Cheers,
>>>>>>>>>>
>>>>>>>>>> Fang, Yan
>>>>>>>>>> [email protected]
>>>>>>>>>> +1 (206) 849-4108
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Tue, Mar 11, 2014 at 10:40 AM,
>>>>>>>>>> <[email protected]>wrote:
>>>>>>>>>>
>>>>>>>>>>> Thanks Chris, Yan,
>>>>>>>>>>>
>>>>>>>>>>> Let me try that.
>>>>>>>>>>>
>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>> From: Chris Riccomini [mailto:[email protected]]
>>>>>>>>>>> Sent: Tuesday, March 11, 2014 10:22 AM
>>>>>>>>>>> To: [email protected]
>>>>>>>>>>> Subject: Re: Failed to package using mvn
>>>>>>>>>>>
>>>>>>>>>>> Hey Yan,
>>>>>>>>>>>
>>>>>>>>>>> Awesome!The location where you can add your .md is here:
>>>>>>>>>>>
>>>>>>>>>>> docs/learn/tutorials/0.7.0/
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Here's a link to the code tree:
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>https://git-wip-us.apache.org/repos/asf?p=incubator-samza.git;a=
>>>>>>>>>>> t
>>>>>>>>>>> r
>>>>>>>>>>> e
>>>>>>>>>>> e;f
>>>>>>>>>>> =do
>>>>>>>>>>> cs
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>/learn/tutorials/0.7.0;h=ef117f4066f14a00f50f0f6fca1790313044831
>>>>>>>>>>> 2
>>>>>>>>>>> ;
>>>>>>>>>>> h
>>>>>>>>>>> b=H
>>>>>>>>>>> EAD
>>>>>>>>>>>
>>>>>>>>>>> You can get the code here:
>>>>>>>>>>>
>>>>>>>>>>> git clone
>>>>>>>>>>> http://git-wip-us.apache.org/repos/asf/incubator-samza.git
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Once you write the .md, just throw it up on a JIRA, and one of
>>>>>>>>>>> us can merge it in.
>>>>>>>>>>>
>>>>>>>>>>> Re: hdfs-site.xml, ah ha, that's what I figured. This is good
>>>>>>>>>>> to
>>>>>>>> know.
>>>>>>>>>>> So
>>>>>>>>>>> you just copy your hdfs-site.xml from your NodeManager's conf
>>>>>>>>>>> directory into your local hdfs-site.xml.
>>>>>>>>>>>
>>>>>>>>>>> Cheers,
>>>>>>>>>>> Chris
>>>>>>>>>>>
>>>>>>>>>>>> On 3/11/14 10:16 AM, "Yan Fang" <[email protected]> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> Hi Chris,
>>>>>>>>>>>>
>>>>>>>>>>>> Sure. I just do not know how/where to contribute this
>>>>>>>>>>>> page...*_*
>>>>>>>>>>>>
>>>>>>>>>>>> Oh, I mean the same this as you mentioned in the *Cluster
>>>>>>>>>>>> Installation*thread:
>>>>>>>>>>>>
>>>>>>>>>>>> *"2. Get a copy of one of your NM's yarn-site.xml and put it
>>>>>>>>>>>> somewhere
>>>>>>>>>>>> on*
>>>>>>>>>>>>
>>>>>>>>>>>> *your desktop (I usually use ~/.yarn/conf/yarn-site.xml). Note
>>>>>>>>>>>> that there'sa "conf" directory there. This is mandatory."*
>>>>>>>>>>>>
>>>>>>>>>>>> So I just copy the hdfs-site.xml to
>>>>>>>>>>>>~/.yarn/conf/hdfs-site.xml.
>>>>>>>>>>>> Thank
>>>>>>>>>>> you.
>>>>>>>>>>>>
>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>
>>>>>>>>>>>> Fang, Yan
>>>>>>>>>>>> [email protected]
>>>>>>>>>>>> +1 (206) 849-4108
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Tue, Mar 11, 2014 at 10:10 AM, Chris Riccomini
>>>>>>>>>>>> <[email protected]>wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Hey Yan,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Would you be up for contributing a tutorial page that
>>>>>>>>>>>>> describes
>>>>>>>>>>> this?
>>>>>>>>>>>>> This
>>>>>>>>>>>>> is really useful information. Our docs are just simple .md
>>>>>>>>>>>>> files in the main code base.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Regarding step (3), is the hdfs-site.xml put into the conf
>>>>>>>>>>>>> folder for the NM boxes, or on the client side (where
>>>>>>>>>>>>> run-job.sh
>>>>>>>> is run)?
>>>>>>>>>>>>>
>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>> Chris
>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 3/11/14 10:07 AM, "Yan Fang" <[email protected]>
>>>>>>>>>>>>>>wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi Sonali,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> The way I make Samza run with HDFS is following:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 1. include hdfs jar in Samza jar tar.gz.
>>>>>>>>>>>>>> 2. you may also want to make sure the hadoop-common.jar has
>>>>>>>>>>>>>> the same version as your hdfs jar. Otherwise, you may have
>>>>>>>>>>>>>> configuration error popping out.
>>>>>>>>>>>>>> 3. then put hdfs-site.xml to conf folder, the same folder
>>>>>>>>>>>>>> as the yarn-site.xml 4. all other steps are not changed.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hope this will help. Thank you.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Fang, Yan
>>>>>>>>>>>>>> [email protected]
>>>>>>>>>>>>>> +1 (206) 849-4108
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Tue, Mar 11, 2014 at 9:25 AM, Chris Riccomini
>>>>>>>>>>>>>> <[email protected]>wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hey Sonali,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I believe that you need to make sure that the HDFS jar is
>>>>>>>>>>>>>>> in your .tar.gz file, as you've said.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> If that doesn't work, you might need to define this
>>>>>>>>>>>>>>> setting in core-site.xml on the machine you're running
>>>>>>>> run-job.sh on:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> <property>
>>>>>>>>>>>>>>> <name>fs.hdfs.impl</name>
>>>>>>>> <value>org.apache.hadoop.hdfs.DistributedFileSystem</value>
>>>>>>>>>>>>>>> <description>The FileSystem for hdfs:
>>>>>>>>>>>>>>> uris.</description> </property>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> You might also need to configure your NodeManagers to
>>>>>>>>>>>>>>> have the HDFS
>>>>>>>>>>>>> file
>>>>>>>>>>>>>>> system impl as well.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I've never run Samza with HDFS, so I'm guessing here.
>>>>>>>>>>>>>>> Perhaps someone else on the list has been successful with
>>>>>>>> this?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>>> Chris
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On 3/10/14 3:59 PM, "[email protected]"
>>>>>>>>>>>>>>> <[email protected]> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Hello,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I fixed this by starting from scratch with gradlew. But
>>>>>>>>>>>>>>>> now when I
>>>>>>>>>>>>> run
>>>>>>>>>>>>>>> my
>>>>>>>>>>>>>>>> job it throws this error:
>>>>>>>>>>>>>>>> Exception in thread "main" java.io.IOException: No
>>>>>>>>>>>>>>>> FileSystem for
>>>>>>>>>>>>>>> scheme:
>>>>>>>>>>>>>>>> hdfs
>>>>>>>>>>>>>>>> at
>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSyst
>>>>>>>>>>>>>>>>e
>>>>>>>>>>>>>>>>m.
>>>>>>>>>>>>>>>> jav
>>>>>>>>>>>>>>>> a:
>>>>>>>>>>>>>>>> 242
>>>>>>>>>>>>>>>> 1)
>>>>>>>>>>>>>>>> at
>>>>>>>>>>>>>> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.
>>>>>>>>>>>>>> j
>>>>>>>>>>>>>> a
>>>>>>>>>>>>>> v
>>>>>>>>>>>>>> a:2
>>>>>>>>>>>>>> 428
>>>>>>>>>>>>>> )
>>>>>>>>>>>>>>>> at
>>>>>>>>>>>>>>> org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:
>>>>>>>>>>>>>>> 8
>>>>>>>>>>>>>>> 8
>>>>>>>>>>>>>>> )
>>>>>>>>>>>>>>>> at
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.jav
>>>>>>>>>>>a
>>>>>>>>>>>:
>>>>>>>>>>>>>> 246
>>>>>>>>>>>>>> 7)
>>>>>>>>>>>>>>>> at
>>>>>>>>>>>>>>> org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:
>>>>>>>>>>>>>>> 2
>>>>>>>>>>>>>>> 4
>>>>>>>>>>>>>>> 4
>>>>>>>>>>>>>>> 9)
>>>>>>>>>>>>>>>> at
>>>>>>>>>>> org.apache.hadoop.fs.FileSystem.get(FileSystem.java:367)
>>>>>>>>>>>>>>>> at
>>>>>>>>>>> org.apache.hadoop.fs.Path.getFileSystem(Path.java:287)
>>>>>>>>>>>>>>>> at
>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>org.apache.samza.job.yarn.ClientHelper.submitApplication(Cl
>>>>>>>>>>>>>>>> i
>>>>>>>>>>>>>>>> e
>>>>>>>>>>>>>>>> n
>>>>>>>>>>>>>>>> tHe
>>>>>>>>>>>>>>>> lpe
>>>>>>>>>>>>>>>> r.
>>>>>>>>>>>>>>>> sc
>>>>>>>>>>>>>>>> al
>>>>>>>>>>>>>>>> a:111)
>>>>>>>>>>>>>>>> at
>>>>>>>>>>>>> org.apache.samza.job.yarn.YarnJob.submit(YarnJob.scala:55)
>>>>>>>>>>>>>>>> at
>>>>>>>>>>>>> org.apache.samza.job.yarn.YarnJob.submit(YarnJob.scala:48)
>>>>>>>>>>>>>>>> at
>>>>>>>>>>> org.apache.samza.job.JobRunner.run(JobRunner.scala:100)
>>>>>>>>>>>>>>>> at
>>>>>>>>>>> org.apache.samza.job.JobRunner$.main(JobRunner.scala:75)
>>>>>>>>>>>>>>>> at
>>>>>>>>>>>>>>>> org.apache.samza.job.JobRunner.main(JobRunner.scala)
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I looked at the samza job tar.gz and it doesn't have a
>>>>>>>>>>>>>>>> Hadoop-hdfs
>>>>>>>>>>>>> jar.
>>>>>>>>>>>>>>>> Is that why I get this error?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>> Sonali
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> From: Parthasarathy, Sonali
>>>>>>>>>>>>>>>> Sent: Monday, March 10, 2014 11:25 AM
>>>>>>>>>>>>>>>> To: [email protected]
>>>>>>>>>>>>>>>> Subject: Failed to package using mvn
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> When I tried to do a mvn clean package of my hello-samza
>>>>>>>>>>>>>>>> project, I
>>>>>>>>>>>>> get
>>>>>>>>>>>>>>>> the following error. Has anyone seen this before?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> [ERROR] Failed to execute goal on project samza-wikipedia:
>>>>>>>>>>>>>>>> Could not resolve dependencies for project
>>>>>>>>>>> samza:samza-wikipedia:jar:0.7.0:
>>>>>>>>>>>>> Could
>>>>>>>>>>>>>>>> not find artifact
>>>>>>>>>>>>>>>> org.apache.samza:samza-kv_2.10:jar:0.7.0
>>>>>>>>>>>>>>>> in apache-releases
>>>>>>>>>>>>> (https://repository.apache.org/content/groups/public)
>>>>>>>>>>>>>>> ->
>>>>>>>>>>>>>>>> [Help 1]
>>>>>>>>>>>>>>>> [ERROR]
>>>>>>>>>>>>>>>> [ERROR] To see the full stack trace of the errors,
>>>>>>>>>>>>>>>> re-run Maven with
>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>> -e switch.
>>>>>>>>>>>>>>>> [ERROR] Re-run Maven using the -X switch to enable full
>>>>>>>>>>>>>>>> debug
>>>>>>>>>>>>> logging.
>>>>>>>>>>>>>>>> [ERROR]
>>>>>>>>>>>>>>>> [ERROR] For more information about the errors and
>>>>>>>>>>>>>>>> possible
>>>>>>>>>>>>> solutions,
>>>>>>>>>>>>>>>> please read the following articles:
>>>>>>>>>>>>>>>> [ERROR] [Help 1]
>>>>>>>>>>>>>
>>>>>>>>>>>>> http://cwiki.apache.org/confluence/display/MAVEN/DependencyRe
>>>>>>>>>>>>> s
>>>>>>>>>>>>> o
>>>>>>>>>>>>> l
>>>>>>>>>>>>> uti
>>>>>>>>>>>>> onE
>>>>>>>>>>>>> xce
>>>>>>>>>>>>>>> p
>>>>>>>>>>>>>>>> tion
>>>>>>>>>>>>>>>> [ERROR]
>>>>>>>>>>>>>>>> [ERROR] After correcting the problems, you can resume
>>>>>>>>>>>>>>>> the build with
>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>> command
>>>>>>>>>>>>>>>> [ERROR] mvn <goals> -rf :samza-wikipedia
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>> Sonali
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Sonali Parthasarathy
>>>>>>>>>>>>>>>> R&D Developer, Data Insights Accenture Technology Labs
>>>>>>>>>>>>>>>> 703-341-7432
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> ________________________________
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> This message is for the designated recipient only and
>>>>>>>>>>>>>>>> may contain privileged, proprietary, or otherwise
>>>>>>>>>>>>>>>> confidential
>>>>>>>>>>> information.
>>>>>>>>>>>>>>>> If
>>>>>>>>>>>>> you
>>>>>>>>>>>>>>>> have received it in error, please notify the sender
>>>>>>>>>>>>>>>> immediately and delete the original. Any other use of
>>>>>>>>>>>>>>>> the e-mail by you is
>>>>>>>>>>>>> prohibited.
>>>>>>>>>>>>>>>> Where allowed by local law, electronic communications
>>>>>>>>>>>>>>>> with Accenture
>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>> its affiliates, including e-mail and instant messaging
>>>>>>>>>>>>>>>> (including content), may be scanned by our systems for
>>>>>>>>>>>>>>>> the purposes of
>>>>>>>>>>>>> information
>>>>>>>>>>>>>>>> security and assessment of internal compliance with
>>>>>>>>>>>>>>>> Accenture
>>>>>>>>>>>>> policy.
>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> __________________________________________________________
>>>>>>>>>>>>>>>> _
>>>>>>>>>>>>>>>> _
>>>>>>>>>>>>>>>> _
>>>>>>>>>>>>>>>> ___
>>>>>>>>>>>>>>>> ___
>>>>>>>>>>>>>>>> ___
>>>>>>>>>>>>>>>> __
>>>>>>>>>>>>>>>> __
>>>>>>>>>>>>>>>> ____________
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> www.accenture.com
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> ________________________________
>>>>>>>>>>>
>>>>>>>>>>> This message is for the designated recipient only and may
>>>>>>>>>>> contain privileged, proprietary, or otherwise confidential
>>>>>>>>>>> information. If you have received it in error, please notify
>>>>>>>>>>> the sender immediately and delete the original. Any other use
>>>>>>>>>>> of the e-mail by you is prohibited. Where allowed by local
>>>>>>>>>>>law,
>>>>>>>>>>> electronic communications with Accenture and its affiliates,
>>>>>>>>>>> including e-mail and instant messaging (including content), may
>>>>>>>>>>> be scanned by our systems for the purposes of information
>>>>>>>>>>> security and assessment of internal compliance with Accenture
>>>>>>>> policy.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>________________________________________________________________
>>>>>>>>>>> _
>>>>>>>>>>> _
>>>>>>>>>>> _
>>>>>>>>>>> ___
>>>>>>>>>>> ___
>>>>>>>>>>> _____________
>>>>>>>>>>>
>>>>>>>>>>> www.accenture.com
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> ________________________________
>>>>>>>>>
>>>>>>>>> This message is for the designated recipient only and may contain
>>>>>>>>> privileged, proprietary, or otherwise confidential information.
>>>>>>>>>If
>>>>>>>>> you have received it in error, please notify the sender
>>>>>>>>> immediately and delete the original. Any other use of the e-mail
>>>>>>>>> by you is
>>>>>>>> prohibited.
>>>>>>>>> Where allowed by local law, electronic communications with
>>>>>>>>> Accenture and its affiliates, including e-mail and instant
>>>>>>>>> messaging (including content), may be scanned by our systems for
>>>>>>>>> the purposes of information security and assessment of internal
>>>>>>>>> compliance with
>>>>>>>> Accenture policy.
>>>>>>>>>
>>>>>>>>>__________________________________________________________________
>>>>>>>>> _
>>>>>>>>> _
>>>>>>>>> _
>>>>>>>>> _____
>>>>>>>>> ____________
>>>>>>>>>
>>>>>>>>> www.accenture.com
>>>>>>>
>>>>>>> ________________________________
>>>>>>>
>>>>>>> This message is for the designated recipient only and may contain
>>>>>>> privileged, proprietary, or otherwise confidential information. If
>>>>>>> you have received it in error, please notify the sender immediately
>>>>>>> and delete the original. Any other use of the e-mail by you is
>>>>>>> prohibited.
>>>>>>> Where allowed by local law, electronic communications with
>>>>>>>Accenture
>>>>>>> and its affiliates, including e-mail and instant messaging
>>>>>>>(including
>>>>>>> content), may be scanned by our systems for the purposes of
>>>>>>> information security and assessment of internal compliance with
>>>>>>> Accenture policy.
>>>>>>>
>>>>>>>____________________________________________________________________
>>>>>>>_
>>>>>>> _
>>>>>>> _
>>>>>>> ___
>>>>>>> ____________
>>>>>>>
>>>>>>> www.accenture.com
>>>>>>
>>>>>>
>>>>>>
>>>>>> ________________________________
>>>>>>
>>>>>> This message is for the designated recipient only and may contain
>>>>>> privileged, proprietary, or otherwise confidential information. If
>>>>>>you
>>>>>> have received it in error, please notify the sender immediately and
>>>>>> delete the original. Any other use of the e-mail by you is
>>>>>>prohibited.
>>>>>> Where allowed by local law, electronic communications with Accenture
>>>>>> and its affiliates, including e-mail and instant messaging
>>>>>>(including
>>>>>> content), may be scanned by our systems for the purposes of
>>>>>> information security and assessment of internal compliance with
>>>>>> Accenture policy.
>>>>>>
>>>>>>_____________________________________________________________________
>>>>>>_
>>>>>> _
>>>>>> ___
>>>>>> ____________
>>>>>>
>>>>>> www.accenture.com
>>>>>
>>>>>
>>>>>
>>>>> ________________________________
>>>>>
>>>>> This message is for the designated recipient only and may contain
>>>>> privileged, proprietary, or otherwise confidential information. If
>>>>>you
>>>>> have received it in error, please notify the sender immediately and
>>>>> delete the original. Any other use of the e-mail by you is
>>>>>prohibited.
>>>>> Where allowed by local law, electronic communications with Accenture
>>>>> and its affiliates, including e-mail and instant messaging (including
>>>>> content), may be scanned by our systems for the purposes of
>>>>>information
>>>>> security and assessment of internal compliance with Accenture policy.
>>>>>
>>>>>______________________________________________________________________
>>>>>_
>>>>> ___
>>>>> ____________
>>>>>
>>>>> www.accenture.com
>>>>
>>>>
>>>>
>>>> ________________________________
>>>>
>>>> This message is for the designated recipient only and may contain
>>>> privileged, proprietary, or otherwise confidential information. If you
>>>> have received it in error, please notify the sender immediately and
>>>> delete the original. Any other use of the e-mail by you is prohibited.
>>>> Where allowed by local law, electronic communications with Accenture
>>>>and
>>>> its affiliates, including e-mail and instant messaging (including
>>>> content), may be scanned by our systems for the purposes of
>>>>information
>>>> security and assessment of internal compliance with Accenture policy.
>>>>
>>>>_______________________________________________________________________
>>>>_
>>>>__
>>>> ____________
>>>>
>>>> www.accenture.com
>>>
>