I did and I can now see the hadoop-hdfs jar in /deploy/samza/lib folder.
I do get a different error now.
I uploaded the samza-job to hdfs and it resides on
hdfs://samza-job-package-0.7.0-dist.tar.gz
But when I run the job I get this exception:
Exception in thread "main" java.lang.IllegalArgumentException:
java.net.UnknownHostException: samza-job-package-0.7.0-dist.tar.gz
at
org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:418)
at
org.apache.hadoop.hdfs.NameNodeProxies.createNonHAProxy(NameNodeProxies.java:231)
at
org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:139)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:510)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:453)
at
org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:136)
at
org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2433)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:88)
at
org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2467)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2449)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:367)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:287)
at
org.apache.samza.job.yarn.ClientHelper.submitApplication(ClientHelper.scala:111)
at org.apache.samza.job.yarn.YarnJob.submit(YarnJob.scala:55)
at org.apache.samza.job.yarn.YarnJob.submit(YarnJob.scala:48)
at org.apache.samza.job.JobRunner.run(JobRunner.scala:100)
at org.apache.samza.job.JobRunner$.main(JobRunner.scala:75)
at org.apache.samza.job.JobRunner.main(JobRunner.scala)
Caused by: java.net.UnknownHostException: samza-job-package-0.7.0-dist.tar.gz
... 18 more
-----Original Message-----
From: Yan Fang [mailto:[email protected]]
Sent: Wednesday, March 12, 2014 4:20 PM
To: [email protected]
Subject: Re: Failed to package using mvn
Hi Sonali,
One tip you may miss:
If you had already run
tar -xvf ./samza-job-package/target/samza-job-package-0.7.0-dist.tar.gz -C
deploy/samza
before you bundled the jar file to tar.gz. Please also remember to put the hdfs
jar file to the deploy/samza/lib.
Let me know if you miss this step.
Thanks,
Fang, Yan
[email protected]
+1 (206) 849-4108
On Wed, Mar 12, 2014 at 4:10 PM, Chris Riccomini <[email protected]>wrote:
> Hey Sonali,
>
> Yan has made a step-by-step tutorial for this. Could you confirm that
> you've followed the instructions, and it's still not working?
>
> https://issues.apache.org/jira/browse/SAMZA-181
>
> Cheers,
> Chris
>
> On 3/12/14 3:12 PM, "[email protected]"
> <[email protected]> wrote:
>
> >So sigh! I had some Kafka issues in-between. That's fixed now.
> >
> >As suggested,
> >
> >1. I made sure the hadoop-hdfs-2.2.0.jar is bundled with the samza
> >job tar.gz.
> >2. I added the configuration to implement hdfs in the hdfs-site.xml
> >files both on the NMs and in the /conf directory for samza
> >
> >I still get the No Filesystem for scheme :hdfs error.
> >
> >Is there anything else im missing?
> >Thanks,
> >Sonali
> >
> >
> >-----Original Message-----
> >From: Chris Riccomini [mailto:[email protected]]
> >Sent: Tuesday, March 11, 2014 8:27 PM
> >To: [email protected]
> >Subject: Re: Failed to package using mvn
> >
> >Hey Yan,
> >
> >This looks great! I added a few requests to the JIRA, if you have time.
> >
> >Cheers,
> >Chris
> >
> >On 3/11/14 7:20 PM, "Yan Fang" <[email protected]> wrote:
> >
> >>Hi Chris,
> >>
> >>Has opened an issue
> >>SAMZA-181<https://issues.apache.org/jira/browse/SAMZA-181>and also
> >>uploaded the patch. Let me know if there is something wrong in my
> >>tutorial. Thank you!
> >>
> >>Cheers,
> >>
> >>Fang, Yan
> >>[email protected]
> >>+1 (206) 849-4108
> >>
> >>
> >>On Tue, Mar 11, 2014 at 10:40 AM,
> >><[email protected]>wrote:
> >>
> >>> Thanks Chris, Yan,
> >>>
> >>> Let me try that.
> >>>
> >>> -----Original Message-----
> >>> From: Chris Riccomini [mailto:[email protected]]
> >>> Sent: Tuesday, March 11, 2014 10:22 AM
> >>> To: [email protected]
> >>> Subject: Re: Failed to package using mvn
> >>>
> >>> Hey Yan,
> >>>
> >>> Awesome!The location where you can add your .md is here:
> >>>
> >>> docs/learn/tutorials/0.7.0/
> >>>
> >>>
> >>> Here's a link to the code tree:
> >>>
> >>>
> >>>
> >>>https://git-wip-us.apache.org/repos/asf?p=incubator-samza.git;a=tre
> >>>e;f
> >>>=do
> >>>cs
> >>>
> >>>/learn/tutorials/0.7.0;h=ef117f4066f14a00f50f0f6fca17903130448312;h
> >>>b=H
> >>>EAD
> >>>
> >>> You can get the code here:
> >>>
> >>> git clone
> >>> http://git-wip-us.apache.org/repos/asf/incubator-samza.git
> >>>
> >>>
> >>> Once you write the .md, just throw it up on a JIRA, and one of us
> >>> can merge it in.
> >>>
> >>> Re: hdfs-site.xml, ah ha, that's what I figured. This is good to know.
> >>>So
> >>> you just copy your hdfs-site.xml from your NodeManager's conf
> >>>directory into your local hdfs-site.xml.
> >>>
> >>> Cheers,
> >>> Chris
> >>>
> >>> On 3/11/14 10:16 AM, "Yan Fang" <[email protected]> wrote:
> >>>
> >>> >Hi Chris,
> >>> >
> >>> >Sure. I just do not know how/where to contribute this page...*_*
> >>> >
> >>> >Oh, I mean the same this as you mentioned in the *Cluster
> >>> >Installation*thread:
> >>> >
> >>> >*"2. Get a copy of one of your NM's yarn-site.xml and put it
> >>> >somewhere
> >>> >on*
> >>> >
> >>> >*your desktop (I usually use ~/.yarn/conf/yarn-site.xml). Note
> >>> >that there'sa "conf" directory there. This is mandatory."*
> >>> >
> >>> >So I just copy the hdfs-site.xml to ~/.yarn/conf/hdfs-site.xml.
> >>> >Thank
> >>>you.
> >>> >
> >>> >Cheers,
> >>> >
> >>> >Fang, Yan
> >>> >[email protected]
> >>> >+1 (206) 849-4108
> >>> >
> >>> >
> >>> >On Tue, Mar 11, 2014 at 10:10 AM, Chris Riccomini
> >>> ><[email protected]>wrote:
> >>> >
> >>> >> Hey Yan,
> >>> >>
> >>> >> Would you be up for contributing a tutorial page that describes
> >>>this?
> >>> >>This
> >>> >> is really useful information. Our docs are just simple .md
> >>> >>files in the main code base.
> >>> >>
> >>> >> Regarding step (3), is the hdfs-site.xml put into the conf
> >>> >>folder for the NM boxes, or on the client side (where run-job.sh is
> >>> >>run)?
> >>> >>
> >>> >> Cheers,
> >>> >> Chris
> >>> >>
> >>> >> On 3/11/14 10:07 AM, "Yan Fang" <[email protected]> wrote:
> >>> >>
> >>> >> >Hi Sonali,
> >>> >> >
> >>> >> >The way I make Samza run with HDFS is following:
> >>> >> >
> >>> >> >1. include hdfs jar in Samza jar tar.gz.
> >>> >> >2. you may also want to make sure the hadoop-common.jar has
> >>> >> >the same version as your hdfs jar. Otherwise, you may have
> >>> >> >configuration error popping out.
> >>> >> >3. then put hdfs-site.xml to conf folder, the same folder as
> >>> >> >the yarn-site.xml 4. all other steps are not changed.
> >>> >> >
> >>> >> >Hope this will help. Thank you.
> >>> >> >
> >>> >> >Cheers,
> >>> >> >
> >>> >> >Fang, Yan
> >>> >> >[email protected]
> >>> >> >+1 (206) 849-4108
> >>> >> >
> >>> >> >
> >>> >> >On Tue, Mar 11, 2014 at 9:25 AM, Chris Riccomini
> >>> >> ><[email protected]>wrote:
> >>> >> >
> >>> >> >> Hey Sonali,
> >>> >> >>
> >>> >> >> I believe that you need to make sure that the HDFS jar is in
> >>> >> >>your .tar.gz file, as you've said.
> >>> >> >>
> >>> >> >> If that doesn't work, you might need to define this setting
> >>> >> >> in core-site.xml on the machine you're running run-job.sh on:
> >>> >> >>
> >>> >> >> <property>
> >>> >> >> <name>fs.hdfs.impl</name>
> >>> >> >> <value>org.apache.hadoop.hdfs.DistributedFileSystem</value>
> >>> >> >> <description>The FileSystem for hdfs: uris.</description>
> >>> >> >> </property>
> >>> >> >>
> >>> >> >>
> >>> >> >> You might also need to configure your NodeManagers to have
> >>> >> >> the HDFS
> >>> >>file
> >>> >> >> system impl as well.
> >>> >> >>
> >>> >> >> I've never run Samza with HDFS, so I'm guessing here.
> >>> >> >>Perhaps someone else on the list has been successful with this?
> >>> >> >>
> >>> >> >> Cheers,
> >>> >> >> Chris
> >>> >> >>
> >>> >> >> On 3/10/14 3:59 PM, "[email protected]"
> >>> >> >> <[email protected]> wrote:
> >>> >> >>
> >>> >> >> >Hello,
> >>> >> >> >
> >>> >> >> >I fixed this by starting from scratch with gradlew. But now
> >>> >> >> >when I
> >>> >>run
> >>> >> >>my
> >>> >> >> >job it throws this error:
> >>> >> >> >Exception in thread "main" java.io.IOException: No
> >>> >> >> >FileSystem for
> >>> >> >>scheme:
> >>> >> >> >hdfs
> >>> >> >> > at
> >>> >> >>
> >>> >>
> >>>
> >>>>>>>>org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:
> >>> >>>>>242
> >>> >>>>>1)
> >>> >> >> > at
> >>> >> >>
> >>> >>>org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.jav
> >>> >>>a:2
> >>> >>>428
> >>> >>>)
> >>> >> >> > at
> >>> >> >>org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:88
> >>> >> >>)
> >>> >> >> > at
> >>> >> >>
> >>> >>>org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:
> >>> >>>246
> >>> >>>7)
> >>> >> >> > at
> >>> >> >>org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:244
> >>> >> >>9)
> >>> >> >> > at
> >>>org.apache.hadoop.fs.FileSystem.get(FileSystem.java:367)
> >>> >> >> > at
> >>>org.apache.hadoop.fs.Path.getFileSystem(Path.java:287)
> >>> >> >> > at
> >>> >> >>
> >>> >>
> >>>
> >>>>>>>>org.apache.samza.job.yarn.ClientHelper.submitApplication(Clien
> >>>>>>>>tHe
> >>>>>>>>lpe
> >>>>>>>>r.
> >>> >>>>>sc
> >>> >> >>>al
> >>> >> >> >a:111)
> >>> >> >> > at
> >>> >>org.apache.samza.job.yarn.YarnJob.submit(YarnJob.scala:55)
> >>> >> >> > at
> >>> >>org.apache.samza.job.yarn.YarnJob.submit(YarnJob.scala:48)
> >>> >> >> > at
> >>>org.apache.samza.job.JobRunner.run(JobRunner.scala:100)
> >>> >> >> > at
> >>>org.apache.samza.job.JobRunner$.main(JobRunner.scala:75)
> >>> >> >> > at
> >>> >> >> > org.apache.samza.job.JobRunner.main(JobRunner.scala)
> >>> >> >> >
> >>> >> >> >I looked at the samza job tar.gz and it doesn't have a
> >>> >> >> >Hadoop-hdfs
> >>> >>jar.
> >>> >> >> >Is that why I get this error?
> >>> >> >> >
> >>> >> >> >Thanks,
> >>> >> >> >Sonali
> >>> >> >> >
> >>> >> >> >From: Parthasarathy, Sonali
> >>> >> >> >Sent: Monday, March 10, 2014 11:25 AM
> >>> >> >> >To: [email protected]
> >>> >> >> >Subject: Failed to package using mvn
> >>> >> >> >
> >>> >> >> >Hi,
> >>> >> >> >
> >>> >> >> >When I tried to do a mvn clean package of my hello-samza
> >>> >> >> >project, I
> >>> >>get
> >>> >> >> >the following error. Has anyone seen this before?
> >>> >> >> >
> >>> >> >> >[ERROR] Failed to execute goal on project samza-wikipedia:
> >>> >> >> >Could not resolve dependencies for project
> >>> samza:samza-wikipedia:jar:0.7.0:
> >>> >>Could
> >>> >> >> >not find artifact org.apache.samza:samza-kv_2.10:jar:0.7.0
> >>> >> >> >in apache-releases
> >>> >>(https://repository.apache.org/content/groups/public)
> >>> >> >>->
> >>> >> >> >[Help 1]
> >>> >> >> >[ERROR]
> >>> >> >> >[ERROR] To see the full stack trace of the errors, re-run
> >>> >> >> >Maven with
> >>> >> >>the
> >>> >> >> >-e switch.
> >>> >> >> >[ERROR] Re-run Maven using the -X switch to enable full
> >>> >> >> >debug
> >>> >>logging.
> >>> >> >> >[ERROR]
> >>> >> >> >[ERROR] For more information about the errors and possible
> >>> >>solutions,
> >>> >> >> >please read the following articles:
> >>> >> >> >[ERROR] [Help 1]
> >>> >> >> >
> >>> >> >>
> >>> >> >>
> >>> >>
> >>> >>http://cwiki.apache.org/confluence/display/MAVEN/DependencyResol
> >>> >>uti
> >>> >>onE
> >>> >>xce
> >>> >> >>p
> >>> >> >> >tion
> >>> >> >> >[ERROR]
> >>> >> >> >[ERROR] After correcting the problems, you can resume the
> >>> >> >> >build with
> >>> >> >>the
> >>> >> >> >command
> >>> >> >> >[ERROR] mvn <goals> -rf :samza-wikipedia
> >>> >> >> >
> >>> >> >> >Thanks,
> >>> >> >> >Sonali
> >>> >> >> >
> >>> >> >> >Sonali Parthasarathy
> >>> >> >> >R&D Developer, Data Insights Accenture Technology Labs
> >>> >> >> >703-341-7432
> >>> >> >> >
> >>> >> >> >
> >>> >> >> >________________________________
> >>> >> >> >
> >>> >> >> >This message is for the designated recipient only and may
> >>> >> >> >contain privileged, proprietary, or otherwise confidential
> >>>information.
> >>> >> >> >If
> >>> >>you
> >>> >> >> >have received it in error, please notify the sender
> >>> >> >> >immediately and delete the original. Any other use of the
> >>> >> >> >e-mail by you is
> >>> >>prohibited.
> >>> >> >> >Where allowed by local law, electronic communications with
> >>> >> >> >Accenture
> >>> >> >>and
> >>> >> >> >its affiliates, including e-mail and instant messaging
> >>> >> >> >(including content), may be scanned by our systems for the
> >>> >> >> >purposes of
> >>> >>information
> >>> >> >> >security and assessment of internal compliance with
> >>> >> >> >Accenture
> >>> >>policy.
> >>> >> >>
> >>> >>
> >>> >>>>>_____________________________________________________________
> >>> >>>>>___
> >>> >>>>>___
> >>> >>>>>___
> >>> >>>>>__
> >>> >> >>>__
> >>> >> >> >____________
> >>> >> >> >
> >>> >> >> >www.accenture.com
> >>> >> >>
> >>> >> >>
> >>> >>
> >>> >>
> >>>
> >>>
> >>>
> >>> ________________________________
> >>>
> >>> This message is for the designated recipient only and may contain
> >>>privileged, proprietary, or otherwise confidential information. If
> >>>you have received it in error, please notify the sender
> >>>immediately and delete the original. Any other use of the e-mail
> >>>by you is prohibited. Where allowed by local law, electronic
> >>>communications with Accenture and its affiliates, including e-mail
> >>>and instant messaging (including content), may be scanned by our
> >>>systems for the purposes of information security and assessment of
> >>>internal compliance with Accenture policy.
> >>>
> >>>
> >>>___________________________________________________________________
> >>>___
> >>>___
> >>>_____________
> >>>
> >>> www.accenture.com
> >>>
> >>>
> >
> >
> >
> >________________________________
> >
> >This message is for the designated recipient only and may contain
> >privileged, proprietary, or otherwise confidential information. If
> >you have received it in error, please notify the sender immediately
> >and delete the original. Any other use of the e-mail by you is prohibited.
> >Where allowed by local law, electronic communications with Accenture
> >and its affiliates, including e-mail and instant messaging (including
> >content), may be scanned by our systems for the purposes of
> >information security and assessment of internal compliance with Accenture
> >policy.
> >_____________________________________________________________________
> >_____
> >____________
> >
> >www.accenture.com
> >
>
>
________________________________
This message is for the designated recipient only and may contain privileged,
proprietary, or otherwise confidential information. If you have received it in
error, please notify the sender immediately and delete the original. Any other
use of the e-mail by you is prohibited. Where allowed by local law, electronic
communications with Accenture and its affiliates, including e-mail and instant
messaging (including content), may be scanned by our systems for the purposes
of information security and assessment of internal compliance with Accenture
policy.
______________________________________________________________________________________
www.accenture.com