Re: ClassNotFoundException: -libjars not working?

2012-02-28 Thread madhu phatak
Hi,
 -libjars doesn't always work.Better way is to create a runnable jar with
all dependencies ( if no of dependency is less) or u have to keep the jars
into the lib folder of the hadoop in all machines.

On Wed, Feb 22, 2012 at 8:13 PM, Ioan Eugen Stan stan.ieu...@gmail.comwrote:

 Hello,

 I'm trying to run a map-reduce job and I get ClassNotFoundException, but I
 have the class submitted with -libjars. What's wrong with how I do things?
 Please help.

 I'm running hadoop-0.20.2-cdh3u1, and I have everithing on the -libjars
 line. The job is submitted via a java app like:

  exec /usr/lib/jvm/java-6-sun/bin/**java -Dproc_jar -Xmx200m -server
 -Dhadoop.log.dir=/opt/ui/var/**log/mailsearch
 -Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=/usr/lib/**hadoop
 -Dhadoop.id.str=hbase -Dhadoop.root.logger=INFO,**console
 -Dhadoop.policy.file=hadoop-**policy.xml -classpath
 '/usr/lib/hadoop/conf:/usr/**lib/jvm/java-6-sun/lib/tools.**
 jar:/usr/lib/hadoop:/usr/lib/**hadoop/hadoop-core-0.20.2-**
 cdh3u1.jar:/usr/lib/hadoop/**lib/ant-contrib-1.0b3.jar:/**
 usr/lib/hadoop/lib/apache-**log4j-extras-1.1.jar:/usr/lib/**
 hadoop/lib/aspectjrt-1.6.5.**jar:/usr/lib/hadoop/lib/**
 aspectjtools-1.6.5.jar:/usr/**lib/hadoop/lib/commons-cli-1.**
 2.jar:/usr/lib/hadoop/lib/**commons-codec-1.4.jar:/usr/**
 lib/hadoop/lib/commons-daemon-**1.0.1.jar:/usr/lib/hadoop/lib/**
 commons-el-1.0.jar:/usr/lib/**hadoop/lib/commons-httpclient-**
 3.0.1.jar:/usr/lib/hadoop/lib/**commons-logging-1.0.4.jar:/**
 usr/lib/hadoop/lib/commons-**logging-api-1.0.4.jar:/usr/**
 lib/hadoop/lib/commons-net-1.**4.1.jar:/usr/lib/hadoop/lib/**
 core-3.1.1.jar:/usr/lib/**hadoop/lib/hadoop-**fairscheduler-0.20.2-cdh3u1.
 **jar:/usr/lib/hadoop/lib/**hsqldb-1.8.0.10.jar:/usr/lib/**
 hadoop/lib/jackson-core-asl-1.**5.2.jar:/usr/lib/hadoop/lib/**
 jackson-mapper-asl-1.5.2.jar:/**usr/lib/hadoop/lib/jasper-**
 compiler-5.5.12.jar:/usr/lib/**hadoop/lib/jasper-runtime-5.5.**
 12.jar:/usr/lib/hadoop/lib
 /jcl-over-slf4j-1.6.1.jar:/**usr/lib/hadoop/lib/jets3t-0.6.**
 1.jar:/usr/lib/hadoop/lib/**jetty-6.1.26.jar:/usr/lib/**
 hadoop/lib/jetty-servlet-**tester-6.1.26.jar:/usr/lib/**
 hadoop/lib/jetty-util-6.1.26.**jar:/usr/lib/hadoop/lib/jsch-**
 0.1.42.jar:/usr/lib/hadoop/**lib/junit-4.5.jar:/usr/lib/**
 hadoop/lib/kfs-0.2.2.jar:/usr/**lib/hadoop/lib/log4j-1.2.15.**
 jar:/usr/lib/hadoop/lib/**mockito-all-1.8.2.jar:/usr/**
 lib/hadoop/lib/oro-2.0.8.jar:/**usr/lib/hadoop/lib/servlet-**
 api-2.5-20081211.jar:/usr/lib/**hadoop/lib/servlet-api-2.5-6.**
 1.14.jar:/usr/lib/hadoop/lib/**slf4j-api-1.6.1.jar:/usr/lib/**
 hadoop/lib/slf4j-log4j12-1.6.**1.jar:/usr/lib/hadoop/lib/**
 xmlenc-0.52.jar:/usr/lib/**hadoop/lib/jsp-2.1/jsp-2.1.**
 jar:/usr/lib/hadoop/lib/jsp-2.**1/jsp-api-2.1.jar:/usr/share/**
 mailbox-convertor/lib/*:/usr/**lib/hadoop/contrib/capacity-**
 scheduler/hadoop-capacity-**scheduler-0.20.2-cdh3u1.jar:/**
 usr/lib/hbase/lib/hadoop-lzo-**0.4.13.jar:/usr/lib/hbase/**
 hbase.jar:/etc/hbase/conf:/**usr/lib/hbase/lib:/usr/lib/**
 zookeeper/zookeeper.jar:/usr/**lib/hadoop/contrib
 /capacity-scheduler/hadoop-**capacity-scheduler-0.20.2-**
 cdh3u1.jar:/usr/lib/hbase/lib/**hadoop-lzo-0.4.13.jar:/usr/**
 lib/hbase/hbase.jar:/etc/**hbase/conf:/usr/lib/hbase/lib:**
 /usr/lib/zookeeper/zookeeper.**jar' org.apache.hadoop.util.RunJar
 /usr/share/mailbox-convertor/**mailbox-convertor-0.1-**SNAPSHOT.jar
 -libjars=/usr/share/mailbox-**convertor/lib/antlr-2.7.7.jar,**
 /usr/share/mailbox-convertor/**lib/aopalliance-1.0.jar,/usr/**
 share/mailbox-convertor/lib/**asm-3.1.jar,/usr/share/**
 mailbox-convertor/lib/**backport-util-concurrent-3.1.**
 jar,/usr/share/mailbox-**convertor/lib/cglib-2.2.jar,/**
 usr/share/mailbox-convertor/**lib/hadoop-ant-3.0-u1.pom,/**
 usr/share/mailbox-convertor/**lib/speed4j-0.9.jar,/usr/**
 share/mailbox-convertor/lib/**jamm-0.2.2.jar,/usr/share/**
 mailbox-convertor/lib/uuid-3.**2.0.jar,/usr/share/mailbox-**
 convertor/lib/high-scale-lib-**1.1.1.jar,/usr/share/mailbox-**
 convertor/lib/jsr305-1.3.9.**jar,/usr/share/mailbox-**
 convertor/lib/guava-11.0.1.**jar,/usr/share/mailbox-**
 convertor/lib/protobuf-java-2.**4.0a.jar,/usr/share/mailbox-**
 convertor/lib/**concurrentlinkedhashmap-lru-1.**1.jar,/usr/share/mailbox-*
 *convertor/lib/json-simple-1.1.**jar,/usr/share/mailbox-**
 convertor/lib/itext-2.1.5.jar,**/usr/share/mailbox-convertor/**
 lib/jmxtools-1.2.1.jar,/usr/**share/mailbox-convertor/lib/**
 jersey-client-1.4.jar,/usr/**share/mailbox-converto
 r/lib/jersey-core-1.4.jar,/**usr/share/mailbox-convertor/**
 lib/jersey-json-1.4.jar,/usr/**share/mailbox-convertor/lib/**
 jersey-server-1.4.jar,/usr/**share/mailbox-convertor/lib/**
 jmxri-1.2.1.jar,/usr/share/**mailbox-convertor/lib/jaxb-**
 impl-2.1.12.jar,/usr/share/**mailbox-convertor/lib/xstream-**
 1.2.2.jar,/usr/share/mailbox-**convertor/lib/commons-metrics-**
 1.3.jar,/usr/share/mailbox-**convertor/lib/commons-**
 monitoring-2.9.1.jar,/usr/**share/mailbox-convertor/lib/**
 

Re: ClassNotFoundException: -libjars not working?

2012-02-28 Thread Ioan Eugen Stan

Pe 28.02.2012 10:58, madhu phatak a scris:

Hi,
  -libjars doesn't always work.Better way is to create a runnable jar with
all dependencies ( if no of dependency is less) or u have to keep the jars
into the lib folder of the hadoop in all machines.



Thanks for the reply Madhu,

I adopted the second solution as explained in [1]. From what I found 
browsing the net it seems that -libjars is broken in hadoop version  
0.18. I didn't got time to check the code yet. Cloudera released hadoop 
sources are packaged a bit odd and Netbeans doens't seem to play well 
with that and this really affects my will to try to fix the problem.


-libjars is a nice feature that permits the use of skinny jars and 
would help system admins do better packaging. It also allows better 
control over the classpath. Too bad it didn't work.



[1] 
http://www.cloudera.com/blog/2011/01/how-to-include-third-party-libraries-in-your-map-reduce-job/


Cheers,

--
Ioan Eugen Stan
http://ieugen.blogspot.com


RE: ClassNotFoundException while running quick start guide on Windows.

2011-06-22 Thread Sandy Pratt
Hi Drew,

I don't know if this is actually the issue or not, but the output below makes 
me think you might be passing Cygwin pathes into the java.exe launcher.  If 
that's the case, it won't work.  java.exe is pure Windows and doesn't know 
about '/cygdrive/c' for example (it also expects the path separator to be 
semicolon rather than colon).  Every once in a while when I try to use java.exe 
from the Cygwin CLI on my Windows box, I get bitten by this.

Sandy

 -Original Message-
 From: Drew Gross [mailto:drew.a.gr...@gmail.com]
 Sent: Tuesday, June 21, 2011 21:26
 To: common-user@hadoop.apache.org
 Subject: Re: ClassNotFoundException while running quick start guide on
 Windows.
 
 Thanks Jeff, it was a problem with JAVA_HOME. I have another problem now
 though, I have this:
 
 $JAVA:  /cygdrive/c/Program Files/Java/jdk1.6.0_26/bin/java
 $JAVA_HEAP_MAX:  -Xmx1000m
 $HADOOP_OPTS:  -Dhadoop.log.dir=C:\Users\Drew
 Gross\Documents\Projects\discom\hadoop-0.21.0\logs
 -Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=C:\Users\Drew
 Gross\Documents\Projects\discom\hadoop-0.21.0\ -Dhadoop.id.str= -
 Dhadoop.root.logger=INFO,console
  -Djava.library.path=/cygdrive/c/Users/Drew
 Gross/Documents/Projects/discom/hadoop-0.21.0/lib/native/
 -Dhadoop.policy.file=hadoop-policy.xml
 $CLASS:  org.apache.hadoop.util.RunJar
 Exception in thread main java.lang.NoClassDefFoundError:
 Gross\Documents\Projects\discom\hadoop-0/21/0\logs
 Caused by: java.lang.ClassNotFoundException:
 Gross\Documents\Projects\discom\hadoop-0.21.0\logs
         at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
         at java.security.AccessController.doPrivileged(Native Method)
         at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
         at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
         at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
         at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
 Could not find the main class:
 Gross\Documents\Projects\discom\hadoop-0.21.0\logs.  Program will exit.
 
 (This is with some extra debugging info added by me in bin/hadoop)
 
 It looks like the windows style file names are causing problems, especially 
 the
 spaces. Has anyone encountered this before, and know how to fix? I tried
 escaping the spaces and surrounding the file paths with quotes (not at the
 same time), but that didn't help.
 
 Drew
 
 
 On Tue, Jun 21, 2011 at 6:24 AM, madhu phatak phatak@gmail.com
 wrote:
 
  I think the jar have some issuses where its not able to read the Main
  class from manifest . try unjar the jar and see in Manifest.xml what
  is the main class and then run as follows
 
   bin/hadoop jar hadoop-*-examples.jar Full qualified main class grep
  input output 'dfs[a-z.]+'
  On Thu, Jun 16, 2011 at 10:23 AM, Drew Gross drew.a.gr...@gmail.com
 wrote:
 
   Hello,
  
   I'm trying to run the example from the quick start guide on Windows
   and I get this error:
  
   $ bin/hadoop jar hadoop-*-examples.jar grep input output 'dfs[a-z.]+'
   Exception in thread main java.lang.NoClassDefFoundError:
   Caused by: java.lang.ClassNotFoundException:
          at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
          at java.security.AccessController.doPrivileged(Native Method)
          at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
          at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
          at
   sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
          at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
   Could not find the main class: .  Program will exit.
   Exception in thread main java.lang.NoClassDefFoundError:
   Gross\Documents\Projects\discom\hadoop-0/21/0\logs
   Caused by: java.lang.ClassNotFoundException:
   Gross\Documents\Projects\discom\hadoop-0.21.0\logs
          at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
          at java.security.AccessController.doPrivileged(Native Method)
          at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
          at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
          at
   sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
          at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
   Could not find the main class:
   Gross\Documents\Projects\discom\hadoop-0.21.0\logs.  Program will
 exit.
  
   Does anyone know what I need to change?
  
   Thank you.
  
   From, Drew
  
   --
   Forget the environment. Print this e-mail immediately. Then burn it.
  
 
 
 
 --
 Forget the environment. Print this e-mail immediately. Then burn it.


Re: ClassNotFoundException while running quick start guide on Windows.

2011-06-21 Thread madhu phatak
I think the jar have some issuses where its not able to read the Main class
from manifest . try unjar the jar and see in Manifest.xml what is the main
class and then run as follows

 bin/hadoop jar hadoop-*-examples.jar Full qualified main class grep input
output 'dfs[a-z.]+'
On Thu, Jun 16, 2011 at 10:23 AM, Drew Gross drew.a.gr...@gmail.com wrote:

 Hello,

 I'm trying to run the example from the quick start guide on Windows and I
 get this error:

 $ bin/hadoop jar hadoop-*-examples.jar grep input output 'dfs[a-z.]+'
 Exception in thread main java.lang.NoClassDefFoundError:
 Caused by: java.lang.ClassNotFoundException:
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
 Could not find the main class: .  Program will exit.
 Exception in thread main java.lang.NoClassDefFoundError:
 Gross\Documents\Projects\discom\hadoop-0/21/0\logs
 Caused by: java.lang.ClassNotFoundException:
 Gross\Documents\Projects\discom\hadoop-0.21.0\logs
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
 Could not find the main class:
 Gross\Documents\Projects\discom\hadoop-0.21.0\logs.  Program will exit.

 Does anyone know what I need to change?

 Thank you.

 From, Drew

 --
 Forget the environment. Print this e-mail immediately. Then burn it.



Re: ClassNotFoundException while running quick start guide on Windows.

2011-06-21 Thread Drew Gross
Thanks Jeff, it was a problem with JAVA_HOME. I have another problem
now though, I have this:

$JAVA:  /cygdrive/c/Program Files/Java/jdk1.6.0_26/bin/java
$JAVA_HEAP_MAX:  -Xmx1000m
$HADOOP_OPTS:  -Dhadoop.log.dir=C:\Users\Drew
Gross\Documents\Projects\discom\hadoop-0.21.0\logs
-Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=C:\Users\Drew
Gross\Documents\Projects\discom\hadoop-0.21.0\ -Dhadoop.id.str=
-Dhadoop.root.logger=INFO,console
 -Djava.library.path=/cygdrive/c/Users/Drew
Gross/Documents/Projects/discom/hadoop-0.21.0/lib/native/
-Dhadoop.policy.file=hadoop-policy.xml
$CLASS:  org.apache.hadoop.util.RunJar
Exception in thread main java.lang.NoClassDefFoundError:
Gross\Documents\Projects\discom\hadoop-0/21/0\logs
Caused by: java.lang.ClassNotFoundException:
Gross\Documents\Projects\discom\hadoop-0.21.0\logs
        at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
        at java.security.AccessController.doPrivileged(Native Method)
        at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
Could not find the main class:
Gross\Documents\Projects\discom\hadoop-0.21.0\logs.  Program will
exit.

(This is with some extra debugging info added by me in bin/hadoop)

It looks like the windows style file names are causing problems,
especially the spaces. Has anyone encountered this before, and know
how to fix? I tried escaping the spaces and surrounding the file paths
with quotes (not at the same time), but that didn't help.

Drew


On Tue, Jun 21, 2011 at 6:24 AM, madhu phatak phatak@gmail.com wrote:

 I think the jar have some issuses where its not able to read the Main class
 from manifest . try unjar the jar and see in Manifest.xml what is the main
 class and then run as follows

  bin/hadoop jar hadoop-*-examples.jar Full qualified main class grep input
 output 'dfs[a-z.]+'
 On Thu, Jun 16, 2011 at 10:23 AM, Drew Gross drew.a.gr...@gmail.com wrote:

  Hello,
 
  I'm trying to run the example from the quick start guide on Windows and I
  get this error:
 
  $ bin/hadoop jar hadoop-*-examples.jar grep input output 'dfs[a-z.]+'
  Exception in thread main java.lang.NoClassDefFoundError:
  Caused by: java.lang.ClassNotFoundException:
         at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
         at java.security.AccessController.doPrivileged(Native Method)
         at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
         at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
         at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
         at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
  Could not find the main class: .  Program will exit.
  Exception in thread main java.lang.NoClassDefFoundError:
  Gross\Documents\Projects\discom\hadoop-0/21/0\logs
  Caused by: java.lang.ClassNotFoundException:
  Gross\Documents\Projects\discom\hadoop-0.21.0\logs
         at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
         at java.security.AccessController.doPrivileged(Native Method)
         at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
         at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
         at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
         at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
  Could not find the main class:
  Gross\Documents\Projects\discom\hadoop-0.21.0\logs.  Program will exit.
 
  Does anyone know what I need to change?
 
  Thank you.
 
  From, Drew
 
  --
  Forget the environment. Print this e-mail immediately. Then burn it.
 



--
Forget the environment. Print this e-mail immediately. Then burn it.


Re: ClassNotFoundException

2010-12-31 Thread Harsh J
The answer is in your log output:

10/12/31 10:26:54 WARN mapreduce.JobSubmitter: No job jar file set.
User classes may not be found. See Job or Job#setJar(String).

Alternatively, use Job.setJarByClass(Class class);

On Fri, Dec 31, 2010 at 3:02 PM, Cavus,M.,Fa. Post Direkt
m.ca...@postdirekt.de wrote:
 I look in my Jar File but I get a ClassNotFoundException why?:

-- 
Harsh J
www.harshj.com


RE: ClassNotFoundException

2010-12-29 Thread Cavus,M.,Fa. Post Direkt
Could you give me the command Harsh?

-Original Message-
From: Harsh J [mailto:qwertyman...@gmail.com] 
Sent: Tuesday, December 28, 2010 5:15 PM
To: common-user@hadoop.apache.org
Subject: Re: ClassNotFoundException

In your job driving class (WordCount as per that command), have you
specified the jar by calling the Job.setJarByClass() [or on Stable
API, JobConf.setJarByClass()]?

I'm not sure if hadoop.util.RunJar automatically sends the jar across
for distribution to TaskTrackers.

On Tue, Dec 28, 2010 at 8:27 PM, Cavus,M.,Fa. Post Direkt
m.ca...@postdirekt.de wrote:
 Hi,

 I process this command: ./hadoop jar /home/userme/hd.jar
 org.postdirekt.hadoop.WordCount gutenberg gutenberberg-output



-- 
Harsh J
www.harshj.com


Re: ClassNotFoundException

2010-12-28 Thread James Seigel
jar -tvf the jar file and double check that it is a class that is
listed. Can't be in an included jar file.

Sent from my mobile. Please excuse the typos.

On 2010-12-28, at 7:58 AM, Cavus,M.,Fa. Post Direkt
m.ca...@postdirekt.de wrote:

 Hi,

 I process this command: ./hadoop jar /home/userme/hd.jar
 org.postdirekt.hadoop.WordCount gutenberg gutenberberg-output



 and get this why? Because I have org.postdirekt.hadoop.Map in the jar
 File.



 10/12/28 15:28:30 INFO mapreduce.Job: Task Id :
 attempt_201012281524_0002_m_00_0, Status : FAILED

 java.lang.RuntimeException: java.lang.ClassNotFoundException:
 org.postdirekt.hadoop.Map

at
 org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1128)

at
 org.apache.hadoop.mapreduce.task.JobContextImpl.getMapperClass(JobContex
 tImpl.java:167)

at
 org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:612)

at org.apache.hadoop.mapred.MapTask.run(MapTask.java:328)

at org.apache.hadoop.mapred.Child$4.run(Child.java:217)

at java.security.AccessController.doPrivileged(Native
 Method)

at javax.security.auth.Subject.doAs(Subject.java:396)

at
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformatio
 n.java:742)

at org.apache.hadoop.mapred.Child.main(Child.java:211)

 Caused by: java.lang.ClassNotFoundException: org.postdirekt.hadoop.Map

at java.net.URLClassLoader$1.run(URLClassLoader.java:202)

at java.security.AccessController.doPrivileged(Native
 Method)

at
 java.net.URLClassLoader.findClass(URLClassLoader.java:190)

at java.lang.ClassLoader.loadClass(ClassLoader.java:307)

at sun.m

 10/12/28 15:28:41 INFO mapreduce.Job: Task Id :
 attempt_201012281524_0002_m_00_1, Status : FAILED

 java.lang.RuntimeException: java.lang.ClassNotFoundException:
 org.postdirekt.hadoop.Map

at
 org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1128)

at
 org.apache.hadoop.mapreduce.task.JobContextImpl.getMapperClass(JobContex
 tImpl.java:167)

at
 org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:612)

at org.apache.hadoop.mapred.MapTask.run(MapTask.java:328)

at org.apache.hadoop.mapred.Child$4.run(Child.java:217)

at java.security.AccessController.doPrivileged(Native
 Method)

at javax.security.auth.Subject.doAs(Subject.java:396)

at
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformatio
 n.java:742)

at org.apache.hadoop.mapred.Child.main(Child.java:211)

 Caused by: java.lang.ClassNotFoundException: org.postdirekt.hadoop.Map

at java.net.URLClassLoader$1.run(URLClassLoader.java:202)

at java.security.AccessController.doPrivileged(Native
 Method)

at
 java.net.URLClassLoader.findClass(URLClassLoader.java:190)

at java.lang.ClassLoader.loadClass(ClassLoader.java:307)

at sun.m

 10/12/28 15:28:53 INFO mapreduce.Job: Task Id :
 attempt_201012281524_0002_m_00_2, Status : FAILED

 java.lang.RuntimeException: java.lang.ClassNotFoundException:
 org.postdirekt.hadoop.Map

at
 org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1128)

at
 org.apache.hadoop.mapreduce.task.JobContextImpl.getMapperClass(JobContex
 tImpl.java:167)

at
 org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:612)

at org.apache.hadoop.mapred.MapTask.run(MapTask.java:328)

at org.apache.hadoop.mapred.Child$4.run(Child.java:217)

at java.security.AccessController.doPrivileged(Native
 Method)

at javax.security.auth.Subject.doAs(Subject.java:396)

at
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformatio
 n.java:742)

at org.apache.hadoop.mapred.Child.main(Child.java:211)

 Caused by: java.lang.ClassNotFoundException: org.postdirekt.hadoop.Map

at java.net.URLClassLoader$1.run(URLClassLoader.java:202)

at java.security.AccessController.doPrivileged(Native
 Method)

at
 java.net.URLClassLoader.findClass(URLClassLoader.java:190)

at java.lang.ClassLoader.loadClass(ClassLoader.java:307)

at sun.m

 10/12/28 15:29:09 INFO mapreduce.Job: Job complete:
 job_201012281524_0002

 10/12/28 15:29:09 INFO mapreduce.Job: Counters: 7

Job Counters

Data-local map tasks=4

Total time spent by all maps waiting after
 reserving slots (ms)=0

Total time spent by all reduces waiting after
 reserving slots (ms)=0

Failed map tasks=1

SLOTS_MILLIS_MAPS=45636

SLOTS_MILLIS_REDUCES=0

Launched map tasks=4





RE: ClassNotFoundException

2010-12-28 Thread Cavus,M.,Fa. Post Direkt
What must I do James?

-Original Message-
From: James Seigel [mailto:ja...@tynt.com] 
Sent: Tuesday, December 28, 2010 4:03 PM
To: common-user@hadoop.apache.org
Subject: Re: ClassNotFoundException

jar -tvf the jar file and double check that it is a class that is
listed. Can't be in an included jar file.

Sent from my mobile. Please excuse the typos.

On 2010-12-28, at 7:58 AM, Cavus,M.,Fa. Post Direkt
m.ca...@postdirekt.de wrote:

 Hi,

 I process this command: ./hadoop jar /home/userme/hd.jar
 org.postdirekt.hadoop.WordCount gutenberg gutenberberg-output



 and get this why? Because I have org.postdirekt.hadoop.Map in the jar
 File.



 10/12/28 15:28:30 INFO mapreduce.Job: Task Id :
 attempt_201012281524_0002_m_00_0, Status : FAILED

 java.lang.RuntimeException: java.lang.ClassNotFoundException:
 org.postdirekt.hadoop.Map

at
 org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1128)

at

org.apache.hadoop.mapreduce.task.JobContextImpl.getMapperClass(JobContex
 tImpl.java:167)

at
 org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:612)

at org.apache.hadoop.mapred.MapTask.run(MapTask.java:328)

at org.apache.hadoop.mapred.Child$4.run(Child.java:217)

at java.security.AccessController.doPrivileged(Native
 Method)

at javax.security.auth.Subject.doAs(Subject.java:396)

at

org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformatio
 n.java:742)

at org.apache.hadoop.mapred.Child.main(Child.java:211)

 Caused by: java.lang.ClassNotFoundException: org.postdirekt.hadoop.Map

at java.net.URLClassLoader$1.run(URLClassLoader.java:202)

at java.security.AccessController.doPrivileged(Native
 Method)

at
 java.net.URLClassLoader.findClass(URLClassLoader.java:190)

at java.lang.ClassLoader.loadClass(ClassLoader.java:307)

at sun.m

 10/12/28 15:28:41 INFO mapreduce.Job: Task Id :
 attempt_201012281524_0002_m_00_1, Status : FAILED

 java.lang.RuntimeException: java.lang.ClassNotFoundException:
 org.postdirekt.hadoop.Map

at
 org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1128)

at

org.apache.hadoop.mapreduce.task.JobContextImpl.getMapperClass(JobContex
 tImpl.java:167)

at
 org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:612)

at org.apache.hadoop.mapred.MapTask.run(MapTask.java:328)

at org.apache.hadoop.mapred.Child$4.run(Child.java:217)

at java.security.AccessController.doPrivileged(Native
 Method)

at javax.security.auth.Subject.doAs(Subject.java:396)

at

org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformatio
 n.java:742)

at org.apache.hadoop.mapred.Child.main(Child.java:211)

 Caused by: java.lang.ClassNotFoundException: org.postdirekt.hadoop.Map

at java.net.URLClassLoader$1.run(URLClassLoader.java:202)

at java.security.AccessController.doPrivileged(Native
 Method)

at
 java.net.URLClassLoader.findClass(URLClassLoader.java:190)

at java.lang.ClassLoader.loadClass(ClassLoader.java:307)

at sun.m

 10/12/28 15:28:53 INFO mapreduce.Job: Task Id :
 attempt_201012281524_0002_m_00_2, Status : FAILED

 java.lang.RuntimeException: java.lang.ClassNotFoundException:
 org.postdirekt.hadoop.Map

at
 org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1128)

at

org.apache.hadoop.mapreduce.task.JobContextImpl.getMapperClass(JobContex
 tImpl.java:167)

at
 org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:612)

at org.apache.hadoop.mapred.MapTask.run(MapTask.java:328)

at org.apache.hadoop.mapred.Child$4.run(Child.java:217)

at java.security.AccessController.doPrivileged(Native
 Method)

at javax.security.auth.Subject.doAs(Subject.java:396)

at

org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformatio
 n.java:742)

at org.apache.hadoop.mapred.Child.main(Child.java:211)

 Caused by: java.lang.ClassNotFoundException: org.postdirekt.hadoop.Map

at java.net.URLClassLoader$1.run(URLClassLoader.java:202)

at java.security.AccessController.doPrivileged(Native
 Method)

at
 java.net.URLClassLoader.findClass(URLClassLoader.java:190)

at java.lang.ClassLoader.loadClass(ClassLoader.java:307)

at sun.m

 10/12/28 15:29:09 INFO mapreduce.Job: Job complete:
 job_201012281524_0002

 10/12/28 15:29:09 INFO mapreduce.Job: Counters: 7

Job Counters

Data-local map tasks=4

Total time spent by all maps waiting after
 reserving slots (ms)=0

Total time spent by all reduces waiting after
 reserving slots (ms)=0

Re: ClassNotFoundException

2010-12-28 Thread Praveen Bathala
Just run this and make sure you really have the class file in jar

jar -tvf | grep org.postdirekt.hadoop.Map

if you don't get any output, the you don't have the class file in your jar

+ Praveen

On Dec 28, 2010, at 9:12 AM, Cavus,M.,Fa. Post Direkt wrote:

 What must I do James?
 
 -Original Message-
 From: James Seigel [mailto:ja...@tynt.com] 
 Sent: Tuesday, December 28, 2010 4:03 PM
 To: common-user@hadoop.apache.org
 Subject: Re: ClassNotFoundException
 
 jar -tvf the jar file and double check that it is a class that is
 listed. Can't be in an included jar file.
 
 Sent from my mobile. Please excuse the typos.
 
 On 2010-12-28, at 7:58 AM, Cavus,M.,Fa. Post Direkt
 m.ca...@postdirekt.de wrote:
 
 Hi,
 
 I process this command: ./hadoop jar /home/userme/hd.jar
 org.postdirekt.hadoop.WordCount gutenberg gutenberberg-output
 
 
 
 and get this why? Because I have org.postdirekt.hadoop.Map in the jar
 File.
 
 
 
 10/12/28 15:28:30 INFO mapreduce.Job: Task Id :
 attempt_201012281524_0002_m_00_0, Status : FAILED
 
 java.lang.RuntimeException: java.lang.ClassNotFoundException:
 org.postdirekt.hadoop.Map
 
   at
 org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1128)
 
   at
 
 org.apache.hadoop.mapreduce.task.JobContextImpl.getMapperClass(JobContex
 tImpl.java:167)
 
   at
 org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:612)
 
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:328)
 
   at org.apache.hadoop.mapred.Child$4.run(Child.java:217)
 
   at java.security.AccessController.doPrivileged(Native
 Method)
 
   at javax.security.auth.Subject.doAs(Subject.java:396)
 
   at
 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformatio
 n.java:742)
 
   at org.apache.hadoop.mapred.Child.main(Child.java:211)
 
 Caused by: java.lang.ClassNotFoundException: org.postdirekt.hadoop.Map
 
   at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
 
   at java.security.AccessController.doPrivileged(Native
 Method)
 
   at
 java.net.URLClassLoader.findClass(URLClassLoader.java:190)
 
   at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
 
   at sun.m
 
 10/12/28 15:28:41 INFO mapreduce.Job: Task Id :
 attempt_201012281524_0002_m_00_1, Status : FAILED
 
 java.lang.RuntimeException: java.lang.ClassNotFoundException:
 org.postdirekt.hadoop.Map
 
   at
 org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1128)
 
   at
 
 org.apache.hadoop.mapreduce.task.JobContextImpl.getMapperClass(JobContex
 tImpl.java:167)
 
   at
 org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:612)
 
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:328)
 
   at org.apache.hadoop.mapred.Child$4.run(Child.java:217)
 
   at java.security.AccessController.doPrivileged(Native
 Method)
 
   at javax.security.auth.Subject.doAs(Subject.java:396)
 
   at
 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformatio
 n.java:742)
 
   at org.apache.hadoop.mapred.Child.main(Child.java:211)
 
 Caused by: java.lang.ClassNotFoundException: org.postdirekt.hadoop.Map
 
   at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
 
   at java.security.AccessController.doPrivileged(Native
 Method)
 
   at
 java.net.URLClassLoader.findClass(URLClassLoader.java:190)
 
   at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
 
   at sun.m
 
 10/12/28 15:28:53 INFO mapreduce.Job: Task Id :
 attempt_201012281524_0002_m_00_2, Status : FAILED
 
 java.lang.RuntimeException: java.lang.ClassNotFoundException:
 org.postdirekt.hadoop.Map
 
   at
 org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1128)
 
   at
 
 org.apache.hadoop.mapreduce.task.JobContextImpl.getMapperClass(JobContex
 tImpl.java:167)
 
   at
 org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:612)
 
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:328)
 
   at org.apache.hadoop.mapred.Child$4.run(Child.java:217)
 
   at java.security.AccessController.doPrivileged(Native
 Method)
 
   at javax.security.auth.Subject.doAs(Subject.java:396)
 
   at
 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformatio
 n.java:742)
 
   at org.apache.hadoop.mapred.Child.main(Child.java:211)
 
 Caused by: java.lang.ClassNotFoundException: org.postdirekt.hadoop.Map
 
   at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
 
   at java.security.AccessController.doPrivileged(Native
 Method)
 
   at
 java.net.URLClassLoader.findClass(URLClassLoader.java:190)
 
   at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
 
   at sun.m
 
 10/12/28 15:29:09 INFO mapreduce.Job: Job complete:
 job_201012281524_0002
 
 10/12/28 15:29:09

RE: ClassNotFoundException

2010-12-28 Thread Cavus,M.,Fa. Post Direkt
Hi Praveen, I get this

2398 Mon Dec 27 16:19:16 CET org/postdirekt/hadoop/Map.class


-Original Message-
From: Praveen Bathala [mailto:pbatha...@gmail.com] 
Sent: Tuesday, December 28, 2010 4:17 PM
To: common-user@hadoop.apache.org
Subject: Re: ClassNotFoundException

Just run this and make sure you really have the class file in jar

jar -tvf | grep org.postdirekt.hadoop.Map

if you don't get any output, the you don't have the class file in your
jar

+ Praveen

On Dec 28, 2010, at 9:12 AM, Cavus,M.,Fa. Post Direkt wrote:

 What must I do James?
 
 -Original Message-
 From: James Seigel [mailto:ja...@tynt.com] 
 Sent: Tuesday, December 28, 2010 4:03 PM
 To: common-user@hadoop.apache.org
 Subject: Re: ClassNotFoundException
 
 jar -tvf the jar file and double check that it is a class that is
 listed. Can't be in an included jar file.
 
 Sent from my mobile. Please excuse the typos.
 
 On 2010-12-28, at 7:58 AM, Cavus,M.,Fa. Post Direkt
 m.ca...@postdirekt.de wrote:
 
 Hi,
 
 I process this command: ./hadoop jar /home/userme/hd.jar
 org.postdirekt.hadoop.WordCount gutenberg gutenberberg-output
 
 
 
 and get this why? Because I have org.postdirekt.hadoop.Map in the jar
 File.
 
 
 
 10/12/28 15:28:30 INFO mapreduce.Job: Task Id :
 attempt_201012281524_0002_m_00_0, Status : FAILED
 
 java.lang.RuntimeException: java.lang.ClassNotFoundException:
 org.postdirekt.hadoop.Map
 
   at

org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1128)
 
   at
 

org.apache.hadoop.mapreduce.task.JobContextImpl.getMapperClass(JobContex
 tImpl.java:167)
 
   at
 org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:612)
 
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:328)
 
   at org.apache.hadoop.mapred.Child$4.run(Child.java:217)
 
   at java.security.AccessController.doPrivileged(Native
 Method)
 
   at javax.security.auth.Subject.doAs(Subject.java:396)
 
   at
 

org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformatio
 n.java:742)
 
   at org.apache.hadoop.mapred.Child.main(Child.java:211)
 
 Caused by: java.lang.ClassNotFoundException:
org.postdirekt.hadoop.Map
 
   at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
 
   at java.security.AccessController.doPrivileged(Native
 Method)
 
   at
 java.net.URLClassLoader.findClass(URLClassLoader.java:190)
 
   at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
 
   at sun.m
 
 10/12/28 15:28:41 INFO mapreduce.Job: Task Id :
 attempt_201012281524_0002_m_00_1, Status : FAILED
 
 java.lang.RuntimeException: java.lang.ClassNotFoundException:
 org.postdirekt.hadoop.Map
 
   at

org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1128)
 
   at
 

org.apache.hadoop.mapreduce.task.JobContextImpl.getMapperClass(JobContex
 tImpl.java:167)
 
   at
 org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:612)
 
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:328)
 
   at org.apache.hadoop.mapred.Child$4.run(Child.java:217)
 
   at java.security.AccessController.doPrivileged(Native
 Method)
 
   at javax.security.auth.Subject.doAs(Subject.java:396)
 
   at
 

org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformatio
 n.java:742)
 
   at org.apache.hadoop.mapred.Child.main(Child.java:211)
 
 Caused by: java.lang.ClassNotFoundException:
org.postdirekt.hadoop.Map
 
   at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
 
   at java.security.AccessController.doPrivileged(Native
 Method)
 
   at
 java.net.URLClassLoader.findClass(URLClassLoader.java:190)
 
   at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
 
   at sun.m
 
 10/12/28 15:28:53 INFO mapreduce.Job: Task Id :
 attempt_201012281524_0002_m_00_2, Status : FAILED
 
 java.lang.RuntimeException: java.lang.ClassNotFoundException:
 org.postdirekt.hadoop.Map
 
   at

org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1128)
 
   at
 

org.apache.hadoop.mapreduce.task.JobContextImpl.getMapperClass(JobContex
 tImpl.java:167)
 
   at
 org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:612)
 
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:328)
 
   at org.apache.hadoop.mapred.Child$4.run(Child.java:217)
 
   at java.security.AccessController.doPrivileged(Native
 Method)
 
   at javax.security.auth.Subject.doAs(Subject.java:396)
 
   at
 

org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformatio
 n.java:742)
 
   at org.apache.hadoop.mapred.Child.main(Child.java:211)
 
 Caused by: java.lang.ClassNotFoundException:
org.postdirekt.hadoop.Map
 
   at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
 
   at java.security.AccessController.doPrivileged(Native

RE: ClassNotFoundException

2010-12-28 Thread Black, Michael (IS)
I'm using hadoop-0.20.2 and I see this for my map/reduce class
 
com/ngc/asoc/recommend/Predict$Counter.class
com/ngc/asoc/recommend/Predict$R.class
com/ngc/asoc/recommend/Predict$M.class
com/ngc/asoc/recommend/Predict.class

I'm a java idiot so I don't know why they appear but perhaps you have similar?
 
Michael D. Black
Senior Scientist
Advanced Analytics Directorate
Northrop Grumman Information Systems
 



From: Cavus,M.,Fa. Post Direkt [mailto:m.ca...@postdirekt.de]
Sent: Tue 12/28/2010 9:21 AM
To: common-user@hadoop.apache.org
Subject: EXTERNAL:RE: ClassNotFoundException



Hi Praveen, I get this

2398 Mon Dec 27 16:19:16 CET org/postdirekt/hadoop/Map.class






Re: ClassNotFoundException

2010-12-28 Thread Harsh J
In your job driving class (WordCount as per that command), have you
specified the jar by calling the Job.setJarByClass() [or on Stable
API, JobConf.setJarByClass()]?

I'm not sure if hadoop.util.RunJar automatically sends the jar across
for distribution to TaskTrackers.

On Tue, Dec 28, 2010 at 8:27 PM, Cavus,M.,Fa. Post Direkt
m.ca...@postdirekt.de wrote:
 Hi,

 I process this command: ./hadoop jar /home/userme/hd.jar
 org.postdirekt.hadoop.WordCount gutenberg gutenberberg-output



-- 
Harsh J
www.harshj.com


Re: ClassNotFoundException with contrib/join example

2010-03-29 Thread M B
Sorry, I should have mentioned that I tried that as well and it also gives
an error:

$ p...@hadoop01:~/hadoop_tests$ hadoop jar -libjars ./samplejoin.jar
/opt/hadoop-0.20.2/contrib/datajoin/hadoop-0.20.2-datajoin.jar
org.apache.hadoop.contrib.utils.join.DataJoinJob datajoin/input
datajoin/output Text 1
org.apache.hadoop.contrib.utils.join.SampleDataJoinMapper
org.apache.hadoop.contrib.utils.join.SampleDataJoinReducer
org.apache.hadoop.contrib.utils.join.SampleTaggedMapOutput Text
Exception in thread main java.io.IOException: Error opening job jar:
-libjars
at org.apache.hadoop.util.RunJar.main(RunJar.java:90)
Caused by: java.util.zip.ZipException: error in opening zip file
at java.util.zip.ZipFile.open(Native Method)
at java.util.zip.ZipFile.init(ZipFile.java:114)
at java.util.jar.JarFile.init(JarFile.java:133)
at java.util.jar.JarFile.init(JarFile.java:70)
at org.apache.hadoop.util.RunJar.main(RunJar.java:88)
Has something changed or is my environment not set up correctly?  Appreciate
any help.



On Fri, Mar 26, 2010 at 8:23 PM, Ted Yu yuzhih...@gmail.com wrote:

 Then use the syntax given by

 http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/util/GenericOptionsParser.html
 :

 $ bin/hadoop jar -libjars ./samplejoin.jar
 /opt/hadoop-0.20.2/contrib/datajoin/hadoop-0.20.2-datajoin.jar
 org.apache.hadoop.contrib.utils.join.DataJoinJob datajoin/input ...

 On Fri, Mar 26, 2010 at 5:10 PM, M B machac...@gmail.com wrote:

  Sorry, but where exactly do I include the libjars option?  I tried to put
  it
  where you stated (after the DataJoinJob class), but it just comes back
 with
  usage information (as if the option is not valid):
  $ p...@hadoop01:~/hadoop_tests$ hadoop jar
   /opt/hadoop-0.20.2/contrib/datajoin/hadoop-0.20.2-datajoin.jar
  org.apache.hadoop.contrib.utils.join.DataJoinJob -libjars
 ./samplejoin.jar
  datajoin/input datajoin/output Text 1
  org.apache.hadoop.contrib.utils.join.SampleDataJoinMapper
  org.apache.hadoop.contrib.utils.join.SampleDataJoinReducer
  org.apache.hadoop.contrib.utils.join.SampleTaggedMapOutput Text
  *usage: DataJoinJob inputdirs outputdir map_input_file_format numofParts
  mapper_class reducer_class map_output_value_class output_value_class
  [maxNumOfValuesPerGroup [descriptionOfJob]]]*
 
  It seems like it's not taking the option for some reason, like it's
 failing
  an argument check in DataJoinJob - does that not use the standard args or
  something?
 
 
  On Fri, Mar 26, 2010 at 4:38 PM, Ted Yu yuzhih...@gmail.com wrote:
 
   DataJoinJob is contained in hadoop-0.20.2-datajoin.jar which is in your
   HADOOP_CLASSPATH
  
   I think you should specify samplejoin.jar using -libjars instead of
  putting
   it directly after jar command:
   hadoop jar hadoop-0.20.2-datajoin.jar
   org.apache.hadoop.contrib.utils.join.DataJoinJob -libjars
  ./samplejoin.jar
   ... (same as your example)
  
   Cheers
  
   On Fri, Mar 26, 2010 at 3:24 PM, M B machac...@gmail.com wrote:
  
I may be having a setup issue with classpaths, would appreciate some
   help.
   
I created a jar with all the Sample* classes in contrib/DataJoin.
  Here
   is
the listing of my samplejoin.jar file:
 zip.vim version v22
 Browsing zipfile /home/hadoop/hadoop_tests/samplejoin.jar
 Select a file with cursor and press ENTER
META-INF/
META-INF/MANIFEST.MF
org/
org/apache/
org/apache/hadoop/
org/apache/hadoop/contrib/
org/apache/hadoop/contrib/utils/
org/apache/hadoop/contrib/utils/join/
org/apache/hadoop/contrib/utils/join/SampleDataJoinReducer.class
org/apache/hadoop/contrib/utils/join/SampleTaggedMapOutput.class
org/apache/hadoop/contrib/utils/join/SampleDataJoinMapper.class
   
When I go to run this, things start to run, but every Map try errors
  out
with:
java.lang.RuntimeException: java.lang.ClassNotFoundException:
org.apache.hadoop.contrib.utils.join.SampleTaggedMapOutput
   
Here is the command:
hadoop jar ./samplejoin.jar
org.apache.hadoop.contrib.utils.join.DataJoinJob
datajoin/input datajoin/output Text 1
org.apache.hadoop.contrib.utils.join.SampleDataJoinMapper
org.apache.hadoop.contrib.utils.join.SampleDataJoinReducer
org.apache.hadoop.contrib.utils.join.SampleTaggedMapOutput Text
   
This is a new install of 0.20.2.
   
HADOOP_CLASSPATH is set
to: /opt/hadoop-0.20.2/contrib/datajoin/hadoop-0.20.2-datajoin.jar
Any help would be appreciated.
   
  
 



RE: ClassNotFoundException with contrib/join example

2010-03-29 Thread Jones, Nick
M B,
I'm not sure about the -libjars argument but 'hadoop jar' is expecting the 
jarfile immediately afterwards: hadoop jar jarFile [mainClass] args...

Nick Jones

-Original Message-
From: M B [mailto:machac...@gmail.com] 
Sent: Monday, March 29, 2010 10:26 AM
To: common-user@hadoop.apache.org
Subject: Re: ClassNotFoundException with contrib/join example

Sorry, I should have mentioned that I tried that as well and it also gives
an error:

$ p...@hadoop01:~/hadoop_tests$ hadoop jar -libjars ./samplejoin.jar
/opt/hadoop-0.20.2/contrib/datajoin/hadoop-0.20.2-datajoin.jar
org.apache.hadoop.contrib.utils.join.DataJoinJob datajoin/input
datajoin/output Text 1
org.apache.hadoop.contrib.utils.join.SampleDataJoinMapper
org.apache.hadoop.contrib.utils.join.SampleDataJoinReducer
org.apache.hadoop.contrib.utils.join.SampleTaggedMapOutput Text
Exception in thread main java.io.IOException: Error opening job jar:
-libjars
at org.apache.hadoop.util.RunJar.main(RunJar.java:90)
Caused by: java.util.zip.ZipException: error in opening zip file
at java.util.zip.ZipFile.open(Native Method)
at java.util.zip.ZipFile.init(ZipFile.java:114)
at java.util.jar.JarFile.init(JarFile.java:133)
at java.util.jar.JarFile.init(JarFile.java:70)
at org.apache.hadoop.util.RunJar.main(RunJar.java:88)
Has something changed or is my environment not set up correctly?  Appreciate
any help.



On Fri, Mar 26, 2010 at 8:23 PM, Ted Yu yuzhih...@gmail.com wrote:

 Then use the syntax given by

 http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/util/GenericOptionsParser.html
 :

 $ bin/hadoop jar -libjars ./samplejoin.jar
 /opt/hadoop-0.20.2/contrib/datajoin/hadoop-0.20.2-datajoin.jar
 org.apache.hadoop.contrib.utils.join.DataJoinJob datajoin/input ...

 On Fri, Mar 26, 2010 at 5:10 PM, M B machac...@gmail.com wrote:

  Sorry, but where exactly do I include the libjars option?  I tried to put
  it
  where you stated (after the DataJoinJob class), but it just comes back
 with
  usage information (as if the option is not valid):
  $ p...@hadoop01:~/hadoop_tests$ hadoop jar
   /opt/hadoop-0.20.2/contrib/datajoin/hadoop-0.20.2-datajoin.jar
  org.apache.hadoop.contrib.utils.join.DataJoinJob -libjars
 ./samplejoin.jar
  datajoin/input datajoin/output Text 1
  org.apache.hadoop.contrib.utils.join.SampleDataJoinMapper
  org.apache.hadoop.contrib.utils.join.SampleDataJoinReducer
  org.apache.hadoop.contrib.utils.join.SampleTaggedMapOutput Text
  *usage: DataJoinJob inputdirs outputdir map_input_file_format numofParts
  mapper_class reducer_class map_output_value_class output_value_class
  [maxNumOfValuesPerGroup [descriptionOfJob]]]*
 
  It seems like it's not taking the option for some reason, like it's
 failing
  an argument check in DataJoinJob - does that not use the standard args or
  something?
 
 
  On Fri, Mar 26, 2010 at 4:38 PM, Ted Yu yuzhih...@gmail.com wrote:
 
   DataJoinJob is contained in hadoop-0.20.2-datajoin.jar which is in your
   HADOOP_CLASSPATH
  
   I think you should specify samplejoin.jar using -libjars instead of
  putting
   it directly after jar command:
   hadoop jar hadoop-0.20.2-datajoin.jar
   org.apache.hadoop.contrib.utils.join.DataJoinJob -libjars
  ./samplejoin.jar
   ... (same as your example)
  
   Cheers
  
   On Fri, Mar 26, 2010 at 3:24 PM, M B machac...@gmail.com wrote:
  
I may be having a setup issue with classpaths, would appreciate some
   help.
   
I created a jar with all the Sample* classes in contrib/DataJoin.
  Here
   is
the listing of my samplejoin.jar file:
 zip.vim version v22
 Browsing zipfile /home/hadoop/hadoop_tests/samplejoin.jar
 Select a file with cursor and press ENTER
META-INF/
META-INF/MANIFEST.MF
org/
org/apache/
org/apache/hadoop/
org/apache/hadoop/contrib/
org/apache/hadoop/contrib/utils/
org/apache/hadoop/contrib/utils/join/
org/apache/hadoop/contrib/utils/join/SampleDataJoinReducer.class
org/apache/hadoop/contrib/utils/join/SampleTaggedMapOutput.class
org/apache/hadoop/contrib/utils/join/SampleDataJoinMapper.class
   
When I go to run this, things start to run, but every Map try errors
  out
with:
java.lang.RuntimeException: java.lang.ClassNotFoundException:
org.apache.hadoop.contrib.utils.join.SampleTaggedMapOutput
   
Here is the command:
hadoop jar ./samplejoin.jar
org.apache.hadoop.contrib.utils.join.DataJoinJob
datajoin/input datajoin/output Text 1
org.apache.hadoop.contrib.utils.join.SampleDataJoinMapper
org.apache.hadoop.contrib.utils.join.SampleDataJoinReducer
org.apache.hadoop.contrib.utils.join.SampleTaggedMapOutput Text
   
This is a new install of 0.20.2.
   
HADOOP_CLASSPATH is set
to: /opt/hadoop-0.20.2/contrib/datajoin/hadoop-0.20.2-datajoin.jar
Any help would be appreciated.
   
  
 




Re: ClassNotFoundException with contrib/join example

2010-03-29 Thread M B
Right, that was the first option I tried and it fails there as well.

Maybe I need to step back and ask a higher-level question - does anyone have
a full, step-by-step example of using a reduce-side join in an M/R job?
Preferrably using the contrib/DataJoin classes, but I'll be happy with
whatever example I could get.

I'd love to see the actual code and then how it's kicked off on the command
line so I can try it on my end as a prototype.  I must be doing something
wrong, but don't know what it is.

Thanks.

On Mon, Mar 29, 2010 at 8:31 AM, Jones, Nick nick.jo...@amd.com wrote:

 M B,
 I'm not sure about the -libjars argument but 'hadoop jar' is expecting the
 jarfile immediately afterwards: hadoop jar jarFile [mainClass] args...

 Nick Jones

 -Original Message-
 From: M B [mailto:machac...@gmail.com]
 Sent: Monday, March 29, 2010 10:26 AM
 To: common-user@hadoop.apache.org
 Subject: Re: ClassNotFoundException with contrib/join example

 Sorry, I should have mentioned that I tried that as well and it also gives
 an error:

  $ p...@hadoop01:~/hadoop_tests$ hadoop jar -libjars ./samplejoin.jar
 /opt/hadoop-0.20.2/contrib/datajoin/hadoop-0.20.2-datajoin.jar
 org.apache.hadoop.contrib.utils.join.DataJoinJob datajoin/input
 datajoin/output Text 1
 org.apache.hadoop.contrib.utils.join.SampleDataJoinMapper
 org.apache.hadoop.contrib.utils.join.SampleDataJoinReducer
 org.apache.hadoop.contrib.utils.join.SampleTaggedMapOutput Text
 Exception in thread main java.io.IOException: Error opening job jar:
 -libjars
at org.apache.hadoop.util.RunJar.main(RunJar.java:90)
 Caused by: java.util.zip.ZipException: error in opening zip file
at java.util.zip.ZipFile.open(Native Method)
at java.util.zip.ZipFile.init(ZipFile.java:114)
at java.util.jar.JarFile.init(JarFile.java:133)
at java.util.jar.JarFile.init(JarFile.java:70)
at org.apache.hadoop.util.RunJar.main(RunJar.java:88)
 Has something changed or is my environment not set up correctly?
  Appreciate
 any help.



 On Fri, Mar 26, 2010 at 8:23 PM, Ted Yu yuzhih...@gmail.com wrote:

  Then use the syntax given by
 
 
 http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/util/GenericOptionsParser.html
  :
 
  $ bin/hadoop jar -libjars ./samplejoin.jar
  /opt/hadoop-0.20.2/contrib/datajoin/hadoop-0.20.2-datajoin.jar
  org.apache.hadoop.contrib.utils.join.DataJoinJob datajoin/input ...
 
  On Fri, Mar 26, 2010 at 5:10 PM, M B machac...@gmail.com wrote:
 
   Sorry, but where exactly do I include the libjars option?  I tried to
 put
   it
   where you stated (after the DataJoinJob class), but it just comes back
  with
   usage information (as if the option is not valid):
   $ p...@hadoop01:~/hadoop_tests$ hadoop jar
/opt/hadoop-0.20.2/contrib/datajoin/hadoop-0.20.2-datajoin.jar
   org.apache.hadoop.contrib.utils.join.DataJoinJob -libjars
  ./samplejoin.jar
   datajoin/input datajoin/output Text 1
   org.apache.hadoop.contrib.utils.join.SampleDataJoinMapper
   org.apache.hadoop.contrib.utils.join.SampleDataJoinReducer
   org.apache.hadoop.contrib.utils.join.SampleTaggedMapOutput Text
   *usage: DataJoinJob inputdirs outputdir map_input_file_format
 numofParts
   mapper_class reducer_class map_output_value_class output_value_class
   [maxNumOfValuesPerGroup [descriptionOfJob]]]*
  
   It seems like it's not taking the option for some reason, like it's
  failing
   an argument check in DataJoinJob - does that not use the standard args
 or
   something?
  
  
   On Fri, Mar 26, 2010 at 4:38 PM, Ted Yu yuzhih...@gmail.com wrote:
  
DataJoinJob is contained in hadoop-0.20.2-datajoin.jar which is in
 your
HADOOP_CLASSPATH
   
I think you should specify samplejoin.jar using -libjars instead of
   putting
it directly after jar command:
hadoop jar hadoop-0.20.2-datajoin.jar
org.apache.hadoop.contrib.utils.join.DataJoinJob -libjars
   ./samplejoin.jar
... (same as your example)
   
Cheers
   
On Fri, Mar 26, 2010 at 3:24 PM, M B machac...@gmail.com wrote:
   
 I may be having a setup issue with classpaths, would appreciate
 some
help.

 I created a jar with all the Sample* classes in contrib/DataJoin.
   Here
is
 the listing of my samplejoin.jar file:
  zip.vim version v22
  Browsing zipfile /home/hadoop/hadoop_tests/samplejoin.jar
  Select a file with cursor and press ENTER
 META-INF/
 META-INF/MANIFEST.MF
 org/
 org/apache/
 org/apache/hadoop/
 org/apache/hadoop/contrib/
 org/apache/hadoop/contrib/utils/
 org/apache/hadoop/contrib/utils/join/
 org/apache/hadoop/contrib/utils/join/SampleDataJoinReducer.class
 org/apache/hadoop/contrib/utils/join/SampleTaggedMapOutput.class
 org/apache/hadoop/contrib/utils/join/SampleDataJoinMapper.class

 When I go to run this, things start to run, but every Map try
 errors
   out
 with:
 java.lang.RuntimeException: java.lang.ClassNotFoundException

Re: ClassNotFoundException with contrib/join example

2010-03-29 Thread Ted Yu
I can run the sample (I created the input files according to
contrib/data_join/src/examples/org/apache/hadoop/contrib/utils/join/README.txt):

[r...@tyu-linux datajoin]# pwd
/opt/ks/hadoop-0.20.2/build/contrib/datajoin
[r...@tyu-linux datajoin]# /opt/ks/hadoop-0.20.2/bin/hadoop jar
hadoop-0.20.2-datajoin-examples.jar
org.apache.hadoop.contrib.utils.join.DataJoinJob input output Text 1
org.apache.hadoop.contrib.utils.join.SampleDataJoinMapper
org.apache.hadoop.contrib.utils.join.SampleDataJoinReducer
org.apache.hadoop.contrib.utils.join.SampleTaggedMapOutput Text
Using TextInputFormat: Text
Using TextOutputFormat: Text
10/03/29 09:01:30 INFO jvm.JvmMetrics: Initializing JVM Metrics with
processName=JobTracker, sessionId=
10/03/29 09:01:30 WARN mapred.JobClient: Use GenericOptionsParser for
parsing the arguments. Applications should implement Tool for the same.
10/03/29 09:01:30 INFO mapred.FileInputFormat: Total input paths to process
: 2
Job job_local_0001 is submitted
Job job_local_0001 is still running.
10/03/29 09:01:30 INFO mapred.FileInputFormat: Total input paths to process
: 2
10/03/29 09:01:31 INFO mapred.MapTask: numReduceTasks: 1
10/03/29 09:01:31 INFO mapred.MapTask: io.sort.mb = 100
10/03/29 09:01:31 INFO mapred.MapTask: data buffer = 79691776/99614720
10/03/29 09:01:31 INFO mapred.MapTask: record buffer = 262144/327680
10/03/29 09:01:31 INFO mapred.MapTask: Starting flush of map output
10/03/29 09:01:31 INFO mapred.MapTask: Finished spill 0
10/03/29 09:01:32 INFO mapred.TaskRunner: Task:attempt_local_0001_m_00_0
is done. And is in the process of commiting
10/03/29 09:01:32 INFO mapred.LocalJobRunner: collectedCount6
totalCount  6

10/03/29 09:01:32 INFO mapred.TaskRunner: Task
'attempt_local_0001_m_00_0' done.
10/03/29 09:01:32 INFO mapred.MapTask: numReduceTasks: 1
10/03/29 09:01:32 INFO mapred.MapTask: io.sort.mb = 100
10/03/29 09:01:32 INFO mapred.MapTask: data buffer = 79691776/99614720
10/03/29 09:01:32 INFO mapred.MapTask: record buffer = 262144/327680
10/03/29 09:01:32 INFO mapred.MapTask: Starting flush of map output
10/03/29 09:01:32 INFO mapred.MapTask: Finished spill 0
10/03/29 09:01:32 INFO mapred.TaskRunner: Task:attempt_local_0001_m_01_0
is done. And is in the process of commiting
10/03/29 09:01:32 INFO mapred.LocalJobRunner: collectedCount5
totalCount  5

10/03/29 09:01:32 INFO mapred.TaskRunner: Task
'attempt_local_0001_m_01_0' done.
10/03/29 09:01:32 INFO mapred.LocalJobRunner:
10/03/29 09:01:32 INFO mapred.Merger: Merging 2 sorted segments
10/03/29 09:01:32 INFO mapred.Merger: Down to the last merge-pass, with 2
segments left of total size: 939 bytes
10/03/29 09:01:32 INFO mapred.LocalJobRunner:
10/03/29 09:01:32 INFO util.NativeCodeLoader: Loaded the native-hadoop
library
10/03/29 09:01:32 INFO zlib.ZlibFactory: Successfully loaded  initialized
native-zlib library
10/03/29 09:01:32 INFO datajoin.job: key: A.a11 this.largestNumOfValues: 3
10/03/29 09:01:32 INFO mapred.TaskRunner: Task:attempt_local_0001_r_00_0
is done. And is in the process of commiting
10/03/29 09:01:32 INFO mapred.LocalJobRunner:
10/03/29 09:01:32 INFO mapred.TaskRunner: Task attempt_local_0001_r_00_0
is allowed to commit now
10/03/29 09:01:32 INFO mapred.FileOutputCommitter: Saved output of task
'attempt_local_0001_r_00_0' to
file:/opt/kindsight/hadoop-0.20.2/build/contrib/datajoin/output
10/03/29 09:01:32 INFO mapred.LocalJobRunner: actuallyCollectedCount5
collectedCount  7
groupCount  6
  reduce
10/03/29 09:01:32 INFO mapred.TaskRunner: Task
'attempt_local_0001_r_00_0' done.
[r...@tyu-linux datajoin]# date
Mon Mar 29 09:02:37 PDT 2010

It took a minute between the last INFO log and exit of DataJoinJob.

Cheers

On Mon, Mar 29, 2010 at 8:26 AM, M B machac...@gmail.com wrote:

 Sorry, I should have mentioned that I tried that as well and it also gives
 an error:

 $ p...@hadoop01:~/hadoop_tests$ hadoop jar -libjars ./samplejoin.jar
 /opt/hadoop-0.20.2/contrib/datajoin/hadoop-0.20.2-datajoin.jar
 org.apache.hadoop.contrib.utils.join.DataJoinJob datajoin/input
 datajoin/output Text 1
 org.apache.hadoop.contrib.utils.join.SampleDataJoinMapper
 org.apache.hadoop.contrib.utils.join.SampleDataJoinReducer
 org.apache.hadoop.contrib.utils.join.SampleTaggedMapOutput Text
 Exception in thread main java.io.IOException: Error opening job jar:
 -libjars
at org.apache.hadoop.util.RunJar.main(RunJar.java:90)
 Caused by: java.util.zip.ZipException: error in opening zip file
at java.util.zip.ZipFile.open(Native Method)
at java.util.zip.ZipFile.init(ZipFile.java:114)
at java.util.jar.JarFile.init(JarFile.java:133)
at java.util.jar.JarFile.init(JarFile.java:70)
at org.apache.hadoop.util.RunJar.main(RunJar.java:88)
 Has something changed or is my environment not set up correctly?
  Appreciate
 any help.



 On Fri, Mar 26, 2010 at 8:23 PM, Ted Yu yuzhih...@gmail.com wrote:

  Then use the syntax given by
 
 
 

Re: ClassNotFoundException with contrib/join example

2010-03-29 Thread M B
I don't see hadoop-0.20.2-datajoin-examples.jar in the
build/contrib/datajoin directory.  Is that a jar you created separately?  I
tried creating one, but it still doesn't run (the mappers show the same
error of missing the classes).

had...@hadoop01:/opt/hadoop-0.20.2/build/contrib/datajoin$ ls
classes  examples  test


On Mon, Mar 29, 2010 at 9:26 AM, Ted Yu yuzhih...@gmail.com wrote:

 I can run the sample (I created the input files according to

 contrib/data_join/src/examples/org/apache/hadoop/contrib/utils/join/README.txt):

 [r...@tyu-linux datajoin]# pwd
 /opt/ks/hadoop-0.20.2/build/contrib/datajoin
 [r...@tyu-linux datajoin]# /opt/ks/hadoop-0.20.2/bin/hadoop jar
 hadoop-0.20.2-datajoin-examples.jar
 org.apache.hadoop.contrib.utils.join.DataJoinJob input output Text 1
 org.apache.hadoop.contrib.utils.join.SampleDataJoinMapper
 org.apache.hadoop.contrib.utils.join.SampleDataJoinReducer
 org.apache.hadoop.contrib.utils.join.SampleTaggedMapOutput Text
 Using TextInputFormat: Text
 Using TextOutputFormat: Text
 10/03/29 09:01:30 INFO jvm.JvmMetrics: Initializing JVM Metrics with
 processName=JobTracker, sessionId=
 10/03/29 09:01:30 WARN mapred.JobClient: Use GenericOptionsParser for
 parsing the arguments. Applications should implement Tool for the same.
 10/03/29 09:01:30 INFO mapred.FileInputFormat: Total input paths to process
 : 2
 Job job_local_0001 is submitted
 Job job_local_0001 is still running.
 10/03/29 09:01:30 INFO mapred.FileInputFormat: Total input paths to process
 : 2
 10/03/29 09:01:31 INFO mapred.MapTask: numReduceTasks: 1
 10/03/29 09:01:31 INFO mapred.MapTask: io.sort.mb = 100
 10/03/29 09:01:31 INFO mapred.MapTask: data buffer = 79691776/99614720
 10/03/29 09:01:31 INFO mapred.MapTask: record buffer = 262144/327680
 10/03/29 09:01:31 INFO mapred.MapTask: Starting flush of map output
 10/03/29 09:01:31 INFO mapred.MapTask: Finished spill 0
 10/03/29 09:01:32 INFO mapred.TaskRunner:
 Task:attempt_local_0001_m_00_0
 is done. And is in the process of commiting
 10/03/29 09:01:32 INFO mapred.LocalJobRunner: collectedCount6
 totalCount  6

 10/03/29 09:01:32 INFO mapred.TaskRunner: Task
 'attempt_local_0001_m_00_0' done.
 10/03/29 09:01:32 INFO mapred.MapTask: numReduceTasks: 1
 10/03/29 09:01:32 INFO mapred.MapTask: io.sort.mb = 100
 10/03/29 09:01:32 INFO mapred.MapTask: data buffer = 79691776/99614720
 10/03/29 09:01:32 INFO mapred.MapTask: record buffer = 262144/327680
 10/03/29 09:01:32 INFO mapred.MapTask: Starting flush of map output
 10/03/29 09:01:32 INFO mapred.MapTask: Finished spill 0
 10/03/29 09:01:32 INFO mapred.TaskRunner:
 Task:attempt_local_0001_m_01_0
 is done. And is in the process of commiting
 10/03/29 09:01:32 INFO mapred.LocalJobRunner: collectedCount5
 totalCount  5

 10/03/29 09:01:32 INFO mapred.TaskRunner: Task
 'attempt_local_0001_m_01_0' done.
 10/03/29 09:01:32 INFO mapred.LocalJobRunner:
 10/03/29 09:01:32 INFO mapred.Merger: Merging 2 sorted segments
 10/03/29 09:01:32 INFO mapred.Merger: Down to the last merge-pass, with 2
 segments left of total size: 939 bytes
 10/03/29 09:01:32 INFO mapred.LocalJobRunner:
 10/03/29 09:01:32 INFO util.NativeCodeLoader: Loaded the native-hadoop
 library
 10/03/29 09:01:32 INFO zlib.ZlibFactory: Successfully loaded  initialized
 native-zlib library
 10/03/29 09:01:32 INFO datajoin.job: key: A.a11 this.largestNumOfValues: 3
 10/03/29 09:01:32 INFO mapred.TaskRunner:
 Task:attempt_local_0001_r_00_0
 is done. And is in the process of commiting
 10/03/29 09:01:32 INFO mapred.LocalJobRunner:
 10/03/29 09:01:32 INFO mapred.TaskRunner: Task
 attempt_local_0001_r_00_0
 is allowed to commit now
 10/03/29 09:01:32 INFO mapred.FileOutputCommitter: Saved output of task
 'attempt_local_0001_r_00_0' to
 file:/opt/kindsight/hadoop-0.20.2/build/contrib/datajoin/output
 10/03/29 09:01:32 INFO mapred.LocalJobRunner: actuallyCollectedCount5
 collectedCount  7
 groupCount  6
   reduce
 10/03/29 09:01:32 INFO mapred.TaskRunner: Task
 'attempt_local_0001_r_00_0' done.
 [r...@tyu-linux datajoin]# date
 Mon Mar 29 09:02:37 PDT 2010

 It took a minute between the last INFO log and exit of DataJoinJob.

 Cheers

 On Mon, Mar 29, 2010 at 8:26 AM, M B machac...@gmail.com wrote:

  Sorry, I should have mentioned that I tried that as well and it also
 gives
  an error:
 
  $ p...@hadoop01:~/hadoop_tests$ hadoop jar -libjars ./samplejoin.jar
   /opt/hadoop-0.20.2/contrib/datajoin/hadoop-0.20.2-datajoin.jar
  org.apache.hadoop.contrib.utils.join.DataJoinJob datajoin/input
  datajoin/output Text 1
  org.apache.hadoop.contrib.utils.join.SampleDataJoinMapper
  org.apache.hadoop.contrib.utils.join.SampleDataJoinReducer
  org.apache.hadoop.contrib.utils.join.SampleTaggedMapOutput Text
  Exception in thread main java.io.IOException: Error opening job jar:
  -libjars
 at org.apache.hadoop.util.RunJar.main(RunJar.java:90)
  Caused by: java.util.zip.ZipException: error in opening zip file
   

Re: ClassNotFoundException with contrib/join example

2010-03-29 Thread Ted Yu
Under hadoop-0.20.2/src/contrib/data_join, run
ant jar-examples

You may need to rename the jars
(hadoop-\$\{version\}-datajoin-examples.jar):
[r...@tyu-linux datajoin]# ls
classes  examples  hadoop-0.20.2-datajoin-examples.jar
hadoop-0.20.2-datajoin.jar  input  output  test

On Mon, Mar 29, 2010 at 1:59 PM, M B machac...@gmail.com wrote:

 I don't see hadoop-0.20.2-datajoin-examples.jar in the
 build/contrib/datajoin directory.  Is that a jar you created separately?  I
 tried creating one, but it still doesn't run (the mappers show the same
 error of missing the classes).

 had...@hadoop01:/opt/hadoop-0.20.2/build/contrib/datajoin$ ls
 classes  examples  test


 On Mon, Mar 29, 2010 at 9:26 AM, Ted Yu yuzhih...@gmail.com wrote:

  I can run the sample (I created the input files according to
 
 
 contrib/data_join/src/examples/org/apache/hadoop/contrib/utils/join/README.txt):
 
  [r...@tyu-linux datajoin]# pwd
  /opt/ks/hadoop-0.20.2/build/contrib/datajoin
  [r...@tyu-linux datajoin]# /opt/ks/hadoop-0.20.2/bin/hadoop jar
  hadoop-0.20.2-datajoin-examples.jar
  org.apache.hadoop.contrib.utils.join.DataJoinJob input output Text 1
  org.apache.hadoop.contrib.utils.join.SampleDataJoinMapper
  org.apache.hadoop.contrib.utils.join.SampleDataJoinReducer
  org.apache.hadoop.contrib.utils.join.SampleTaggedMapOutput Text
  Using TextInputFormat: Text
  Using TextOutputFormat: Text
  10/03/29 09:01:30 INFO jvm.JvmMetrics: Initializing JVM Metrics with
  processName=JobTracker, sessionId=
  10/03/29 09:01:30 WARN mapred.JobClient: Use GenericOptionsParser for
  parsing the arguments. Applications should implement Tool for the same.
  10/03/29 09:01:30 INFO mapred.FileInputFormat: Total input paths to
 process
  : 2
  Job job_local_0001 is submitted
  Job job_local_0001 is still running.
  10/03/29 09:01:30 INFO mapred.FileInputFormat: Total input paths to
 process
  : 2
  10/03/29 09:01:31 INFO mapred.MapTask: numReduceTasks: 1
  10/03/29 09:01:31 INFO mapred.MapTask: io.sort.mb = 100
  10/03/29 09:01:31 INFO mapred.MapTask: data buffer = 79691776/99614720
  10/03/29 09:01:31 INFO mapred.MapTask: record buffer = 262144/327680
  10/03/29 09:01:31 INFO mapred.MapTask: Starting flush of map output
  10/03/29 09:01:31 INFO mapred.MapTask: Finished spill 0
  10/03/29 09:01:32 INFO mapred.TaskRunner:
  Task:attempt_local_0001_m_00_0
  is done. And is in the process of commiting
  10/03/29 09:01:32 INFO mapred.LocalJobRunner: collectedCount6
  totalCount  6
 
  10/03/29 09:01:32 INFO mapred.TaskRunner: Task
  'attempt_local_0001_m_00_0' done.
  10/03/29 09:01:32 INFO mapred.MapTask: numReduceTasks: 1
  10/03/29 09:01:32 INFO mapred.MapTask: io.sort.mb = 100
  10/03/29 09:01:32 INFO mapred.MapTask: data buffer = 79691776/99614720
  10/03/29 09:01:32 INFO mapred.MapTask: record buffer = 262144/327680
  10/03/29 09:01:32 INFO mapred.MapTask: Starting flush of map output
  10/03/29 09:01:32 INFO mapred.MapTask: Finished spill 0
  10/03/29 09:01:32 INFO mapred.TaskRunner:
  Task:attempt_local_0001_m_01_0
  is done. And is in the process of commiting
  10/03/29 09:01:32 INFO mapred.LocalJobRunner: collectedCount5
  totalCount  5
 
  10/03/29 09:01:32 INFO mapred.TaskRunner: Task
  'attempt_local_0001_m_01_0' done.
  10/03/29 09:01:32 INFO mapred.LocalJobRunner:
  10/03/29 09:01:32 INFO mapred.Merger: Merging 2 sorted segments
  10/03/29 09:01:32 INFO mapred.Merger: Down to the last merge-pass, with 2
  segments left of total size: 939 bytes
  10/03/29 09:01:32 INFO mapred.LocalJobRunner:
  10/03/29 09:01:32 INFO util.NativeCodeLoader: Loaded the native-hadoop
  library
  10/03/29 09:01:32 INFO zlib.ZlibFactory: Successfully loaded 
 initialized
  native-zlib library
  10/03/29 09:01:32 INFO datajoin.job: key: A.a11 this.largestNumOfValues:
 3
  10/03/29 09:01:32 INFO mapred.TaskRunner:
  Task:attempt_local_0001_r_00_0
  is done. And is in the process of commiting
  10/03/29 09:01:32 INFO mapred.LocalJobRunner:
  10/03/29 09:01:32 INFO mapred.TaskRunner: Task
  attempt_local_0001_r_00_0
  is allowed to commit now
  10/03/29 09:01:32 INFO mapred.FileOutputCommitter: Saved output of task
  'attempt_local_0001_r_00_0' to
  file:/opt/kindsight/hadoop-0.20.2/build/contrib/datajoin/output
  10/03/29 09:01:32 INFO mapred.LocalJobRunner: actuallyCollectedCount5
  collectedCount  7
  groupCount  6
reduce
  10/03/29 09:01:32 INFO mapred.TaskRunner: Task
  'attempt_local_0001_r_00_0' done.
  [r...@tyu-linux datajoin]# date
  Mon Mar 29 09:02:37 PDT 2010
 
  It took a minute between the last INFO log and exit of DataJoinJob.
 
  Cheers
 
  On Mon, Mar 29, 2010 at 8:26 AM, M B machac...@gmail.com wrote:
 
   Sorry, I should have mentioned that I tried that as well and it also
  gives
   an error:
  
   $ p...@hadoop01:~/hadoop_tests$ hadoop jar -libjars ./samplejoin.jar
/opt/hadoop-0.20.2/contrib/datajoin/hadoop-0.20.2-datajoin.jar
   

Re: ClassNotFoundException with contrib/join example

2010-03-29 Thread M B
ah, thanks, that got it.  now I'm at the same point you are -
part-0.deflate is there and is not readable.  Seems like I should see
text output, right?

On Mon, Mar 29, 2010 at 2:04 PM, Ted Yu yuzhih...@gmail.com wrote:

 Under hadoop-0.20.2/src/contrib/data_join, run
 ant jar-examples

 You may need to rename the jars
 (hadoop-\$\{version\}-datajoin-examples.jar):
 [r...@tyu-linux datajoin]# ls
 classes  examples  hadoop-0.20.2-datajoin-examples.jar
 hadoop-0.20.2-datajoin.jar  input  output  test

 On Mon, Mar 29, 2010 at 1:59 PM, M B machac...@gmail.com wrote:

  I don't see hadoop-0.20.2-datajoin-examples.jar in the
  build/contrib/datajoin directory.  Is that a jar you created separately?
  I
  tried creating one, but it still doesn't run (the mappers show the same
  error of missing the classes).
 
  had...@hadoop01:/opt/hadoop-0.20.2/build/contrib/datajoin$ ls
  classes  examples  test
 
 
  On Mon, Mar 29, 2010 at 9:26 AM, Ted Yu yuzhih...@gmail.com wrote:
 
   I can run the sample (I created the input files according to
  
  
 
 contrib/data_join/src/examples/org/apache/hadoop/contrib/utils/join/README.txt):
  
   [r...@tyu-linux datajoin]# pwd
   /opt/ks/hadoop-0.20.2/build/contrib/datajoin
   [r...@tyu-linux datajoin]# /opt/ks/hadoop-0.20.2/bin/hadoop jar
   hadoop-0.20.2-datajoin-examples.jar
   org.apache.hadoop.contrib.utils.join.DataJoinJob input output Text 1
   org.apache.hadoop.contrib.utils.join.SampleDataJoinMapper
   org.apache.hadoop.contrib.utils.join.SampleDataJoinReducer
   org.apache.hadoop.contrib.utils.join.SampleTaggedMapOutput Text
   Using TextInputFormat: Text
   Using TextOutputFormat: Text
   10/03/29 09:01:30 INFO jvm.JvmMetrics: Initializing JVM Metrics with
   processName=JobTracker, sessionId=
   10/03/29 09:01:30 WARN mapred.JobClient: Use GenericOptionsParser for
   parsing the arguments. Applications should implement Tool for the same.
   10/03/29 09:01:30 INFO mapred.FileInputFormat: Total input paths to
  process
   : 2
   Job job_local_0001 is submitted
   Job job_local_0001 is still running.
   10/03/29 09:01:30 INFO mapred.FileInputFormat: Total input paths to
  process
   : 2
   10/03/29 09:01:31 INFO mapred.MapTask: numReduceTasks: 1
   10/03/29 09:01:31 INFO mapred.MapTask: io.sort.mb = 100
   10/03/29 09:01:31 INFO mapred.MapTask: data buffer = 79691776/99614720
   10/03/29 09:01:31 INFO mapred.MapTask: record buffer = 262144/327680
   10/03/29 09:01:31 INFO mapred.MapTask: Starting flush of map output
   10/03/29 09:01:31 INFO mapred.MapTask: Finished spill 0
   10/03/29 09:01:32 INFO mapred.TaskRunner:
   Task:attempt_local_0001_m_00_0
   is done. And is in the process of commiting
   10/03/29 09:01:32 INFO mapred.LocalJobRunner: collectedCount6
   totalCount  6
  
   10/03/29 09:01:32 INFO mapred.TaskRunner: Task
   'attempt_local_0001_m_00_0' done.
   10/03/29 09:01:32 INFO mapred.MapTask: numReduceTasks: 1
   10/03/29 09:01:32 INFO mapred.MapTask: io.sort.mb = 100
   10/03/29 09:01:32 INFO mapred.MapTask: data buffer = 79691776/99614720
   10/03/29 09:01:32 INFO mapred.MapTask: record buffer = 262144/327680
   10/03/29 09:01:32 INFO mapred.MapTask: Starting flush of map output
   10/03/29 09:01:32 INFO mapred.MapTask: Finished spill 0
   10/03/29 09:01:32 INFO mapred.TaskRunner:
   Task:attempt_local_0001_m_01_0
   is done. And is in the process of commiting
   10/03/29 09:01:32 INFO mapred.LocalJobRunner: collectedCount5
   totalCount  5
  
   10/03/29 09:01:32 INFO mapred.TaskRunner: Task
   'attempt_local_0001_m_01_0' done.
   10/03/29 09:01:32 INFO mapred.LocalJobRunner:
   10/03/29 09:01:32 INFO mapred.Merger: Merging 2 sorted segments
   10/03/29 09:01:32 INFO mapred.Merger: Down to the last merge-pass, with
 2
   segments left of total size: 939 bytes
   10/03/29 09:01:32 INFO mapred.LocalJobRunner:
   10/03/29 09:01:32 INFO util.NativeCodeLoader: Loaded the native-hadoop
   library
   10/03/29 09:01:32 INFO zlib.ZlibFactory: Successfully loaded 
  initialized
   native-zlib library
   10/03/29 09:01:32 INFO datajoin.job: key: A.a11
 this.largestNumOfValues:
  3
   10/03/29 09:01:32 INFO mapred.TaskRunner:
   Task:attempt_local_0001_r_00_0
   is done. And is in the process of commiting
   10/03/29 09:01:32 INFO mapred.LocalJobRunner:
   10/03/29 09:01:32 INFO mapred.TaskRunner: Task
   attempt_local_0001_r_00_0
   is allowed to commit now
   10/03/29 09:01:32 INFO mapred.FileOutputCommitter: Saved output of task
   'attempt_local_0001_r_00_0' to
   file:/opt/kindsight/hadoop-0.20.2/build/contrib/datajoin/output
   10/03/29 09:01:32 INFO mapred.LocalJobRunner: actuallyCollectedCount
  5
   collectedCount  7
   groupCount  6
 reduce
   10/03/29 09:01:32 INFO mapred.TaskRunner: Task
   'attempt_local_0001_r_00_0' done.
   [r...@tyu-linux datajoin]# date
   Mon Mar 29 09:02:37 PDT 2010
  
   It took a minute between the last INFO log and exit of DataJoinJob.
  
   Cheers
  
   On Mon, Mar 

Re: ClassNotFoundException with contrib/join example

2010-03-26 Thread Ted Yu
DataJoinJob is contained in hadoop-0.20.2-datajoin.jar which is in your
HADOOP_CLASSPATH

I think you should specify samplejoin.jar using -libjars instead of putting
it directly after jar command:
hadoop jar hadoop-0.20.2-datajoin.jar
org.apache.hadoop.contrib.utils.join.DataJoinJob -libjars ./samplejoin.jar
... (same as your example)

Cheers

On Fri, Mar 26, 2010 at 3:24 PM, M B machac...@gmail.com wrote:

 I may be having a setup issue with classpaths, would appreciate some help.

 I created a jar with all the Sample* classes in contrib/DataJoin.  Here is
 the listing of my samplejoin.jar file:
  zip.vim version v22
  Browsing zipfile /home/hadoop/hadoop_tests/samplejoin.jar
  Select a file with cursor and press ENTER
 META-INF/
 META-INF/MANIFEST.MF
 org/
 org/apache/
 org/apache/hadoop/
 org/apache/hadoop/contrib/
 org/apache/hadoop/contrib/utils/
 org/apache/hadoop/contrib/utils/join/
 org/apache/hadoop/contrib/utils/join/SampleDataJoinReducer.class
 org/apache/hadoop/contrib/utils/join/SampleTaggedMapOutput.class
 org/apache/hadoop/contrib/utils/join/SampleDataJoinMapper.class

 When I go to run this, things start to run, but every Map try errors out
 with:
 java.lang.RuntimeException: java.lang.ClassNotFoundException:
 org.apache.hadoop.contrib.utils.join.SampleTaggedMapOutput

 Here is the command:
 hadoop jar ./samplejoin.jar
 org.apache.hadoop.contrib.utils.join.DataJoinJob
 datajoin/input datajoin/output Text 1
 org.apache.hadoop.contrib.utils.join.SampleDataJoinMapper
 org.apache.hadoop.contrib.utils.join.SampleDataJoinReducer
 org.apache.hadoop.contrib.utils.join.SampleTaggedMapOutput Text

 This is a new install of 0.20.2.

 HADOOP_CLASSPATH is set
 to: /opt/hadoop-0.20.2/contrib/datajoin/hadoop-0.20.2-datajoin.jar
 Any help would be appreciated.



Re: ClassNotFoundException with contrib/join example

2010-03-26 Thread Ted Yu
Then use the syntax given by
http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/util/GenericOptionsParser.html
:

$ bin/hadoop jar -libjars ./samplejoin.jar
/opt/hadoop-0.20.2/contrib/datajoin/hadoop-0.20.2-datajoin.jar
org.apache.hadoop.contrib.utils.join.DataJoinJob datajoin/input ...

On Fri, Mar 26, 2010 at 5:10 PM, M B machac...@gmail.com wrote:

 Sorry, but where exactly do I include the libjars option?  I tried to put
 it
 where you stated (after the DataJoinJob class), but it just comes back with
 usage information (as if the option is not valid):
 $ p...@hadoop01:~/hadoop_tests$ hadoop jar
 /opt/hadoop-0.20.2/contrib/datajoin/hadoop-0.20.2-datajoin.jar
 org.apache.hadoop.contrib.utils.join.DataJoinJob -libjars ./samplejoin.jar
 datajoin/input datajoin/output Text 1
 org.apache.hadoop.contrib.utils.join.SampleDataJoinMapper
 org.apache.hadoop.contrib.utils.join.SampleDataJoinReducer
 org.apache.hadoop.contrib.utils.join.SampleTaggedMapOutput Text
 *usage: DataJoinJob inputdirs outputdir map_input_file_format numofParts
 mapper_class reducer_class map_output_value_class output_value_class
 [maxNumOfValuesPerGroup [descriptionOfJob]]]*

 It seems like it's not taking the option for some reason, like it's failing
 an argument check in DataJoinJob - does that not use the standard args or
 something?


 On Fri, Mar 26, 2010 at 4:38 PM, Ted Yu yuzhih...@gmail.com wrote:

  DataJoinJob is contained in hadoop-0.20.2-datajoin.jar which is in your
  HADOOP_CLASSPATH
 
  I think you should specify samplejoin.jar using -libjars instead of
 putting
  it directly after jar command:
  hadoop jar hadoop-0.20.2-datajoin.jar
  org.apache.hadoop.contrib.utils.join.DataJoinJob -libjars
 ./samplejoin.jar
  ... (same as your example)
 
  Cheers
 
  On Fri, Mar 26, 2010 at 3:24 PM, M B machac...@gmail.com wrote:
 
   I may be having a setup issue with classpaths, would appreciate some
  help.
  
   I created a jar with all the Sample* classes in contrib/DataJoin.  Here
  is
   the listing of my samplejoin.jar file:
zip.vim version v22
Browsing zipfile /home/hadoop/hadoop_tests/samplejoin.jar
Select a file with cursor and press ENTER
   META-INF/
   META-INF/MANIFEST.MF
   org/
   org/apache/
   org/apache/hadoop/
   org/apache/hadoop/contrib/
   org/apache/hadoop/contrib/utils/
   org/apache/hadoop/contrib/utils/join/
   org/apache/hadoop/contrib/utils/join/SampleDataJoinReducer.class
   org/apache/hadoop/contrib/utils/join/SampleTaggedMapOutput.class
   org/apache/hadoop/contrib/utils/join/SampleDataJoinMapper.class
  
   When I go to run this, things start to run, but every Map try errors
 out
   with:
   java.lang.RuntimeException: java.lang.ClassNotFoundException:
   org.apache.hadoop.contrib.utils.join.SampleTaggedMapOutput
  
   Here is the command:
   hadoop jar ./samplejoin.jar
   org.apache.hadoop.contrib.utils.join.DataJoinJob
   datajoin/input datajoin/output Text 1
   org.apache.hadoop.contrib.utils.join.SampleDataJoinMapper
   org.apache.hadoop.contrib.utils.join.SampleDataJoinReducer
   org.apache.hadoop.contrib.utils.join.SampleTaggedMapOutput Text
  
   This is a new install of 0.20.2.
  
   HADOOP_CLASSPATH is set
   to: /opt/hadoop-0.20.2/contrib/datajoin/hadoop-0.20.2-datajoin.jar
   Any help would be appreciated.