Re: Adding mahout math jar to hadoop mapreduce execution

2012-02-01 Thread Daniel Quach
Thanks, this seems to make it work. In fact, I did not even need to specify lib jars in the command line…should I be worried that it doesn't work that way? On Jan 31, 2012, at 4:09 PM, Joey Echeverria wrote: You also need to add the jar to the classpath so it's available in your main. You can

How to convert sequence file into normal text file

2012-02-01 Thread praveenesh kumar
I am running SimpleKmeansClustering sample code from mahout in action. How can I convert sequence file written using SequenceFile.Writer into plain HDFS file so that I can read it properly. I know mahout has seqdumper tool to read it. But I want to create normal text file rather than sequence file

reduce no response

2012-02-01 Thread Jinyan Xu
Hi all, Do terasort use 4 reduce 8 map , when all maps finish reduce no response. Debug messages like: 12/02/02 01:42:47 DEBUG ipc.Client: IPC Client (47) connection to localhost/127.0.0.1:9001 from hadoop sending #7126 12/02/02 01:42:47 DEBUG ipc.Client: IPC Client (47) connection to

reduce no response

2012-02-01 Thread Jinyan Xu
Hi all, Do terasort use 4 reduce 8 map , when all maps finish reduce no response. Debug messages like: 12/02/02 01:42:47 DEBUG ipc.Client: IPC Client (47) connection to localhost/127.0.0.1:9001 from hadoop sending #7126 12/02/02 01:42:47 DEBUG ipc.Client: IPC Client (47) connection to

Re: reduce no response

2012-02-01 Thread Harsh J
Jinyan, Am not sure what your problem here seems to be - the client hanging or the job itself hanging. Could you provide us some more information on what state the job is hung in, or expand on the job client hang? Having a jstack also helps whenever you run into a JVM hang. On Wed, Feb 1, 2012

Re: How to convert sequence file into normal text file

2012-02-01 Thread Harsh J
Praveenesh, The utility hadoop fs -text seqfile, documented at http://hadoop.apache.org/common/docs/current/file_system_shell.html#text can read SequenceFiles and print their string representations out. Will this be of any help? Otherwise, you want to use a SequenceFile.Reader over your given

Re: How to convert sequence file into normal text file

2012-02-01 Thread praveenesh kumar
Cool. I used the 2nd method. Will look out for other 2 also. Thanks, Praveenesh On Wed, Feb 1, 2012 at 4:43 PM, Harsh J ha...@cloudera.com wrote: Praveenesh, The utility hadoop fs -text seqfile, documented at http://hadoop.apache.org/common/docs/current/file_system_shell.html#text can read

Re: Why $HADOOP_PREFIX ?

2012-02-01 Thread Prashant Sharma
I think you have misunderstood something. AFAIK or understand these variables are set automatically when you run a script. it's name is obscure for some strange reason. ;). Warning: $HADOOP_HOME is deprecated is always there. whether the variable is set or not. Why? Because the hadoop-config is

Re: Adding mahout math jar to hadoop mapreduce execution

2012-02-01 Thread Joey Echeverria
The -libjars feature is needed if you use the classes in your remote code (map and reduce functions). Is it possible you only use it in your main() method? -Joey Sent from my iPhone On Feb 1, 2012, at 3:38, Daniel Quach danqu...@cs.ucla.edu wrote: Thanks, this seems to make it work. In fact,

Re: Why $HADOOP_PREFIX ?

2012-02-01 Thread praveenesh kumar
Interesting and strange. but are there any reason for setting $HADOOP_HOME to $HADOOP_PREFIX in hadoop-conf.sh and then checking in /bin/hadoop.sh whether $HADOOP_HOME is not equal to I mean if I comment out the export HADOOP_HOME=${HADOOP_PREFIX} in hadoop-conf.sh, does it make any difference ?

how to force small files not to span over multiple nodes?

2012-02-01 Thread Qiming He
Hi all, Is there anyway (command) to determine the physical location of a file in HDFS to see it spans over multiple nodes? and any way to force a small file not to span over two nodes? assuming its size is smaller than default block size (e.g., 64MB). Thanks in advance -Qiming

Re: how to force small files not to span over multiple nodes?

2012-02-01 Thread Harsh J
If the file size is less than a block size, then file isn't spaning across nodes. Files are split at block size points, so your file is essentially just one block here. Also see http://search-hadoop.com/m/tGBgk1WFVAO1 for your block location question. You can get the node list of replicas this

Re: Why $HADOOP_PREFIX ?

2012-02-01 Thread Robert Evans
I think it comes down to a long history of splitting and then remerging the hadoop project. I could be wrong about a lot of this so take it worth a grain of salt. Hadoop originally, and still is on 1.0 a single project. HDFS, mapreduce and common are all compiled together into a single jar

Re: Why $HADOOP_PREFIX ?

2012-02-01 Thread Harsh J
Personal opinion here: For branch-1, I do think the earlier tarball structure was better. I do not see why it had to change for this version at least. Possibly was changed during all the work of adding packaging-related scripts for rpm/deb into Hadoop itself, but the tarball right now is not as

Re: ERROR namenode.NameNode: java.io.IOException: Cannot remove current directory: /var/lib/hadoop-0.20/cache/hadoop/dfs/name/current

2012-02-01 Thread Harsh J
Vijay, [Moving to cdh-u...@cloudera.org | https://groups.google.com/a/cloudera.org/group/cdh-user/topics since this is CDH3 specific] You need to run that command as the 'hdfs' user, since the specific dirs are writable only by the group 'hadoop': $ sudo -u hdfs hadoop namenode -format On Thu,

No class def found for AbstractIterator

2012-02-01 Thread Daniel Quach
I recently changed my reducer to output mahout VectorWritable instead of Text, and as a result, when I try to execute my map reduce, I run into this error. java.lang.NoClassDefFoundError: com/google/common/collect/AbstractIterator I've included the mahout-core jar in my hadoop class path, so I

Can't achieve load distribution

2012-02-01 Thread Mark Kerzner
Hi, I have a simple MR job, and I want each Mapper to get one line from my input file (which contains further instructions for lengthy processing). Each line is 100 characters long, and I tell Hadoop to read only 100 bytes,

Re: Why $HADOOP_PREFIX ?

2012-02-01 Thread Prashant Sharma
@Harsh, I sometimes get similar thoughts :P. But wonder if there is something can be done. @Bobby, Thanks for elaborating the strange reason. :) @Praveenesh, Yes, you can do away with sourcing of hadoop-config.sh and set all the necessary variables by hand. On Wed, Feb 1, 2012 at 10:38 PM,

Re: Can't achieve load distribution

2012-02-01 Thread Anil Gupta
Do u have enough data to start more than one mapper? If entire data is less than a block size then only 1 mapper will run. Best Regards, Anil On Feb 1, 2012, at 4:21 PM, Mark Kerzner mark.kerz...@shmsoft.com wrote: Hi, I have a simple MR job, and I want each Mapper to get one line from my

Re: Can't achieve load distribution

2012-02-01 Thread Mark Kerzner
Anil, do you mean one block of HDFS, like 64MB? Mark On Wed, Feb 1, 2012 at 7:03 PM, Anil Gupta anilgupt...@gmail.com wrote: Do u have enough data to start more than one mapper? If entire data is less than a block size then only 1 mapper will run. Best Regards, Anil On Feb 1, 2012, at

RE: ERROR namenode.NameNode: java.io.IOException: Cannot remove current directory: /var/lib/hadoop-0.20/cache/hadoop/dfs/name/current

2012-02-01 Thread Uma Maheswara Rao G
can you try delete this directory manually? Please check other process already running with this directory configured. Regards, Uma From: Vijayakumar Ramdoss [nellaivi...@gmail.com] Sent: Thursday, February 02, 2012 1:27 AM To:

Re: Failed to set permissions of path

2012-02-01 Thread Radu
If you just want to test Hadoop on Windows the actual permissions are not that important. I updated RawLocalFileSystem.java so it just assigns some generous value to all files everytime ignoring the actual value in the 'permission' argument. /** * Use the command chmod to set permission.

Re: Can't achieve load distribution

2012-02-01 Thread Anil Gupta
Yes, if ur block size is 64mb. Btw, block size is configurable in Hadoop. Best Regards, Anil On Feb 1, 2012, at 5:06 PM, Mark Kerzner mark.kerz...@shmsoft.com wrote: Anil, do you mean one block of HDFS, like 64MB? Mark On Wed, Feb 1, 2012 at 7:03 PM, Anil Gupta anilgupt...@gmail.com

Re: No class def found for AbstractIterator

2012-02-01 Thread Harsh J
You need Google's guava jar, which Mahout seems to add in as a dependency, among your -libjars for the job. On Thu, Feb 2, 2012 at 5:27 AM, Daniel Quach danqu...@cs.ucla.edu wrote: I recently changed my reducer to output mahout VectorWritable instead of Text, and as a result, when I try to

Re: Can't achieve load distribution

2012-02-01 Thread Mark Kerzner
Thanks! Mark On Wed, Feb 1, 2012 at 7:44 PM, Anil Gupta anilgupt...@gmail.com wrote: Yes, if ur block size is 64mb. Btw, block size is configurable in Hadoop. Best Regards, Anil On Feb 1, 2012, at 5:06 PM, Mark Kerzner mark.kerz...@shmsoft.com wrote: Anil, do you mean one block of

Re: Reduce copy at 0.00 MB/s

2012-02-01 Thread hadoop hive
Hey , Can any1 help me with this, i have increases the reduce slowstart to .25 but its still hangs after copy . tell me what else i can change it to make it working fine. regards Vikas Srivastava On Wed, Jan 25, 2012 at 7:45 PM, praveenesh kumar praveen...@gmail.comwrote: Yeah , I am doing

Re: reduce no response

2012-02-01 Thread Idris Ali
Hi Jinyan, To quickly check if it has to do with resolving the ip address, can you check what is the hostname of the system? A quick hack would be to rename host itself as localhost if you are using cloudera's pseudo cluster hadoop for testing. Thanks, -Idris On Wed, Feb 1, 2012 at 4:24 PM,

RE: Reduce copy at 0.00 MB/s

2012-02-01 Thread sathyavageeswaran
Can somebody help me to unsubscribe. Even after unsubscribing I continue to get mails. -Original Message- From: hadoop hive [mailto:hadooph...@gmail.com] Sent: 02 February 2012 12:29 To: common-user@hadoop.apache.org Subject: Re: Reduce copy at 0.00 MB/s Hey , Can any1 help me with