Streaming Job map/reduce not working with scripts on 1.0.3

2013-01-04 Thread Ben Kim
Hi ! I'm using hadoop-1.0.3 to run streaming jobs with map/reduce shell scripts such as this bin/hadoop jar ./contrib/streaming/hadoop-streaming-1.0.3.jar -input /input -output /output/015 -mapper streaming-map.sh -reducer streaming-reduce.sh -file /home/hadoop/streaming/streaming-map.sh -file

Re: Streaming Job map/reduce not working with scripts on 1.0.3

2013-01-04 Thread Ben Kim
nevermind. the problem has been fixed. The problem was the trailing {control-v}{control-m} character on the first line of #!/bin/bash (which i blame my teammate for writing the script in windows notepad !!) On Fri, Jan 4, 2013 at 8:09 PM, Ben Kim benkimkim...@gmail.com wrote: Hi ! I'm

Re: Distscp code

2013-01-04 Thread Dave Beech
org.apache.hadoop.tools.DistCp Cheers, Dave On 4 January 2013 11:47, Kasi Subrahmanyam kasisubbu...@gmail.com wrote: Hi Guys, The FsShell class has the code that runs behind for normal operations that we perform on HDFS. Which classs has the code for distscp?

Re: Lost tasktracker errors

2013-01-04 Thread Robert Evans
This really should be on the user list so I am moving it over there. It is probably something about the OS that is killing it. The only thing that I know of on stock Linux that would do this is the Out of Memory Killer. Do you have swap enabled on these boxes? You should check the OOM killer

Can I change the IP/hostname of the dfs name without data loss?

2013-01-04 Thread Jianhui Zhang
Hadoop version: hadoop-0.20.205.0 My NN machine has 2 IP addresses and 2 hostnames assigned to them respectively. I have configured fs.default.name using one of the hostnames and used the cluster for a while. Now, I may have to move the fs.default.name to the other IP/hostname of the same

Re: more reduce tasks

2013-01-04 Thread Pavel Hančar
Hello, thank you for the answer. Exactly: I want the parallelism but a single final output. What do you mean by another stage? I thought I should setmapred.reduce.tasks large enough and hadoop will run the reducers in so many rounds it will be optimal. But it isn't the case. When I tried to

Re: Instructions on how to run Apache Hadoop 2.0.2-alpha?

2013-01-04 Thread Glen Mazza
Actually, those instructions are for Hadoop 0.24, not 2.0.2-alpha. Glen On 11/30/2012 03:40 PM, Cristian Cira wrote: Dear Glen, try http://blog.cloudera.com/blog/2011/11/building-and-deploying-mr2/ Cristian Cira Graduate Research Assistant Parallel Architecture and System Laboratory(PASL)

Re: Instructions on how to run Apache Hadoop 2.0.2-alpha?

2013-01-04 Thread Chen He
0.24 is the developing version, 2.0.2 is the publication version. 0.23 or above is published as 2.0.x On Fri, Jan 4, 2013 at 5:45 AM, Glen Mazza gma...@talend.com wrote: Actually, those instructions are for Hadoop 0.24, not 2.0.2-alpha. Glen On 11/30/2012 03:40 PM, Cristian Cira wrote:

Re: Instructions on how to run Apache Hadoop 2.0.2-alpha?

2013-01-04 Thread Glen Mazza
Thanks for your response. That's pretty vital information--I'm not used to separate development/publication versions. Is the 0.23--2.0.2 renumbering stated anywhere on the Hadoop website or wiki? It's very confusing as people otherwise think there are three separate branches -- 0.2x.y, 1.x,

Possible to run an application jar as a hadoop daemon?

2013-01-04 Thread Krishna Rao
Hi al, I have a java application jar that converts some files and writes directly into hdfs. If I want to run the jar I need to run it using hadoop jar application jar, so that it can access HDFS (that is running java -jar application jar results in a HDFS error). Is it possible to run an jar

Re: Instructions on how to run Apache Hadoop 2.0.2-alpha?

2013-01-04 Thread Chen He
Hi Glen I agree with you. There are many versions and confusing. If you really want to know, you can check the developing and publishing documents. In 0.23's document, they must announced how many patches are included, as well as the 2.0.x. Then you will exactly understand which one is which

Re: Instructions on how to run Apache Hadoop 2.0.2-alpha?

2013-01-04 Thread Glen Mazza
OK, looking at the Hadoop branches: http://svn.apache.org/viewvc/hadoop/common/branches/ and tags: http://svn.apache.org/viewvc/hadoop/common/tags/, it's thankfully not that bad. There is no 0.24 in Hadoop, and the Maven pom files for the 2.0.2-alpha branch indeed say 2.0.2-alpha. The

Hello and request some advice.

2013-01-04 Thread Cristian Carranza
Hi all in this list! My name is Cristián Carranza, a statistician and quality consultant that for the second time, intends to learn Hadoop and Big Data related issues. I’am requesting advice in order to plan my learning. I read the page “ Products that include Apache Hadoop or derivative works

Re: Hello and request some advice.

2013-01-04 Thread Nitin Pawar
- Is Ubuntu a good O.S. for running Hadoop? I’ve tried to learn in the past using Red Hat Infosphere Biginsights, but I need a free O.S. If you want a free O.S , ubuntu is good but if you are familiar with RedHat then you may want to have look at Scientific Linux (Its free as well) - Is there a

Re: Hello and request some advice.

2013-01-04 Thread Gangadhar Ramini
Hi Nitin, I tried latest stable Hadoop version on windows with cygwin, I see following error in JobTracker logs. Do you have any advice? C:\cygwin\home\garamini\hadoop-1.0.4\logs\history to 0755^M at org.apache.hadoop.fs.FileUtil.checkReturnValue(FileUtil.java:689)^M at

Re: Hello and request some advice.

2013-01-04 Thread Nitin Pawar
Does your user have permissions to read/write on the dfs directories you made? try changing the directory ownerships to the user which is running hadoop. On Fri, Jan 4, 2013 at 11:20 PM, Gangadhar Ramini use.had...@gmail.comwrote: Hi Nitin, I tried latest stable Hadoop version on windows

Re: Hello and request some advice.

2013-01-04 Thread Gangadhar Ramini
Yes user owns the directory and had right permissions, still i don't understand what could be the issue. ls -ltr ~/hadoop-1.0.4/logs/history total 0 drwxr-xr-x+ 1 garamini mkgroup 0 Jan 2 22:15 Thanks -Gangadhar On Fri, Jan 4, 2013 at 9:55 AM, Nitin Pawar nitinpawar...@gmail.com wrote:

RE: Hello and request some advice.

2013-01-04 Thread John Lilley
If you like RedHat, consider Centos also; it is a nearly-complete clone of the RHEL distro. John From: Nitin Pawar [mailto:nitinpawar...@gmail.com] Sent: Friday, January 04, 2013 10:46 AM To: user@hadoop.apache.org Subject: Re: Hello and request some advice. - Is Ubuntu a good O.S. for running

Re: Hello and request some advice.

2013-01-04 Thread Gangadhar Ramini
Following is the configuration i put in config. core-site.xml property namehadoop.tmp.dir/name value/usr/local/hadoop/datastore/hadoop-${user.name}/value /property hdfs-site.xml property namedfs.name.dir/name valueC:/cygwin/dfs/logs/value /property property

RE: Hello and request some advice.

2013-01-04 Thread Rajeev Yadav
Hi john,which would be a better option between Linux and windows from learning  perspective of Hadoop? --- On Fri, 4/1/13, John Lilley john.lil...@redpoint.net wrote: From: John Lilley john.lil...@redpoint.net Subject: RE: Hello and request some advice. To: user@hadoop.apache.org

RE: Hello and request some advice.

2013-01-04 Thread John Lilley
I personally find Windows easier to use, however it is not a supported Hadoop production environment, and I *think* you have to use Cygwin under Windows even for development. Given that, if you want to use a Windows machine and performance is not a consideration, you could spin up a VirtualBox

sporadic failure

2013-01-04 Thread Stan Rosenberg
Hi, Any ideas why a staging directory would suddenly become unavailable after the completion of the map phase but before the start of the reduce phase? We noticed a sporadic failure yesterday wherein all the map tasks completed successfully and all the reduce tasks failed. Upon examining task

Re: Hello and request some advice.

2013-01-04 Thread Glen Mazza
I would say Linux, because in your job you're most likely going to use a *nix-type system instead of Windows for hosting Hadoop, so it's good to gain experience with whatever headaches come along. Further, you're also learning Linux simultaneously, killing two birds with one stone. Glen On

Gridmix version 1.0.4 Error

2013-01-04 Thread Sean Barry
Hi, I am trying to use grid mix but I keep getting the error that is shown below. Does anyone have some suggestions. Thanks in advance. Sean Barry hostname:gridmix seanbarry$ pwd /usr/local/hadoop-1.0.4/contrib/gridmix hostname:gridmix seanbarry$ java -cp

Re: Possible to run an application jar as a hadoop daemon?

2013-01-04 Thread Robert Molina
Hi Krishna, Do you simply want to schedule the job to run at specific times? If so, I believe oozie maybe what you are looking for. Regards, Robert On Fri, Jan 4, 2013 at 6:40 AM, Krishna Rao krishnanj...@gmail.com wrote: Hi al, I have a java application jar that converts some files and

Re: Instructions on how to run Apache Hadoop 2.0.2-alpha?

2013-01-04 Thread Robert Evans
It is very long and confusing. Here is my understanding of what happened even though I was not around for all of it. 0.1 - 0.20 was mostly main line development. At that point there was a split and 0.20 was forked to add security, 0.20-security, and also to add in append support for H-BASE

Re: Instructions on how to run Apache Hadoop 2.0.2-alpha?

2013-01-04 Thread Glen Mazza
Wow. Thanks for the explanation, very helpful. Glen On 01/04/2013 06:28 PM, Robert Evans wrote: It is very long and confusing. Here is my understanding of what happened even though I was not around for all of it. 0.1 - 0.20 was mostly main line development. At that point there was a split

Re: sporadic failure

2013-01-04 Thread Harsh J
Hi Stan, I'd check the NN audit logs for the file /user/apache/.staging/ job_201211150255_237458/job.xml to see when/who deleted it away, perhaps that would give more insight. On Sat, Jan 5, 2013 at 2:32 AM, Stan Rosenberg stan.rosenb...@gmail.comwrote: Hi, Any ideas why a staging directory

Re: Possible to run an application jar as a hadoop daemon?

2013-01-04 Thread Harsh J
Hi, On Fri, Jan 4, 2013 at 8:10 PM, Krishna Rao krishnanj...@gmail.com wrote: If I want to run the jar I need to run it using hadoop jar application jar, so that it can access HDFS (that is running java -jar application jar results in a HDFS error). The latter is because running a Hadoop

Re: more reduce tasks

2013-01-04 Thread Harsh J
What do you mean by a final reduce? Not all jobs require that the final output result be singular, since the reducer phase is provided to work on a per-partition basis (also why the files are named part-*). One job consists of only one reduce phase, wherein the reducers all work independently and