Hi !
I'm using hadoop-1.0.3 to run streaming jobs with map/reduce shell scripts
such as this
bin/hadoop jar ./contrib/streaming/hadoop-streaming-1.0.3.jar -input /input
-output /output/015 -mapper streaming-map.sh -reducer
streaming-reduce.sh -file /home/hadoop/streaming/streaming-map.sh -file
nevermind. the problem has been fixed.
The problem was the trailing {control-v}{control-m} character on the first
line of #!/bin/bash
(which i blame my teammate for writing the script in windows notepad !!)
On Fri, Jan 4, 2013 at 8:09 PM, Ben Kim benkimkim...@gmail.com wrote:
Hi !
I'm
org.apache.hadoop.tools.DistCp
Cheers,
Dave
On 4 January 2013 11:47, Kasi Subrahmanyam kasisubbu...@gmail.com wrote:
Hi Guys,
The FsShell class has the code that runs behind for normal operations that
we perform on HDFS.
Which classs has the code for distscp?
This really should be on the user list so I am moving it over there.
It is probably something about the OS that is killing it. The only thing
that I know of on stock Linux that would do this is the Out of Memory
Killer. Do you have swap enabled on these boxes? You should check the
OOM killer
Hadoop version: hadoop-0.20.205.0
My NN machine has 2 IP addresses and 2 hostnames assigned to them
respectively. I have configured fs.default.name using one of the
hostnames and used the cluster for a while.
Now, I may have to move the fs.default.name to the other IP/hostname of
the same
Hello,
thank you for the answer. Exactly: I want the parallelism but a single
final output. What do you mean by another stage? I thought I should
setmapred.reduce.tasks large enough and hadoop will run the reducers
in so
many rounds it will be optimal. But it isn't the case.
When I tried to
Actually, those instructions are for Hadoop 0.24, not 2.0.2-alpha.
Glen
On 11/30/2012 03:40 PM, Cristian Cira wrote:
Dear Glen,
try http://blog.cloudera.com/blog/2011/11/building-and-deploying-mr2/
Cristian Cira
Graduate Research Assistant
Parallel Architecture and System Laboratory(PASL)
0.24 is the developing version, 2.0.2 is the publication version.
0.23 or above is published as 2.0.x
On Fri, Jan 4, 2013 at 5:45 AM, Glen Mazza gma...@talend.com wrote:
Actually, those instructions are for Hadoop 0.24, not 2.0.2-alpha.
Glen
On 11/30/2012 03:40 PM, Cristian Cira wrote:
Thanks for your response. That's pretty vital information--I'm not used
to separate development/publication versions. Is the 0.23--2.0.2
renumbering stated anywhere on the Hadoop website or wiki? It's very
confusing as people otherwise think there are three separate branches --
0.2x.y, 1.x,
Hi al,
I have a java application jar that converts some files and writes directly
into hdfs.
If I want to run the jar I need to run it using hadoop jar application
jar, so that it can access HDFS (that is running java -jar application
jar results in a HDFS error).
Is it possible to run an jar
Hi Glen
I agree with you. There are many versions and confusing. If you really want
to know, you can check the developing and publishing documents. In 0.23's
document, they must announced how many patches are included, as well as the
2.0.x.
Then you will exactly understand which one is which
OK, looking at the Hadoop branches:
http://svn.apache.org/viewvc/hadoop/common/branches/ and tags:
http://svn.apache.org/viewvc/hadoop/common/tags/, it's thankfully not
that bad.
There is no 0.24 in Hadoop, and the Maven pom files for the 2.0.2-alpha
branch indeed say 2.0.2-alpha. The
Hi all in this list!
My name is Cristián Carranza, a statistician and quality consultant that for
the second time, intends to learn Hadoop and Big Data related issues.
I’am requesting advice in order to plan my learning.
I read the page “ Products that include Apache Hadoop or derivative works
- Is Ubuntu a good O.S. for running Hadoop? I’ve tried to learn in the
past using Red Hat Infosphere Biginsights, but I need a free O.S.
If you want a free O.S , ubuntu is good but if you are familiar with RedHat
then you may want to have look at Scientific Linux (Its free as well)
- Is there a
Hi Nitin,
I tried latest stable Hadoop version on windows with cygwin, I see
following error in JobTracker logs. Do you have any advice?
C:\cygwin\home\garamini\hadoop-1.0.4\logs\history to 0755^M
at
org.apache.hadoop.fs.FileUtil.checkReturnValue(FileUtil.java:689)^M
at
Does your user have permissions to read/write on the dfs directories you
made?
try changing the directory ownerships to the user which is running hadoop.
On Fri, Jan 4, 2013 at 11:20 PM, Gangadhar Ramini use.had...@gmail.comwrote:
Hi Nitin,
I tried latest stable Hadoop version on windows
Yes user owns the directory and had right permissions, still i don't
understand what could be the issue.
ls -ltr ~/hadoop-1.0.4/logs/history
total 0
drwxr-xr-x+ 1 garamini mkgroup 0 Jan 2 22:15
Thanks
-Gangadhar
On Fri, Jan 4, 2013 at 9:55 AM, Nitin Pawar nitinpawar...@gmail.com wrote:
If you like RedHat, consider Centos also; it is a nearly-complete clone of the
RHEL distro.
John
From: Nitin Pawar [mailto:nitinpawar...@gmail.com]
Sent: Friday, January 04, 2013 10:46 AM
To: user@hadoop.apache.org
Subject: Re: Hello and request some advice.
- Is Ubuntu a good O.S. for running
Following is the configuration i put in config.
core-site.xml
property
namehadoop.tmp.dir/name
value/usr/local/hadoop/datastore/hadoop-${user.name}/value
/property
hdfs-site.xml
property
namedfs.name.dir/name
valueC:/cygwin/dfs/logs/value
/property
property
Hi john,which would be a better option between Linux and windows from learning
perspective of Hadoop?
--- On Fri, 4/1/13, John Lilley john.lil...@redpoint.net wrote:
From: John Lilley john.lil...@redpoint.net
Subject: RE: Hello and request some advice.
To: user@hadoop.apache.org
I personally find Windows easier to use, however it is not a supported Hadoop
production environment, and I *think* you have to use Cygwin under Windows even
for development.
Given that, if you want to use a Windows machine and performance is not a
consideration, you could spin up a VirtualBox
Hi,
Any ideas why a staging directory would suddenly become unavailable
after the completion of the map phase but before the start of the
reduce phase? We noticed a sporadic failure yesterday wherein all the
map tasks completed
successfully and all the reduce tasks failed. Upon examining task
I would say Linux, because in your job you're most likely going to use a
*nix-type system instead of Windows for hosting Hadoop, so it's good to
gain experience with whatever headaches come along. Further, you're
also learning Linux simultaneously, killing two birds with one stone.
Glen
On
Hi,
I am trying to use grid mix but I keep getting the error that is shown below.
Does anyone have some suggestions.
Thanks in advance.
Sean Barry
hostname:gridmix seanbarry$ pwd
/usr/local/hadoop-1.0.4/contrib/gridmix
hostname:gridmix seanbarry$ java -cp
Hi Krishna,
Do you simply want to schedule the job to run at specific times? If so, I
believe oozie maybe what you are looking for.
Regards,
Robert
On Fri, Jan 4, 2013 at 6:40 AM, Krishna Rao krishnanj...@gmail.com wrote:
Hi al,
I have a java application jar that converts some files and
It is very long and confusing. Here is my understanding of what happened even
though I was not around for all of it.
0.1 - 0.20 was mostly main line development. At that point there was a split
and 0.20 was forked to add security, 0.20-security, and also to add in append
support for H-BASE
Wow. Thanks for the explanation, very helpful.
Glen
On 01/04/2013 06:28 PM, Robert Evans wrote:
It is very long and confusing. Here is my understanding of what
happened even though I was not around for all of it.
0.1 - 0.20 was mostly main line development. At that point there was
a split
Hi Stan,
I'd check the NN audit logs for the file /user/apache/.staging/
job_201211150255_237458/job.xml to see when/who deleted it away, perhaps
that would give more insight.
On Sat, Jan 5, 2013 at 2:32 AM, Stan Rosenberg stan.rosenb...@gmail.comwrote:
Hi,
Any ideas why a staging directory
Hi,
On Fri, Jan 4, 2013 at 8:10 PM, Krishna Rao krishnanj...@gmail.com wrote:
If I want to run the jar I need to run it using hadoop jar application
jar, so that it can access HDFS (that is running java -jar application
jar results in a HDFS error).
The latter is because running a Hadoop
What do you mean by a final reduce? Not all jobs require that the
final output result be singular, since the reducer phase is provided
to work on a per-partition basis (also why the files are named
part-*). One job consists of only one reduce phase, wherein the
reducers all work independently and
30 matches
Mail list logo