What do you mean by a "final reduce"? Not all jobs require that the
final output result be singular, since the reducer phase is provided
to work on a per-partition basis (also why the files are named
part-*). One job consists of only one reduce phase, wherein the
reducers all work independently and
Hi,
On Fri, Jan 4, 2013 at 8:10 PM, Krishna Rao wrote:
> If I want to run the jar I need to run it using "hadoop jar jar>", so that it can access HDFS (that is running "java -jar jar> results in a HDFS error").
The latter is because running a Hadoop program requires Hadoop
dependencies and con
Hi Sean,
Two questions: Why are you running this in local mode? Placing a cluster's
config directory on your java -cp will make it go distributed. And, does
that reported output directory really exist? If so, you may want to delete
it before you run GridMix.
On Sat, Jan 5, 2013 at 3:55 AM, Sean
Hi Stan,
I'd check the NN audit logs for the file /user/apache/.staging/
job_201211150255_237458/job.xml to see when/who deleted it away, perhaps
that would give more insight.
On Sat, Jan 5, 2013 at 2:32 AM, Stan Rosenberg wrote:
> Hi,
>
> Any ideas why a staging directory would suddenly become
Wow. Thanks for the explanation, very helpful.
Glen
On 01/04/2013 06:28 PM, Robert Evans wrote:
It is very long and confusing. Here is my understanding of what
happened even though I was not around for all of it.
0.1 - 0.20 was mostly main line development. At that point there was
a split
It is very long and confusing. Here is my understanding of what happened even
though I was not around for all of it.
0.1 - 0.20 was mostly main line development. At that point there was a split
and 0.20 was forked to add security, 0.20-security, and also to add in append
support for H-BASE 0.2
Hi Krishna,
Do you simply want to schedule the job to run at specific times? If so, I
believe oozie maybe what you are looking for.
Regards,
Robert
On Fri, Jan 4, 2013 at 6:40 AM, Krishna Rao wrote:
> Hi al,
>
> I have a java application jar that converts some files and writes directly
> into
Hi,
I am trying to use grid mix but I keep getting the error that is shown below.
Does anyone have some suggestions.
Thanks in advance.
Sean Barry
hostname:gridmix seanbarry$ pwd
/usr/local/hadoop-1.0.4/contrib/gridmix
hostname:gridmix seanbarry$ java -cp
/usr/local/hadoop-1.0.4/contrib/grid
I would say Linux, because in your job you're most likely going to use a
*nix-type system instead of Windows for hosting Hadoop, so it's good to
gain experience with whatever headaches come along. Further, you're
also learning Linux simultaneously, killing two birds with one stone.
Glen
On 0
Hi,
Any ideas why a staging directory would suddenly become unavailable
after the completion of the map phase but before the start of the
reduce phase? We noticed a sporadic failure yesterday wherein all the
map tasks completed
successfully and all the reduce tasks failed. Upon examining task
tr
Uhm...
Well, you can talk to Microsoft and Hortonworks about Microsoft as a platform.
Depending on the power of your laptop, you could create a VM and run hadoop in
a pseudo distributed mode there.
You could also get an Amazon Web Services account and build a small cluster via
EMR...
In ter
I personally find Windows easier to use, however it is not a supported Hadoop
production environment, and I *think* you have to use Cygwin under Windows even
for development.
Given that, if you want to use a Windows machine and performance is not a
consideration, you could spin up a VirtualBox V
Hi john,which would be a better option between Linux and windows from learning
perspective of Hadoop?
--- On Fri, 4/1/13, John Lilley wrote:
From: John Lilley
Subject: RE: Hello and request some advice.
To: "user@hadoop.apache.org"
Date: Friday, 4 January, 2013, 6:12 PM
If you like
Following is the configuration i put in config.
core-site.xml
hadoop.tmp.dir
/usr/local/hadoop/datastore/hadoop-${user.name}
hdfs-site.xml
dfs.name.dir
C:/cygwin/dfs/logs
dfs.data.dir
C:/cygwin/dfs/data
Thanks
-Gangadhar
On Fri, Jan 4, 2013 at 10:19 AM, N
If you like RedHat, consider Centos also; it is a nearly-complete clone of the
RHEL distro.
John
From: Nitin Pawar [mailto:nitinpawar...@gmail.com]
Sent: Friday, January 04, 2013 10:46 AM
To: user@hadoop.apache.org
Subject: Re: Hello and request some advice.
- Is Ubuntu a good O.S. for running
Yes user owns the directory and had right permissions, still i don't
understand what could be the issue.
ls -ltr ~/hadoop-1.0.4/logs/history
total 0
drwxr-xr-x+ 1 garamini mkgroup 0 Jan 2 22:15
Thanks
-Gangadhar
On Fri, Jan 4, 2013 at 9:55 AM, Nitin Pawar wrote:
> Does your user have
Does your user have permissions to read/write on the dfs directories you
made?
try changing the directory ownerships to the user which is running hadoop.
On Fri, Jan 4, 2013 at 11:20 PM, Gangadhar Ramini wrote:
> Hi Nitin,
>
>I tried latest stable Hadoop version on windows with cygwin, I se
Hi Nitin,
I tried latest stable Hadoop version on windows with cygwin, I see
following error in JobTracker logs. Do you have any advice?
C:\cygwin\home\garamini\hadoop-1.0.4\logs\history to 0755^M
at
org.apache.hadoop.fs.FileUtil.checkReturnValue(FileUtil.java:689)^M
at org.ap
for the basics, all you need is a java IDE . Hadoop Map/Reduce can run in
local filesystem mode without any kind of HDFS backing.
On Fri, Jan 4, 2013 at 12:45 PM, Nitin Pawar wrote:
> - Is Ubuntu a good O.S. for running Hadoop? I’ve tried to learn in the
> past using Red Hat & Infosphere Bigin
- Is Ubuntu a good O.S. for running Hadoop? I’ve tried to learn in the
past using Red Hat & Infosphere Biginsights, but I need a free O.S.
If you want a free O.S , ubuntu is good but if you are familiar with RedHat
then you may want to have look at Scientific Linux (Its free as well)
- Is there a
Hi all in this list!
My name is Cristián Carranza, a statistician and quality consultant that for
the second time, intends to learn Hadoop and Big Data related issues.
I’am requesting advice in order to plan my learning.
I read the page “ Products that include Apache Hadoop or derivative works a
OK, looking at the Hadoop branches:
http://svn.apache.org/viewvc/hadoop/common/branches/ and tags:
http://svn.apache.org/viewvc/hadoop/common/tags/, it's thankfully not
that bad.
There is no 0.24 in Hadoop, and the Maven pom files for the 2.0.2-alpha
branch indeed say 2.0.2-alpha. The Cloudera
Hi Glen
I agree with you. There are many versions and confusing. If you really want
to know, you can check the developing and publishing documents. In 0.23's
document, they must announced how many patches are included, as well as the
2.0.x.
Then you will exactly understand which one is which one.
John, the two programs below, one is from the Definitive Guide chapter 4 with
slight mods and the other is in-house but similar to Hadoop in Action chap 3.
package sequencefileprocessor;
// cc SequenceFileReadDemo Reading a SequenceFile
import java.io.IOException;
import java.net.URI;
import or
Hi Bejoy,
ah yes that is exactly the mistake I was making, I had
import org.apache.hadoop.mapred.SequenceFileOutputFormat;
instead of
import org.apache.hadoop.mapreduce.lib.output.SequenceFileOutputFormat;
On Fri, Jan 4, 2013 at 4:04 PM, wrote:
> **
> Hi Peter
>
> Did you ensure that using
Hi Peter
Did you ensure that using SequenceFileOutputFormat from the right package?
Based on the API you are using, mapred or mapreduce you need to use the
OutputFormat from the corresponding package.
Regards
Bejoy KS
Sent from remote device, Please excuse typos
-Original Message-
Fr
Hi al,
I have a java application jar that converts some files and writes directly
into hdfs.
If I want to run the jar I need to run it using "hadoop jar ", so that it can access HDFS (that is running "java -jar results in a HDFS error").
Is it possible to run an jar as a hadoop daemon?
Cheers,
Thanks for your response. That's pretty vital information--I'm not used
to separate development/publication versions. Is the 0.23-->2.0.2
renumbering stated anywhere on the Hadoop website or wiki? It's very
confusing as people otherwise think there are three separate branches --
0.2x.y, 1.x,
0.24 is the developing version, 2.0.2 is the publication version.
0.23 or above is published as 2.0.x
On Fri, Jan 4, 2013 at 5:45 AM, Glen Mazza wrote:
> Actually, those instructions are for Hadoop 0.24, not 2.0.2-alpha.
>
> Glen
>
>
> On 11/30/2012 03:40 PM, Cristian Cira wrote:
>
>> Dear Glen
Actually, those instructions are for Hadoop 0.24, not 2.0.2-alpha.
Glen
On 11/30/2012 03:40 PM, Cristian Cira wrote:
Dear Glen,
try http://blog.cloudera.com/blog/2011/11/building-and-deploying-mr2/
Cristian Cira
Graduate Research Assistant
Parallel Architecture and System Laboratory(PASL)
She
Hello,
thank you for the answer. Exactly: I want the parallelism but a single
final output. What do you mean by "another stage"? I thought I should
setmapred.reduce.tasks large enough and hadoop will run the reducers
in so
many rounds it will be optimal. But it isn't the case.
When I tried to r
31 matches
Mail list logo