Thanks Brian. Killing an orphan process on one of the nodes resolved the
issue (I did not capture a stack trace)
Zeev
On Wed, Sep 16, 2009 at 5:47 AM, Brian Bockelman wrote:
> Hey Zeev,
>
> This is caused by a misbehaving client stuck in an infinite loop. When you
> restart the NN, the client
Hi all,
RapLeaf is hosting a Cascading meetup on September 24th. More details
at:
http://blog.rapleaf.com/dev/?p=196
and
http://upcoming.yahoo.com/event/4421260
Hope to see you there!
chris
--
Chris K Wensel
ch...@concurrentinc.com
http://www.concurrentinc.com
That would be my problem. Thanks. I was passing it Java 6.
-Original Message-
From: Matt Massie [mailto:m...@cloudera.com]
Sent: Monday, September 21, 2009 2:23 PM
To: common-user@hadoop.apache.org
Cc: core-u...@hadoop.apache.org
Subject: Re: forrest configuration
What are you
What are you passing in for java5.home?
Forrest requires Java 5 to validate the sitemap. It will fail if you try to
use Java 6.
-Matt
On Mon, Sep 21, 2009 at 1:20 PM, Andy Sautins
wrote:
>
> I can't seem to find documentation on how to configure forrest to build.
> When trying to build hdfs
I can't seem to find documentation on how to configure forrest to build.
When trying to build hdfs/site/build.xml I get the following error:
[exec] validate-sitemap:
[exec]
/home/user/apache-forrest-0.8/main/webapp/resources/schema/relaxng/sitemap-v06.rng:72:31:
error: datatype li
Don't forget that the records are sorted going into the reducer. This is
often overlooked by new users that are just using pipes on the command line
to test their perl mappers and reducers without sorted data.
For my perl streaming applications, I perform all of my operations on my
values, like
Referring to Hadoop 0.20.1 API.
SequenceFileAsBinaryOutputFormat requires JobConf but JobConf is deprecated.
Is there another OutputFormat I should be using ?
Bill
I think the default chunk size you are referring to is about 64Mb.
This was chosen as something like a single read off a disk.
I for one am a big perl fan but I am not happy about 64Mb of text
being read into a perl hash. Hashes are memory wasteful - in
preference to speed.
So my verdict is to ret
Hi all,
I have a streaming job running on ~300 GB of ASCII data in 3 large
files, where the mapper and reducer are Perl scripts. Mapper does
trivial data cleanup, and reducer builds a hash then iterates over
this hash writing output. Hash key is the first field in the data,
i.e. the same as the s
I didn't see anything about this in the archive, so perhaps I'm doing
something wrong, but I have run into a problem creating a job with the
.20 release without using the deprecated JobConf class.
The mapreduce.JobContext class is the replacement for the deprecated
mapred.JobContext, but it co
On 9/20/09 10:19 PM, "Jeff Zhang" wrote:
> But it's weired that it shows I can not stop the cluster. Does anyone
> encounter this problem before ?
>
> Any ideas ? This is the message when I run command bin/stop-all.sh
>
> no jobtracker to stop
All the time. For us, the result was the $USER e
Brian (and others):
Great info...thanks!
> I would suggest looking into at Cloudera's blog posting about the "small
> files problem":
>
> http://www.cloudera.com/blog/2009/02/02/the-small-files-problem/
Good link...muchos gracias.
> The simplest thing you could do is to use the Hadoop ARchive
On Mon, Sep 21, 2009 at 7:50 AM, Edward Capriolo wrote:
>
>
> >Storing the only copy of the NN data into NFS would make the NFS server an
> > SPOF, and you still need to solve the problems of
>
> @Steve correct. It is hair splitting but Stas asked if there was an
> approach that did not use DRBD.
Apologies, I should'a checked the source first ... I see that keys have
to be WritableComparable, and hence I'll have to implement that
interface in my custom class.
Lajos
Lajos wrote:
Hi all,
I seem to have a problem using ArrayWritable (of Texts) as a key in my
MR jobs. I want my Mapper
On Mon, Sep 21, 2009 at 2:57 AM, Steve Loughran wrote:
> Jeff Zhang wrote:
>
>> My cluster has running for several months.
>>
>
> Nice.
>
> Is this a bug of hadoop? I think hadoop is supposed to run for long time.
>>
>
> I'm doing work in HDFS-326 on making it easier to start/stop the various
>
Hypertable (www.hypertable.org) is an open source C++ implementation of
Bigtable which runs on top of HDFS. Binary packages (RPM, debian, dmg) for
Hypertable are now available and can be downloaded here:
http://package.hypertable.org/
Updated documentation, with a "Getting Started" guide, can be
Hi Lajos,
ArrayWritable does not implement WritableComparable, so it can't currently
be used as a mapper output key - those keys have to be sorted during the
shuffle, and thus the type must be WritableComparable.
-Todd
On Mon, Sep 21, 2009 at 8:53 AM, Lajos wrote:
> Hi all,
>
> I seem to have
Hi all,
I seem to have a problem using ArrayWritable (of Texts) as a key in my
MR jobs. I want my Mapper output key to be ArrayWritable, and both input
& output keys in my Reducer the same.
I've tried this with both mapred and mapreduce versions (I'm using
0.20.0 here).
I also tried extend
On Mon, Sep 21, 2009 at 6:03 AM, Steve Loughran wrote:
> Edward Capriolo wrote:
>
>>
>> Just for reference. Linux HA and some other tools deal with the split
>> brain decisions by requiring a quorum. A quorum involves having a
>> third party or having more then 50% of the nodes agree.
>>
>> An iss
Hi.
Just wanted to reflect my thoughts on this:
So far DRBD looks as a good enough solution. My only problem, is that it
requires from me to operate dedicate machines (physical or virtual) for
Hadoop Namenode, in active/passive configuration.
I'm interesting in HADOOP-4539 mostly because it woul
Edward Capriolo wrote:
Just for reference. Linux HA and some other tools deal with the split
brain decisions by requiring a quorum. A quorum involves having a
third party or having more then 50% of the nodes agree.
An issue with linux-ha and hadoop is that linux-ha is only
supported/tested on
Jeff Zhang wrote:
My cluster has running for several months.
Nice.
Is this a bug of hadoop? I think hadoop is supposed to run for long time.
I'm doing work in HDFS-326 on making it easier to start/stop the various
hadoop services; once the lifecycle stuff is in I'll worry more about
the r
It's not precisely a bug in anything - rather, its a hadoop default
configuration that is rather peculiar. The process ID (pid) is kept in
a pid file. The default location of that file is set in
hadoop-default.xml, and should be overridden in hadoop-site.xml. The
problem is that the default loca
It has nothing to do with Hadoop, it has to do with tmpwatch.
Kill the processes nicely and you won't lose any data.
Cheers,
Anthony
On Mon, Sep 21, 2009 at 1:39 AM, Chandraprakash Bhagtani
wrote:
> no you won't loose data, as you are only killing process, which you can
> restart later.
>
> On
no you won't loose data, as you are only killing process, which you can
restart later.
On Mon, Sep 21, 2009 at 12:15 PM, Jeff Zhang wrote:
> My cluster has running for several months.
>
> Is this a bug of hadoop? I think hadoop is supposed to run for long time.
>
> And will I lose data if I manu
25 matches
Mail list logo