Endless loop returning in map

2007-11-19 Thread jonathan doklovic
Hi, I have a map task that uses a regex to parse data. Sometimes the regex matcher throws an unavoidable StackOverflowError. I'm catching these errors and want to skip the map/reduce entirely when encountered. If I don't do anything in my catch, the job runs and shows a few failed tasks. If I p

performance test tips?

2007-11-16 Thread jonathan doklovic
Hi, We've finally got our hadoop cluster up, some data to crunch and a map/reduce job. After running a few configurations, i'm not sure about our performance and would like to get some advice We have a 20 node ec2 cluster. We have 750MB of data. currently our job seems to be doing 1%/min on

Re: ClassNotFound just started with custom mapper

2007-11-13 Thread jonathan doklovic
ens. > > - Aaron > > > jonathan doklovic wrote: > > Hi, > > > > I've been testing a map/reduce on a local hadoop cluster on my machine. > > It was working fine yesterday, and now it keeps throwing > > ClassNotFoundExceptions if I run it as a java app via ecli

ClassNotFound just started with custom mapper

2007-11-13 Thread jonathan doklovic
Hi, I've been testing a map/reduce on a local hadoop cluster on my machine. It was working fine yesterday, and now it keeps throwing ClassNotFoundExceptions if I run it as a java app via eclipse. If I choose "run on hadoop" it seems to run ok, but I don't see any messages in the console from my S

Accessing EC2 / local client

2007-11-12 Thread jonathan doklovic
Hi, I just got an EC2 cluster up and running, and the first thing I need to do is loop through a file on my local system and create a mapfile on the hadoop cludter from it. I tried doing this with a local java client, but when I call close() on my MapFile.Writer, I get the following messages over

Starting remote jobs

2007-11-09 Thread jonathan doklovic
Hi, This may be a silly question, but is there a way via java to start a job remotely? What I mean is, I write a map/reduce (with the jobclient), jar it up, and deploy it to my cluster. I know I can log into the master node and start the job on the command line, but I want to kick it off from ja

Re: Hadoop/HBase setup

2007-11-05 Thread jonathan doklovic
g/~cutting/hadoop-0.15.0-candidate-1/ -- or go > get a recent nightly build from here: > http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Nightly/. > > St.Ack > > > jonathan doklovic wrote: > > Hi, > > > > I'm trying to evaluate hadoop/hbase for

Re: Hadoop/HBase setup

2007-11-02 Thread jonathan doklovic
couldn't. I have seen > >> this previous when hosts were confused on how to reach each other -- is > >> there a bogus entry in an /etc/hosts? > >> > >> But it looks like you are trying the hbase from the hadoop 0.14.x > >> branch. IMO, you'

Re: Hadoop/HBase setup

2007-11-02 Thread jonathan doklovic
0 candidate -- > http://people.apache.org/~cutting/hadoop-0.15.0-candidate-1/ -- or go > get a recent nightly build from here: > http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Nightly/. > > St.Ack > > > jonathan doklovic wrote: > > Hi, > > > > I

Ant build: touch in init breaks build

2007-11-02 Thread jonathan doklovic
Hi, I just downloaded 0.15.0 and realized it has the same problem as 0.14.0... If I try to run: ant clean jar compile-contrib ant complains with: Specify at least one source--a file or resource collection. then the build fails. This is due to the following being present in the init target of b

Hadoop/HBase setup

2007-11-01 Thread jonathan doklovic
Hi, I'm trying to evaluate hadoop/hbase for a project I'm on that requires filtering massive amounts of RSS data. I've been trying to follow the simple tutorials, but I can't seem to get anything to work. So far, I've formatted hadoop storage, that went fine. Then I start hadoop: [EMAIL PROTECT