Re: Cluster Tuning

2011-07-11 Thread Juan P.
. -Joey On Fri, Jul 8, 2011 at 11:25 AM, Juan P. gordoslo...@gmail.com wrote: Here's another thought. I realized that the reduce operation in my map/reduce jobs is a flash. But it goes really slow until the mappers end. Is there a way to configure the cluster to make the reduce wait

Re: Cluster Tuning

2011-07-11 Thread Juan P.
BTW: Here's the Job Output https://spreadsheets.google.com/spreadsheet/ccc?key=0Av5N1j_JvusDdDdaTG51OE1FOUptZHg5M1Zxc0FZbHchl=en_US On Mon, Jul 11, 2011 at 1:28 PM, Juan P. gordoslo...@gmail.com wrote: Hi guys! Here's my mapred-site.xml I've tweaked a few properties but still it's taking

Re: Cluster Tuning

2011-07-11 Thread Juan P.
:28 AM, Juan P. wrote: * property* *namemapred.child.java.opts/name* *value-Xmx400m/value* * /property* Single core machines with 600MB of RAM. 2x400m = 800m just for the heap of the map and reduce phases, not counting the other memory that the jvm will need

Re: Cluster Tuning

2011-07-08 Thread Juan P.
, Esteban. -- Get Hadoop! http://www.cloudera.com/downloads/ On Thu, Jul 7, 2011 at 1:29 PM, Juan P. gordoslo...@gmail.com wrote: Hi guys! I'd like some help fine tuning my cluster. I currently have 20 boxes exactly alike. Single core machines with 600MB of RAM

Re: Cluster Tuning

2011-07-08 Thread Juan P.
Thanks! Pony On Fri, Jul 8, 2011 at 11:41 AM, Juan P. gordoslo...@gmail.com wrote: Hey guys, Thanks all of you for your help. Joey, I tweaked my MapReduce to serialize/deserialize only escencial values and added a combiner and that helped a lot. Previously I had a domain object which

Cluster Tuning

2011-07-07 Thread Juan P.
Hi guys! I'd like some help fine tuning my cluster. I currently have 20 boxes exactly alike. Single core machines with 600MB of RAM. No chance of upgrading the hardware. My cluster is made out of 1 NameNode/JobTracker box and 19 DataNode/TaskTracker boxes. All my config is default except i've

Setting names for nodes

2011-07-04 Thread Juan P.
Hi guys, Is there a way to set human readable names for my nodes? I've configured an Amazon cluster, and currently when browsing the NameNode Web Console in the list of nodes I get part of the Amazon public DNS URL which isn't very helpful when trying to figure out which node I'm looking at. So I

Re: Performance Tunning

2011-06-27 Thread Juan P.
-default.html http://hadoop.apache.org/common/docs/r0.20.2/hdfs-default.html http://hadoop.apache.org/common/docs/r0.20.2/core-default.html HTH, Matt -Original Message- From: Juan P. [mailto:gordoslo...@gmail.com] Sent: Monday, June 27, 2011 2:50 PM To: common-user@hadoop.apache.org

Re: Performance Tunning

2011-06-27 Thread Juan P.
mapred.JobClient: map 0% reduce 0% Any thoughts? Thanks for your help guys! On Mon, Jun 27, 2011 at 7:33 PM, Juan P. gordoslo...@gmail.com wrote: Matt, Thanks for your help! I think I get it now, but this part is a bit confusing: * * *so: tasktracker/datanode and 6 slots left. How you break

Re: Starting JobTracker Locally but binding to remote Address

2011-06-01 Thread Juan P.
on all of the hosts listed in masters. In your case, you'll want to run start-dfs.sh on slave3 and start-mapred.sh on slave2. -Joey On Tue, May 31, 2011 at 5:07 PM, Juan P. gordoslo...@gmail.com wrote: Hi Guys, I recently configured my cluster to have 2 VMs. I configured 1 machine

Starting JobTracker Locally but binding to remote Address

2011-05-31 Thread Juan P.
Hi Guys, I recently configured my cluster to have 2 VMs. I configured 1 machine (slave3) to be the namenode and another to be the jobtracker (slave2). They both work as datanode/tasktracker as well. Both configs have the following contents in their masters and slaves file: *slave2* *slave3* Both

Re: Comparing

2011-05-26 Thread Juan P.
using bytes itself (Across different types), which can end up being faster when used in jobs. Hope this clears up your confusion. On Tue, May 24, 2011 at 2:06 AM, Juan P. gordoslo...@gmail.com wrote: Hi guys, I wanted to get your help with a couple of questions which came up while

Re: Comparing

2011-05-25 Thread Juan P.
Hi guys! Any thoughts on this? Should I have sent my queries to a different distribution list? Thanks! Pony On Mon, May 23, 2011 at 5:36 PM, Juan P. gordoslo...@gmail.com wrote: Hi guys, I wanted to get your help with a couple of questions which came up while looking at the Hadoop Comparator

Comparing

2011-05-23 Thread Juan P.
Hi guys, I wanted to get your help with a couple of questions which came up while looking at the Hadoop Comparator/Comparable architecture. As I see it before each reducer operates on each key, a sorting algorithm is applied to them. *Why does Hadoop need to do that?* If I implement my own class

Why is JUnit a compile scope dependency?

2011-04-29 Thread Juan P.
I was putting together a maven project and imported hadoop-core as a dependency and noticed that among the jars it brought with it was JUnit 4.5. Shouldn't it be a test scope dependency? It also happens with JUnit 3.8.1 for the commons-httpclient-3.0.1 dependency it pulls down from the repo.

Re: Why is JUnit a compile scope dependency?

2011-04-29 Thread Juan P.
/HADOOP Thanks, Cos On Fri, Apr 29, 2011 at 07:03, Juan P. gordoslo...@gmail.com wrote: I was putting together a maven project and imported hadoop-core as a dependency and noticed that among the jars it brought with it was JUnit 4.5. Shouldn't it be a test scope dependency? It also

Should waitForCompletion throw so many exceptions?

2011-04-29 Thread Juan P.
Is it just me or is it weird that org.apache.hadoop.mapreduce.Job#waitForCompletion(boolean verbose) throws exceptions like ClassNotFoundException? It seems like it's breaking encapsulation by throwing IOException, ClassNotFoundException and InterruptedException. Has this been discussed? Thanks,

Stable Release

2011-04-28 Thread Juan P.
Hi guys, I wanted to know exactly which was the latest stable release of Hadoop. In the site it says it's release 0.20.2, but 0.21.0 is also available and in the repository there's already a branch for release 0.22.0. Is it possible that the current development branch is 0.22, the stable is 0.21