Re: Poor scalability with map reduce application

2011-06-22 Thread Harsh J
Alberto, I can assure you that fiddling with default replication factors can't be the solution here. Most of us running a 3+ cluster still use the 3-replica-factor and it hardly introduces a performance lag. As long as your Hadoop cluster network is not shared with other network applications, you

Re: Hadoop eclipse plugin stopped working after replacing hadoop-0.20.2 jar files with hadoop-0.20-append jar files

2011-06-22 Thread Jack Ye
I used the 0.20.203.0, and can't access the Dfs locations. Following is the error: failure to login internal error:"map/reduce location status updater" org/codehaus/jackson/map/jsonmappingexceptoon Yaozhen Pan 编写: >Hi, > >Our hadoop version was built on 0.20-append with a few patches. >However,

Re: Hadoop eclipse plugin stopped working after replacing hadoop-0.20.2 jar files with hadoop-0.20-append jar files

2011-06-22 Thread Yaozhen Pan
Hi, Our hadoop version was built on 0.20-append with a few patches. However, I didn't see big differences in eclipse-plugin. Yaozhen On Thu, Jun 23, 2011 at 11:29 AM, 叶达峰 (Jack Ye) wrote: > do you use hadoop 0.20.203.0? > I also have problem about this plugin. > > Yaozhen Pan 编写: > > >Hi, > >

Re: Hadoop eclipse plugin stopped working after replacing hadoop-0.20.2 jar files with hadoop-0.20-append jar files

2011-06-22 Thread Jack Ye
do you use hadoop 0.20.203.0? I also have problem about this plugin. Yaozhen Pan 编写: >Hi, > >I am using Eclipse Helios Service Release 2. >I encountered a similar problem (map/reduce perspective failed to load) when >upgrading eclipse plugin from 0.20.2 to 0.20.3-append version. > >I compared the

Re: Hadoop eclipse plugin stopped working after replacing hadoop-0.20.2 jar files with hadoop-0.20-append jar files

2011-06-22 Thread Yaozhen Pan
Hi, I am using Eclipse Helios Service Release 2. I encountered a similar problem (map/reduce perspective failed to load) when upgrading eclipse plugin from 0.20.2 to 0.20.3-append version. I compared the source code of eclipse plugin and found only a few difference. I tried to revert the differen

Re: Poor scalability with map reduce application

2011-06-22 Thread Alberto Andreotti
Hi guys, I suspected that the problem was due to overhead introduced by the filesystem, so I tried to set the "dfs.replication.max" property to different values. First, I tried with 2, and I got a message saying that I was requesting a value of 3, which was bigger than the limit. So I couldn't do

Re: OutOfMemoryError: GC overhead limit exceeded

2011-06-22 Thread hadoopman
I've run into similar problems in my hive jobs and will look at the 'mapred.child.ulimit' option. One thing that we've found is when loading data with insert overwrite into our hive tables we've needed to include a 'CLUSTER BY' or 'DISTRIBUTE BY' option. Generally that's fixed our memory issu

Re: Hadoop Eclipse plugin 0.20.203.0 doesn't work

2011-06-22 Thread Jack Ye
can anyone help me? 叶达峰 编写: >Hi, > > I am a freshman on Hadoop. Today, I spent the whole night trying to set up a > development environment for Hadoop. I encounter several problems, first is > that the eclipse can't load the plugin, I changed to another version, this > problem was resolved.

Re: Problem debugging MapReduce job under Windows

2011-06-22 Thread Sal
I had the same issue. I installed the previous stable version of Hadoop (0.20.2), and it worked fine. I hope this helps. -Sal

Re: Any reason Hadoop logs cant be directed to a separate filesystem?

2011-06-22 Thread jagaran das
Hi, Can I limit the log file duration ? I want to keep files for last 15 days only. Regards, Jagaran From: Jack Craig To: "common-user@hadoop.apache.org" Sent: Wed, 22 June, 2011 2:00:23 PM Subject: Re: Any reason Hadoop logs cant be directed to a separate f

Re: Any reason Hadoop logs cant be directed to a separate filesystem?

2011-06-22 Thread Jack Craig
Thx to both respondents. Note i've not tried this redirection as I have only production grids available. Our grids are growing and with them, log volume. As until now that log location has been in the same fs as the grid data, so running out of space due log bloat is a growing problem. >From yo

Re: Any reason Hadoop logs cant be directed to a separate filesystem?

2011-06-22 Thread Harsh J
Jack, I believe the location can definitely be set to any desired path. Could you tell us the issues you face when you change it? P.s. The env var is used to set the config property hadoop.log.dir internally. So as long as you use the regular scripts (bin/ or init.d/ ones) to start daemons, it wo

Re: Any reason Hadoop logs cant be directed to a separate filesystem?

2011-06-22 Thread Madhu Ramanna
Looks like you missed the '#' in line beginning Feel free to set HADOOP_LOG_DIR in that script or elsewhere On 6/22/11 1:02 PM, "Jack Craig" wrote: >Hi Folks, > >In the hadoop-env.sh, we find, ... > ># Where log files are stored. $HADOOP_HOME/logs by default. ># export HADOOP_LOG_DIR=${HADOOP_

Any reason Hadoop logs cant be directed to a separate filesystem?

2011-06-22 Thread Jack Craig
Hi Folks, In the hadoop-env.sh, we find, ... # Where log files are stored. $HADOOP_HOME/logs by default. # export HADOOP_LOG_DIR=${HADOOP_HOME}/logs is there any reason this location could not be a separate filesystem on the name node? Thx, jackc... Jack C

Re: Automatic Configuration of Hadoop Clusters

2011-06-22 Thread jagaran das
Pupetize From: gokul To: common-user@hadoop.apache.org Sent: Wed, 22 June, 2011 8:38:13 AM Subject: Automatic Configuration of Hadoop Clusters Dear all, for benchmarking purposes we would like to adjust configurations as well as flexibly adding/removing machine

RE: ClassNotFoundException while running quick start guide on Windows.

2011-06-22 Thread Sandy Pratt
Hi Drew, I don't know if this is actually the issue or not, but the output below makes me think you might be passing Cygwin pathes into the java.exe launcher. If that's the case, it won't work. java.exe is pure Windows and doesn't know about '/cygdrive/c' for example (it also expects the path

Re: Hadoop eclipse plugin stopped working after replacing hadoop-0.20.2 jar files with hadoop-0.20-append jar files

2011-06-22 Thread praveenesh kumar
I am doing that.. its not working.. If I am replacing the hadoop-core from hadoop-plugin.jar.. I am not able to see map-reduce perspective at all. Guys.. any help.. !!! Thanks, Praveenesh On Wed, Jun 22, 2011 at 12:34 PM, Devaraj K wrote: > Every time when hadoop builds, it also builds the hado

Re: Automatic Configuration of Hadoop Clusters

2011-06-22 Thread Nathan Milford
http://www.opscode.com/chef/ http://trac.mcs.anl.gov/projects/bcfg2 http://cfengine.com/ http://www.puppetlabs.com/ I use chef personally, but the others are just as good and all are tuned towards different philosophies in configuration management. - n On

Automatic Configuration of Hadoop Clusters

2011-06-22 Thread gokul
Dear all, for benchmarking purposes we would like to adjust configurations as well as flexibly adding/removing machines from our Hadoop clusters. Is there any framework around allowing this in an easy manner without having to manually distribute the changed configuration files? We consider writing

Hadoop Version 0.21 Node decommision not working

2011-06-22 Thread Kumar, Amit H.
Hi all, We would like to decommission nodes to run jobs with less number of map and reduce and then scale it up. In this process we want to decommission the nodes and then bring back. Does decommissioning work? We have defined dfs.hosts.exclude with a path to the file that has list of nodes to

Backup and upgrade practices?

2011-06-22 Thread Mark Kerzner
Hi, I am planning a small Hadoop cluster, but looking ahead, are there cheaps option to have a back up of the data? If I later want to upgrade the hardware, do I make a complete copy, or do I upgrade one node at a time? Thank you, Mark

RE: Hadoop eclipse plugin stopped working after replacing hadoop-0.20.2 jar files with hadoop-0.20-append jar files

2011-06-22 Thread Devaraj K
Every time when hadoop builds, it also builds the hadoop eclipse plug-in using the latest hadoop core jar. In your case eclipse plug-in contains the other version jar and cluster is running with other version. That's why it is giving the version mismatch error. Just replace the hadoop-core jar