Re: Doubt regarding Binary Compatibility\Source Compatibility with old *mapred* APIs and new *mapreduce* APIs in Hadoop

2014-04-14 Thread John Meagher
Also, Source Compatibility also means ONLY a recompile is needed. No code changes should be needed. On Mon, Apr 14, 2014 at 10:37 AM, John Meagher john.meag...@gmail.com wrote: Source Compatibility = you need to recompile and use the new version as part of the compilation Binary Compatibility

Re: re-replication after data node failure

2014-03-26 Thread John Meagher
The balancer is not what handles adding extra replicas in the case of a node failure, but it looks like the balancer bandwidth setting is the way to throttle. See: http://mail-archives.apache.org/mod_mbox/hadoop-hdfs-user/201301.mbox/%3c50f870c1.5010...@getjar.com%3E On Wed, Mar 26, 2014 at

Re: empty file

2013-12-11 Thread John Meagher
Is something still writing to it? ... Total files: 0 (Files currently being written: 1) Total blocks (validated): 0 (Total open file blocks (not validated): 1) On Wed, Dec 11, 2013 at 2:37 PM, Adam Kawa kawa.a...@gmail.com wrote: i have never seen something like that. Can you read that

Re: Question on BytesWritable

2013-10-01 Thread John Meagher
https://issues.apache.org/jira/browse/HADOOP-6298 On Tue, Oct 1, 2013 at 12:39 AM, Chandra Mohan, Ananda Vel Murugan ananda.muru...@honeywell.com wrote: Hi, I am using Hadoop 1.0.2. I have written a map reduce job. I have a requirement to process the whole file without splitting. So I have

Re: Concatenate multiple sequence files into 1 big sequence file

2013-09-10 Thread John Meagher
Here's a great tool for exactly what you're looking for https://github.com/edwardcapriolo/filecrush On Tue, Sep 10, 2013 at 11:07 AM, Jerry Lam chiling...@gmail.com wrote: Hi Hadoop users, I have been trying to concatenate multiple sequence files into one. Since the total size of the sequence

Re: Automating Hadoop installation

2013-08-16 Thread John Meagher
That sounds like what Bigtop is doing, at least covering the Linux distros. http://bigtop.apache.org/ On Fri, Aug 16, 2013 at 11:23 AM, Andrew Pennebaker apenneba...@42six.com wrote: I think it would make Hadoop installation easier if we released standardized packages. What if Ubuntu users

Re: UNSUBSCRIBE IS A SEPARATE LIST

2012-08-08 Thread John Meagher
The email header looks right: List-Unsubscribe: mailto:user-unsubscr...@hadoop.apache.org On Wed, Aug 8, 2012 at 10:35 AM, Bertrand Dechoux decho...@gmail.com wrote: Wasn't there a merge of mailing list not long ago? That's maybe be due to an error with the new mailing lists? Who was

Re: rack awareness and safemode

2012-03-22 Thread John Meagher
. I just don't want to fall into the situation that the hdfs sit in the safemode for hours and users can't use hadoop and start yelping. Let's hear from others. Thanks Patai On 3/20/12 1:27 PM, John Meagher john.meag...@gmail.com wrote: ere's the script I used (all sorts of caveats about

Re: rack awareness and safemode

2012-03-20 Thread John Meagher
Unless something has changed recently it won't automatically relocate the blocks. When I did something similar I had a script that walked through the whole set of files that were misreplicated and increased the replication factor then dropped it back down. This triggered relocation of blocks to

Re: How to find out whether a node is Overloaded from Cpu utilization ?

2012-01-18 Thread John Meagher
The problem I've run into more than memory is having the system CPU time get out of control. My guess is that the threshold for what is considered overloaded is going to be dependent on your system setup, what you're running on it, and what bounds your jobs. On Tue, Jan 17, 2012 at 22:06,

Re: Fixing Mis-replicated blocks

2011-10-21 Thread John Meagher
. Of course, the little script only works if the replication factor is 3 on all the files. If it's a variable amount you should use the java API to get the existing factor and then increase by one and then decrease. Jeff On Thu, Oct 20, 2011 at 8:44 AM, John Meagher john.meag...@gmail.comwrote

Fixing Mis-replicated blocks

2011-10-20 Thread John Meagher
After a hardware move with an unfortunate mis-setup rack awareness script our hadoop cluster has a large number of mis-replicated blocks. After about a week things haven't gotten better on their own. Is there a good way to trigger the name node to fix the mis-replicated blocks? Here's what I'm

Re: How do I diagnose IO bounded errors using the framework counters?

2011-10-05 Thread John Meagher
The counter names are created dynamically in mapred.Task /** * Counters to measure the usage of the different file systems. * Always return the String array with two elements. First one is the name of * BYTES_READ counter and second one is of the BYTES_WRITTEN counter. */

Re: Problem in Map/Reduce

2008-08-29 Thread John Meagher
Did you override the equals and hashcode methods? These are the methods usually used in a map to determine equality for put/get operations. The comparator is probably only used for sorting, not equality checks. On Fri, Aug 29, 2008 at 2:55 AM, P.ILAYARAJA [EMAIL PROTECTED] wrote: Hello: I