Job cleanup

2013-04-13 Thread Robert Dyer
What does the job cleanup task do? My understanding was it just cleaned up any intermediate/temporary files and moved the reducer output to the output directory? Does it do more? One of my jobs runs, all maps and reduces finish, but then the job cleanup task never finishes. Instead it gets kill

Re: Job cleanup

2013-04-17 Thread Robert Dyer
://hadoop.apache.org/docs/stable/api/org/apache/hadoop/mapred/JobContext.html#getProgressible%28%29 On Sat, Apr 13, 2013 at 2:35 PM, Robert Dyer wrote: > What does the job cleanup task do? My understanding was it just cleaned > up any intermediate/temporary files and moved the reducer output to the &g

Re: Query on Cost estimates on Hadoop and Java

2013-04-25 Thread Robert Dyer
It isn't GPL. OpenJDK[1] is GPLv2 with a Classpath Exception[2] (which is important). Read more here: http://programmers.stackexchange.com/questions/52534/can-we-use-java-for-commercial-use Also note that Hadoop[3] is licensed under Apache v2[4]. [1] http://openjdk.java.net/legal/ [2] http://op

Re: About configuring cluster setup

2013-05-14 Thread Robert Dyer
You can, however note that unless you also run a TaskTracker on that node (bad idea) then any blocks that are replicated to this node won't be available as input to MapReduces and you are lowering the odds of having data locality on those blocks. On Tue, May 14, 2013 at 2:01 AM, Ramya S wrote: >

HDFS edit log NPE

2013-06-03 Thread Robert Dyer
I recently upgraded from 1.0.4 to 1.1.2. Now however my HDFS won't start up. There appears to be something wrong in the edits file. Obviously I can roll back to a previous checkpoint, however it appears checkpointing has been failing for some time and my last check point is over a month old. Is

Re: Hadoop upgrade

2013-08-09 Thread Robert Dyer
Actually, 1.2.1 is out (and marked stable). I see no reason not to upgrade. http://hadoop.apache.org/docs/r1.2.1/releasenotes.html As far as performance goes, when I upgraded our cluster from 1.0.4 to 1.1.2, our small jobs (that took about 1 min each) were taking about 20-30s less time. So ther

Job status shows 0's for counters

2013-09-03 Thread Robert Dyer
I just noticed the job status for MR jobs tends to show 0's in the Map and Reduce columns but actually shows the totals correctly. I am not sure exactly when this started happening, but this cluster was upgraded from Hadoop 1.0.4 to 1.1.2 and now to 1.2.1. It definitely worked fine on 1.0.4, but

Re: Job status shows 0's for counters

2013-09-03 Thread Robert Dyer
/browse/MAPREDUCE-5376) and attached a > patch. > But it is not fixed by the current release. > > Thanks, > Shinichi > > (2013/09/03 11:20), Robert Dyer wrote: > > I just noticed the job status for MR jobs tends to show 0's in the Map > > and Reduce columns bu

Hadoop 2.2.0 MR tasks failing

2013-10-21 Thread Robert Dyer
I recently setup a 2.2.0 test cluster. For some reason, all of my MR jobs are failing. The maps and reduces all run to completion, without any errors. Yet the app is marked failed and there is no final output. Any ideas? Application Type: MAPREDUCE State: FINISHED FinalStatus: FAILED Diagnosti

Re: Hadoop 2.2.0 MR tasks failing

2013-10-21 Thread Robert Dyer
896 INFO [main] org.apache.hadoop.mapred.Task: Task 'attempt_1382415258498_0001_m_14_0' done. On Tue, Oct 22, 2013 at 12:16 AM, Arun C Murthy wrote: > If you follow the links on the web-ui to the logs of the map/reduce tasks, > what do you see there? > > Arun > > On Oct 21, 2

Re: Hadoop 2.2.0 MR tasks failing

2013-11-01 Thread Robert Dyer
So does anyone have any ideas how to track this down? Is it perhaps an exception somewhere in an output committer that is being swallowed and not showing up in the logs? On Tue, Oct 22, 2013 at 2:19 AM, Robert Dyer wrote: > The logs for the maps and reduces show nothing useful. There ar

Re: Any reference for upgrade hadoop from 1.x to 2.2

2013-11-22 Thread Robert Dyer
Thanks Sandy! These seem helpful! "MapReduce cluster configuration options have been split into YARN configuration options, which go in yarn-site.xml; and MapReduce configuration options, which go in mapred-site.xml. Many have been given new names to reflect the shift. ... *We’ll follow up with a

Uncompressed size of Sequence files

2013-11-23 Thread Robert Dyer
Is there an easy way to get the uncompressed size of a sequence file that is block compressed? I am using the Snappy compressor. I realize I can obviously just decompress them to temporary files to get the size, but I would assume there is an easier way. Perhaps an existing tool that my search d

Re: Uncompressed size of Sequence files

2013-11-27 Thread Robert Dyer
916) On Sat, Nov 23, 2013 at 3:14 PM, Robert Dyer wrote: > Is there an easy way to get the uncompressed size of a sequence file that > is block compressed? I am using the Snappy compressor. > > I realize I can obviously just decompress them to temporary files to get > the size, bu

Namenode won't start, has NPE

2012-08-07 Thread Robert Dyer
I recently restarted my small cluster (1 namenode, 1 job tracker, 1 secondary nn, 6 compute/data nodes). However the namenode refuses to start up due to a NPE. After googling I saw some suggestions of doing a printf "\xff\xff\xff\xee\xff" > edits, however this did not fix the problem. Any ideas

Updating SequenceFiles?

2012-08-22 Thread Robert Dyer
I am currently using a SequenceFile as input to my MR job (on Hadoop 1.0.3). This works great, as my input is just a bunch of binary blobs. However it seems SequenceFile is only intended to append new data and never update existing entries. Is that correct? If so, would i be better off moving t

HBase and MapReduce data locality

2012-08-28 Thread Robert Dyer
I have been reading up on HBase and my understanding is that the physical files on the HDFS are split first by region and then by column families. Thus each column family has its own physical file (on a per-region basis). If I run a MapReduce task that uses the HBase as input, wouldn't this imply

Re: HBase and MapReduce data locality

2012-08-28 Thread Robert Dyer
one replica of the blocks on > the datanode the same machine as the client (i.e. the regionserver from hdfs > point of view). > > N. > > > > On Wed, Aug 29, 2012 at 6:20 AM, Robert Dyer wrote: >> >> I have been reading up on HBase and my understanding is that the >

Re: HBase and MapReduce data locality

2012-08-29 Thread Robert Dyer
#x27;re right" :-). > It's documented here: > http://hbase.apache.org/book.html#regions.arch.locality > > On Wed, Aug 29, 2012 at 8:06 AM, Robert Dyer wrote: >> >> Ok but does that imply that only 1 of your compute nodes is promised >> to have all of the data

Re: New Learner for Haddop and Hive

2012-09-07 Thread Robert Dyer
_ > Disclaimer:This email and any attachments are sent in strictest confidence > for the sole use of the addressee and may contain legally privileged, > confidential, and proprietary data. If you are not the intended recipient, > please advise the sender by replying promptly to this email and then delete > and destroy this email and any attachments without any further use, copying > or forwarding > > -- Robert Dyer rd...@iastate.edu

Re: How to split a sequence file

2012-09-11 Thread Robert Dyer
If the file is pre-sorted, why not just make multiple sequence files - 1 for each split? Then you don't have to compute InputSplits because the physical files are already split. On Tue, Sep 11, 2012 at 11:00 PM, Harsh J wrote: > Hey Jason, > > Is the file pre-sorted? You could override the Outpu

Re: Reg LZO compression

2012-10-16 Thread Robert Dyer
Hi Manoj, If the data is the same for both tests and the number of mappers is fewer, then each mapper has more (uncompressed) data to process. Thus each mapper should take longer and overall execution time should increase. As a simple example: if your data is 128MB uncompressed it may use 2 mapp

Strange machine behavior

2012-12-08 Thread Robert Dyer
Has anyone experienced a TaskTracker/DataNode behaving like the attached image? This was during a MR job (which runs often). Note the extremely high System CPU time. Upon investigating I saw that out of 64GB ram the system had allocated almost 45GB to cache! I did a sudo sh -c "sync ; echo 3 >

Re: Strange machine behavior

2012-12-08 Thread Robert Dyer
ch your > MR job again. > Can you share your logs in pastebin? > > > On Sat 08 Dec 2012 07:09:02 PM CST, Robert Dyer wrote: > >> Has anyone experienced a TaskTracker/DataNode behaving like the >> attached image? >> >> This was during a MR job (which

Re: Strange machine behavior

2012-12-10 Thread Robert Dyer
se and > slowdown. drop_caches doesn't have any impact on correctness; it won't > cause data loss (by dropping a dirty page or whatever). I've had sync > calls take 10 minutes to complete, so the unnecessary impact can be > significant. > > -andy > > On Sat, Dec

Re: Strange machine behavior

2012-12-10 Thread Robert Dyer
unnecessary impact can be > significant. > > -andy > > On Sat, Dec 8, 2012 at 4:09 PM, Robert Dyer wrote: > > Has anyone experienced a TaskTracker/DataNode behaving like the attached > > image? > > > > This was during a MR job (which runs often). Note the extr

Re: Strange machine behavior

2012-12-10 Thread Robert Dyer
ly want paged > out in favor of a larger filesystem cache. > > > > There is also a vm parameter that controls the minimum size of the free > chain, might want to increase that a bit. > > > > Also, look into hosting your JVM heap on huge pages, they can't be p

Re: why not hadoop backup name node data to local disk daily or hourly?

2012-12-26 Thread Robert Dyer
I actually have this exact same error. After running my namenode for awhile (with a snn), it gets to a point where the snn starts crashing and if I try to restart the NN I will get this problem. I typically wind up having to go with a much older copy of the image and edits files in order to get i

Re: more reduce tasks

2013-01-03 Thread Robert Dyer
You could create a CustomOutputCommitter and in the commitJob() method simply read in the part-* files and write them out into a single aggregated file. This requires making a CustomOutputFormat class that uses the CustomOutputCommittter and then setting that via job.setOutputFormatClass(CustomOut

Re: Specific HDFS tasks where is passwordless SSH is necessary

2013-02-05 Thread Robert Dyer
? > > -- > Jay Vyas > http://jayunit100.blogspot.com > > -- > > Robert Dyer > rd...@iastate.edu >

Re: Namenode failures

2013-02-16 Thread Robert Dyer
Forgot to mention: Hadoop 1.0.4 On Sat, Feb 16, 2013 at 2:38 PM, Robert Dyer wrote: > I am at a bit of wits end here. Every single time I restart the namenode, > I get this crash: > > 2013-02-16 14:32:42,616 INFO org.apache.hadoop.hdfs.server.common.Storage: > Image file of siz

Re: Namenode failures

2013-02-17 Thread Robert Dyer
an easy way to monitor (other than a script grep'ing the logs) the checkpoints to see when this happens? On Sat, Feb 16, 2013 at 2:39 PM, Robert Dyer wrote: > Forgot to mention: Hadoop 1.0.4 > > > On Sat, Feb 16, 2013 at 2:38 PM, Robert Dyer wrote: > >> I am at a bi

Re: Namenode failures

2013-02-17 Thread Robert Dyer
ose > the fsimage file. > > Warm Regards, > Tariq > https://mtariq.jux.com/ > cloudfront.blogspot.com > > > On Mon, Feb 18, 2013 at 3:31 AM, Robert Dyer wrote: > >> It just happened again. This was after a fresh format of HDFS/HBase and >> I am attemptin

Re: Namenode failures

2013-02-17 Thread Robert Dyer
/hdfs'. > Warm Regards, > Tariq > https://mtariq.jux.com/ > cloudfront.blogspot.com > > > On Mon, Feb 18, 2013 at 3:31 AM, Robert Dyer wrote: > >> It just happened again. This was after a fresh format of HDFS/HBase and >> I am attempting to re-import the (backed

Re: Namenode failures

2013-02-17 Thread Robert Dyer
> P.s. How exactly are you shutting down the NN each time? A kill -9 or > a regular SIGTERM shutdown? > I shut down the NN with 'bin/stop-dfs.sh'. > On Mon, Feb 18, 2013 at 4:31 AM, Robert Dyer wrote: > > On Sun, Feb 17, 2013 at 4:41 PM, Mohammad Tariq > wrote: &

Slow MR time and high network utilization with all local data

2013-02-24 Thread Robert Dyer
I have a small 6 node dev cluster. I use a 1GB SequenceFile as input to a MapReduce job, using a custom split size of 10MB (to increase the number of maps). Each map call will read random entries out of a shared MapFile (that is around 50GB). I set replication to 6 on both of these files, so all

Re: Slow MR time and high network utilization with all local data

2013-02-24 Thread Robert Dyer
done over a local socket as > well, and may appear in network traffic observing tools too (but do > not mean they are over the network). > > On Mon, Feb 25, 2013 at 2:35 AM, Robert Dyer wrote: > > I have a small 6 node dev cluster. I use a 1GB SequenceFile as input to > a > &

Re: Slow MR time and high network utilization with all local data

2013-02-25 Thread Robert Dyer
t the short circuit. Now I see no network utilization for this job and it runs *much* faster (13 mins instead of 2+ hours)! Problem solved! :-) Thanks Harsh! On Mon, Feb 25, 2013 at 1:41 AM, Robert Dyer wrote: > I am using Ganglia. > > Note I have short circuit reads enabled (I

Re: Get dynamic values in a user defined class from reducer.

2013-12-18 Thread Robert Dyer
Generally speaking, static fields are not useful in Hadoop. The issue you are seeing is that the reducer is running in a separate VM (possibly on a different node!) and thus the static value you are reading inside of Mid is actually a separate instantiation of that class and field. If you have an