Re: TestDFSIO failure
Hi Matt, On Jun 20, 2011, at 1:46pm, GOEKE, MATTHEW (AG/1000) wrote: Has anyone else run into issues using output compression (in our case lzo) on TestDFSIO and it failing to be able to read the metrics file? I just assumed that it would use the correct decompression codec after it finishes but it always returns with a 'File not found' exception. Yes, I've run into the same issue on 0.20.2 and CHD3u0 I don't see any Jira issue that covers this problem, so unless I hear otherwise I'll file one. The problem is that the post-job code doesn't handle getting the path.deflate or path.lzo (for you) file from HDFS, and then decompressing it. Is there a simple way around this without spending the time to recompile a cluster/codec specific version? You can use hadoop fs -text path reported in exception.lzo This will dump out the file, which looks like: f:rate 171455.11 f:sqrate2981174.8 l:size 1048576 l:tasks 10 l:time 590537 If you take f:rate/1000/l:tasks, that should give you the average MB/sec. E.g. for the example above, that would be 171455/1000/10 = 17MB/sec. -- Ken -- Ken Krugler +1 530-210-6378 http://bixolabs.com custom big data solutions training Hadoop, Cascading, Mahout Solr
Hadoop Training - Santa Clara, Pasadena (LA)
Hi all, Two quick notes about Hadoop training from Scale Unlimited: 1. The Santa Clara class is almost full - 3 or 4 spots left. This is for Nov 10th - 12th - see http://www.eventbrite.com/event/905623745 2. The location/dates for Pasadena (LA area) are now set. It's Dec 8th - 10th at the Pasadena Convention Center - see http://www.eventbrite.com/event/1001693091 As I'd previously mentioned, we've added an extra day - the core boot camp is still two days, with an optional third day for more in-depth/ hands-on experience writing real code to solve real problems. With this third day you'll get one-on-one help to better assimilate the concepts covered during the first two days. As always, feel free to ping me if you've got any questions. Thanks! -- Ken -- Ken Krugler kkrug...@scaleunlimited.com
Hadoop Training - Boston, Seattle, LA and San Jose
Hi all, I've just posted the upcoming Hadoop training schedule for Scale Unlimited. See http://bit.ly/su-courses for an updated course description, and http://bit.ly/su-events for dates/locations. As the subject says, classes will be held in new cities - Boston, Seattle, LA and San Jose - so for people and companies outside of the Bay area and NYC, this should make it easier. The other news is that we've added an extra day - the core boot camp is still two days, with an optional third day for more in-depth/hands- on experience writing real code to solve real problems. With this third day you'll get one-on-one help to better assimilate the concepts covered during the first two days. As always, feel free to ping me if you've got any questions. Thanks! -- Ken Ken Krugler kkrug...@scaleunlimited.com
Re: Hadoop Training
Hi Mark - thanks for the kind words. For those starting out with Hadoop, there are 10 spots left, this coming Thursday Friday (July 22nd 23rd) See http://bit.ly/hadoop-bootcamp for details, and http://bit.ly/bootcamp-outline for an outline. Thanks, -- Ken On Jul 9, 2010, at 8:10am, Mark Kerzner wrote: Awesome course - I took the historic first one, and benefited a lot. Great that Ken is going to teach it. Mark On Fri, Jul 9, 2010 at 9:31 AM, Ken Krugler kkrug...@scaleunlimited.com wrote: Hi all, A quick note that I'll be the instructor for the next Hadoop Bootcamp training course from Scale Unlimited. It's a two day class on July 22nd and 23rd, which covers the usual high (and low) points of Hadoop. Plus bonus material on using Hadoop with machine learning, generating search indexes, and data processing workflows with Cascading. See http://www.scaleunlimited.com/courses/hadoop-bootcamp- santaclara for more details, or ping me if you've got specific questions. Thanks! -- Ken Ken Krugler +1 530-210-6378 http://bixolabs.com e l a s t i c w e b m i n i n g
Hadoop Training
Hi all, A quick note that I'll be the instructor for the next Hadoop Bootcamp training course from Scale Unlimited. It's a two day class on July 22nd and 23rd, which covers the usual high (and low) points of Hadoop. Plus bonus material on using Hadoop with machine learning, generating search indexes, and data processing workflows with Cascading. See http://www.scaleunlimited.com/courses/hadoop-bootcamp-santaclara for more details, or ping me if you've got specific questions. Thanks! -- Ken Ken Krugler kkrug...@scaleunlimited.com
FAQ for New to Hadoop
Hi all, I recently hosted an Intro to Hadoop session at the BigDataCamp unconference last week. I later wrote down questions from the audience that seemed useful to other Hadoop beginners, and the compared this to the Hadoop project FAQ at http://wiki.apache.org/hadoop/FAQ There was overlap, but not as much as I expected - the Hadoop FAQ has more how do I do X versus can I do X or why should I do X. I posted these questions to http://www.scaleunlimited.com/blog/intro-to-hadoop-at-bigdatacamp , and would appreciate any input - e.g. questions you think should be there, answers you think aren't very clear (though mea culpa in advance, I jotted these down quickly so I realize they're pretty rough). Thanks, -- Ken Ken Krugler +1 530-210-6378 http://bixolabs.com e l a s t i c w e b m i n i n g
Re: LeaseExpiredException Exception
Hi Jason, Hi Jason, Thanks for the info - it's good to hear from somebody else who's run into this :) I tried again with a bigger box for the master, and wound up with the same results. I guess the framework could be killing it - but no idea why. This is during a very simple write out the results phase, so very high I/O but not much computation, and nothing should be hung. Any particular configuration values you had to tweak? I'm running this in Elastic MapReduce (EMR) so most settings are whatever they provide by default. I override a few things in my JobConf, but (for example) anything related to HDFS/MR framework will be locked loaded by the time my job is executing. Thanks! -- Ken On Dec 8, 2009, at 9:34am, Jason Venner wrote: Is it possible that this is occurring in a task that is being killed by the framework. Sometimes there is a little lag, between the time the tracker 'kills a task' and the task fully dies, you could be getting into a situation like that where the task is in the process of dying but the last write is still in progress. I see this situation happen when the task tracker machine is heavily loaded. In once case there was a 15 minute lag between the timestamp in the tracker for killing task XYZ, and the task actually going away. It took me a while to work this out as I had to merge the tracker and task logs by time to actually see the pattern. The host machines where under very heavy io pressure, and may have been paging also. The code and configuration issues that triggered this have been resolved, so I don't see it anymore. On Tue, Dec 8, 2009 at 8:32 AM, Ken Krugler kkrugler_li...@transpac.com wrote: Hi all, In searching the mail/web archives, I see occasionally questions from people (like me) who run into the LeaseExpiredException (in my case, on 0.18.3 while running a 50 server cluster in EMR). Unfortunately I don't see any responses, other than Dennis Kubes saying that he thought some work had been done in this area of Hadoop a while back. And this was in 2007, so it hopefully doesn't apply to my situation. I see these LeaseExpiredException errors showing up in the logs around the same time as IOException errors, eg: java.io.IOException: Stream closed. at org.apache.hadoop.dfs.DFSClient $DFSOutputStream.isClosed(DFSClient.java:2245) at org.apache.hadoop.dfs.DFSClient $DFSOutputStream.writeChunk(DFSClient.java:2481) at org .apache .hadoop.fs.FSOutputSummer.writeChecksumChunk(FSOutputSummer.java:155) at org.apache.hadoop.fs.FSOutputSummer.flushBuffer(FSOutputSummer.java: 132) at org.apache.hadoop.fs.FSOutputSummer.flushBuffer(FSOutputSummer.java: 121) at org.apache.hadoop.fs.FSOutputSummer.write1(FSOutputSummer.java:112) at org.apache.hadoop.fs.FSOutputSummer.write(FSOutputSummer.java:86) at org.apache.hadoop.fs.FSDataOutputStream $PositionCache.write(FSDataOutputStream.java:49) at java.io.DataOutputStream.write(DataOutputStream.java:90) at org.apache.hadoop.io.SequenceFile $BlockCompressWriter.writeBuffer(SequenceFile.java:1260) at org.apache.hadoop.io.SequenceFile $BlockCompressWriter.sync(SequenceFile.java:1277) at org.apache.hadoop.io.SequenceFile $BlockCompressWriter.close(SequenceFile.java:1295) at org.apache.hadoop.mapred.SequenceFileOutputFormat $1.close(SequenceFileOutputFormat.java:73) at org.apache.hadoop.mapred.MapTask $DirectMapOutputCollector.close(MapTask.java:276) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:238) at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java: 2216) This issue seemed related, but would have been fixed in the 0.18.3 release. http://issues.apache.org/jira/browse/HADOOP-3760 I saw a similar HBase issue - https://issues.apache.org/jira/browse/HBASE-529 - but they fixed it by retrying a failure case. These exceptions occur during write storms, where lots of files are being written out. Though lots is relative, e.g. 10-20M. It's repeatable, in that it fails on the same step of a series of chained MR jobs. Is it possible I need to be running a bigger box for my namenode server? Any other ideas? Thanks, -- Ken On May 25, 2009, at 7:37am, Stas Oskin wrote: Hi. I have a process that writes to file on DFS from time to time, using OutputStream. After some time of writing, I'm starting getting the exception below, and the write fails. The DFSClient retries several times, and then fails. Copying the file from local disk to DFS via CopyLocalFile() works fine. Can anyone advice on the matter? I'm using Hadoop 0.18.3. Thanks in advance. 09/05/25 15:35:35 INFO dfs.DFSClient: org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.dfs.LeaseExpiredException: No lease on /test/ test.bin File does not exist. Holder DFSClient_-951664265 does not have any open files