Re: TestDFSIO failure

2011-09-01 Thread Ken Krugler
Hi Matt,

On Jun 20, 2011, at 1:46pm, GOEKE, MATTHEW (AG/1000) wrote:

 Has anyone else run into issues using output compression (in our case lzo) on 
 TestDFSIO and it failing to be able to read the metrics file? I just assumed 
 that it would use the correct decompression codec after it finishes but it 
 always returns with a 'File not found' exception.

Yes, I've run into the same issue on 0.20.2 and CHD3u0

I don't see any Jira issue that covers this problem, so unless I hear otherwise 
I'll file one.

The problem is that the post-job code doesn't handle getting the path.deflate 
or path.lzo (for you) file from HDFS, and then decompressing it.

 Is there a simple way around this without spending the time to recompile a 
 cluster/codec specific version?


You can use hadoop fs -text path reported in exception.lzo

This will dump out the file, which looks like:

f:rate  171455.11
f:sqrate2981174.8
l:size  1048576
l:tasks 10
l:time  590537

If you take f:rate/1000/l:tasks, that should give you the average MB/sec.

E.g. for the example above, that would be 171455/1000/10 = 17MB/sec.

-- Ken

--
Ken Krugler
+1 530-210-6378
http://bixolabs.com
custom big data solutions  training
Hadoop, Cascading, Mahout  Solr





Hadoop Training - Santa Clara, Pasadena (LA)

2010-11-02 Thread Ken Krugler

Hi all,

Two quick notes about Hadoop training from Scale Unlimited:

1. The Santa Clara class is almost full - 3 or 4 spots left. This is  
for Nov 10th - 12th - see http://www.eventbrite.com/event/905623745


2. The location/dates for Pasadena (LA area) are now set. It's Dec 8th  
- 10th at the Pasadena Convention Center - see http://www.eventbrite.com/event/1001693091


As I'd previously mentioned, we've added an extra day - the core boot  
camp is still two days, with an optional third day for more in-depth/ 
hands-on experience writing real code to solve real problems. With  
this third day you'll get one-on-one help to better assimilate the  
concepts covered during the first two days.


As always, feel free to ping me if you've got any questions.

Thanks!

-- Ken

--
Ken Krugler
kkrug...@scaleunlimited.com



Hadoop Training - Boston, Seattle, LA and San Jose

2010-08-14 Thread Ken Krugler

Hi all,

I've just posted the upcoming Hadoop training schedule for Scale  
Unlimited. See http://bit.ly/su-courses for an updated course  
description, and http://bit.ly/su-events for dates/locations.


As the subject says, classes will be held in new cities - Boston,  
Seattle, LA and San Jose - so for people and companies outside of the  
Bay area and NYC, this should make it easier.


The other news is that we've added an extra day - the core boot camp  
is still two days, with an optional third day for more in-depth/hands- 
on experience writing real code to solve real problems. With this  
third day you'll get one-on-one help to better assimilate the concepts  
covered during the first two days.


As always, feel free to ping me if you've got any questions.

Thanks!

-- Ken


Ken Krugler
kkrug...@scaleunlimited.com



Re: Hadoop Training

2010-07-16 Thread Ken Krugler

Hi Mark - thanks for the kind words.

For those starting out with Hadoop, there are 10 spots left, this  
coming Thursday  Friday (July 22nd  23rd)


See http://bit.ly/hadoop-bootcamp for details, and http://bit.ly/bootcamp-outline 
 for an outline.


Thanks,

-- Ken


On Jul 9, 2010, at 8:10am, Mark Kerzner wrote:

Awesome course - I took the historic first one, and benefited a lot.  
Great

that Ken is going to teach it.

Mark

On Fri, Jul 9, 2010 at 9:31 AM, Ken Krugler kkrug...@scaleunlimited.com 
wrote:



Hi all,

A quick note that I'll be the instructor for the next Hadoop Bootcamp
training course from Scale Unlimited.

It's a two day class on July 22nd and 23rd, which covers the usual  
high

(and low) points of Hadoop.

Plus bonus material on using Hadoop with machine learning, generating
search indexes, and data processing workflows with Cascading.

See http://www.scaleunlimited.com/courses/hadoop-bootcamp- 
santaclara for

more details, or ping me if you've got specific questions.

Thanks!

-- Ken



Ken Krugler
+1 530-210-6378
http://bixolabs.com
e l a s t i c   w e b   m i n i n g






Hadoop Training

2010-07-09 Thread Ken Krugler

Hi all,

A quick note that I'll be the instructor for the next Hadoop Bootcamp  
training course from Scale Unlimited.


It's a two day class on July 22nd and 23rd, which covers the usual  
high (and low) points of Hadoop.


Plus bonus material on using Hadoop with machine learning, generating  
search indexes, and data processing workflows with Cascading.


See http://www.scaleunlimited.com/courses/hadoop-bootcamp-santaclara  
for more details, or ping me if you've got specific questions.


Thanks!

-- Ken


Ken Krugler
kkrug...@scaleunlimited.com



FAQ for New to Hadoop

2010-07-08 Thread Ken Krugler

Hi all,

I recently hosted an Intro to Hadoop session at the BigDataCamp  
unconference last week. I later wrote down questions from the audience  
that seemed useful to other Hadoop beginners, and the compared this to  
the Hadoop project FAQ at http://wiki.apache.org/hadoop/FAQ


There was overlap, but not as much as I expected - the Hadoop FAQ has  
more how do I do X versus can I do X or why should I do X.


I posted these questions to http://www.scaleunlimited.com/blog/intro-to-hadoop-at-bigdatacamp 
 , and would appreciate any input - e.g. questions you think should  
be there, answers you think aren't very clear (though mea culpa in  
advance, I jotted these down quickly so I realize they're pretty rough).


Thanks,

-- Ken


Ken Krugler
+1 530-210-6378
http://bixolabs.com
e l a s t i c   w e b   m i n i n g



Re: LeaseExpiredException Exception

2009-12-08 Thread Ken Krugler

Hi Jason,
Hi Jason,

Thanks for the info - it's good to hear from somebody else who's run  
into this :)


I tried again with a bigger box for the master, and wound up with the  
same results.


I guess the framework could be killing it - but no idea why. This is  
during a very simple write out the results phase, so very high I/O  
but not much computation, and nothing should be hung.


Any particular configuration values you had to tweak? I'm running this  
in Elastic MapReduce (EMR) so most settings are whatever they provide  
by default. I override a few things in my JobConf, but (for example)  
anything related to HDFS/MR framework will be locked  loaded by the  
time my job is executing.


Thanks!

-- Ken

On Dec 8, 2009, at 9:34am, Jason Venner wrote:

Is it possible that this is occurring in a task that is being killed  
by the

framework.
Sometimes there is a little lag, between the time the tracker 'kills  
a task'
and the task fully dies, you could be getting into a situation like  
that
where the task is in the process of dying but the last write is  
still in

progress.
I see this situation happen when the task tracker machine is heavily  
loaded.
In once case there was a 15 minute lag between the timestamp in the  
tracker

for killing task XYZ, and the task actually going away.

It took me a while to work this out as I had to merge the tracker  
and task

logs by time to actually see the pattern.
The host machines where under very heavy io pressure, and may have  
been
paging also. The code and configuration issues that triggered this  
have been

resolved, so I don't see it anymore.

On Tue, Dec 8, 2009 at 8:32 AM, Ken Krugler kkrugler_li...@transpac.com 
wrote:



Hi all,

In searching the mail/web archives, I see occasionally questions from
people (like me) who run into the LeaseExpiredException (in my  
case, on

0.18.3 while running a 50 server cluster in EMR).

Unfortunately I don't see any responses, other than Dennis Kubes  
saying
that he thought some work had been done in this area of Hadoop a  
while
back. And this was in 2007, so it hopefully doesn't apply to my  
situation.


I see these LeaseExpiredException errors showing up in the logs  
around the

same time as IOException errors, eg:

java.io.IOException: Stream closed.
  at
org.apache.hadoop.dfs.DFSClient 
$DFSOutputStream.isClosed(DFSClient.java:2245)

  at
org.apache.hadoop.dfs.DFSClient 
$DFSOutputStream.writeChunk(DFSClient.java:2481)

  at
org 
.apache 
.hadoop.fs.FSOutputSummer.writeChecksumChunk(FSOutputSummer.java:155)

  at
org.apache.hadoop.fs.FSOutputSummer.flushBuffer(FSOutputSummer.java: 
132)

  at
org.apache.hadoop.fs.FSOutputSummer.flushBuffer(FSOutputSummer.java: 
121)

  at
org.apache.hadoop.fs.FSOutputSummer.write1(FSOutputSummer.java:112)
  at  
org.apache.hadoop.fs.FSOutputSummer.write(FSOutputSummer.java:86)

  at
org.apache.hadoop.fs.FSDataOutputStream 
$PositionCache.write(FSDataOutputStream.java:49)

  at java.io.DataOutputStream.write(DataOutputStream.java:90)
  at
org.apache.hadoop.io.SequenceFile 
$BlockCompressWriter.writeBuffer(SequenceFile.java:1260)

  at
org.apache.hadoop.io.SequenceFile 
$BlockCompressWriter.sync(SequenceFile.java:1277)

  at
org.apache.hadoop.io.SequenceFile 
$BlockCompressWriter.close(SequenceFile.java:1295)

  at
org.apache.hadoop.mapred.SequenceFileOutputFormat 
$1.close(SequenceFileOutputFormat.java:73)

  at
org.apache.hadoop.mapred.MapTask 
$DirectMapOutputCollector.close(MapTask.java:276)

  at org.apache.hadoop.mapred.MapTask.run(MapTask.java:238)
  at
org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java: 
2216)


This issue seemed related, but would have been fixed in the 0.18.3  
release.


http://issues.apache.org/jira/browse/HADOOP-3760

I saw a similar HBase issue -
https://issues.apache.org/jira/browse/HBASE-529 - but they fixed  
it by

retrying a failure case.

These exceptions occur during write storms, where lots of files  
are being

written out. Though lots is relative, e.g. 10-20M.

It's repeatable, in that it fails on the same step of a series of  
chained

MR jobs.

Is it possible I need to be running a bigger box for my namenode  
server?

Any other ideas?

Thanks,

-- Ken


On May 25, 2009, at 7:37am, Stas Oskin wrote:

Hi.


I have a process that writes to file on DFS from time to time, using
OutputStream.
After some time of writing, I'm starting getting the exception  
below, and
the write fails. The DFSClient retries several times, and then  
fails.


Copying the file from local disk to DFS via CopyLocalFile() works  
fine.


Can anyone advice on the matter?

I'm using Hadoop 0.18.3.

Thanks in advance.


09/05/25 15:35:35 INFO dfs.DFSClient:
org.apache.hadoop.ipc.RemoteException:
org.apache.hadoop.dfs.LeaseExpiredException: No lease on /test/ 
test.bin

File
does not exist. Holder DFSClient_-951664265 does not have any open  
files