hadoop 0.20 migration for JobClient?

2010-08-20 Thread Steve Hoffman
I'm migrating some code from the hadoop 0.18 apis to the 0.20 apis. The Mapper/Reducer interfaces in the mapred package to extending the Mapper/Reducer classes in the mapreduce package is pretty straight forward. It appears that Job replaces JobClient/JobConf/etc. and you simply call submit() to d

Re: Writable questions

2010-08-31 Thread Steve Hoffman
That is the default 'toString()' output of any Java object. If you want your custom Writable to print something different you have to override the toString() method. Steve On Tue, Aug 31, 2010 at 11:58 AM, Mark wrote: >  I have a question regarding outputting Writable objects. I thought all > W

Re: Writable questions

2010-09-02 Thread Steve Hoffman
You could use the standard List.toString() method which does a nice job of printing something like this; (A1,A2,A3) assuming the objects contained in the list implement toString() to something you'd want to see. Use in conjuction with java.util.Arrays.asList() and the ArrayWriteable.toStrings()

Re: Real-time log processing in Hadoop

2010-09-16 Thread Steve Hoffman
I gave a talk on this recently at the Chicago Hadoop Users Group. Our volume is high enough that the "chunks" of data get written into hdfs with a few minutes delay. This is close enough to "real time" for our initial needs, but are quickly evaluating Flume, etc. as a collection mechanism to replac

Re: Appending to existing files in HDFS

2010-09-17 Thread Steve Hoffman
This is a "feature" of HDFS. Files are immutable. You have to create a new file. The file you are writing to isn't available in hdfs until you close it. Usually you'll have something buffering pieces and writing to hdfs. Then you can roll those smaller files into larger chunks using a nightly map