Re: Upload, then decompress archive on HDFS?

2011-08-04 Thread Harsh J
Keith, The 'hadoop fs -text' tool does decompress a file given to it if needed/able, but what you could also do is run a distributed mapreduce job that converts from compressed to decompressed, that'd be much faster. On Fri, Aug 5, 2011 at 4:58 AM, Keith Wiley wrote: > Instead of "hd fs -put" hu

Re:Re:Re:Re:Re: one quesiton in the book of "hadoop:definitive guide 2 edition"

2011-08-04 Thread Daniel,Wu
Hi John, Another finding, if I remove the loop of values ( remove for (NullWritable iw:values)), then the result is the MAX temperature for each year. and the original test I did return the MIN temperature for each year. The book also mentioned the value if mutable, I think the key might also

Upload, then decompress archive on HDFS?

2011-08-04 Thread Keith Wiley
Instead of "hd fs -put" hundreds of files of X megs, I want to do it once on a gzipped (or zipped) archive, one file, much smaller total megs. Then I want to decompress the archive on HDFS? I can't figure out what "hd fs" type command would do such a thing. Thanks. __

Hadoop Streaming Combiner Problem

2011-08-04 Thread Premal Shah
According to the hadoop streaming docs, there is an inbuilt Aggregate Java class which can work both as a mapper and a reducer. Here is the command: *shell> had

Re: YCSB Benchmarking for HBase

2011-08-04 Thread M. C. Srivas
Lohit did some work on making YCSB run on a bunch of machines in a coordinated manner. Plus fixed some limits in how many zk connections/threads can run inside one process. See http://github.com/lohitvijayarenu/YCSB I believe that code also has a data-verification option to ensure that a * get* re

Re:Re:Re:Re: one quesiton in the book of "hadoop:definitive guide 2 edition"

2011-08-04 Thread John Armstrong
On Thu, 4 Aug 2011 14:07:12 +0800 (CST), "Daniel,Wu" wrote: > I am using the new API (released is from cloudera). We can see from the > output, for each call of reduce function, 100 records were processed, but > as the reduce is defined as > reduce(IntPair key, Iterable values, Context context),

Re: Kill Task Programmatically

2011-08-04 Thread Harsh J
Hello, Adding to Aleksandr's suggestion, you could also lower the timeout if a throw condition can't be determined. On Thu, Aug 4, 2011 at 5:10 AM, Aleksandr Elbakyan wrote: > Hello, > > You can just throw run time exception. In that case it will fail :) > > Regards, > Aleksandr > > --- On Wed,