Re: HDFS - millions of files in one directory?

2009-01-26 Thread Andy Liu
SequenceFile supports transparent block-level compression out of the box, so you don't have to compress data in your code. Most the time, compression not only saves disk space but improves performance because there's less data to write. Andy On Mon, Jan 26, 2009 at 12:35 PM, Mark Kerzner wrote:

Number of records in a MapFile

2009-01-27 Thread Andy Liu
Is there a way to programatically get the number of records in a MapFile without doing a complete scan?

Cannot run program "chmod": error=12, Not enough space

2009-01-28 Thread Andy Liu
I'm running Hadoop 0.19.0 on Solaris (SunOS 5.10 on x86) and many jobs are failing with this exception: Error initializing attempt_200901281655_0004_m_25_0: java.io.IOException: Cannot run program "chmod": error=12, Not enough space at java.lang.ProcessBuilder.start(ProcessBuilder.java

Re: Cannot run program "chmod": error=12, Not enough space

2009-01-29 Thread Andy Liu
Thanks, that was it. On Thu, Jan 29, 2009 at 9:22 AM, Steve Loughran wrote: > Andy Liu wrote: > >> I'm running Hadoop 0.19.0 on Solaris (SunOS 5.10 on x86) and many jobs are >> failing with this exception: >> > > This isnt disk space, this is a RAM/swap probl

Total number of records processed in mapper

2009-04-14 Thread Andy Liu
Is there a way for all the reducers to have access to the total number of records that were processed in the Map phase? For example, I'm trying to perform a simple document frequency calculation. During the map phase, I emit pairs for every unique word in every document. During the reduce phase,

Re: fyi: A Comparison of Approaches to Large-Scale Data Analysis: MapReduce vs. DBMS Benchmarks

2009-04-15 Thread Andy Liu
Not sure if comparing Hadoop to databases is an apples to apples comparison. Hadoop is a complete job execution framework, which collocates the data with the computation. I suppose DBMS-X and Vertica do that to some certain extent, by way of SQL, but you're restricted to that. If you want to say

Re: Make money from Hadoop ?

2009-05-08 Thread Andy Liu
http://www.cloudera.com/ On Fri, May 8, 2009 at 9:43 AM, PORTO aLET wrote: > Hi All, > > Just wondering if anybody has any idea about making money from using > hadoop? > i.e. found a company that provides DFS/MapReduce service ? or something > like > that? > Or maybe something else? >