Re: How to merge several SequenceFile into one?

2011-05-11 Thread jason
M/R job with a single reducer would do the job. This way you can utilize distributed sort and merge/combine/dedupe key/values as you wish. On 5/11/11, 丛林 wrote: > Hi all, > > There is lots of SequenceFile in HDFS, how can I merge them into one > SequenceFile? > > Thanks for you suggestion. > > -L

Re: How to create a SequenceFile more faster?

2011-05-11 Thread Harsh J
Are you doing this as a MapReduce job or is it a simple linear program? MapReduce could be much faster (Combined-files input format, with a few Reducers for merging if you need that as well). On Thu, May 12, 2011 at 5:18 AM, 丛林 wrote: > Hi, all. > > I want to write lots of little files (32GB) to

How to merge several SequenceFile into one?

2011-05-11 Thread 丛林
Hi all, There is lots of SequenceFile in HDFS, how can I merge them into one SequenceFile? Thanks for you suggestion. -Lin

How to create a SequenceFile more faster?

2011-05-11 Thread 丛林
Hi, all. I want to write lots of little files (32GB) to HDFS as org.apache.hadoop.io.SequenceFile. But now it is too slow: we use about 8 hours to create this SequenceFile (6.7GB). So I wonder how to create this SequenceFile more faster? Thanks for your suggestion. -Best Wishes, -Lin

Re: Question About Passing Properties to a Mapper

2011-05-11 Thread Geoffry Roberts
All, Thanks Harsh for your response. I have, however, solved the problem and I shall now share. The upshot is the property in question, when I did the get() was in fact not null. It contained the full XML document. I thought it was either null or a zero length string because I was logging the

Re: Question About Passing Properties to a Mapper

2011-05-11 Thread Harsh J
Hello Geoffry, On Wed, May 11, 2011 at 8:40 PM, Geoffry Roberts wrote: > All, > > I am attempting to pass a string value from my driver to each one of my > mappers and it is not working.  I can set the value, but when I read it back > it returns null.  the value is not null when I set() it and I

Question About Passing Properties to a Mapper

2011-05-11 Thread Geoffry Roberts
All, I am attempting to pass a string value from my driver to each one of my mappers and it is not working. I can set the value, but when I read it back it returns null. the value is not null when I set() it and I am using the correct key when I attempt to get() it. This should be a simple, str

Map 100%, Reduce 99.99%?

2011-05-11 Thread Evert Lammerts
Hi all, Why do we every now and then see a job remaining in Running state with no more Mappers or Reducers running, while the reduce progress tells us it's 99.99% done? Might this be due to a stranded process? Cheers, Evert

Re: What is the property for setting the number of tolerated failure task in one job

2011-05-11 Thread Jeff Zhang
Thanks On Tue, May 10, 2011 at 11:48 PM, Amar Kamat wrote: > The property to set the max number of task failures a job can tolerate is > ‘mapred.max.map.failures.percent’ in the old API and > ‘mapreduce.map.failures.maxpercent’ in the new API. This determines the > job faillure. > Amar > > > >