Re: Appending and seeking files while writing

2010-06-13 Thread Vidur Goyal
Append is supported in hadoop 0.20 . > Hi. > > I think this really depends on the append functionality, any idea whether > it > supports such behaviour now? > > Regards. > > On Fri, Jun 11, 2010 at 10:41 AM, hadooprcoks > wrote: > >> Stas, >> >> I also believe that there should be a seek interfa

Hadoop 0.20.2 looking *inside* a file in the input path for files?

2010-06-13 Thread suckerfish
Hello, I am a newbie to hadoop, following the http://hadoop.apache.org/common/docs/r0.20.1/mapred_tutorial.html WordCount tutorial but trying update it to use the mapreduce classes instead of mapred. However I am getting the following error: 10/06/13 18:24:50 INFO mapred.JobClient: Task Id :

Re: Hadoop 0.20.2 looking *inside* a file in the input path for files?

2010-06-13 Thread Ted Yu
See https://issues.apache.org/jira/browse/MAPREDUCE-1734 BTW, it's easier for other people to reproduce your scenario if you post your code. On Sun, Jun 13, 2010 at 3:35 AM, suckerfish wrote: > > Hello, I am a newbie to hadoop, following the > http://hadoop.apache.org/common/docs/r0.20.1/mapred

Re: Appending and seeking files while writing

2010-06-13 Thread Todd Lipcon
On Sun, Jun 13, 2010 at 12:46 AM, Vidur Goyal wrote: > Append is supported in hadoop 0.20 . > > Append will be supported in the 0.20-append branch, which is still in progress. It is NOT supported in vanilla 0.20. You can turn on the config option but it is dangerous and highly discouraged for real

problem setting up development environment for hadoop

2010-06-13 Thread Vidur Goyal
Hello All, I have been trying to set up a development environment for hdfs using this link http://wiki.apache.org/hadoop/EclipseEnvironment , but the project gives error after the build is completed. It does not contain certain files. Please help ! vidur -- This message has been scanned for vir

Is it possible to sort values before they are sent to the reduce function?

2010-06-13 Thread Kevin Tse
Hi, For each key, there might be millions of values(LongWritable), but I only want to emit top 20 of these values which I want to be sorted in descending order. So is it possible to sort these values before they enter the reduce phase? Thank you in advance! Kevin

Re: Is it possible to sort values before they are sent to the reduce function?

2010-06-13 Thread Alex Kozlov
Hi Kevin, This is a very common technique. Look for secondary sort in Tom White's HTGD (Chapter 6). You'll most likely have to write your own Partitioner and WritableComparator. -- Alex K On Sun, Jun 13, 2010 at 7:16 PM, Kevin Tse wrote: > Hi, > For each key, there might be millions of values

Re: Problems with HOD and HDFS

2010-06-13 Thread David Milne
Anybody? I am completely stuck here. I have no idea who else I can ask or where I can go for more information. Is there somewhere specific where I should be asking about HOD? Thank you, Dave On Thu, Jun 10, 2010 at 2:56 PM, David Milne wrote: > Hi there, > > I am trying to get Hadoop on Demand u

Re: Problems with HOD and HDFS

2010-06-13 Thread Jeff Hammerbacher
Hey Dave, I can't speak for the folks at Yahoo!, but from watching the JIRA, I don't think HOD is actively used or developed anywhere these days. You're attempting to use a mostly deprecated project, and hence not receiving any support on the mailing list. Thanks, Jeff On Sun, Jun 13, 2010 at 7:

Re: Is it possible to sort values before they are sent to the reduce function?

2010-06-13 Thread Kevin Tse
Hi Alex, I am was reading Tom's book, but I have not reached chapter 6 yet. I just read it, it is really helpful. Thank you for mentioning it, and Thanks also goes to Tom. Kevin On Mon, Jun 14, 2010 at 10:22 AM, Alex Kozlov wrote: > Hi Kevin, This is a very common technique. Look for secondary

Re: Caching in HDFS C API Client

2010-06-13 Thread Arun C Murthy
I'd bet on the Linux file-cache. Assuming you wrote the file with the default replication factor of 3, there is one replica of the local- filesystem which you are reading... Try writing multiple GBs of data and randomly reading large files to blow your file-cache? Arun On Jun 11, 2010, at

Re: Problems with HOD and HDFS

2010-06-13 Thread David Milne
Ok, thanks Jeff. This is pretty surprising though. I would have thought many people would be in my position, where they have to use Hadoop on a general purpose cluster, and need it to play nice with a resource manager? What do other people do in this position, if they don't use HOD? Deprecated nor

Re: Problems with HOD and HDFS

2010-06-13 Thread Vinod KV
On Monday 14 June 2010 08:03 AM, David Milne wrote: Anybody? I am completely stuck here. I have no idea who else I can ask or where I can go for more information. Is there somewhere specific where I should be asking about HOD? Thank you, Dave In the ringmaster logs, you should see which no

Re: Problems with HOD and HDFS

2010-06-13 Thread Vinod KV
On Monday 14 June 2010 09:51 AM, David Milne wrote: Ok, thanks Jeff. This is pretty surprising though. I would have thought many people would be in my position, where they have to use Hadoop on a general purpose cluster, and need it to play nice with a resource manager? What do other people do i