Re: HoD and locality of TaskTrackers to data (on DataNodes)

2008-03-23 Thread Hemanth Yamijala
Jiaqi, Hi, I have a question about using HoD and the locality of the assigned TaskTrackers to the data. Suppose I have a long-running HDFS installation with TaskTrackers/JobTracker nodes dynamically allocated by HoD, and I uploaded my data to HDFS prior to running my job/allocating nodes using

Re: One Simple Question About Hadoop DFS

2008-03-23 Thread Amar Kamat
On Sun, 23 Mar 2008, Chaman Singh Verma wrote: > Hello, > > I am exploring Hadoop and MapReduce and I have one very simple question. > > I have 500GB dataset on my local disk and I have written both Map-Reduce > functions. Now how should I start ? > > 1. I copy the data from local disk to DFS. I

Re: More MapReduce Examples

2008-03-23 Thread Cam Bazz
yes, I am looking for example apps too. I have gone tru the standard examples of counting web logs, etc, but all of those are really sophomoric - and I am looking for more complex and use oriented examples. Best. On Sun, Mar 23, 2008 at 8:57 AM, Owen O'Malley <[EMAIL PROTECTED]> wrote: > On Mar

HoD and locality of TaskTrackers to data (on DataNodes)

2008-03-23 Thread Jiaqi Tan
Hi, I have a question about using HoD and the locality of the assigned TaskTrackers to the data. Suppose I have a long-running HDFS installation with TaskTrackers/JobTracker nodes dynamically allocated by HoD, and I uploaded my data to HDFS prior to running my job/allocating nodes using "dfs -put

One Simple Question About Hadoop DFS

2008-03-23 Thread Chaman Singh Verma
Hello, I am exploring Hadoop and MapReduce and I have one very simple question. I have 500GB dataset on my local disk and I have written both Map-Reduce functions. Now how should I start ? 1. I copy the data from local disk to DFS. I have configured DFS with 100 machines. I hope that it will

Re: Query regarding hadoop

2008-03-23 Thread Owen O'Malley
On Mar 22, 2008, at 1:13 PM, Prerna Manaktala wrote: but how should I run bin/hadoop using this ant? I am working on the EC2 of Amazon,wanted to run bin/hadoop -ec2 run. How should I proceed? I would actually suggest starting a single node cluster to gain familiarity first. From the hadoop