Reporter - incrCounter

2009-07-08 Thread Sagar Naik
public void incrCounter(Enum key, long amount) I am using a multi-threaded Mapper. I m not sure, but should this be synchronized ? I log the counter call incrCounter. log stmt counters are fine and Reporter counters are not matching m i missing something ? -Sagar

Can I run tow MapReduce Jobs concurrently?

2009-07-08 Thread zsongbo
Hi all, Can I run tow MapReduce Jobs concurrently? Schubert

Can mapper use MultipleOutputs ??

2009-07-08 Thread Johnson Chen
Hi , I want to output data from each Mapper , so I write the following code. public static class Map extends MapReduceBase implements Mapper { private MultipleOutputs mos ; public void configure(JobConf job) { mos = new MultipleOutputs(job);

RE: Sorting data sets

2009-07-08 Thread Patterson, Josh
Ted, I had to run that through my head for a minute since the sortkey changed some base assumptions for me (and the window scanning can take on various forms, with various added constraints), but I believe pulling the data right off the .next() loop (even mixed together, but in order) would

Re: Forcing Many Map Nodes

2009-07-08 Thread Aaron Kimball
If you look into FileInputFormat, you'll see that there's a call to FileSystem.getFileBlockLocations() (line 222) which finds the addresses of the nodes holding the blocks to be mapped. Each FileSplit generated in that same getSplits() method contains the list of locations where this split should

Custom input help/debug help

2009-07-08 Thread Matthew B.
Hello, I'm starting to use Hadoop for something I'm working on. I'm on a windows machine (xp) and I cannot consider changing to any other OS. I'm using eclipse with the hadoop plug in to develop, and I have cygwin fully installed and working; I am using hadoop-0.20.0. I tried to develop a

Announcement: Cloudera Hadoop Training in Los Angeles

2009-07-08 Thread Christophe Bisciglia
Hadoop Fans, several of you have asked us to come to LA. We've heard you. We'll be teaming up with our friends at Fox Interactive Media to offer Hadoop Training at what might just be the coolest venue *ever* for technical training: The Fox Lot - where they make the movies... Attendees will have

Re: Extracting data from HDFS and displaying stats to a webpage

2009-07-08 Thread Amr Awadallah
Usman, HDFS is a distributed grid file system, as opposed to a live serving database. A simple analogy is the difference between the linux file system then mysql running on top of linux to enable it to be a database. Furthermore, HDFS is optimized for throughput as opposed to latency, hence

RE: how to use hadoop in real life?

2009-07-08 Thread Shravan Mahankali
Thanks for the information Ted. Regards, Shravan Kumar. M Catalytic Software Ltd. [SEI-CMMI Level 5 Company] - This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you

Re: Extracting data from HDFS and displaying stats to a webpage

2009-07-08 Thread Ted Dunning
On Wed, Jul 8, 2009 at 7:46 PM, Christophe Bisciglia christo...@cloudera.com wrote: Hey Usman, your second approach is on the right track. You don't want to have your end users interacting directly with HDFS. The latency is too high, and it wasn't designed for this. This definitely used to

Few Queries..!!!

2009-07-08 Thread Sugandha Naolekar
Hello! I have a 7 node hadoop cluster! As of now, I am able to transfer(dump) the data in HDFS from a remote node(not a part of hadoop cluster). And through web UI, I am able to download the same. - but, If I need to restrict that web UI to few users only, what am I supposed to do? - Also, if

Re: permission denied on additional binaries

2009-07-08 Thread jason hadoop
Just out of curiosity, what happens when you run your script by hand? On Wed, Jul 8, 2009 at 8:09 AM, Rares Vernica rvern...@gmail.com wrote: On Tue, Jul 7, 2009 at 10:26 PM, jason hadoop jason.had...@gmail.com wrote: The mapper has no control at the point where your mymapper.sh script is