What's the easiest way to count the number of Key, Value pairs in a directory?

2011-05-20 Thread W.P. McNeill
I've got a directory with a bunch of MapReduce data in it. I want to know how many Key, Value pairs it contains. I could write a mapper-only process that takes Writeable, Writeable pairs as input and updates a counter, but it seems like this utility should already exist. Does it, or do I have

Re: What's the easiest way to count the number of Key, Value pairs in a directory?

2011-05-20 Thread Joey Echeverria
What format is the input data in? At first glance, I would run an identity mapper and use a NullOutputFormat so you don't get any data written. The built in counters already count the number of key, value pairs read in by the mappers. -Joey On Fri, May 20, 2011 at 9:34 AM, W.P. McNeill

Re: What's the easiest way to count the number of Key, Value pairs in a directory?

2011-05-20 Thread James Seigel
The cheapest way would be to check the counters as you write them in the first place and keep a running score. :) Sent from my mobile. Please excuse the typos. On 2011-05-20, at 10:35 AM, W.P. McNeill bill...@gmail.com wrote: I've got a directory with a bunch of MapReduce data in it. I want

Re: What's the easiest way to count the number of Key, Value pairs in a directory?

2011-05-20 Thread W.P. McNeill
The keys are Text and the values are large custom data structures serialized with Avro. I also have counters for the job that generates these files that gives me this information but sometimes...Well, it's a long story. Suffice to say that it's nice to have a post-hoc method too. :-) The

Re: What's the easiest way to count the number of Key, Value pairs in a directory?

2011-05-20 Thread Joey Echeverria
Are you storing the data in sequence files? -Joey On Fri, May 20, 2011 at 10:33 AM, W.P. McNeill bill...@gmail.com wrote: The keys are Text and the values are large custom data structures serialized with Avro. I also have counters for the job that generates these files that gives me this

Re: What's the easiest way to count the number of Key, Value pairs in a directory?

2011-05-20 Thread W.P. McNeill
No.