On Thu, Apr 28, 2011 at 10:06 PM, Nigel Daley nda...@mac.com wrote:
As announced last week, I'm planning to do this at 2pm PDT tomorrow
(Friday) April 29.
Suresh, when do you plan to commit HFS-1052? That should be done first.
Owen or Todd, did you want to follow Paul's advice:
If you're
One of assumptions map reduce made, I think, is that size of map's output is
smaller than input. Although we can see many applications have the same size
of output with input, like, sort, merge,etc.
For my benchmark purpose, I am looking for some non-trivial, real life
applications which creates
Another case is augmenting data. This is sometimes done outside of MR
in an ETL flow, but can be done as an MR job. Doing something like
this is using Hadoop to handle the scaling issues, but really isn't
what MR is intended for.
A real example of this is:
* Input: standard apache weblog
*
On Apr 28, 2011, at 11:24 PM, Todd Lipcon wrote:
Wasn't sure how to go about doing that. I guess we need to talk to infra
about it? Do you know how we might clone the SVN repos themselves to test
with?
It looks like there are svn dumps at http://svn-master.apache.org/dump/ from 2
april