Hi all,

I'd like to select random N records from a large amount of data using
hadoop, just wonder how can I archive this ? Currently my idea is that let
each mapper task select N / mapper_number records. Does anyone has such
experience ?


-- 
Best Regards

Jeff Zhang

Reply via email to