Hi, Have you checked Hive? Seems to fit your needs perfectly.
Thanks and Regards, Sonal www.meghsoft.com http://in.linkedin.com/in/sonalgoyal On Sun, Aug 1, 2010 at 1:40 AM, Bright D L <brigh...@gmail.com> wrote: > Hi all, > I am doing a simple project to analyze http proxy server logs by > hadoop mapreduce approach (in Java). The log file contains logs for a week > or some times more than that. > I have following requirements: > 1) Find the top 50 bandwidth consumers (IPs) for each day > 2) Find the hour of the day where there is maximum bandwidth > utilization > Please help me out with some directions. Sample code is highly > appreciated. > Thank you all, > Bright