Hi all, I am doing a simple project to analyze http proxy server logs by hadoop mapreduce approach (in Java). The log file contains logs for a week or some times more than that. I have following requirements: 1) Find the top 50 bandwidth consumers (IPs) for each day 2) Find the hour of the day where there is maximum bandwidth utilization Please help me out with some directions. Sample code is highly appreciated. Thank you all, Bright
- mapreduce for proxy log file analysis Bright D L
- Re: mapreduce for proxy log file analysis Sonal Goyal