Hi Manoj You can load daily logs into a individual directories in hdfs and process them daily. Keep those results in hdfs or hbase or dbs etc. Every day do the processing, get the results and aggregate the same with the previously aggregated results till date.
Regards Bejoy KS Sent from handheld, please excuse typos. -----Original Message----- From: Manoj Babu <manoj...@gmail.com> Date: Sun, 9 Sep 2012 21:28:54 To: <mapreduce-user@hadoop.apache.org> Reply-To: mapreduce-user@hadoop.apache.org Subject: Reg: parsing all files & file append Hi All, I have two questions, providing info on it will be helpful. 1, I am using hadoop to analyze and to find top n search term metric's from logs. If any new log file is added to HDFS then again we are running the job to find the metrics. Daily we will be getting log files and we are parsing the whole file and getting the metric's. All the log file's are parsed daily to get the latest metric's is there any way is there any way to avoid this? 2, Does file append is production stable? Cheers! Manoj.