I want to analyze the traffic pattern and statistics of a distributed application. I am thinking of having the application write the events as log entries into HDFS and then later I can use a Map/Reduce task to do the analysis in parallel. Is this a good approach ?
In this case, does HDFS support concurrent write (append) to a file ? Another question is whether the write API thread-safe ? Rgds, Ricky