Re: HDFS as a logfile ??

2009-04-14 Thread Ariel Rabkin
Everything gets dumped into the same files. We don't assume anything at all about the format of the input data; it gets dumped into Hadoop sequence files, tagged with some metadata to say what machine and app it came from, and where it was in the original stream. There is a slight penalty from th

RE: HDFS as a logfile ??

2009-04-13 Thread Ricky Ho
l.com] Sent: Monday, April 13, 2009 7:38 AM To: core-user@hadoop.apache.org Subject: Re: HDFS as a logfile ?? Chukwa is a Hadoop subproject aiming to do something similar, though particularly for the case of Hadoop logs. You may find it useful. Hadoop unfortunately does not support concurrent a

Re: HDFS as a logfile ??

2009-04-13 Thread Ariel Rabkin
Chukwa is a Hadoop subproject aiming to do something similar, though particularly for the case of Hadoop logs. You may find it useful. Hadoop unfortunately does not support concurrent appends. As a result, the Chukwa project found itself creating a whole new demon, the chukwa collector, precisel

Re: HDFS as a logfile ??

2009-04-09 Thread Brian Bockelman
Also, Chukwa (a project already in Hadoop contrib) is designed to do something similar with Hadoop directly: http://wiki.apache.org/hadoop/Chukwa I think some of the examples even mention Apache logs. Haven't used it personally, but it looks nice. Brian On Apr 9, 2009, at 11:14 PM, Alex

Re: HDFS as a logfile ??

2009-04-09 Thread Alex Loddengaard
This is a great idea and a common application, Ricky. Scribe is probably useful for you as well: < http://images.google.com/imgres?imgurl=http://farm3.static.flickr.com/2211/2197670659_b42810b8ba.jpg&imgrefurl=http://www.flickr.com/photos/niallkenne

HDFS as a logfile ??

2009-04-09 Thread Ricky Ho
I want to analyze the traffic pattern and statistics of a distributed application. I am thinking of having the application write the events as log entries into HDFS and then later I can use a Map/Reduce task to do the analysis in parallel. Is this a good approach ? In this case, does HDFS sup