1. HadoopLog should be removed. There is a legacy parser for hadoop 0.18 job history format, which requires this table. The code probably should be updated to parse the new hadoop 0.20 format.
2. The data written to HBase is defined by the demux parsers. You would find the parsers are located in src/ java/org/apache/hadoop/chukwa/extraction/demux/processor/mapper/*.java and src/ java/org/apache/hadoop/chukwa/extraction/demux/processor/reducer/*.java. It is using the same parsers that is executed by demux mapreduce job for backward compatibility with Chukwa 0.4. It is definitely possible to write your own parser to extract features from /var/log/messages and provide visualization through HICC. As long as you have written a demux parser and defined Hbase Schema, it should show up in HICC. 3. HBaseWriter act as a mini-demux process and output demuxed key/value pairs. If the data structure is pre-determined, and data analytics only requires semi-structures. HBaseWriter is good enough for near real time data monitoring. The use case for SeqFileWriter is writing the raw unstructured data into HDFS. Hence, if your use case is to preserve unknown data structure, use SeqFileWriter. This creates archive files on HDFS and can be processed by demux (ETL) process as a secondary pipeline. 4. Chukwa agent keeps track of the file offset, and written a check point file. If agent has been restarted, it would resume operation from last check point. I am not sure if this was what you saw. Regards, Eric On 6/3/11 12:46 AM, "DKN" <[email protected]> wrote: A few corrections to the above post. 1. Data is written ( a lot of them actually !) to Hadoop table in HBase.. However HadoopLog table is empty. 2. and 3. I was reading this archive today : http://www.mail-archive.com/[email protected]/msg00078.html and got some more insights. However, if Eric can comment on the data storage strategy between HDFS and HBase tables, it will be useful. 4. I restarted the entire cluster and now, I don't see this problem. In other words, I do see what is configured in initial_adaptors file running. I will put this in the back-burner for now. Thanks and regards, DKN -- View this message in context: http://apache-chukwa.679492.n3.nabble.com/A-demo-setup-on-a-single-linux-server-tp3001627p3018864.html Sent from the Chukwa - Users mailing list archive at Nabble.com.
