Format of Kafka storage on disk

2014-01-03 Thread Subbu Srinivasan
Is there any place where I can know about the internal structure of
the log file where kafka stores the data. A topic has a .index and a .log
file.

I want to read the entire log file and parse the contents out.

Thanks
Subbu


Re: Format of Kafka storage on disk

2014-01-03 Thread Joe Stein
The DumpLogSegments should do that for you
https://github.com/apache/kafka/blob/0.8/core/src/main/scala/kafka/tools/DumpLogSegments.scala

bin/kafka-run-class.sh kafka.tools.DumpLogSegments

Option  Description

--  ---

--deep-iterationif set, uses deep instead of
shallow
  iteration

--files file1, file2, ... REQUIRED: The comma separated list
of
  data and index log files to be
dumped
--max-message-size Integer: size  Size of largest message. (default:

  5242880)

--print-data-logif set, printing the messages
content
  when dumping data logs

--verify-index-only if set, just verify the index log

  without printing its content

or use the code as entry point for whatever you want to-do :)


/***
 Joe Stein
 Founder, Principal Consultant
 Big Data Open Source Security LLC
 http://www.stealth.ly
 Twitter: @allthingshadoop http://www.twitter.com/allthingshadoop
/


On Fri, Jan 3, 2014 at 5:10 PM, Subbu Srinivasan ssriniva...@gmail.comwrote:

 Is there any place where I can know about the internal structure of
 the log file where kafka stores the data. A topic has a .index and a .log
 file.

 I want to read the entire log file and parse the contents out.

 Thanks
 Subbu