Hi, I am learning Hadoop. We have some special formated text file for input, so we need to write some customized inputFormat, probably based on FileInputFormat. Does the FileInputFormat respect the record boundary (every line or maybe every other line)? I am reading the source code (1.0.0). For example in the LineRecordReader, is "in" field (InputStream) of the LineReader(in,..) the full HDFS file (of many blocks) or just the real local file of one block? All books I read have very little details about it. Can any expert point me to some reference about it, or maybe which part of the source code I should concentrate on? Thanks.
Zhu, Guojun Modeling Sr Graduate 571-3824370 guojun_...@freddiemac.com Financial Engineering Freddie Mac