Hi,

I am learning Hadoop.  We have some special formated text file for input, 
so we need to write some customized inputFormat, probably based on 
FileInputFormat.  Does the FileInputFormat respect the record boundary 
(every line or maybe every other line)?  I am reading the source code 
(1.0.0).  For example in the LineRecordReader, is "in" field (InputStream) 
of the LineReader(in,..) the full HDFS file (of many blocks) or just the 
real local file of one block?  All books I read have very little details 
about it.   Can any expert point me to some reference about it, or maybe 
which part of the source code I should concentrate on?  Thanks. 

Zhu, Guojun
Modeling Sr Graduate
571-3824370
guojun_...@freddiemac.com
Financial Engineering
Freddie Mac

Reply via email to