Hi, My usecase involves reading from HDFS and emit each record as a separate tuple. Record can be either fixed length record or separator based record (such as newline). Expected output is byte[] for each record.
I am planning to solve this as follows: - New operator which extends BlockReader. - It will have configuration option to select mode for FIXED_LENGTH, SEPARATOR_BASED. - Use appropriate ReaderContext based on mode. Reason for having different operator than BlockReader is because output port signature is different than BlockReader. This new operator can be used in conjunction with FileSplitter. Any feedback? ~ Yogi
