+1 Regards, Mohit
On Thu, Apr 28, 2016 at 4:29 PM, Yogi Devendra <[email protected] > wrote: > Hi, > > My usecase involves reading from HDFS and emit each record as a separate > tuple. Record can be either fixed length record or separator based record > (such as newline). Expected output is byte[] for each record. > > I am planning to solve this as follows: > - New operator which extends BlockReader. > - It will have configuration option to select mode for FIXED_LENGTH, > SEPARATOR_BASED. > - Use appropriate ReaderContext based on mode. > > Reason for having different operator than BlockReader is because output > port signature is different than BlockReader. This new operator can be used > in conjunction with FileSplitter. > > Any feedback? > > ~ Yogi >
