Yes, we seek to 0 and we read the header then seek back to the split offset. On Aug 1, 2013 11:16 PM, "Lior Schachter" <lior...@gmail.com> wrote:
> Hi Harsh, > So for each split you first read the header of the file directly from HDFS > ? > > Thanks, > Lior > > > > > On Thu, Aug 1, 2013 at 7:36 PM, Harsh J <ha...@cloudera.com> wrote: > >> We read it from the top of the file at start (just the schema bytes) >> and then initialize the reader. >> >> On Thu, Aug 1, 2013 at 8:32 PM, Lior Schachter <lior...@gmail.com> wrote: >> > Hi all, >> > >> > When writing Avro schema to the a data file, what will be the expected >> > behavior if the file is used as M/R input. How does the second/third/... >> > splits get the schema (the schema is always written to the first split) >> ? >> > >> > Thanks, >> > Lior >> > >> > >> >> >> >> -- >> Harsh J >> > >