Avro skips over fields that were present in the writer's schema but are no longer present in the reader's schema. Skipping is substantially faster than reading for most types. For known-size types like string, bytes, fixed, double and float the file pointer can be incremented past skipped values. For skipped structures like records, maps and arrays, no memory is allocated and no stores are made. Avro data files are not in a columnar format however, so the i/o and decompression of skipped fields is not generally avoided.
Doug On Wed, Dec 17, 2014 at 7:53 AM, ÐΞ€ρ@Ҝ (๏̯͡๏) <deepuj...@gmail.com> wrote: > I have a data that is persisted in Avro format. Each record has a certain > schema and it contains 10 fields while it is persisted. > > When I read the same record(s) from other process, i also specify a schema > with a subset of fields (5). > > Will only 5 columns be read from disk? > or > Will all the columns be read but 5 are later discarded? > or > Are all the columns read but only five are accessible since the schema used > to read contain only five columns? > > Please suggest. > > Regards, > Deepak >