using avro schemas to select columns (abusing versioning?)

2012-01-23 Thread Koert Kuipers
we are working on a very sparse table with say 500 columns where we do batch uploads that typically only contain a subset of the columns (say 100), and we run multiple map-reduce queries on subsets of the columns (typically less than 50 columns go into a single map-reduce job). my question is the

Re: using avro schemas to select columns (abusing versioning?)

2012-01-23 Thread Doug Cutting
On 01/23/2012 02:18 PM, Koert Kuipers wrote: is this considered abuse of avro's versioning capabilities? Not at all. Using a subset of the fields in Avro is called projection and can provide significant performance improvements. Doug