we are working on a very sparse table with say 500 columns where we do
batch uploads that typically only contain a subset of the columns (say
100), and we run multiple map-reduce queries on subsets of the columns
(typically less than 50 columns go into a single map-reduce job).
my question is the
On 01/23/2012 02:18 PM, Koert Kuipers wrote:
is this considered abuse of avro's versioning capabilities?
Not at all. Using a subset of the fields in Avro is called projection
and can provide significant performance improvements.
Doug