I am using 2.0.1 and databricks avro library 3.0.1. I am running this on the latest AWS EMR release.
On Mon, Nov 14, 2016 at 3:06 PM, Jörn Franke <jornfra...@gmail.com> wrote: > spark version? Are you using tungsten? > > > On 14 Nov 2016, at 10:05, Prithish <prith...@gmail.com> wrote: > > > > Can someone please explain why this happens? > > > > When I read a 600kb AVRO file and cache this in memory (using > cacheTable), it shows up as 11mb (storage tab in Spark UI). I have tried > this with different file sizes, and the size in-memory is always > proportionate. I thought Spark compresses when using cacheTable. >