Re: AVRO File size when caching in-memory

Prithish Tue, 15 Nov 2016 20:45:07 -0800

Anyone?

On Tue, Nov 15, 2016 at 10:45 AM, Prithish <prith...@gmail.com> wrote:


> I am using 2.0.1 and databricks avro library 3.0.1. I am running this on
> the latest AWS EMR release.
>
> On Mon, Nov 14, 2016 at 3:06 PM, Jörn Franke <jornfra...@gmail.com> wrote:
>
>> spark version? Are you using tungsten?
>>
>> > On 14 Nov 2016, at 10:05, Prithish <prith...@gmail.com> wrote:
>> >
>> > Can someone please explain why this happens?
>> >
>> > When I read a 600kb AVRO file and cache this in memory (using
>> cacheTable), it shows up as 11mb (storage tab in Spark UI). I have tried
>> this with different file sizes, and the size in-memory is always
>> proportionate. I thought Spark compresses when using cacheTable.
>>
>
>

Re: AVRO File size when caching in-memory

Reply via email to