RE: SequenceFile and object reuse

2015-11-19 Thread jeff saremi
Sandy, Ryan, Andrew Thanks very much. I think i now understand it better. Jeff From: ryan.blake.willi...@gmail.com Date: Thu, 19 Nov 2015 06:00:30 + Subject: Re: SequenceFile and object reuse To: sandy.r...@cloudera.com; jeffsar...@hotmail.com CC: user@spark.apache.org Hey Jeff, in addition

Re: SequenceFile and object reuse

2015-11-18 Thread Ryan Williams
Hey Jeff, in addition to what Sandy said, there are two more reasons that this might not be as bad as it seems; I may be incorrect in my understanding though. First, the "additional step" you're referring to is not likely to be adding any overhead; the "extra map" is really just materializing the

Re: SequenceFile and object reuse

2015-11-18 Thread Sandy Ryza
Hi Jeff, Many access patterns simply take the result of hadoopFile and use it to create some other object, and thus have no need for each input record to refer to a different object. In those cases, the current API is more performant than an alternative that would create an object for each