Thanks! Our protobuf object is fairly complex. Even O(N) takes a lot of time.
On Mon, Feb 26, 2018 at 6:33 PM, 叶先进 <advance...@gmail.com> wrote: > H Xin Liu, > > Could you provide a concrete user case if possible(code to reproduce > protobuf object and comparisons between protobuf and normal object)? > > I contributed a bit to SizeEstimator years ago, and to my understanding, > the time complexity should be O(N) where N is the num of referenced fields > recursively. > > We should definitely investigate this case if it indeed takes a lot of > time on protobuf objects. > > > On 27 Feb 2018, at 8:47 AM, Xin Liu <xin.e....@gmail.com> wrote: > > Hi folks, > > We have a situation where, shuffled data is protobuf based, and > SizeEstimator is taking a lot of time. > > We have tried to override SizeEstimator to return a constant value, which > speeds up things a lot. > > My questions, what is the side effect of disabling SizeEstimator? Is it > just spark do memory reallocation, or there is more severe consequences? > > Thanks! > > >