Vova, Why do we need to write zeros and nulls in the first place? What's the value of having them in the byte array?
-Val On Fri, Oct 28, 2016 at 1:18 AM, Vladimir Ozerov <[email protected]> wrote: > Valya, > > Currently null value is written as one byte, while zero value of long type > is written as 9 bytes. I want to improve that and write zeros as one byte > as well. > > As per var-length encoding, I am strongly against it. It saves IO and > memory at the cost of CPU. If we encode numbers in this way we will > slowdown SQL (which is already not very fast, to be honest). Because > instead of a single read memory read, we will have to perform multiple > reads and then apply some mechanics to restore original value. We already > have such problem with Strings - Java stores them as UTF-16, but we encode > them as UTF-8. As a result every read of a string field in SQL results in > decoding overhead. > > Vladimir. > > On Fri, Oct 28, 2016 at 6:07 AM, Valentin Kulichenko < > [email protected]> wrote: > >> Cross-posting this to dev list. >> >> Vladimir, >> >> To be honest, I don't see much difference between null values for objects >> and zero values for primitives. From BinaryObject semantics standpoint, >> both are default values for corresponding types. These values will be >> returned from the BinaryObject.field() method regardless of whether we >> actually save then in the byte array or not. Having said that, why don't we >> just skip them during write? >> >> You optimization will be still useful though, because there are often a >> lot of ints and longs that are not zeros, but still small and can fit 1-2 >> bytes. We already added such compaction in direct message marshaling and it >> reduced overall traffic by around 30%. >> >> -Val >> >> >> On Thu, Oct 27, 2016 at 2:21 PM, Vladimir Ozerov <[email protected]> >> wrote: >> >>> Hi, >>> >>> I am not very concerned with null fields overhead, because usually it >>> won't be significant. However, there is a problem with zeros. User object >>> might have lots of int/long zeros, this is not uncommon. And each zero will >>> consume 4-8 additional bytes. We probably will implement special >>> optimization which will write such fields in special compact format. >>> >>> Vladimir. >>> >>> On Thu, Oct 27, 2016 at 10:55 PM, vkulichenko < >>> [email protected]> wrote: >>> >>>> Hi, >>>> >>>> Yes, null values consume memory. I believe this can be optimized, but I >>>> haven't seen issues with this so far. Unless you have hundreds of fields >>>> most of which are nulls (very rare case), the overhead is minimal. >>>> >>>> -Val >>>> >>>> >>>> >>>> -- >>>> View this message in context: http://apache-ignite-users.705 >>>> 18.x6.nabble.com/BinaryObject-pros-cons-tp8541p8563.html >>>> Sent from the Apache Ignite Users mailing list archive at Nabble.com. >>>> >>> >>> >> >
