Valya, Do you have any ideas how to implement this? We write field offsets in the footer. If field is not written, then what should be used for its offset?
On Mon, Oct 31, 2016 at 4:56 PM, Valentin Kulichenko < valentin.kuliche...@gmail.com> wrote: > Vladimir, > > These are good points, but I'm not suggesting to change the schema. If one > writes five fields, the schema should have five fields in any case, > regardless of values. I only suggest to change the internal representation > of the object and do not save fields with default values in the byte array > as we don't really need them there. > > -Val > > On Sun, Oct 30, 2016 at 12:24 PM, Vladimir Ozerov <voze...@gridgain.com> > wrote: > >> Valya, >> >> I have several concerns: >> 1) Correctness: hasField() will not work properly. But probably we can >> fix that by adding this info to schema. >> 2) Performance: we have lots optimizations which depend on either >> "stable" object schema, or low number of schemas. We will effectively turn >> them off. >> But what concerns me even more, is that we may end up in enormous number >> of schemas. E.g. consider an object with 10 number fields. If all fields >> could be zero, we may end up in something like 2^10 schemas. >> >> Vladimir. >> >> 29 окт. 2016 г. 0:37 пользователь "Valentin Kulichenko" < >> valentin.kuliche...@gmail.com> написал: >> >> Vova, >>> >>> Why do we need to write zeros and nulls in the first place? What's the >>> value of having them in the byte array? >>> >>> -Val >>> >>> On Fri, Oct 28, 2016 at 1:18 AM, Vladimir Ozerov <voze...@gridgain.com> >>> wrote: >>> >>>> Valya, >>>> >>>> Currently null value is written as one byte, while zero value of long >>>> type is written as 9 bytes. I want to improve that and write zeros as one >>>> byte as well. >>>> >>>> As per var-length encoding, I am strongly against it. It saves IO and >>>> memory at the cost of CPU. If we encode numbers in this way we will >>>> slowdown SQL (which is already not very fast, to be honest). Because >>>> instead of a single read memory read, we will have to perform multiple >>>> reads and then apply some mechanics to restore original value. We already >>>> have such problem with Strings - Java stores them as UTF-16, but we encode >>>> them as UTF-8. As a result every read of a string field in SQL results in >>>> decoding overhead. >>>> >>>> Vladimir. >>>> >>>> On Fri, Oct 28, 2016 at 6:07 AM, Valentin Kulichenko < >>>> valentin.kuliche...@gmail.com> wrote: >>>> >>>>> Cross-posting this to dev list. >>>>> >>>>> Vladimir, >>>>> >>>>> To be honest, I don't see much difference between null values for >>>>> objects and zero values for primitives. From BinaryObject semantics >>>>> standpoint, both are default values for corresponding types. These values >>>>> will be returned from the BinaryObject.field() method regardless of >>>>> whether >>>>> we actually save then in the byte array or not. Having said that, why >>>>> don't >>>>> we just skip them during write? >>>>> >>>>> You optimization will be still useful though, because there are often >>>>> a lot of ints and longs that are not zeros, but still small and can fit >>>>> 1-2 >>>>> bytes. We already added such compaction in direct message marshaling and >>>>> it >>>>> reduced overall traffic by around 30%. >>>>> >>>>> -Val >>>>> >>>>> >>>>> On Thu, Oct 27, 2016 at 2:21 PM, Vladimir Ozerov <voze...@gridgain.com >>>>> > wrote: >>>>> >>>>>> Hi, >>>>>> >>>>>> I am not very concerned with null fields overhead, because usually it >>>>>> won't be significant. However, there is a problem with zeros. User object >>>>>> might have lots of int/long zeros, this is not uncommon. And each zero >>>>>> will >>>>>> consume 4-8 additional bytes. We probably will implement special >>>>>> optimization which will write such fields in special compact format. >>>>>> >>>>>> Vladimir. >>>>>> >>>>>> On Thu, Oct 27, 2016 at 10:55 PM, vkulichenko < >>>>>> valentin.kuliche...@gmail.com> wrote: >>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> Yes, null values consume memory. I believe this can be optimized, >>>>>>> but I >>>>>>> haven't seen issues with this so far. Unless you have hundreds of >>>>>>> fields >>>>>>> most of which are nulls (very rare case), the overhead is minimal. >>>>>>> >>>>>>> -Val >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> View this message in context: http://apache-ignite-users.705 >>>>>>> 18.x6.nabble.com/BinaryObject-pros-cons-tp8541p8563.html >>>>>>> Sent from the Apache Ignite Users mailing list archive at Nabble.com. >>>>>>> >>>>>> >>>>>> >>>>> >>>> >>> >