Makes sense to me, but not sure about -1 in particular. Is this offset relative to object start position? What values can it have?
-Val On Mon, Oct 31, 2016 at 10:38 AM, Igor Sapego <isap...@gridgain.com> wrote: > Vladimir, > > How about some reserved value? I.e -1 offset means a default/null value > should be used? > > Best Regards, > Igor > > On Mon, Oct 31, 2016 at 5:05 PM, Vladimir Ozerov <voze...@gridgain.com> > wrote: > >> Valya, >> >> Do you have any ideas how to implement this? We write field offsets in the >> footer. If field is not written, then what should be used for its offset? >> >> On Mon, Oct 31, 2016 at 4:56 PM, Valentin Kulichenko < >> valentin.kuliche...@gmail.com> wrote: >> >> > Vladimir, >> > >> > These are good points, but I'm not suggesting to change the schema. If >> one >> > writes five fields, the schema should have five fields in any case, >> > regardless of values. I only suggest to change the internal >> representation >> > of the object and do not save fields with default values in the byte >> array >> > as we don't really need them there. >> > >> > -Val >> > >> > On Sun, Oct 30, 2016 at 12:24 PM, Vladimir Ozerov <voze...@gridgain.com >> > >> > wrote: >> > >> >> Valya, >> >> >> >> I have several concerns: >> >> 1) Correctness: hasField() will not work properly. But probably we can >> >> fix that by adding this info to schema. >> >> 2) Performance: we have lots optimizations which depend on either >> >> "stable" object schema, or low number of schemas. We will effectively >> turn >> >> them off. >> >> But what concerns me even more, is that we may end up in enormous >> number >> >> of schemas. E.g. consider an object with 10 number fields. If all >> fields >> >> could be zero, we may end up in something like 2^10 schemas. >> >> >> >> Vladimir. >> >> >> >> 29 окт. 2016 г. 0:37 пользователь "Valentin Kulichenko" < >> >> valentin.kuliche...@gmail.com> написал: >> >> >> >> Vova, >> >>> >> >>> Why do we need to write zeros and nulls in the first place? What's the >> >>> value of having them in the byte array? >> >>> >> >>> -Val >> >>> >> >>> On Fri, Oct 28, 2016 at 1:18 AM, Vladimir Ozerov < >> voze...@gridgain.com> >> >>> wrote: >> >>> >> >>>> Valya, >> >>>> >> >>>> Currently null value is written as one byte, while zero value of long >> >>>> type is written as 9 bytes. I want to improve that and write zeros >> as one >> >>>> byte as well. >> >>>> >> >>>> As per var-length encoding, I am strongly against it. It saves IO and >> >>>> memory at the cost of CPU. If we encode numbers in this way we will >> >>>> slowdown SQL (which is already not very fast, to be honest). Because >> >>>> instead of a single read memory read, we will have to perform >> multiple >> >>>> reads and then apply some mechanics to restore original value. We >> already >> >>>> have such problem with Strings - Java stores them as UTF-16, but we >> encode >> >>>> them as UTF-8. As a result every read of a string field in SQL >> results in >> >>>> decoding overhead. >> >>>> >> >>>> Vladimir. >> >>>> >> >>>> On Fri, Oct 28, 2016 at 6:07 AM, Valentin Kulichenko < >> >>>> valentin.kuliche...@gmail.com> wrote: >> >>>> >> >>>>> Cross-posting this to dev list. >> >>>>> >> >>>>> Vladimir, >> >>>>> >> >>>>> To be honest, I don't see much difference between null values for >> >>>>> objects and zero values for primitives. From BinaryObject semantics >> >>>>> standpoint, both are default values for corresponding types. These >> values >> >>>>> will be returned from the BinaryObject.field() method regardless of >> whether >> >>>>> we actually save then in the byte array or not. Having said that, >> why don't >> >>>>> we just skip them during write? >> >>>>> >> >>>>> You optimization will be still useful though, because there are >> often >> >>>>> a lot of ints and longs that are not zeros, but still small and can >> fit 1-2 >> >>>>> bytes. We already added such compaction in direct message >> marshaling and it >> >>>>> reduced overall traffic by around 30%. >> >>>>> >> >>>>> -Val >> >>>>> >> >>>>> >> >>>>> On Thu, Oct 27, 2016 at 2:21 PM, Vladimir Ozerov < >> voze...@gridgain.com >> >>>>> > wrote: >> >>>>> >> >>>>>> Hi, >> >>>>>> >> >>>>>> I am not very concerned with null fields overhead, because usually >> it >> >>>>>> won't be significant. However, there is a problem with zeros. User >> object >> >>>>>> might have lots of int/long zeros, this is not uncommon. And each >> zero will >> >>>>>> consume 4-8 additional bytes. We probably will implement special >> >>>>>> optimization which will write such fields in special compact >> format. >> >>>>>> >> >>>>>> Vladimir. >> >>>>>> >> >>>>>> On Thu, Oct 27, 2016 at 10:55 PM, vkulichenko < >> >>>>>> valentin.kuliche...@gmail.com> wrote: >> >>>>>> >> >>>>>>> Hi, >> >>>>>>> >> >>>>>>> Yes, null values consume memory. I believe this can be optimized, >> >>>>>>> but I >> >>>>>>> haven't seen issues with this so far. Unless you have hundreds of >> >>>>>>> fields >> >>>>>>> most of which are nulls (very rare case), the overhead is minimal. >> >>>>>>> >> >>>>>>> -Val >> >>>>>>> >> >>>>>>> >> >>>>>>> >> >>>>>>> -- >> >>>>>>> View this message in context: http://apache-ignite-users.705 >> >>>>>>> 18.x6.nabble.com/BinaryObject-pros-cons-tp8541p8563.html >> >>>>>>> Sent from the Apache Ignite Users mailing list archive at >> Nabble.com. >> >>>>>>> >> >>>>>> >> >>>>>> >> >>>>> >> >>>> >> >>> >> > >> > >