Igor, Good catch. Probably some MAX value could help us here.
On Mon, Oct 31, 2016 at 9:17 PM, Igor Sapego <isap...@gridgain.com> wrote: > Valentin, > > -1 was just an example. I've checked - currently we use all possible range > of offset values. > So if we are going to use suggested approach then we need to reserve some > value and > adjust serialization/deserialization algorithms. > > Best Regards, > Igor > > On Mon, Oct 31, 2016 at 8:46 PM, Valentin Kulichenko < > valentin.kuliche...@gmail.com> wrote: > > > Makes sense to me, but not sure about -1 in particular. Is this offset > > relative to object start position? What values can it have? > > > > -Val > > > > On Mon, Oct 31, 2016 at 10:38 AM, Igor Sapego <isap...@gridgain.com> > > wrote: > > > >> Vladimir, > >> > >> How about some reserved value? I.e -1 offset means a default/null value > >> should be used? > >> > >> Best Regards, > >> Igor > >> > >> On Mon, Oct 31, 2016 at 5:05 PM, Vladimir Ozerov <voze...@gridgain.com> > >> wrote: > >> > >>> Valya, > >>> > >>> Do you have any ideas how to implement this? We write field offsets in > >>> the > >>> footer. If field is not written, then what should be used for its > offset? > >>> > >>> On Mon, Oct 31, 2016 at 4:56 PM, Valentin Kulichenko < > >>> valentin.kuliche...@gmail.com> wrote: > >>> > >>> > Vladimir, > >>> > > >>> > These are good points, but I'm not suggesting to change the schema. > If > >>> one > >>> > writes five fields, the schema should have five fields in any case, > >>> > regardless of values. I only suggest to change the internal > >>> representation > >>> > of the object and do not save fields with default values in the byte > >>> array > >>> > as we don't really need them there. > >>> > > >>> > -Val > >>> > > >>> > On Sun, Oct 30, 2016 at 12:24 PM, Vladimir Ozerov < > >>> voze...@gridgain.com> > >>> > wrote: > >>> > > >>> >> Valya, > >>> >> > >>> >> I have several concerns: > >>> >> 1) Correctness: hasField() will not work properly. But probably we > can > >>> >> fix that by adding this info to schema. > >>> >> 2) Performance: we have lots optimizations which depend on either > >>> >> "stable" object schema, or low number of schemas. We will > effectively > >>> turn > >>> >> them off. > >>> >> But what concerns me even more, is that we may end up in enormous > >>> number > >>> >> of schemas. E.g. consider an object with 10 number fields. If all > >>> fields > >>> >> could be zero, we may end up in something like 2^10 schemas. > >>> >> > >>> >> Vladimir. > >>> >> > >>> >> 29 окт. 2016 г. 0:37 пользователь "Valentin Kulichenko" < > >>> >> valentin.kuliche...@gmail.com> написал: > >>> >> > >>> >> Vova, > >>> >>> > >>> >>> Why do we need to write zeros and nulls in the first place? What's > >>> the > >>> >>> value of having them in the byte array? > >>> >>> > >>> >>> -Val > >>> >>> > >>> >>> On Fri, Oct 28, 2016 at 1:18 AM, Vladimir Ozerov < > >>> voze...@gridgain.com> > >>> >>> wrote: > >>> >>> > >>> >>>> Valya, > >>> >>>> > >>> >>>> Currently null value is written as one byte, while zero value of > >>> long > >>> >>>> type is written as 9 bytes. I want to improve that and write zeros > >>> as one > >>> >>>> byte as well. > >>> >>>> > >>> >>>> As per var-length encoding, I am strongly against it. It saves IO > >>> and > >>> >>>> memory at the cost of CPU. If we encode numbers in this way we > will > >>> >>>> slowdown SQL (which is already not very fast, to be honest). > Because > >>> >>>> instead of a single read memory read, we will have to perform > >>> multiple > >>> >>>> reads and then apply some mechanics to restore original value. We > >>> already > >>> >>>> have such problem with Strings - Java stores them as UTF-16, but > we > >>> encode > >>> >>>> them as UTF-8. As a result every read of a string field in SQL > >>> results in > >>> >>>> decoding overhead. > >>> >>>> > >>> >>>> Vladimir. > >>> >>>> > >>> >>>> On Fri, Oct 28, 2016 at 6:07 AM, Valentin Kulichenko < > >>> >>>> valentin.kuliche...@gmail.com> wrote: > >>> >>>> > >>> >>>>> Cross-posting this to dev list. > >>> >>>>> > >>> >>>>> Vladimir, > >>> >>>>> > >>> >>>>> To be honest, I don't see much difference between null values for > >>> >>>>> objects and zero values for primitives. From BinaryObject > semantics > >>> >>>>> standpoint, both are default values for corresponding types. > These > >>> values > >>> >>>>> will be returned from the BinaryObject.field() method regardless > >>> of whether > >>> >>>>> we actually save then in the byte array or not. Having said that, > >>> why don't > >>> >>>>> we just skip them during write? > >>> >>>>> > >>> >>>>> You optimization will be still useful though, because there are > >>> often > >>> >>>>> a lot of ints and longs that are not zeros, but still small and > >>> can fit 1-2 > >>> >>>>> bytes. We already added such compaction in direct message > >>> marshaling and it > >>> >>>>> reduced overall traffic by around 30%. > >>> >>>>> > >>> >>>>> -Val > >>> >>>>> > >>> >>>>> > >>> >>>>> On Thu, Oct 27, 2016 at 2:21 PM, Vladimir Ozerov < > >>> voze...@gridgain.com > >>> >>>>> > wrote: > >>> >>>>> > >>> >>>>>> Hi, > >>> >>>>>> > >>> >>>>>> I am not very concerned with null fields overhead, because > >>> usually it > >>> >>>>>> won't be significant. However, there is a problem with zeros. > >>> User object > >>> >>>>>> might have lots of int/long zeros, this is not uncommon. And > each > >>> zero will > >>> >>>>>> consume 4-8 additional bytes. We probably will implement special > >>> >>>>>> optimization which will write such fields in special compact > >>> format. > >>> >>>>>> > >>> >>>>>> Vladimir. > >>> >>>>>> > >>> >>>>>> On Thu, Oct 27, 2016 at 10:55 PM, vkulichenko < > >>> >>>>>> valentin.kuliche...@gmail.com> wrote: > >>> >>>>>> > >>> >>>>>>> Hi, > >>> >>>>>>> > >>> >>>>>>> Yes, null values consume memory. I believe this can be > optimized, > >>> >>>>>>> but I > >>> >>>>>>> haven't seen issues with this so far. Unless you have hundreds > of > >>> >>>>>>> fields > >>> >>>>>>> most of which are nulls (very rare case), the overhead is > >>> minimal. > >>> >>>>>>> > >>> >>>>>>> -Val > >>> >>>>>>> > >>> >>>>>>> > >>> >>>>>>> > >>> >>>>>>> -- > >>> >>>>>>> View this message in context: http://apache-ignite-users.705 > >>> >>>>>>> 18.x6.nabble.com/BinaryObject-pros-cons-tp8541p8563.html > >>> >>>>>>> Sent from the Apache Ignite Users mailing list archive at > >>> Nabble.com. > >>> >>>>>>> > >>> >>>>>> > >>> >>>>>> > >>> >>>>> > >>> >>>> > >>> >>> > >>> > > >>> > >> > >> > > >