Hello!

My take is the following: if conserving memory is needed at all, then we
better invest in compression (such as dictionary-based row compression)
rather than implementing varint, compact nulls, etc.

Dictionary-based compression can easily tackle varints, null patterns while
also compressing strings and repeated values and even things we would never
think out on our own.

It also has low complexity of our own code, no compatibility issues (people
store binary objects in 3rd party storage, they do indeed) and low
incidence of bugs.

Regards,
-- 
Ilya Kasnacheev


пн, 25 мая 2020 г. в 12:51, Hostettler, Steve <
steve.hostett...@wolterskluwer.com>:

> I went for a simpler approach (only with null mask( and yes the gain is
> high for smaller object but low otherwise. I gain between 5-20% on my
> objects. But to me it is the step stone to easily implement other
> optimisations like varint and schemaless without using raw. Trying to solve
> the latest unit tests to give you a better idea. If not worth then let's
> not do it but it is worth a try I think.
>
>
> -----Original Message-----
> From: Ilya Kasnacheev <ilya.kasnach...@gmail.com>
> Sent: Monday, May 25, 2020 11:48 AM
> To: dev <dev@ignite.apache.org>
> Subject: Re: IGNITE-6499 Compact NULL fields
>
> Caution, this email may be from a sender outside Wolters Kluwer. Verify
> the sender and know the content is safe.
>
> Hello!
>
> I can't help myself but wonder how large of a benefit will it give.  I
> have checked the ticket description, it looks the proposed scheme is
> elaborate and benefit for non-extreme binary objects rather tiny.
>
> WDYT?
>
> Regards,
> --
> Ilya Kasnacheev
>
>
> пн, 18 мая 2020 г. в 22:54, steve.hostett...@gmail.com <
> steve.hostett...@gmail.com>:
>
> > Hello igniters,
> >
> > while I would like to help on the calcite because H2 optimiser (or the
> > lack
> > thereof) is really killing us, I think that it would be wiser to start
> > by contributing on something easier.
> >
> > Therefore I will tackle another problem that we have which is the
> > memory consumption. I stumbled upon this IEP
> >
> > https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcwik
> > i.apache.org%2Fconfluence%2Fdisplay%2FIGNITE%2FIEP-2%253A%2BBinary%2Bo
> > bject%2Bformat%2Bimprovements&amp;data=02%7C01%7CSteve.Hostettler%40wo
> > lterskluwer.com%7C7568148487434617407b08d80090b1f2%7C8ac76c91e7f141ffa
> > 89c3553b2da2c17%7C0%7C0%7C637259968758509764&amp;sdata=ZNFJ5gqEXRv5KR3
> > HJUfYZ4rmnGwCiFVGg4IrWTORT2k%3D&amp;reserved=0
> > <
> > https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcwik
> > i.apache.org%2Fconfluence%2Fdisplay%2FIGNITE%2FIEP-2%253A%2BBinary%2Bo
> > bject%2Bformat%2Bimprovements&amp;data=02%7C01%7CSteve.Hostettler%40wo
> > lterskluwer.com%7C7568148487434617407b08d80090b1f2%7C8ac76c91e7f141ffa
> > 89c3553b2da2c17%7C0%7C0%7C637259968758509764&amp;sdata=ZNFJ5gqEXRv5KR3
> > HJUfYZ4rmnGwCiFVGg4IrWTORT2k%3D&amp;reserved=0>
> >
> > that is about optimising the binary marshaller.
> >
> > The low hanging fruit seemed to be the null compaction so I decided to
> > start with it. Though I am sure I do see some hidden complexity.
> >
> > Here a couple of questions:
> > - Can I assign myself IGNITE-6499 and attach a patch?
> > - Who can I contact to help with the review. In the following page
> > https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcwik
> > i.apache.org%2Fconfluence%2Fdisplay%2FIGNITE%2FHow%2Bto%2BContribute&a
> > mp;data=02%7C01%7CSteve.Hostettler%40wolterskluwer.com%7C7568148487434
> > 617407b08d80090b1f2%7C8ac76c91e7f141ffa89c3553b2da2c17%7C0%7C0%7C63725
> > 9968758519763&amp;sdata=1Uzz8DLO%2B9nd1FPQ14efFeL35QsYE6tT3BvhIKf03FQ%
> > 3D&amp;reserved=0
> > <https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcwi
> > ki.apache.org%2Fconfluence%2Fdisplay%2FIGNITE%2FHow%2Bto%2BContribute&
> > amp;data=02%7C01%7CSteve.Hostettler%40wolterskluwer.com%7C756814848743
> > 4617407b08d80090b1f2%7C8ac76c91e7f141ffa89c3553b2da2c17%7C0%7C0%7C6372
> > 59968758519763&amp;sdata=1Uzz8DLO%2B9nd1FPQ14efFeL35QsYE6tT3BvhIKf03FQ
> > %3D&amp;reserved=0> there is no one assigned for marshalling.
> >
> > On the details:
> > The compression is disabled by default as it is not compatible with
> > objects previously marshalled.
> >
> > My approach was to go a bit beyond the JIRA. No only do I remove the
> > indexes to null fields in the footer, I also remove the 0x65 in the
> > objects. I did not remove them fro the collections and arrays because
> > they are using absolute positioning.
> >
> > I gain between 5% to 20% depending of my test cases. Obviously the
> > smaller the object and the higher the number of nulls, the higher the
> > compression rate.
> >
> > Based on that I can quite easily add var int compression which is
> > IGNITE-6418 and should significantly increase the compression rate
> > with a lot of integers and longs when only using small numbers.
> >
> > Next step is to add JMH micro-benchmark to check the impact in terms
> > of performances.
> >
> >
> > Example on a simple object w/ null compaction
> >
> > Length=55 FooterPosition=50
> > 0x67 // ValueType
> > 0x01 // FormatVersion
> > 0x2b 0x00 //Flags userType=true hasSchema=true offset=1
> > compactFooter=true
> > 0x78 0x66 0xbe 0x44 //TypeId
> > 0xf9 0xcd 0x07 0x57 //Hashcode
> > 0x37 0x00 0x00 0x00 //Length
> > 0x3d 0xa8 0x15 0xe4 //SchemaId
> > 0x32 0x00 0x00 0x00 //Footer position = 50
> > 0x03 0x01 0x00 0x00 0x00 0x03 0x01 0x00 0x00 0x00 0x09 0x03 0x00 0x00
> > 0x00
> > 0x61 0x62 0x63 0x09 0x03 0x00 0x00 0x00 0x61 0x62 0x63 Footer length=5
> > 0x18 0x1d 0x22 0x2a 0x47
> >
> > and w/o null compaction
> > Length=60 FooterPosition=53
> > 0x67 // ValueType
> > 0x01 // FormatVersion
> > 0x2b 0x00 //Flags userType=true hasSchema=true offset=1
> > compactFooter=true
> > 0x78 0x66 0xbe 0x44 //TypeId
> > 0xa4 0x43 0x0e 0xf5 //Hashcode
> > 0x3c 0x00 0x00 0x00 //Length
> > 0x3d 0xa8 0x15 0xe4 //SchemaId
> > 0x35 0x00 0x00 0x00 //Footer position = 53
> > 0x03 0x01 0x00 0x00 0x00 0x03 0x01 0x00 0x00 0x00 0x09 0x03 0x00 0x00
> > 0x00
> > 0x61 0x62 0x63 0x65 0x65 0x65 0x09 0x03 0x00 0x00 0x00 0x61 0x62 0x63
> > Footer length=7
> > 0x18 0x1d 0x22 0x2a 0x2b 0x2c 0x2d
> >
> >
> >
> >
> > --
> > Sent from:
> >
> https://nam04.safelinks.protection.outlook.com/?url=http%3A%2F%2Fapache-ignite-developers.2346864.n4.nabble.com%2F&amp;data=02%7C01%7CSteve.Hostettler%40wolterskluwer.com%7C7568148487434617407b08d80090b1f2%7C8ac76c91e7f141ffa89c3553b2da2c17%7C0%7C0%7C637259968758519763&amp;sdata=YmPxlqtaJCQLQmB6yyoaNr27mstXWyFuWyJYZDafwU4%3D&amp;reserved=0
> >
>

Reply via email to