Re: Custom string encoding

Dmitriy Setrakyan Sat, 01 Jul 2017 00:14:27 -0700

Val, do you know how we compare strings in SQL queries? Will we be able to
use this encoder?


Additionally, I think that the encoder is a bit too abstract. Why not go
even further and allow users create their own ASCII table for encoding?

D.

On Fri, Jun 30, 2017 at 6:49 PM, Valentin Kulichenko <
[email protected]> wrote:

> Andrey,
>
> Can you elaborate more on this? What is your concern?
>
> -Val
>
> On Fri, Jun 30, 2017 at 6:17 PM Andrey Mashenkov <
> [email protected]>
> wrote:
>
> > Val,
> >
> > Looks like make sense.
> >
> > This will not affect FullText index, as Lucene has own format for storing
> > data.
> >
> > But.. would it be compatible with H2 indexing ? I doubt.
> >
> > 1 июля 2017 г. 2:27 пользователь "Valentin Kulichenko" <
> > [email protected]> написал:
> >
> > > Folks,
> > >
> > > Currently binary marshaller always encodes strings in UTF-8. However,
> > > sometimes it can be useful to customize this. For example, if data
> > contains
> > > a lot of Cyrillic, Chinese or other symbols, but not so many Latin
> > symbols,
> > > memory is used very inefficiently. In this case it would be great to
> > encode
> > > most frequently used symbols in one byte instead of two or three.
> > >
> > > I propose to introduce BinaryStringEncoder interface that will convert
> > > strings to byte arrays and back, and make it pluggable via
> > > BinaryConfiguration. This will allow users to plug in any encoding
> > > algorithms based on their requirements.
> > >
> > > Thoughts?
> > >
> > > https://issues.apache.org/jira/browse/IGNITE-5655
> > >
> > > -Val
> > >
> >
>

Re: Custom string encoding

Reply via email to