Re: Custom string encoding

2017-07-03 Thread Valentin Kulichenko
Yes, this needs to be tested and confirmed. I will work on it. Would be great to get more details about indexes. I'm not sure I understand the limitation there. -Val On Mon, Jul 3, 2017 at 7:21 AM, Dmitriy Setrakyan wrote: > Agree with Valya on the system-wide default.

Re: Custom string encoding

2017-07-03 Thread Dmitriy Setrakyan
Agree with Valya on the system-wide default. We need to have it. Also, are we certain that the encoding will provide 1-byte length for UTF-8 for different languages? Would be nice to test it to confirm, as it has a potential to decrease the Ignite storage space by 2x in certain cases. D. On

Re: Custom string encoding

2017-07-02 Thread Valentin Kulichenko
Vova, That's actually a good point. Probably that would be enough and there is no need to introduce absract encoder. However, I still think it makes sense to specify default encoding in BinaryConfiguration and BinaryTypeConfiguration. -Val On Sun, Jul 2, 2017 at 10:31 AM Vladimir Ozerov

Re: Custom string encoding

2017-07-02 Thread Vladimir Ozerov
Yes, this is exactly what non-UTF8 encodings do. вс, 2 июля 2017 г. в 20:08, Dmitriy Setrakyan : > On Sun, Jul 2, 2017 at 9:50 AM, Vladimir Ozerov > wrote: > > > There is no need for custom encoders, as they are already built-in to > Java. > > > >

Re: Custom string encoding

2017-07-02 Thread Dmitriy Setrakyan
On Sun, Jul 2, 2017 at 9:50 AM, Vladimir Ozerov wrote: > There is no need for custom encoders, as they are already built-in to Java. > Will non-ASCII encodings fit into 1 byte? The whole point here is to save space. > > вс, 2 июля 2017 г. в 19:16, Dmitriy Setrakyan

Re: Custom string encoding

2017-07-02 Thread Vladimir Ozerov
There is no need for custom encoders, as they are already built-in to Java. вс, 2 июля 2017 г. в 19:16, Dmitriy Setrakyan : > Vladimir, how would you plugin custom encoders in your design? > > On Sat, Jul 1, 2017 at 11:53 PM, Vladimir Ozerov > wrote:

Re: Custom string encoding

2017-07-02 Thread Dmitriy Setrakyan
Vladimir, how would you plugin custom encoders in your design? On Sat, Jul 1, 2017 at 11:53 PM, Vladimir Ozerov wrote: > Valya, > > Personally I vote against this feature. BinaryConfiguration is proven to be > inconvenient, since it has to be configured before node start,

Re: Custom string encoding

2017-07-01 Thread Dmitriy Setrakyan
On Sat, Jul 1, 2017 at 2:24 AM, Sergi Vladykin wrote: > In SQL indexes we may store partial strings and assume them to be in UTF-8, > I don't think this can be abstracted away. But may be this is not a big > deal if in indexes we still will use UTF-8. > Sergi, why does

Re: Custom string encoding

2017-07-01 Thread Sergi Vladykin
In SQL indexes we may store partial strings and assume them to be in UTF-8, I don't think this can be abstracted away. But may be this is not a big deal if in indexes we still will use UTF-8. Sergi 2017-07-01 10:13 GMT+03:00 Dmitriy Setrakyan : > Val, do you know how we

Re: Custom string encoding

2017-07-01 Thread Dmitriy Setrakyan
Val, do you know how we compare strings in SQL queries? Will we be able to use this encoder? Additionally, I think that the encoder is a bit too abstract. Why not go even further and allow users create their own ASCII table for encoding? D. On Fri, Jun 30, 2017 at 6:49 PM, Valentin Kulichenko <

Re: Custom string encoding

2017-06-30 Thread Valentin Kulichenko
Andrey, Can you elaborate more on this? What is your concern? -Val On Fri, Jun 30, 2017 at 6:17 PM Andrey Mashenkov wrote: > Val, > > Looks like make sense. > > This will not affect FullText index, as Lucene has own format for storing > data. > > But.. would it be

Re: Custom string encoding

2017-06-30 Thread Andrey Mashenkov
Val, Looks like make sense. This will not affect FullText index, as Lucene has own format for storing data. But.. would it be compatible with H2 indexing ? I doubt. 1 июля 2017 г. 2:27 пользователь "Valentin Kulichenko" < valentin.kuliche...@gmail.com> написал: > Folks, > > Currently binary

Custom string encoding

2017-06-30 Thread Valentin Kulichenko
Folks, Currently binary marshaller always encodes strings in UTF-8. However, sometimes it can be useful to customize this. For example, if data contains a lot of Cyrillic, Chinese or other symbols, but not so many Latin symbols, memory is used very inefficiently. In this case it would be great to