Also MERGE.
2016-09-29 2:10 GMT+03:00 Denis Magda <dma...@gridgain.com>: > You need a hash code only for INSERT operation, right? > > — > Denis > >> On Sep 28, 2016, at 3:47 PM, Alexander Paschenko >> <alexander.a.pasche...@gmail.com> wrote: >> >> But what if the user works from some kind of console and just types >> the queries as text in full and does not bind params via JDBC or >> something alike? What if there's no binary object? I don't see why we >> should keep the user from usual cache gets in this case. I really like >> the idea of supplying the values of distinct fields, thus freeing the >> user of the need to mess with objects and builders, AND then just >> calculating hash code as suggested before - say, via explicitly >> listing participating fields in XML or by marking them with transient >> keyword or some annotation. >> Actually, I believe that's the only case when we need to generate any >> hash codes - when the class is present, we can just get hash code from >> its implementation of its method. When there's no class, we generate. >> And all that is solely for SQL. For the rest - just throw an exception >> when there's no hash code manually set for binary object. I don't see >> why we should try to generate anything when the user already is using >> Ignite in full, not just via limited interface of SQL. >> >> 2016-09-29 0:31 GMT+03:00 Denis Magda <dma...@gridgain.com>: >>> Hmm, this is a good question. >>> >>> If a user doesn’t provide a _key when an INSERT is executed for me it means >>> that he is not going to use the key later for cache.get/put, DELETE, UPDATE >>> and other possible operation simply because he doesn’t know how to >>> reconstruct the key back in his code. If he wants to use the primary key in >>> the rest of operations then he must provide it at INSERT time. >>> >>> Do we need this key only for a case when an object is being inserted into a >>> cache? If it’s so I would auto-generate a key using ‘long’ as a key type. I >>> do remember that we provided the auto-generation for Spark module in a some >>> way that may be useful here. >>> >>> — >>> Denis >>> >>>> On Sep 28, 2016, at 9:53 AM, Alexander Paschenko >>>> <alexander.a.pasche...@gmail.com> wrote: >>>> >>>> Denis, >>>> >>>> That's not what I was asking about. >>>> Currently DML implementation allows for dymanic instantiation of keys, >>>> in other words, user does not have to provide value for object-typed >>>> _key column - instead, he may supply just field values based on which >>>> _key will be dynamically instantiated/binary built. And that's the >>>> whole point of this discussion as I see it: what to do when we've >>>> binary built classless key that we build ourselves inside SQL engine >>>> and don't know how to compute hash code for it? >>>> >>>> - Alex >>>> >>>> 2016-09-28 19:48 GMT+03:00 Denis Magda <dma...@gridgain.com>: >>>>> Alexander, >>>>> >>>>> As I guess if we have a key without a class then it will be constructed >>>>> using a BinaryBuilder instance and it’s user responsibility to set the >>>>> has code at the end with BinaryBuilder.hasCode method. Sure, all this >>>>> cases must be well-documented in both Java Doc API and Apache Ignite >>>>> documentation. >>>>> >>>>> — >>>>> Denis >>>>> >>>>>> On Sep 28, 2016, at 9:33 AM, Alexander Paschenko >>>>>> <alexander.a.pasche...@gmail.com> wrote: >>>>>> >>>>>> Dmitry, Denis, >>>>>> >>>>>> OK, but I think it's necessary to address also the cases when there's >>>>>> no actual class for the key, and its fields are simply declared in >>>>>> XML. In this case, there are no fields to be marked transient. What do >>>>>> we do then? List transient fields in XML separately? >>>>>> >>>>>> - Alex >>>>>> >>>>>> 2016-09-28 4:15 GMT+03:00 Dmitriy Setrakyan <dsetrak...@apache.org>: >>>>>>> Agree with Denis. >>>>>>> >>>>>>> - by default, all non-transient key fields should participate in the >>>>>>> hashcode generation >>>>>>> - when working on DDL, then the primary key fields should participate in >>>>>>> the hashcode >>>>>>> - we should add a resolver to override the default behavior (please >>>>>>> propose the interface in Jira) >>>>>>> - we should print out a warning, only once per type, the the hashcode >>>>>>> has been automatically generated based on which fields and which formula >>>>>>> >>>>>>> D. >>>>>>> >>>>>>> On Tue, Sep 27, 2016 at 5:42 PM, Denis Magda <dma...@gridgain.com> >>>>>>> wrote: >>>>>>> >>>>>>>> Hi Alexander, >>>>>>>> >>>>>>>> Vladimir’s proposal sounds reasonable to me. However we must keep in >>>>>>>> mind >>>>>>>> one important thing. Binary objects were designed to address the >>>>>>>> following >>>>>>>> disadvantages a regular serializer, like optimized marshaller, has: >>>>>>>> necessity to deserialize an object on a server side every time it’s >>>>>>>> needed. >>>>>>>> necessity to hold an object in both serialized and deserialized forms >>>>>>>> on >>>>>>>> the server node. >>>>>>>> necessity to restart the whole cluster each time an object version is >>>>>>>> changed (new field is added or an old one is removed). >>>>>>>> If it will be needed to perform step 3 for a default implementation of >>>>>>>> the >>>>>>>> binary resolver just because the resolver has to consider new fields or >>>>>>>> ignore old ones then such an implementation sucks. Overall, the default >>>>>>>> implementation should use the reflection coming over all the fields a >>>>>>>> key >>>>>>>> has ignoring the ones that are marked with “transient” keyword. If a >>>>>>>> user >>>>>>>> wants to control the default resolver's logic then he can label all the >>>>>>>> fields that mustn’t be of a final has code value with “transient”. >>>>>>>> This has >>>>>>>> to be well-documented for sure. >>>>>>>> >>>>>>>> Makes sense? >>>>>>>> >>>>>>>> — >>>>>>>> Denis >>>>>>>> >>>>>>>>> On Sep 26, 2016, at 12:40 PM, Alexander Paschenko < >>>>>>>> alexander.a.pasche...@gmail.com> wrote: >>>>>>>>> >>>>>>>>> Hello Igniters, >>>>>>>>> >>>>>>>>> As DML support is near, it's critical that we agree on how we generate >>>>>>>>> hash codes for new keys in presence of binary marshaller. Actually, >>>>>>>>> this discussion isn't new - please see its beginning here: >>>>>>>>> >>>>>>>>> http://apache-ignite-developers.2346864.n4.nabble. >>>>>>>> com/All-BinaryObjects-created-by-BinaryObjectBuilder-stored- >>>>>>>> at-the-same-partition-by-default-td8042.html >>>>>>>>> >>>>>>>>> Still, I'm creating this new thread to make getting to the final >>>>>>>>> solution as simple and fast as possible. >>>>>>>>> >>>>>>>>> I remind everyone that the approach that has got the least critics was >>>>>>>>> the one proposed by Vladimir Ozerov: >>>>>>>>> >>>>>>>>> <quote> >>>>>>>>> I think we can do the following: >>>>>>>>> 1) Add "has hash code" flag as Denis suggested. >>>>>>>>> 2) If object without a hash code is put to cache, throw an exception. >>>>>>>>> 3) Add *BinaryEqualsHashCodeResolver *interface. >>>>>>>>> 4) Add default implementation which will auto-generate hash code. >>>>>>>>> *Print >>>>>>>> a >>>>>>>>> warning when auto-generation occurs*, so that user is aware that he is >>>>>>>>> likely to have problems with normal GETs/PUTs. >>>>>>>>> 5) Add another implementation which will use encoded string to >>>>>>>>> calculate >>>>>>>> a >>>>>>>>> hash code. E.g. *new BinaryEqualsHashCodeResolver("{a} * 31 + {b}")*. >>>>>>>>> Originally proposed by Yakov some time ago. >>>>>>>>> </quote> >>>>>>>>> >>>>>>>>> After that, Sergi suggested that instead of a "formula" we keep just a >>>>>>>>> list of the "fields" that participate in hash code evaluation, and >>>>>>>>> with that list, we simply calculate hash code just like IDE does - >>>>>>>>> with all its bit shifts and additions. >>>>>>>>> >>>>>>>>> I'm planning on settling down with this combined Vlad-Sergi approach. >>>>>>>>> Any objections? >>>>>>>>> >>>>>>>>> And an extra question I have: Vlad, you suggest that we both throw an >>>>>>>>> exception on cache code absence and that we might generate it as the >>>>>>>>> last resort. Do I understand you correctly that you suggest generating >>>>>>>>> random code only in context of SQL, but throw exception for keys >>>>>>>>> without codes on ordinary put? >>>>>>>>> >>>>>>>>> And yes, built-in hash codes for JDK types are supported as well as >>>>>>>>> items 1-2 from Vlad's list (there's already fixed issue of IGNITE-3633 >>>>>>>>> for the flag and its presence check). >>>>>>>>> >>>>>>>>> - Alex >>>>>>>> >>>>>>>> >>>>> >>> >