Also MERGE.

2016-09-29 2:10 GMT+03:00 Denis Magda <dma...@gridgain.com>:
> You need a hash code only for INSERT operation, right?
>
> —
> Denis
>
>> On Sep 28, 2016, at 3:47 PM, Alexander Paschenko 
>> <alexander.a.pasche...@gmail.com> wrote:
>>
>> But what if the user works from some kind of console and just types
>> the queries as text in full and does not bind params via JDBC or
>> something alike? What if there's no binary object? I don't see why we
>> should keep the user from usual cache gets in this case. I really like
>> the idea of supplying the values of distinct fields, thus freeing the
>> user of the need to mess with objects and builders, AND then just
>> calculating hash code as suggested before - say, via explicitly
>> listing participating fields in XML or by marking them with transient
>> keyword or some annotation.
>> Actually, I believe that's the only case when we need to generate any
>> hash codes - when the class is present, we can just get hash code from
>> its implementation of its method. When there's no class, we generate.
>> And all that is solely for SQL. For the rest - just throw an exception
>> when there's no hash code manually set for binary object. I don't see
>> why we should try to generate anything when the user already is using
>> Ignite in full, not just via limited interface of SQL.
>>
>> 2016-09-29 0:31 GMT+03:00 Denis Magda <dma...@gridgain.com>:
>>> Hmm, this is a good question.
>>>
>>> If a user doesn’t provide a _key when an INSERT is executed for me it means 
>>> that he is not going to use the key later for cache.get/put, DELETE, UPDATE 
>>> and other possible operation simply because he doesn’t know how to 
>>> reconstruct the key back in his code. If he wants to use the primary key in 
>>> the rest of operations then he must provide it at INSERT time.
>>>
>>> Do we need this key only for a case when an object is being inserted into a 
>>> cache? If it’s so I would auto-generate a key using ‘long’ as a key type. I 
>>> do remember that we provided the auto-generation for Spark module in a some 
>>> way that may be useful here.
>>>
>>> —
>>> Denis
>>>
>>>> On Sep 28, 2016, at 9:53 AM, Alexander Paschenko 
>>>> <alexander.a.pasche...@gmail.com> wrote:
>>>>
>>>> Denis,
>>>>
>>>> That's not what I was asking about.
>>>> Currently DML implementation allows for dymanic instantiation of keys,
>>>> in other words, user does not have to provide value for object-typed
>>>> _key column - instead, he may supply just field values based on which
>>>> _key will be dynamically instantiated/binary built. And that's the
>>>> whole point of this discussion as I see it: what to do when we've
>>>> binary built classless key that we build ourselves inside SQL engine
>>>> and don't know how to compute hash code for it?
>>>>
>>>> - Alex
>>>>
>>>> 2016-09-28 19:48 GMT+03:00 Denis Magda <dma...@gridgain.com>:
>>>>> Alexander,
>>>>>
>>>>> As I guess if we have a key without a class then it will be constructed 
>>>>> using a BinaryBuilder instance and it’s user responsibility to set the 
>>>>> has code at the end with BinaryBuilder.hasCode method. Sure, all this 
>>>>> cases must be well-documented in both Java Doc API and Apache Ignite 
>>>>> documentation.
>>>>>
>>>>> —
>>>>> Denis
>>>>>
>>>>>> On Sep 28, 2016, at 9:33 AM, Alexander Paschenko 
>>>>>> <alexander.a.pasche...@gmail.com> wrote:
>>>>>>
>>>>>> Dmitry, Denis,
>>>>>>
>>>>>> OK, but I think it's necessary to address also the cases when there's
>>>>>> no actual class for the key, and its fields are simply declared in
>>>>>> XML. In this case, there are no fields to be marked transient. What do
>>>>>> we do then? List transient fields in XML separately?
>>>>>>
>>>>>> - Alex
>>>>>>
>>>>>> 2016-09-28 4:15 GMT+03:00 Dmitriy Setrakyan <dsetrak...@apache.org>:
>>>>>>> Agree with Denis.
>>>>>>>
>>>>>>> - by default, all non-transient key fields should participate in the
>>>>>>> hashcode generation
>>>>>>> - when working on DDL, then the primary key fields should participate in
>>>>>>> the hashcode
>>>>>>> - we should add a resolver to override the default behavior (please
>>>>>>> propose the interface in Jira)
>>>>>>> - we should print out a warning, only once per type, the the hashcode
>>>>>>> has been automatically generated based on which fields and which formula
>>>>>>>
>>>>>>> D.
>>>>>>>
>>>>>>> On Tue, Sep 27, 2016 at 5:42 PM, Denis Magda <dma...@gridgain.com> 
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Hi Alexander,
>>>>>>>>
>>>>>>>> Vladimir’s proposal sounds reasonable to me. However we must keep in 
>>>>>>>> mind
>>>>>>>> one important thing. Binary objects were designed to address the 
>>>>>>>> following
>>>>>>>> disadvantages a regular serializer, like optimized marshaller, has:
>>>>>>>> necessity to deserialize an object on a server side every time it’s 
>>>>>>>> needed.
>>>>>>>> necessity to hold an object in both serialized and deserialized forms 
>>>>>>>> on
>>>>>>>> the server node.
>>>>>>>> necessity to restart the whole cluster each time an object version is
>>>>>>>> changed (new field is added or an old one is removed).
>>>>>>>> If it will be needed to perform step 3 for a default implementation of 
>>>>>>>> the
>>>>>>>> binary resolver just because the resolver has to consider new fields or
>>>>>>>> ignore old ones then such an implementation sucks. Overall, the default
>>>>>>>> implementation should use the reflection coming over all the fields a 
>>>>>>>> key
>>>>>>>> has ignoring the ones that are marked with “transient” keyword. If a 
>>>>>>>> user
>>>>>>>> wants to control the default resolver's logic then he can label all the
>>>>>>>> fields that mustn’t be of a final has code value with “transient”. 
>>>>>>>> This has
>>>>>>>> to be well-documented for sure.
>>>>>>>>
>>>>>>>> Makes sense?
>>>>>>>>
>>>>>>>> —
>>>>>>>> Denis
>>>>>>>>
>>>>>>>>> On Sep 26, 2016, at 12:40 PM, Alexander Paschenko <
>>>>>>>> alexander.a.pasche...@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>> Hello Igniters,
>>>>>>>>>
>>>>>>>>> As DML support is near, it's critical that we agree on how we generate
>>>>>>>>> hash codes for new keys in presence of binary marshaller. Actually,
>>>>>>>>> this discussion isn't new - please see its beginning here:
>>>>>>>>>
>>>>>>>>> http://apache-ignite-developers.2346864.n4.nabble.
>>>>>>>> com/All-BinaryObjects-created-by-BinaryObjectBuilder-stored-
>>>>>>>> at-the-same-partition-by-default-td8042.html
>>>>>>>>>
>>>>>>>>> Still, I'm creating this new thread to make getting to the final
>>>>>>>>> solution as simple and fast as possible.
>>>>>>>>>
>>>>>>>>> I remind everyone that the approach that has got the least critics was
>>>>>>>>> the one proposed by Vladimir Ozerov:
>>>>>>>>>
>>>>>>>>> <quote>
>>>>>>>>> I think we can do the following:
>>>>>>>>> 1) Add "has hash code" flag as Denis suggested.
>>>>>>>>> 2) If object without a hash code is put to cache, throw an exception.
>>>>>>>>> 3) Add *BinaryEqualsHashCodeResolver *interface.
>>>>>>>>> 4) Add default implementation which will auto-generate hash code. 
>>>>>>>>> *Print
>>>>>>>> a
>>>>>>>>> warning when auto-generation occurs*, so that user is aware that he is
>>>>>>>>> likely to have problems with normal GETs/PUTs.
>>>>>>>>> 5) Add another implementation which will use encoded string to 
>>>>>>>>> calculate
>>>>>>>> a
>>>>>>>>> hash code. E.g. *new BinaryEqualsHashCodeResolver("{a} * 31 + {b}")*.
>>>>>>>>> Originally proposed by Yakov some time ago.
>>>>>>>>> </quote>
>>>>>>>>>
>>>>>>>>> After that, Sergi suggested that instead of a "formula" we keep just a
>>>>>>>>> list of the "fields" that participate in hash code evaluation, and
>>>>>>>>> with that list, we simply calculate hash code just like IDE does -
>>>>>>>>> with all its bit shifts and additions.
>>>>>>>>>
>>>>>>>>> I'm planning on settling down with this combined Vlad-Sergi approach.
>>>>>>>>> Any objections?
>>>>>>>>>
>>>>>>>>> And an extra question I have: Vlad, you suggest that we both throw an
>>>>>>>>> exception on cache code absence and that we might generate it as the
>>>>>>>>> last resort. Do I understand you correctly that you suggest generating
>>>>>>>>> random code only in context of SQL, but throw exception for keys
>>>>>>>>> without codes on ordinary put?
>>>>>>>>>
>>>>>>>>> And yes, built-in hash codes for JDK types are supported as well as
>>>>>>>>> items 1-2 from Vlad's list (there's already fixed issue of IGNITE-3633
>>>>>>>>> for the flag and its presence check).
>>>>>>>>>
>>>>>>>>> - Alex
>>>>>>>>
>>>>>>>>
>>>>>
>>>
>

Reply via email to