Re: [Drizzle-discuss] PROPOSAL: Change to the Cursor interface

Venu Kalyan Mon, 26 Apr 2010 20:16:18 -0700

Just curious; is there any optimization made in drizzle to handle ON
DUPLICATE KEY UPDATE ? as this is one the feature that many people use
currently; and there is a hit in the performance as the probability of
the duplicates increases...


On Mon, Apr 26, 2010 at 1:11 AM, Roland Bouman <[email protected]> wrote:
> Hi Francesco,
>
> On Mon, Apr 26, 2010 at 12:38 AM, Francesco Riosa
> <[email protected]> wrote:
>>>> 2) for many insert a row is considered to be duplicate even if there are no
>>>> collision with unique indexes (seem to be satisfied by the previous 
>>>> example)
>>>
>>>Well, I think that this is something that MERGE can solve, because you
>>>control the match condition yourself explicitly (in the ON clause).
>>>REPLACE INTO or ON DUPLICATE KEY UPDATE have a more limited matching
>>>condition, that requires one of the unique constraints or primary key
>>>to match.
>>
>> exactly, tough it could be solved for current syntax by the "NO ENFORCE"
>> index property introduced by SQL 2011?
>> first seen in http://petereisentraut.blogspot.com/2010/04/news-from-sql-
>> standard.html
>
> I read the blog - do you mean the part where he mentions new features
> in SQL:2011:
>
> "Constraints can optionally be set to NO ENFORCE.  That means the
> database system won't enforce them but still assumes they are valid,
> for example for optimization."
>
> Remarkable feature. I don't have drafts for the 2011 version of the
> standard, so I can't read up on it, but I am not sure I'm too happy
> with this...
> In as far this is relevant to the matching behaviour, I still think
> the MERGE...ON construct is much more powerful (and clear!) than this.
> (Just my opinion)
>
> kind regarrds.
>
>>
>>>
>>>> to be really useful in these cases the sintax would be something like:
>>>> MERGE INTO table_name USING table_name ON (condition)
>>>> WHEN MATCHED THEN
>>>> *UPDATE SET column1 = value1 [, column2 = value2 ...]*
>>>> WHEN NOT MATCHED THEN
>>>> *INSERT (column1 [, column2 ...]) VALUES (value1 [, value2 ...*
>>>> ALWAYS SELECT (column1 [, column2 ...])
>>>>
>>>> does it make sense?
>>>
>>>Do you mean, the ALWAYS SELECT would return the matched set as it
>>>looks like after the INSERT/UPDATEs have taken place? (which would
>>>include all values as they are stored in the table, including any
>>>generated values caused by auto_increment columns, column level
>>>defaults, and, in case drizzle would support them, timestamp columns
>>>and triggers)
>>>
>>>If so, then yes, that makes sense to me.
>>
>> Yes the SELECT should return values from the matched row _after_ it has been
>> (eventually) modified or inserted
>>
>>>
>>>
>>>>
>>>>>As we can see merge works like a sub query fashion. and UPDATE and INSERT
>>>>>are part of it (3rd and 5th line).
>>>>>so in effect MERGE statement splits into a SELECT statement (1st line)
>>>>>followed by UPDATE or INSERT.
>>>>
>>>>>Here the advantage is, UPDATE and INSERT are explicit. so no ambiguity
>>>>>remains.
>>>>
>>>>>But this is not the case of REPLACE, need a investigation of the return
>> pair
>>>>>to understand actually what happened inside.
>>>>
>>>>>Am i right?
>>>>
>>>>>Thank you,
>>>>>Jobin.
>>>>
>>>>>On Sun, Apr 25, 2010 at 10:24 PM, Jobin Augustine <[email protected]>
>> wrote:
>>>>
>>>>>> Hi Roland,
>>>>>>
>>>>>> Yes!. MERGE can be used as a replacement for REPLACE. :)
>>>>>> but in practical, it is difficult to write REPLACE using MERGE.
>>>>>> i hope nobody will do that,  just give a try. i bet you will give up :)
>>>>>>
>>>>>> MERGE statement is designed to serve other purposes very efficiently.
>>>>>> where we deal with multiple tables.
>>>>>> For example.
>>>>>>
>>>>>>  1. Warehousing / Reporting systems require "merging" data from
>> normalized
>>>>>> transaction tables into different models
>>>>>>  2. Archiving old data, merge new data into your archive tables very
>>>>>> efficiently.
>>>>>>
>>>>>> I am not against any extra features like REPLACE.
>>>>>> but just conscious about the extra complexity it adds to the code bases
>> and
>>>>>> propages to other plugins (yet to born ones also)
>>>>>> more complexity means more bugs.
>>>>>> referring above proposal, insertRecord returning a pair !
>>>>>>
>>>>>> Thank you,
>>>>>> Jobin.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Sun, Apr 25, 2010 at 1:53 PM, Roland Bouman
>>>> <[email protected]>wrote:
>>>>>>
>>>>>>> Actually, after some more reading I now get the impression that MERGE
>>>>>>> INTO does allow for the same flexibility in handling UPDATEs
>>>>>>> differently from the INSERTs, and is in fact even more flexible as it
>>>>>>> allows a separate set to identify the target records.
>>>>>>>
>>>>>>> On Sun, Apr 25, 2010 at 1:14 AM, Roland Bouman <[email protected]>
>>>>>>> wrote:
>>>>>>> > Gah...I replied to Jobin, but forgot to reply to Jay and the list....
>>>>>>> >
>>>>>>> > here goes:
>>>>>>> >
>>>>>>> > Hi Jobin,
>>>>>>> >
>>>>>>> > On Sat, Apr 24, 2010 at 10:20 PM, Jobin Augustine <[email protected]>
>>>>>>> wrote:
>>>>>>> >> is this REPLACE is something so useful?.
>>>>>>> >> are developers are aware and using it to a good extend?
>>>>>>> >
>>>>>>> > In my opinion, it is useful. I don't think there is any statistic on
>>>>>>> > whether developers are making legitimate use of it. But then, this
>>>>>>> > information is also not available for any other SQL feature.
>>>>>>> >
>>>>>>> >> I don't see it in SQL standards and other planets (Postgres,
>>>>>>> Oracle,...)
>>>>>>> >
>>>>>>> > In the SQL standard, the MERGE INTO statement offers similar (although
>>>>>>> > not equivalent) functionality
>>>>>>> >
>>>>>>> > <merge statement> ::=
>>>>>>> >   MERGE INTO <target table> [ [ AS ] <merge correlation name> ]
>>>>>>> >   USING <table reference>
>>>>>>> >   ON <search condition> <merge operation specification>
>>>>>>> >
>>>>>>> > (ISO/IEC 9075-2:2003 (E) 14.9   <merge statement>, page 837)
>>>>>>> >
>>>>>>> > It is similar in that matching existing rows are updated, and if there
>>>>>>> > are no such rows, they will be inserted.
>>>>>>> > Personally I find the standardese particular dense and hard to
>>>>>>> > understand, but this wikipedia entry explains it nicely:
>>>>>>> >
>>>>>>> >
>>>>
>> http://en.wikipedia.org/wiki/Merge_(SQL)<http://en.wikipedia.org/wiki/Merge_%28SQL%29>
>>>>>>> >
>>>>>>> > Personally I like MySQL's REPLACE and ON DUPLICATE KEY constructs
>>>>>>> > better because they don't just merge the data, but allow you a IMO
>>>>>>> > flexible and natural way to handle the updates differently than the
>>>>>>> > inserts. But this is besides your point: the standard offers an
>> "Upsert"
>>>>>>> > construct just like MySQL, but it is called MERGE INTO, and not
>>>>>>> > REPLACE INTO (or INSERT ... ON DUPLICATE KEY UPDATE)
>>>>>>> >
>>>>>>> >> for me, REPLACE is as bad as silent commits (in many planets it
>> happens
>>>>>>> >> before DDL)
>>>>>>> >
>>>>>>> > Ok. Then don't use it :)
>>>>>>> >
>>>>>>> >> all these headache may go if we could discontinue support for
>> mysqlism.
>>>>>>> >
>>>>>>> > That would postpone the problem until drizzle would add
>>>>>>> > support for SQL standard MERGE INTO syntax.
>>>>>>> > In other words, it would need to be solved anyway.
>>>>>>> >
>>>>>>> > kind regards,
>>>>>>> >
>>>>>>> > Roland
>>>>>>> >
>>>>>>> >>
>>>>>>> >> Thank you,
>>>>>>> >> Jobin.
>>>>>>> >>
>>>>>>> >>
>>>>>>> >>
>>>>>>> >> _______________________________________________
>>>>>>> >> Mailing list: https://launchpad.net/~drizzle-
>>>> discuss<https://launchpad.net/%7Edrizzle-discuss>
>>>>>>> >> Post to     : [email protected]
>>>>>>> >> Unsubscribe : https://launchpad.net/~drizzle-
>>>> discuss<https://launchpad.net/%7Edrizzle-discuss>
>>>>>>> >> More help   : https://help.launchpad.net/ListHelp
>>>>>>> >>
>>>>>>> >>
>>>>>>> >
>>>>>>> >
>>>>>>> >
>>>>>>> > --
>>>>>>> > Roland Bouman
>>>>>>> > http://rpbouman.blogspot.com/
>>>>>>> >
>>>>>>> > Author of "Pentaho Solutions: Business Intelligence and Data
>>>>>>> > Warehousing with Pentaho and MySQL",
>>>>>>> > http://www.wiley.com/WileyCDA/WileyTitle/productCd-0470484322.html
>>>>>>> >
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Roland Bouman
>>>>>>> http://rpbouman.blogspot.com/
>>>>>>>
>>>>>>> Author of "Pentaho Solutions: Business Intelligence and Data
>>>>>>> Warehousing with Pentaho and MySQL",
>>>>>>> http://www.wiley.com/WileyCDA/WileyTitle/productCd-0470484322.html
>>
>> Thanks for taking the time to explain/extend better the original explanation
>>
>
>
>
> --
> Roland Bouman
> http://rpbouman.blogspot.com/
>
> Author of "Pentaho Solutions: Business Intelligence and Data
> Warehousing with Pentaho and MySQL",
> http://www.wiley.com/WileyCDA/WileyTitle/productCd-0470484322.html
>
> _______________________________________________
> Mailing list: https://launchpad.net/~drizzle-discuss
> Post to     : [email protected]
> Unsubscribe : https://launchpad.net/~drizzle-discuss
> More help   : https://help.launchpad.net/ListHelp
>

_______________________________________________
Mailing list: https://launchpad.net/~drizzle-discuss
Post to     : [email protected]
Unsubscribe : https://launchpad.net/~drizzle-discuss
More help   : https://help.launchpad.net/ListHelp

Re: [Drizzle-discuss] PROPOSAL: Change to the Cursor interface

Reply via email to