Thank you for concern, I completely understand.

We have no intention of releasing anything that would do this (data corruption) and testing will ensure this. The main objective here is that DBD::mysql is on par with all the other drivers, the whole idea behind a driver that DBI can use and code should work the same regardless of underlying RDBMS. Having worked with other languages in the last few years (PDO, Go/Gorm, Python, ODBC, etc) it's something I want for Perl and MySQL as well.

Regards,

Patrick

On 9/19/17 12:10 PM, Darren Duncan wrote:
What Night Light's post says to me is that there is high risk of causing data corruption if any changes are made under the DBD::mysql name where DBD::mysql has not been exhaustively tested to guarantee that its behavior is backwards compatible.

This makes a stronger case to me that the DBD::mysql Git master (that which includes the 4.042 changes and any other default breaking changes) should rename the Perl driver package name, I suggest DBD::mysql2 version 5.0, and that any changes not guaranteed backwards compatible for whatever reason go there.

If the Git legacy maintenance branch 4.041/3 can have careful security patches applied that don't require any changes to user code to prevent breakage, it gets them, and otherwise only DBD::mysql2 gets any changes.

By doing what I said, we can be guaranteed that users with no control over how DBD::mysql gets upgraded for them will introduce corruption simply for upgrading.

-- Darren Duncan

On 2017-09-19 5:46 AM, Night Light wrote:
Dear Perl gurus,

This is my first post. I'm using Perl with great joy, and I'd like to express my
gratitude for all you are doing to keep Perl stable and fun to use.

I'd like to ask to object to re-releasing this version and discuss on how to
make 4.043 backwards compatible instead.
This change will with 100% certainty corrupt all BLOB data written to the database when the developer did not read the release notes before applying the
latest version of DBD::mysql (and changed its code consequently).
Knowing that sysadmins have the habit of not always reading the release notes of each updated package the likelihood that this will happen will therefore high. I myself wasn't even shown the release notes as it was a dependency of an
updated package that I applied.
The exposure of this change is big as DBD::mysql affects multiple applications
and many user bases.
I believe deliberately introducing industry wide database corruption is
something that will significantly harm peoples confidence in using Perl.
I believe that not providing backwards compatibility is not in line with the Perl policy that has been carefully put together by the community to maintain
the quality of Perl as it is today.
http://perldoc.perl.org/perlpolicy.html#BACKWARD-COMPATIBILITY-AND-DEPRECATION

I therefore believe the only solution is an upgrade that is by default backwards compatible, and where it is the user who decides when to start UTF8 encode the
input values of a SQL request instead.
If it is too time consuming or too difficult it should be considered to park the
UTF8-encoding "fix" and release a version with the security fix first.

I have the following objections against this release:

1. the upgrade will corrupt more records than it fixes (it does more harm than good) 2. the reason given for not providing backward compatibility ("because it was hard to implement") is not plausible given the level of unwanted side effects.    This especially knowing that there is already a mechanism in place to signal if its wants UTF8 encoding or not (mysql_enable_utf8/mysql_enable_utf8mb4). 3. it costs more resources to coordinate/discuss a "way forward" or options than
to implement a solution that addresses backwards compatibility
4. it is unreasonable to ask for changing existing source knowing that depending
modules may not be actively maintained or proprietary
   It can be argued that such module should always be maintained but it does not
change the fact that a good running Perl program becomes unusable
5. it does not inform the user that after upgrading existing code will start
write corrupt BLOB records
6. it does not inform the user about the fact that a code review of all existing
code is necessary, and how it needs to be changed and tested
7. it does not give the user the option to decide how the BLOB's should be
stored/encoded (opt in)
8. it does not provide backwards compatibility
   By doing so it does not respect the Perl policy that has been carefully put together by the community to maintain the quality of Perl as it is today.
http://perldoc.perl.org/perlpolicy.html#BACKWARD-COMPATIBILITY-AND-DEPRECATION
9. it blocks users from using DBD::mysql upgrades as long as they have not
rewritten their existing code
10. not all users from DBD::mysql can be warned beforehand about the side effects as it is not known which private parties have code that use DBD::mysql
12. I believe development will go faster when support for backwards
compatibility is addressed
13. having to write 1 extra line for each SQL query value is a monks job that
will make the module less attractive to use

About forking to DBD::mariadb?:
The primary reason to create such a module is when the communication protocol of
Mariadb has become incompatible with Mysql.
To use this namespace to fix a bug in DBD::mysql does not meet that criteria and causes confusion for developers and unnecessary pollution of the DBD namespace.

---

For people that do not know the impact of the change that is pending to be
committed:
(see Github issue that includes 3 reports of companies that suffered data loss
https://github.com/perl5-dbi/DBD-mysql/issues/117 )

Issue: some UTF8 characters are not properly displayed after retrieval
Cause: SQL query values are not UTF8 encoded when sent to the database but they
are all decoded once retrieved.
Occurence: Only records with string data that can only be written with UTF8. It can be considered rare as people haven't reported this issue after 10 years of
usage.
Regional impact: Only affects countries which characters need UTF8 encoding and
only affects string values.
Steps to recover from it: Read string data unencoded and write it encoded.

Changes of upgrade pending to be re-released:
SQL query values are both UTF8 encoded when sent to the database as when its
retrieved (including BLOB fields).
BLOB fields will be excluded from encoding only if you specify its data type.

Side effects from installing upgrade:
- BLOB data will be written after UTF8 encoding and will therefore be corrupt - no possibility to detect if a BLOB field is corrupt or not. Only when known
when the INSERT/UPDATE took place, and when the upgrade was installed
- existing data will still display incorrect

Occurence: every INSERT/UPDATE statement will start writing corrupted BLOB data
Regional impact: worldwide
Steps to recover from it corrupted BLOBs? You cannot. Your binary blobs are encoded as if they were UTF8 strings. Your binary data is unrecoverable (as in
"gone forever").
If you are a dentist you have to ask your customers to come back to make another
x-ray as the made photo's are gone.

What is asked from the developer to prevent this from happening?
- do not miss reading the release notes before upgrading
- review all source code (including written by other included modules) and
specify the data type of each SQL parameter value
  before: $dbh->do('INSERT INTO test (BLOB1,BLOB2,BLOB3,BLOB4)
VALUES(?,?,?,?)',undef,$col1,$col2,$col3);
  after:  $dbh->do('INSERT INTO test (BLOB1,BLOB2,BLOB3,BLOB4) VALUES(?,?,?,?)');
          $sth->bind_param(1, $file, SQL_BLOB);
          $sth->bind_param(2, $file, SQL_BLOB);
          $sth->bind_param(3, $file, SQL_BLOB);
          ...
  One line more for each SQL statement. This will be a time consuming monks task during which the user will ask why this is necessary while it worked before.
- upgrade scripts need to be written to UTF8 encode existing string data
- retest all source code

Reply via email to