Re: DBD::mysql path forward

2017-09-13 Thread H.Merijn Brand
On Wed, 13 Sep 2017 13:27:58 -0700, Darren Duncan
 wrote:

> On 2017-09-13 12:58 PM, Dan Book wrote:
> > On Wed, Sep 13, 2017 at 3:53 AM, Peter Rabbitson wrote:
> >
> > On 09/12/2017 07:12 PM, p...@cpan.org wrote:
> > And here is promised script:
> > 
> >
> > The script side-steps showcasing the treatment of BLOB/BYTEA columns, 
> > which
> > was one of the main ( albeit not the only ) reason the userbase lost 
> > data.
> >
> > Please extend the script with a BLOB/BYTEA test.
> >
> > I'm not sure how to usefully make such a script, since correct insertion of 
> > BLOB
> > data (binding with the SQL_BLOB type or similar) would work correctly both
> > before and after the fix.  
> 
> Perhaps the requirement of the extra tests is to ensure that BLOB/BYTEA data 
> is 
> NOT mangled during input or output, that on input any strings with a true 
> utf8 
> flag are rejected and that on output any strings have a false utf8 flag.  
> Part 
> of the idea is to give regression testing that changes regarding Unicode 
> handling with text don't inadvertently break blob handling. -- Darren Duncan

BYTE/BLOB/TEXT tests require three types of data

• Pure ASCII
• Correct UTF-8 (with complex combinations)
• Pure binary
  - random bytes ranging 0x00..0xFF
  - Images (png, jpg, gif, tiff)

None of those is hard to generate.

1. create two tables with a field of the type to check
2. insert the data in table 1
3. use SQL to copy the data to table 2
4. extract data from table 2
5. compare to original data
6. drop tables
7. goto 1

If MySQL support different ways to do this, test all ways (in the
string, placeholders, bind_columns, bind_params, other ...)

-- 
H.Merijn Brand  http://tux.nl   Perl Monger  http://amsterdam.pm.org/
using perl5.00307 .. 5.27   porting perl5 on HP-UX, AIX, and openSUSE
http://mirrors.develooper.com/hpux/http://www.test-smoke.org/
http://qa.perl.org   http://www.goldmark.org/jeff/stupid-disclaimers/


pgpD2bPne9MSZ.pgp
Description: OpenPGP digital signature


Re: DBD::mysql path forward

2017-09-13 Thread Darren Duncan

On 2017-09-13 12:58 PM, Dan Book wrote:

On Wed, Sep 13, 2017 at 3:53 AM, Peter Rabbitson wrote:

On 09/12/2017 07:12 PM, p...@cpan.org wrote:
And here is promised script:


The script side-steps showcasing the treatment of BLOB/BYTEA columns, which
was one of the main ( albeit not the only ) reason the userbase lost data.

Please extend the script with a BLOB/BYTEA test.

I'm not sure how to usefully make such a script, since correct insertion of BLOB
data (binding with the SQL_BLOB type or similar) would work correctly both
before and after the fix.


Perhaps the requirement of the extra tests is to ensure that BLOB/BYTEA data is 
NOT mangled during input or output, that on input any strings with a true utf8 
flag are rejected and that on output any strings have a false utf8 flag.  Part 
of the idea is to give regression testing that changes regarding Unicode 
handling with text don't inadvertently break blob handling. -- Darren Duncan


Re: DBD::mysql path forward

2017-09-13 Thread Peter Rabbitson

On 09/13/2017 09:58 PM, Dan Book wrote:
On Wed, Sep 13, 2017 at 3:53 AM, Peter Rabbitson > wrote:


On 09/12/2017 07:12 PM, p...@cpan.org  wrote:

On Tuesday 12 September 2017 12:27:25 p...@cpan.org
 wrote:

To prove fact that other DBI drivers (e.g. Pg or SQLite) had
fixed
similar/same UTF-8 issue as MySQL has and behave
Perl-correctly, I
would provide test cases so you would see difference between Pg,
SQLite and mysql DBI drivers.


And here is promised script:




The script side-steps showcasing the treatment of BLOB/BYTEA
columns, which was one of the main ( albeit not the only ) reason
the userbase lost data.

Please extend the script with a BLOB/BYTEA test.


I'm not sure how to usefully make such a script, since correct insertion 
of BLOB data (binding with the SQL_BLOB type or similar) would work 
correctly both before and after the fix.


If you were to actually try to make the requested addition you'd have 
noticed that DBD::Pg *DOES NOT* require one to binding with SQL_BLOB. 
For more info please go back to the old ( 2013 ) RT threads on the 
"mysql unicode issue" and read again carefully.


Please try to actually implement this - it will give you additional 
perspective.


Re: DBD::mysql path forward

2017-09-13 Thread Dan Book
On Wed, Sep 13, 2017 at 3:53 AM, Peter Rabbitson  wrote:

> On 09/12/2017 07:12 PM, p...@cpan.org wrote:
>
>> On Tuesday 12 September 2017 12:27:25 p...@cpan.org wrote:
>>
>>> To prove fact that other DBI drivers (e.g. Pg or SQLite) had fixed
>>> similar/same UTF-8 issue as MySQL has and behave Perl-correctly, I
>>> would provide test cases so you would see difference between Pg,
>>> SQLite and mysql DBI drivers.
>>>
>>
>> And here is promised script:
>>
>>
> 
>
> The script side-steps showcasing the treatment of BLOB/BYTEA columns,
> which was one of the main ( albeit not the only ) reason the userbase lost
> data.
>
> Please extend the script with a BLOB/BYTEA test.
>

I'm not sure how to usefully make such a script, since correct insertion of
BLOB data (binding with the SQL_BLOB type or similar) would work correctly
both before and after the fix.

-Dan


Re: DBD::mysql path forward

2017-09-13 Thread Darren Duncan

On 2017-09-13 6:31 AM, p...@cpan.org wrote:

On Tuesday 12 September 2017 11:32:36 Darren Duncan wrote:

Regardless, following point 2, mandate that all Git pull requests are made
against the new 5.x master; the 4.x legacy branch would have no commits
except minimal back-porting.


New pull requests are by default created against "master" branch. So if
5.x development would happen in "master" and 4.x in e.g. "legacy-4.0"
then no changes is needed.

But on github it is not possible to disallow users to create new pull
requests for non-default branch. When creating pull request there is a
button which open dialog to change target branch.


When I said "mandate" I meant as a matter of project policy, not on what GitHub 
enforces.  Strictly speaking there are times where a pull request against the 
legacy branch is appropriate. -- Darren Duncan


Re: DBD::mysql path forward

2017-09-13 Thread pali
On Tuesday 12 September 2017 11:32:36 Darren Duncan wrote:
> On 2017-09-12 11:05 AM, p...@cpan.org wrote:
> >On Tuesday 12 September 2017 19:00:59 Darren Duncan wrote:
> >>I strongly recommend that another thing happen, which is
> >>re-versioning DBD::mysql to 5.0.
> >>
> >>1. From now on, DBD::mysql versions 4.x would essentially be frozen
> >>at 4.041/4.043.
> >>
> >>2. From now on, DBD::mysql versions 5.x and above would be where all
> >>active development occurs, would be the only place where Unicode
> >>handling is fixed and new MySQL versions and features are supported,
> >>and where other features are added.
> >
> >I'm fine with it. Basically it means reverting to pre-4.043 code and
> >continue *git* development there with increased version to 5.x. And once
> >code is ready it can be released to cpan as normal (non-engineering) 5.x
> >release.
> 
> Yes, as you say.  With respect to Git I propose doing this immediately:
> 
> 1. Create a Git tag/branch off of 4.043 which is the 4.x legacy support 
> branch.

This is up to Patrick.

> 2. Revert Git master to the pre-4.043 code and then follow that with a
> commit to master that changes the DBD::mysql version to 5.0.

If everybody agree with this step, I can prepare pull request which
revert tree to this state, plus re-apply/rebase commits which were newly
accepted.

> Regardless, following point 2, mandate that all Git pull requests are made
> against the new 5.x master; the 4.x legacy branch would have no commits
> except minimal back-porting.

New pull requests are by default created against "master" branch. So if
5.x development would happen in "master" and 4.x in e.g. "legacy-4.0"
then no changes is needed.

But on github it is not possible to disallow users to create new pull
requests for non-default branch. When creating pull request there is a
button which open dialog to change target branch.


Re: DBD::mysql path forward

2017-09-13 Thread Peter Rabbitson

On 09/12/2017 07:12 PM, p...@cpan.org wrote:

On Tuesday 12 September 2017 12:27:25 p...@cpan.org wrote:

To prove fact that other DBI drivers (e.g. Pg or SQLite) had fixed
similar/same UTF-8 issue as MySQL has and behave Perl-correctly, I
would provide test cases so you would see difference between Pg,
SQLite and mysql DBI drivers.


And here is promised script:





The script side-steps showcasing the treatment of BLOB/BYTEA columns, 
which was one of the main ( albeit not the only ) reason the userbase 
lost data.


Please extend the script with a BLOB/BYTEA test.


Re: DBD::mysql path forward

2017-09-13 Thread pali
I would suggest to communicate with all people behind cpan modules which
uses mysql_enable_utf8.

https://grep.metacpan.org/search?size=20&q=mysql_enable_utf8&qd=&qft=

They should know if their modules are expecting old broken or behavior
of DBD::mysql or not. I would expect that are modules which needs
correct behavior as not all perl people expect such bugs...

They can update their modules to be compatible with both old buggy
behavior and new correct. Months ago I wrote update for DBD::mysql POD
documentation which mention example how to achieve it:

https://github.com/perl5-dbi/DBD-mysql/pull/119/files

Meanwhile DBD::mysql can try to play with "conflicts" relationship in
META.json to prevent installing one of the above cpan module which
depends on broken DBD::mysql until maintainer of that module fixes it.

https://metacpan.org/pod/CPAN::Meta::Spec#Prereq-Spec

I have never uses "conflicts" in META.json, so I do not know if it is
working correctly in all cpan clients, but according to documentation it
is a way how to achieve that users would not upgrade DBD::mysql to new
version if they have installed another module which needs old
DBD::mysql.

Has anybody experience with "conflicts" in META.json?

Also this would mean to prepare list of all of those cpan modules and
add them into "conflicts" section in DBD::mysql's META.json.

This can be tested e.g. by releasing engineering/beta version on cpan.

On Tuesday 12 September 2017 21:45:11 Patrick M. Galbraith wrote:
> Darren,
> 
> I agree with this as well, with the exception of 4 and 5, keeping 5.0 "pure"
> to the new way of doing things.
> 
> For releases, I think I want to understand what this will mean. Sooner or
> later, a release shows up in as a distribution package, installed on the OS,
> and I want to know what the way of communicating that to the users and
> vendors so expectations are met. That's another question of "how do we
> handle that?" and "how do we inform an OS packager/vendor, what to do?
> 
> Thank you for the great discussion!
> 
> Patrick
> 
> On 9/12/17 2:05 PM, p...@cpan.org wrote:
> >On Tuesday 12 September 2017 19:00:59 Darren Duncan wrote:
> >>On 2017-09-12 8:54 AM, Dan Book wrote:
> >>>On Tue, Sep 12, 2017 at 11:04 AM, Patrick M. Galbraith wrote:
> >>> Pali,
> >>> Yes, I agree, we'll have to create a fork pre revert and stop
> >>> accepting PRs How might we allow people time to test the fixes
> >>> to give them time? Just have them use the fork, I would
> >>> assume?
> >>>
> >>>To be clear, this sounds like a branch not a fork. If your plan is
> >>>to reinstate the mysql_enable_utf8 behavior as in 4.042 rather
> >>>than adding a new option for this behavior, then branching from
> >>>4.042 seems reasonable to me; but you should be very clear if this
> >>>is your intended approach, as this is what led to many people
> >>>corrupting data as they send blobs to mysql with the same
> >>>mysql_enable_utf8 option, and expect them to accidentally not get
> >>>encoded.
> >>Assuming that broken Unicode handling has been in DBD::mysql for a
> >>long time and that users expect this broken behavior and that fixing
> >>DBD::mysql may break user code making those assumptions...
> >>
> >>I strongly recommend that another thing happen, which is
> >>re-versioning DBD::mysql to 5.0.
> >>
> >>A declared major version change would signify to a lot of people that
> >>there are significant changes from what came before and that they
> >>should be paying closer attention and expecting the possibility that
> >>their code might break unless they make some adjustments.  Without
> >>the major version change they can more easily and reasonably expect
> >>not having compatibility breaks.
> >>
> >>Part and parcel with this is that only DBD::mysql 5.0 would have the
> >>other changes required for compatibility with newer MySQL versions
> >>or features that would be the other more normal-sounding reasons to
> >>use it.
> >>
> >>So I specifically propose the following:
> >>
> >>1. From now on, DBD::mysql versions 4.x would essentially be frozen
> >>at 4.041/4.043.  They would expressly never receive any breaking
> >>changes (but see point 3) and in particular no Unicode handling
> >>changes.  They would expressly never receive any new features.  This
> >>is the option for people whose codebases and environments work now
> >>and want to leave it alone.
> >>
> >>2. From now on, DBD::mysql versions 5.x and above would be where all
> >>active development occurs, would be the only place where Unicode
> >>handling is fixed and new MySQL versions and features are supported,
> >>and where other features are added.  Version 5.0 specifically would
> >>have all of the backwards-breaking changes at once that are
> >>currently planned or anticipated in the short term, in particular
> >>fixing the Unicode.  Anyone who is already making changes to their
> >>environment by moving to a newer MySQL version or want newer feature
> >>support will have to use