RE: New NestedHash module needs home

2011-09-21 Thread Byrd, Brendan
I think I'm going to end up going the route of a new module called DBD::Object. 
 While AnyData seems like a good fit, the whole table vs. database thing seems 
to hamper that choice too much.  Besides, if we're talking about multiple 
tables, it's pretty much a database at that point, anyway.

FYI, I'm also (simultaneously) working on a few other DBI modules that you guys 
might be interested in:

DBD::SNMP - This will be a merging between the Net-SNMP module (for OID data) 
and the Net::SNMP/::XS module (for everything else), with built-in multi-device 
support (via a virtual temp table).  If David doesn't hurry up and get those 
dispatch patches in, I guess they will be included here, too.

DBD::FusionTables - Interaction with Google's Fusion Tables API, which has a 
SQL interface.  And when I say SQL interface, I mean a really stripped-down 
bare-bones SQL interface.  Still, it can do INSERTs, SELECTs, UPDATEs, etc., so 
let's build an interface and see if it even works on DBIC.

SQL::Statement::Functions::CAST - This is what happens when I say (for the 
thousandth time) Oh, that should be an easy function to implement!  Then I 
buried my head in SQL-99 specs and the madness began.  I am making good headway 
on this one, though.  Depending on what other SQL-99/ODBC functions are already 
in Perl, it may just turn into ::SQL99 to implement mostly everything.

From: dub...@gmail.com [mailto:dub...@gmail.com] On Behalf Of Jeff Zucker
Sent: Tuesday, September 13, 2011 12:57 PM
To: dbi-dev@perl.org
Cc: Byrd, Brendan
Subject: Re: New NestedHash module needs home

On Tue, Sep 13, 2011 at 7:50 AM, Jonathan Leffler 
jonathan.leff...@gmail.commailto:jonathan.leff...@gmail.com wrote:

On Mon, Sep 12, 2011 at 10:29, Byrd, Brendan 
byr...@insightcom.commailto:byr...@insightcom.com wrote:

  I currently have a working and tested model for a nested hash to table

 conversion.  [...]

 ** **

 *AnyData::Format::NestedHash - *[...]

 *DBD::AnyData::Format::NestedHash - *[...]

 *DBD::NestedHash - *[...]


I'm not sure it fits there; it may be more driver-related than built atop
DBI.  But you didn't mention the DBIx namespace...

It sounds to me like it fits very well in the AnyData/DBD::AnyData namesapce 
because it would provide a driver for using DBI against things not usually 
considered to be databases.  Although I gave birth to the AnyDatas, I'm not 
very involved in them at the moment so if you go that route, you should check 
with Jens Reshack who currently is the primary maintainer.

--
Jeff


Re: New NestedHash module needs home

2011-09-21 Thread H.Merijn Brand
On Mon, 12 Sep 2011 17:29:57 +, Byrd, Brendan
byr...@insightcom.com wrote:

 DBD::NestedHash - This could also be its own Perl module within CPAN.
 However, the hash to table conversion is such a thin wrapper around
 DBD::AnyData that it just seems to make more sense to actually tie it
 into that module somehow, so that developers can benefit from the
 integration.

Feel free to loan/borrow/steal from my Tie::Hash::DBD, which supports
nested hashes using a serializer

-- 
H.Merijn Brand  http://tux.nl  Perl Monger  http://amsterdam.pm.org/
using 5.00307 through 5.14 and porting perl5.15.x on HP-UX 10.20, 11.00,
11.11, 11.23 and 11.31, OpenSuSE 10.1, 11.0 .. 11.4 and AIX 5.2 and 5.3.
http://mirrors.develooper.com/hpux/   http://www.test-smoke.org/
http://qa.perl.org  http://www.goldmark.org/jeff/stupid-disclaimers/


Re: New NestedHash module needs home

2011-09-21 Thread Tim Bunce
Would it be an actual DBI driver?

Could you post some examples or docs?

I'm uncomfortable with it having such a generic name.
Two words after the DBD would be better.

Tim.

On Wed, Sep 21, 2011 at 02:58:09AM +, Byrd, Brendan wrote:
 I think I'm going to end up going the route of a new module called 
 DBD::Object.  While AnyData seems like a good fit, the whole table vs. 
 database thing seems to hamper that choice too much.  Besides, if we're 
 talking about multiple tables, it's pretty much a database at that point, 
 anyway.
 
 FYI, I'm also (simultaneously) working on a few other DBI modules that you 
 guys might be interested in:
 
 DBD::SNMP - This will be a merging between the Net-SNMP module (for OID data) 
 and the Net::SNMP/::XS module (for everything else), with built-in 
 multi-device support (via a virtual temp table).  If David doesn't hurry up 
 and get those dispatch patches in, I guess they will be included here, too.
 
 DBD::FusionTables - Interaction with Google's Fusion Tables API, which has a 
 SQL interface.  And when I say SQL interface, I mean a really stripped-down 
 bare-bones SQL interface.  Still, it can do INSERTs, SELECTs, UPDATEs, etc., 
 so let's build an interface and see if it even works on DBIC.
 
 SQL::Statement::Functions::CAST - This is what happens when I say (for the 
 thousandth time) Oh, that should be an easy function to implement!  Then I 
 buried my head in SQL-99 specs and the madness began.  I am making good 
 headway on this one, though.  Depending on what other SQL-99/ODBC functions 
 are already in Perl, it may just turn into ::SQL99 to implement mostly 
 everything.
 
 From: dub...@gmail.com [mailto:dub...@gmail.com] On Behalf Of Jeff Zucker
 Sent: Tuesday, September 13, 2011 12:57 PM
 To: dbi-dev@perl.org
 Cc: Byrd, Brendan
 Subject: Re: New NestedHash module needs home
 
 On Tue, Sep 13, 2011 at 7:50 AM, Jonathan Leffler 
 jonathan.leff...@gmail.commailto:jonathan.leff...@gmail.com wrote:
 
 On Mon, Sep 12, 2011 at 10:29, Byrd, Brendan 
 byr...@insightcom.commailto:byr...@insightcom.com wrote:
 
   I currently have a working and tested model for a nested hash to table
 
  conversion.  [...]
 
  ** **
 
  *AnyData::Format::NestedHash - *[...]
 
  *DBD::AnyData::Format::NestedHash - *[...]
 
  *DBD::NestedHash - *[...]
 
 
 I'm not sure it fits there; it may be more driver-related than built atop
 DBI.  But you didn't mention the DBIx namespace...
 
 It sounds to me like it fits very well in the AnyData/DBD::AnyData namesapce 
 because it would provide a driver for using DBI against things not usually 
 considered to be databases.  Although I gave birth to the AnyDatas, I'm not 
 very involved in them at the moment so if you go that route, you should check 
 with Jens Reshack who currently is the primary maintainer.
 
 --
 Jeff


Re: New NestedHash module needs home

2011-09-21 Thread Tim Bunce
On Wed, Sep 21, 2011 at 03:17:53PM +, Byrd, Brendan wrote:
 
 It basically takes whatever Perl tree object you throw at it and turns
 it into a database.

Cool.

Given that description I'd suggest a name like DBD::TreeDataMumble
e.g., DBD::TreeDataWrapper with tdw_ as the prefix.

Tim.

 This is achieved with a custom recursive sub to
 normalize the tree, and a thin wrapper around DBD::AnyData to dump the
 tables via ad_import.

 $dbh = DBI-connect('dbi:Object:', undef, undef, {
obj_object = JSON::Any-jsonToObj($json),
obj_prefix = 'json',
obj_1st_table_name = 'map_routes',
obj_table_rename   = {
   'html_instructions' = 'instructions',
   'southwests'= 'sw_corners',
   'northeasts'= 'ne_corners',
},
 });
 
 I'm a stickler for details, so this will probably end up with the full gambit 
 of ODBC variables and *_info subs.  (Just processed that for 
 DBD::FusionTables::GetInfo; man, that's a mess of 185 different bitmaps...)
 
 As far as the name, I was going to use NestedHash, but it does work off of 
 both Arrays and Hashs (as well as looks at other ref types), so I figured 
 something generic would work better.  Technically, it would work for any Perl 
 object, even a single hash.  Some potential names could be:
 
 DBD::Object
 DBD::Tree
 DBD::TreeObject
 DBD::NestedObject
 
 Which one would work best?
 
 --
 Brendan Byrd byr...@insightcom.com
 System Integration Analyst (NOC Web Developer)
 
 -Original Message-
 From: Tim Bunce [mailto:tim.bu...@pobox.com] 
 Sent: Wednesday, September 21, 2011 9:45 AM
 To: Byrd, Brendan
 Cc: dbi-dev@perl.org
 Subject: Re: New NestedHash module needs home
 
 Would it be an actual DBI driver?
 
 Could you post some examples or docs?
 
 I'm uncomfortable with it having such a generic name.
 Two words after the DBD would be better.
 
 Tim.
 
 On Wed, Sep 21, 2011 at 02:58:09AM +, Byrd, Brendan wrote:
  I think I'm going to end up going the route of a new module called 
  DBD::Object.  While AnyData seems like a good fit, the whole table vs. 
  database thing seems to hamper that choice too much.  Besides, if we're 
  talking about multiple tables, it's pretty much a database at that point, 
  anyway.
  
  FYI, I'm also (simultaneously) working on a few other DBI modules that you 
  guys might be interested in:
  
  DBD::SNMP - This will be a merging between the Net-SNMP module (for OID 
  data) and the Net::SNMP/::XS module (for everything else), with built-in 
  multi-device support (via a virtual temp table).  If David doesn't hurry up 
  and get those dispatch patches in, I guess they will be included here, too.
  
  DBD::FusionTables - Interaction with Google's Fusion Tables API, which has 
  a SQL interface.  And when I say SQL interface, I mean a really 
  stripped-down bare-bones SQL interface.  Still, it can do INSERTs, SELECTs, 
  UPDATEs, etc., so let's build an interface and see if it even works on DBIC.
  
  SQL::Statement::Functions::CAST - This is what happens when I say (for the 
  thousandth time) Oh, that should be an easy function to implement!  Then 
  I buried my head in SQL-99 specs and the madness began.  I am making good 
  headway on this one, though.  Depending on what other SQL-99/ODBC functions 
  are already in Perl, it may just turn into ::SQL99 to implement mostly 
  everything.
  
  From: dub...@gmail.com [mailto:dub...@gmail.com] On Behalf Of Jeff Zucker
  Sent: Tuesday, September 13, 2011 12:57 PM
  To: dbi-dev@perl.org
  Cc: Byrd, Brendan
  Subject: Re: New NestedHash module needs home
  
  On Tue, Sep 13, 2011 at 7:50 AM, Jonathan Leffler 
  jonathan.leff...@gmail.commailto:jonathan.leff...@gmail.com wrote:
  
  On Mon, Sep 12, 2011 at 10:29, Byrd, Brendan 
  byr...@insightcom.commailto:byr...@insightcom.com wrote:
  
I currently have a working and tested model for a nested hash to table
  
   conversion.  [...]
  
   ** **
  
   *AnyData::Format::NestedHash - *[...]
  
   *DBD::AnyData::Format::NestedHash - *[...]
  
   *DBD::NestedHash - *[...]
  
  
  I'm not sure it fits there; it may be more driver-related than built atop
  DBI.  But you didn't mention the DBIx namespace...
  
  It sounds to me like it fits very well in the AnyData/DBD::AnyData 
  namesapce because it would provide a driver for using DBI against things 
  not usually considered to be databases.  Although I gave birth to the 
  AnyDatas, I'm not very involved in them at the moment so if you go that 
  route, you should check with Jens Reshack who currently is the primary 
  maintainer.
  
  --
  Jeff
 


Re: Add Unicode Support to the DBI

2011-09-21 Thread David E. Wheeler
DBI peeps,

Sorry for the delayed response, I've been busy, looking to reply to this thread 
now.

On Sep 9, 2011, at 8:06 PM, Greg Sabino Mullane wrote:

 One thing I see bandied about a lot is that Perl 5.14 is highly preferred. 
 However, it's not clear exactly what the gains are and how bad 5.12 is 
 compared to 5.14, how bad 5.10 is, how bad 5.8 is, etc. Right now 5.8 is 
 the required minimum for DBI: should we consider bumping this? I know TC 
 would be horrified to see us attempting to talk about Unicode support 
 with a 5.8.1 requirement, but how much of that will affect database 
 drivers? I have no idea myself.

I think I'd just follow TC's recommendations here. DBI should stay compatible 
as far back as is reasonable without unduly affecting further development and 
improvement (not that there's much of that right now). So if proper encoding is 
important to you, use at least 5.12 and prefer 5.14. And if proper encoding is 
not important to you, well, it is, you just don't know it yet.

 Another aspect to think about that came up during some offline DBD::Pg 
 talks was the need to support legacy scripts and legacy data. While the 
 *correct* thing is to blaze forward and use Do Things Correctly everywhere, 
 I think we at least need some prominent knobs so that we can maintain 
 backwards compatiblity for existing scripts that expect a bunch of 
 Latin1, or need the data to come back in the current, undecoded, 
 un-utf8-flagged way.

Agreed. I suspect the existing behavior should remain the default, with a knob 
to make it do things correctly, with perhaps a deprecation plan to turn on 
the correctly knob by default in a year or so.

Best,

David

Re: Add Unicode Support to the DBI

2011-09-21 Thread David E. Wheeler
On Sep 10, 2011, at 7:44 AM, Lyle wrote:

 Right now 5.8 is the required minimum for DBI: should we consider bumping 
 this?
 
 I know a lot of servers in the wild are still running RHEL5 and it's 
 variants, which are stuck on 5.8 in the standard package management. The new 
 RHEL6 only has 5.10...
 So at this time the impact of such change could be significant.

Yes, which is why we can't just impose a solution on people.

Best,

David

Re: Add Unicode Support to the DBI

2011-09-21 Thread David E. Wheeler
On Sep 10, 2011, at 3:08 AM, Martin J. Evans wrote:

 I'm not sure any change is required to DBI to support unicode. As far as I'm 
 aware unicode already works with DBI if the DBDs do the right thing.

Right, but the problem is that, IME, none of them do the right thing. As I 
said, I've submitted encoding-related bug reports for every DBD I've used in 
production code. And they all have different interfaces for tweaking things.

 If you stick to the rule that all data Perl receives must be decoded and all 
 data Perl exports must be encoded it works (ignoring any issues in Perl 
 itself).

Er, was there supposed to be a , then … statement there?

 I bow to Tom's experience but I'm still not sure how that applies to DBI so 
 long as the interface between the database and Perl always encodes and 
 decodes then the issues Tom describes are all Perl ones - no?

The trouble is that:

1. They don't always encode or decode
2. When they do, the tend to get subtle bits wrong
3. And they all have different interfaces and philosophies for doing so

 Surely Oracle should return the data encoded as you asked for it and if it 
 did not Oracle is broken.
 I'd still like to see this case and then we can see if Oracle is broken and 
 if there is a fix for it.

Oh I don't doubt that Oracle is broken.

 In some places DBD::Oracle does sv_utf8_decode(scalar) or SvUTF8_on(scalar) 
 (depending on your Perl) and in some places it just does SvUTF8_on(scalar). I 
 believe the latter is much quicker as the data is not checked. Many people 
 (myself included) are particularly interested in DBD::Oracle being fast and 
 if all the occurrences were changed to decode I'd patch that out in my copy 
 as I know the data I receive is UTF-8 encoded.

IME It needs an assume Oracle is broken knob. That is, I should have the 
option to enface encoding and decoding, rather than just flipping SvUTF8. And I 
think that such an interface should be standardized in the DBI along with 
detailed information for driver authors how how to get it right.

 See above. I'd like the chance to go with speed and take the consequences 
 rather than go with slower but know incorrect UTF-8 is spotted.

And maybe that's the default. But I should be able to tell it to be pedantic 
when the data is known to be bad (see, for example data from an 
SQL_ASCII-encoded PostgreSQL database).

 I thought UTF-8 when used in Perl used the strict definition and utf-8 used 
 Perl's looser definition - see 
 http://search.cpan.org/~dankogai/Encode-2.44/Encode.pm#UTF-8_vs._utf8_vs._UTF8

That's right. So if I want to ensure that I'm getting strict encoding in my 
database, It needs to encode and decode, not simply flip SvUTF8.

 Don't DBDs do this now? I know the encoding of the data I receive in 
 DBD::ODBC and decode it when I get it and encode it when I send it and I 
 believe that is what DBD::Oracle does as well. There is one exception in ODBC 
 for drivers which don't truly abide by ODBC spec and send 8 bit data back 
 UTF-8 encoded (see later).

There is no single API for configuring this in the DBI, and I argue there 
should be.

 I've spent a lot of effort getting unicode working in DBD::ODBC (for UNIX and 
 with patches from Alexander Foken for Windows) which is implemented in an 
 awkward fashion in ODBC. I'd like to hear from DBD authors what support they 
 already have and how it is implemented so we can see what ground is already 
 covered and where the problems were.

DBD::Pg's approach is currently broken. Greg is working on fixing it, but for 
compatibility reasons the fix is non-trivial (an the API might be, too). In a 
perfect world DBD::Pg would just always do the right thing, as the database 
tells it what encodings to use when you connect (and *all* data is encoded as 
such, not just certain data types). But the world is not perfect, there's a lot 
of legacy stuff.

Greg, care to add any other details?

 as I remain unconvinced a problem exists other than incorrectly coded DBDs. 
 I'm happy to collate that information. As a start I'll describe the DBD::ODBC:
 
 1. ODBC has 2 sets of APIs, SQLxxxA (each chr is 8 bits) and SQLxxxW (each 
 chr is 16 bits and UCS-2). This is how Microsoft did it and yes I know that 
 does not support all of unicode but code pages get involved too.
 
 2. You select which API you are using with a macro when you compile your 
 application so you cannot change your mind.
 You can in theory call SQLxxxA or SQLxxxW functions directly but if you use 
 SQLxxx you get the A or W depending on what the macro is set to.
 Problem: DBD::ODBC has to built one way or the other.
 
 3. When using the SQLxxxA functions you can still bind columns/parameters as 
 wide characters but the ODBC driver needs to support this.
 
 4. When using SQLxxxW functions all strings are expected in UCS-2. You can 
 bind columns and parameters as whatever type you like but obviously if you 
 bind a unicode column as SQLCHAR instead of SQLWCHAR you probably get the 

RE: New NestedHash module needs home

2011-09-21 Thread Byrd, Brendan
Depends on your definition of actual.  Or for that matter, your definition of 
Object.

I have some draft code on the beginning of this thread.  It basically takes 
whatever Perl tree object you throw at it and turns it into a database.  This 
is achieved with a custom recursive sub to normalize the tree, and a thin 
wrapper around DBD::AnyData to dump the tables via ad_import.  So it would 
indeed be a DBI object.  I would even tie it to the standard DBD driver code, 
so that the traditional DBI-connect string would work.  For example:

$dbh = DBI-connect('dbi:Object:', undef, undef, {
   obj_object = JSON::Any-jsonToObj($json),
   obj_prefix = 'json',
   obj_1st_table_name = 'map_routes',
   obj_table_rename   = {
  'html_instructions' = 'instructions',
  'southwests'= 'sw_corners',
  'northeasts'= 'ne_corners',
   },
});

I'm a stickler for details, so this will probably end up with the full gambit 
of ODBC variables and *_info subs.  (Just processed that for 
DBD::FusionTables::GetInfo; man, that's a mess of 185 different bitmaps...)

As far as the name, I was going to use NestedHash, but it does work off of both 
Arrays and Hashs (as well as looks at other ref types), so I figured something 
generic would work better.  Technically, it would work for any Perl object, 
even a single hash.  Some potential names could be:

DBD::Object
DBD::Tree
DBD::TreeObject
DBD::NestedObject

Which one would work best?

--
Brendan Byrd byr...@insightcom.com
System Integration Analyst (NOC Web Developer)

-Original Message-
From: Tim Bunce [mailto:tim.bu...@pobox.com] 
Sent: Wednesday, September 21, 2011 9:45 AM
To: Byrd, Brendan
Cc: dbi-dev@perl.org
Subject: Re: New NestedHash module needs home

Would it be an actual DBI driver?

Could you post some examples or docs?

I'm uncomfortable with it having such a generic name.
Two words after the DBD would be better.

Tim.

On Wed, Sep 21, 2011 at 02:58:09AM +, Byrd, Brendan wrote:
 I think I'm going to end up going the route of a new module called 
 DBD::Object.  While AnyData seems like a good fit, the whole table vs. 
 database thing seems to hamper that choice too much.  Besides, if we're 
 talking about multiple tables, it's pretty much a database at that point, 
 anyway.
 
 FYI, I'm also (simultaneously) working on a few other DBI modules that you 
 guys might be interested in:
 
 DBD::SNMP - This will be a merging between the Net-SNMP module (for OID data) 
 and the Net::SNMP/::XS module (for everything else), with built-in 
 multi-device support (via a virtual temp table).  If David doesn't hurry up 
 and get those dispatch patches in, I guess they will be included here, too.
 
 DBD::FusionTables - Interaction with Google's Fusion Tables API, which has a 
 SQL interface.  And when I say SQL interface, I mean a really stripped-down 
 bare-bones SQL interface.  Still, it can do INSERTs, SELECTs, UPDATEs, etc., 
 so let's build an interface and see if it even works on DBIC.
 
 SQL::Statement::Functions::CAST - This is what happens when I say (for the 
 thousandth time) Oh, that should be an easy function to implement!  Then I 
 buried my head in SQL-99 specs and the madness began.  I am making good 
 headway on this one, though.  Depending on what other SQL-99/ODBC functions 
 are already in Perl, it may just turn into ::SQL99 to implement mostly 
 everything.
 
 From: dub...@gmail.com [mailto:dub...@gmail.com] On Behalf Of Jeff Zucker
 Sent: Tuesday, September 13, 2011 12:57 PM
 To: dbi-dev@perl.org
 Cc: Byrd, Brendan
 Subject: Re: New NestedHash module needs home
 
 On Tue, Sep 13, 2011 at 7:50 AM, Jonathan Leffler 
 jonathan.leff...@gmail.commailto:jonathan.leff...@gmail.com wrote:
 
 On Mon, Sep 12, 2011 at 10:29, Byrd, Brendan 
 byr...@insightcom.commailto:byr...@insightcom.com wrote:
 
   I currently have a working and tested model for a nested hash to table
 
  conversion.  [...]
 
  ** **
 
  *AnyData::Format::NestedHash - *[...]
 
  *DBD::AnyData::Format::NestedHash - *[...]
 
  *DBD::NestedHash - *[...]
 
 
 I'm not sure it fits there; it may be more driver-related than built atop
 DBI.  But you didn't mention the DBIx namespace...
 
 It sounds to me like it fits very well in the AnyData/DBD::AnyData namesapce 
 because it would provide a driver for using DBI against things not usually 
 considered to be databases.  Although I gave birth to the AnyDatas, I'm not 
 very involved in them at the moment so if you go that route, you should check 
 with Jens Reshack who currently is the primary maintainer.
 
 --
 Jeff


Re: DBI drivers by duck-typing

2011-09-21 Thread Greg Sabino Mullane

-BEGIN PGP SIGNED MESSAGE-
Hash: RIPEMD160


 I believe that DBI should go away as an actual piece of code and instead be 
 replaced by an API specification document, taking PSGI as inspiration.

I'm having a hard time envisioning how this would work in practice. What I 
see is lots of duplicated code across the DBDs. So a DBI bug would be 
handled by updating the API, bumping the version, and waiting for individual 
DBDs to implement it? A recipe for a large collection of supported API versions 
out in the wild, I would imagine.

 The concept of driver-specific methods, like pg_*, just become ordinary DBD 
 methods that are beyond what is defined by the DBI spec.

Seems a sure recipe for namespace collisions. Also makes it much harder to 
spot any DBMS-specific hacks in your Perl code.

- -- 
Greg Sabino Mullane g...@turnstep.com
PGP Key: 0x14964AC8 201109211632
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-BEGIN PGP SIGNATURE-

iEYEAREDAAYFAk56SzcACgkQvJuQZxSWSsgdzwCguJbIqtX5zjAkDtUvbvUmEn87
Y4AAmwbQeD/KxyBbiUa8WL6fDOXX9wE3
=8RbY
-END PGP SIGNATURE-




Re: Add Unicode Support to the DBI

2011-09-21 Thread Greg Sabino Mullane

-BEGIN PGP SIGNED MESSAGE-
Hash: RIPEMD160


...
 And maybe that's the default. But I should be able to tell it to be pedantic 
 when the 
 data is known to be bad (see, for example data from an SQL_ASCII-encoded 
 PostgreSQL database).
...
 DBD::Pg's approach is currently broken. Greg is working on fixing it, but for 
 compatibility 
 reasons the fix is non-trivial (an the API might be, too). In a perfect world 
 DBD::Pg would 
 just always do the right thing, as the database tells it what encodings to 
 use when you 
 connect (and *all* data is encoded as such, not just certain data types). But 
 the world is 
 not perfect, there's a lot of legacy stuff.

 Greg, care to add any other details?

My thinking on this has changed a bit. See the DBD::Pg in git head for a 
sample, but basically, 
DBD::Pg is going to:

* Flip the flag on if the client_encoding is UTF-8 (and server_encoding is not 
SQL_ASCII)
* Flip if off if not

The single switch will be pg_unicode_flag, which will basiccaly override the 
automatic 
choice above, just in case you really want your SQL_ASCII byte soup marked as 
utf8 for 
some reason, or (more likely), you want your data unmarked as utf8 despite 
being so.

This does rely on PostgreSQL doing the right thing when it comes to 
encoding/decoding/storing 
all the encodings, but I'm pretty sure it's doing well in that regard.

...

Since nobody has actally defined a specific interface yet, let me throw out a 
straw man. It may look familiar :)

===
* $h-{unicode_flag}

If this is set on, data returned from the database is assumed to be UTF-8, and 
the utf8 flag will be set. DBDs will decode the data as needed.

If this is set off, the utf8 flag will never be set, and no decoding will be 
done 
on data coming back from the database.

If this is not set (undefined), the underlying DBD is responsible for doing the 
correct thing. In other words, the behaviour is undefined.
===

I don't think this will fit into DBD::Pgs current implementation perfectly, as 
we wouldn't want people to simply leave $h-{unicode_flag} on, as that would 
force SQL_ASCII text to have utf8 flipped on. Perhaps we simply never, ever 
allow that.

- -- 
Greg Sabino Mullane g...@turnstep.com
End Point Corporation http://www.endpoint.com/
PGP Key: 0x14964AC8 201109211651
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-BEGIN PGP SIGNATURE-

iEYEAREDAAYFAk56TngACgkQvJuQZxSWSsiIfwCeKMfsg2RYsCzDuwb8FnmZhhbu
8LgAn2TNLuKirq5IDAhlCNmQ3gxbnuq7
=k+Fi
-END PGP SIGNATURE-




Re: Add Unicode Support to the DBI

2011-09-21 Thread David E. Wheeler
On Sep 21, 2011, at 1:52 PM, Greg Sabino Mullane wrote:

 Since nobody has actally defined a specific interface yet, let me throw out a 
 straw man. It may look familiar :)
 
 ===
 * $h-{unicode_flag}
 
 If this is set on, data returned from the database is assumed to be UTF-8, 
 and 
 the utf8 flag will be set.

I assume you also mean to say that data sent *to* the database has the flag 
turned off, yes?

 DBDs will decode the data as needed.

I don't understand this sentence. If the flag is flipped, why will it decode?

 If this is set off, the utf8 flag will never be set, and no decoding will be 
 done 
 on data coming back from the database.

What if the data coming back from the database is Big5 and I want to decode it?

 If this is not set (undefined), the underlying DBD is responsible for doing 
 the 
 correct thing. In other words, the behaviour is undefined.
 ===
 
 I don't think this will fit into DBD::Pgs current implementation perfectly, 
 as 
 we wouldn't want people to simply leave $h-{unicode_flag} on, as that would 
 force SQL_ASCII text to have utf8 flipped on. Perhaps we simply never, ever 
 allow that.

You mean never allow it to be flipped when the database encoding is SQL_ASCII?

Best,

David



Re: DBI drivers by duck-typing

2011-09-21 Thread Darren Duncan

Greg Sabino Mullane wrote:
I believe that DBI should go away as an actual piece of code and instead be 
replaced by an API specification document, taking PSGI as inspiration.


I'm having a hard time envisioning how this would work in practice. What I 
see is lots of duplicated code across the DBDs. So a DBI bug would be 
handled by updating the API, bumping the version, and waiting for individual 
DBDs to implement it? A recipe for a large collection of supported API versions 
out in the wild, I would imagine.


I envision that DBDs will tend to still have a shared dependency with each other 
as they do now, and that shared code essentially being what DBI is now.  So when 
the API is updated to handle this DBI bug, then generally only the shared 
dependency would need to be updated, same as now.


The main difference in my proposal is that this code sharing is no longer 
*mandatory* to use the API.


The concept of driver-specific methods, like pg_*, just become ordinary DBD 
methods that are beyond what is defined by the DBI spec.


Seems a sure recipe for namespace collisions. Also makes it much harder to 
spot any DBMS-specific hacks in your Perl code.


You can still use namespaces under my proposal.  The difference is that it is no 
longer mandatory for the shared code to be updated for a new DBMS to be supported.


As for DBMS-specific hacks, those still abound today, because every DBMS has 
unique SQL syntax and unique behaviors for some of the same syntax; DBI has 
never abstracted these away; it just prevents the need for some other kinds of 
DBMS-specific hacks.


-- Darren Duncan


Re: DBI drivers by duck-typing

2011-09-21 Thread David Nicol
On Wed, Sep 21, 2011 at 7:27 PM, Darren Duncan
 As for DBMS-specific hacks

Another possible approach would be a strict interface that only allows
some kind of DBI creole -- well, I suppose a lot of other
persistence frameworks are that, pretty much.