Re: [Wikitech-l] [MediaWiki-CVS] SVN: [45755] trunk/phase3 (IBM DB2 database layer)

2009-01-15 Thread Roan Kattouw
Tim Starling schreef:
> Roan Kattouw wrote:
>   
>> Just some basic comments, I'm sure Brion has more.
>> 
>
> You should probably send them to CodeReview these days.
>   
Yeah, I know, but since I wanted to reply to code and the reply was 
quite sizable, I thought I'd do it the old-fashioned way.
> SQLite doesn't have a built-in full-text search engine either, and
> PostgreSQL only has one starting in 8.3. The schema I'll use for SQLite is
> pretty much the same. It keeps the scripts happy. They'll use
> SearchEngineDummy by default and so the table won't be populated. Lucene
> can be used instead.
>   
Ah, so it's a dummy table. That explains.

Roan Kattouw (Catrope)

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] [MediaWiki-CVS] SVN: [45755] trunk/phase3 (IBM DB2 database layer)

2009-01-14 Thread Chad
On Wed, Jan 14, 2009 at 6:27 PM, Roan Kattouw  wrote:
[snip]
> I knew about those, yes, but I didn't know LIMIT/OFFSET was
> non-standard, even though it seems to be the more widely used variant.
> Of course if offset handling isn't implemented at all, that's something
> to worry about.
[/snip]

MSSQL does something different too. The following bit of
MySQL:

SELECT * FROM tbl LIMIT 5;

Would become the following in MSSQL:

SELECT TOP 5 FROM tbl;

Was definitely a headache on more than one occasion, being
so used to typing limit...

-Chad
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] [MediaWiki-CVS] SVN: [45755] trunk/phase3 (IBM DB2 database layer)

2009-01-14 Thread Tim Starling
Roan Kattouw wrote:
> Just some basic comments, I'm sure Brion has more.

You should probably send them to CodeReview these days.

>> +/**
>> + * This represents a column in a DB2 database
>> + * @ingroup Database
>> + */
>> +class IBM_DB2Field {
[...]
>>   
> Why do you need this? The other Database backends don't have it.

Actually SQLite is going to have one after I commit my working copy. And
MySQL has had one for a while.

>> +/**
>> + * USE INDEX clause
>> + * DB2 doesn't have them and returns ""
>> + * @param sting $index
>> + */
>> +public function useIndexClause( $index ) {
>> +return "";
>> +}
>>   
> What do you mean DB2 "doesn't have them"? FORCE INDEX isn't supported in 
> DB2? Then unless its index choosing algorithm is extremely good, it 
> won't be able to run certain queries with satisfactory efficiency.

SQLite doesn't have them either. It's not an issue.

>> +-- should be replaced with OmniFind, Contains(), etc
>> +CREATE TABLE searchindex (
>> +  si_page int NOT NULL,
>> +  si_title varchar(255) NOT NULL default '',
>> +  si_text clob NOT NULL
>> +);
>>   
> Don't you need some index on this table to enable efficient searching?

SQLite doesn't have a built-in full-text search engine either, and
PostgreSQL only has one starting in 8.3. The schema I'll use for SQLite is
pretty much the same. It keeps the scripts happy. They'll use
SearchEngineDummy by default and so the table won't be populated. Lucene
can be used instead.

-- Tim Starling


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] [MediaWiki-CVS] SVN: [45755] trunk/phase3 (IBM DB2 database layer)

2009-01-14 Thread Aryeh Gregor
On Wed, Jan 14, 2009 at 6:27 PM, Roan Kattouw  wrote:
> I knew about those, yes, but I didn't know LIMIT/OFFSET was
> non-standard, even though it seems to be the more widely used variant.
> Of course if offset handling isn't implemented at all, that's something
> to worry about.

You can always emulate it by rewriting

SELECT * FROM foo ORDER BY bar LIMIT m, n

as

SELECT * FROM (SELECT * FROM foo ORDER BY bar LIMIT m+n) ORDER BY bar
DESC LIMIT n

or whatever (replacing LIMIT as appropriate).  Other major DBMSes tend
to implement subqueries more efficiently than MySQL too, AFAIK.

> Ah, didn't know pgsql did those things too. I guess it's alright then,
> as long as performance stays acceptable.

Well, on small sites database performance isn't a big issue.  On large
sites, well, hopefully they'll have a DBA who can optimize it and give
us patches.  :)

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] [MediaWiki-CVS] SVN: [45755] trunk/phase3 (IBM DB2 database layer)

2009-01-14 Thread Roan Kattouw
Aryeh Gregor schreef:
> On Wed, Jan 14, 2009 at 6:10 PM, Roan Kattouw  wrote:
>   
>> So DB2 renames LIMIT $n to something else and doesn't even implement
>> offset handling, even though both are in the SQL specification?
>> 
>
> LIMIT/OFFSET are completely nonstandard.  See, e.g., this Google
> result confirms this:
>
> http://troels.arvin.dk/db/rdbms/#select-limit
>
> FETCH FIRST n ROWS ONLY is the standard way, not LIMIT.  A lot of
> stuff that you take for granted in MySQL is completely nonstandard --
> REPLACE INTO... and INSERT IGNORE INTO... are two other major
> examples.
>   
I knew about those, yes, but I didn't know LIMIT/OFFSET was 
non-standard, even though it seems to be the more widely used variant. 
Of course if offset handling isn't implemented at all, that's something 
to worry about.
> As a general rule, SQL seems to be such a terrible, slowly developed,
> impractical standard that a typical app doesn't implement most of what
> it says to do and implements a metric ton of things that it doesn't
> mention.  Which is why we need such a complicated abstraction layer in
> the first place.  Ideally we could just write in standard SQL with the
> occasional minor workaround, the way we write standard HTML and CSS
> and so on, but real life doesn't seem to work that way in this case.
>
>   
>> What do you mean DB2 "doesn't have them"? FORCE INDEX isn't supported in
>> DB2? Then unless its index choosing algorithm is extremely good, it
>> won't be able to run certain queries with satisfactory efficiency.
>> 
>
> My impression is that MySQL is completely stupid when it comes to
> choosing indexes.
It's pretty stupid at times, yes.
>   Most other serious databases are much smarter and
> don't need babysitting.  Our pgsql wrapper also doesn't implement
> this.  I don't think anything does other than DatabaseMysql.
>
> Index behavior is very database-dependent -- even if other DBs did
> implement forcing a particular index, they probably wouldn't need to
> force the same indexes in the same cases.  They might not even *have*
> the same indexes.  I know our pgsql schema doesn't have the same
> indexes as MySQL in most cases: it often can make do with fewer,
> because of its ability to do things like take intersections of
> multiple indexes efficiently, or even retrieve by one index and order
> by another.
>
>   
>> You shouldn't rename indices like that, because index names are used in
>> FORCE INDEX clauses (oh wait, but they weren't supported, right?)
>> 
>
> Right.  pgsql also renames indexes (because as noted, it doesn't use
> the same ones).  I'd imagine the same is true for the other DBs we
> support or semi-support.
Ah, didn't know pgsql did those things too. I guess it's alright then, 
as long as performance stays acceptable.

Roan Kattouw (Catrope)

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] [MediaWiki-CVS] SVN: [45755] trunk/phase3 (IBM DB2 database layer)

2009-01-14 Thread Aryeh Gregor
On Wed, Jan 14, 2009 at 6:10 PM, Roan Kattouw  wrote:
> So DB2 renames LIMIT $n to something else and doesn't even implement
> offset handling, even though both are in the SQL specification?

LIMIT/OFFSET are completely nonstandard.  See, e.g., this Google
result confirms this:

http://troels.arvin.dk/db/rdbms/#select-limit

FETCH FIRST n ROWS ONLY is the standard way, not LIMIT.  A lot of
stuff that you take for granted in MySQL is completely nonstandard --
REPLACE INTO... and INSERT IGNORE INTO... are two other major
examples.

As a general rule, SQL seems to be such a terrible, slowly developed,
impractical standard that a typical app doesn't implement most of what
it says to do and implements a metric ton of things that it doesn't
mention.  Which is why we need such a complicated abstraction layer in
the first place.  Ideally we could just write in standard SQL with the
occasional minor workaround, the way we write standard HTML and CSS
and so on, but real life doesn't seem to work that way in this case.

> What do you mean DB2 "doesn't have them"? FORCE INDEX isn't supported in
> DB2? Then unless its index choosing algorithm is extremely good, it
> won't be able to run certain queries with satisfactory efficiency.

My impression is that MySQL is completely stupid when it comes to
choosing indexes.  Most other serious databases are much smarter and
don't need babysitting.  Our pgsql wrapper also doesn't implement
this.  I don't think anything does other than DatabaseMysql.

Index behavior is very database-dependent -- even if other DBs did
implement forcing a particular index, they probably wouldn't need to
force the same indexes in the same cases.  They might not even *have*
the same indexes.  I know our pgsql schema doesn't have the same
indexes as MySQL in most cases: it often can make do with fewer,
because of its ability to do things like take intersections of
multiple indexes efficiently, or even retrieve by one index and order
by another.

> You shouldn't rename indices like that, because index names are used in
> FORCE INDEX clauses (oh wait, but they weren't supported, right?)

Right.  pgsql also renames indexes (because as noted, it doesn't use
the same ones).  I'd imagine the same is true for the other DBs we
support or semi-support.

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] [MediaWiki-CVS] SVN: [45755] trunk/phase3 (IBM DB2 database layer)

2009-01-14 Thread Roan Kattouw
Just some basic comments, I'm sure Brion has more.

leo...@svn.wikimedia.org schreef:
> Revision: 45755
> Author:   leonsp
> Date: 2009-01-14 22:20:15 + (Wed, 14 Jan 2009)
>
> Log Message:
> ---
> (bug 17028) Added support for IBM DB2 database. config/index.php has new 
> interface elements that only show up if PHP has ibm_db2 module enabled. 
> AutoLoader knows about the new DB2 classes. GlobalFunctions has a new 
> constant for DB2 time format. Revision class fixed slightly. Also includes 
> new PHP files containing the Database and Search API implementations for IBM 
> DB2.
>
> [...]
> Modified: trunk/phase3/includes/Revision.php
> ===
> --- trunk/phase3/includes/Revision.php2009-01-14 22:15:50 UTC (rev 
> 45754)
> +++ trunk/phase3/includes/Revision.php2009-01-14 22:20:15 UTC (rev 
> 45755)
> @@ -961,6 +961,10 @@
>*/
>   static function getTimestampFromId( $title, $id ) {
>   $dbr = wfGetDB( DB_SLAVE );
> + // Casting fix for DB2
> + if ($id == '') {
> + $id = 0;
> + }
>   
You should probably use intval($id) here.
>  [...]
>
> Added: trunk/phase3/includes/SearchIBM_DB2.php
> ===
> --- trunk/phase3/includes/SearchIBM_DB2.php   (rev 0)
> +++ trunk/phase3/includes/SearchIBM_DB2.php   2009-01-14 22:20:15 UTC (rev 
> 45755)
> @@ -0,0 +1,247 @@
> + +# Copyright (C) 2004 Brion Vibber 
>   
If you wrote this file, you should attribute yourself.
>
> Added: trunk/phase3/includes/db/DatabaseIbm_db2.php
> ===
> --- trunk/phase3/includes/db/DatabaseIbm_db2.php  
> (rev 0)
> +++ trunk/phase3/includes/db/DatabaseIbm_db2.php  2009-01-14 22:20:15 UTC 
> (rev 45755)
>
> +/**
> + * Utility class for generating blank objects
> + * Intended as an equivalent to {} in Javascript
> + * @ingroup Database
> + */
> +class BlankObject {
> +}
>   
Just use $obj = new stdClass; here.
> +
> +/**
> + * This represents a column in a DB2 database
> + * @ingroup Database
> + */
> +class IBM_DB2Field {
> + private $name, $tablename, $type, $nullable, $max_length;
> +
> + /**
> +  * Builder method for the class 
> +  * @param Object $db Database interface
> +  * @param string $table table name
> +  * @param string $field column name
> +  * @return IBM_DB2Field
> +  */
> + static function fromText($db, $table, $field) {
> + [...]
> + }
> + /**
> +  * Get column name
> +  * @return string column name
> +  */
> + function name() { return $this->name; }
> + /**
> +  * Get table name
> +  * @return string table name
> +  */
> + function tableName() { return $this->tablename; }
> + /**
> +  * Get column type
> +  * @return string column type
> +  */
> + function type() { return $this->type; }
> + /**
> +  * Can column be null?
> +  * @return bool true or false
> +  */
> + function nullable() { return $this->nullable; }
> + /**
> +  * How much can you fit in the column per row?
> +  * @return int length
> +  */
> + function maxLength() { return $this->max_length; }
> +}
>   
Why do you need this? The other Database backends don't have it.
> +
> +/**
> + * Wrapper around binary large objects
> + * @ingroup Database
> + */
> +class IBM_DB2Blob {
> + private $mData;
> +
> + function __construct($data) {
> + $this->mData = $data;
> + }
> +
> + function getData() {
> + return $this->mData;
> + }
> +}
>   
Why do you need these wrapper objects?
> [...]
> + public function is_numeric_type( $type ) {
> + switch (strtoupper($type)) {
> + case 'SMALLINT':
> + case 'INTEGER':
> + case 'INT':
> + case 'BIGINT':
> + case 'DECIMAL':
> + case 'REAL':
> + case 'DOUBLE':
> + case 'DECFLOAT':
> + return true;
> + }
> + return false;
> + }
>   
Indentation looks wrong here.
> + /**
> +  * Construct a LIMIT query with optional offset
> +  * This is used for query pages
> +  * $sql string SQL query we will append the limit too
> +  * $limit integer the SQL limit
> +  * $offset integer the SQL offset (default false)
> +  */
> + public function limitResult($sql, $limit, $offset=false) {
> + if( !is_numeric($limit) ) {
> + throw new DBUnexpectedError( $this, "Invalid 
> non-numeric limit passed to limitResult()\n" );
> + }
> + if( $offset ) {
> + wfDebug("Offset parameter not supported in 
> limitResult()\n");
> + }
> + // TODO implement