On Wed, Feb 28, 2018 at 5:07 PM, Jaime Crespo <jcre...@wikimedia.org> wrote:
> On Wed, Feb 28, 2018 at 5:26 PM, Brad Jorsch (Anomie) <bjor...@wikimedia.org
>> wrote:
>
>> On Wed, Feb 28, 2018 at 8:47 AM, Jaime Crespo <jcre...@wikimedia.org>
>> wrote:
>>
>> > Very recently I have been experimenting with recursive Common Table
>> > Expressions [2], which are or will be available on the latest versions of
>> > MySQL and MariaDB.
>> >
>>
>> Do the other databases MediaWiki tries to support have that feature?
>>
>
> Actually, MySQL/MariaDB is the *last* database to conform to the sql:1999
> WITH standard: https://modern-sql.com/feature/with#compatibility Even
> sqlite suported a limited set of those!
>
> The good news is that, probably because it arrived last, it got a pretty
> feature-full implementation:
> https://twitter.com/MarkusWinand/status/852862475699707904
>
>
>>
>> > With a single query on can obtain all titles directly or indirectly in a
>> > category:
>> >
>> > WITH RECURSIVE cte (cl_from, cl_type) AS
>> > (
>> >     SELECT cl_from, cl_type FROM categorylinks WHERE cl_to =
>> > 'Database_management_systems' -- starting category
>> >     UNION
>> >     SELECT categorylinks.cl_from, categorylinks.cl_type FROM cte JOIN
>> page
>> > ON
>> > cl_from = page_id JOIN categorylinks ON page_title = cl_to WHERE
>> > cte.cl_type
>> > = 'subcat' -- subcat addition on each iteration
>> > )
>> > SELECT page_title FROM cte JOIN page ON cl_from = page_id WHERE
>> > page_namespace = 0 ORDER BY page_title; -- printing only articles in the
>> > end
>> > , ordered by title
>> >
>>
>> Does that work efficiently on huge categories, or does it wind up fetching
>> millions of rows and filesorting?
>
>
> Needs more testing-- that query worked for me well enough on my laptop to
> expose it directly on a webrequest (<0.1 s), but I only imported the
> categorylinks and page tables, so I was working with memory. Obviously, the
> more complex the query, and the more results it returns, the less likely it
> is to be able to be exposed to, e.g. a public API. But at least there are
> configurable limits on recursivitiy and max execution time.
>
> Honestly, given it is a new feature, I don't expect mediawiki --which at
> the moment has to support 5.5- to embrace it any time soon. However, I
> wanted to ask if it was interesting enough to setup some test hosts for
> mediawiki to "play" with it --e.g. evaluate performance--, and maybe (?)
> some upgraded mariadb/mysql servers for WMF labsdb (for long-running
> analytics or gadgets that generates reports).
>

I certainly think labsdb users would welcome this feature. Traversing
category trees (Or sometimes even the graph of page links) is
certainly a common thing that people want to do.

--
Brian

_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Reply via email to