On Sonntag, 17. August 2008, Michael Dale wrote:
> I do sql queries that do selects with lots of results across several
> custom tables. To pull in semantic data I would add the following to
> mediaWikis database abstraction query variables:
> (assuming I have a bunch of mv_index_page results as part of a complex
> join query and wanted property like "spoken_by", or "other properties"
> that I could add conditions on I would add the following (where
> 'spoken_by' is the $prop variable
>
> $vars[] = $prop.object_title as $prop;
> $from_tables.=' ' .
>         LEFT JOIN 'smw_relations'
>                  as $prop
>         ON (mv_mvd_index.mv_page_id= $prop . subject_id
>                 AND
>         $prop . relation_title  = '$prop'
>
>
> This is a LOT more difficult to do with the new database structure....
> Any advice how I would approach this with the new db structure?

Okay, let's see. It's surely not so obvious any more, since the new DB is not 
optimised for joins with MW page IDs at all.


The most straightforward solution (with the most complex query as a result) is 
to add joins with further tables as needed. smw_rels2 has a triple structure 
s_id, p_id, o_id, and instead of using direct property string names or MW 
page ids, these ids now can only be resolved using the table smw_ids. The 
latter realises a mapping between the store's ids and MediaWiki page names, 
namespaces, and possibly interwiki names. No MediaWiki page id is tracked 
there, so if your mv_mvd_index table only holds ids, then you will need 
another join with page to get from names to ids. Probably it would be faster 
to hold titles+namespaces in mv_mvd_index right away.

This is the canonical modification to use a DB query as before, and again it 
depends completely on the SMW internal DB layout. It has many more joins, but 
things might be not too bad, given that all joins with smw_ids join on a 
single numeric primary key.


Another option should be considered as well: if the number of objects in 
either table is typically very small (e.g. I am not sure how few objects you 
have in mv_mvd_index), then one can realise the left join in PHP, iterating 
through the result. This is never as fast as a direct SQL query, but it is 
the solution that is most robust and requires least maintenance, since one 
could use SMW API calls to do the individual checks, yielding complete 
storage abstraction.


None of these options are really convincing to me. I think a better way to go 
would be to move the relevant data (the data that is used to build 
mv_mvd_index) into SMW, storing it there with the help of (predefined) SMW 
properties. This must work somehow, since SMW is fairly universal in its data 
model. The work then would be in extending the storage methods to make sure 
SMW contains all relevant data. After this, the query part would no longer be 
direct SQL, but merely an SMW query to the (current) SMW store, whatever it 
may be. 

It depends on the details of the data needed for mv_mvd_index whether or not 
this would work well. I have to see more details here.


Finally, it would also be possible to move the construction of the above left 
join into the SMW storage implementation. It is possible to subclass SMW 
stores to extend/overwrite any of their functions, and such a new store can 
easily be shipped and set by extensions. This would enable the tightest 
integration and it is certainly the preferred way when private SQL queries 
are really unavoidable. It still is easier to maintain because the extension 
can presume a certain SMW store to be used as a basis for its customisation, 
and it is clear that users cannot simply change the store without breaking 
something. Also, minor changes in SMW's storage implementation are simply 
inherited by the subclassed store. For instance, one could overwrite/extend 
the initialisation and update methods of SMW to ensure that all data 
for "special" properties (like spoken_by) is duplicated into another internal 
table that is easier to use in the above SQL statement. See the current Halo 
1.2 release for an example of subclassing another store (including error 
messages when an admin tries to use another unsupported store).


There is yet another solution that requires work on our side: we could extend 
the query API of SMW stores so that they are able to put query results into a 
temporary DB table (returning its name), which might then be used for further 
joins. Optionally this table could use MW page ids or title/namespace pairs. 
This solution would be good as a general path for SQL-level extensions of 
SMW, but it might force some inefficient behaviour for some SMW stores. It 
also precludes some join-order-optimisations that could be possible when 
everything is done by a single query.

Best regards,

Markus


>
> peace,
> --michael
>
> -------------------------------------------------------------------------
> This SF.Net email is sponsored by the Moblin Your Move Developer's
> challenge Build the coolest Linux based applications with Moblin SDK & win
> great prizes Grand prize is a trip for two to an Open Source event anywhere
> in the world http://moblin-contest.org/redirect.php?banner_id=100&url=/
> _______________________________________________
> Semediawiki-devel mailing list
> Semediawiki-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/semediawiki-devel



-- 
Markus Krötzsch
Semantic MediaWiki    http://semantic-mediawiki.org
http://korrekt.org    [EMAIL PROTECTED]

Attachment: signature.asc
Description: This is a digitally signed message part.

-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Semediawiki-devel mailing list
Semediawiki-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/semediawiki-devel

Reply via email to