I'm working on a Subversion interface to MediaWiki and am struggling with the SQL to respond to Subversion's update-report: http://www.mediawiki.org/wiki/WebDAV
MediaWiki's revision table contains unique revision ids and the corresponding page id. The page table contains unique page ids and the corresponding page title. Suversion's update-report intends to get a list of changes between the current state of the working copy and a target revision: often HEAD, MAX(revision.rev_id) By converting update-report entries to an SQL condition, I can select rows of the revision table which come after the current state of the working copy: <S:update-report send-all="true" xmlns:S="svn:"> <S:src-path>http://ket/~jablko/mediawiki/webdav.php</S:src-path> <S:target-revision>27</S:target-revision> <S:entry rev="18"></S:entry> <S:entry rev="20">Test</S:entry> <S:entry rev="20">Main_Page</S:entry> </S:update-report> - becomes: ((page_title = 'Main_Page' OR page_title LIKE 'Main\_Page/%') AND revision.rev_id > '20' OR NOT (page_title = 'Main_Page' OR page_title LIKE 'Main\_Page/%') AND ((page_title = 'Test' OR page_title LIKE 'Test/%') AND revision.rev_id > '20' OR NOT (page_title = 'Test' OR page_title LIKE 'Test/%') AND revision.rev_id > '18')) Using GROUP BY page_id, I get a list of pages which changed between the current state of the working copy and the target revision. My problem: I also need to know if these pages have revisions before <current-state>, or if they are newly created. I don't want to do a second query for revisions of <page-list> before <current-state> because on an initial checkout, <page-list> could be huge, making an enormous SQL query. Instead, I think I should do a LEFT JOIN on another instance of the revision table ("old"), where revisions are before <current-state>. NULL rows in this table correspond to newly created pages: SELECT page_title, MAX(new.rev_id), old.rev_id FROM page JOIN revision AS new LEFT JOIN revision AS old ON new.rev_page = old.rev_page WHERE new.rev_page = page_id AND old.rev_id < new.rev_id AND new.rev_id <= <target-revision> AND <new.rev_id greater than current-state> GROUP BY page_id The problem with this query is that "old.rev_id < new.rev_id" means each row in "old" is less than _a_ row in "new", not necessarily less than the _minimum_ row in "new". I don't want to replace this condition with <old.rev_id less than current-state> because the <current-state> expression can be long and complex; I prefer to evaluate it only once, for <new.rev_id greater than current-state> I tried replacing "old.rev_id < new.rev_id" with "old.rev_id < MIN(new.rev_id)" but got a MySQL error: Invalid use of group function Can anyone recommend how best to query the database for a list of pages which changed since the current state of the working copy, and whether those pages are newly created? Much thanks, Jack
signature.asc
Description: Digital signature