Hi Daniel,

On 10.11.2014 20:31, Herzig, Daniel (AIFB) wrote:
>
> Problem #1: export of RDF fails with "Data for a subobject of PageA
> cannot be added to PageB'"

Yes, I encountered the same problem recently. The problem is the same 
when using Special:ExportRDF and possibly also when trying to use a 
SPARQL store (though I am not sure about this). It occurs whenever PageB 
redirects to PageA, and PageA has subobjects (including #ask queries).

The problem is based on SMW's handling of redirects. In query answering 
etc., SMW will follow redirects, i.e., when you use a redirect page in 
any request or query, you will get the answer as if you would have used 
the redirect target instead. There are some places where this can be 
switched off, but especially the SMWStore method getSemanticData() does 
not support this. This means that any attempt to get the semantic data 
(the stuff shown in the Factbox) for a redirect page will return the 
data for the redirect target.

You can see this effect when using Special:Browse on a redirect page. 
Notably, however, this does not cause any exception. Something else must 
be going wrong in the RDF export. My guess is that the call to 
getSubSemanticData() in SMW_ExportController.php(128) does in general 
cause an exception for SemanticData objects that have been created from 
redirects. Special:Browse would not run into this because the data for 
the subobjects is not expanded there (you only get links to show the 
subobject-data, but these links will trigger new requests that are not 
based on redirects).

The exception itself happens since a subobject of PageA is added to the 
data for PageB. This should not happen. In the situation with the 
redirect, the semantic data returned would need to use the redirect 
target as a subject if it wants to have the subobjects. This should be 
modified in the SQL store method. I am pretty confident that this will 
fix the exception.

This fix is not a complete solution for RDF, since it will mean that 
every redirect page exports the exact same data of its target page. So 
the dump will contain duplicate triples, and it will lack any 
information about redirects (they will not even be mentioned). A better 
handling would be to recognize that we are dealing with a redirect page, 
and to export it as such (in particular, do not call getSemanticData()). 
The right place to do this would be the method getSemanticData() in 
SMW_ExportController.php(294). It could check if a page is a redirect 
and then create a suitable SemanticData response manually for this case.

Another option would be to extend the store to support redirect 
resolution to be turned off for getSemanticData() by using some 
parameter. Care is needed there to make sure that the caching is not 
confused (but this seems easier if the subject is correctly converted to 
be the redirect target; so requests with and without redirect resolution 
will eventually use distinct subjects, and these should be the basis for 
caching).

Cheers,

Markus



> ----------------------------------------------------------
> When running SemanticMediaWiki/maintenance/dumpRDF.php following
> exception occurs, stack below [2]: 'Data for a subobject of PageA cannot
> be added to PageB'
>
> PageB is a redirect to PageA. PageB used lots of template with semantic
> data. When the information on PageB became outdated, a redirect has been
> inserted to point the newer page PageA. PageA uses also lots of
> templates with subobjects.
>
> Deleting page PageB solves the issue. However, there are thousands of
> redirects in this wiki and they can't be deleted.
> Unfortunately, I can not reproduce this issue in my test environment
> (same SMW and MW versions) and I have no developer access to the system
> having this problem.
>
> Did anybody encountered this issue too? Any ideas how to fix this?
>
> My assumption was that the reason for this bug lies in inconsistent
> semantic data. Maybe because some refreshJobs aren't done yet.
> When trying to debug redirects and data refresh jobs in my test
> environment, I encountered the problem with job loops and redirects,
> which might be connected:
>
>
> Problem #2: Infinite jobqueue runs
> ----------------------------------------------------------
> There was a discussion on infinite jobqueue runs recently on this list.
> However, I found this way easier to reproduce the problem.
> When inserting a redirect into a page that existed before and had
> semantic data on it, two SMW_UpdateJobs are inserted into the queue.
> When executed, these jobs spawn identical jobs, which causes the queue
> to loop forever.
> I can reproduce this behavior with the following procedure:
>
> * Create a page and insert a subobject. Save the page.
> * Edit the page again and insert a redirect, i.e. #REDIRECT
> [[AnotherPage]].  Save the page.
> ** The table 'smw_fpt_redi' has no entry for this page. I think there
> should be one, right?
> ** The subobject and its properties are still in table 'smw_object_ids'
> and the page is not marked as redirect.
> ** Now two new SMW_UpdateJobs are in the queue (one for origin page, one
> for target page of the redirect). When executed, these jobs spawn
> identical jobs again -> loops forever
> ** The RDF dump still produces the page with its subobject.
>
> truncate smw_object_ids, smw_prop_stats; and runnning rebuildData.php
> shows the page as a redirect in smw_object_ids. Just running
> rebuildData.php alone does not help.
>
> I assume the solution would be to detect that a redirect has been
> inserted into a page that has semantic data and removed its semantic
> data properly.
> Unfortunately, I haven't found a way to do this.
>
> Any hints and suggestions are appreciated!
>
>
>
> Thanks!
> Daniel
>
>
>
>
> [1]
> MediaWiki 1.22.7
> Semantic MediaWiki 2.0
>
>
> [2]
>
> Exception from line 601 of
> www/htdocs/extensions/SemanticMediaWiki/includes/SemanticData.php: Data
> for a subobject of PageA cannot be added to PageB.
> Backtrace:
> #0
> www/htdocs/extensions/SemanticMediaWiki/includes/storage/SQLStore/SMW_Sql3StubSemanticData.php(320):
> SMW\SemanticData->addSubSemanticData(SMWSql3StubSemanticData)
> #1
> www/htdocs/extensions/SemanticMediaWiki/includes/storage/SQLStore/SMW_Sql3StubSemanticData.php(171):
> SMWSql3StubSemanticData->addSubSemanticDataToInternalCache(SMW\DIProperty)
> #2
> www/htdocs/extensions/SemanticMediaWiki/includes/export/SMW_ExportController.php(128):
> SMWSql3StubSemanticData->getSubSemanticData()
> #3
> www/htdocs/extensions/SemanticMediaWiki/includes/export/SMW_ExportController.php(452):
> SMWExportController->serializePage(SMW\DIWikiPage, integer)
> #4
> www/htdocs/extensions/SemanticMediaWiki/includes/export/SMW_ExportController.php(422):
> SMWExportController->printAll(boolean, integer, integer)
> #5 www/htdocs/extensions/SemanticMediaWiki/maintenance/dumpRDF.php(158):
> SMWExportController->printAllToOutput(boolean, integer, integer)
> #6 www/htdocs/extensions/SemanticMediaWiki/maintenance/dumpRDF.php(92):
> SMW\Maintenance\DumpRdf->generateRdfToChannel()
> #7 www/htdocs/maintenance/doMaintenance.php(113):
> SMW\Maintenance\DumpRdf->execute()
> #8 www/htdocs/extensions/SemanticMediaWiki/maintenance/dumpRDF.php(164):
> require_once(string)
> #9 {main}
>
>
>
> ------------------------------------------------------------------------------
> Comprehensive Server Monitoring with Site24x7.
> Monitor 10 servers for $9/Month.
> Get alerted through email, SMS, voice calls or mobile push notifications.
> Take corrective actions from your mobile device.
> http://pubads.g.doubleclick.net/gampad/clk?id=154624111&iu=/4140/ostg.clktrk
>
>
>
> _______________________________________________
> Semediawiki-devel mailing list
> Semediawiki-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/semediawiki-devel
>


------------------------------------------------------------------------------
Comprehensive Server Monitoring with Site24x7.
Monitor 10 servers for $9/Month.
Get alerted through email, SMS, voice calls or mobile push notifications.
Take corrective actions from your mobile device.
http://pubads.g.doubleclick.net/gampad/clk?id=154624111&iu=/4140/ostg.clktrk
_______________________________________________
Semediawiki-devel mailing list
Semediawiki-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/semediawiki-devel

Reply via email to