TSV Output may be invalid but TSV Input reads it fine
-----------------------------------------------------
Key: JENA-198
URL: https://issues.apache.org/jira/browse/JENA-198
Project: Jena
Issue Type: Bug
Components: ARQ
Affects Versions: ARQ 2.9.1
Environment: Any
Reporter: Rob Vesse
Priority: Minor
Fix For: ARQ 2.9.1
Attachments: TSVIllegalInput.patch, TSVIllegalOutput.patch
I noticed today that TSVOutput may produce output that contains prefixed names
which is invalid per my reading of the relevant specification -
http://www.w3.org/TR/sparql11-results-csv-tsv/
This is due to the fact that TSVOutput called FmtUtils.stringForNode() with
only a Node resulting in it using the ARQ default prefix mapping for output.
Attached is a simple patch which fixes the issue, it should also speed up
TSVOutput marginally as the existing code requires a SerializationContext to be
created for every term serialized and incurs the cost of trying to turn URIs
into prefixed names. Essentially the patch creates a null SerializationContext
variable and just passes that to every call to FmtUtils.stringForNode() so that
the ARQ default prefix mapping never gets used.
The second part of the issue is that this malformed TSV input may be accepted
because TSVInputIterator uses NodeFactory.parseNode() to parse terms which
calls SSE.parseNode() without any prefix mapping and thus internally ends up
using the default SSE prefix mapping which means some prefixed names get
permitted as valid when they should be rejected.
The second patch attached fixes this part of the issue by keeping an empty
static prefix map and calling SSE.parseNode() directly and passing in this map.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira