[ https://issues.apache.org/jira/browse/JENA-2006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17244037#comment-17244037 ]
ASF subversion and git services commented on JENA-2006: ------------------------------------------------------- Commit 19b4ed3d8a6e39573a78d5a0c39aeb3b3f95fe7b in jena's branch refs/heads/master from Andy Seaborne [ https://gitbox.apache.org/repos/asf?p=jena.git;h=19b4ed3 ] JENA-2006: DatasetGraph prefixes > Dataset prefixes > ---------------- > > Key: JENA-2006 > URL: https://issues.apache.org/jira/browse/JENA-2006 > Project: Apache Jena > Issue Type: Improvement > Reporter: Andy Seaborne > Assignee: Andy Seaborne > Priority: Major > Fix For: Jena 3.18.0 > > Time Spent: 1h > Remaining Estimate: 0h > > Summary: > Add API calls: > {{DatasetGraph.prefixes()}} -> {{PrefixMap}} > {{Dataset.getPrefixMapping()}} -> {{PrefixMapping}} > Rework internal implementation code to reflect this. > Clearup the different handling of prefixes; switch to a consistent provision > of a dataset prefix map. Remove {{DatasetPrefixStorage}} (multiple prefix > maps per dataset). > My first attempt of this work was to use {{DatasetPrefixStorage}} > consistently but it ended up as a lot of classes mirroring PrefixMap > implementations. Because input formats only have prefixes by datasets, not > individual graph, the extra feature of multiple prefix maps can only be used > by API and it just doesn't seem worth the effort and extra code. It was > quicker doing the final form - "one prefix map per dataset" than the more > complicated form. > More details: > "TDB" means both TDB1 and TDB2. > The main use case for prefixes is set as part of data parsing and use for > output to abbreviate URIs. > For output, we know that URI->prefixed name is a performance critical > operation. It is optimized in {{PrefixMapStd}}. This does not change. The > writers copy prefixes into a {{PrefixMapStd}} which has a fast-path for the > common case of split at last "/" or "#" and a reverse map from URI to prefix. > Mostly, up to now, implementation has been "store the prefixes in the default > graph" and while TDB stores multiple set of prefixes for each dataset so that > here is the possibility of graphs in the same dataset having different > prefixes, it used the default graph as well. Output has never made use of > multiple prefixes per dataset. > The {{PrefixMapping}} API presumes a reverse mapping and the API contract is > part of the Model API (Model extends PrefixMapping). The other odd feature of > {{PrefixMapping}} is that there is no direct access to the prefixes as a map, > only a copy form. > {{PrefixMap}} is simpler with the needs of parsers and storage implementation > in mind. > The idea is that {{PrefixMapping}} is to be considered to be part of the > Dataset/Model/Statement/Resource APIs. There is a legacy quirk that Graph has > "getPrefixMapping". > There will be adapters between the two viewpoints. Aside from the implicit > contract of {{PrefixMapping}} following XML qname rules, while Turtle is less > restrictive, the functionality can be mapped both ways. > Mostly the XML-rules contract has been moved into the writers themselves in > previous iterations of implementation improvement. The adapters are > lightweight objects, with no state other than the object that adapt and > "double adapting" actually removes wrappers and returns the underlying > prefixes object. > The improved way: > Basic datasets (DatasetGraphMap and DatasetGraphMapLink) - dataset prefixes > are the default graph prefixes. > TIM: All graphs in the dataset have the same prefix map. The PrefixMap is > thread-safe but isn't transactional (possible future work if needed). > TDB1, TDB2: These have there own, more general prefix storage but the > additional feature is not exposed. All graphs in the dataset have the same > prefix map. There is no change to on-disk format. > SDB: As before. There is no change to on-disk format. > The nulls (DatasetGraphZero and DatasetGraphSink): Sink is "forget updates", > Zero is "empty, no updates": Suitably misbehaved implemented of the > {{PrefixMap}} API. -- This message was sent by Atlassian Jira (v8.3.4#803005)