This isn't really a question about Jena usage but about application and
data design.
Typically for reference data such as geography you would want your
reference data to be linkable to existing managed IDs for these things.
At least for UK geography there's a wealth of well-maintained identifier
schemes including support and open data resources for using them in
linked data.
There's the OS Linked data service which provides URIs and descriptions
for both administrative geography (from Boundary Line) and places names
from both OS Names and the 50K Gazetteer. These URIs build on and link
to the OS identifiers like TOIDS. https://data.ordnancesurvey.co.uk/
Then for statistical geography there's the ONS linked data service which
provides URIs that build on and link to GSS codes:
https://www.ons.gov.uk/methodology/geography/geographicalproducts/geographylinkeddata
So you at least have the option to reuse either or both of these schemes
and include in your store whatever level of description of these
entities you need for your purpose. The whole of the OS linked data,
excluding postcodes (CodePoint open) is only about 30M triples.
Dave
On 02/11/2021 16:46, Matt Whitby wrote:
I'll try and keep my question relatively succinct if I can.
The top level question is we're trying to decide whether to have reference
data within the triplestore, or whether we have it externally in a standard
relational database.
Wikidata implements each SUBJECT as a URI (Q Code), which we would assume
is allocated a number in an RDBMS, etc. somewhere. It then resolves the
code back to a label with it's label service. We can certainly do this, but
it's an overhead to have to resolve all the names back.
Alternatively, do we have - say - our County, District, Parish data entries
within the triplestore? So, if we go that route how would we construct a
URI without going outside of Jena?
We can't have a URL along the lines of;
www.test.com/schema/spatial/parish/abberton because Parish names are
non-unique.
I hope that makes some semblance of sense.
Matt.