On 19/08/14 10:19, Stefan Henke wrote:
Hi,
I would like to get some idea if there is any best practice in application
design/architecture for what I want to achieve. My use case is to enrich
application data stored in a jena model with public data coming from
dbpedia.org. I want to do some reasoning over this combined data. The
question is how to best achieve this reasoning over multiple data sources.
Spanning a kind of "remote" jena model over dbpedia.org seems not to
possible to my knowledge. Correct me if I´m wrong. So, the only option I
can see is to copy the data I need from dbpedia.org via sparql, but store
it locally in a jena model. Alternatively, there are dbpedia datasets, I
can directly import into jena. While this will work well for dbpedia data
it might be not well suited for more volatile data. Let´s imagine weather
related data which might change within minutes.
Not sure there's a "best practice" here.
Just querying distributed sources is expensive, arbitrary inference over
unconstrained distributed sources is more so. So there is no one size
fits all.
Your best bet is to dynamically fetch just the relevant fragments of the
different sources into a local model and perform inference only over
that. However, what constitutes a "relevant fragment" and how to obtain
it will depend on your application.
Dve