Stephen Allen wrote:
Hi Paulo,
You may be interested this paper from ISWC 2009: "Parallel
Materialization of the Finite RDFS Closure for Hundreds of Millions of
Triples" [1].
-Stephen
Thank you Stephen.
Paolo (*)
(*) the Italian version, not the Brasilian one :-)
[1] http://data.semanticweb.org/pdfs/iswc/2009/paper241.pdf
On Thu, Nov 3, 2011 at 11:52 AM, Paolo Castagna
<[email protected]> wrote:
Hi,
I wonder if anyone here had a look at Apache Giraph (in the incubator)
(i.e. a Google Pregel clone), here: http://incubator.apache.org/giraph/
I am curious to know if and how this could be used to implement an
rule based inference engine. :-)
Pregel seems to me a better model/architecture (than MapReduce) for
things such as the RETE algorithm.
Having said that, a think you can easily do with MapReduce is to
distribute all your vocabularies/ontologies to all your nodes (via
DistributedCache) load them in RAM (they are typically not that
big) and then perform inference in parallel.
An example of this is here: [1,2]. This is using RIOT infer approach,
but I was wondering if I could just use Jena inference engine there
and if that would work on one triple at the time. I don't think
everything works in this case.
However, perhaps there is a way to group the RDF data so that
inference would work as if all the instance data were available.
Any idea/suggestion?
Since I am here and I was looking at RIOT infer command implementation
yesterday, I found this comment here [3]:
* TDB Infer
* RDFS
* owl:sameAs (in T-Box, not A-Box)
* owl:equivalentClass, owl:equivalentProperty
* owl:TransitiveProperty, owl:SymmetricProperty
(by the way, interesting comments in the RIOT infer's command ;-))
I see how RDFS is implemented (very elegant and compact).
I can see how we could do similar things for:
- owl:equivalentClass
- owl:equivalentProperty
- owl:SymmetricProperty
I think I know how to do:
- owl:sameAs (in T-Box, not A-Box)
But, I do not really see how we could do owl:TransitiveProperty in a
similar streaming fashion. Is that possible?
I was also thinking if those "fragments" of OWL should be added to the
existing Java classes or we should keep RDFS and OWL separate and have
InferenceProcessorOWL.java.
Cheers,
Paolo
[1]
https://github.com/castagna/tdbloader3/blob/master/src/main/java/org/apache/jena/tdbloader3/InferDriver.java
[2]
https://github.com/castagna/tdbloader3/blob/master/src/main/java/org/apache/jena/tdbloader3/InferMapper.java
[3]
http://svn.apache.org/repos/asf/incubator/jena/Jena2/ARQ/trunk/src/main/java/riotcmd/infer.java