On Tue, Nov 9, 2010 at 1:07 PM, Andy Seaborne <[email protected]> wrote: > > > On 09/11/10 07:13, Reto Bachmann-Gmuer wrote: >> >> On the incubator mailing list a project for commons around the semantic >> oriented projected has been suggested as a possibility. >> >> I'm wondering which parts of Clerezza could be moved to such a project, >> thinking at: >> - core graph/mgraph api >> - serializers >> - graph isomorphism code (to be improved there) >> >> Obviously it only make sense to move things if somebody want's to use >> these >> things without using Clerezza. > >> >> Reto >> > > Does that mean these pieces of code work work with other systems? They all base on the interfaces in the org.apache.clerezza.rdf.core package and the goal is to provide wrappers for systems that do not natively expose this (as we do for jena ans sesame) > > I'm curious: > How does the graph isomorphism code compare to Jena's? CLEREZZA-67 was closed with the comment "The current algorithm is highly inefficient in some situation, but as with most real world graphs its reasonably fast I suggest this to be solved in a sperate issue: CLEREZZA-81. "
CLEREZZA-81 is describe as "The current GraphMatcher used in AbstarctGraph.equals is efficient when it can map all bnoded of the two Garphs by computing hash on them. When the hashes can not be refined further it simply tries all permutations with the bnodes with the same hash. In some situations this latter brude force fallback is terribly inefficient. For example if the compared graphs contain circles of bnodes connceted with the same property, in this and similar case we should switch back to hash-code based matching after randomly equating just two node of the two graphs." I have a vague remembrance of it being massively slower than jena in such bnodes circles while being slightly faster where the hash-based matching succeeds. Looking at the parser API at http://incubator.apache.org/clerezza/mvn-site/org.apache.clerezza.rdf.core/apidocs/index.html?org/apache/clerezza/rdf/core/serializedform/package-summary.html I think that it might be better not to require ParsingProviders to return Graphs but to allow any TripleCollection which doesn't have the requirement on the equals and the hashcode method as (inmutable) graphs, currently its hard for an implementation not to depend on more stuff in order to provide correct implementations of these methods. Reto. > > Andy >
