These days I am a big fan of RDF* and SPARQL*, which unifies RDF with the property graph model. On the other hand I used to hate blank nodes but I learned to stop worrying and love them. I am hoping anyway that Neo4J and it's ilk become a gateway drug to the RDF world.
On Fri, Oct 9, 2015 at 9:44 AM, Andy Seaborne <[email protected]> wrote: > It is nice that the Titan guys see RDF as something to compare to. > Coincidently, I was giving a talk about Property Graph / Linked Data just > recently at the European ApacheCon BigData conference. > > > The Property Graph (PG) market is maybe x2 the size of the RDF market, and > both are small. The challenge is growing the graph market, not one form > taking market share away from the other. > > And the key difference between graph databases and other data systems is > modelling. The differences between graph systems is not the key here. > > About reification, they are somewhat off track. Reification is a quite > specialised feature for limited use. It is not RDF's equivalent to > attributes on links in PG. > > Let me make that concrete with an example simplified from Graph databases > / chapter 3 (page 52 in my copy). The book is written the Neo4J folks. > > Email provenance. > > A sends_email_to B > > Now, you could reify that statement (the act by A of sending the email to > B). > > Reification is way more powerful than just being about to add data to the > triple. It says "claim: A sends_mail_to B" - several different and > competing claims can be made. But let's continue assuming reification and > assertion of the triple ... [*] > > <<A sends email to B>> > cc C > cc D > sentOn Tuesday > > In the same modelling way you could add attributes to a PG graph edge for > sends_email_to. > > Both are anti-patterns (as chapter 3 notes). > > The email sent is an important concept so model it explicitly: > > A sends MSG > MSG receivedBy B > MSG cc_to C > MSG cc_to D > MSG sentOn "Tuesday" > > By modelling the email message as a first class concept, not implicit in > the activity via reification/link attributes, you can better add > information e.g. which servers it was transferred by and stored on, when > was it received (this is email - that might be twice) and better query it > (who else accessed it on receipt). Modelling those on the act of sending > is making life hard (how do you talk about a draft email?) > > MSG contents URL_to_content > MSG hasChecksum 0xABCDEF > MSG receivedHeader "from nm15-vm2.bullet.mail.ne1.yahoo.com ...." > > That last one is tricky - one sending of a message can result in different > receivedHeaders depending on the receiver. > > This event based modelling, not reification. > > > If you wanted a highly efficient reification-supporting RDF store, then > build one. No need to blindly store as multiple triples (its called > compression!). You don't see such stores because reification is a minor > feature of RDF. Event-based modelling and named graphs are often better. > > Andy > > [*] > << >> is syntax that I proposed in early SPARQL drafts pre 1.0 for > reification support but didn't gain much support. It is still in the ARQ > parser source but not active. > > -- Paul Houle *Applying Schemas for Natural Language Processing, Distributed Systems, Classification and Text Mining and Data Lakes* (607) 539 6254 paul.houle on Skype [email protected] :BaseKB -- Query Freebase Data With SPARQL http://basekb.com/gold/ Legal Entity Identifier Lookup https://legalentityidentifier.info/lei/lookup/ <http://legalentityidentifier.info/lei/lookup/> Join our Data Lakes group on LinkedIn https://www.linkedin.com/grp/home?gid=8267275
