On 18/12/2012, at 9:21, Daz DeBoer <[email protected]> wrote:
> Thanks Luke for your deep thinking on this problem. It's sometimes > easy to treat every problem as a series of small steps, and sometimes > we need to step back an look at the big picture. If you can send some > recommended reading, that would be awesome. I have been reading http://workingontologist.org/, I highly recommend it. Next is http://shop.oreilly.com/product/0636920020547.do which is focused on the query language, SPARQL. > Daz > > On 11 December 2012 18:41, Luke Daley <[email protected]> wrote: >> Hi all, >> >> I'm back on this bandwagon. While I'm away, I've been reading/researching on >> this topic. I'm more convinced than ever that our future lies in this >> direction. I've got more learning to do, but I wanted to share some >> interesting things at this point. >> >> At the heart of the stack of practices/technologies that is collectively >> called Semantic Web, is RDF - Resource Description Framework (you can >> substitute “resource” with “it” or “thing”). RDF can be thought of as a >> similar initiative to XML, in some ways. Unlike XML, RDF is not a >> serialisation format. RDF is formal system for stating facts about things, >> in fundamentally a graph structure. This has serious implications when >> compared to a hierarchical model (e.g. XML) or relational. Also built into >> the concept is what is considered the AAA principle, Anyone can say Anything >> about Anything. Said more practically, built into the system is the idea of >> enriching a graph with new facts/connections by aggregating graphs. There >> are many sources of RDF data and many ways to embed RDF in common >> serialisation formats (e.g. XML, JSON), and even ways to on the fly convert >> relational information into RDF on the fly. Once you have data in RDF >> (however that is), it becomes trivial to aggregate the data and use the >> enriched model. The key thing here, is that built in to the system is the >> idea that there are always more things to learn about something. More facts >> can be discovered over time. Said a different way, the distributed nature of >> the data is embraced. >> >> I am convinced that this is the toolset and mindset by which we should be >> modelling our domain. We all know the power of graph structures, and this >> idea of collecting facts about things resonates very strongly with the >> direction that we are heading in with our dependency management model. >> >> There are other aspects as well. >> >> There are specialised data stores that are known as triple stores. RDF is >> based on the concept of triple statements; «subject» «predicate» «object» >> (e.g. luke is-a male, london is-in uk, luke lives-in london). Triple stores >> can store huge numbers of facts, which can be queried. There are many >> interesting things about triple stores, but one of the most pertinent for us >> is that they are effectively schema less. If we are to be collecting facts >> about things that we don't explicitly model (and I can guarantee we will be) >> then this is critical. >> >> There are different querying options, but the emerging standard is SPARQL. >> Think SQL but for graph data. SPARQL can be used to select paths through the >> graph of facts. Given our simple graph (luke is-a male, london is-in uk, >> luke lives-in london), you can query like… >> >> SELECT ?who >> WHERE ?who is-a male >> WHERE ?who lives-in ?city >> WHERE ?city is-in uk >> >> Who are the males who live in cities in the uk? That's pretty standard graph >> querying stuff. If you've ever done logic programming, you might recognise >> the idea. To bring it home in practical terms, think queries like: what are >> all the source jars for libraries that are java 7 compatible? >> >> There is a very strong emphasis on modelling within RDF/SemWeb, but also the >> acceptance that no model is completely correct/perfect/universal. You can >> merge/adapt models without having to transform one schema into another with >> code. You just need to make connections between the two models, linking the >> entities. This reduces the pressure to model everything you'll need, or get >> it exactly right up front. >> >> One aspect that is still unclear to me is the role that reasoning could >> play. Going beyond RDF, you can start to use richer modelling languages to >> describe higher order concepts. As an example, we could model dependency >> “compatibility” in such a way that a reasoning tool could reason out a >> compatible set of dependencies from a graph. The appeal here, is that to >> extend the capability of the system (e.g. adding variants, architectures >> etc) would be a matter of extending the model and not writing more >> algorithms to interpret the data graph. This is an appealing idea. I haven't >> gone to far down this road yet as I'm just getting started, but it's pretty >> easy to see the potential. The potential is especially staggering when you >> consider the enterprise component graph and what kind of facts you could >> reason out. >> >> I've got much more reading/study to do on this topic. I'm convinced it's in >> our best interests to pursue, though I don't expect everyone else to at this >> point. It's worth noting that this is not an experimental, academic, >> technology. This is in serious use in some very large scale sophisticated >> systems. >> >> I'm mentioning all this at this time to try and pique your interest and >> encourage you to take any opportunity that might arise in the near future to >> learn more about this stuff. >> >> -- >> Luke Daley >> Principal Engineer, Gradleware >> http://gradleware.com >> >> >> --------------------------------------------------------------------- >> To unsubscribe from this list, please visit: >> >> http://xircles.codehaus.org/manage_email > > > > -- > Darrell (Daz) DeBoer > Principal Engineer, Gradleware > http://www.gradleware.com > > --------------------------------------------------------------------- > To unsubscribe from this list, please visit: > > http://xircles.codehaus.org/manage_email > >
