Hi William, Friederich. This is an excellent email. My replies inlined. Hope I can help.
On Wed, Nov 24, 2010 at 9:47 AM, William Waites <w...@styx.org> wrote: > Friedrich, I'm forwarding your message to one of the W3 lists. > > Some of your questions could be easily answered (e.g. for euro in your > context, you don't have a predicate for that, you have an Observation > with units of a currency and you could take the currency from > dbpedia, the predicate is "units"). > > But I think your concerns are quite valid generally and your > experience reflects that of most web site developers that encounter > RDF. > > LOD list, Friedrich is a clueful developer, responsible for > http://bund.offenerhaushalt.de/ amongst other things. What can we > learn from this? How do we make this better? > > -w > > > ----- Forwarded message from Friedrich Lindenberg <friedr...@pudo.org> ----- > > From: Friedrich Lindenberg <friedr...@pudo.org> > Date: Wed, 24 Nov 2010 11:56:20 +0100 > Message-Id: <a9089567-6107-4b43-b442-d09dcc0c3...@pudo.org> > To: wdmmg-discuss <wdmmg-disc...@lists.okfn.org> > Subject: [wdmmg-discuss] Failed to port datastore to RDF, will go Mongo > > (reposting to list): > > Hi all, > > As an action from OGDCamp, Rufus and I agreed that we should resume porting > WDMMG to RDF in order to make the data model more flexible and to allow a > merger between WDMMG, OffenerHaushalt and similar other projects. > > After a few days, I'm now over the whole idea of porting WDMMG to RDF. Having > written a long technical pro/con email before (that I assume contained > nothing you don't already know), I think the net effect of using RDF would be > the following: > > * Lots of coolness, sucking up to linked data people. > * Further research regarding knowledge representation. I will quickly outline some points that I think are advantages from a developer POV. ( once you tackle the problems you outline below, of course ). * A highly expressive language ( SPARQL ) * Ease of creating workflows where data moves from one app to another. And this is not just buzz. The self-contained nature of triples and IDs make it so that you can SPARQL select on one side and SPARQL insert on another. I do this all the time, creating "data pipelines". I admit it has taken some time to master, but I can peform "magic" from my customer's point of view. > > vs. > > * Unstable and outdated technological base. No triplestore I have seen so far > seemed on par with MySQL 4. * You definitely need to give Virtuoso a try. It is a mature SQL database that grew into RDF. I Strongly disagree with this point as I have personally created highly demanding projects for large companies using Virtuoso's Quad Store. To give you a real life case, the recent Brazilian Election portal by Globo.com ( http://g1.globo.com/especiais/eleicoes-2010/ ) has Virtuoso under the hood and, being a highly important, mission critical app in a major ( 4th ) media company it is not a toy application. I know many others but in this one I participated so I can tell you it is Virtuoso w/o fear mistake. > * No freedom wrt to schema, instead modelling overhead. Spent 30 minutes > trying to find a predicate for "Euro". Yes! This is a major problem and we as a community need to tackle it. I am intrigued to see what ideas come up in this thread. Thanks for bringing it up. As an alternative, you can initially model everything using a simple urn:foo:xxx or http://mydomain.com/id/xxx schema ( this is what I do ) and as you move fwd you can refactor the model. Or not. You can leave it as is and it will still be integratable ( able to live along other datasets in the same store ). Deploying the "Linked" part of Linked Data ( the dereferencing protocols ) later on is another game. > * Scares off developers. Invested 2 days researching this, which is how long > it took me to implement OHs backend the first time around. Project would need > to be sustained through linked data grad students. > * Less flexibility wrt to analytics, querying and aggregation. SPARQL not so > hot. Did you try Virtuoso? Seriously. It provides out of the box common aggregates and is highly extensible. You basically have a development platform at your disposal. > * Good chance of chewing up the UI, much harder to implement editing. Definitely hard. This is something I hope will be alleviated once we start getting more demos into the wild. But, take note: the Active Record + MVC pattern works. This is not as alien as it seems. Also, SPARQL also removes the "joines" as some of the major NoSQL offerings do. I find it terribly easy to create UIs over RDF, but I have been doing it for a while already. > > I normally enjoy learning new stuff. This is just painful. Most of the above > points are probably based on my ignorance, but it really shouldn't take a PhD > to process some gov spending tables. > > I'll now start a mongo effort because I really think this should go > schema-free + I want to get stuff moving. If you can hold off loading Uganda > and Israel for a week that would of course be very cool, we could then try to > evaluate how far this went. Progress will be at: > http://bitbucket.org/pudo/wdmmg-core My exec summary to you is this: * Instead of mongo, use Virtuoso with your own predicates. You will get a lot of power and you will be able to make your data live natively as RDF. This means it will be easily importable and meshable with other datasets, initially. * If UI is an issue, you can throw in your questions to public-lod and lots of us will answer with patterns, strategies, etc. Regards, A > > Friedrich > > > > _______________________________________________ > wdmmg-discuss mailing list > wdmmg-disc...@lists.okfn.org > http://lists.okfn.org/mailman/listinfo/wdmmg-discuss > > ----- End forwarded message ----- > > -- > William Waites > http://eris.okfn.org/ww/foaf#i > 9C7E F636 52F6 1004 E40A E565 98E3 BBF3 8320 7664 > > -- Aldo Bucchi @aldonline skype:aldo.bucchi http://aldobucchi.com/