Re: FW: Failed to port datastore to RDF, will go Mongo

Aldo Bucchi Wed, 24 Nov 2010 05:35:46 -0800

Hi William, Friederich.

This is an excellent email. My replies inlined. Hope I can help.

On Wed, Nov 24, 2010 at 9:47 AM, William Waites <w...@styx.org> wrote:
> Friedrich, I'm forwarding your message to one of the W3 lists.
>
> Some of your questions could be easily answered (e.g. for euro in your
> context, you don't have a predicate for that, you have an Observation
> with units of a currency and you could take the currency from
> dbpedia, the predicate is "units").
>
> But I think your concerns are quite valid generally and your
> experience reflects that of most web site developers that encounter
> RDF.
>
> LOD list, Friedrich is a clueful developer, responsible for
> http://bund.offenerhaushalt.de/ amongst other things. What can we
> learn from this? How do we make this better?
>
> -w
>
>
> ----- Forwarded message from Friedrich Lindenberg <friedr...@pudo.org> -----
>
> From: Friedrich Lindenberg <friedr...@pudo.org>
> Date: Wed, 24 Nov 2010 11:56:20 +0100
> Message-Id: <a9089567-6107-4b43-b442-d09dcc0c3...@pudo.org>
> To: wdmmg-discuss <wdmmg-disc...@lists.okfn.org>
> Subject: [wdmmg-discuss] Failed to port datastore to RDF, will go Mongo
>
> (reposting to list):
>
> Hi all,
>
> As an action from OGDCamp, Rufus and I agreed that we should resume porting 
> WDMMG to RDF in order to make the data model more flexible and to allow a 
> merger between WDMMG, OffenerHaushalt and similar other projects.
>
> After a few days, I'm now over the whole idea of porting WDMMG to RDF. Having 
> written a long technical pro/con email before (that I assume contained 
> nothing you don't already know), I think the net effect of using RDF would be 
> the following:
>
> * Lots of coolness, sucking up to linked data people.
> * Further research regarding knowledge representation.

I will quickly outline some points that I think are advantages from a
developer POV. ( once you tackle the problems you outline below, of
course ).
* A highly expressive language ( SPARQL )
* Ease of creating workflows where data moves from one app to another.
And this is not just buzz. The self-contained nature of triples and
IDs make it so that you can SPARQL select on one side and SPARQL
insert on another. I do this all the time, creating "data pipelines".
I admit it has taken some time to master, but I can peform "magic"
from my customer's point of view.

>
> vs.
>
> * Unstable and outdated technological base. No triplestore I have seen so far 
> seemed on par with MySQL 4.

* You definitely need to give Virtuoso a try. It is a mature SQL
database that grew into RDF. I Strongly disagree with this point as I
have personally created highly demanding projects for large companies
using Virtuoso's Quad Store. To give you a real life case, the recent
Brazilian Election portal by Globo.com (
http://g1.globo.com/especiais/eleicoes-2010/ ) has Virtuoso under the
hood and, being a highly important, mission critical app in a major (
4th ) media company  it is not a toy application.
I know many others but in this one I participated so I can tell you it
is Virtuoso w/o fear mistake.

> * No freedom wrt to schema, instead modelling overhead. Spent 30 minutes 
> trying to find a predicate for "Euro".

Yes!
This is a major problem and we as a community need to tackle it.
I am intrigued to see what ideas come up in this thread. Thanks for
bringing it up.

As an alternative, you can initially model everything using a simple
urn:foo:xxx or http://mydomain.com/id/xxx schema ( this is what I do )
and as you move fwd you can refactor the model. Or not.

You can leave it as is and it will still be integratable ( able to
live along other datasets in the same store ).

Deploying the "Linked" part of Linked Data ( the dereferencing
protocols ) later on is another game.

> * Scares off developers. Invested 2 days researching this, which is how long 
> it took me to implement OHs backend the first time around. Project would need 
> to be sustained through linked data grad students.
> * Less flexibility wrt to analytics, querying and aggregation. SPARQL not so 
> hot.

Did you try Virtuoso? Seriously.
It provides out of the box common aggregates and is highly extensible.
You basically have a development platform at your disposal.

> * Good chance of chewing up the UI, much harder to implement editing.

Definitely hard. This is something I hope will be alleviated once we
start getting more demos into the wild. But, take note: the Active
Record + MVC pattern works. This is not as alien as it seems.

Also, SPARQL also removes the "joines" as some of the major NoSQL
offerings do. I find it terribly easy to create UIs over RDF, but I
have been doing it for a while already.

>
> I normally enjoy learning new stuff. This is just painful. Most of the above 
> points are probably based on my ignorance, but it really shouldn't take a PhD 
> to process some gov spending tables.
>
> I'll now start a mongo effort because I really think this should go 
> schema-free + I want to get stuff moving. If you can hold off loading Uganda 
> and Israel for a week that would of course be very cool, we could then try to 
> evaluate how far this went. Progress will be at: 
> http://bitbucket.org/pudo/wdmmg-core

My exec summary to you is this:
* Instead of mongo, use Virtuoso with your own predicates. You will
get a lot of power and you will be able to make your data live
natively as RDF. This means it will be easily importable and meshable
with other datasets, initially.
* If UI is an issue, you can throw in your questions to public-lod and
lots of us will answer with patterns, strategies, etc.

Regards,
A

>
> Friedrich
>
>
>
> _______________________________________________
> wdmmg-discuss mailing list
> wdmmg-disc...@lists.okfn.org
> http://lists.okfn.org/mailman/listinfo/wdmmg-discuss
>
> ----- End forwarded message -----
>
> --
> William Waites
> http://eris.okfn.org/ww/foaf#i
> 9C7E F636 52F6 1004 E40A  E565 98E3 BBF3 8320 7664
>
>

-- 
Aldo Bucchi
@aldonline
skype:aldo.bucchi
http://aldobucchi.com/

Re: FW: Failed to port datastore to RDF, will go Mongo

Reply via email to