Re: Failed to port datastore to RDF, will go Mongo

2010-11-25 Thread Dave Reynolds
Hi Friedrich,

On Thu, 2010-11-25 at 00:43 +0100, Friedrich Lindenberg wrote:

> Anyway, I'd like to raise some additional points for the future: 
> 1. I'd like to get a better picture of who is currently developing end-user 
> open government data applications based on linked data. Given that there is a 
> massive push towards releasing OGD as LD, I'd be eager to find out who is 
> consuming it in which kind of (user-facing) context, especially regarding 
> government transparency. More precisely: is RDF used primarily as an 
> interchange format or are there many people actively running sites using it? 

We have been doing a little of of this. In particular, we developed a
simple data explorer for the LOD local government spend data which we're
working out how best to make public.

This uses the Linked Data API [1] to expose the data from a triple store
and client-side javascript for the UI, though could equally well have
been done server side.

However, we are not a fair test case! [2]

[Aside: Having been actively involved the LOD side of UK open government
data then the term "massive push" isn't one that I would use, at least
not in that context! There has been genuine interest, some very hard
work by a few motivated people and some promising results but not that
much in the way of, say, resourcing. There *has* been some effective
publicity thanks to TimBL and Nigel Shadbolt but that has emphasized the
opening of data more than it has particular data representations, which
I'd regard as a good thing.]



[2] We co-developed the ontology for publishing the data, co-developed
the spec (and an implementation) for the Linked Data API and are active
developers of the open source Jena RDF toolkit on which the backend of
this small app is based.

Re: Failed to port datastore to RDF, will go Mongo

2010-11-24 Thread Friedrich Lindenberg
Hi all, 

first off: thanks a lot for the many comments and the advice that this has 

On Nov 24, 2010, at 11:44 PM, Toby Inkster wrote:

> On Wed, 24 Nov 2010 18:12:50 +
> Ben O'Steen  wrote:
>> That's not the point that is being made. A competent developer, using
>> all the available links and documentation, spending days researching
>> and learning and trying to implement, is unable to make an app using a
>> triplestore that is on a par with one they can create very quickly
>> using a relational database.
> Or, to put a different slant on it: a competent developer who has spent
> years using SQL databases day-to-day finds it easier to use SQL and the
> relational data model than a different data model and different query
> language that he's spent a few days trying out.

That is probably a fair description of  my position. (Although I'm now using a 
database I've only known for 4 months and that is definitely not relational.) I 
want to finish this particular project by the end of next week, so I decided to 
default back to technologies that are more familiar to me and where it seemed 
easier to select the required components. Have a running prototype now :-)  

Anyway, I'd like to raise some additional points for the future: 

1. I'd like to get a better picture of who is currently developing end-user 
open government data applications based on linked data. Given that there is a 
massive push towards releasing OGD as LD, I'd be eager to find out who is 
consuming it in which kind of (user-facing) context, especially regarding 
government transparency. More precisely: is RDF used primarily as an 
interchange format or are there many people actively running sites using it? 

2. (Trying to figure out the intended process:) Several people have suggested 
that I shoud iteratively develop a mapping of the data to RDF, starting with an 
entirely independent ontology and then incrementally adopting other 
vocabularies. While this seems fine in theory, I'm curious how it works in 
practice: wouldn't I a) screw anyone using my data via REST and dumps and b) 
have to refactor all of my loaders, search indexing, templates, ... essentially 
every part of the system using the data for each change? Or would you recommend 

Thanks a lot, 


Re: Failed to port datastore to RDF, will go Mongo

2010-11-24 Thread Kingsley Idehen

On 11/24/10 6:19 PM, William Waites wrote:

* [2010-11-24 22:44:53 +] Toby Inkster  écrit:
] Or, to put a different slant on it: a competent developer who has spent
] years using SQL databases day-to-day finds it easier to use SQL and the
] relational data model than a different data model and different query
] language that he's spent a few days trying out.

I don't think that's what's happening here, or at least not
entirely. People coming from a RDB background expect things like SUM,
COUNT, INSERT, DELETE, not to mention GROUP BY to work. But SPARQL 1.1
is still very new, each store implements them in slightly different
ways with slightly different syntax, sometimes requiring workarounds
in application code.

Now if you put it that way, YES!
Any RDBMS developer that comes to RDF triple / quad store land and 
discovers that main query language only just learned how to count 
(recently) would be SPOOKED!
Virtuoso is a hybrid (multi-model) DBMS and that insulates it from these 
kinds of transition concerns. It also why we give you the ability to 
leverage SQL in SPARQL or SPARQL in SQL etc..

  With RDBs we have good libraries for abstracting
away these differences.

Hmm.. in my experience there are very few ODBC or JDBC or Native CLIs 
that pull off such abstractions. We have gone to great lengths to make 
this so re. our ODBC, JDBC, ADO.NET, OLEDB drivers for all the major 
RDBMS engines via implementation of metadata calls. Trouble is, Web 
Developers don't even use any of these APIs they write RDBMS specific 
apps :-(

Ironically, the abstraction you describe is one that could manifest via 
an ODBC ontology i.e., and RDBMS abstraction ontology. Naturally, this 
is what we do inside Virtuoso's Virtual DBMS for Relational Data 
Sources, but it isn't OWL based since it precedes OWL etc..

At some point, we might consider making a public ontology for ODBC, but 
for now, making Linked Data simple for end-users and power-users is a 
much higher priority :-)

We still require people to pay a lot closer
attention to what the underlying plumbing is and how it works (and if
the binary package they got with their OS might be out of date or has
to be compiled from source or even patched - the horror!).

Well how is that different from any other Open Source experience? 
Basically, this is what Toby is saying: those used to MySQL and LAMP 
stack are fine, they go through pains that are invisible since that's 
home territory. Tweak the model a little, and all hell break loose. 
Tweak can even be as simple as writing RDBMS independent apps. via ODBC 
using iODBC or unixODBC.

Virtuoso can run LAMP apps (modulo MySQL via ODBC or JDBC based data 
access) across Windows, Mac OS X, Linux, Solaris, and many other UNIX 
platforms. You install and go, but to many typical LAMP folks, that very 
very confusing :-)

things prevent people from getting on with what they see as the task
at hand.

Yes and No, there is an element of discomfort that comes from changing 
the norm that makes this awfully subjective.




Kingsley Idehen 
President&  CEO
OpenLink Software
Twitter/ kidehen

Re: Failed to port datastore to RDF, will go Mongo

2010-11-24 Thread William Waites
* [2010-11-24 22:44:53 +] Toby Inkster  écrit:
] Or, to put a different slant on it: a competent developer who has spent
] years using SQL databases day-to-day finds it easier to use SQL and the
] relational data model than a different data model and different query
] language that he's spent a few days trying out.

I don't think that's what's happening here, or at least not
entirely. People coming from a RDB background expect things like SUM,
COUNT, INSERT, DELETE, not to mention GROUP BY to work. But SPARQL 1.1
is still very new, each store implements them in slightly different
ways with slightly different syntax, sometimes requiring workarounds
in application code. With RDBs we have good libraries for abstracting
away these differences. We still require people to pay a lot closer
attention to what the underlying plumbing is and how it works (and if
the binary package they got with their OS might be out of date or has
to be compiled from source or even patched - the horror!). These
things prevent people from getting on with what they see as the task
at hand. 

William Waites
9C7E F636 52F6 1004 E40A  E565 98E3 BBF3 8320 7664

Re: Failed to port datastore to RDF, will go Mongo

2010-11-24 Thread Toby Inkster
On Wed, 24 Nov 2010 18:12:50 +
Ben O'Steen  wrote:

> That's not the point that is being made. A competent developer, using
> all the available links and documentation, spending days researching
> and learning and trying to implement, is unable to make an app using a
> triplestore that is on a par with one they can create very quickly
> using a relational database.

Or, to put a different slant on it: a competent developer who has spent
years using SQL databases day-to-day finds it easier to use SQL and the
relational data model than a different data model and different query
language that he's spent a few days trying out.

It's not surprising. I often find it difficult to code things in Python
and end up switching to Perl. Why? Is it because Perl's an inherently
easier language? Or is it because Perl has been one of my main
development tools for the best part of a decade whereas I dig out
Python only occasionally.

Toby A Inkster

Re: FW: Failed to port datastore to RDF, will go Mongo

2010-11-24 Thread Kingsley Idehen

On 11/24/10 3:57 PM, Ben O'Steen wrote:

On Wed, 2010-11-24 at 14:40 -0600, Juan Sequeda wrote:


Just like we can

1) download xampp and install php, apache, mysql with one click
2) open a browser, open phpmyadmin, create my db
3) copy paste any snippet of code I can find on the web about
connecting php/java etc to a mysql database
4) write code to select/insert/update my db

... you are asking for these same 4 simple steps but for an RDF

Not me personally, but in my experience of talking to developers in the
HE/FE sector as well as commercial devs through JISC, running Dev8D and
so on, being able to achieve those steps in the manner you have
suggested is crucial.

Yes, I exaggerated about my hearing the same tale a thousand times, but
I have heard that perception of RDF/triplestores many, many times as
unfounded as some may argue it is.

This will sound like heresy, but the closest parallel I've found to step
1) is with mulgara (excepting that a Java runtime of some sort is
required.) Run the jar, open browser, and run through the web-based
examples that cover input, update and query.


You should make a basic RDBMS your yardstick e.g. FoxPRO, Access, 
Filemaker, SQL Server etc..

If you have enterprise DBMS experience then: Oracle, SQL Server, Sybase, 
Ingres, Informix, Progress (OpenEdge), Firebird (once Interbase), 
PostgreSQL, MySQL.

In all cases, this is what developers do:

1. Install DBMS
2. Load (using various data loaders and import utilities) and Create Data
3. Create Views and Queries
4. Put Forms and Reports atop Views (or Tables)
5. Enjoy power of RDBMS apps.

This is what end-users (basic or power users do):

1. Get a productivity tool of choice (Word Processor, Spreadsheet, 
Report Writer etc)
2. Connect to RDBMS via an ODBC Data Source Name (which via ODBC Driver 
Manager is bound to Drivers for each DBMS)

3. Enjoys power of RDBMS via their preferred Desktop tool.

Here is what so called Web Developers do:

1. Find an Open Source DBMS
2. Compile it
3. Work through LAMP stack to PHP, Pyton, Ruby, TCL, others
4. Ignore DBMS independent API of ODBC (available via iODBC or unixODBC 
effort) and couple HTML pages directly to DBMS
5. Ignore DBMS for user account management and stick that in HTML page 
layer too.

Irrespective of where you fit in re. the above, this is what you should 
be able to do with Relational Property Graph Databases that support 
resolvable URIs as Unique Keys:

1. Load data - there are a myriad of paths including transient and 
materialized views over ODBC or JDBC accessible RDBMS data sources, Web 
Services, many other data container formats (spreadsheets, CSV files, etc..)
2. Use HTML+RDFa (or basic HTML) pages as Forms and Report Writer tool 
re. data browsing

3. Enjoy power of Linked Data.

Note re. above:

1. No re-write rules coding
2. No 303 debates re. how to make Unique Keys resolve
3. No exposure to Name | Address disambiguation re. Unique Keys .

Mulgara already mandates Java. Java != Platform Independent either. I am 
mandating nothing bar installing a DBMS and then simply leveraging HTTP, 
EAV Data Model, and the power of a Relational Property Graph Database 
that may or may not provide output in RDF format.


Juan Sequeda

On Wed, Nov 24, 2010 at 12:12 PM, Ben O'Steen
 On Wed, 2010-11-24 at 12:51 -0500, Kingsley Idehen wrote:
 >  What does MySQL 4 do with this data that can't be done with
 >  moderately capable RDF quad / triplestore?
 >  If I am going to run rings around this thing, I need a
 >  point :-)

 That's not the point that is being made. A competent
 developer, using
 all the available links and documentation, spending days
 researching and
 learning and trying to implement, is unable to make an app
 using a
 triplestore that is on a par with one they can create very
 quickly using
 a relational database.

 This is about the 1000th time I have heard this story, and the
 range of those saying the same thing is huge - from 9-5 devs
 who learn
 what they need to people who research and teach artificial
 and other cutting edge areas and who actively learn new,
 complex skills
 just because they can.

 The point is not whether someone who (co?)developed the
 triplestore can make RDF work, it's whether someone with the
 current documentation and inclination can.


 >  --
 >  Regards,
 >  Kingsley Idehen
 >  President&  CEO
 >  OpenLink Software
 >  Web:
 >  Weblog:
 >  Twitter/ k

Re: FW: Failed to port datastore to RDF, will go Mongo

2010-11-24 Thread Ben O'Steen
On Wed, 2010-11-24 at 14:40 -0600, Juan Sequeda wrote:
> Ben,
> Just like we can
> 1) download xampp and install php, apache, mysql with one click
> 2) open a browser, open phpmyadmin, create my db
> 3) copy paste any snippet of code I can find on the web about
> connecting php/java etc to a mysql database
> 4) write code to select/insert/update my db
> ... you are asking for these same 4 simple steps but for an RDF
> database?

Not me personally, but in my experience of talking to developers in the
HE/FE sector as well as commercial devs through JISC, running Dev8D and
so on, being able to achieve those steps in the manner you have
suggested is crucial.

Yes, I exaggerated about my hearing the same tale a thousand times, but
I have heard that perception of RDF/triplestores many, many times as
unfounded as some may argue it is.

This will sound like heresy, but the closest parallel I've found to step
1) is with mulgara (excepting that a Java runtime of some sort is
required.) Run the jar, open browser, and run through the web-based
examples that cover input, update and query.


> Juan Sequeda
> +1-575-SEQ-UEDA
> On Wed, Nov 24, 2010 at 12:12 PM, Ben O'Steen 
> wrote:
> On Wed, 2010-11-24 at 12:51 -0500, Kingsley Idehen wrote:
> > What does MySQL 4 do with this data that can't be done with
> a
> > moderately capable RDF quad / triplestore?
> >
> > If I am going to run rings around this thing, I need a
> starting
> > point :-)
> That's not the point that is being made. A competent
> developer, using
> all the available links and documentation, spending days
> researching and
> learning and trying to implement, is unable to make an app
> using a
> triplestore that is on a par with one they can create very
> quickly using
> a relational database.
> This is about the 1000th time I have heard this story, and the
> ability
> range of those saying the same thing is huge - from 9-5 devs
> who learn
> what they need to people who research and teach artificial
> intelligence
> and other cutting edge areas and who actively learn new,
> complex skills
> just because they can.
> The point is not whether someone who (co?)developed the
> virtuoso
> triplestore can make RDF work, it's whether someone with the
> time,
> current documentation and inclination can.
> Ben
> >
> > --
> >
> > Regards,
> >
> > Kingsley Idehen
> > President & CEO
> > OpenLink Software
> > Web:
> > Weblog:
> > Twitter/ kidehen
> >
> >
> >
> >

Re: FW: Failed to port datastore to RDF, will go Mongo

2010-11-24 Thread Juan Sequeda

Just like we can

1) download xampp and install php, apache, mysql with one click
2) open a browser, open phpmyadmin, create my db
3) copy paste any snippet of code I can find on the web about connecting
php/java etc to a mysql database
4) write code to select/insert/update my db

... you are asking for these same 4 simple steps but for an RDF database?

Juan Sequeda

On Wed, Nov 24, 2010 at 12:12 PM, Ben O'Steen  wrote:

> On Wed, 2010-11-24 at 12:51 -0500, Kingsley Idehen wrote:
> > What does MySQL 4 do with this data that can't be done with a
> > moderately capable RDF quad / triplestore?
> >
> > If I am going to run rings around this thing, I need a starting
> > point :-)
> That's not the point that is being made. A competent developer, using
> all the available links and documentation, spending days researching and
> learning and trying to implement, is unable to make an app using a
> triplestore that is on a par with one they can create very quickly using
> a relational database.
> This is about the 1000th time I have heard this story, and the ability
> range of those saying the same thing is huge - from 9-5 devs who learn
> what they need to people who research and teach artificial intelligence
> and other cutting edge areas and who actively learn new, complex skills
> just because they can.
> The point is not whether someone who (co?)developed the virtuoso
> triplestore can make RDF work, it's whether someone with the time,
> current documentation and inclination can.
> Ben
> >
> > --
> >
> > Regards,
> >
> > Kingsley Idehen
> > President & CEO
> > OpenLink Software
> > Web:
> > Weblog:
> > Twitter/ kidehen
> >
> >
> >
> >

Re: FW: Failed to port datastore to RDF, will go Mongo

2010-11-24 Thread Kingsley Idehen

On 11/24/10 1:12 PM, Ben O'Steen wrote:

On Wed, 2010-11-24 at 12:51 -0500, Kingsley Idehen wrote:

What does MySQL 4 do with this data that can't be done with a
moderately capable RDF quad / triplestore?

If I am going to run rings around this thing, I need a starting
point :-)

That's not the point that is being made. A competent developer, using
all the available links and documentation, spending days researching and
learning and trying to implement, is unable to make an app using a
triplestore that is on a par with one they can create very quickly using
a relational database.

Fine, so there lies the problem.

Here is my simple guide for said competent RDBMS programmer:

1. Keep your presentation templates for tabular data manipulation intact 
(e.g. your PHP, Ruby etc. based dynamic HTML pages)

2. Keep variable bindings intact
3. Make SPARQL SELECT queries against RDF triple / quad store
4. See if you have the equivalent of what you has using SQL (it should 
be re. tabular data representation as based for page generation).

Now, how do we go beyond that?

1. Just add subject URIs to the Tabular result set so that you can see 
the power of a URI when used as DBMS key (of course this should be a 
Linked Data URI that de-references to a structured description in the 
form of an EAV graph).

The beauty of the single step above is this:

1. You realize that RDBMS keys are impoverished - a function of how 
RDBMS engine are implemented i.e., most don't support Reference values

Impoverishment is accentuated by:

1. URIs are portable -- drop them in any browser and a description of 
Referent with manifest (e.g. in a Web Page that may or may not also 
double as a machine readable descriptor document courtesy of RDFa)

2. URIs can be Joined in powerful ways that expand the range of data 
they expose (even when they are sole focus of a query) -- this is where 
an owl:sameAs relation (just another record in the RDF DBMS comes into 
play, no schema alteration whatsoever)

3. Then all of sudden you discover new data about your URIs Referent 
(101 re. Web since its global), all you do is add records (SPO or EAV 
3-tuples) to your DBMS.

You can't pull this off with MySQL in "schema last" mode. And you can't 
do anything with Reference values let alone walk the relational property 
graph and all the goodies a basic follow-your-nose pattern will unveil 
to you etc..

This is about the 1000th time I have heard this story, and the ability
range of those saying the same thing is huge - from 9-5 devs who learn
what they need to people who research and teach artificial intelligence
and other cutting edge areas and who actively learn new, complex skills
just because they can.

Not true, but I do understand how you could arrive at that conclusion.

The point is not whether someone who (co?)developed the virtuoso
triplestore can make RDF work, it's whether someone with the time,
current documentation and inclination can.
Anyone can exploit Linked Data. The trouble is that the Linked Data 
narrative could could be better, esp. for your particular developer/user 
profile :-)


1. - simple guide re. RDF based Linked Data and 
Virtuoso (load RDF and that's it, no coding re. Linked Data deployment, 
just get-going etc..)
2. -- sample 
collection of demos using PivotViewer
3. -- general Linked 
Data demo collection
4. -- example that shows a 
bookmark++ use pattern (install the extension and/or bookmarklet) visit 
places, query later using a myriad of tools that are on offer (Virtuoso 
or from elsewhere)
5. - a service that shows what can exist in your 
own data space post installation of Virtuoso + Linked Data middleware 
component (the sponger)
6. -- 
discussion with John F. Sowa about Webs & Fabrics, I make some RDF and 
RDBMS comments of relevance, plus links to live examples of hypermedia 
based structured data using an iCalendar document .





Kingsley Idehen 
President&  CEO
OpenLink Software
Twitter/ kidehen



Kingsley Idehen 
President&  CEO
OpenLink Software
Twitter/ kidehen

Re: FW: Failed to port datastore to RDF, will go Mongo

2010-11-24 Thread Ben O'Steen
On Wed, 2010-11-24 at 12:51 -0500, Kingsley Idehen wrote:
> What does MySQL 4 do with this data that can't be done with a
> moderately capable RDF quad / triplestore? 
> If I am going to run rings around this thing, I need a starting
> point :-)

That's not the point that is being made. A competent developer, using
all the available links and documentation, spending days researching and
learning and trying to implement, is unable to make an app using a
triplestore that is on a par with one they can create very quickly using
a relational database.

This is about the 1000th time I have heard this story, and the ability
range of those saying the same thing is huge - from 9-5 devs who learn
what they need to people who research and teach artificial intelligence
and other cutting edge areas and who actively learn new, complex skills
just because they can.

The point is not whether someone who (co?)developed the virtuoso
triplestore can make RDF work, it's whether someone with the time,
current documentation and inclination can.


> -- 
> Regards,
> Kingsley Idehen 
> President & CEO 
> OpenLink Software 
> Web:
> Weblog:
> Twitter/ kidehen 

Re: FW: Failed to port datastore to RDF, will go Mongo

2010-11-24 Thread Kingsley Idehen

On 11/24/10 8:29 AM, William Waites wrote:

... on the plus side, Friedrich wrote:

] * Lots of coolness, sucking up to linked data people.

I don't see these as particularly good things in themselves. The
solutions have to be obviously technically sound and convenient to
use. Drinking the kool-aid is not helpful.

* [2010-11-24 08:05:08 -0500] Kingsley Idehen  écrit:
] Is your data available as a dump?

UK data for 2009 that I made is available at:

But this was done more or less by hand and repurposing the CSV ->
SDMX (this was done before QB became best practice)  scripts is not
easy. Still, from a modeling perspective they might be a good starting

But having to ask a question in the right place and the answer being a
good starting point is maybe different from doing a google search and
finding easy to follow recipes that can immediately plugged into some
web app.



What does MySQL 4 do with this data that can't be done with a moderately 
capable RDF quad / triplestore?

If I am going to run rings around this thing, I need a starting point :-)



Kingsley Idehen 
President&  CEO
OpenLink Software
Twitter/ kidehen

Re: FW: Failed to port datastore to RDF, will go Mongo

2010-11-24 Thread Aldo Bucchi
Sorry, I forgot to add something critical.

Ease of integration ( moving triples ) is just the beginning. Once you
get a hold on the power of ontologies and inference as "views" your
data starts becoming more and more useful.

But the first step is getting your data into RDF and the return on
that investment is SPARQL and the ease to integrate.

I usually end up with several transformation pipelines and accesory
TTL files which get all combined into one dataset. TTLs are easily
editable by hand, collaboratively versiones, while giving you full

TTL files alone are why some developers fall in love with Linked Data.

On Wed, Nov 24, 2010 at 10:33 AM, Aldo Bucchi  wrote:
> Hi William, Friederich.
> This is an excellent email. My replies inlined. Hope I can help.
> On Wed, Nov 24, 2010 at 9:47 AM, William Waites  wrote:
>> Friedrich, I'm forwarding your message to one of the W3 lists.
>> Some of your questions could be easily answered (e.g. for euro in your
>> context, you don't have a predicate for that, you have an Observation
>> with units of a currency and you could take the currency from
>> dbpedia, the predicate is "units").
>> But I think your concerns are quite valid generally and your
>> experience reflects that of most web site developers that encounter
>> RDF.
>> LOD list, Friedrich is a clueful developer, responsible for
>> amongst other things. What can we
>> learn from this? How do we make this better?
>> -w
>> - Forwarded message from Friedrich Lindenberg  -----
>> From: Friedrich Lindenberg 
>> Date: Wed, 24 Nov 2010 11:56:20 +0100
>> Message-Id: 
>> To: wdmmg-discuss 
>> Subject: [wdmmg-discuss] Failed to port datastore to RDF, will go Mongo
>> (reposting to list):
>> Hi all,
>> As an action from OGDCamp, Rufus and I agreed that we should resume porting 
>> WDMMG to RDF in order to make the data model more flexible and to allow a 
>> merger between WDMMG, OffenerHaushalt and similar other projects.
>> After a few days, I'm now over the whole idea of porting WDMMG to RDF. 
>> Having written a long technical pro/con email before (that I assume 
>> contained nothing you don't already know), I think the net effect of using 
>> RDF would be the following:
>> * Lots of coolness, sucking up to linked data people.
>> * Further research regarding knowledge representation.
> I will quickly outline some points that I think are advantages from a
> developer POV. ( once you tackle the problems you outline below, of
> course ).
> * A highly expressive language ( SPARQL )
> * Ease of creating workflows where data moves from one app to another.
> And this is not just buzz. The self-contained nature of triples and
> IDs make it so that you can SPARQL select on one side and SPARQL
> insert on another. I do this all the time, creating "data pipelines".
> I admit it has taken some time to master, but I can peform "magic"
> from my customer's point of view.
>> vs.
>> * Unstable and outdated technological base. No triplestore I have seen so 
>> far seemed on par with MySQL 4.
> * You definitely need to give Virtuoso a try. It is a mature SQL
> database that grew into RDF. I Strongly disagree with this point as I
> have personally created highly demanding projects for large companies
> using Virtuoso's Quad Store. To give you a real life case, the recent
> Brazilian Election portal by (
> ) has Virtuoso under the
> hood and, being a highly important, mission critical app in a major (
> 4th ) media company  it is not a toy application.
> I know many others but in this one I participated so I can tell you it
> is Virtuoso w/o fear mistake.
>> * No freedom wrt to schema, instead modelling overhead. Spent 30 minutes 
>> trying to find a predicate for "Euro".
> Yes!
> This is a major problem and we as a community need to tackle it.
> I am intrigued to see what ideas come up in this thread. Thanks for
> bringing it up.
> As an alternative, you can initially model everything using a simple
> urn:foo:xxx or schema ( this is what I do )
> and as you move fwd you can refactor the model. Or not.
> You can leave it as is and it will still be integratable ( able to
> live along other datasets in the same store ).
> Deploying the "Linked" part of Linked Data ( the dereferencing
> protocols ) later on is another game.
>> *

Re: FW: Failed to port datastore to RDF, will go Mongo

2010-11-24 Thread Aldo Bucchi
Hi William, Friederich.

This is an excellent email. My replies inlined. Hope I can help.

On Wed, Nov 24, 2010 at 9:47 AM, William Waites  wrote:
> Friedrich, I'm forwarding your message to one of the W3 lists.
> Some of your questions could be easily answered (e.g. for euro in your
> context, you don't have a predicate for that, you have an Observation
> with units of a currency and you could take the currency from
> dbpedia, the predicate is "units").
> But I think your concerns are quite valid generally and your
> experience reflects that of most web site developers that encounter
> RDF.
> LOD list, Friedrich is a clueful developer, responsible for
> amongst other things. What can we
> learn from this? How do we make this better?
> -w
> - Forwarded message from Friedrich Lindenberg  -
> From: Friedrich Lindenberg 
> Date: Wed, 24 Nov 2010 11:56:20 +0100
> Message-Id: 
> To: wdmmg-discuss 
> Subject: [wdmmg-discuss] Failed to port datastore to RDF, will go Mongo
> (reposting to list):
> Hi all,
> As an action from OGDCamp, Rufus and I agreed that we should resume porting 
> WDMMG to RDF in order to make the data model more flexible and to allow a 
> merger between WDMMG, OffenerHaushalt and similar other projects.
> After a few days, I'm now over the whole idea of porting WDMMG to RDF. Having 
> written a long technical pro/con email before (that I assume contained 
> nothing you don't already know), I think the net effect of using RDF would be 
> the following:
> * Lots of coolness, sucking up to linked data people.
> * Further research regarding knowledge representation.

I will quickly outline some points that I think are advantages from a
developer POV. ( once you tackle the problems you outline below, of
course ).
* A highly expressive language ( SPARQL )
* Ease of creating workflows where data moves from one app to another.
And this is not just buzz. The self-contained nature of triples and
IDs make it so that you can SPARQL select on one side and SPARQL
insert on another. I do this all the time, creating "data pipelines".
I admit it has taken some time to master, but I can peform "magic"
from my customer's point of view.

> vs.
> * Unstable and outdated technological base. No triplestore I have seen so far 
> seemed on par with MySQL 4.

* You definitely need to give Virtuoso a try. It is a mature SQL
database that grew into RDF. I Strongly disagree with this point as I
have personally created highly demanding projects for large companies
using Virtuoso's Quad Store. To give you a real life case, the recent
Brazilian Election portal by ( ) has Virtuoso under the
hood and, being a highly important, mission critical app in a major (
4th ) media company  it is not a toy application.
I know many others but in this one I participated so I can tell you it
is Virtuoso w/o fear mistake.

> * No freedom wrt to schema, instead modelling overhead. Spent 30 minutes 
> trying to find a predicate for "Euro".

This is a major problem and we as a community need to tackle it.
I am intrigued to see what ideas come up in this thread. Thanks for
bringing it up.

As an alternative, you can initially model everything using a simple
urn:foo:xxx or schema ( this is what I do )
and as you move fwd you can refactor the model. Or not.

You can leave it as is and it will still be integratable ( able to
live along other datasets in the same store ).

Deploying the "Linked" part of Linked Data ( the dereferencing
protocols ) later on is another game.

> * Scares off developers. Invested 2 days researching this, which is how long 
> it took me to implement OHs backend the first time around. Project would need 
> to be sustained through linked data grad students.
> * Less flexibility wrt to analytics, querying and aggregation. SPARQL not so 
> hot.

Did you try Virtuoso? Seriously.
It provides out of the box common aggregates and is highly extensible.
You basically have a development platform at your disposal.

> * Good chance of chewing up the UI, much harder to implement editing.

Definitely hard. This is something I hope will be alleviated once we
start getting more demos into the wild. But, take note: the Active
Record + MVC pattern works. This is not as alien as it seems.

Also, SPARQL also removes the "joines" as some of the major NoSQL
offerings do. I find it terribly easy to create UIs over RDF, but I
have been doing it for a while already.

> I normally enjoy learning new stuff. This is just painful. Most of the above 
> points are probably based on my ignorance, but it really shouldn't take a PhD 

Re: FW: Failed to port datastore to RDF, will go Mongo

2010-11-24 Thread William Waites
... on the plus side, Friedrich wrote:

] * Lots of coolness, sucking up to linked data people.

I don't see these as particularly good things in themselves. The
solutions have to be obviously technically sound and convenient to
use. Drinking the kool-aid is not helpful.

* [2010-11-24 08:05:08 -0500] Kingsley Idehen  écrit:
] Is your data available as a dump?

UK data for 2009 that I made is available at:

But this was done more or less by hand and repurposing the CSV ->
SDMX (this was done before QB became best practice)  scripts is not
easy. Still, from a modeling perspective they might be a good starting

But having to ask a question in the right place and the answer being a
good starting point is maybe different from doing a google search and
finding easy to follow recipes that can immediately plugged into some
web app.

William Waites
9C7E F636 52F6 1004 E40A  E565 98E3 BBF3 8320 7664

Re: FW: Failed to port datastore to RDF, will go Mongo

2010-11-24 Thread Kingsley Idehen

On 11/24/10 7:47 AM, William Waites wrote:

Friedrich, I'm forwarding your message to one of the W3 lists.

Some of your questions could be easily answered (e.g. for euro in your
context, you don't have a predicate for that, you have an Observation
with units of a currency and you could take the currency from
dbpedia, the predicate is "units").

But I think your concerns are quite valid generally and your
experience reflects that of most web site developers that encounter

LOD list, Friedrich is a clueful developer, responsible for amongst other things. What can we
learn from this? How do we make this better?


- Forwarded message from Friedrich Lindenberg  -

From: Friedrich Lindenberg
Date: Wed, 24 Nov 2010 11:56:20 +0100
To: wdmmg-discuss
Subject: [wdmmg-discuss] Failed to port datastore to RDF, will go Mongo

(reposting to list):

Hi all,

As an action from OGDCamp, Rufus and I agreed that we should resume porting 
WDMMG to RDF in order to make the data model more flexible and to allow a 
merger between WDMMG, OffenerHaushalt and similar other projects.

After a few days, I'm now over the whole idea of porting WDMMG to RDF. Having 
written a long technical pro/con email before (that I assume contained nothing 
you don't already know), I think the net effect of using RDF would be the 

* Lots of coolness, sucking up to linked data people.
* Further research regarding knowledge representation.


* Unstable and outdated technological base. No triplestore I have seen so far 
seemed on par with MySQL 4.
* No freedom wrt to schema, instead modelling overhead. Spent 30 minutes trying to find a 
predicate for "Euro".
* Scares off developers. Invested 2 days researching this, which is how long it 
took me to implement OHs backend the first time around. Project would need to 
be sustained through linked data grad students.
* Less flexibility wrt to analytics, querying and aggregation. SPARQL not so 
* Good chance of chewing up the UI, much harder to implement editing.

I normally enjoy learning new stuff. This is just painful. Most of the above 
points are probably based on my ignorance, but it really shouldn't take a PhD 
to process some gov spending tables.

I'll now start a mongo effort because I really think this should go schema-free 
+ I want to get stuff moving. If you can hold off loading Uganda and Israel for 
a week that would of course be very cool, we could then try to evaluate how far 
this went. Progress will be at:


wdmmg-discuss mailing list

- End forwarded message -


Which Triple or Quad stores have you tested. Your MySQL 4 assertion 
doesn't compute.

MySQL is a Relational Database, it doesn't compare to even a moderately 
capable RDF Triple or Quad store (Relational Property Graph Databases 
that support URI based Keys) when it comes to:

1. Heterogenously sourced data
2. Disparately shaped data
3. Volatile Schema.

Is your data available as a dump?



Kingsley Idehen 
President&  CEO
OpenLink Software
Twitter/ kidehen

FW: Failed to port datastore to RDF, will go Mongo

2010-11-24 Thread William Waites
Friedrich, I'm forwarding your message to one of the W3 lists.

Some of your questions could be easily answered (e.g. for euro in your
context, you don't have a predicate for that, you have an Observation
with units of a currency and you could take the currency from
dbpedia, the predicate is "units").

But I think your concerns are quite valid generally and your
experience reflects that of most web site developers that encounter

LOD list, Friedrich is a clueful developer, responsible for amongst other things. What can we
learn from this? How do we make this better?


- Forwarded message from Friedrich Lindenberg  -

From: Friedrich Lindenberg 
Date: Wed, 24 Nov 2010 11:56:20 +0100
To: wdmmg-discuss 
Subject: [wdmmg-discuss] Failed to port datastore to RDF, will go Mongo

(reposting to list):

Hi all, 

As an action from OGDCamp, Rufus and I agreed that we should resume porting 
WDMMG to RDF in order to make the data model more flexible and to allow a 
merger between WDMMG, OffenerHaushalt and similar other projects. 

After a few days, I'm now over the whole idea of porting WDMMG to RDF. Having 
written a long technical pro/con email before (that I assume contained nothing 
you don't already know), I think the net effect of using RDF would be the 

* Lots of coolness, sucking up to linked data people.
* Further research regarding knowledge representation.


* Unstable and outdated technological base. No triplestore I have seen so far 
seemed on par with MySQL 4. 
* No freedom wrt to schema, instead modelling overhead. Spent 30 minutes trying 
to find a predicate for "Euro".
* Scares off developers. Invested 2 days researching this, which is how long it 
took me to implement OHs backend the first time around. Project would need to 
be sustained through linked data grad students.
* Less flexibility wrt to analytics, querying and aggregation. SPARQL not so 
* Good chance of chewing up the UI, much harder to implement editing.

I normally enjoy learning new stuff. This is just painful. Most of the above 
points are probably based on my ignorance, but it really shouldn't take a PhD 
to process some gov spending tables. 

I'll now start a mongo effort because I really think this should go schema-free 
+ I want to get stuff moving. If you can hold off loading Uganda and Israel for 
a week that would of course be very cool, we could then try to evaluate how far 
this went. Progress will be at: 


wdmmg-discuss mailing list

- End forwarded message -

William Waites
9C7E F636 52F6 1004 E40A  E565 98E3 BBF3 8320 7664