Re: Content negotiation for Turtle files

2013-02-10 Thread Chris Beer
Hi all

While I promised a response, time is never my friend despite best intentions.

+1 to Tim on crispness, and on a protocol. I note that the
content-negotiation error which was at the core of this discussion
hasn't really been talked about, and was where I was planning to provide
comment on.

So noting the latest in the discussion, I'll fast track and suggest that
as an interim measure: (note - this is a drive by comment, and possibly at
the risk of over-simplifying:)

Cannot a lot of this discussion be solved if a server is correctly
configured to return a 300 response in these cases? (Multiple-choice or
there is more than one format available Mr. Client - please choose which
one you'd like).

We can't assume that clients or users will ask for something we have, or
in a correct manner, which is the reason 300 and other responses not-often
used exist.

Cheers

Chris

 I feel we should be crisp about these things.
 Its not a question of thinking of what things kind of tend
 to enhance interoperability, it is defining a protocol
 which 100% guarantees interoperability.

 Here are three distinct protocols which work,
 ie guarantee each client can understand each server.

 A) Client accepts various formats including RDF/XML.
   Server provides various formats including RDF/XML.

 B) Client accepts various formats including RDF/XML AND turtle.
   Server provides  various formats including either RDF/XML OR turtle.

 C) Client accepts various formats including turtle.
   Server provides  various formats including turtle.

 These may not ever have been named.
 The RDF world used A in fact for a while, but the
 Linked Data Platform at last count was using C.
 Obviously B has its own advantages but I think that
 we need lightweight clients more than we need lightweight
 servers and so being able to build a client without an
 XML parser is valuable.

 Obviously there is a conservative middle ground D in
 which all clients and servers support both formats,
 which could be defined as a practical best practice,
 but we should have a name for, say, C.

 We should see whether the LDP group will define
 a word for compliance with C.  I hope so, and then
 we can all provide that and test for it.

 Tim

 On 2013-02 -06, at 11:38, Leigh Dodds wrote:

 From an interoperability point of view, having a default format that
 clients can rely on is reasonable. Until now, RDF/XML has been the
 standardised format that we can all rely on, although shortly we may
 all collectively decide to prefer Turtle. So ensuring that RDF/XML is
 available seems like a reasonable thing for a validator to try and
 test for.

 But there's several ways that test could have been carried out. E.g.
 Vapour could have checked that there was a RDF/XML version and
 provided you with some reasons why that would be useful. Perhaps as a
 warning, rather than a fail.

 The explicit check for RDF/XML being available AND being the default
 preference of the server is raising the bar slightly, but its still
 trying to aim for interop.

 Personally I think I'd implement this kind of check as ensure there
 is at least one valid RDF serialisation available, either RDF/XML or
 Turtle. I wouldn't force a default on a server, particularly as we
 know that many clients can consume multiple formats.

 This is where automated validation tools have to tread carefully:
 while they play an excellent role in encouraging consistently, the
 tests they perform and the feedback they give need to have some
 nuance.







Re: Content negotiation for Turtle files

2013-02-06 Thread Chris Beer
Bernard, Ivan

(At last! Something I can speak semi-authoritatively on ;P )

@ Bernard - no - there is no reason to go back if you do not want to, and
every reason to serve both formats plus more.

Your comment about UA's complaining about a content negotiation issue is
key to what you're trying to do here. I'd like to provide some clear
guidance or suggestions back, but first, if possible, can you please post
the http request headers for the four (and any others you have) user
agents you've used to attempt to request your rdf+xml files and which have
either choked or accepted the .ttl file. Extra points if you can also post
the server's response headers.

@ Ivan - while I wince a little at the trick - the question comes down to
the same thing - what is the http response header that is sent back to the
client - would be interested to see if in fact what you're doing ISN'T a
trick but in fact a compliant way to approach this.

Personally I think you shouldn't actually need to resort to using .var
(which is Apache specific) when what is essentially a content negotiation
issue can simply be configured properly at the server level and thus a
single approach could be used by IIS, Apache, nginx etc.

Look forward to the responses (excuse the pun)

Cheers

Chris

--

Chris Beer
Manager - Online Services
Department of Regional Australia, Local Government, Arts and Sport

 Bernard,

 (forget my W3C hat, I am not authoritative on Apache tricks, for
 example...)

 When I put up a vocabulary onto www.w3.org/ns/, for example, I publish it
 both in ttl and rdf/xml. Actually, we also publish the file in HTML+RDFa
 (which very often is the master copy and I convert it into ttl and rdf/xml
 before publishing). Additionally, we put there a .var file. This is the
 .var file for the http://www.w3.org/ns/r2rml:

 r2rml.var
 -
 URI: r2rml

 URI: r2rml.html
 Content-Type: text/html

 URI: r2rml.rdf
 Content-Type: application/rdf+xml; qs=0.4

 URI: r2rml.ttl
 Content-Type: text/turtle; qs=0.5

 that seems to work well, at least I have not heard complaints:-)

 One can do a further trick by adding to .htaccess entries to convert, say,
 r2rml.html to r2rml.ttl on the fly; I did not do that to reduce the load
 on our servers.

 There is somewhere a flag in the apache configuration allowing apache to
 handle these .var files; I am not sure it is there by default.

 I hope this helps

 Ivan





 On Feb 6, 2013, at 24:49 , Bernard Vatant bernard.vat...@mondeca.com
 wrote:

 Hello all

 Back in 2006, I thought had understood with the help of folks around
 here, how to configure my server for content negotiation at lingvoj.org.
 Both vocabulary and instances were published in RDF/XML.

 I updated the ontology last week, and since after years of happy living
 with RDF/XML people eventually convinced that it was a bad, prehistoric
 and ugly syntax, I decided to be trendy and published the new version in
 Turtle at http://www.lingvoj.org/ontology_v2.0.ttl

 The vocabulary URI is still the same : http://www.lingvoj.org/ontology,
 and the namespace  http://www.lingvoj.org/ontology# (cool URI don't
 change)

 Then I turned to Vapour to test this new publication, and found out that
 to be happy with the vocabulary URI it has to find some answer when
 requesting application/rdf+xml. But since I have no more RDF/XML file
 for this version, what should I do?
 I turned to best practices document at
 http://www.w3.org/TR/swbp-vocab-pub, but it does not provide examples
 with Turtle, only RDF/XML.

 So I blindly put the following in the .htaccess : AddType
 application/rdf+xml .ttl
 I found it a completely stupid and dirty trick ... but amazigly it makes
 Vapour happy.

 But now Firefox chokes on http://www.lingvoj.org/ontology_v2.0.ttl
 because it seems to expect a XML file. Chrome has not this issue.
 The LOV-Bot says there is a content negotiation issue and can't get the
 file. So does Parrot.

 I feel dumb, but I'm certainly not the only one, I've stumbled upon a
 certain number of vocabularies published in Turtle for which the conneg
 does not seem to be perfectly clear either.

 What do I miss, folks? Should I forget about it, and switch back to good
 ol' RDF/XML?

 Bernard

 --
 Bernard Vatant
 Vocabularies  Data Engineering
 Tel :  + 33 (0)9 71 48 84 59
 Skype : bernard.vatant
 Blog : the wheel and the hub
 
 Mondeca
 3 cité Nollez 75018 Paris, France
 www.mondeca.com
 Follow us on Twitter : @mondecanews
 --

 Meet us at Documation in Paris, March 20-21




 
 Ivan Herman, W3C Semantic Web Activity Lead
 Home: http://www.w3.org/People/Ivan/
 mobile: +31-641044153
 FOAF: http://www.ivan-herman.net/foaf.rdf











Re: Linked Data Book in Early Access Release

2012-12-05 Thread Chris Beer
snip

http://www.manning.com/dwood/ itself doesn't seam to have any Linked Data
 to consume ;)

Makes sense to me - if you know enough to look for LD resources at the
manning.com/dwood/ URI, you've just self evaluated that you probably don't
need the book! :P

(Although reminds me of the classic newspaper ad - Can't read? Call this
number and we'll tell you about learning to read courses...)

- Chris




Re: CC Version 4.0 (and government data)

2012-07-09 Thread Chris Beer

Hi all

Very much agreed - it's somewhat of an honour to have been publically corrected 
by a true expert in the field :) You'll also be hapy to know Anne, that I've 
since firmly entered the AusGOAL (ausgoal.gov.au for those reading) fold and 
have been working with Baden et. al. on the CC 4.0 implications for data in the 
Australian context.

Cheers

Chris



Sent from Samsung MobileBernadette Hyland bhyl...@3roundstones.com wrote:Hi 
Anne,
As always, you are thorough!  Thank you for the detail on CC licenses for use 
by governments.  It is a topic that requires expertise that few can provide and 
therefore it only makes sense to leverage the excellent work you all have done. 

The URLs and some description, with proper attribution of course, will be 
folded in the the Gov Linked Data Working Group's forthcoming Best Practices 
deliverable.  Thanks again and I'm so glad you are part of this effort.

Cheers,

Bernadette Hyland, co-chair 
W3C Government Linked Data Working Group
Charter: http://www.w3.org/2011/gld/

On Jul 5, 2012, at 9:47 PM, Anne Fitzgerald wrote:

Hi all
 
I thought it might be useful to post some clarifications on the points raised 
by Chris Beer in comments posted on 12 December 2011.
 
(1)The version 3.0 CC Australia licences ARE suitable for use on 
copyright-protected datasets, data compilations and databases.  If the dataset 
is not copyright-protected, the CC licences (which are based on the rights held 
by copyright owners) are unsuitable.  While copyright does not apply to mere 
facts or unoriginal data collections, there are many datasets, data 
compilations and databases that will qualify for copyright under the tests set 
out by the Australian courts in cases decided in 2010 and 2011.  A summary of 
the position is contained in my chapter (“Copyright”) in the recently-published 
book “Australian Media Law” 4th ed, Thomson Reuters, November 2011 or in our 
Guide “CC and Government” – available here:http://eprints.qut.edu.au/38364/
 
(2)CC licences are in fact being widely used on datasets and data collections 
by government agencies and educational and research institutions around 
Australia, ranging from the Australian Bureau of Statistics (www.abs.gov.au) 
and Geoscience Australia (www.ga.gov.au) at the federal government level, 
through to the Queensland Police Service 
(http://www.police.qld.gov.au/copyright.htm) and Brisbane City Council 
(http://data.brisbane.qld.gov.au/). The most widely used licence for data is CC 
BY.  (Please note that CC 0 is not used in Australia as it is not legally 
effective; all these government agencies are using CC BY as the default.)
 
(3)Based on wide-ranging consultations and feedback over the last several 
years, there is little interest in other, more complex licences such as ODbL.  
Reasons for this are that Australia does not recognise sui generis database 
rights and there is no discernible advocacy in favour of extending statutory 
database rights to factual data collections that are not sufficiently original 
to warrant copyright protection.  In the absence of a statutory database right, 
protection of non-copyright data collections would require parties to enter 
into a contractual arrangement to firstly, describe their respective rights and 
obligations and, secondly, to set out the consequences of breach of those 
obligations.
 
(4)The Australian legal position with respect to copyright in datasets, data 
compilations and databases is appropriately dealt with in the CC version 3.0 
Australia licences.  The revisions in version 4.0 are primarily directed at 
addressing the situation in Europe (and a few other countries, such as Korea) 
which recognise sui generis database rights; version 3.0 is based on copyright 
interests but does not deal with the licensing of database rights that may 
exist in the same material to which the CC licence is applied.
 
(5) There has been little interest in Australia in the development of licences 
based on rights (such as sui generis database rights) that do not exist under 
Australian law.  As the Creative Commons licences (up to and including version 
3.0) have been “ported” so they are effective under the laws existing in 
individual jurisdictions (countries) where the licence is applied, unless and 
until a truly “international” licence is developed it is inappropriate to 
include mention – and even more inappropriate to purport to grant a licence - 
of rights that are not recognised at all under that country’s laws.  In 
countries which do recognise sui generis database rights there has, of course, 
been extensive consideration of the rights and their operation has been 
examined in several important cases in the UK and Europe in recent years.
 
(6)There is now considerable experience with using CC licences in the 
Australian public sector as well as in education and publicly funded research.  
This is also increasingly the case worldwide as national and local authorities 
develop data.gov

Fwd: Re: CC Version 4.0 (and government data)

2011-12-11 Thread Chris Beer
...And this time to the list and not just Sandro...

Sent from Samsung Mobile

 Original message 
Subject: Re: CC Version 4.0 (and government data) 
From: Chris Beer ch...@codex.net.au 
To: san...@w3.org 
CC:  

Thanks Sandro

This is of immediate interest here in Australia where at a very recent (last 
week) federal level meeting concerning a WoG licencing framework I raised with 
general acknowledgement from others that CC 3.0 was unsuitable for data, and 
that CC proscribed as much.

My suggestion then was that the ODbL should be actively considered as the 
suitable 3rd critical part of an open  licence triumvirate formed by CC for 
objects, GPL/BSD for software, and ODbL for object containers, noting for 
instance the most common scenario wherein the displayed results on a query is 
considered a derivative work where database or dataset is CC licenced.

This new development CC 4.0 does appear to change things. My questions to the 
list are 

a) how much has been invested by Gov / Academia / Orgs anywhere or at any level 
in ODbL

b)  how much has been invested by Gov / Academia / Orgs anywhere or at any 
level in CC with datasets, databases or datacubes and has suitability been an 
issue

and 

c) to anyone's knowledge, has CC = 3.0 on data, datasets/bases/cubes been 
tested in court in a real copyright/left case (pref with Gov as plaintiff)

Cheers

Chris Beer
Australia

Sent from Samsung Mobile


Re: New Open Government Platform code released

2011-12-09 Thread Chris Beer
Hi Jeanne

a) Is there a dedicated group, list or contact for the dev cycle? (alpha, beta, 
RC, rel). Feel free to send details direct and I'll talk off list with a view 
to getting the .gov.au Drupal/open data community into the loop as an ongoing 
exercise.


b) erewhon.gov (I originally called it example.gov or something) was always a 
nice idea. But it is probably far easier for every gov to make use of 
handles/PIDs, RDF, standard domains etc to make a example.any.data.gov(.*) 
which acts as a community cloud with localisation. Anyway, while off topic a 
little, it would seem there exists a strong business case for an agreement to 
exist between W3, ICANN, IETF, the UN and associated groups such as ISO, OGC, 
EU, G20 etc, to establish a gTLD along the lines of .* (obviously it would need 
to be something other than the actual wildcard character for DNS reasons). This 
would allow for a variety of test beds and sandboxes to be used in cross 
org/state developments such as a true 'data.gov.*' for instance, or 
'alpha.semweb.*', or even health.gov.* etc. (It also seems logical in a SemWeb 
and e-gov sense - simple triple lists etc of all relevant sites/services in an 
internationally accepted domain - eg hospital.gov.* should just redirect you to 
the closest Government's hospital and health directory, space.gov.* should take 
you to the closest space agency website etc.)

Pie in the sky maybe, but certainly there appears to be a need. And the timing 
is right with new gTLD applications being taken as of Jan. 2012. As it would be 
a high level agreement, it could be done at little to no cost as it is in the 
public interest (as opposed to the $185k being charged to business).

(example.data.gov for our needs here would always have been ideal, bar for the 
defacto position which saw the US take over .gov rather than using .gov.us as 
they really should under the standard.)

Interested in what others think. Is this something all of our standards 
communities/UN/international orgs should be looking to implement? If so, how do 
we propose/drive it, and what would be the next step? We are prob a little to 
far down the chain to really influence such a thing, but...

Cheers

Chris Beer
Australian .gov.au IT type (with all opinions my own of course)

Sent from Samsung Mobile


 Holm, Jeanne M (1760) jeanne.m.h...@jpl.nasa.gov wrote: 

Thanks for the feedback Gannon.  I really appreciate it.

As you know, this kind of feedback is critical to us being able to iterate with 
you and the broad community on getting this from Alpha to Beta.  We've gotten 
some other feedback as well and are working on many some fixes.  I'll check out 
erewhon.gov…  

Thanks!

--Jeanne

**
Jeanne Holm
Evangelist, Data.gov
U.S. General Services Administration
Cell: (818) 434-5037
Twitter/Facebook/LinkedIn: JeanneHolm
**

From: Gannon Dick gannon_d...@yahoo.com
Reply-To: Gannon Dick gannon_d...@yahoo.com
Date: Fri, 9 Dec 2011 14:09:24 -0800
To: ch...@codex.net.au ch...@codex.net.au, Jeanne Holm 
jeanne.m.h...@jpl.nasa.gov
Subject: Re: New Open Government Platform code released

After 20hrs. I succeeded in getting a minimum Drupal-7.10 installation working 
on the loop-back of an old laptop.  I started from scratch, as it were, 
installing Linux (desktop), then Apache, etc..  Nonetheless,  little boy named 
Ed S. is getting coal in his stocking.  I'm still not sure what the symlink to 
the 'dgib_dms' directory is all about.  I think that directory is created on 
installation.  How did you come out Chris ?  If I could make a suggestion 
Jeanne ... a test deployment, refreshed overnight would be real handy.  May I 
suggest www.erewhon.gov, and you should list it as a deployment in the 
Community Directory as well.

--Gannon  

From: Chris Beer ch...@codex.net.au
To: Holm, Jeanne M (1760) jeanne.m.h...@jpl.nasa.gov 
Cc: public-egov...@w3.org public-egov...@w3.org 
Sent: Thursday, December 8, 2011 12:58 AM
Subject: Re: New Open Government Platform code released

OMG OMG OMG OMG OMG!!!

Something something Dark Side... something something something Data Management 
and Repository Software Requirements complete...

Thankyou for letting us know! I know what I'm installing tonight...

Cheers

Chris



4th Australian Metadata Conference - 2011 - Call for Presentations and Case Studies

2011-02-25 Thread Chris Beer
 website


Speaker reimbursement
. The conference is being run on a cost recovery basis to keep costs as 
low as possible for delegates
. Presenters will be required to register for the conference but the 
registration fee will be waived
. There may be some possibility to reimburse travel and accommodation 
costs for presenters outside of the ACT (within Australia only).  For 
further information on this please contact i...@metalounge.org 
mailto:i...@metalounge.org


Sponsorships

. Limited sponsorship opportunities are available for this event. 
 Further details are available here or contact us at 
i...@metalounge.org mailto:i...@metalounge.org


Critical Timeline
. Deadline for proposals:  19th March, 2011
. Notification of acceptance:  2nd April, 2011
. Deadline for final abstracts and bios:  29th April, 2011
. Deadline for final presentation slides and related documents:   13th 
May, 2011

. Conference:  Wednesday 25th - Friday 27th May, 2011

Institute of Metadata Management Committee

. Lisa Baldwin, Fuji Xerox Australia
. Oliver Bell, Microsoft Australia
. Michele Berkhout, Digital Brand (links to online profiles)
. David Bromage, National Archives of Australia
. Karen Dexter, Department of Defence
. Terry Hanisch, Department of Finance and Deregulations
. Anni Rowland-Campbell, Digital Brand
. Mel Taylor, Australian Institute of Health and Welfare Services
. Simon Wall, Australian Bureau of Statistics
. Chris Beer, National Occupational Licensing Authority

The conference is being organised by Digital Brand Pty Ltd on behalf of 
the Institute of Metadata Management.

--
/*Chris Beer*
Invited Expert (Public Member) W3 eGovernment Interest Group  W3-WAI 
WCAG Working Group

EM: ch...@e-beer.net.au mailto:ch...@e-beer.net.au
TW: @zBeer http://www.twitter.com/zBeer
LI: http://au.linkedin.com/in/zbeer/


Re: Organization ontology

2010-06-01 Thread Chris Beer

Good point!

Sent from my iPhone

On 02/06/2010, at 15:06, Stuart A. Yeates syea...@gmail.com wrote:


On Tue, Jun 1, 2010 at 7:50 PM, Dave Reynolds
dave.e.reyno...@googlemail.com wrote:
We would like to announce the availability of an ontology for  
description of

organizational structures including government organizations.

This was motivated by the needs of the data.gov.uk project. After  
some
checking we were unable to find an existing ontology that precisely  
met our
needs and so developed this generic core, intended to be extensible  
to

particular domains of use.

[1] http://www.epimorphics.com/public/vocabulary/org.html


I think this is great, but I'm a little worried that a number of
Western (and specifically Westminister) assumptions may have been
built into it.

What would be great would be to see a handful of different
organisations (or portions of them) from different traditions
modelled. Maybe:
* The tripartite system at the top of US government, which seems
pretty complex to me, with former Presidents apparently retaining some
control after they leave office
* The governance model of the Vatican City and Catholic Church
* The Asian royalty model, in which an informal royalty commonly
appears to sit above a formal constitution

cheers
stuart





Re: Organization ontology

2010-06-01 Thread Chris Beer

Cool! Let me know when that's ready. End of the week ok? ;P lol

Sent from my iPhone

On 02/06/2010, at 15:47, Mike Norton xsideofparad...@yahoo.com wrote:

Or, in the U.S. we could just partition a new web with top level  
domains reflective of the agencies and departments financed by our  
tax dollars.  Open Gov!


Michael A. Norton



From: Chris Beer ch...@e-beer.net.au
To: Stuart A. Yeates syea...@gmail.com
Cc: Dave Reynolds dave.e.reyno...@googlemail.com; Linked Data  
community public-lod@w3.org; public-egov...@w3.org public-egov...@w3.org 


Sent: Tue, June 1, 2010 10:22:12 PM
Subject: Re: Organization ontology

Good point!

Sent from my iPhone

On 02/06/2010, at 15:06, Stuart A. Yeates syea...@gmail.com wrote:

 On Tue, Jun 1, 2010 at 7:50 PM, Dave Reynolds
 dave.e.reyno...@googlemail.com wrote:
 We would like to announce the availability of an ontology for  
description of

 organizational structures including government organizations.

 This was motivated by the needs of the data.gov.uk project. After  
some
 checking we were unable to find an existing ontology that  
precisely met our
 needs and so developed this generic core, intended to be  
extensible to

 particular domains of use.

 [1] http://www.epimorphics.com/public/vocabulary/org.html

 I think this is great, but I'm a little worried that a number of
 Western (and specifically Westminister) assumptions may have been
 built into it.

 What would be great would be to see a handful of different
 organisations (or portions of them) from different traditions
 modelled. Maybe:
 * The tripartite system at the top of US government, which seems
 pretty complex to me, with former Presidents apparently retaining  
some

 control after they leave office
 * The governance model of the Vatican City and Catholic Church
 * The Asian royalty model, in which an informal royalty commonly
 appears to sit above a formal constitution

 cheers
 stuart





Re: [agenda] eGov IG Call, 25 Nov 2009, item 6

2009-11-26 Thread Chris Beer

Thanks for the reply Kingsley.

/Kingsley Idehen wrote:/

/Chris Beer wrote:
/

/I think Thomas makes some excellent points.

Is it possible as a group to agree on something akin to the following?

1) Open Data refers to how data is accessed and is primarily a 
political/policy consideration

/
/Structured Data based on industry standard data representation 
formats. Just as UNIX came down to POSIX. Ditto Internet re. TCP/IP. 
Openness is about Standards, and has nothing to do with politics or 
philosophy.


You can institute policies that mandate the use of industry standard 
data formats re. data placed in the public domain or simply published 
for reuse by others.

/
/2) Linked (Open) Data refers to how data is structured and delivered 
and is primarily a technological/standards consideration

/
/To be precise: HTTP based Linked Open Data.  This is about the 
incorporation of HTTP scheme Identifiers into data that has be 
published using a standard data representation format.


Note: to get data into any standard data representation format there 
has to be a formal data model. At the most basic, said model takes the 
form: Entity-Attribute-Value. In the case of Linked Open Data, you 
have the intersection of the following:


1. EAV model
2. Standard Data Formats
3. HTTP scheme Identifiers (HTTP URIs)./
What I was in fact suggesting here is that we clearly define the 
difference between Data being Open as in access and policy surrounding 
it - the political/philosophical side of the coin, and Data being Open 
as in Standards and, as you so better put it - structured data - the 
technical side of the coin. The semantics surrounding the two are 
important - to date we have basically said in e-Gov IG Lets make Open 
(Standard) Data Open (to the Public) - anyone coming in with no 
background knowledge - potentially such as as those working in policy 
from a non-IT background that is covered by initial Working Draft 
http://www.w3.org/TR/gov-data/'s  /To: Any government wishing to 
set-up data.gov.* /(wiki version), is simply going to start to find it 
confusing. We have discussions on defining open data that are centered 
around the access/policy question, and we have discussions on Linked 
Open Gov Data that are veering into the technical. For the good of all 
involved, I feel we need to define, set, and stick to some basic 
terminology that doesn't confuse the two.
/3) The majority of datasets, LOD or not, that are of real value, are 
developed, maintained and delivered by Government, like it or not. We 
know this without even looking at the LOD Projects work 
http://esw.w3.org/topic/SweoIG/TaskForces/CommunityProjects/LinkingOpenData#head-277d7f68544ce1a9e252f5c0080b6402cd983a49 
(which interestingly, contains very little Government data, which is 
a worry as it possibly indicates that Governments just AREN'T getting 
on board with early take up of LOD, despite the various legal 
requirements coming out world wide).

/
/How have you arrived at the above bearing in mind the pivotal role of 
DBpedia?  Basically, this is about a  Linked Open Data Space derived 
from Wikipedia snapshots which have little or no Govt. data. Of 
course, things get much better across depth, quality, and linked 
density dimensions when Govt. data is cross linked with LOD spaces 
like DBpedia etc./
Quite simply - It is Government that conducts the majority of hard 
statistical research and collates data. DBpedia, or indeed any other 
commercial enterprise, including Academia, does not equal the sum total 
of Linked Open Data. They do indeed provide a pivotal and valuable 
service - but only in the sense that Google does with searches. They are 
a reseller of Data in that sense - but Government is, and will remain 
for a long time, the primary producer of raw datasets.


If there were huge chunks of Government Datasets floating around in the 
public domain waiting to be linked, it would of been done by someone 
already, and we wouldn't be having this discussion. As Thomas points 
out: /we have tons of government data with a legal obligation to make 
them available to the public (at least in Europe, and especially 
environmental data), and we are looking for means to do so in the most 
efficient way. /While he is referring to the technical aspect here, the 
inference and reality to us all is clear. Government datasets are a 
small percentage of what is openly available and being linked. Primarily 
due to access, which has much to do with issues that Government 
considers important, such as politics, provenance, authority and trust. 
I am not counting resellers in this, as that in itself raises further 
issues about why some organisations have access to this first hand data, 
and why the man on the street often doesn't.



/
3) We accept that Linked (Open) Data is the purview of the Linking 
Open Data W3C Project - there is probably little we can add to the 
discussion here apart from supporting them in thier own work of IDing

Re: [agenda] eGov IG Call, 25 Nov 2009, item 6

2009-11-25 Thread Chris Beer

I think Thomas makes some excellent points.

Is it possible as a group to agree on something akin to the following?

1) Open Data refers to how data is accessed and is primarily a 
political/policy consideration
2) Linked (Open) Data refers to how data is structured and delivered and 
is primarily a technological/standards consideration
3) The majority of datasets, LOD or not, that are of real value, are 
developed, maintained and delivered by Government, like it or not. We 
know this without even looking at the LOD Projects work 
http://esw.w3.org/topic/SweoIG/TaskForces/CommunityProjects/LinkingOpenData#head-277d7f68544ce1a9e252f5c0080b6402cd983a49 
(which interestingly, contains very little Government data, which is a 
worry as it possibly indicates that Governments just AREN'T getting on 
board with early take up of LOD, despite the various legal requirements 
coming out world wide).


3) We accept that Linked (Open) Data is the purview of the Linking Open 
Data W3C Project - there is probably little we can add to the discussion 
here apart from supporting them in thier own work of IDing datasets that 
can be linked.


In support of this point, e-Government will be as any other entity in 
this regard, and the methodologies in delivering LOD will not likely 
differ to the rest of the world or society, much as there is little 
difference in Web Content Delivery between Government models and 
Commercial/Public models. In that sense I agree with Thomas 100% when it 
comes to a technology model. It will be Semantic, and RDF is likely to 
become the dominant paradigm, if not the only one.


5) Open Data therefore is what we SHOULD be focused on - not in the 
sense of forcing a standard on Gov in terms of Open Data Delivery 
policy, but in Education and Outreach.


The question of non-RDF data consumers is almost moot. Given the time 
scales we are operating on, it is akin to asking at the start of the 
first version of HTML how does hyperlinked content support .txt based 
users such as BBS systems. Non semantic, non-RDF, pre HTML 5 browsers 
and technologies will be legacy before we know it, probably while we are 
still discussing all this. I mean it.


This leaves us with two outcomes. The first is that the current user 
base that Thomas identifies as professional RDF consumers will 
inevitably drive the conversion of their suppliers data into RDF/XML 
formats, essentially as a snowball effect. GIS Data is a good example of 
where this is already happening.


The second is that as Thomas says,  human-readable formats  HAVE to be 
provided - ultimately the user is human, and the transition on the tech 
side between how the machine reads it, and how it is displayed to the 
user in a usable, displayable form should be seamless. Ultimately the 
user should not even realise that they are doing anything but looking at 
a web page of results that they have asked a server for.


This is where I do disagree with Thomas. A Federation of providers is a 
nice concept, but it is too far off to think about, and will be 
inevitable in the end so probably doesn't need to be focused on. I 
believe that the key to overcoming the mistrust issue is three-fold:


a) Focusing on educating Governments on WOG methodologies in adopting 
inter-agency delivery on a National level - ie: promote the creation of 
the data.gov.* model. The international model is far to scary a prospect 
for most Governments to contemplate.

b) Educating Government on the ROI in making Data open to the public
c) Educating Government in ways in which clearly marked-off data spaces 
with a trusted provenance can still mean open data delivery for all - 
essentially this already happens whenever data is published, even in a 
HTML/PDF format - having data in the public domain does not mean giving 
access to the original uncorrupted dataset.


Just some thoughts.

Cheers

Chris

Thomas Bandholtz wrote:

There has been much discussion about *Open* Data in the eGov list these
days, which is a rather political question.
I am currently not so much concerned about openness, more about *Linked*
Data, as we have tons of government data with a legal obligation to make
them available to the public (at least in Europe, and especially
environmental data), and we are looking for means to do so in the most
efficient way.

So, among the six items of today's agenda, I find number 6 the most
challenging:
  

6. Discussion: Government Linked Data, Techniques and Technologies
[35min]


some considerations:
  

+ how does linked data support (non-RDF) data consumers?


First of all: Linked Data supports RDF data consumers.

Human readable formats should also be provided based on content
negotiation. Some providers have dedicated HTML formats, others have
not. Those who haven't depend on some available, general purpose linked
data browser.
The latest discussion about the state of such tools has been started by