from:"John Erickson"

Re: Survey: Use of this list for Calls for Papers

2016-03-31 Thread John Erickson

+1 to stuff like "[CfP]" (and also "[RFP]," etc) added to Subject lines.

+100 to people refraining from submitting bogusly long [CfP]'s. These
simply AREN'T necessary! [CfP] emails should be limited to very brief
summaries of the CfP --- possibly "structured," but a very few
elements, and a LINK.

There's this thing called "The Web" with which the bogusly-long
content can be presented in full, and "linked"-to by this thing called
a "URL..."

ALSO: Phil, the max size of allowed emails can be set, to something
like 10K bytes. That will stop all sorts of badness, including
overly-copied discussions threads.

-- 
John S. Erickson, Ph.D.
Director of Operations, The Rensselaer IDEA
Deputy Director, Web Science Research Center (RPI)
 
Twitter & Skype: olyerickson

Re: Algorithm evaluation on the complete LOD cloud?

2015-04-23 Thread John Erickson

I'm somewhat amused by the idea of a complete LOD cloud... ;)

On Thu, Apr 23, 2015 at 7:21 AM, Laurens Rietveld
laurens.rietv...@vu.nl wrote:
 Hi all,

 I'm doing some research on evaluating algorithms on the complete LOD cloud
 (via http://lodlaundromat.org), and am looking for existing papers and
 algorithms to evaluate

 The criteria for such an algorithm are:

 It should be open source
 Domain independent
 No dependency on third data sources, such as query logs or a gold standard
 No particular hardware dependencies (e.g. a cluster)
 The algorithm should take a dataset as input, and produce results as output

 Many thanks in advance for any suggestions
 Best, Laurens


 --

 VU University Amsterdam

 Faculty of Exact Sciences

 Department of Computer Science

 De Boelelaan 1081 A

 1081 HV Amsterdam

 The Netherlands

 www.laurensrietveld.nl

 laurens.rietv...@vu.nl

 Visiting address:

 De Boelelaan 1081

 Science Building Room T312



-- 
John S. Erickson, Ph.D.
Director of Operations, The Rensselaer IDEA
Deputy Director, Web Science Research Center (RPI)
http://tw.rpi.edu olyerick...@gmail.com
Twitter  Skype: olyerickson

Re: Microsoft OLE

2014-12-16 Thread John Erickson

This is not crazy, just a bit insane ;)

OLE structured storage is analogous to XML; indeed, it is in many ways
a binary precursor to XML. You're biggest problem will be that it
represents structure and not relationships; converting information
packed in OLE-SS into sensible RDF presents the same challenges as
converting an arbitrary XML document. Yuck...

To do it you need a target information model for generating the RDF.
You need to ask if you really must convert all of the data or simply
the metadata (ie the properties). If you only (or, at least, at
first) focus on the metadata, there is potentially much you can do.
Also, since Office structures are well known, you might make some
progress into the data itself. But yuck...

Why not just save out the documents as Office XML and convert that
way? Still painful, but there are plenty of code examples (c.f.
LibreOffice)

John

On Mon, Dec 15, 2014 at 6:27 PM, Hugh Glaser h...@glasers.org wrote:
 Thanks Paul,
 On 15 Dec 2014, at 19:07, Paul Houle ontolo...@gmail.com wrote:

 Most Windows programmers would instantiate OLE objects in the applications 
 and query them to get results;
 Ah, the first problem - I'm not a Windows programmer :-)
 In fact, I want to access OLE-published stuff without any need to have 
 knowledge of Windows at all.
 http resolution to IIS or Apache running on the Windows machine seemed like a 
 good choice.
 commonly people write XML or JSON APIs,  but writing RDF wouldn't be too 
 different.

 The next step up is to have a theory that converts OLE data structures to 
 and from RDF either in general or in a specific case with help from a 
 schema.  Microsoft invested a lot in making SOAP work well with OLE,  so you 
 might do best with a SOAP to RDF mapping.
 So yes - a service that did some mapping from the retrieved OLE data 
 structure to RDF; and a general one was what I was thinking of.

 The incoming URI would be interpretable as an OLE data object (I guess with 
 some server config), which then got fetched and converted to RDF.
 In fact, it seems an obvious way of exposing Word docs, Excel spreadsheets 
 and even Access DBs live, but there is probably some stuff I don't understand 
 that means it is crazy.

 I suspect the silence (except you and Barry) means that this isn't something 
 anyone has done, at least yet.

 Best
 Hugh

 This caught my eye though,  because I've been looking at the relationships 
 between RDF and OMG,  a distant outpost of standardization.  You can find 
 competitive products on the market,  one based on UML and another based on 
 RDF, OWL, SKOS and so forth.  The products do more or less the same thing,  
 but described in such different language and vocabulary that it's hard to 
 believe that they compete for any sales.

 There is lots of interesting stuff there,  but the big theme is ISO Common 
 logic,  which adds higher-arity predicates and a foundation for inference 
 that people will actually want to use.  It's not hard to convince the 
 enterprise that first-order-logic is ready for the big time because banks 
 and larger corporations all use FOL-based systems on production rules to 
 automate decisions.



 On Sat, Dec 13, 2014 at 7:30 AM, Hugh Glaser h...@glasers.org wrote:
 Anyone know of any work around exposing OLE linked objects as RDF?
 I could envisage a proxy that gave me URIs and metadata for embedded objects.

 Is that even a sensible question? :-)

 --
 Hugh Glaser
20 Portchester Rise
Eastleigh
SO50 4QS
 Mobile: +44 75 9533 4155, Home: +44 23 8061 5652





 --
 Paul Houle
 Expert on Freebase, DBpedia, Hadoop and RDF
 (607) 539 6254paul.houle on Skype   ontolo...@gmail.com
 http://legalentityidentifier.info/lei/lookup

 --
 Hugh Glaser
20 Portchester Rise
Eastleigh
SO50 4QS
 Mobile: +44 75 9533 4155, Home: +44 23 8061 5652






-- 
John S. Erickson, Ph.D.
Deputy Director, Web Science Research Center
Tetherless World Constellation (RPI)
http://tw.rpi.edu olyerick...@gmail.com
Twitter  Skype: olyerickson

Re: scientific publishing process (was Re: Cost and access)

2014-10-06 Thread John Erickson

This is an incredibly rich and interestingly conversation. I think there
are two separate themes:
1. What is required and/or asked-for by the conference organizers...
a. ...that is needed for the review process
b. ...that is needed to implement value-added services for the conference
c. ...that contributes to the body of work

2. What is required and/or asked for by the publisher?

All of (1) is about the meat of the contributions, including establishing
a long-term legacy. (2) is about (presumably) prestigious output.

What added services could esp. Easychair provide that would go beyond 1.a.
and contribute to 1.b. and 1.c., etc.? Are there any Easychair committers
watching this thread? ;)

John

On Mon, Oct 6, 2014 at 11:17 AM, Kingsley Idehen kide...@openlinksw.com
wrote:

  On 10/6/14 10:25 AM, Paul Houle wrote:

 Frankly I don't see the reason for the hate on PDF files.

  I do a lot of reading on a tablet these days because I can take it to
 the gym or on a walk or in the car.  Network reliability is not universal
 when I leave the house (even if I had a $10 a GB LTE plan) so downloaded
 PDFs are my document format of choice.

  There might be a lot of hypothetical problems with PDFs,  and I am sure
 there is a better way to view files on a small screen,  but practically I
 have no trouble reading papers from arXiv.org,  books from oreilly.com,
  be these produced by a TeX-derived or Word-derived toolchains or a
 toolchain that involves a real page layout tool for that matter.


 Paul,

 As I see it, the issue here is more to do with PDF being the only option,
 rather than no PDFs at all. Put differently, we are not using our horses
 for course technology (the Web that emerges from AWWW exploitation) to
 produce horses for course conference artifacts. Instead, we continue to
 impose (overtly or covertly) specific options that are contradictory, and
 of diminishing value.

 Conferences (associated with themes like Semantic Web and Linked Open
 Data) should accept submissions that provide open access to relevant
 research data. In a sense, imagine if PDFs where submitted without
 bibliographic references. Basically, that's what happening here with
 research data circa. 2014, where we have a functioning Web of Linked (Open)
 Data, which is based on AWWW.

 Loosely coupling the print-friendly documents (PDFs, Latex etc.),
 http-browser friendly documents (HTML), and actual raw data references
 (which take the form of 5-Star Linked Open Data ) is a practical staring
 point. Adding experiment workflow (which is also becoming the norm in the
 bio informatics realm) is a nice bonus, as already demonstrated by examples
 provided by Hugh Glaser (see: this weekend's thread).

 Kingsley






 On Sun, Oct 5, 2014 at 5:43 PM, Mark Diggory mdigg...@atmire.com wrote:


 On Sun, Oct 5, 2014 at 2:39 PM, Mark Diggory mdigg...@atmire.com wrote:

 Hello Community,

  On Sun, Oct 5, 2014 at 1:19 PM, Luca Matteis lmatt...@gmail.com
 wrote:

 On Sun, Oct 5, 2014 at 4:34 PM, Ivan Herman i...@w3.org wrote:
  The real problem is still the missing tooling. Authors, even if
 technically savy like this community, want to do what they set up to do:
 write their papers as quickly as possible. They do not want to spend their
 time going through some esoteric CSS massaging, for example. Let us face
 it: we are not yet there. The tools for authoring are still very poor.

 But are they still very poor? I mean, I think there are more tools for
 rendering HTML than there are for rendering Latex. In fact there are
 probably more tools for rendering HTML than anything else out there,
 because HTML is used more than anything else. Because HTML powers the
 Web!


 You can write in Word, and export in HTML. You can write in Markdown
 and export in HTML. You can probably write in Latex and export in HTML
 as well :)


 The tools are not the problem. The problem to me is the printing
 afterwords. Conferences/workshops need to print the publications.
 Printing consistent Latex/PDF templates is a lot easier than printing
 inconsistent (layout wise) HTML pages.


   There are tools, for example, theres already a bit of work to provide
 a plugin for semantic markup in Microsoft Word (
 https://ucsdbiolit.codeplex.com/) and similar efforts on the Latex side
 (https://trac.kwarc.info/sTeX/)

  But, this is not a question of technology available to authors, but of
 requirements defined by publishers. If authors are too busy for this
 effort, then publishers facilitate that added value when it is in their
 best interest.

 For example, PLoS has a published format guidelines using Work and Latex
 (http://www.plosone.org/static/guidelines), a workflow for semantically
 structuring their resulting output and their final output is well
 structured and available in XML based on a known standard (
 http://dtd.nlm.nih.gov/publishing/3.0/journalpublishing3.dtd), PDF and
 the published HTML on their website (

Re: Sharing George Thomas' legacy with colleagues scholars

2014-09-23 Thread John Erickson

Thanks very much for this update, Bernadette!

It is exciting to see George's growing legacy in the community his
wisdom, hard work and energy helped create!

John

On Tue, Sep 23, 2014 at 11:17 AM, Bernadette Hyland
bhyl...@3roundstones.com wrote:
 Hi,
 Some of you worked with George Thomas, my fellow W3C Government Linked Data
 Working Group co-chair (2011-2012) and my friend, so I wanted to share a
 good news story  give a progress update.

 In June, a public announcement at the US National Health Datapalooza
 (Washington DC) was made to honor George's work in the field of health data
 science, specifically in relation to publishing Linked Data on human health.
 A generous post graduate fellowship in health data science  engineering was
 made to honor the life  legacy of George.[1]  George pioneered the site
 healthdata.gov (not to be confused with the site healthcare.gov).  For
 years, George tirelessly spearheaded efforts to make healthcare  life
 sciences more discoverable on the Web  more usable for us all.[2]

 George literally taught several US Government executives the words open and
 machine readable until it became embodied in a Presidential mandate.[3]
 This week, the inaugural recipient of the George Thomas Post-Graduate
 Fellowship: Kevin Emmett, a doctoral student in the Department of Physics at
 Columbia University was awarded this fellowship.[4] In this open letter
 below, George's wife, Suzanne Thomas (copied), congratulates Kevin (copied)
 and reflects on influence of George's work on the next generation of data
 scientists and engineers.[5]

 Thanks to all of you who helped make this happen  who publish  consume
 open  machine readable information on human health.

 Cheers,

 Bernadette Hyland
 CEO, 3 Round Stones, Inc.

 http://3roundstones.com
 http://about.me/bernadettehyland


 [1]
 http://healthdatapalooza.org/health-data-consortium-announces-memorial-health-data-sciences-fellowship/

 [2] http://healthdata.gov/

 [3]
 http://www.whitehouse.gov/sites/default/files/omb/memoranda/2013/m-13-13.pdf

 [4] https://systemsbiology.columbia.edu/people/kevin-emmett

 [5] http://www.healthdataconsortium.org/blog_home.asp?Display=100




-- 
John S. Erickson, Ph.D.
Deputy Director, Web Science Research Center
Tetherless World Constellation (RPI)
http://tw.rpi.edu olyerick...@gmail.com
Twitter  Skype: olyerickson

Re: ORCID as Linked Data

2014-06-17 Thread John Erickson

I agree with Leigh, this is a great addition.

Our team is working on a VIVO extension http://vivoweb.org that will
do bibliographic RDF import based on DOIs using CrossRef's linked data
access capability. Now we'll take a look at a similar capability for
ORCID identifiers!

John

On Tue, Jun 17, 2014 at 10:00 AM, Leigh Dodds le...@ldodds.com wrote:
 I discovered this today:

 curl -v -L -H Accept: text/turtle http://orcid.org/-0003-0837-2362

 A fairly new addition to the ORCID service I think.

 With many DOIs already supporting Linked Data views, this makes a nice
 addition to the academic linked data landscape.

 Still lots of room for improvement, but definitely a step forwards.

 Cheers,

 L.

 --
 Leigh Dodds
 Freelance Technologist
 Open Data, Linked Data Geek
 t: @ldodds
 w: ldodds.com
 e: le...@ldodds.com




-- 
John S. Erickson, Ph.D.
Deputy Director, Web Science Research Center
Tetherless World Constellation (RPI)
http://tw.rpi.edu olyerick...@gmail.com
Twitter  Skype: olyerickson

Re: What happened to my trusted Turtle validator and converter?

2014-05-06 Thread John Erickson

I believe Gregg meant: http://linter.structured-data.org/

On Tue, May 6, 2014 at 11:45 AM, Gregg Kellogg gr...@greggkellogg.net wrote:
 Phil, if you could add my distiller http://rdf.greggkellogg.net/distiller, 
 and the Structured Data Linter http://linter-structured-data.org/ they 
 would serve a couple of purposes. The RDF distiller processes pretty much 
 every RDF variant, and the distiller also validates against the schema.org, 
 FOAF, and a number of other vocabularies and generates a human usable 
 representation of that data.

 Gregg Kellogg
 gr...@greggkellogg.net

 On May 6, 2014, at 5:05 PM, Phil Archer ph...@w3.org wrote:

 I should have said this in reply to Frans' original mail - there's a list of 
 validators on the W3C Sem Web wiki, see
 http://www.w3.org/2001/sw/wiki/Category:Validator

 That's pulled from the more general list of tools
 http://www.w3.org/2001/sw/wiki/Tools

 There are recent entries on that wiki so I dare to think that it is still 
 doing its job of providing useful info.

 HTH

 Phil.


 On 05/05/2014 08:58, Frans Knibbe | Geodan wrote:
 On 2014-04-28 22:58, Ghislain Atemezing wrote:
 Hi Frans,
 According to the creator of the tool on Twitter
 (https://twitter.com/JoshData/)/, he does not have any plan at the
 moment to fix the issue.
 Yes, after writing him (Josh Tauberer) he explained that the validator
 broke after an upgrade of the host OS, and that he does not have time to
 look into it.

 This is a great list! Already three alternatives for the Turtle
 validator and converter are noted. Thanks a lot for developing and
 sharing those. I will try them all out.

 However, he kindly shared the code at
 https://gist.github.com/JoshData/11370152.
 So, maybe you might be interested in hosting it ? (or even someone in
 this list) ?.
 I agree, it was/is a nice tool that deserve a « second life » ;)

 That is a good idea. I will try to see if I can find an open time window
 somewhere.

 Regards,
 Frans

 Best,
 Ghislain




 
 Frans Knibbe
 Geodan
 President Kennedylaan 1
 1079 MB Amsterdam (NL)

 T +31 (0)20 - 5711 347
 E frans.kni...@geodan.nl
 www.geodan.nl http://www.geodan.nl | disclaimer
 http://www.geodan.nl/disclaimer
 


 --


 Phil Archer
 W3C Data Activity Lead
 http://www.w3.org/2013/data/

 http://philarcher.org
 +44 (0)7887 767755
 @philarcher1






-- 
John S. Erickson, Ph.D.
Deputy Director, Web Science Research Center
Tetherless World Constellation (RPI)
http://tw.rpi.edu olyerick...@gmail.com
Twitter  Skype: olyerickson

Re: [datahub-discuss] Lot's of eggs, but where is the chicken? Foundation of a Data ID Unit

2014-04-18 Thread John Erickson

This is interesting...So, could *someone* please provide context
w.r.t. a couple prior and/or ongoing efforts:

* W3C DCAT http://www.w3.org/TR/vocab-dcat/
* Schema.og/Dataset http://schema.org/Dataset
* Various Research Data Alliance (RDA) WGs on data citation,
persistent identification and typing of data/datasets
http://rd-alliance.org/
* The new W3C Data activity http://www.w3.org/2013/data/

etc...

On Fri, Apr 18, 2014 at 3:10 AM, Sebastian Hellmann
hellm...@informatik.uni-leipzig.de wrote:
 Hi Michel,
 this looks very similar. DataID is just a fancy name to pool efforts under
 it.

 Here is a Gretchenfrage:   Do you have a list of URLs with existing
 HCLSDatasetDescriptions ?

 If yes, could you submit this list here:
 https://github.com/dbpedia/dataId/blob/master/TheListOfDataIDUrlsOnTheWeb.txt
 If we end up with a slightly different format (which I don't think will
 happen) we can write a small converter.
 We are also looking for people to join the group and provide more client
 implementations, i.e. DataId to Jena Assembler or Virtuoso Load Scripts or
 LOD2 Debian packages.

 All the best,
 Sebastian



 On 17.04.2014 22:55, Michel Dumontier wrote:

 Hi,
  we have just released a draft version of our W3C note on dataset
 description [1] for comment. It would be worthwhile to compare this with
 your DataID proposal, so as to see what considerations our community had,
 and whether we have inadvertently missed something that you may require.

 Cheers,

 m.

 [1]
 http://htmlpreview.github.io/?https://github.com/joejimbo/HCLSDatasetDescriptions/blob/master/Overview.html


 Michel Dumontier
 Associate Professor of Medicine (Biomedical Informatics), Stanford
 University
 Chair, W3C Semantic Web for Health Care and the Life Sciences Interest Group
 http://dumontierlab.com


 On Thu, Apr 17, 2014 at 6:29 AM, Sebastian Hellmann
 hellm...@informatik.uni-leipzig.de wrote:

 Dear all,
 I would like to wish you a Happy Easter. At the same time, I have an
 issue, which concerns LOD and data on the web in general.

 As a community, we have all contributed to this image:
 http://lod-cloud.net/versions/2011-09-19/lod-cloud.html  (which is now three
 years old)
 You can see a lot of eggs on it, but the meta data (the chicken) is:
 - inaccurate
 - out-dated
 - infeasible to maintain manually
 (this is my opinion. I find it hard to belief that we will start updating
 triple and link counts manually)

 Here one pertinent example:
 http://datahub.io/dataset/dbpedia
 - (this is still linking to DBpedia 3.5.1)


 Following the foundation of the DBpedia Association we would like to start
 to solve this problem with the help of a new group called DataIDUnit
 (http://wiki.dbpedia.org/coop/DataIDUnit) using rough consensus and working
 code as their codex: http://wiki.dbpedia.org/coop/

 The first goal will be to find some good existing vocabularies and then
 provide a working version for DBpedia.
 A student from Leipzig (Markus Freudenberg) will implement a push DataId
 to Datahub via its API feature.
 This will help us describe the chicken better, that laid all these eggs.

 Happy Easter, all feedback is welcome, we hope not to duplicate efforts.
 Sebastian


 --
 Sebastian Hellmann
 AKSW/NLP2RDF research group
 DBpedia Association
 Insitute for Applied Informatics (InfAI)
 Events:
 * 21st March, 2014: LD4LT Kick-Off @European Data Forum
 * Sept. 1-5, 2014 Conference Week in Leipzig, including
 ** Sept 2nd, MLODE 2014
 ** Sept 3rd, 2nd DBpedia Community Meeting
 ** Sept 4th-5th, SEMANTiCS (formerly i-SEMANTICS)
 Venha para a Alemanha como PhD: http://bis.informatik.uni-leipzig.de/csf
 Projects: http://dbpedia.org, http://nlp2rdf.org,
 http://linguistics.okfn.org, https://www.w3.org/community/ld4lt
 Homepage: http://aksw.org/SebastianHellmann
 Research Group: http://aksw.org
 Thesis:
 http://tinyurl.com/sh-thesis-summary
 http://tinyurl.com/sh-thesis

 ___
 datahub-discuss mailing list
 datahub-disc...@lists.okfn.org
 https://lists.okfn.org/mailman/listinfo/datahub-discuss




 --
 Sebastian Hellmann
 AKSW/NLP2RDF research group
 Insitute for Applied Informatics (InfAI) affiliated with DBpedia

 Events:
 * 21st March, 2014: LD4LT Kick-Off @European Data Forum
 * Sept. 1-5, 2014 Conference Week in Leipzig, including
 ** Sept 2nd, MLODE 2014
 ** Sept 3rd, 2nd DBpedia Community Meeting
 ** Sept 4th-5th, SEMANTiCS (formerly i-SEMANTICS)
 Venha para a Alemanha como PhD: http://bis.informatik.uni-leipzig.de/csf
 Projects: http://dbpedia.org, http://nlp2rdf.org,
 http://linguistics.okfn.org, https://www.w3.org/community/ld4lt
 Homepage: http://aksw.org/SebastianHellmann
 Research Group: http://aksw.org
 Thesis:
 http://tinyurl.com/sh-thesis-summary
 http://tinyurl.com/sh-thesis



-- 
John S. Erickson, Ph.D.
Deputy Director, Web Science Research Center
Tetherless World Constellation (RPI)
http://tw.rpi.edu olyerick...@gmail.com
Twitter  Skype: olyerickson

Re: OpenRefine

2013-10-28 Thread John Erickson

Hugh, I wonder if you could be more specific regarding the troubles
you had with OpenRefine?

One of our students also had trouble, and I'm wondering if it might be
the same problem.

Like you, reconciliation with Refine has worked for me in the past but
I haven't tried the same process using OpenRefine...

On Mon, Oct 28, 2013 at 2:41 PM, Hugh Glaser h...@ecs.soton.ac.uk wrote:
 Hi.
 I’m not sure where to ask, so I’ll try my friends here.
 I was having a go at OpenRefine yesterday, and I can’t get it to reconcile, 
 try as I might - I have even watched the videos again.
 I’m doing what I remember, but it is a while ago.
 Are there others currently using it successfully?
 Or is it possibly a Mavericks (OSX) upgrade thing, which I did recently.
 Cheers
 --
 Hugh




-- 
John S. Erickson, Ph.D.
Director, Web Science Operations
Tetherless World Constellation (RPI)
http://tw.rpi.edu olyerick...@gmail.com
Twitter  Skype: olyerickson

Re: License LINK Headers and Linked Data

2013-08-12 Thread John Erickson

Leigh et.al., you may be interested in (the great) Henry Perritt's
seminal _1993_ paper Knowbots, Permissions Headers and Contract Law
[1]

Prof. Perritt wrote this in the very early days of the Web, when few
people were thinking (clearly) about copyright on the Internet and
fewer still were contemplating making licenses and offers easily
obtained and processed by rules-based agents --- his knowbots.
Leigh's ideas seem very much in that spirit!

John
PS: One can see aspects of Perritt's ideas also in the way Creative
Commons is implemented.

[1] Henry H. Perritt, Knowbots, Permissions Headers and Contract
Law. Paper for the Conference on Technological Strategies for
Protecting Intellectual Property in the Networked Multimedia
Environment (April 1993). http://cyber.eserver.org/property.txt

On Mon, Aug 12, 2013 at 12:14 PM, Leigh Dodds le...@ldodds.com wrote:
Hi,

There's one aspect of my document on publishing machine-readable
rights statements that I want to flag to this community.

Specifically its the section on including references to licence and
rights statements from LINK headers in HTTP responses:

https://github.com/theodi/open-data-licensing/blob/master/guides/publisher-guide.md#linking-to-rights-statements-from-web-apis

While that information can also be published in RDF, as part of the
Linked Data response, I think adding LINK headers is very important
too, for several reasons:

Linked Data applications and browsers will commonly encounter new
resources and the licensing information should be immediately clear.
Having this be accessible outside of the response will allow user
agents to be able to clearly detect licences before they start
retrieving data from a new source. This will allow users to place
pre-conditions on what type of data they want to
harvest/collect/process.

A HEAD request can be made on a resource to check its licensing,
before data is actually retrieved.

Cheers,

--
Leigh Dodds
Freelance Technologist
Open Data, Linked Data Geek
t: @ldodds
w: ldodds.com
e: le...@ldodds.com

--
John S. Erickson, Ph.D.
Director, Web Science Operations
Tetherless World Constellation (RPI)
http://tw.rpi.edu olyerick...@gmail.com
Twitter Skype: olyerickson

Re: Licensing advice

2013-07-25 Thread John Erickson

My two cents: In many legal regimes it has been successfully argued
that code is speech. The imperative vs declarative distinction is
likely to fail; if the code conveys information intended to control
the operation of another system, it can be argued that it is a form of
speech (and not merely data, for which different IP rules may
apply). Consider the DeCSS trials (and tribulations) of the last
decade http://digital-law-online.info/lpdi1.0/treatise50.html

People interested in this topic might enjoy Gabriella Coleman's Code
is Speech (2009) http://bit.ly/CodeIsSpeech and Coding Freedom
(2013) http://bit.ly/CodingFreedom

On Thu, Jul 25, 2013 at 8:30 AM, Víctor Rodríguez Doncel
vrodrig...@fi.upm.es wrote:

 Oh! I didn't know... but if you can insert a SQL expression then R2RML is
 certainly imperative.
 Now I am very curious about the Prolog question, too, and I would like to
 hear more opinions.

 To foster the discussion, I have posted about RDF Mappings and Licenses
 here: http://licensius.com/blog/MappingsAndLicenses

 Víctor

 El 25/07/2013 13:13, Barry Norton escribió:


 Interesting distinction, but I'm not sure I buy it.

 Does that mean software licenses don't apply to PROLOG code?

 I can actually make R2RML mappings more imperative than PROLOG cuts by using
 control flow features of SQL.

 Barry


 On 25/07/13 12:04, Víctor Rodríguez Doncel wrote:

 Dear Roberto, all

 Well, I have not heard about any case in a trial court about this and the
 legal texts seem somewhat ambiguous. Also, I have not heard other qualified
 opinions on this particular regard. So, this can be matter for a friendly
 discussion.

 But I still lean towards not considering a mapping (for example the R2RML
 below) as a computer program.
 The mapping is declarative, not imperative. They are not instructions, as
 required in the legal text.

 Think of HTML pages. I dont think they are regarded as software. People
 don't license them with a BSD license. They use CreativeCommons licenses,
 intended for general works. You declare a table, a computer program will
 process it. (Yet, a Javascript piece would be made up of instructions).

 I hope I clarified my point.
 Víctor



 @prefix rr: http://www.w3.org/ns/r2rml#.
 @prefix ex: http://example.com/ns#.

 #TriplesMap1
 rr:logicalTable [ rr:tableName EMP ];
 rr:subjectMap [
 rr:template http://data.example.com/employee/{EMPNO};;
 rr:class ex:Employee;
 ];
 rr:predicateObjectMap [
 rr:predicate ex:name;
 rr:objectMap [ rr:column ENAME ];
 ].



 El 25/07/2013 10:32, Roberto García escribió:

 Dear Víctor, Tom, all,

 Maybe I've missed something but if what is going to be licensed are R2RML
 mappings, for me this is code.

 As Víctor quoted, a computer program is (WIPO): a set of instructions,
 which controls the operations of a computer in order to enable it to perform
 a specific task.

 This is just what happens with R2RML mappings, they are based on a
 metalanguage that is read by a computer using a R2RML interpreter
 (implemented using another programming language but just similar to a
 compiler) that at last executes a set of instructions that read data from a
 source and generate a data stream in the output...

 My 2c,


 Roberto


 On Wed, Jul 24, 2013 at 11:01 AM, Víctor Rodríguez Doncel
 vrodrig...@fi.upm.es wrote:


 Well, ODC data licenses include both copyrights and database rights.
 So you dont give up your claims for having made a creative work...

 Víctor

 El 24/07/2013 10:38, Tom Heath escribió:

 Just seen this thread, apols for the slow response Barry...

 Of course IANAL and all that, but I disagree with Victor's conclusion.

 I would argue that the individual mappings are creative works (as you
 say), and therefore a CC license would apply (better still, why not
 apply a public domain waiver so they're totally open?).

 The collection as a whole would probably qualify as a database, at
 which point Victor's points about a DB license would be relevant.

 As others have mentioned, the data created by the execution of these
 mappings is another issue altogether, which you seem to have covered.

 My 2p worth -- hope it helps :)

 Tom.


 On 12 July 2013 21:38, Víctor Rodríguez Doncel vrodrig...@fi.upm.es
 wrote:

 Barry,

 My opinion is the following:

 1. Code license NO. A computer program is (WIPO): a set of instructions,
 which controls the operations of a computer in order to enable it to
 perform
 a specific task
 2. Intellectual Property. I'd say no in this case. Some databases are
 protected by IP law. They are if they can assumed to be collections of
 literary or artistic works such as encyclopaedias and anthologies which,
 by
 reason of the selection and arrangement of their contents, constitute
 intellectual creations, are to be protected as such, without prejudice to
 the copyright in each of the works forming part of such collections.
 So, if you have made your mapping automatically, they are NOT

Re: Licensing advice

2013-07-25 Thread John Erickson

On Thu, Jul 25, 2013 at 12:48 PM, Gannon Dick gannon_d...@yahoo.com wrote:
 My two cents: Isn't Linked Data supererogatory in any Jurisdiction ?
 http://plato.stanford.edu/entries/supererogation/

(Linked Data) raises interesting problems both on the meta-ethical
level of deontic logic and on the normative level of the justification
of moral demands... ;)


-- 
John S. Erickson, Ph.D.
Director, Web Science Operations
Tetherless World Constellation (RPI)
http://tw.rpi.edu olyerick...@gmail.com
Twitter  Skype: olyerickson

Re: Linked Data version of ICD

2013-07-15 Thread John Erickson

Marcello, this is not precisely what you are looking for but it might prove
helpful if you decide to build an ICD instance hub yourself:
http://www.icd10data.com/

URIs from the above site are finer-grained than on the WHO site; compare
the following:

* http://www.icd10data.com/ICD10CM/Codes/H60-H95/H60-H62/H60-/H60.321 =
finest resolution possible via icd10data
* http://apps.who.int/classifications/icd10/browse/2010/en#/H60 = finest
resolution possible via WHO

Unfortunately, neither has an API, but you might ask... ;)

John


On Mon, Jul 15, 2013 at 3:50 AM, Marcello Leida marcello.le...@kustar.ac.ae
 wrote:

  Hi all,

 ** **

 I was wandering if there is any available linked data set on International
 classification of Diseases (ICD)[1].

 I could not find any in the linkeddata.org page, but maybe someone has
 translated this data.

 ** **

 Kind Regards.

 [1] http://apps.who.int/classifications/icd10/browse/2010/en

 *Marcello Leida*

 Senior Researcher

 [image: ebtic] http://www.ebtic.org/**

 EBTIC (Etisalat BT Innovation Center)

 PO BOX 127788

 Abu Dhabi, U.A.E.

 Tel. +9724018144

 Mob.+971566936698

 ** **




-- 
John S. Erickson, Ph.D.
Director, Web Science Operations
Tetherless World Constellation (RPI)
http://tw.rpi.edu olyerick...@gmail.com
Twitter  Skype: olyerickson
image001.png

Re: Open Data Rights Statements

2013-07-02 Thread John Erickson

Leigh, this is *fantastic*; thanks for your contributions and bringing
ODRS to our attention!

I think something that could help with ODRS adoption is (are?) some
lightweight, community-sourced examples that demonstrate its use in
different (esp. legal) contexts.

John

On Tue, Jul 2, 2013 at 4:23 AM, Leigh Dodds le...@ldodds.com wrote:
 Hi,

 At the UK Open Data Institute we've been working on some guidance and
 a new vocabulary to help support the publication of machine-readable
 rights statements for open data. The vocabulary builds on existing
 work in this area (e.g. Dublin Core and Creative Commons) but
 addresses a few issues that we felt were underspecified.

 The vocabulary is intended to work in a wide variety of contexts, from
 simple JSON documents and data packaging formats through to Linked
 Data and Web APIs.

 The work is now at a stage where we're keen to get wider feedback from
 the community.

 You can read a background on the work in this introductory blog post
 on the UK ODI blog:

 http://theodi.org/blog/machine-readable-rights-statements

 The draft schema can be found here:

 http://schema.theodi.org/odrs/

 And there are publisher and re-user guides to accompany it:

 https://github.com/theodi/open-data-licensing/blob/master/guides/publisher-guide.md
 https://github.com/theodi/open-data-licensing/blob/master/guides/reusers-guide.md

 We would love to hear your feedback on the work. If you do have issues
 or comments, then can I ask that you submit them as an issue to our
 github project:

 https://github.com/theodi/open-data-licensing/issues

 Thanks,

 L.

 --
 Leigh Dodds
 Freelance Technologist
 Open Data, Linked Data Geek
 t: @ldodds
 w: ldodds.com
 e: le...@ldodds.com




-- 
John S. Erickson, Ph.D.
Director, Web Science Operations
Tetherless World Constellation (RPI)
http://tw.rpi.edu olyerick...@gmail.com
Twitter  Skype: olyerickson

Re: Linked Data Glossary is published!

2013-07-01 Thread John Erickson

Thanks for your comments, Kanzaki!

You wrote:
 1. 5 Star ...:
 I'm afraid I don't understand why XML is 2-star (implying proprietary
 format), lower rated than CSV. Does it mean Excel ? (but even Excel uses
 non-proprietary OpenXML now).
 I'd suggest this section should include a link to the TimBL's original
 scheme.

Please note that the formats listed are EXAMPLES...
I think the point w.r.t. spreadsheets is that they are not merely
published using OpenXML, but also using a well-defined document model
that makes the semantics of their content more accessible.

I believe the thinking on XML being listed under two stars is that
often data has been dumped out of some government's data catacombs
using XML serialization but no clear explanation of structure or
semantics. OpenXML has a clearly (and openly) defined structure and a
protocol for interpretation, and thus should get more stars than
XML. A dumb XML artifact is somewhat more usable than dumping the
data in PDF, but is not as usable or accessible as a CSV, which
usually provides the data in a tabular format wherein the semantics of
each cell are more accessible. This is a sweeping generalization, but
it is often the case.

Thus, I think we should distinguish between plain old XML and Office
Open XML/OOXML/OpenXML; based on my understanding and what I read 
OpenXML could be listed as an example three-star format.

 58. Machine Readable Data:
 I think without access to proprietary libraries is not well fit for the
 definition of this widely used term (it could be called Open Machine
 Readable Data or something, if this condition is necessary). Moreover, I
 don't believe PDF and Microsoft Excel are good counter examples. There are
 many open source PDF libraries, and parsing OpenXML is not very difficult.

* I agree that Open, Machine Readable Data or Openly Machine
Readable Data is a better fit with our definition than Machine
Readable. I believe the distinctions are:
** Machine Readable Data is readily accessed using available
libraries or protocols
** Open (or Openly) Machine Readable Data is readily accessed using
freely-available libraries or protocols
* I think the POINT is that the data should be published in a way
suited for machine consumption. A format should NOT be considered
machine readable simply because someone cooked up a hack on
Scraperwiki for getting the data out of an otherwise opaque data dump
on a site
* On that point, data may be structured within PDF documents in a
variety of ways, depending upon the source program and the code used
to generate the PDF (including such tools as LaTeX).
** IF there was a well-documented and widely-adopted approach for
embedding tabular data in PDFs and IF API for identifying and pulling
out such data from PDFs, then perhaps PDF-published data could be
considered. NOTE: I am NOT talking about metadata...
** One could even embed RDF Linked Data in PDFs using such a
technique. If it only existed...

 Probably, what is needed here is a sort of Machine Readable Structured
 Data, which PDF and Excel data are sometimes not. However, unstructured
 Excel data is not the fault of the format, but the usage of it, IMHO.

I get your meaning, and this *might* be a useful distinction to make.
I'm wondering however if we really want/need an additional entry in
the Glossary.
* The argument against having a separate term is simply that
(arguably) the common case for publishing machine readable data *is*
structured data, and adding the a special structured category merely
confuses adopters.
* The argument for a new term is, if the reason we want machine
readable data is because we expect (and usually get) structured data,
then we should specify that what we REALLY want is machine readable
structured data... (and explain what that means)

John


 cheers,



 2013/6/28 Bernadette Hyland bhyl...@3roundstones.com

 Hi,
 On behalf of the editors, I'm pleased to announce the publication of the
 peer-reviewed Linked Data Glossary published as a W3C Working Group Note
 effective 27-June-2013.[1]

 We hope this document serves as a useful glossary containing terms defined
 and used to describe Linked Data, and its associated vocabularies and best
 practices for publishing structured data on the Web.

 The LD Glossary is intended to help foster constructive discussions
 between the Web 2.0 and 3.0 developer communities, encouraging all of us
 appreciate the application of different technologies for different use
 cases.  We hope the glossary serves as a useful starting point in your
 discussions about data sharing on the Web.

 Finally, the editors are grateful to David Wood for contributing the
 initial glossary terms from Linking Government Data, (Springer 2011). The
 editors wish to also thank members of the Government Linked Data Working
 Group with special thanks to the reviewers and contributors: Thomas Baker,
 Hadley Beeman, Richard Cyganiak, Michael Hausenblas, Sandro Hawke, Benedikt
 Kaempgen, James McKinney, Marios

Re: Percentages in Linked Data

2013-06-24 Thread John Erickson

Frans, I think you may be interested in the W3C RDF Data Cube
vocabulary: http://www.w3.org/TR/vocab-data-cube/

Abstract: There are many situations where it would be useful to be
able to publish multi-dimensional data, such as statistics, on the web
in such a way that it can be linked to related data sets and concepts.
The Data Cube vocabulary provides a means to do this using the W3C RDF
(Resource Description Framework) standard. The model underpinning the
Data Cube vocabulary is compatible with the cube model that underlies
SDMX (Statistical Data and Metadata eXchange), an ISO standard for
exchanging and sharing statistical data and metadata among
organizations. The Data Cube vocabulary is a core foundation which
supports extension vocabularies to enable publication of other aspects
of statistical data flows or other multi-dimensional data sets...

John

On Mon, Jun 24, 2013 at 12:37 PM, Frans Knibbe | Geodan
frans.kni...@geodan.nl wrote:
 Hello,

 I would like to publish some statistical data. A few of these numbers are
 percentages. What is the best way to make it clear to data consumers that
 the numbers are to be treated as percentages? As far as I can tell, the XSD
 data types do not suffice.

 Regards,
 Frans Knibbe





-- 
John S. Erickson, Ph.D.
Director, Web Science Operations
Tetherless World Constellation (RPI)
http://tw.rpi.edu olyerick...@gmail.com
Twitter  Skype: olyerickson

Re: The need for RDF in Linked Data

2013-06-17 Thread John Erickson

On Mon, Jun 17, 2013 at 6:32 AM, Luca Matteis lmatt...@gmail.com wrote:

 So how do we define the line between Documents and Data? Is HTML a document
 and RDF data? But then HTML annotated as RDFa becomes data?


No. I think you're losing the linked distinction.

* You can use the Web to distribute data, but...
* ...if that data doesn't name and distinguish entities using URIs,
then it isn't (usefully) linked
* ...if that data doesn't declare relationships between entities with
URI-named predicates, then it isn't (usefully) linked

So, given a client's ability to interpret relationships between
entities expressed via e.g. RDFa, an HTML page can easily contribute
to the Web of Data.

John

--
John S. Erickson, Ph.D.
Director, Web Science Operations
Tetherless World Constellation (RPI)
http://tw.rpi.edu olyerick...@gmail.com
Twitter  Skype: olyerickson

Re: [White House announcement] We're making a lot more data open to the public

2013-05-10 Thread John Erickson

NOTE: Getting various US government agencies up to speed on
providing their open data as Linked Data (tm) is a *process* that
includes preparing useful/actionable best practices, accompanied by a
set of relevant standards. This has been the work of the W3C
Government Linked Data Working Group over the past couple years. See
http://www.w3.org/2011/gld/wiki/Main_Page

Over the past few weeks and indeed days we have been pushing to CR
status a number of vocabularies, including DCAT, ORG, and DATA CUBE,
with ADMS and RegOrg soon to follow, as well as a Best Practices
document. All of this is necessary infrastructure to help get
Data.gov and its peers across the world publishing their OGD as
LOGD...

I won't attempt the footnotes-in-email game here; just go to the W3C
GLD WG page http://www.w3.org/2011/gld/wiki/Main_Page and see all
the deliverables.

John
PS: Huge debts of gratitude to all of my GLD peers, esp. current
co-chairs Hadley Beeman and Bernadette Hyland, co-chair emeritus
George Thomas, and W3C operative Sandro Hawke!


On Fri, May 10, 2013 at 9:53 AM, Kingsley Idehen kide...@openlinksw.com wrote:
 On 5/10/13 9:39 AM, Luca Matteis wrote:

 Actually I don't see Linked Data mentioned specifically. They mostly talk of
 Open Data and the Project Open Data: http://project-open-data.github.io/

 This interesting project also doesn't seem to mention Linked Data, which
 scares me a little.


 As I stated in the tweet [1], as a community of Linked Data supporters, we
 too need to remember that it's the meaning (delivered via relation
 semantics) that matters not the labels.

 The phrase Linked Data isn't what determines the principled approach to
 constructing and publishing RDF model based structured data; one that
 leverages HTTP URIs as a denotation mechanism for entities in a manner that
 ensures they resolve to description documents -- where the content is
 comprised of one or more entity relationship graphs endowed with varying
 levels of machine- and/or human-comprehensible entity relationship
 semantics.

 Links:

 1. https://twitter.com/kidehen/status/332530949264916482 -- it's the meaning
 (discernible from relations/sentence semantics) that matters
 2. https://twitter.com/kidehen/status/332508521314799618 -- complete tweet
 thread.


 Kingsley




 On Fri, May 10, 2013 at 3:16 PM, David Wood da...@3roundstones.com wrote:

 On May 10, 2013, at 09:06, Kingsley Idehen kide...@openlinksw.com wrote:

  On 5/10/13 7:45 AM, David Wood wrote:
  On May 9, 2013, at 19:24, Kingsley Idehen kide...@openlinksw.com
  wrote:
 
  On 5/9/13 6:23 PM, David Booth wrote:
  FYI. Progress!  Note that they mention machine-readable formats like
  CSV, XML, and JSON.  Now, how can we get these organizations to provide
  that information in Turtle, or provide mappings from those formats to 
  RDF so
  that the semantics will be clear and information will be easier to link 
  with
  other information?
  Yes, a lot of progress.
 
  Remember, sometimes less is more, Turtle and RDF wouldn't add any
  critical value to consumers of the directive. Ultimately, this is 
  actually
  about Open Data and Linked Data [1]. The narrative was deftly crafted by
  politicians who are masters of abstract communication.
  Actually, it was crafted by Deloitte and Microsoft who hope to make
  money on the newly released data.  It is still progress ;)
 
  And they are instances subClasses with a lot of political attributes :-)

 Yeah, good point :)


 
  BTW -- the best practices section is all about Open Data and Linked Data
  [1].

 Good stuff.

 Regards,
 Dave
 --
 http://about.me/david_wood


 
 
  Links:
 
  1. https://twitter.com/kidehen/status/332508521314799618 -- thread about
  this matter (read from the bottom up).
 
  Kingsley
 
  Regards,
  Dave
  --
  http://about.me/david_wood
 
 
  The only downside for me was the hour I lost trying to the get the PDF
  [2][3] transformed into HTML via the plethora of PDF to HTML converters 
  out
  there.
 
  Links:
 
  1. https://twitter.com/kerfors/status/332528853098561538 -- thread
  that developed around the announcement
  2.
  http://www.whitehouse.gov/sites/default/files/omb/memoranda/2013/m-13-13.pdf
  -- PDF
  3.
  http://www.pdf.investintech.com/preview/35b2ec7e-b8c9-11e2-8a0e-003048d80846/index.html
  -- Converted PDF .
 
 
  Kingsley
  On a related note, there will be a workshop on RDF as a Universal
  Healthcare Exchange Language at the upcoming SemTech conference in San
  Francisco:
 
  http://semtechbizsf2013.semanticweb.com/sessionPop.cfm?confid=70proposalid=5296
 
   Original Message 
  Subject: We're making a lot more data open to the public
  Date: Thu, 09 May 2013 16:05:11 -0500
  From: Todd Park, The White House i...@messages.whitehouse.gov
  Reply-To: i...@messages.whitehouse.gov
  To: da...@dbooth.org
 
  We're making a lot more data open to the public
  The White House Thursday, May 9, 2013
 
  Hi, all --
 
  Earlier today,

Re: ORCID no longer relevant?

2013-03-12 Thread John Erickson

Regarding the specific question of the orcid.org proxy returning
correct http conneg results --- a must in order to be linked data
savvy --- a couple years ago a similar observation was made of
crossref.org and they remedied the situation nicely.

Since a few of the people involved in ORCID
http://orcid.org/about/team are familiar with the CrossRef.org
situation, maybe a similar result can happen?

John

On Tue, Mar 12, 2013 at 7:21 AM, Stian Soiland-Reyes
soiland-re...@cs.manchester.ac.uk wrote:
 In my projects, we have been wanting to recommend using ORCID [1] as
 part of identifying authors and contributors. ORCID is receiving
 increasing attention in the scientific publishing community as it
 promises a unified way to identify authors of scientific publications.


 I was going to include an ex:orcid property on foaf:Agents in our
 specifications, perhaps as an owl:sameAs subproperty (I know, I
 know!).

 There's no official property for linking to a ORCID profile at the
 moment [5] - I would be careful about using foaf:account to the ORCID
 URI, as the ORCID identifies the person (at least in a scientific
 context), and not an OnlineAccount - has someone else tried a
 structure here?



 There are other long-standing issues in using ORCID in Linked Data:


 For one, the URI to use is unclear [2], but the form
 http://orcid.org/-0001-9842-9718 is what is currently being
 promoted [3]:

 The ORCID iD should always be expressed and stored as a URI: 
 http://orcid.org/--- (with the protocol (http://), and with 
 hyphens in the number ---).

 (Strangely this advise is not reflected on orcid.org itself)


 Another issue is that there is actually no RDF exposed from orcid.org [4].


 But the last issue is that if you request the ORCID URI with Accept:
 application/rdf+xml - then the REST API wrongly returns its own XML
 format - but still claims Content-Type application/rdf+xml.  The issue
 for this [5] has just been postponed 'for several months', even though
 it should be a simple fix.


 This raises the question if ORCIDs would still be relevant on the
 semantic web. Does anyone else have views, alternatives or
 suggestions?



 [1] http://orcid.org/
 [2] 
 http://support.orcid.org/forums/175591-orcid-ideas-forum/suggestions/3641532
 [3] 
 http://support.orcid.org/knowledgebase/articles/116780-structure-of-the-orcid-identifier
 [4] 
 http://support.orcid.org/forums/175591-orcid-ideas-forum/suggestions/3283848
 [5] 
 http://support.orcid.org/forums/175591-orcid-ideas-forum/suggestions/3291844


 --
 Stian Soiland-Reyes, myGrid team
 School of Computer Science
 The University of Manchester




-- 
John S. Erickson, Ph.D.
Director, Web Science Operations
Tetherless World Constellation (RPI)
http://tw.rpi.edu olyerick...@gmail.com
Twitter  Skype: olyerickson

Re: Linked Data Book in Early Access Release

2012-12-05 Thread John Erickson

Is the O'Reilly site linked data saavy? If so, perhaps that could be
used as incentive for Manning to do likewise...

On Wed, Dec 5, 2012 at 9:52 AM, Melvin Carvalho
melvincarva...@gmail.com wrote:


 On 5 December 2012 15:36, Melvin Carvalho melvincarva...@gmail.com wrote:



 On 5 December 2012 14:56, David Wood da...@3roundstones.com wrote:

 On Dec 5, 2012, at 08:46, Kingsley Idehen kide...@openlinksw.com wrote:

  On 12/5/12 7:55 AM, David Wood wrote:
  On Dec 5, 2012, at 06:34, Chris Beerch...@codex.net.au  wrote:
 
  snip
  
  http://www.manning.com/dwood/  itself doesn't seam to have any
   Linked Data
  to consume;)
  
  Makes sense to me - if you know enough to look for LD resources at
   the
  manning.com/dwood/ URI, you've just self evaluated that you probably
   don't
  need the book! :P
  I agree - and have been speaking with Manning about this.
  Unfortunately, I haven't made any progress yet.  I'll keep trying!
 
  Thanks for the mail.  Perhaps I can use it as proof to Manning that
  people do want LD on their site.
 
  Regards,
  Dave
 
 
  Dave,
 
  They have to understand that its sorta contradictory if they need to be
  convinced of this matter :-)

 Oh, I see your point and have made it myself.  Unfortunately, economics
 seems to be dictating otherwise to them for right or wrong.

 The only productive suggestion that has been made to me is to put up a
 parallel site for the book that includes LD.  Michael Hausenblas has offered
 the domain linkeddatadeveloper.com, which was his original site for the book
 but has fallen into disuse.  Of course, I would need to be willing to pay
 for the site and take the time to operate it.


 How about this?

 http://linked.data.fm/book.html

 It's also LDP compliant ;)


 BTW you can do quite a lot in design mode :
 http://www.quirksmode.org/dom/execCommand/





 Would the community find that a useful thing to do?  I am willing to go
 to the effort if I receive a good number of positive responses.

 Regards,
 Dave







-- 
John S. Erickson, Ph.D.
Director, Web Science Operations
Tetherless World Constellation (RPI)
http://tw.rpi.edu olyerick...@gmail.com
Twitter  Skype: olyerickson

Re: Access Control Lists, Policies and Business Models

2012-08-16 Thread John Erickson

Kingsley, thanks for forwarding that post. It ranks up there with
Daniel Jacobson's work w.r.t. the inherent value of keyed APIs...

I believe that like other API providers, Twitter is merely realizing
they can achieve better analytics over their delivery of services with
more diligent access control. Although policies are part of the
equation, I think it has more to do with being adaptive and reactive,
and understanding what API users want and how they are using it.

API intermediaries like Mashery, apigee, 3scale, etc are all about
providing analytics dashboards so their clients can understand in
detail how e.g. their data is being consumed, and by whom...THAT is
the value of keys...

On Thu, Aug 16, 2012 at 7:39 PM, Kingsley Idehen kide...@openlinksw.com wrote:
 All,

 Here's Twitter pretty much expressing the inevitable reality re. Web-scale
 business models: https://dev.twitter.com/blog/changes-coming-to-twitter-api

 There's no escaping the importance of access control lists and policy based
 data access.

 --

 Regards,

 Kingsley Idehen
 Founder  CEO
 OpenLink Software
 Company Web: http://www.openlinksw.com
 Personal Weblog: http://www.openlinksw.com/blog/~kidehen
 Twitter/Identi.ca handle: @kidehen
 Google+ Profile: https://plus.google.com/112399767740508618350/about
 LinkedIn Profile: http://www.linkedin.com/in/kidehen








-- 
John S. Erickson, Ph.D.
Director, Web Science Operations
Tetherless World Constellation (RPI)
http://tw.rpi.edu olyerick...@gmail.com
Twitter  Skype: olyerickson

Re: best practice RDF in HTML

2012-06-12 Thread John Erickson

Sebastian, is the requirement that the RDF not be *integrated* with
the content of the page --- in other words, you just want to embed a
dump of some RDF?

Why not link to a RDF or TTL file?

On Tue, Jun 12, 2012 at 10:02 AM, Sebastian Hellmann
hellm...@informatik.uni-leipzig.de wrote:
 Dear list,
 What are the best practice to include a set of RDF triples in HTML.
 *Please note*: I am not looking for the RDFa way to include triples. I just
 want to add a set of triples somewhere in an HTML document. They are not
 supposed to show up like Wikinomics, Don Tapscott in  the following
 example:

 div  xmlns:dc=http://purl.org/dc/elements/1.1/;
  about=http://www.example.com/books/wikinomics;
  span  property=dc:titleWikinomics/span
  span  property=dc:creatorDon Tapscott/span
  span  property=dc:date2006-10-01/span
 /div

 I don't want to use the strings in the HTML document as objects in the
 triples. My use case is that I just have a large set of triples, e.g. 1000
 that I want to include as a bulk somewhere and ship along with the html.
 Which way is the best? Do the examples below work?
 All the best,
 Sebastian

 ***
 Include in head
 **
 html
 head
 script type=application/rdf+xml
 rdf:RDF
 xmlns:rdf=http://www.w3.org/1999/02/22-rdf-syntax-ns#;
 xmlns:cd=http://www.recshop.fake/cd#;

 rdf:Description
 rdf:about=http://www.recshop.fake/cd/Empire Burlesque
 cd:artistBob Dylan/cd:artist
 cd:dbpedia rdf:resource=http://dbpedia.org/resource/Empire_Burlesque; 
 /rdf:Description
 /rdf:RDF
 /script
 /head
 body
 /body
 /html
 **
 attach after html
 *
 html
 head
 /head
 body
 /body
 /html
 rdf:RDF
 xmlns:rdf=http://www.w3.org/1999/02/22-rdf-syntax-ns#;
 xmlns:cd=http://www.recshop.fake/cd#;

 rdf:Description
 rdf:about=http://www.recshop.fake/cd/Empire Burlesque
 cd:artistBob Dylan/cd:artist
 cd:dbpedia rdf:resource=http://dbpedia.org/resource/Empire_Burlesque; 
 /rdf:Description
 /rdf:RDF


 --
 Dipl. Inf. Sebastian Hellmann
 Department of Computer Science, University of Leipzig
 Projects: http://nlp2rdf.org , http://dbpedia.org
 Homepage: http://bis.informatik.uni-leipzig.de/SebastianHellmann
 Research Group: http://aksw.org





-- 
John S. Erickson, Ph.D.
Director, Web Science Operations
Tetherless World Constellation (RPI)
http://tw.rpi.edu olyerick...@gmail.com
Twitter  Skype: olyerickson

Re: best practice RDF in HTML

2012-06-12 Thread John Erickson

If I understand correctly, the problem is to embed an RDF island in
an HTML document, to be managed by a (presumably RDF clueless) CMS.

How about following the approach described in the W3C Microdata to
RDF: Transformation from HTML+Microdata to RDF working draft
http://www.w3.org/TR/2012/WD-microdata-rdf-20120112/

There are a number of examples in that draft which seem compatible
with the problem statement, including embedded JSON (that can be
easily extracted and consumed as RDF)

On Tue, Jun 12, 2012 at 11:12 AM, Gannon Dick gannon_d...@yahoo.com wrote:
 Hello Sebastian,

 You are making me nostalgic for a dispute I lost by shout-down with with the
 developers of RDFa :o)

 Oops.  Mr Erickson just beat me to the punch ... the critical point is that
 HTML has two bowls of tag soup (HEAD, BODY) related by proximity not by
 authority.  It's easy to assume that the HEAD is global to the BODY or
 vice-versa.  What you really want to do is cite a bibliographic reference to
 a set of RDF triples.

 You can link to that file, or if you want to get fancy, embed an XML
 Bibliographic Reference format like MODS from the LoC[1].  Embedding in the
 BODY is more polite, and reassuring if questions arise about download size.

 --Gannon

 [1] Sorry, I have not looked at this in years so there will be some syntax
 issues.  The idea is simple, MathML for people who do math, MODS for people
 who keep track of written stuff.
 http://www.rustprivacy.org/FunForLibrarians.pdf


 
 From: John Erickson olyerick...@gmail.com
 To: Sebastian Hellmann hellm...@informatik.uni-leipzig.de
 Cc: public-lod public-lod@w3.org; semantic-web semantic-...@w3.org
 Sent: Tuesday, June 12, 2012 9:32 AM
 Subject: Re: best practice RDF in HTML

 Sebastian, is the requirement that the RDF not be *integrated* with
 the content of the page --- in other words, you just want to embed a
 dump of some RDF?

 Why not link to a RDF or TTL file?

 On Tue, Jun 12, 2012 at 10:02 AM, Sebastian Hellmann
 hellm...@informatik.uni-leipzig.de wrote:
 Dear list,
 What are the best practice to include a set of RDF triples in HTML.
 *Please note*: I am not looking for the RDFa way to include triples. I
 just
 want to add a set of triples somewhere in an HTML document. They are not
 supposed to show up like Wikinomics, Don Tapscott in  the following
 example:

 div  xmlns:dc=http://purl.org/dc/elements/1.1/;
  about=http://www.example.com/books/wikinomics;
  span  property=dc:titleWikinomics/span
  span  property=dc:creatorDon Tapscott/span
  span  property=dc:date2006-10-01/span
 /div

 I don't want to use the strings in the HTML document as objects in the
 triples. My use case is that I just have a large set of triples, e.g. 1000
 that I want to include as a bulk somewhere and ship along with the html.
 Which way is the best? Do the examples below work?
 All the best,
 Sebastian

 ***
 Include in head
 **
 html
 head
 script type=application/rdf+xml
 rdf:RDF
 xmlns:rdf=http://www.w3.org/1999/02/22-rdf-syntax-ns#;
 xmlns:cd=http://www.recshop.fake/cd#;

 rdf:Description
 rdf:about=http://www.recshop.fake/cd/Empire Burlesque
 cd:artistBob Dylan/cd:artist
 cd:dbpedia rdf:resource=http://dbpedia.org/resource/Empire_Burlesque; 
 /rdf:Description
 /rdf:RDF
 /script
 /head
 body
 /body
 /html
 **
 attach after html
 *
 html
 head
 /head
 body
 /body
 /html
 rdf:RDF
 xmlns:rdf=http://www.w3.org/1999/02/22-rdf-syntax-ns#;
 xmlns:cd=http://www.recshop.fake/cd#;

 rdf:Description
 rdf:about=http://www.recshop.fake/cd/Empire Burlesque
 cd:artistBob Dylan/cd:artist
 cd:dbpedia rdf:resource=http://dbpedia.org/resource/Empire_Burlesque; 
 /rdf:Description
 /rdf:RDF


 --
 Dipl. Inf. Sebastian Hellmann
 Department of Computer Science, University of Leipzig
 Projects: http://nlp2rdf.org , http://dbpedia.org
 Homepage: http://bis.informatik.uni-leipzig.de/SebastianHellmann
 Research Group: http://aksw.org





 --
 John S. Erickson, Ph.D.
 Director, Web Science Operations
 Tetherless World Constellation (RPI)
 http://tw.rpi.edu olyerick...@gmail.com
 Twitter  Skype: olyerickson






-- 
John S. Erickson, Ph.D.
Director, Web Science Operations
Tetherless World Constellation (RPI)
http://tw.rpi.edu olyerick...@gmail.com
Twitter  Skype: olyerickson

Re: Datatypes with no (cool) URI

2012-04-03 Thread John Erickson

Gannon raises a valid point, BUT it is important to remember that ISO
is a *publisher* and DOI is fundamentally a publishing industry thing.

So while they might not be inclined to support Cool URIs for their own
sake, they might be DOI adopters for the sake of The Bottom Line...

On Tue, Apr 3, 2012 at 11:19 AM, Gannon Dick gannon_d...@yahoo.com wrote:
 There are just some things outside of the Web's bailiwick, and the
 properties of people in that class.  The problem is that you are never sure
 if you are naming the property on rudely calling the property holder names.
 ISO declines to play, the LOC declines differently
 http://id.loc.gov/authorities/subjects/sh91003756 and simple classes don't
 exist.  I think you've hit a limit, not on Cool Uri's necessarily, but maybe
 on philosophy.

 
 From: John Erickson olyerick...@gmail.com
 To: David Booth da...@dbooth.org
 Cc: Phil Archer ph...@w3.org; public-lod@w3.org public-lod@w3.org
 Sent: Tuesday, April 3, 2012 9:53 AM
 Subject: Re: Datatypes with no (cool) URI

 On Tue, Apr 3, 2012 at 10:38 AM, David Booth da...@dbooth.org wrote:
 On Tue, 2012-04-03 at 14:33 +0100, Phil Archer wrote:
 [ . . . ] The actual URI for it is

 http://www.iso.org/iso/iso_catalogue/catalogue_tc/catalogue_detail.htm?csnumber=36266
 (or rather, that's the page about the spec but that's a side issue for
 now).

 That URI is just horrible and certainly not a 'cool URI'. The Eurostat
 one is no better.

 Does the datatype URI have to resolve to anything (in theory no, but in
 practice? Would a URN be appropriate?

 It's helpful to be able to click on the URI to figure out what exactly
 was meant.  How about just using a URI shortener, such as tinyurl.com or
 bit.ly?

 David's good point raises an even bigger point: why isn't ISO minting
 DOI's for specs?

 Or, at least, why can't ISO manage a DOI-equivalent space that would
 rein-in bogusly-long URIs, make them more manageable, and perhaps more
 functional e.g. CrossRef's Linked Data-savvy DOI proxy
 http://bit.ly/HcStYl


 --
 John S. Erickson, Ph.D.
 Director, Web Science Operations
 Tetherless World Constellation (RPI)
 http://tw.rpi.edu olyerick...@gmail.com
 Twitter  Skype: olyerickson






-- 
John S. Erickson, Ph.D.
Director, Web Science Operations
Tetherless World Constellation (RPI)
http://tw.rpi.edu olyerick...@gmail.com
Twitter  Skype: olyerickson

Re: Datatypes with no (cool) URI

2012-04-03 Thread John Erickson

So David's solution (using PURLs) provides a bit of transparency and
manageablity, but it has the disadvantage of having no official
status.

Maybe (probably) I'm missing something here?

On Tue, Apr 3, 2012 at 11:19 AM, David Booth da...@dbooth.org wrote:
 Okay, then maybe a PURL would help?  purl.org now supports partial
 redirects:
 http://purl.org/docs/faq.html#toc1.9
 That may not quite work with your ISO URIs though.

 Personally, I don't think you should worry too much about a machine
 expecting to be able to dereference the datatype URI to get data back.
 I would expect most datatype URIs would lead to human-oriented
 information, though that could gradually change.

 David


 On Tue, 2012-04-03 at 15:58 +0100, Phil Archer wrote:
 Hi David,

 Yes, one could use URL shorteners and that's probably the only sane way
 to go but it's still not ideal because:

 1. Both Bitly and Tinyurl come with no guarantee of service (and  a
 lot of tracking) - Google's goo.gl is all wrapped up with their services
 too - not the kind of thing public administrations will be happy about
 using. Yves Lafon's http://kwz.me is a pure shortener with no tracking
 of any kind but it's a one man project so, again, it won't be 'good
 enough' for public sector data.

 2. Neither a shortened URL nor the long form tell a human reader a lot
 whereas something (non-standard I know) like urn:iso/iec:5218:2004 tells
 you that it's an ISO standard that a human can look up. The ISO
 catalogue URLs point to Web pages or PDFs available from those Web pages
 so you still need to be a human to get the information. The danger would
 be that a machine would look up the datatype URI and expect to get data
 back, not ISO's paywall :-)

 So, not ideal, but still the best (practical) solution?



 On 03/04/2012 15:38, David Booth wrote:
  On Tue, 2012-04-03 at 14:33 +0100, Phil Archer wrote:
  [ . . . ] The actual URI for it is
  http://www.iso.org/iso/iso_catalogue/catalogue_tc/catalogue_detail.htm?csnumber=36266
  (or rather, that's the page about the spec but that's a side issue for
  now).
 
  That URI is just horrible and certainly not a 'cool URI'. The Eurostat
  one is no better.
 
  Does the datatype URI have to resolve to anything (in theory no, but in
  practice? Would a URN be appropriate?
 
  It's helpful to be able to click on the URI to figure out what exactly
  was meant.  How about just using a URI shortener, such as tinyurl.com or
  bit.ly?
 
 


 --
 David Booth, Ph.D.
 http://dbooth.org/

 Opinions expressed herein are those of the author and do not necessarily
 reflect those of his employer.





-- 
John S. Erickson, Ph.D.
Director, Web Science Operations
Tetherless World Constellation (RPI)
http://tw.rpi.edu olyerick...@gmail.com
Twitter  Skype: olyerickson

Re: Yahoo! patent 7,747,648 court case

2012-03-13 Thread John Erickson

Elaborating on what Henry said, occasionally a member of a standards
organization will (a) pursue a patent on a particular technology
deemed essential to implementing a standard and (b) wave royalties
for using the patented technology, as long as such use implements the
standard and remains royalty-free (think of this as donating). But
they still hold the patent and may prosecute companies that attempt to
make money off e.g. commercial and non-standard implementations.

Although it might seem perverse, this is one way for stakeholders to
protect the openness of standards by keeping out trolls, etc. The
*theory* is, you'd rather have a friend patent it than an enemy...

John
DISCLAIMER: I don't know that Yahoo! is in fact doing this.

On Tue, Mar 13, 2012 at 9:07 AM, Henry Story henry.st...@bblfish.net wrote:
How would anyone know?

Given that Yahoo is a member of the W3C, it cannot affect the w3c standards,
since they signed up to a no patent policy.

On 13 Mar 2012, at 13:54, Melvin Carvalho wrote:

You may have seen in the news facebook are getting sued for using the
following patented technology

http://patft.uspto.gov/netacgi/nph-Parser?Sect2=PTO1Sect2=HITOFFp=1u=/netahtml/PTO/search-bool.htmlr=1f=Gl=50d=PALLRefSrch=yesQuery=PN/7747648%0A

Abstract

Systems and methods for information retrieval and communication employ a
world model. The world model is made up of interrelated entity models, each
of which corresponds to an entity in the real world, such as a person,
place, business, other tangible thing, community, event, or thought. Each
entity model provides a communication channel via which a user can contact a
real-world person responsible for that entity model. Entity models also
provide feedback information, enabling users to easily share their
experiences and opinions of the corresponding real-world entity.

Does this affect Linked Open Data too?

Social Web Architect
http://bblfish.net/

--
John S. Erickson, Ph.D.
Director, Web Science Operations
Tetherless World Constellation (RPI)
http://tw.rpi.edu olyerick...@gmail.com
Twitter Skype: olyerickson

Re: redirects and relative URLs

2012-01-29 Thread John Erickson

Henry asked:
 If I dereference a URL which contains a redirect to another resource, and that
 resource contains relative URLs, how should the relative URLs of the returned
 document be completed? With the initial URL? Or with the one given in the 
 Location
 header (or some other header?) of the last document?

 Perhaps this has been written up somewhere?

Doesn't RFC 3986 http://www.ietf.org/rfc/rfc3986.txt speak to this?

5.1.3.  Base URI from the Retrieval URI

If no base URI is embedded and the representation is not encapsulated
within some other entity, then, if a URI was used to retrieve the
representation, that URI shall be considered the base URI.  Note that
if the retrieval was the result of a redirected request, the last URI
used (i.e., the URI that resulted in the actual retrieval of the
representation) is the base URI.


-- 
John S. Erickson, Ph.D.
Director, Web Science Operations
Tetherless World Constellation (RPI)
http://tw.rpi.edu olyerick...@gmail.com
Twitter  Skype: olyerickson

Re: recording meetings

2011-12-20 Thread John Erickson

Brand Niemann bniem...@cox.net asks:
 Interesting discussion - Are we having these presentations because there is
 consideration being given to licensing LOD to protect it (from what), allow
 someone to make money off of it (really), or some other reason?

I think it's more fundamental than that; the eGov community needs a
clear understanding of

* What legal mechanisms (and precedent) are available and/or required
to declare Open w.r.t. data?
* What technical mechanisms (vocabulary, markup methodologies) are
available and/or required to assert such terms to machines and people?

There is no LOD without the O, stated clearly. Providers' claims
must be legally valid, unambiguous and accessible by both people and
machines...

-- 
John S. Erickson, Ph.D.
Director, Web Science Operations
Tetherless World Constellation (RPI)
http://tw.rpi.edu olyerick...@gmail.com
Twitter  Skype: olyerickson

Re: CAS, DUNS and LOD (was Re: Cost/Benefit Anyone? Re: Vote for my Semantic Web presentation at SXSW)

2011-08-23 Thread John Erickson

This is an important discussion that (I believe) foreshadows how
canonical identifiers are managed moving forward.

Both CAS and DUNS numbers are a good example. Consider the challenge
of linking EPA data; it's easy to create a list of toxic chemicals
that are common across many EPA datasets. Based on those chemical
names, its possible to further find (in most cases) references in
DBPedia and other sources, such as PubChem:

* ACETALDEHYDE
* http://dbpedia.org/page/Acetaldehyde
* http://pubchem.ncbi.nlm.nih.gov/summary/summary.cgi?cid=177
* etc...

Now, add to this a sensible agency-rooted URI design and a
DBPedia-like infrastructure and one has a very powerful hub that
strengthens the Linked Data ecosystem. It would arguably be stronger
if CAS identifiers were also (somehow) included, but even the bits of
linking shown above change the value proposition of traditional
proprietary naming schemes...

John
PS: At TWC we are about to go live with a registry called Instance
Hub that will demonstrate the association of agency-based URI schemes
--- think EPA, HHS, DOE, USDA, etc --- with instance data over which
the agency has some authority or interest...More very soon!

On Tue, Aug 23, 2011 at 8:31 AM, Patrick Durusau patr...@durusau.net wrote:
 David,

 On 8/22/2011 9:55 PM, David Booth wrote:

 On Mon, 2011-08-22 at 20:27 -0400, Patrick Durusau wrote:
 [ . . . ]

 The use of CAS identifiers supports searching across vast domains of
 *existing* literature. Not all, but most of it for the last 60 or so
 years.

 That is non-trivial and should not be lightly discarded.

 BTW, your objection is that non-licensed systems cannot use CAS
 identifiers? Are these commercial systems that are charging their
 customers? Why would you think such systems should be able to take
 information created by others?

 Using the information associated with an identifier is one thing; using
 the identifier itself is another.  I'm sure the CAS numbers have added
 non-trivial value that should not be ignored.  But their business model
 needs to change.  It is ludicrous in this web era to prohibit the use of
 the identifiers themselves.

 If there is one principle we have learned from the web, it is enormous
 value and importance of freely usable universal identifiers.  URIs rule!
 http://urisrule.org/

 :)

 Well, I won't take the bait on URIs, ;-), but will note that re-use of
 identifiers of a sort was addressed quite a few years ago.

 See: Feist Publications, Inc., v. Rural Telephone Service Co., 499 U.S. 340
 (1991) or follow this link:

 http://en.wikipedia.org/wiki/Feist_v._Rural

 The circumstances with CAS numbers is slightly different because to get
 access to the full set of CAS numbers I suspect you have to sign a licensing
 agreement on re-use, which makes it a matter of *contract* law and not
 copyright.

 Perhaps they should increase the limits beyond 10,000 identifiers but the
 only people who want the whole monty as it were are potential commercial
 competitors.

 The people who publish the periodical Brain for example at $10,000 a year.
 Why should I want the complete set of identifiers to be freely available to
 help them?

 Personally I think given the head start that the CAS maintainers have on the
 literature, etc., that different models for use of the identifiers might
 suit their purposes just as well. Universal identifiers change over time and
 my concern is with the least semantic friction and not as much with how we
 get there.

 Hope you are having a great day!

 Patrick




 --
 Patrick Durusau
 patr...@durusau.net
 Chair, V1 - US TAG to JTC 1/SC 34
 Convener, JTC 1/SC 34/WG 3 (Topic Maps)
 Editor, OpenDocument Format TC (OASIS), Project Editor ISO/IEC 26300
 Co-Editor, ISO/IEC 13250-1, 13250-5 (Topic Maps)

 Another Word For It (blog): http://tm.durusau.net
 Homepage: http://www.durusau.net
 Twitter: patrickDurusau




-- 
John S. Erickson, Ph.D.
http://bitwacker.com
olyerick...@gmail.com
Twitter: @olyerickson
Skype: @olyerickson

Re: Making Linked Data Fun

2010-11-19 Thread John Erickson

This single most powerful demo available is an epic fail on Ubuntu
10.10 + Chrome. The most recent release of Moonlight just doesn't cut
it (and shouldn't have to).

Could we as a community *possibly* work towards a rich data
visualization/presentation toolkit built on, say, HTML5?

On Fri, Nov 19, 2010 at 11:20 AM, Aldo Bucchi aldo.buc...@gmail.com wrote:
 Kingsley,

 On Fri, Nov 19, 2010 at 1:07 PM, Kingsley Idehen kide...@openlinksw.com 
 wrote:
 All,

 Here is an example of what can be achieved with Linked Data, for instance
 using BBC Wild Life Finder's data:

 1. http://uriburner.com/c/DI463N -- remote SPARQL queries between two
 instances (URIBurner and LOD Cloud Cache) with results serialized in CXML
 (image processing part of the SPARQL query pipeline) .

 This is excellent!
 Single most powerful demo available. Really looking fwd to what's coming next.

 Let's see how this shifts gears in terms of Linked Data comprehension.
 Even in its current state, this is an absolute game changer.

 I know this was not easy. My hat goes off to the team for their focus.

 Now, just let me send this link out to some non-believers that have
 been holding back my evangelization pipeline ;)

 Regards,
 A



 Enjoy!

 --

 Regards,

 Kingsley Idehen
 President  CEO
 OpenLink Software
 Web: http://www.openlinksw.com
 Weblog: http://www.openlinksw.com/blog/~kidehen
 Twitter/Identi.ca: kidehen








 --
 Aldo Bucchi
 @aldonline
 skype:aldo.bucchi
 http://aldobucchi.com/





-- 
John S. Erickson, Ph.D.
http://bitwacker.wordpress.com
olyerick...@gmail.com
Twitter: @olyerickson
Skype: @olyerickson

Re: Call for Chapters: Linking Government Data

2010-10-07 Thread John Erickson

Will all due respect, as with any monograph this is a call to
*contribute*; the benefits if accepted are being part of an important
work. Recipients are free to not submit!

On Thu, Oct 7, 2010 at 11:23 AM, Kingsley Idehen kide...@openlinksw.com wrote:
 On 10/7/10 10:02 AM, David Wood wrote:

 Hi all,

 Please find below a Call for Chapters for a new contributed book to be
 entitled Linking_Government_Data.  Please distribute this information as
 widely as possible to help us collect useful success stories, techniques and
 benefits to using Linked Data in governments.  Thanks in advance.

 Regards,
 Dave

 --

 David Wood announces a Call for Chapters for a new book to be entitled
 Linking Government Data. First proposal submissions are due November 30,
 2010 to da...@3roundstones.com.

 The book is intended to be published in print, ebooks format and on the Web,
 but a publisher has not yet been chosen. More than one publisher is
 interested.

 CHAPTER PROPOSALS INVITED FROM RESEARCHERS AND PRACTITIONERS IN LINKED DATA,
 DATA MANAGEMENT AND WEB INFORMATION SYSTEMS

 1st Proposal Submission Deadline: November 30, 2010
 Full Chapter Submission Deadline: March 1, 2010

 Linking Government Data
 A book edited by David Wood, Talis, USA

 I. Introduction

 Linking Government Data is the application of Semantic Web architecture
 principles to real-world information management issues faced by government
 agencies. The term LGD is a play on Linking Open Data (LOD), a community
 project started by the World Wide Web Consortium’s Semantic Web Education
 and Outreach Interest Group aimed at exposing data sets to the Web in
 standard formats and actively relating them to one another with hyperlinks.

 Data in general is growing at a much faster rate than traditional
 technologies allow. The World Wide Web is the only information system we
 know that scales to the degree that it does and is robust to both changes
 and failure of components. Most software does not work nearly as well as the
 Web does. Applying the Web’s architectural principles to government
 information distribution programs may be the only way to effectively address
 the current and future information glut. Challenges remain, however, because
 the publication of data to the Web requires government agencies to give up
 the central control and planning traditionally applied by IT departments.

 A primary goal of this book is to highlight both costs and benefits to
 broader society of the publication of raw data to the Web by government
 agencies. How might the use of government Linked Data by the Fourth Estate
 of the public press change societies?

 How can agencies fulfill their missions with less cost? How must
 intra-agency culture change to allow public presentation of Linked Data?

 This book follows the successful publication of Linking Enterprise Data by
 Springer Science+Business Media in October 2011.

 II. Objective of the Book

 This book aims to provide practical approaches to addressing common
 information management issues by the application of Semantic Web and Linked
 Data research to government environments and to report early experiences
 with the publication of Linked Data by government agencies. The approaches
 taken are based on international standards. The book is to be written and
 edited by leaders in Semantic Web and Linked Data research and standards
 development and early adopters of Semantic Web and Linked Data standards and
 techniques.

 III. Target Audience

 This book is meant for Semantic Web researchers and academicians, and CTOs,
 CIOs, enterprise architects, project managers and application developers in
 commercial, not-for-profit and government organizations concerned with
 scalability, flexibility and robustness of information management systems.
 Not-for-profit organizations specifically include the library and museum
 communities.

 Recommended topics include, but are not limited to, the following: – social,
 technical and mission values of applying Web architecture to government
 content, such as the means by which deployment agility, resilience and reuse
 of data may be accomplished – Relating to other eGov initiatives – Building
 of social (human-centered) communities to curate distributed data –
 Enterprise infrastructure for Linking Government Data – Persistent
 Identifiers – Linking the government cloud – Applications of Linked Data to
 government transparency, organizational learning or curation of/access to
 distributed information – Publishing large-scale Linked Data.

 Contributions from those working with government Linked Data projects of all
 sizes are sought. Many stories exist from the U.S. and U.K. government
 agencies, but contributions from Estonia, Germany, New Zealand, Norway, etc,
 etc, are more than welcome.

 IV. Publisher

 The book is intended to be published in print, ebooks format and on the Web,
 but a publisher has not yet been chosen. More than one publisher is
 interested. This

Re: New LOD Cloud

2010-09-24 Thread John Erickson

Didn't one of the original color-enhanced versions of the LOD Cloud
highlight by license?

2010/9/24 Egon Willighagen egon.willigha...@gmail.com:
 2010/9/24 François Dongier francois.dong...@gmail.com:
 Would be nice to go a bit beyond this relatively crude categorisation.
 Enabling someone interested in, say, wine or Alabama farming, to highlight
 the datasets that are relevant to this interest.

 Indeed, it would be great to select the classification ontology by
 which the nodes are colored, which then dynamically changes the
 coloring... perhaps also coloring by license could be interesting?

 Egon

 --
 Dr E.L. Willighagen
 Post-doc @ Uppsala University (only until 2010-09-30)
 Proteochemometrics / Bioclipse Group of Prof. Jarl Wikberg
 Homepage: http://egonw.github.com/
 LinkedIn: http://se.linkedin.com/in/egonw
 Blog: http://chem-bla-ics.blogspot.com/
 PubList: http://www.citeulike.org/user/egonw/tag/papers





-- 
John S. Erickson, Ph.D.
http://bitwacker.wordpress.com
olyerick...@gmail.com
Twitter: @olyerickson
Skype: @olyerickson

Re: empirical cloud

2010-09-15 Thread John Erickson

It *does* do this; are you using e.g. Chrome? It takes a few seconds.

On Wed, Sep 15, 2010 at 10:23 AM, Juan Sequeda juanfeder...@gmail.com wrote:
 Really cool
 Would it be possible to know what each node is (hovering over the node).
 Or could you tell us what are the 4-5 main nodes. I'm guessing DBpedia...
  Is Foaf one of the main nodes?

 Juan Sequeda
 +1-575-SEQ-UEDA
 www.juansequeda.com


 On Wed, Sep 15, 2010 at 1:59 AM, Ed Summers e...@pobox.com wrote:

 The discussion about updating the Linking Open Data cloud got me
 thinking it would be interesting to try to visualize the actual
 owl:sameAs links in the Billion Triple Challenge dataset [1]. It
 turned out to be relatively easy to get something somewhat workable
 with tools like zgrep, sort, uniq, a couple of custom scripts, and the
 handy ProtoVis library [2].

 You can view the result at:

  http://inkdroid.org/empirical-cloud/

 If you notice the links to the rdf/xml and Turtle you'll see I tried
 my hand at representing the underlying data using void, foaf and
 dcterms. There's definitely room for improvement, so if you are so
 inclined please tinker with it and send me a pull request on GitHub
 [3]. Thanks to Gunnar Grimnes and Dan Brickley for ideas and help.

 //Ed

 [1] http://challenge.semanticweb.org/
 [2] http://vis.stanford.edu/protovis/
 [3] http://github.com/edsu/empirical-cloud






-- 
John S. Erickson, Ph.D.
http://bitwacker.wordpress.com
olyerick...@gmail.com
Twitter: @olyerickson
Skype: @olyerickson

Re: Best practice for permantently moved resources?

2010-08-12 Thread John Erickson

I realize this is a dangerous question, but...what is the cause of the change?

* Has the content actually changed (revision), or simply moved (URI change)?
* Must the content at the old URI go away, or can it be represented
as a previous version?

All of this is useful knowledge --- a form of provenance --- and
should not be lost. The question is, how to handle it systematically
and efficiently...

This brings to mind a number of conversations I've had in the
DOI/Handle System alternate reality, in which we've discussed a
vocabulary of types (think: predicates) that could serve as hints to a
HS proxy on what physical URI to return to a resolution request,
including and especially as a smart way to handle conneg. Not only
could these type assertions help the proxy distinguish between data
vs. document, but it could nicely handle version ala Memento
http://mementoweb.org/

My point is not to harp on the Handle System, instead to point out
that the HS (architecturally) builds-in the kind of encapsulation that
can facilitate this --- although the type-based conneg I'm referring
to has only been seriously considered in the past few months (AFAIK).

John

On Thu, Aug 12, 2010 at 5:12 AM, Kjetil Kjernsmo kje...@kjernsmo.net wrote:
 Hi all!

 Cool URIs don't change, but cool content does, so the problem surfaces that I
 need to permanently redirect now and then. I discussed this problem in a
 meetup yesterday, and it turns out that people have found dbpedia problematic
 to use because it is too much of a moving target, when a URI changes because
 the underlying concepts change, there's a need for more 301s.

 The problem is then that I need to record the relation between the old and the
 new URI somehow. As of now, it seems that the easiest way to do this would be
 to do something like:

 http://example.org/old ex:permanently_moved_to http://example.org/new

 and if the former is dereferenced, the server will 301 redirect to the latter.
 Has anyone done something like that, or have other useful experiences relevant
 to this problem?

 Cheers,

 Kjetil





-- 
John S. Erickson, Ph.D.
http://bitwacker.wordpress.com
olyerick...@gmail.com
Twitter: @olyerickson
Skype: @olyerickson

Re: Show me the money - (was Subjects as Literals)

2010-07-05 Thread John Erickson

I greatly respect Jeremy's thoughts, and they may be spot-on in this
case, but I urge the community to be cautious about how much weight to
give this kind of pragmatic economics-driven argument generally as
the semantic technology industry grows.

Virtually every organization has -- should have! -- increasing vested
interests in their own unique approach. In many cases, their
stakeholders may be better-served by maintaining the status quo; many
others will be served by upsetting the collective apple cart. Progress
is made collectively by hearing out and sometimes acting on
well-reasoned arguments from the other side, even if the implications
are changing one's code base!

Industry consortia move things that look and smell like standards --
W3C recommendations -- ahead by appealing to the greater good. Thus
I interpret Jeremy's comments as not a call to halt progress; rather,
he's simply asking for a strong case be made that the proposed changes
would benefit the *community* in a compelling way. He's asking for
well-reasoned arguments for change that colleagues around the
ecosystem might present to their grumpy, grey-suited, money-grubbing,
cigar-smoking management ;)

John

On Sun, Jul 4, 2010 at 10:51 PM, Jeremy Carroll jer...@topquadrant.com wrote:
  On 7/1/2010 8:44 PM, Pat Hayes wrote:

 Jeremy, your argument is perfectly sound from your company's POV, but not
 from a broader perspective. Of course, any change will incur costs by those
 who have based their assumptions upon no change happening

 I was asking for the economic benefit of the change, as opposed to the
 elegance benefit.
 Personally, I am wholly convinced by the elegance argument - but it will not
 convince my management, nor should it.

 I suspect there are several other companies and other open source activities
 that have investments that assume literals do not occur in subject position.

 Elegance is not, IMO, a sufficient argument to negate those investments.
 (The sort of thing we are talking about, is what sort of display is
 appropriate for a subject of a triple - we know that it is not a literal, so
 certain code paths, and options are not considered).

 Of course, in an industrial consortium costs to one member maybe justified
 by benefits to another - but costs to any member do need to be offset by
 some benefit to some member ... I have yet to see much of an argument (Henry
 got a small bit of the way), that there are any such benefits (i.e. ones
 which have a dollar, euro or yuan value). I have pointed to dollar costs ...
 I expect to see some such benefit. I don't think that expectation is
 unreasonable, more a boundary that keeps people honest ... and not just
 indulging in an intellectual game (he says politely).

 Jeremy








-- 
John S. Erickson, Ph.D.
http://bitwacker.wordpress.com
olyerick...@gmail.com
Twitter: @olyerickson

Re: Show me the money - (was Subjects as Literals)

2010-07-01 Thread John Erickson

RE getting a full list of the benefits, surely if it's being
discussed here, Literals as Subjects must be *somebody's* Real(tm)
Problem and the benefits are inherent in its solution?

And if it isn't, um, why is it being discussed here? ;)

On Thu, Jul 1, 2010 at 11:46 AM, Henry Story henry.st...@gmail.com wrote:
 Jeremy, the point is to start the process, but put it on a low burner,
 so that in 4-5 years time, you will be able to sell a whole new RDF+ suite to 
 your customers with this new benefit.  ;-)

 On 1 Jul 2010, at 17:38, Jeremy Carroll wrote:


 I am still not hearing any argument to justify the costs of literals as 
 subjects

 I have loads and loads of code, both open source and commercial that assumes 
 throughout that a node in a subject position is not a literal, and a node in 
 a predicate position is a URI node.

 but is that really correct? Because bnodes can be names for literals, and so 
 you really do have
 literals in subject positions No?


 Of course, the correct thing to do is to allow all three node types in all 
 three positions. (Well four if we take the graph name as well!)

 But if we make a change,  all of my code base will need to be checked for 
 this issue.
 This costs my company maybe $100K (very roughly)
 No one has even showed me $1K of advantage for this change.

 I agree, it would be good to get a full list of the benefits.


 It is a no brainer not to do the fix even if it is technically correct

 Jeremy








-- 
John S. Erickson, Ph.D.
http://bitwacker.wordpress.com
olyerick...@gmail.com
Twitter: @olyerickson

Re: Slightly off topic - content negotiation by language accept headers

2010-06-24 Thread John Erickson

Hmmm, interesting...

FYI, I'm flying Chrome 6.0.437.3 dev on Ubuntu...Weird... ;)

John

On Thu, Jun 24, 2010 at 11:46 AM, Michael Smethurst
michael.smethu...@bbc.co.uk wrote:
 Hi John

 Thanks for the tips. Seems our Chrome versions aren't quite the same. I'm on
 MacOS (5.0.375.70) and...


 On 23/06/2010 19:48, John Erickson olyerick...@gmail.com wrote:

 Here's how you specific your language preferences in Chrome:

 * In the Customize menu (the wrench) select Options

 I have no options under customise; only preferences

 * Select the Under the Hood tab

 Yup

 * Scroll down-down-down to Web Content area

 Yup

 * Select Change font and language settings

 Nope. I see only a change font settings button. No mention there (or
 elsewhere as far as I can see of language settings)

 * Select the Language tab
 * Add the languages of your choosing, in order of preference

 John

 Looks like support is fairly patchy even across different versions of the
 same (modern) browser :-(


 http://www.bbc.co.uk/
 This e-mail (and any attachments) is confidential and may contain personal 
 views which are not the views of the BBC unless specifically stated.
 If you have received it in error, please delete it from your system.
 Do not use, copy or disclose the information in any way nor act in reliance 
 on it and notify the sender immediately.
 Please note that the BBC monitors e-mails sent or received.
 Further communication will signify your consent to this.





-- 
John S. Erickson, Ph.D.
http://bitwacker.wordpress.com
olyerick...@gmail.com
Twitter: @olyerickson

Re: Linked data in packaged content (ePub)

2010-05-14 Thread John Erickson

One aspect of this thread concerns reasoning on limited-compute (i.e.
mobile) platforms. I'd like to point out Pychinko, the Python-based
reasoner (c.f. http://bit.ly/bPglu1 ), which was designed to run on
mobile platforms such as Nokia phones where Java-based reasoners were
considered too compute-intensive.

Granted, given the current closed nature of Apple, we should probably
be looking for Cocoa/Objective-C based reasoners... :P

John

On Wed, Apr 28, 2010 at 2:39 PM, Stuart A. Yeates syea...@gmail.com wrote:
 On Wed, Apr 28, 2010 at 11:36 PM, John Erickson olyerick...@gmail.com wrote:
 Stuart, t's not clear to me what you're trying to accomplish...For
 whom are you trying to add value?

 We are funded to digitise teaching, learning and research materials
 for our staff and students. Value to anyone else is incidental, but
 indicative.

 Are you imagining creating some kind of meshup within the reading
 experience, perhaps meshing metadata and links bound to entities
 within the ePub'd document with external linked data?

 Ideally, I'd like a protocol such as Open URL (
 http://en.wikipedia.org/wiki/Open_URL ), linking books on the device
 up to the bibliographies of other books that also happen to be on the
 device. For low CPU devices the links might have to be pre-calculated
 when connected to a desktop PC. I understand that Open URL can't
 actaully do this because it assumes the web.

 cheers
 stuart




-- 
John S. Erickson, Ph.D.
http://bitwacker.wordpress.com
olyerick...@gmail.com
Twitter: @olyerickson

Re: Linked data in packaged content (ePub)

2010-04-28 Thread John Erickson

Stuart, t's not clear to me what you're trying to accomplish...For
whom are you trying to add value?

Since ePub is XHTML, it makes sense to embed metadata as RDFa. But
why? Is the purpose to enhance the reading experience? Or perhaps the
local collection management experience?

Publishers should be, and most likely are, obtaining DOIs
http://doi.org for their materials through a registration agency
such as CrossRef http://crossref.org. As we have discussed elsewhere
(and for more than a decade...) the DOI enables multiple stakeholders
to manage and publish metadata about the item; linked data best
practices are a promising approach , but RDFa on an RA's landing page
for the item is also a possibility. The DOI is not about embedding
metadata in individual instances, however, which is what you seem to
be asking about.

Are you imagining creating some kind of meshup within the reading
experience, perhaps meshing metadata and links bound to entities
within the ePub'd document with external linked data?

John

On Tue, Apr 27, 2010 at 4:40 PM, Stuart A. Yeates syea...@gmail.com wrote:
 I'm interested in putting linked data into eBooks published in the (open 
 standard) ePub format (http://www.openebook.org/ ). The format is essentially 
 a relocatable zip file of XHTML, associated media files and a few metadata 
 files.

 The target platforms of this content impose some restrictions on what is 
 practical: e-ink devices (which are the only current eBook readers with the 
 battery life to last an entire novel) typically don't have an internet 
 connection (thus no resolving of links) and have very little in the way of 
 processing power (thus no full reasoning).

 We already have some data-interlinking between our collection 
 (http://www.nzetc.org/ ) and librarything (http://www.librarything.com/ ) at 
 the FRBR work level (http://vocab.org/frbr/core.html#Work ) and also some 
 links to wikipedia / dbpedia for named entities (principally authors and 
 places).  We believe we have quite good authority control over author names, 
 even those who published under multiple names (see, for example 
 http://www.nzetc.org/tm/scholarly/name-208662.html or 
 http://www.nzetc.org/tm/scholarly/name-208310.html ). We have ~1300 ePubs, 
 the largest of which exceed the size limits of most ePub tools.

 Does anyone know of any other attempts to put linked data into packages like 
 this?

 There are two main issues I can see: (a) how to self-identify the package 
 (naive hashing doesn't work, as some eBook readers open the package and add 
 custom metadata) and (b) how to package the linked data to get maximal use 
 when a paucity of CPU precludes a full reasoner.

 The traditional identifier used in this field, the ISBN, is essentially a 
 print-run identifier, and not of a whole lot of obvious use to us since: (a) 
 most of our books' original publishing predates ISBNs and (b) our digital 
 republishing of them doesn't qualify for an ISBN according to our local ISBN 
 issuer (the National Library of New Zealand).

 cheers
 stuart






-- 
John S. Erickson, Ph.D.
http://bitwacker.wordpress.com
olyerick...@gmail.com
Twitter: @olyerickson

Re: Linked data in packaged content (ePub)

2010-04-28 Thread John Erickson

On Wed, Apr 28, 2010 at 2:39 PM, Stuart A. Yeates syea...@gmail.com wrote:
 Ideally, I'd like a protocol such as Open URL 
 (http://en.wikipedia.org/wiki/Open_URL ), linking books on the device up to 
 the bibliographies of other books that also happen to be on the device. For 
 low CPU devices the links might have to be pre-calculated when connected to a 
 desktop PC. I understand that Open URL can't actually do this because it 
 assumes the web.

So the zeroth answer is to only provide such behavior under only
certain conditions, for example when explicit DOIs for the items in
the bibliographic record are provided and the other items on the
device have DOIs...this would allow for an easy registration when
books were put on the device. The browser of course would have to know
to intervene, however.

To actually do what OpenURL enables --- embed citation data ala COinS
http://ocoins.info/, use that as the basis for constructing OpenURL
references, and resolve those on the device --- would be a stretch on
a disconnected device, but not impossible. The citation metadata for
each installed ePub would have to be indexed, and then a proxy on the
device would have to check this before attempting to resolve
off-device. This sort of resolution doesn't require any advanced
reasoning, although it still might be intense.

Is this what you were thinking?

-- 
John S. Erickson, Ph.D.
http://bitwacker.wordpress.com
olyerick...@gmail.com
Twitter: @olyerickson

Re: Fwd: backronym proposal: Universal Resource Linker

2010-04-18 Thread John Erickson

+1 to Danbri's emphasis on LINKs, because at the end of the day
linking is what it's all about!

John

2010/4/18 Jiří Procházka oji...@gmail.com:
 Why 'URL' when it is pretty clearly defined and still significant portion of 
 web users don't understand it.

 I'd rather embrace 'web address' - even non-tech users would understand
 that.

 Best,
 Jiri Prochazka

 On 04/18/2010 12:18 PM, Dan Brickley wrote:
 So - I'm serious. The term 'URI' has never really worked as something
 most Web users encounter and understand.

 For RDF, SemWeb and linked data efforts, this is a problem as our data
 model is built around URIs.

 If 'URL' can be brought back from limbo as a credible technical term,
 and rebranded around the concept of 'linkage', I think it'll go a long
 way towards explaining what we're up to with RDF.

 Thoughts?

 Dan


 -- Forwarded message --
 From: Dan Brickley dan...@danbri.org
 Date: Sun, Apr 18, 2010 at 11:52 AM
 Subject: backronym proposal: Universal Resource Linker
 To: u...@w3.org
 Cc: Tim Berners-Lee ti...@w3.org


 I'll keep this short. The official term for Web identifiers, URI,
 isn't widely known or understood. The I18N-friendly variant IRI
 confuses many (are we all supposed to migrate to use it; or just in
 our specs?), while the most widely used, understood and (for many)
 easiest to pronounce, 'URL' (for Uniform Resource Locator) has been
 relegated to 'archaic form' status. At the slightest provocation this
 community dissapears down the rathole of URI-versus-URN, and until
 this all settles down we are left with an uncomfortable disconnect
 between how those in-the-know talk about Web identifiers, and those
 many others who merely use it.

 As of yesterday, I've been asked but what is a URI? one too many
 times. I propose a simple-minded fix: restore 'URL' as the most
 general term for Web identifiers, and re-interpret 'URL' as Universal
 Resource Linker. Most people won't care, but if they investigate,
 they'll find out about the re-naming. This approach avoids URN vs URI
 kinds of distinction, scores 2 out of 3 for use of intelligible words,
 and is equally appropriate to classic browser/HTML, SemWeb and other
 technical uses. What's not to like? The Web is all about links, and
 urls are how we make them...

 cheers,

 Dan






-- 
John S. Erickson, Ph.D.
http://bitwacker.wordpress.com
olyerick...@gmail.com
Twitter: @olyerickson

Re: Comments on Data 3.0 manifesto

2010-04-17 Thread John Erickson

Hi Kingsley!

Reading between the lines, I think I grok where you are trying to go
with your manifesto. For it to be an effective, stand-alone document
I think a few pieces are needed:

1. What is your GOAL? It should be clearly stated, something like, to
promote best-practices for standards-compliant access to structured
data object (or entity) descriptors by getting data architects to do X
instead of Y, etc.

2. What is your MOTIVATION? I think this is implicit in your current
text --- your argument seems to be that TBL's Four Principles are
not enough --- but you need to make your motivations explicit and
JUSTIFY them. If TBL's principles are too nebulous, explain concisely
why and what the implications are. Keep in mind that they seem to be
good enough for many practitioners today. ;)

3. Be SPECIFIC about what practitioners must do moving forward. I
think you've made a good start on this, to the extent that you have
lots of SHOULDS. I would argue that more specificity of a different
kind is needed; if data architects SHOULD be following more abstract
EAV conceptualizations, what exactly should they do in practice?

Finally, on the deeper question of motivation, I suggest that while a
historical argument can be made that RDF is likely a subset or special
case of EAV, the community has developed convenient and familiar
languages for expressing RDF (such as N3 and Turtle); practitioners
are much less familiar with EAV. Does the community really lose
anything by using RDF as its shorthand?

Perhaps you can suggest a pattern within current RDF practice that
more strongly enforces EAV principles?

John

On Sat, Apr 17, 2010 at 12:37 PM, Kingsley Idehen
kide...@openlinksw.com wrote:
Richard Cyganiak wrote:

Hi Kingsley,

Regarding your blog post at

http://www.openlinksw.com/dataspace/kide...@openlinksw.com/weblog/kide...@openlinksw.com%27s%20blog%20%5b127%5d/1624

Great job -- I like it a lot, it's not as fuzzy as Tim's four principles,
not as mired in detail as most of the concrete literature around linked
data, and on the right level of abstraction to explain why we need to do
certain things in linked data in a certain way. It's also great for
comparing the strengths and weaknesses of different data exchange stacks.

Thanks, happy its resonating.

RDF has inadvertently caused mass distraction away from the fact that a
common Data Model is the key to meshing heterogeneous data sources. People
just don't buy or grok the data model aspect of RDF, so why continue
fighting this battle, when all we want is mass comprehension, however we get
there.

A few comments:

1. I'd like to see mentioned that identifiers should have global scope.

Yes, will add that emphasis for sure. I guess Network might not
necessarily emphasize that strongly enough.

2. I'd prefer a list of the parts of a 3-tuple that reads:

- an Identifier that names an Entity
- an Identifier that names an Attribute
- an Attribute Value, which may be an Identifier or a Literal (typed
or untyped).

This avoids using the new terms “Entity Identifier” and “Attribute
Identifier”.

No problem.

3. “Structured Descriptions SHOULD be borne by Descriptor Resources” -- I
think this one is incomprehensible, because “to bear” is such an unusual
verb and has no clear connotations in technical circles. I'd encourage a
different phrasing.

Will think about that, getting the right phrase here is is challenging, so I
am naturally open to suggestions etc..

3b. Any chance of talking you into using “Descriptor Document” rather than
“Descriptor Resource”?

No problem, Descriptor Document it is :-)

4. One thing that's left unclear: Can a Descriptor Resource carry multiple
Structured Entity Descriptions or just a single one?

Descriptor Documents are compound in that they can describe a single Entity
or a Collection.

5. Putting each term in quotes when first introduced is a good idea and
helps -- you did it for the first few terms but then stopped.

Writers exhaustion I guess, will fix.

6. I'm tempted to add somewhere, “Descriptor Resources are Entities
themselves.” But this might be a purposeful omission on your part?

Yes, this is deliberate because I am trying to say: Referent is the
Thing you describe by giving it a Name so, anything can be a Referent
including a Document (which has always been problematic in general RDF
realm work e.g. the failure to make links between a .rdf Descriptor
Document and the actual Entity Descriptions they contain etc. via
primarytopic, describedby, and other relations.

7. The last point talks about a “Structured Representation” of the
Referent's Structured Description. The term hasn't been introduced.
Shouldn't this just read “Descriptor Resource carrying the Referent's
Structured Description”?

Yes, so basically this is: s/bear/carry/g :-)

What's your preferred name for the entire thing? I'm tempted to call it
“Kingsley's networked EAV model” or

Re: What would you build with a web of data?

2010-04-10 Thread John Erickson

I'm going to set aside the question of problems and consider
*possibilities* for a moment.

I think linked open data offers profound opportunities for
community-driven applications; I'll use citizen science as a specific
example.

I was reminded of this as I listened to this week's episode of
Material World, in which ...Quentin Cooper hears how records from
two and a half centuries of nature-watching reveal the gradual advance
of spring, and what this says about climate change... see
http://bit.ly/ccFVFM

Check out the applet at nature's calendar
http://www.naturescalendar.org.uk/, based on user-contributed data,
and imagine what is possible both in the citizen-driven science realm
and across domains, into other areas.

To me the magic of linked data will be revealed in the applications
that are created in a matter of hours (if that!) upon hearing about a
particular data source becoming available (bird sightings! flowers
blooming! real-time bike race or marathon results! celebrity
sightings! ) and published as linked data, thus allowing enthusiastic
meshers and mashers to create and share applications which may
become popular.

To power this, I think we a class of platform that is accessible as
blogs for non-technical enthusiasts to publish, mesh and mash. Such a
platform need to be more that data hosts; they must also allow users
to easily create shareable active behaviors that filter and
otherwise transform data the way Yahoo! Pipes diddle with feeds and
other sources --- think Google Desktop Gadgets for Linked Data.

-- 
John S. Erickson, Ph.D.
http://bitwacker.wordpress.com
olyerick...@gmail.com
Twitter: @olyerickson

Re: KIT releases 14 billion triples to the Linked Open Data cloud

2010-04-01 Thread John Erickson

RE Figure 1: *Finally* we have an update to the July 2009 Web of
Data diagram!!!

Great work!!

On Thu, Apr 1, 2010 at 8:43 AM, Denny Vrandecic denny.vrande...@kit.edu wrote:
 No, that is left for future work (as said in the paper).

 Cheers,
 denny


 On Apr 1, 2010, at 12:41, Dan Brickley wrote:

 But I love it :) Do the numbers include dates?

 Dan

 On Thu, Apr 1, 2010 at 12:30 PM, Matthias Samwald samw...@gmx.at wrote:
 Hi Denny,

 I am sorry, but I have to voice some criticism of this project. Over the
 past two years, I have become increasingly wary of the excitement over large
 numbers of triples in the LOD community. Large numbers of triples don't mean
 don't necessarily mean that a dataset enables us to do anything novel or
 significantly useful. I think there should be a shift from focusing on
 quantity to focusing on quality and usefulness.

 Now the project you describe seems to be well-made, but it also exemplifies
 this problem to a degree that I have not seen before. You basically
 published a huge dataset of numbers, for the sake of producing a large
 number of triples. Your announcement mainly emphasis on how huge the dataset
 is, and the corresponding paper does the same. The paper gives a few
 application scenarios, I quote

 The added value of the paradigm shift initiated by our work cannot be
 underestimated.
 By endowing numbers with an own identity, the linked open data cloud
 will become treasure trove for a variety of disciplines. By using elaborate
 data
 mining techniques, groundbreaking insights about deep mathematical
 correspondences
 can be obtained. As an example, using our sample dataset, we were able
 to discover that there are signi cantly more odd primes than even ones, and
 even more excitingly a number contains 2 as a prime factor exactly if its
 successor does not.

 I am sorry, but this  sounds a bit overenthusiastic. I see no paradigm
 shift, and I also don't see why your findings about prime numbers required
 you to publish the dataset as linked data. I also have troubles seeing the
 practical value of looking at the resource pages for each number with a
 linked data browser, but I am also not a mathematician.

 I am sorry for being a bit antagonistic, but we as a community should really
 try not to be seduced too easily by publishing ever-larger numbers of
 triples.

 Cheers,
 Matthias Samwald




 --
 From: Denny Vrandecic denny.vrande...@kit.edu
 Sent: Thursday, April 01, 2010 12:01 PM
 To: public-lod@w3.org
 Subject: KIT releases 14 billion triples to the Linked Open Data cloud

 We are happy to announce that the Institute AIFB at the KIT is releasing
 the biggest dataset until now to the Linked Open Data cloud. The Linked 
 Open
 Numbers project offers billions of facts about natural numbers, all readily
 available as Linked Data.

 Our accompanying peer-reviewed paper [1] gives further details on the
 background and implementation. We have integrated with external data 
 sources
 (linking DBpedia to all their 335 number entities) and also directly link 
 to
 the best-known linked open data browsers from the page.

 You can visit the Linked Open Numbers project at:
 http://km.aifb.kit.edu/projects/numbers/

 Or point your linked open data browser directly at:
 http://km.aifb.kit.edu/projects/numbers/n1

 We are happy to have increased the amount of triples on the Web by more
 than 14 billion triples, roughly 87.5% of the size of linked data web 
 before
 this release (see paper for details). We hope that the data set will find
 its serendipitous use.

 The data set and the publication mechanism was checked pedantically, and
 we expect no errors in the triples. If you do find some, please let us 
 know.
 We intend to be compatible with all major linked open data publication
 standards.

 About the AIFB

 The Institute AIFB (Applied Informatics and Formal Description Methods) at
 KIT is one of the world-leading institutions in Semantic Web technology.
 Approximately 20 researchers of the knowledge management research group are
 establishing theoretical results and scalable implementations for the 
 field,
 closely collaborating with the sister institute KSRI (Karlsruhe Service
 Research Institute), the start-up company ontoprise GmbH, and the Knowledge
 Management group at the FZI Research Center for Information Technologies.
 Particular emphasis is given to areas such as logical foundations, Semantic
 Web mining, ontology creation engineering and management, RDF data
 management, semantic web search, and the implementation of interfaces and
 tools. The institute is involved in many industry-university co-operations,
 both on a European and a national level, including a number of intelligent
 Web systems case studies.

 Website: http://www.aifb.kit.edu

 About KIT

 The Karlsruhe Institute of Technology (KIT) is the merger of the former
 Universität Karlsruhe (TH) and the former Forschungszentrum Karlsruhe. With

Re: Visual exploration of linked data with the Information Workbench and Microsoft Pivot

2010-03-26 Thread John Erickson

A reminder that Pivot requires:

* Microsoft .NET Framework 3.5 SP1
* Microsoft Internet Explorer 8
* Microsoft Window 7 (XP not supported)
* Microsoft Windows Aero
* Pivot only displays in English (US).

Just sayin'...

On Fri, Mar 26, 2010 at 8:43 AM, Peter Haase peter.ha...@fluidops.com wrote:
 Hi all,

 we would like to announce a demonstrator showing Microsoft's Pivot as an
 interface for the visual exploration of linked data.
 The application uses Pivot as a frontend to the Information Workbench,
 integrating DBpedia and several other LOD data sets.

 Try it at http://iwb.fluidops.com/pivot

 Some key features are:

 - support for structured (SPARQL) and unstructured (keyword) queries
 - visual exploration and analysis of query results
 - query refinement and data exploration using pivot operations

 The demonstrator requires Microsoft Pivot, http://getpivot.com/.

 Regards,
 Peter








-- 
John S. Erickson, Ph.D.
http://bitwacker.wordpress.com
olyerick...@gmail.com
Twitter: @olyerickson

Re: A question - use 301 instead of 406?

2010-03-26 Thread John Erickson

Thank you Richard for that email; I think in making this
observation/asking this question you've pointed out the subtle role
that the generic document plays, esp (from
http://www.w3.org/TR/cooluris/#r303gendocument):

...This has the advantage that clients can bookmark and further work
with the generic document. A user having a RDF-capable client could
bookmark the document, and mail it to another user (or device) which
then dereferences it and gets the HTML or the RDF view...

John

On Fri, Mar 26, 2010 at 9:22 AM, Richard Cyganiak rich...@cyganiak.de wrote:
 Hi Hugh,

 Thinking more about this, I'm resetting to the start of the thread and I
 have a question for you.

 The Cool URIs for the Semantic Web document, which is perhaps the canonical
 reference for the 303/conneg style of linked data publishing, lists two
 different design patterns for using 303 redirects with slash URIs:

 1. 303 URIs forwarding to One Generic Document
 http://www.w3.org/TR/cooluris/#r303gendocument

 2. 303 URIs forwarding to Different Documents
 http://www.w3.org/TR/cooluris/#r303uri

 AFAIK, RKBexplorer, like DBpedia and many other linked data sites, implement
 the second approach. When we tried to get the document through the TAG,
 TimBL insisted that the first approach is better and should come first in
 the Cool URIs document. So we did that. But the second approach was already
 deployed and has captured most of the mindshare and still is the default
 today, and the first approach has never really caught on.

 My question for you: Does the first approach solve your problem? By always
 303ing to a generic document, you'd see a document URL in the browser bar
 that could respond with either HTML or RDF. The variant-specific URIs would
 still be there but not be used in typical HTTP interactions. Does this solve
 the issue that motivated you here?

 Best,
 Richard


 On 23 Mar 2010, at 22:50, Hugh Glaser wrote:

 I am not sure I even dare ask this, but here goes.
 (This is prompted by a real application implementation - it is not just a
 hypothetical.)

 Assuming that we are in the usual situation of http://foo/bar doing a 303
 to
 http://foo/bar.rdf when it gets a Accept: application/rdf+xml
 http://foo/bar
 what should a server do when it gets a request for
 Accept: application/rdf+xml http://foo/bar.html ?

 OK, the answer is 406.

 But is this compatible with the principle of being as forgiving as
 possible
 as a server?

 I think it is clear what the agent wanted:
 Accept: application/rdf+xml http://foo/bar
 it is just that somehow the wrong URI got into the system.

 I know I have made the mistake of for example copying a dbpedia URI from
 the
 address bar when I was looking for the LD URI, and ended up wondering for
 a
 moment why
 Accept: application/rdf+xml http://dbpedia.org/page/Tim_Berners-Lee
 gives me a 406 before I remember I need to right click on the About and
 copy
 the link.

 That's OK if all that happens is I use the wrong URI straight away.
 But what happens if I then enter it into a form that requires a LD URI,
 and
 then perhaps goes into a DB, and becomes a small part of a later process?
 Simply put, the process will fail maybe years later, and the possibility
 and
 knowledge to fix it will be long gone.

 Maybe the form validation is substandard, but I can see this as a
 situation
 that will recur a lot, because the root cause is that the address bar URI
 changes from the NIR URI. And most html pages do not have links to the NIR
 of the page you are on - I am even told that it is bad practice to make
 the
 main label of the page a link to itself - wikipedia certainly doesn't,
 although it is available as the article tab, which is not the normal
 thing
 of a page. SO in a world where wikipedia itself became LD, it would not be
 clear to someone who wanted the NIR URI where to find it.

 So that is some of the context and motivation.
 If we were to decide to be more forgiving, what might be done?
 How about using 301?
 Ducks
 To save you looking it up, I have appended the RFC2616 section to this
 email.
 That is
 Accept: application/rdf+xml http://foo/bar.html
 Should 301 to http://foo/bar
 It seems to me that it is basically doing what is required - it gives the
 client the expected access, while telling it (if it wants to hear) that it
 should correct the mistake.
 One worry (as Danius Michaelides pointed out to me) is that the caching
 may
 need careful consideration - should the response indicate that it is not
 cacheable, or is that not necessary?

 So that's about it.
 I am unhappy that users doing the obvious thing might get frustrated
 trying
 to find the URIs for heir Things, so really want a solution that is not
 just
 406.
 Are there other ways of being nice to users, without putting a serious
 burden on the client software?

 I look forward to the usual helpful and thoughtful responses!

 By the way, I see no need to 301 to http:/foo/bar if you get a
 Accept: text/html http://foo/bar.rdf

Re: Conneg representation equivalence

2010-03-12 Thread John Erickson

Nathan nat...@webr3.org wrote:
 Is it correct that all representations must have consistent fragment 
 identifiers in order to be considered equivalent?
 A fragment identifier should not identify different things in different 
 representations. (Though it may be unrepresented in some or all of the 
 representations.)

From http://www.w3.org/TR/webarch/#frag-coneg , the fragment
identifier must be defined consistently by the representations; the
*provider* decides when definitions of fragment identifier semantics
are sufficiently consistent...

 If I recall correctly the URI RFC..., the semantics of fragments identifiers 
 depends on the retrieved content-type. So why would they *have* to identify 
 the same thing?

From http://www.w3.org/TR/webarch/#frag-coneg , Individual data
formats may define their own rules for use of the fragment identifier
syntax for specifying different types of subsets, views, or external
references that are identifiable as secondary resources by that media
type.

 That being said, I agree it sounds like a good practice. Especially if you 
 consider an RDF/XML and a Turtle representation of the same RDF graph... If 
 their fragment identifier were not consistent, that would be a serious 
 headache... But is this rule written somewhere?

The interpretation in http://www.w3.org/TR/webarch/#frag-coneg
suggests that it would be correct either for the fragment identifier
to be interpreted by the RFG/XML and Turtle representations
consistently --- refers to a semantically consistent secondary
representation --- or for an interpretation to not be defined at all.
It is not acceptable for representations with an interpretation
defined, to interpret the fragment identifier such that semantically
inconsistent secondary representations are returned. Interpret them
consistently, or don't interpret them, but don't do it inconsistently!

So maybe an answer to Nathan's question needs qualification; IF for
each representation of a resource an interpretation for a given
fragment identifier has been defined, AND we assume that the server is
exhibiting correct behavior, THEN we must accept that the secondary
representations meet the providers definition of consistent.

Consistency is not the same as equivalence; two representations might
return consistent secondary representations that cannot be
considered equal because they are of entirely different content
types.

-- 
John S. Erickson, Ph.D.
http://bitwacker.wordpress.com
olyerick...@gmail.com
Twitter: @olyerickson

47 matches

Mail list logo