WebID vs. JSON (Was: Re: Think before you write Semantic Web crawlers)

2011-06-22 Thread William Waites
What does WebID have to do with JSON? They're somehow representative
of two competing trends.

The RDF/JSON, JSON-LD, etc. work is supposed to be about making it
easier to work with RDF for your average programmer, to remove the
need for complex parsers, etc. and generally to lower the barriers.

The WebID arrangement is about raising barriers. Not intended to be
the same kind of barriers, certainly the intent isn't to make
programmer's lives more difficult, rather to provide a good way to do
distributed authentication without falling into the traps of PKI and
such.

While I like WebID, and I think it is very elegant, the fact is that I
can use just about any HTTP client to retrieve a document whereas to
get rdf processing clients, agents, whatever, to do it will require
quite a lot of work [1]. This is one reason why, for example, 4store's
arrangement of /sparql/ for read operations and /data/ and /update/
for write operations is *so* much easier to work with than Virtuoso's
OAuth and WebID arrangement - I can just restrict access using all of
the normal tools like apache, nginx, squid, etc..

So in the end we have some work being done to address the perception
that RDF is difficult to work with and on the other hand a suggestion
of widespread putting in place of authentication infrastructure which,
whilst obviously filling a need, stands to make working with the data
behind it more difficult.

How do we balance these two tendencies?

[1] examples of non-WebID aware clients: rapper / rasqal, python
rdflib, curl, the javascript engine in my web browser that doesn't
properly support client certificates, etc.
-- 
William Waitesmailto:w...@styx.org
http://river.styx.org/ww/sip:w...@styx.org
F4B3 39BF E775 CF42 0BAB  3DF0 BE40 A6DF B06F FD45



Re: WebID vs. JSON (Was: Re: Think before you write Semantic Web crawlers)

2011-06-22 Thread Leigh Dodds
Hi,

On 22 June 2011 15:41, William Waites w...@styx.org wrote:
 What does WebID have to do with JSON? They're somehow representative
 of two competing trends.

 The RDF/JSON, JSON-LD, etc. work is supposed to be about making it
 easier to work with RDF for your average programmer, to remove the
 need for complex parsers, etc. and generally to lower the barriers.

 The WebID arrangement is about raising barriers. Not intended to be
 the same kind of barriers, certainly the intent isn't to make
 programmer's lives more difficult, rather to provide a good way to do
 distributed authentication without falling into the traps of PKI and
 such.

 While I like WebID, and I think it is very elegant, the fact is that I
 can use just about any HTTP client to retrieve a document whereas to
 get rdf processing clients, agents, whatever, to do it will require
 quite a lot of work [1]. This is one reason why, for example, 4store's
 arrangement of /sparql/ for read operations and /data/ and /update/
 for write operations is *so* much easier to work with than Virtuoso's
 OAuth and WebID arrangement - I can just restrict access using all of
 the normal tools like apache, nginx, squid, etc..

 So in the end we have some work being done to address the perception
 that RDF is difficult to work with and on the other hand a suggestion
 of widespread putting in place of authentication infrastructure which,
 whilst obviously filling a need, stands to make working with the data
 behind it more difficult.

 How do we balance these two tendencies?

By recognising that often we just need to use existing technologies
more effectively and more widely, rather than throw more technology at
a problem, thereby creating an even greater education and adoption
problem?

Cheers,

L.

-- 
Leigh Dodds
Programme Manager, Talis Platform
Mobile: 07850 928381
http://kasabi.com
http://talis.com

Talis Systems Ltd
43 Temple Row
Birmingham
B2 5LS



Re: WebID vs. JSON (Was: Re: Think before you write Semantic Web crawlers)

2011-06-22 Thread Kingsley Idehen

On 6/22/11 3:41 PM, William Waites wrote:

While I like WebID, and I think it is very elegant, the fact is that I
can use just about any HTTP client to retrieve a document whereas to
get rdf processing clients, agents, whatever, to do it will require
quite a lot of work [1]. This is one reason why, for example, 4store's
arrangement of/sparql/  for read operations and/data/  and/update/
for write operations is*so*  much easier to work with than Virtuoso's
OAuth and WebID arrangement - I can just restrict access using all of
the normal tools like apache, nginx, squid, etc..

Huh?

WebID and SPARQL is about making an Endpoint with ACLs. ACL membership 
is driven by WebID for people, organizations, or groups (of either).


Don't really want to get into a Virtuoso vs 4-Store argument, but do 
explain to me how the convention you espouse enables me confine access 
to a SPARQL endpoint for:


A person identified by URI based Name (WebID) that a member of a 
foaf:Group (which also has its own WebID).


How does this approach leave ACL membership management to designated 
members of the foaf:Group?


Again, don't wanna do a 4-Store vs Virtuoso, but I really don't get your 
point re. WebID and the fidelity it brings to data access in general. 
Also note, SPARQL endpoints are but one type of data access address. 
WebID protects access to data accessible via Addresses by implicitly 
understanding the difference between a generic Name and a Name 
specifically used a Data Source Address or Location.




--

Regards,

Kingsley Idehen 
President  CEO
OpenLink Software
Web: http://www.openlinksw.com
Weblog: http://www.openlinksw.com/blog/~kidehen
Twitter/Identi.ca: kidehen







Re: WebID vs. JSON (Was: Re: Think before you write Semantic Web crawlers)

2011-06-22 Thread Dave Reynolds
On Wed, 2011-06-22 at 15:52 +0100, Leigh Dodds wrote: 
 Hi,
 
 On 22 June 2011 15:41, William Waites w...@styx.org wrote:
  What does WebID have to do with JSON? They're somehow representative
  of two competing trends.
 
  The RDF/JSON, JSON-LD, etc. work is supposed to be about making it
  easier to work with RDF for your average programmer, to remove the
  need for complex parsers, etc. and generally to lower the barriers.
 
  The WebID arrangement is about raising barriers. Not intended to be
  the same kind of barriers, certainly the intent isn't to make
  programmer's lives more difficult, rather to provide a good way to do
  distributed authentication without falling into the traps of PKI and
  such.
 
  While I like WebID, and I think it is very elegant, the fact is that I
  can use just about any HTTP client to retrieve a document whereas to
  get rdf processing clients, agents, whatever, to do it will require
  quite a lot of work [1]. This is one reason why, for example, 4store's
  arrangement of /sparql/ for read operations and /data/ and /update/
  for write operations is *so* much easier to work with than Virtuoso's
  OAuth and WebID arrangement - I can just restrict access using all of
  the normal tools like apache, nginx, squid, etc..
 
  So in the end we have some work being done to address the perception
  that RDF is difficult to work with and on the other hand a suggestion
  of widespread putting in place of authentication infrastructure which,
  whilst obviously filling a need, stands to make working with the data
  behind it more difficult.
 
  How do we balance these two tendencies?
 
 By recognising that often we just need to use existing technologies
 more effectively and more widely, rather than throw more technology at
 a problem, thereby creating an even greater education and adoption
 problem?

+1

Don't raise barriers to linked data use/publication by tying it to
widespread adoption and support for WebID.

Dave





Re: WebID vs. JSON (Was: Re: Think before you write Semantic Web crawlers)

2011-06-22 Thread Kingsley Idehen

On 6/22/11 4:08 PM, Dave Reynolds wrote:

On Wed, 2011-06-22 at 15:52 +0100, Leigh Dodds wrote:

Hi,

On 22 June 2011 15:41, William Waitesw...@styx.org  wrote:

What does WebID have to do with JSON? They're somehow representative
of two competing trends.

The RDF/JSON, JSON-LD, etc. work is supposed to be about making it
easier to work with RDF for your average programmer, to remove the
need for complex parsers, etc. and generally to lower the barriers.

The WebID arrangement is about raising barriers. Not intended to be
the same kind of barriers, certainly the intent isn't to make
programmer's lives more difficult, rather to provide a good way to do
distributed authentication without falling into the traps of PKI and
such.

While I like WebID, and I think it is very elegant, the fact is that I
can use just about any HTTP client to retrieve a document whereas to
get rdf processing clients, agents, whatever, to do it will require
quite a lot of work [1]. This is one reason why, for example, 4store's
arrangement of /sparql/ for read operations and /data/ and /update/
for write operations is *so* much easier to work with than Virtuoso's
OAuth and WebID arrangement - I can just restrict access using all of
the normal tools like apache, nginx, squid, etc..

So in the end we have some work being done to address the perception
that RDF is difficult to work with and on the other hand a suggestion
of widespread putting in place of authentication infrastructure which,
whilst obviously filling a need, stands to make working with the data
behind it more difficult.

How do we balance these two tendencies?

By recognising that often we just need to use existing technologies
more effectively and more widely, rather than throw more technology at
a problem, thereby creating an even greater education and adoption
problem?

+1

Don't raise barriers to linked data use/publication by tying it to
widespread adoption and support for WebID.


-1

You are misunderstanding WebID and what it delivers.

I am popping out, but I expect a response. Should Henry not put this 
misconception to REST, I'll certainly reply.


Got to go do some walking for now :-)

Dave







--

Regards,

Kingsley Idehen 
President  CEO
OpenLink Software
Web: http://www.openlinksw.com
Weblog: http://www.openlinksw.com/blog/~kidehen
Twitter/Identi.ca: kidehen








Re: WebID vs. JSON (Was: Re: Think before you write Semantic Web crawlers)

2011-06-22 Thread William Waites
* [2011-06-22 16:00:49 +0100] Kingsley Idehen kide...@openlinksw.com écrit:

] explain to me how the convention you espouse enables me confine access 
] to a SPARQL endpoint for:
] 
] A person identified by URI based Name (WebID) that a member of a 
] foaf:Group (which also has its own WebID).

This is not a use case I encounter much. Usually I have some
application code that needs write access to the store and some public
code (maybe javascript in a browser, maybe some program run by a third
party) that needs read access.

If the answer is to teach my application code about WebID, it's going
to be a hard sell because really I want to be working on other things
than protocol plumbing.

If you then go further and say that *all* access to the endpoint needs
to use WebID because of resource-management issues, then every client
now needs to do a bunch of things that end with shaving a yak before
they can even start on working on whatever they were meant to be
working on.

On the other hand, arranging things so that access control can be done
by existing tools without burdening the clients is a lot easier, if
less general. And easier is what we want working with RDF to be.

Cheers,
-w

-- 
William Waitesmailto:w...@styx.org
http://river.styx.org/ww/sip:w...@styx.org
F4B3 39BF E775 CF42 0BAB  3DF0 BE40 A6DF B06F FD45



WebID and client tools - was: WebID vs. JSON (Was: Re: Think before you write Semantic Web crawlers)

2011-06-22 Thread Henry Story

On 22 Jun 2011, at 16:41, William Waites wrote:

 
 [1] examples of non-WebID aware clients: rapper / rasqal, python
 rdflib, curl, the javascript engine in my web browser that doesn't
 properly support client certificates, etc.

curl is WebID aware. You just need to get yourself a certificate for your 
crawler, and then use

  -E/--cert  certificate[:password] 

arguments to pass that certificate if the server requests it. 

The specs for HTTPS client certs are so old and well established that it is 
built by default into most libraries. So it would not take a lot to expose it, 
if it is not already in all the libs you mention.

But thanks for this new FAQ [1]. We'll try to fill in the details on how to 
work with the libs above using webid.

There is a Javascript layer for https too, but what is the point of doing that 
there? Let the browser do the https for you.

Henry

[1] http://www.w3.org/wiki/Foaf%2Bssl/FAQ

Social Web Architect
http://bblfish.net/




Re: WebID vs. JSON (Was: Re: Think before you write Semantic Web crawlers)

2011-06-22 Thread Henry Story

On 22 Jun 2011, at 17:14, William Waites wrote:

 * [2011-06-22 16:00:49 +0100] Kingsley Idehen kide...@openlinksw.com écrit:
 
 ] explain to me how the convention you espouse enables me confine access 
 ] to a SPARQL endpoint for:
 ] 
 ] A person identified by URI based Name (WebID) that a member of a 
 ] foaf:Group (which also has its own WebID).
 
 This is not a use case I encounter much. Usually I have some
 application code that needs write access to the store and some public
 code (maybe javascript in a browser, maybe some program run by a third
 party) that needs read access.
 
 If the answer is to teach my application code about WebID, it's going
 to be a hard sell because really I want to be working on other things
 than protocol plumbing.

So you're in luck. https is shipped in all client libraries, so you just need
to get your application a webid certificate. That should be as easy as one post 
request
to get it. At least for browsers it's a one click affair for the end user, as 
shown
here

   http://bblfish.net/blog/2011/05/25/

It would be easy to do the same for robots. In fact that is why at the 
University
of Manchester Bruno Harbulot and Mike Jones are using WebID for their Grid 
computing 
work, because it makes access control to the grid so much easier that any of 
the other
top heavy technologies available.

 If you then go further and say that *all* access to the endpoint needs
 to use WebID because of resource-management issues, then every client
 now needs to do a bunch of things that end with shaving a yak before
 they can even start on working on whatever they were meant to be
 working on.

You can be very flexible there.  If users have WebId you give them a better 
service.
Seems fair deal. It can also be very flexible. You don't need all your site to 
be WebID
enabled. You could use cookie auth on http endpoints, and for clients that 
don't have
a cookie redirect them to an https endpoint where they can auth with webid. If 
they don't
ask them to auth with somithing like OpenId. I'd say pretty soon your crawlers 
and users will
be a lot happier with WebID.

 On the other hand, arranging things so that access control can be done
 by existing tools without burdening the clients is a lot easier, if
 less general. And easier is what we want working with RDF to be.

All your tools probably already are webId enabled. It's just a matter now of
giving a foaf profile to yourself and robots, getting a cert with the webid 
inthere,
and getting going. Seems to be that that's a lot easier than building crawlers, 
or semweb clients, or semweb servers, or pretty much anything.

Henry

 
 Cheers,
 -w
 
 -- 
 William Waitesmailto:w...@styx.org
 http://river.styx.org/ww/sip:w...@styx.org
 F4B3 39BF E775 CF42 0BAB  3DF0 BE40 A6DF B06F FD45
 

Social Web Architect
http://bblfish.net/




Re: WebID vs. JSON (Was: Re: Think before you write Semantic Web crawlers)

2011-06-22 Thread Kingsley Idehen

On 6/22/11 4:14 PM, William Waites wrote:

* [2011-06-22 16:00:49 +0100] Kingsley Idehenkide...@openlinksw.com  écrit:

] explain to me how the convention you espouse enables me confine access
] to a SPARQL endpoint for:
]
] A person identified by URI based Name (WebID) that a member of a
] foaf:Group (which also has its own WebID).

This is not a use case I encounter much. Usually I have some
application code that needs write access to the store and some public
code (maybe javascript in a browser, maybe some program run by a third
party) that needs read access.


I am assuming you seek multiple users of your end product (the 
application), right?


I assume all users aren't equal i.e., they have varying profiles, right?

If the answer is to teach my application code about WebID, it's going
to be a hard sell because really I want to be working on other things
than protocol plumbing.


Remember, I like to take the problem solving approach to technology. 
Never technology for the sake of it, never.


There is a fundamental problem: you seek 1 users of your apps. All 
users aren't the same, profile wise.


Simple case in point, right here, and right now. This thread is about a 
critical challenge (always there btw) that Linked Data propagation 
unveils. The very same problems hit us in the early '90s re. ODBC i.e., 
how do we control access to data bearing in mind ODBC application user 
profile variations. Should anyone be able to access pensions and payroll 
data to make a very obvious example.


The gaping security hole that ODBC introduced to the enterprise is still 
doing damage to this very day. I won't mention names, but as you hear 
about security breaches, do some little digging about what's behind many 
of these systems. Hint: a relational database, and free ODBC, JDBC, 
OLE-DB, ADO.NET providers, in many cases. Have one of those libraries  
on a system, you can get into the RDBMS via social engineering (in the 
absolute worst case, or throttle with CPUs for passwords).


Way back then we use Windows INI structure to construct a graph based 
data representation format that we called session rules book. Via 
these rules we enabled organizations to say: Kingsley can only access 
records in certain ODBC/JDBC/OLE-DB/ADO.NET accessible databases if he 
met certain criteria that included the IP address he logs in from, his 
username, client application name, arbitrary identifiers that the system 
owner could conjure up etc.. The only drag for us what it was little 
OpenLink rather than a behemoth like Microsoft.


When we encountered RDF and the whole Semantic Web vision we realized 
there was a standardized route for addressing these fundamental issues. 
This is why WebID is simply a major deal. It is inherently quite 
contradictory to push Linked Data and push-back at WebID. That's only 
second to rejecting essence of URI abstraction by conflating Names and 
Addresses re. fine grained data access that address troubling problems 
of yore.




If you then go further and say that *all* access to the endpoint needs
to use WebID because of resource-management issues, then every client
now needs to do a bunch of things that end with shaving a yak before
they can even start on working on whatever they were meant to be
working on.



No.

This is what we (WebID implementers) are saying:

1. Publish Linked Data
2. Apply Linked Data prowess to the critical issue of controlled access 
to Linked Data Spaces.


Use Linked Data to solve a real problem. In doing so we'll achieve the 
critical mass we all seek because the early adopters of Linked Data will 
be associated with:


1. Showing how Linked Data solves a real problem
2. Using Linked Data to make its use and consumption easier for others 
who seek justification and use case examples en route to full investment.



On the other hand, arranging things so that access control can be done
by existing tools without burdening the clients is a lot easier, if
less general. And easier is what we want working with RDF to be.


It has nothing to do with RDF. It has everything to do with Linked Data 
i.e., Data Objects endowed with Names that resolve to their 
Representations. Said representations take the form of EAV/SPO based 
graphs. RDF is one of the options for achieving this goal via a syntax 
with high semantic fidelity (most of that comes from granularity 
covering datatypes and locale issues).


What people want, and have always sought is: open access to relevant 
data from platforms and tools of their choice without any performance or 
security compromises. HTTP, URIs, and exploitation of full URI 
abstraction as mechanism for graph based whole data representation, 
without graph format/syntax distractions is the beachhead we need right 
now. The semantic fidelity benefits of RDF re. datatypes and locale 
issues comes after that. Thus, first goal is to actually simplify Linked 
Data, and make its use and exploitation practical,  starting with 
appreciation of