Sorry for not being able to reply earlier. The Federated Social Web Conference in Berlin co-occurred with the WebId get together and so I was extremely busy . There were a lot of discussions on the Social Web, and I even presented Clerezza.
http://d-cent.org/fsw2011/ I read through the first part of Reto's reply and answered that here. The issues were grouped together all around the theme of trust, the complexity of determining it, and how that clashes with the idea of being able to determine trust by default for other people. It turns out, I argue, that adding the content graph to a remote graph can reduce trust in the result rather than enhance it. It furthermore can lead to information leakage. It also makes many use cases impossible to implement as well as making choices that will make efficient reasoning in the future difficult. On 3 Jun 2011, at 13:29, Reto Bachmann-Gmuer wrote: >> >> Also there will be contradictions in the information on the web. Some >> people may trust some graphs, other trust others. > > Right, that's why the GraphNodeProvider trusts only the content-graph, which > is trusted qua being a platform service) and the graph resulting from > dereferencing the resource (trusted by conventional web-trust) As we saw the content graph returned by the Provider is a complex union of other graphs, of which a documentation, config, web-resources, and enrichment graph, plus some logic regarding WebIdGraphService, and other services. Combine that with the fact that judgements of trust are context dependent, user dependent and task dependent among other things, and one can predict that making simplified trust decisions for others will lead to security holes, and many other issues. So as I understood Clerezza is built so as to make it possible for users to upload new packages. These may contain documentation, which could contain relations that are perhaps out of date, or are just hypothetical, and so start interfering in odd ways with other applications... It would be a pity to loose the ability to give rights to one's friends to add limited new features to Clerezza for fear that they may interfere with one's work in other unrelated areas. Consider that in the previous post you wrote: > Modules can write to the content graph or add temporary additions to it. > Actually writing to the content graph should happen when public and trusted > information is added. Who decides when information is public and trusted? This is not something that comes all at once. It is not furthermore something that is determined for ever. Marriages would not break up so frequently were it otherwise. And most companies have more than 2 people working for them, with constantly changing trust relations, roles, etc... > An information is considered trusted when added by a > user with respective permission or verified by privileged code (e.g. that > allows the public to add see-also references). Yes, this is one potential method for determining trust. It will work for some apps, but for many others this will capture a woefully inadequate notion of trust. Consider that the above will mean that the aggregate trust one can put in the content graph, will depend on the verification ability of the weakest code installed in Clerezza. So now the trust one can have in the result returned by Clerezza when asking for <http://www.w3.org/People/Berners-Lee/card> Is not the trust one has in the above resource, but the trust one has in the weakest link of a specific installation of Clerezza. >> Graphs can be merged easily in RDF - IF they are believed both to be true. >> But what is believed to be true will depend on what possible world you >> believe yourself to be in. I argued this in "Beatnik: change your mind" >> in more detail, if that helps for people following this discussion >> >> http://blogs.oracle.com/bblfish/entry/beatnik_change_your_mind >> >> From the point of view of WebID and security I want to be able to tell WHO >> said what. In many applications being able to be very clear about where >> something was said is going to be essential to giving good feedback. Some >> example coming from the field I am working on below. >> > >> So for a foaf-browser, I want to know when TimBl declares someone to be a >> friend, and differentiate that from when someone declares himself to be a >> friend of TimBL, which is a very different thing. > > With the current service you have what TimBL says plus the platform-wide > truths of the content-graph, this may contain things like a link back to you > (the owner of the platform instance) or a statement like : TimBL rdf:type > ex: Spammer which might not be published in TimBL's profile Yes, that is great. I would love to be able to have that information when I need it. But is the content graph the right place to put information about spammers? Is a simple ssp application that publishes a graph of relations on TimBL also going to now publish that he is a spammer? > >> When I get Dan Brickley's graph I may want to know all the people he >> mentions in his foaf profile - even if he does not mention them as >> foaf:knows related to him. > > does this provide a new point? yes. It shows that your merger of a graph with a whole bunch of other information makes certain types of queries impossible. Use Case: When people type or drag and drop URLs in a form they will usually not be precise with their URLs. So it will be important to find the minimum published context, find the people in that context, to be able to help them select a person. Presumably they meant some person in that page. So one needs to search the people in that page to help them select the right one. We have similar issues with reasoning. Reasoning over small graphs is going to be a lot more efficient than over the whole database. So one may be also interested just in minimal reasoning, adding all foaf:Person to all foaf:knows subjects and objects in a remote foaf - in order to locate that person the user is interested in. When receiving a new grph one may also be first interested to see if it by itself does not contain any contradictions, before merging it - even if only virtually - with the rest of the DB. > >> If the GraphNodeProvider returns a union graph of the documentation graph, > > Again no, we're not returning a union graph we're returning a GraphNode, the > underlying graph is an implementation detail (was think if the > getGraph-method could be made less visible (protected or private) to avoid > this confusion) > > >> content graph,... and his foaf profile then when searching for all the >> foaf:Person > > You don't search a GraphNode for all foaf:Person but the GraphNode > represents the foaf:Person you asked for. Well I could also ask for the GraphNode of the foaf:Person class and then people/-RDF.`type` would return all instances. > >> I will get the documentation writers too, the writers of content in the >> content graph, and who knows what else... > > you will have properties pointing from that persons to all the comments he > left on the local instance, which can be quite handy (and which are from the > underlying content graph as they are probably not also contained in the > remote foaf:profile) That can be quite handy in SOME circumstances, and not handy at all in others. So in summary, the "trust" decisions made by the GraphNodeProvider do not increase trust but may well reduce it, they could end up creating information leakage, they reduce the number of things that can be done, make some very arbitrary trust decisions, and make reasoning more difficult. The decision to return these union of graphs by default is thus unintuitive and unhelpful. It packs a huge amount of decisions that are not evident when asking for a GraphNode. These decisions are not of course bad in all circumstances. There are many cases where it may be very good. But it is better to have those decisions be clear and not tie them into the core of Clerezza where the returning of a simple UriRef requires one to be aware of all these decisions. Henry
