Re: Issues of trust -- Re: [VOTE] Accept the proposed patch of CLEREZZA-540

Reto Bachmann-Gmuer Mon, 06 Jun 2011 03:51:14 -0700

On Mon, Jun 6, 2011 at 11:55 AM, Henry Story <[email protected]>wrote:


> Sorry for not being able to reply earlier. The Federated Social Web
> Conference in Berlin
> co-occurred with the WebId get together and so I was extremely busy . There
> were a lot of
> discussions on the Social Web, and I even presented Clerezza.
>
>   http://d-cent.org/fsw2011/
>
> I read through the first part of Reto's reply and answered that here. The
> issues were grouped together all around the theme of trust, the complexity
> of determining it, and how that clashes with the idea of being able to
> determine trust by default for other people. It turns out, I argue, that
> adding the content graph to a remote graph can reduce trust in the result
> rather than enhance it. It furthermore can lead to information leakage. It
> also makes many use cases impossible to implement as well as making choices
> that will make efficient reasoning in the future difficult.
>
> On 3 Jun 2011, at 13:29, Reto Bachmann-Gmuer wrote:
> >>
> >> Also there will be contradictions in the information on the web. Some
> >> people may trust some graphs, other trust others.
> >
> > Right, that's why the GraphNodeProvider trusts only the content-graph,
> which
> > is trusted qua being a platform service) and the graph resulting from
> > dereferencing the resource (trusted by conventional web-trust)
>
> As we saw the content graph returned by the Provider is a complex union of
> other graphs, of which a documentation, config, web-resources, and
> enrichment graph, plus some logic regarding WebIdGraphService, and other
> services.
>
> Combine that with the fact that judgements of trust are context dependent,
> user dependent and task dependent among other things, and one can predict
> that making simplified trust decisions for others will lead to security
> holes, and many other issues.
>
> So as I understood Clerezza is built so as to make it possible for users to
> upload new packages. These may contain documentation, which could contain
> relations that are perhaps out of date, or are just hypothetical, and so
> start interfering in odd ways with other applications... It would be a pity
> to loose the ability to give rights to one's friends to add limited new
> features to Clerezza for fear that they may interfere with one's work in
> other unrelated areas.
>
The ability for users to upload bundles running with their permission is a
feature that should be used with greatest care. In fact there's no
protection against DOS attacks  also the user bundles might start services
other than jax-rs resources scoped to their bundle prefix. You correctly
note, that via the documentation graph they are able to add contents to the
virtual content graph without having an explicit permission to do so. This
is certainly an issue.

However this is not related to CLEREZZA-540, the content-graph is a feature
that exists independently of CLEREZZA-540 and while it is a fact that - as
you point out - users that have (the anyway very high) privilege to upload
bundles can add content to the content graph, this means that they can for
example add arbitrary content to the start page of the local clerezza
instance. Annotating remote resources doesn't seem to be something that
should be more restricted than resources served by the local instance.



>
> Consider that in the previous post you wrote:
>
> > Modules can write to the content graph or add temporary additions to it.
> > Actually writing to the content graph should happen when public and
> trusted
> > information is added.
>
> Who decides when information is public and trusted? This is not something
> that comes all at once. It is not furthermore something that is determined
> for ever. Marriages would not break up so frequently were it otherwise. And
> most companies have more than 2 people working for them, with constantly
> changing trust relations, roles, etc...
>
You pointed out an issue above, but basically its who has readwrite
permission on the content-graph to decide which triples are public and
trusted


>
> > An information is considered trusted when added by a
> > user with respective permission or verified by privileged code (e.g. that
> > allows the public to add see-also references).
>
> Yes, this is one potential method for determining trust. It will work for
> some apps, but for many others this will capture a woefully inadequate
> notion of trust. Consider that the above will mean that the aggregate trust
> one can put in the content graph, will depend on the verification ability of
> the weakest code installed in Clerezza.
>
Yes, as per now the weakest code can do bad things. Clerezza does not (yet)
have a powerfull sandboxing mechanism. We should add warnings about this but
this doesn't alters the soundness of having a place of instance-wide trusted
and public data. This place (the content graph) is not introduced by the
issue you veto against.



>
> So now the trust one can have in the result returned by Clerezza when
> asking for
>
>   <http://www.w3.org/People/Berners-Lee/card>
>
> Is not the trust one has in the above resource,

You mean in the infrastructure used for getting a representation for that
resource, i.e. the server and the netwsork.

but the trust one has in the weakest link of a specific installation of
> Clerezza.
>

Which makes perfectly sense as you wouldn't otherwise ask a specific
clerezza instance but directly dereference the resource.

[...]

> >
> > With the current service you have what TimBL says plus the platform-wide
> > truths of the content-graph, this may contain things like a link back to
> you
> > (the owner of the platform instance) or a statement like : TimBL rdf:type
> > ex: Spammer which might not be published in TimBL's profile
>
> Yes, that is great. I would love to be able to have that information when I
> need
> it. But is the content graph the right place to put information about
> spammers?
>
It seems like a non-classified and trusted (for the medium operation trust
level of the content graph) information.

> Is a simple ssp application that publishes a graph of relations on TimBL
> also going
> to now publish that he is a spammer?
>
ssp is a rendering mechanism, here we are talking about a service that gives
description about resources using the content graph and for remote resources
the web. This service is not by default accessible from remote.


>
> >
> >> When I get Dan Brickley's graph I may want to know all the people he
> >> mentions in his foaf profile - even if he does not mention them as
> >> foaf:knows related to him.
> >
> > does this provide a new point?
>
> yes. It shows that your merger of a graph with a whole bunch of other
> information  makes certain types of queries impossible.
>
Ok, I think you illustrated this with the TimBL example too.


>        Use Case: When people type or drag and drop URLs in a form  they
> will usually not be precise with their URLs. So it will be important to find
> the minimum published context, find the people in that context, to be able
> to help them select a person. Presumably they meant some person in that
> page. So one needs to search the people in that page to help them select the
> right one.
>
Yes, apart from the fact that (as Tommaso pointed out) nobody is forced to
use the new service it seems that the service could be used here. That it
might even be useful to give the platform owner the ability to add triples.


>
> We have similar issues with reasoning. Reasoning over small graphs is going
> to be a lot more efficient than over the whole database. So one may be also
> interested just in minimal reasoning, adding all foaf:Person to all
> foaf:knows subjects and objects in a remote foaf - in order to locate that
> person the user is interested in.
>
Yes, and reasing can also be done when handling the query, the described use
case is only a problem with big graph when you do forward chaining.


> When receiving a new grph one may also be first interested to see if it by
> itself does not contain any contradictions,
> before merging it - even if only virtually - with the rest of the DB.
>
and when installing a bundle one might be interested to see that all methods
it contains actually terminate...

we don't know the meaning of all used terms so a checking for contradiction
would necessarily be limited. it would be expensive to do and I don't see
exactly the benefit, if there is a contradiction between local knowledge and
what the web says you might have a contradiction in the triples accessible
from the graphnode. this happens, if for some application this is a
catastrophe they shouldn't use the service. (And probably neither the web).



>
> >
> >> If the GraphNodeProvider returns a union graph of the documentation
> graph,
> >
> > Again no, we're not returning a union graph we're returning a GraphNode,
> the
> > underlying graph is an implementation detail (was think if the
> > getGraph-method could be made less visible (protected or private) to
> avoid
> > this confusion)
> >
> >
> >> content graph,... and his foaf profile then when searching for all the
> >> foaf:Person
> >
> > You don't search a GraphNode for all foaf:Person but the GraphNode
> > represents the foaf:Person you asked for.
>
> Well I could also ask for the GraphNode of the foaf:Person class and then
>
>    people/-RDF.`type`
>
> would return all instances.
>
The resource you get "contains" the description it gets from the foaf
ontology as well as any person the platform knows about. This seems to be
usefull in many situations, but you were talking about getting persons not
classes, your usage of "the" in your sentence indicate you refer to an
instance of foaf:Person.



>
>
> >
> >> I will get the documentation writers too, the writers of content in the
> >> content graph, and who knows what else...
> >
> > you will have properties pointing from that persons to all the comments
> he
> > left on the local instance, which can be quite handy (and which are from
> the
> > underlying content graph as they are probably not also contained in the
> > remote foaf:profile)
>
> That can be quite handy in SOME circumstances, and not handy at all in
> others.
>
A truism: all the service offered by the platform is useful for some scopes
and useless for others.


>
> So in summary, the "trust" decisions made by the GraphNodeProvider do not
> increase
> trust but may well reduce it,

The trust decision is not made by the GNP but by its client, I don't ask you
to tell me about xy if I could ask xy directly unless to give you the
opportunity to tell me something that xy might not tell me.


> they could end up creating information leakage,

not seeing this.


> they reduce
> the number of things that can be done,

how can an added feature rediuce the number of things that can be done?


> make some very arbitrary trust decisions,

The "trust decisions" (referent of "they") "make some very arbitrary trust
decisions" ?


> and make reasoning more difficult.
>
With the current blocking of features and length of discussions it seems
unlikely clerezza will ever be close to do reasoning :(



>
> The decision to return these union of graphs by default is thus unintuitive
> and unhelpful.

Which seems slightly contrasting to the fact that we have 5 explicit +1 to
the feature.


> It packs a huge amount of decisions that are not evident when asking for a
> GraphNode.  These decisions are not of course bad in all circumstances.
> There  are many cases where it may be very good. But it is better to have
> those decisions be clear and not tie them into the core of Clerezza

As having a common trust basis is what is the particularity of the
clerezza.platform bundles this seems like an argument to the whole platform.



> where the returning of a simple UriRef requires one to be aware of all
> these decisions.
>
No, please see my previous mail on UriRef/GraphNode disctinction.


Reto

Current mood: :(

Re: Issues of trust -- Re: [VOTE] Accept the proposed patch of CLEREZZA-540

Reply via email to