Re: How to name things with URIs

Henry Story Sun, 15 May 2011 02:28:58 -0700

On 15 May 2011, at 03:41, Reto Bachmann-Gmuer wrote:

> I apologize for for having wasted (not just my) time in engaging in
> this argument in such a non-constructive way.


Well we did get cover a lot of important issues in using URIs, many of which
I this discussion has helped bring back to the fore of my mind.

And we did boil things down to the question of why we need 

urn:x-localinstance:/cache/<remote-uri>

rather than just

<remote-uri>

We both agree that either is an improvement over <remote-uri>.cache

> 
> Let's talk code. Please have a look at parent/rdf.storage.web I just 
> committed.
> 
> What this thread is about is implemented in line 158 in
> http://svn.apache.org/viewvc/incubator/clerezza/trunk/parent/rdf.storage.web/src/main/scala/WebProxy.scala?revision=1103264&view=markup
> 
> i.e.: val cacheGraphName = new UriRef("urn:x-localinstance:/cache/" +
> name.getUnicodeString)
> 
> The code wouldn't work with cacheGraphName = name because in this
> case, once created the cache would always be provided by a higher
> priority WeightedTcProvider so that the caching Provider (WebProxy)
> never gets considered again and thus cannot update when the cache copy
> is considered outdated.

Ok, so this issue is occurring because you are refactoring WebProxy to be a 
TcProvider, which it was not originally. One can of course see the pull to 
making  it a TcProvider, though perhaps the delete methods are not so useful 
there.

> So you're welcome to make suggestion on how it
> should be different,

I have not really had time to study all your local TcProviders, nor
work out how a number can help anything distinguish between one TcProvider
and another. I have to go right now - my sister is calling...

But here is a quick question: why not simply make the ProxyTcProvider a higher 
priority than the
pure local one?




> but without other proposal and without you
> withdrawing the -1 I have to change it to
> name.getUnicodeString+".cache" which was the last (silently) accepted
> name. I think we both agree that localinstance is better than the
> .cache proposal, so I urge you to revoke your vote.
> 
> Cheers,
> Reto
> 
> On Sun, May 15, 2011 at 12:17 AM, Henry Story <[email protected]> wrote:
>> I would like you first to read through the extensive mail I wrote, which took
>> me some time to write, and think things through.
>> 
>> 
>> Henry
>> 
>> On 14 May 2011, at 22:37, Reto Bachmann-Gmuer wrote:
>> 
>>> On Sat, May 14, 2011 at 7:54 PM, Henry Story <[email protected]> 
>>> wrote:
>>>> Btw, I suppose I should say that I am not massively against the suggestion
>>>> you started this thread with. It is more than I am trying to explore this
>>>> more carefully, because it is an important discussion that deserves careful
>>>> thought.
>>> The careful procedure is to have tiny little issues which when
>>> resolved bring a tiny but undisputed improvement. Now with your
>>> resolution of CLEREZZA-463 I'm having massive problems and even if you
>>> think the status quo ante was fundamentally wrong I believe the
>>> graph-renaming you did makes things worse.
>>> 
>>> I know that CLEREZZA-463 contains many real improvement. But it also
>>> introduce problems. And not just what you might consider a
>>> philosophical problem that names denote extensionally different things
>>> but also very practical ones.
>>> 
>>> One major problem is the permission.  We introduces
>>> WebIdBasedPermissionProvider and one implementation
>>> (UserGraphAcessPermissionProvider) used to provide readwrite access to
>>> the profile graph. Now this no longer works because you changed the
>>> names of graphs. Because of this and not because of a fundamentally
>>> broken architecture before your patch applications that used to work.
>>> 
>>> Your -1 was against urn:x-localinstance:/cache/<remote-uri>
>>> 
>>> The status quo ante was
>>> 
>>> cache graph: <web-profile-uri>.cache
>>> profie-graph: <web-profile-uri>
>>> 
>>> with the resolution of  CLEREZZA-463 we have
>>> 
>>> cache graph <web-profile-uri>
>>> profile graphs for local users: <web-profile-uri>
>>> profile graphs for remote users: <default-base-uri>/<web-profile-uri>
>>> 
>>> you did change some names, probably just because of inconsistent
>>> changes things broke (UserGraphAcessPermissionProvider seems pointless
>>> right now). I don't want to
>>> 
>>> and  such that because of the renaming of graphs the
>>> UserGraphAcessPermissionProvider
>>> 
>>> - The user has no longer the right to write to its own graph
>>> - Because the user graphs that is now (with your resolution of
>>> CLEREZZA-463) named like
>>> <http://localhost:8080/user/https://farewellutopia.com/user/me/profile>
>>> 
>>> In my opinion to changed a suboptimal solution against quite a mess,
>>> now you argue against my solution to tidy things up because you are
>>> afraid of having a mess in one year.
>>> 
>>> So please either accept my proposal which started this thread as
>>> something that is better than the status quo (i.e. retract your -1 so
>>> I can finally go back coding) or make a concrete proposal on how to
>>> name the different entities I've been suggesting names for or else
>>> revert the changes for CLEREZZA-463 (so that applications that used to
>>> work work again and we can start a proper development with little
>>> issues and patches that represent undisputed improvements.
>>> 
>>> 
>>> ==== what I consider important and relevant to current development
>>> ends here ====
>>> 
>>>> 
>>>> On 14 May 2011, at 17:09, Reto Bachmann-Gmuer wrote:
>>>> 
>>>>> On Fri, May 13, 2011 at 5:46 PM, Henry Story <[email protected]> 
>>>>> wrote:
>>>>>> Reto wrote:
>>>>>>> Clerrezza-489 and you also quote may statement of 463. okay, you might 
>>>>>>> say
>>>>>>> that I'm stating rather than arguing.
>>>>>> :-)
>>>>>>> The argument: they are different thing, both intensionally (cache and
>>>>>>> source) as in many case extensionally (triples may differ).
>>>>>> 
>>>>>> in that sense I agree.
>>>>>> But then the other point I made is also true, and that is that different
>>>>>> users may get different
>>>>>> graphs back for the same remote resource. In fact those users may be the
>>>>>> same user at different times.  Since those are all different graphs by 
>>>>>> your
>>>>>> definition above one should also give them different names.
>>>>> We do not have support for this yet and I think its a feature
>>>>> increasing complexity massively.
>>>> 
>>>> You are dealing with an architectural problem which cannot just be dealt
>>>> with in stages. You need to look at the problem as a whole, or you will
>>>> just end up with the problem we are having right now. It is better to get 
>>>> this
>>>> issue cleared up now, than have a mess of graph names in one year, when a 
>>>> lot of
>>>> applications depend on this.
>>> This kind of against agile mantras and it seems to contrast very
>>> strongly to what you just did: you changed the names and now want a
>>> scientific study to change them again to solve the problems your
>>> namechange introduced.
>>> 
>>>> 
>>>> In any case it's not increasing anything massively, it is the logical
>>>> continuation of your point above.
>>> If you propose a patch which changes names and deliver good arguments
>>> why the new names are massively better and support future usecases
>>> without any disadvantage for addressing the current usecases than I'm
>>> sure this gets accepted, what you did is mix-in this namechange in a
>>> whole bunch of patches.
>>> 
>>>> 
>>>> Your argument was:
>>>> 
>>>> "they [the remote and the locally fetched graph] are different thing, both
>>>> intensionally (cache and source) as in many case extensionally (triples 
>>>> may differ)."
>>>> 
>>>> And so it follows that graphs sent at different times may also differ
>>>> extensionally and should have different names too.
>>> 
>>> No, we are talking about MGraphs here. I know transtemporal identity
>>> is a hard problem philosophically yet in practice we have quite strong
>>> intuition on what we consider to be the same thing over time. the
>>> google website remains the google website even if they change the
>>> design, same goes for the wikipedia page about google it remains the
>>> wikipedia site about google (with the same URI) even after it was
>>> changed, one never becomes the other.
>>> 
>>>> 
>>>> You can't have it both ways, argue on intentionality for different names 
>>>> and then
>>>> refuse to see that temporally different graphs would also then need 
>>>> different names.
>>> I was talking about intensionality. Two terms have a same intension
>>> only is in the same universe of evaluation and at the same point in
>>> time they have the same extension.
>>> 
>>>> 
>>>> ( Btw. there are good arguments that intentionally the local graph if it 
>>>> is a cache
>>>> does not differ from the remote one. In any case if you pursue this too 
>>>> far you will
>>>> find that you can never name any remote thing. )
>>>> 
>>>>> I don't think that clerezza-490 need to be resolved urgently, but anyway 
>>>>> we
>>>>> should proceed issue by issue, and the best resolution of an issue is a 
>>>>> minimal
>>>>> resolution not one that tries to foresee and future issues.
>>>> 
>>>> I tend to see logical consequences of an argument as being contained in 
>>>> the argument,
>>>> and not being future issues that can be looked at later as somehow being 
>>>> distinct.
>>> yes, but:
>>> 1. analysing till the very bottom inevitably leads to paralysis.
>>> 2. this inconsistent with your intuition based named change without 
>>> discussion
>>> 3. We have problems needing a fix (only to be as good as before your
>>> patches) and you're not making a concrete proposal
>>> 
>>>> 
>>>> Clerezza-490 that deals with different ways the server can present itself 
>>>> to other
>>>> servers, is not of course something that needs to be implemented 
>>>> immediately. But it
>>>> would be good that the naming solution we come up with can be extended to 
>>>> that case
>>>> and to the temporal case.
>>>> 
>>>> So I am invoking Clerezza-490 as something to help test the naming ideas 
>>>> being put
>>>> forward here. This is a logical test if you will.
>>> see above
>>> 
>>>> 
>>>>>> So local graph naming schemes should take that into account, which is 
>>>>>> why I
>>>>>> suggest that we have an API that can allow for extensibility here.
>>>>> We have currently things and we are naming them badly.
>>>>> 
>>>>> Prior to you r webproxy we had:
>>>>> <webid-profile-url>.cache as name for the cache of the webprofile
>>>>> and
>>>>> <webid-profile-url> as uri for triples the user generated locally,
>>>>> this can be seen as extensions to the remote profile with information
>>>>> (like preferred language) that happen not to be in the remote profile
>>>>> 
>>>>> which was consistent with local users who only had
>>>>> <webid-profile-url> for the triples they control which include both
>>>>> the regular profile as well
>>>> 
>>>> yes, and both of those were not good solutions.
>>>> The .cache solution is bound to create a problem if someone remotely has
>>>> a URI named http://some.example/resource.cache
>>>> 
>>>> It is bound to lead to nasty name clashes, with the same URI naming two 
>>>> different things.
>>> right, I'm admitting it wasn't ideal - but I preffere the seldom
>>> clashes to the ambiguity by design.
>>> 
>>>> 
>>>> Remote URIs are named by remote resources, so it makes more sense to use 
>>>> the URI of the
>>>> remote resource to name the graph of the remote resource. The remote 
>>>> resource was named
>>>> by the owner of the resource. We should respect that.
>>> <sarcasm>so we nshould not do caching, as the uri prefix http implies
>>> a preferred method for retrieving the resource which is definitively
>>> different than getting it out of a local tdb store</sarcasm>
>>> 
>>>> 
>>>> If there are local additions to a remote graph, they should be given a 
>>>> local
>>>> URI. There is nothing simpler than this solution it seems to me.
>>>> 
>>>>> 
>>>>> Now <webid-profile-url> is the cache,
>>>> 
>>>> You can look at it that way, or you can think of it as the name of the 
>>>> remote
>>>> graph, with the contents being the cache of the remote graph.
>>>> 
>>>> If you were to make the local graph available publicly, it would then of
>>>> course need to have a local url tied into your namespace. Perhaps this is 
>>>> a good
>>>> way to think of the distinction.
>>> 
>>> I'm noty saying your proposal is absurd, but you introduced in a way
>>> that breaks things an without discussion. now that I want to clean the
>>> mess you start writing socio-philosophical essays
>>> 
>>>> 
>>>> 
>>>>> not sure where additional
>>>>> triples added locally get stored, i.e. where triples added to
>>>>> webIdGraphsService.getWebIDInfo(webId).publicUserGraph are stored.
>>>> 
>>>> 
>>>> They should be stored in graph names with a local URL clearly since these 
>>>> are being stored by a local agent. And I think it will be application 
>>>> specific what the names of those graphs should be.
>>>> 
>>>> So currently as an initial proposal I put them in
>>> as a proposal ok, but you changed something that was working without
>>> dissusing the consequences this e.g. for permissions.
>>> 
>>> <snip/>
>>>> Now imagine there are 2 or 3 applications on a clerezza instance, that a 
>>>> remote user  with his WebID uses.  There is no reason these applications 
>>>> should be putting all the information they generate for that user in the 
>>>> same local graph.
>>>> 
>>>> A banking graph should put banking info in its graph and a blogging graph 
>>>> into  its graph. The way to do this is to give applications - like users - 
>>>> access to  namespaces. Perhaps the bank application that was given control 
>>>> of the /bank namespace could coin graphs for remote users in that space, 
>>>> eg /bank/id/{remoteWebID} and the blogging one in /blog/id?{remoteWebID} .
>>>> 
>>>> By giving apps access to name spaces you can also make sure that there 
>>>> won't be any clashes.
>>> there is nothing that prevent application from making there own graphs
>>> for user information.
>>> 
>>>> 
>>>> now, that could be a reason for having URIs like
>>>> 
>>>> mvn:/dev.net/application1/?user=webid...
>>>> 
>>>> But then you see that applications on different servers will have name 
>>>> clashes too if they
>>>> ever merge their databases.
>>>> 
>>>> The advantage of using the local published name is that this then would 
>>>> allow simple dumps of databases and their merging in remote databases 
>>>> without clashes.
>>>> 
>>>>> I'm not saying the old naming was perfect but it worked in a somehow
>>>>> consistent fashion for local and remote users.
>>>> 
>>>> It was very confusing to me at least, as I point out in CLEREZZA-489.
>>>> 
>>>> And it furthermore is inconsistent with your point above that remote 
>>>> graphs are
>>>> intentionally different from the local version.
>>>> 
>>>>> Now my application taht used this feature is now longer working.
>>>> 
>>>> Well that is the problem of having an initial system that is broken.
>>>> It will be easy to fix this, and we should fix it well, not do a half job 
>>>> of it,
>>>> because this is a distributed naming problem.
>>> I'm tired. I've nothing against a concrete counter proposal against
>>> the one that started the tread, e.g. saying: "we must give every
>>> instance a unique-id and this should be part of the
>>> x-localinstance-uri"
>>> 
>>> 
>>>> 
>>>>> 
>>>>>> in Clerezza-489 I wrote that one could describe each graph like this in a
>>>>>> special Cache graph perhaps.
>>>>>> :g202323 a :Graph;
>>>>>>     = { ... };
>>>>>>     :fetchedFrom <https://remote.com/&gt;;
>>>>>>     :fetchedBy <http://bblfish.net/people/henry/card#me&gt;;
>>>>>>     :representation <file:/tmp/repr/202323>;
>>>>>>     :httpMeta [ etag "sdfsdfsddfs";
>>>>>>                      validTo "2012...."^^xsd:dateTime;
>>>>>>                     ... redirected info?
>>>>>>                     ] .
>>>>>> 
>>>>>> :g202324 a :Graph;
>>>>>>     = { ... };
>>>>>>     :fetchedFrom <https://remote.com/&gt;;
>>>>>>     :fetchedBy <http://farewellutopia.com/reto#me&gt;;
>>>>>>     :representation <file:/tmp/repr/202324>;
>>>>>>     :httpMeta [ etag "ddfsdfsddfd";
>>>>>>                      validTo "2012...."^^xsd:dateTime;
>>>>>>                     ... redirected info?
>>>>>>                     ] .
>>>>> 
>>>>> If we had barketing in RDF and our tooling would support it the the
>>>>> above might be somehow topical, answer to the question "how to name
>>>>> this?" "don't name it".
>>>> 
>>>> The above is just a way of writing the contents of the graph and the 
>>>> metadata
>>>> in the same file.  That is what the
>>>> 
>>>>  :g202323 = { ... }
>>>> 
>>>> is about. You don't need any special tools for that. If you use Jena to 
>>>> get the graph
>>>> named above you would get the content of the brackets. The point is that 
>>>> the content
>>>> from
>>> Also in jena  the graphs have a name, very profane sequence of
>>> characters this discussion was about. So in clerezza of in jena in the
>>> metadata graph you have a name instead of {...} and for this name you
>>> will get {...} from the named graph store.
>>> 
>>>> 
>>>>  :fetchedFrom ..
>>>>  :fetchedBy ...
>>>> 
>>>> is not in the g202323 graph, but in a graph metadata graph.
>>> obviously
>>>> 
>>>>> Please lets proceed issue by issue and make
>>>>> sure every brick we place is really solid and separate this from
>>>>> visionary long term stuff.
>>>> 
>>>> Ok, I hope you see that I introduced nothing new there. It's just an
>>>> n3 notation that makes it easy to write things out in an e-mail.
>>> an n3 notaions that omits exactly what this discussion is about,
>>> namely my nameing proposal and your -1 gainst it.
>>> 
>>>> 
>>>> So please consider that point again in that light.
>>>> 
>>>>>> 
>>>>>> Then this API could use information from this graph to and information 
>>>>>> from
>>>>>> the user's request
>>>>>> to find the correct local graph he wants.
>>>>> Still the local graph would have a name, probably - but as I said its
>>>>> irrelevant. Lets deal with the issues at hand, you changed the names
>>>>> of graph (which I agree didn't have the best possible names) with
>>>>> names that I think are worse, lets find something we can agree upon.
>>>>> (otherwise, please roll back to the version with the orginal names
>>>>> till we find a consensus).
>>>> 
>>>> Well I don't think rolling back would improve anything. I think clearly
>>>> this was an improvement. But I do think we can do better.
>>> It a mixture between improvements and deterioration. following the
>>> right process avoids the deterioations
>>> 
>>> 
>>>> 
>>>> So my thinking is that to reach consensus we can do this with an API, 
>>>> without
>>>> deciding what precisely the names should be.
>>> Stop: I disagree with your new names and we have problems because of
>>> your name changes and now you dont want to decide about names?!
>>> 
>>>> The best is just to lay out the
>>>> requirements:
>>>> 
>>>>  1. mapping from a remote URI to the URI understood by the local triple 
>>>> store
>>>>   and back. There should be no name clashes. It should be possible to 
>>>> easily extend
>>>>   to have agent views and temporal views.
>>>> 
>>>>  2. method for applications to take hold of legitimate namespaces in such 
>>>> a way that
>>>>    a clash of names is not possible.
>>> 
>>> If any proposal for changing names satisfies one of your criteria less
>>> than the staus before the poposal your applying the argument to the
>>> concrete proposal is welcome.
>>> 
>>> Reto
>>> 
>>> 
>>>> 
>>>> 
>>>> Henry
>>>> 
>>>> 
>>>>> 
>>>>> Reto
>>>>> 
>>>>>> Henry
>>>>>> PS. Having said that one then may just wonder why local graphs should 
>>>>>> ever
>>>>>> have anything other than
>>>>>> local URLs, since every time someone made a copy of a local graph it 
>>>>>> would
>>>>>> be different.
>>>> 
>>>> Social Web Architect
>>>> http://bblfish.net/
>>>> 
>>>> 
>> 
>> Social Web Architect
>> http://bblfish.net/
>> 
>> 

Social Web Architect
http://bblfish.net/

Re: How to name things with URIs

Reply via email to