Re: Semantics and Belief contexts - was: PaceDuplicateIdsEntryOrigin posted

2005-05-26 Thread Henry Story


Part of the reason I went into the detailed explanation I did, is  
because not everyone
here may be familiar with mathematical logic. What I was explaining  
is a heavy condensation
of what I learned in my BA in philosophy. I am not plucking these  
thoughts out of a hat.


So let me apply these a little more directly to your proposal.

I very much agree with the spirit of what you are saying. But I think  
that the danger is that by looking at the problem the way you are, we  
may end up confusing things a lot more than we are clarifying them.


On 25 May 2005, at 21:06, Antone Roundy wrote:

== Rationale ==

 * The accepted language for allowing duplicate IDs in a feed  
document speaks only multiple atom:entry elements with the same  
atom:id describing the same entry if they exist in the same  
document--of course, we intend for them to describe the same entry  
whether they're simultaneously in the feed document or not


If we put this in terms of ontology, what we want is to state simply  
that an ID refers to one and
only one entry. There is not much to add by the way, other than that  
the ID is a URI, since the

whole role of URIs is to identify one and only one thing.

 * The accepted language does not speak of the origin feed of the  
entries. Ideally, an atom:id should be univerally unique to one  
entry resource, and we rightly require publishers to mint them with  
that goal. However, in reality, malicious or undereducted  
publishers might duplicate the IDs of others. Therefore, it is  
proposed to modify the specification to state that the atom:entry  
elements describe the same entry (resource) if they originate in  
the same feed.


When you say 'ideally', I say ontologically. Let us imagine that  
instead of creating an ontology
for a atom entry, we were creating one for a gun. We could say that  
guns have a number of
properties, one of them being that they are physical objects.  
Physical objects can only be at

one place at a time. So a gun can only be at one place at a time.
Now you can see how it won't help the definition of a gun if  
someone were to introduce the
following reasoning.  When a policeman asks someone where they put  
the gun, some people are honest
and tell them where, but many people (the most interesting ones) will  
lie to them. The policeman has
to report the facts as he sees and hears them. He may hear two  
conflicting reports on where the
gun is. Are guns therefore weird objects that have a position  
relative to the person who speaks

about them?
This is confusing semantics and the way people speak about the  
world. It is confusing our
concepts about the world, and the context in which sentences  
containing those concepts about the world appear.


 * Aggregators wishing to protect against DOS attacks are not  
unlikely to perform some sort of safety checks to detect malicious  
atom:id duplication, regardless of whether the specification  
authorizes them to or not.


The policeman is not unlikely to do a lot of safety checks to find  
out where the gun is. He may for
example scan the person who told him he has not gun anyway, just to  
make sure.



On 25 May 2005, at 22:57, Antone Roundy wrote:

On Wednesday, May 25, 2005, at 02:26  PM, Henry Story wrote:
Since the referents of Superman and Clark Kent are the same,  
what is true of the one,
is true of the other. When speaking directly about the world, we  
can replace any occurrence

of Superman with Clark Kent, and still say something true.

Clark Kent is the secret identity of Superman. - Superman is  
the secret identity of Superman.  Whether they're perfectly  
interchangeable or not depends on whether the name is referring to  
the object or some a facet of the object.  The second sentence  
actually works if the first Superman refers to the persona, and  
the second to the person.  But getting back to Atom...


There are a lot of corner cases. I was trying to give an overview of  
a complicated field.
I was restricting myself to sentences where words clearly identify  
objects.  If you use Superman
to refer to a persona then we are speaking about something else.  
Other cases are Superman is so called because he is strong. Well  
known, but it does not undermine the distinction I was making.



[snip]

So to prevent a DOS attack, best is to have aggregator feeds such as:

feed
!-- aggregator feed --
feed src=http://true.org;
   idtag://true.org,2005/feed1/id
   entry
  titleEnter your credit card number here/title
  ...
   /entry
 /feed
feed src=http://false.org;
   idtag://true.org,2005/feed1/id
   entry
  titleEnter your credit card number here/title
  ...
   /entry
/feed
/feed

Here all the aggregator feed is claiming is that he has seen  
entries inside other

feeds.


...

It will be up to the consumer of such aggregated feeds to decide  
which to trust.




From the end user's point of view, it's not much 

Semantics and Belief contexts - was: PaceDuplicateIdsEntryOrigin posted

2005-05-25 Thread Henry Story


On 25 May 2005, at 21:06, Antone Roundy wrote:


 * The accepted language does not speak of the origin feed of the  
entries. Ideally, an atom:id should be univerally unique to one  
entry resource, and we rightly require publishers to mint them with  
that goal. However, in reality, malicious or undereducted  
publishers might duplicate the IDs of others. Therefore, it is  
proposed to modify the specification to state that the atom:entry  
elements describe the same entry (resource) if they originate in  
the same feed.
 * Aggregators wishing to protect against DOS attacks are not  
unlikely to perform some sort of safety checks to detect malicious  
atom:id duplication, regardless of whether the specification  
authorizes them to or not.



I understand your motivation, but I think it is misguided. I only  
recently understood why myself [1].


Let me explain a little how I come to this conclusion. An easy way to  
understand semantics is to think of it as about the objects out  
there. Take the sentences:


(a) Superman can fly
(b) Superman is Clark Kent

we can immediately deduce truly that

(c) Clark Kent can fly.

Since the referents of Superman and Clark Kent are the same, what  
is true of the one,
is true of the other. When speaking directly about the world, we can  
replace any occurrence

of Superman with Clark Kent, and still say something true.

When we are speaking about what others believe, this is no longer  
true. Lois Lane may believe (a) without believing (c). She may think  
Superman is a hero, but not think that Clark Kent is one. There
is in logic therefore a fundamental distinction between sentences  
used in a direct semantic way, and
sentences used in this indirect way, when the sentence is in a belief  
context. This distinction is
so fundamental that there is a well known mental illness that goes  
with people who are not able to
make this distinction: autism. Autistic children have great  
difficulty understanding the difference

between what is and how people perceive things to be.

In RDF this distinction shows up when moving from triples to 4- 
tuples. RDF/XML is a language
that works best in the Semantic realm. With triples we can describe  
objects and their
relationships. If we want speak about consistent ways of seeing the  
world we need to group statements with formulae as is done in N3[2]  
and TriX for example. This then allows us to name
consistent sets of statements. It also allows one to simultaneously  
refer to sets that are inconsistent. I can for example consistently  
hold the following:


Lois lane believes that Superman is different from Clark Kent
Clark Kent believes that he is Superman.

without contradiction.

So how does this relate to Atom? Well we need to be clear that  
semantically a entryId and
a feedId point to one thing and one thing only. But this does not  
mean that there can not
be erroneous, false, corrupted,... feeds out there. Aggregators  
wishing to protect against
DOS attacks should simply do what we humans do in such circumstances,  
namely quote what others
are saying and not assert the things others are saying. This is why  
the proposal by Roy
Fielding to allow feeds inside of feeds was probably the best way to  
do things (I just came
to this conclusion yesterday, before this I had no idea what he was  
going on about).


So to prevent a DOS attack, best is to have aggregator feeds such as:

feed
!-- aggregator feed --
feed src=http://true.org;
   idtag://true.org,2005/feed1/id
   entry
  titleEnter your credit card number here/title
  ...
   /entry
 /feed
feed src=http://false.org;
   idtag://true.org,2005/feed1/id
   entry
  titleEnter your credit card number here/title
  ...
   /entry
/feed
/feed

Here all the aggregator feed is claiming is that he has seen entries  
inside other
feeds. He never need claim to agree with any of their content. And so  
the content
of the first internal feed and the second internal feed can be  
contradictory. They
can for example have the same id with the same updated timestamp and  
with different

content.

It will be up to the consumer of such aggregated feeds to decide  
which to trust.


The good thing about this way of doing things is that one can define  
a first level feed
in a simple semantic vocabulary, without needing to create all kinds  
of exceptional
clauses all over the place. When dealing with feeds inside a feed one  
can then
simply mention that this indirection is equivalent to the belief  
context indirection.

Statements can be contradictory across such internal feeds.

Taking this into account should help make the spec a lot cleaner,  
easier to write and
easier to understand. The problems are fundamental, so they cannot be  
swept under the

carpet. They will keep popping up.

Henry Story

[1] http://www.imc.org/atom-syntax/mail-archive/msg15608.html
[2] 

Re: Semantics and Belief contexts - was: PaceDuplicateIdsEntryOrigin posted

2005-05-25 Thread Antone Roundy


On Wednesday, May 25, 2005, at 02:26  PM, Henry Story wrote:
Since the referents of Superman and Clark Kent are the same, what 
is true of the one,
is true of the other. When speaking directly about the world, we can 
replace any occurrence

of Superman with Clark Kent, and still say something true.
Clark Kent is the secret identity of Superman. - Superman is the 
secret identity of Superman.  Whether they're perfectly 
interchangeable or not depends on whether the name is referring to the 
object or some a facet of the object.  The second sentence actually 
works if the first Superman refers to the persona, and the second to 
the person.  But getting back to Atom...



Autistic children have great difficulty understanding the difference
between what is and how people perceive things to be.
They sure don't have a monopoly on this!  Really getting back to 
Atom!...



So to prevent a DOS attack, best is to have aggregator feeds such as:

feed
!-- aggregator feed --
feed src=http://true.org;
   idtag://true.org,2005/feed1/id
   entry
  titleEnter your credit card number here/title
  ...
   /entry
 /feed
feed src=http://false.org;
   idtag://true.org,2005/feed1/id
   entry
  titleEnter your credit card number here/title
  ...
   /entry
/feed
/feed

Here all the aggregator feed is claiming is that he has seen entries 
inside other

feeds.

...
It will be up to the consumer of such aggregated feeds to decide which 
to trust.


From the end user's point of view, it's not much different.  Somebody 
still has to make the decision, and the end user doesn't want to be the 
one doing it--they want the super aggregator or their feed reader or 
somebody else to do it for them.  The feed reader should be doing it 
anyway, since they won't be getting all of their data through a super 
aggregator.  But the super aggregator is likely to want to do it too, 
both to reduce how much data they forward to their clients, and because 
many feed readers aren't going to do it very well, so handling part of 
the job for them will improve the end user's experience.


I'm not a fan of feeds of feeds (though I have proposed and still like 
a one-level embedding of feeds in a different top-level element).  
Plus, I think it's inconceivable that the WG would make this drastic a 
change at this point.  Let's focus on doing what's actually possible, 
given the WG schedule and temperment, to mitigate this problem.