date:20050506

At 04:32 05/05/04, Graham wrote:
>
>On 28 Apr 2005, at 7:33 pm, Alexey Melnikov wrote:
>
>> Ok, maybe it is just me, but what does it mean to "collapse white- 
space"? Does this mean to replace FWS (in RFC 2822 sense) with a
>> single space?
>
>Since the statement is a MAY, I don't think any exact meaning is
>necessary. It's simply a hint to publishers that whitespace may not
>be preserved.
>
>
>On 29 Apr 2005, at 10:17 am, Martin Duerst wrote:
>
>> Making this more precise is definitely desirable. But there is also
>> an i18n issue: This works fine for languages that use spaces between
>> words. It doesn't work for languages that don't have spaces between
>> words (Chinese, Japanese, Thai,...). If Text elements are only used
>> for short things such as names or titles, that's not a big issue,
>> the text in question can just be put on a single line. However,
>> when the texts in question are long, it's a serious issue, and
>> should be fixed.
>
>A consumer may do anything that can reasonably be described as
>"collapsing whitespace", but are not required to. How does this cause
>problems in Asian languages?

If the consumer does the right thing, it won't cause problems.
But the chance that the consumer does the right thing without
being told what this is (or, without being told that this is
different for different languages or scripts) is rather low.
So we better improve the spec to help consumers do a better
job.
Regards,Martin.

Re: Last Call: 'The Atom Syndication Format' to Proposed Standard

At 02:27 05/05/04, Thomas Broyer wrote:
>
>Martin Duerst wrote:
>> At 03:33 05/04/29, Alexey Melnikov wrote:
>>  > >   If the value is "text", the content of the Text construct MUST NOT
>>  > >   contain child elements.  Such text is intended to be presented to
>>  > >   humans in a readable fashion.  Thus, Atom Processors MAY collapse
>>  > >   white-space (including line-breaks),
>>  >
>>  >Ok, maybe it is just me, but what does it mean to "collapse 
>white-space"? Does this mean to replace FWS (in RFC 2822 sense) with a 
>single space?
>> Making this more precise is definitely desirable. But there is also
>> an i18n issue: This works fine for languages that use spaces between
>> words. It doesn't work for languages that don't have spaces between
>> words (Chinese, Japanese, Thai,...). If Text elements are only used
>> for short things such as names or titles, that's not a big issue,
>> the text in question can just be put on a single line. However,
>> when the texts in question are long, it's a serious issue, and
>> should be fixed.
>
>My understanding of type="text" is that this is "just text" without any 
"formatting".

That's my understanding, too.
>Hence, it is not meant to be "preformatted text" such as text/plain or 
inside an (X)HTML "pre".

Yes. But that's exactly where the spacing problems with Chinese/Japanese/Thai
are. There are no such problems for preformatted text, because the line breaking
in the source (as sent) is the same as the line breaking when displayed.
For free-flowing text, however, the line breaks in the source and those in
the display are not (necessarily) the same, and so linebreaks have to be
changed to spaces for Western languages, but to nothing for Chinese/Japanese
(and most possibly to a zero-width non-breaking space for Thai), and the spec
has to say something about this.
Regards,Martin.
>This means type="text" content is a single paragraph of text. If you 
need paragraphs, lists or any other "structural formatting", you have to 
use type="html" or type="xhtml" with the appropriate content.
>
>I was about to writing a Pace about white-space handling in type="text" 
(either using xml:space or an attribute that would have mimic'd the 
"white-space" CSS property) when I understood/recalled that Text 
Constructs have accessibility in mind (hence their limitation to textual 
contents) and preformatted text is not accessible enough.
>
>--
>Thomas Broyer
>

Re: hreflang attribute should be allowed to be empty as well

At 02:07 05/05/04, Thomas Broyer wrote:
>
>Anne van Kesteren wrote:
>> I think that the HREFLANG attribute[1] should be allowed to be empty as 
well, just like xml:lang is allowed to be empty.

-1. Thomas explains it all very well below. Regards,Martin.
>xml:lang is allowed to be empty to override a previous, inherited, 
xml:lang declaration without specifying a new language.
>
>As xml:lang doesn't apply to the content of the linked resource, hreflang 
does not need to allow an empty value. If you don't want to or can't 
specify the language of the linked resource, just omit the hreflang attribute.
>We might want to distinguish those two cases: not providing a language 
hint vs. explicitly telling the linked resource can't be "language 
qualified" but I'm not sure why this would be needed and what would be the 
difference (interop, accessibility, etc.)
>
>However, we may need to make clearer that an xml:lang doesn't apply to 
the linked resource.
>
>--
>Thomas Broyer
>

Re: PaceOptionalFeedLink

At 11:50 05/05/06, Sam Ruby wrote:
>
>Tim Bray wrote:
>> +1
>> There are people who want to publish feeds without rel="alternate" 
links.  I'm against telling people they can't do something they want to do 
without strong reasons, as in loss of interoperability.  I don't see the 
reasons here as strong enough.  -Tim

+1 here, too, since long ago.
>FYI: we have an instance proof of this requiring an existing tool to do 
additional work:
>
>   http://www.imc.org/atom-syntax/mail-archive/msg13983.html

Well, yes, but how much work can that possibly be?
Regards,Martin.

Re: Autodiscovery

2005-05-06 Thread Roger B.


> The problem is fortunately mitigated because user agents usually 
> only offer copying @href ("copy link location").> I'm under the 
> impression that they do something similar with rich-text copying.

Nikolas: Your impression is mistaken. If I copy a chunk from Zeldman's
XFN-friendly links page and paste it into my WYSIWYG editor, I get all
of his @rel="whatevers" in my post. This is the same behavior I've
noted in every IE- and Mozilla-based WYSIWYG editor I've tried.

--
Roger Benningfield

Re: Autodiscovery in as well as

2005-05-06 Thread Roger B.


> Is there something wrong with the HTML parsers?

Nikolas: Are they installed by default on most servers? If not, can
those running in sandboxes install them?

>From the perspective of my niche, I can tell you that Coldfusion can
use jTidy to make sense of random HTML, but it is (a) installed in
virtually zero CF hosting environments, and (b) cannot normally be
added by an individual developer working in a sandbox. (It's also
riddled with bugs, but I'm just grateful to have it at all... I steer
clear of gift horses' mouths whenever possible.)

--
Roger Benningfield

Re: Autodiscovery in as well as

Phil Ringnalda wrote:

>
> Nikolas 'Atrus' Coukouma wrote:
>
>> Using @rel with any linking element is perfectly valid and has been for
>> years.
>> @rel not being supported for anything other than the link element itself
>> has also been an outstanding bug for just as long. There's lot of debate
>> attached to at least one Mozilla bug (#57399 [1] - filed on 2000-10-20).
>>
>> Can we agree that this should be supported, but currently isn't? Unless
>> there's a compelling reason not to, I think we might as well allow
>> autodiscovery via either element. Any implementation guide should
>> recommend duplicating the information in the interest of autodiscovery
>> actually working.
>>
>> [1] https://bugzilla.mozilla.org/show_bug.cgi?id=57399
>
>
> -1 to saying in the spec that you can use either element, and in the
> guide saying to use both if you want it to work, not just look pretty.

You're absolutely right. I was thing in more immediate terms before, but
if we're going to make this part of the Atom working group, land of
well-defined and reasonable specs, everything should work.

>
> As I remember it, when RSS autodiscovery started this cowpath,
> aggregator developers generally didn't have an SGML parser handy, and
> weren't especially happy about the idea of having to write their own
> HTML parser. Finding one (or a few) of relatively few s in the
> first bit of the document feels a lot easier than having to look at
> every  in the whole document.
>
> Now? I'd say most don't have an SGML parser handy, and won't be
> especially happy about writing their own HTML parser. It's fairly rare
> for someone to comment out bits of their , and quite common for
> them to comment out huge swaths of their , including things a
> template came with, like Atom
> feed, with no thought that something will be seeing and using that
> invisible link with an incorrect path. I added Atom autodiscovery to
> my current aggregator, Feed on Feeds, with a ten second
> copy/paste/change mime-type of the results of it using a regular
> expression on the HTML. If instead I had to correctly parse the entire
> HTML document, I'd... switch to something in Python, I guess.

Is there something wrong with the HTML parsers?
Perl has HTML::Parser
Python has htmllib.py
Ruby has ymHTML and a port of of the Python library called html-parseer
PHP has PHP-HTML
Common Lisp has phtml
The W3C  provides a simple parser written in C

I'm sure I can find more, but I think the above is a sufficiently long
list to illustrate my point.

> Then, since I foolishly took the Firefox bug for better autodiscovery,
> I'll also need to do it where I do have an excellent HTML parser, but
> I have to do it on every single page that every single Firefox user
> loads, whether or not they have any interest in feeds, or subscribed
> to the feed ten thousand loads of that particular page ago.  is
> easy, we've got a DOMLinkAdded event and most pages have very few of
> them. ? Well, the performance hit probably won't be noticeable on
> most pages.

This is a single XPath query.  Gecko has native support for it. I'm not
sure about the others, but Sarissa is a fine library for DOM
manipulation (including XSLT and XPath) from Javascript and it works
with IE, KHTML, Opera ...

>
> Phil Ringnalda
>
Of course, if your XML library copes with all the errors present in
normal HTML, it's probably nicer to use than any HTML parser.

The point here is that most developers have access to an HTML parser. I
admit that they might need patching, but at least 90% of the work is done.

I'll try to find time to examine each of these libraries and make any
changes needed. Hopefully they're already in good shape or the author is
open to this sort thing. If all else fails, there's forking.

If the problem is ignorance, I'll happily maintain a list. I'm also
willing to write some sample implementations in all of the languages I
listed before and more.

I don't think this is terribly difficult. In fact, I just took a shot at
altering Feeds on Feeds to support this and found it incredibly easy.

patch: http://zaphod.student.umd.edu/~atrus/FoF_mod/a-support.patch
There's other stuff in the same directory there if you want to poke at
it. The changes just use PHP-HTML, which I mentioned earlier.

Cheers,
-Nikolas 'Atrus' Coukouma

Re: entry definition

2005-05-06 Thread James Tauber


In the case where there is an "alternate" I think this is correct,  
but it:

1. highlights the inaccuracy of the word "alternate"
2. doesn't address the case where there is no link alternate
James
On 06/05/2005, at 12:46 PM, Henry Story wrote:
Some have been clamoring for a good definition of an entry.
Here is one I have thought of recently.
An Atom Entry is a resource (identified by atom:id) whose  
representations
(atom:entry) describe the state of a web resource at a time
(the link alternate)

Re: Autodiscovery in as well as

2005-05-06 Thread Phil Ringnalda

Nikolas 'Atrus' Coukouma wrote:
Using @rel with any linking element is perfectly valid and has been for
years.
@rel not being supported for anything other than the link element itself
has also been an outstanding bug for just as long. There's lot of debate
attached to at least one Mozilla bug (#57399 [1] - filed on 2000-10-20).
Can we agree that this should be supported, but currently isn't? Unless
there's a compelling reason not to, I think we might as well allow
autodiscovery via either element. Any implementation guide should
recommend duplicating the information in the interest of autodiscovery
actually working.
[1] https://bugzilla.mozilla.org/show_bug.cgi?id=57399
-1 to saying in the spec that you can use either element, and in the 
guide saying to use both if you want it to work, not just look pretty.

As I remember it, when RSS autodiscovery started this cowpath, 
aggregator developers generally didn't have an SGML parser handy, and 
weren't especially happy about the idea of having to write their own 
HTML parser. Finding one (or a few) of relatively few s in the 
first bit of the document feels a lot easier than having to look at 
every  in the whole document.

Now? I'd say most don't have an SGML parser handy, and won't be 
especially happy about writing their own HTML parser. It's fairly rare 
for someone to comment out bits of their , and quite common for 
them to comment out huge swaths of their , including things a 
template came with, like Atom 
feed, with no thought that something will be seeing and using that 
invisible link with an incorrect path. I added Atom autodiscovery to my 
current aggregator, Feed on Feeds, with a ten second copy/paste/change 
mime-type of the results of it using a regular expression on the HTML. 
If instead I had to correctly parse the entire HTML document, I'd... 
switch to something in Python, I guess.

Then, since I foolishly took the Firefox bug for better autodiscovery, 
I'll also need to do it where I do have an excellent HTML parser, but I 
have to do it on every single page that every single Firefox user loads, 
whether or not they have any interest in feeds, or subscribed to the 
feed ten thousand loads of that particular page ago.  is easy, 
we've got a DOMLinkAdded event and most pages have very few of them. 
? Well, the performance hit probably won't be noticeable on most pages.

Phil Ringnalda

Re: entry definition

2005-05-06 Thread David Powell



Friday, May 6, 2005, 5:46:59 PM, you wrote:


> Some have been clamoring for a good definition of an entry.
> Here is one I have thought of recently.

> An Atom Entry is a resource (identified by atom:id) whose  
> representations
> (atom:entry) describe the state of a web resource at a time
> (the link alternate)

> Any thoughts?

The entry's alternate link is optional isn't it?


-- 
Dave

Re: PaceCaching

2005-05-06 Thread Paul Hoffman

-1. Having two mechanisms in two different layers is a recipe for 
disaster. If HTTP headers are good enough for everything else on the 
web, they're good enough for Atom.

--Paul Hoffman, Director
--Internet Mail Consortium

Re: Autodiscovery

Bob Wyman wrote:

>Sjoerd Visscher wrote:
>  
>
>>[HTML 4.01 says:] This attribute describes the relationship from
>>the current document to the anchor specified by the href attribute.
>>The value of this attribute is a space-separated list of link types.
>>
>>
>   But, if you copy HTML from one document to another, or you construct
>an HTML document from parts, you risk carrying  tags with rel attributes
>from one document to another. If I quote some HTML in a new HTML document
>and the quoted HTML includes rel="alternate" in an  tag, are we really
>saying that the presence of rel="alternate" in the quoted text establishes a
>relation of the new HTML document as a whole?
>   Personally, I think there is a serious scoping problem here. We've
>got attributes of separable components of a page establishing metadata for
>the page as a whole. Not good.
>
>   bob wyman
>  
>
Since the HTML spec is where this originates, I'm inclined to say that
this is something that should be handled by whatever is manging content
(user or program) is copying the link. Clearly, @rel should be copied
with caution or simply left behind entirely.

I think calling this "components of a page establishing metadata for the
page as a whole" is a bit misleading. The metadata (@rel) say nothing
about the document it's in. It's metadata about the link between the two
documents. This metadata is context-Dependant because it depends on an
implicit @from counterpart to the explicit @href (@to, in XLink).

 and  both suffer from this weakness. I can't pick up a link
element from one element and move it to another. I can have a pile of
them in the template I use to make my pages.

It's considered a usability problem in the case of  because 
appears in a place where it's a it more likely to be copied: the body.
The problem is fortunately mitigated because user agents usually only
offer copying @href ("copy link location"). I'm under the impression
that they do something similar with rich-text copying. If someone's
copying HTML "source" by hand, then they should know to be wary of @rel.
The best we can do is add a note.

It's also not likely to bother programs that put content into
syndication feeds because links to feeds generally appear in the
periphery of presentation and not the main content of a feed. If people
do include @rel in content that's included in syndicated content, then
it should be stripped.

Again, no one uses @rel in  because it's unsupported. Coincidentally,
most programs do the right thing. If we push for @rel support in  in
the autodiscovery spec, then I think there may be an increase in usage,
at which time users can authors can educate themselves and programs can
make changes. They should have done this years ago. We're not creating
@rel, just encouraging it's use.

All of that said, it seems sensible to include a warning or a note or
somesuch.

-Nikolas 'Atrus' Coukouma

Re: PaceOriginalAttribute

2005-05-06 Thread Antone Roundy

On Thursday, May 5, 2005, at 05:21  PM, Robert Sayre wrote:
On 5/5/05, Antone Roundy <[EMAIL PROTECTED]> wrote:
Yeah, they think they are, or at least claim to think so.  But isn't
that the same thing that is stated if you see the following in two
feeds?

bar:bar

foo:bar

foo:foo

I may be an imposter



This says that this feed is (or at least claims it is) forwarding the
entry with the id "foo:bar" from the feed "foo:foo".
OK, now let's say you're subscribed to "imposter" in PubSub.

  bar:bar
  
 quux:quux1
 
 foo:foo
 
 I may be an imposter

  
 quux:quux2
 
 baz:baz
 
 I may be an imposter

 
Sorry, I don't understand the point of this example.  I read it as 
saying that these two entries from different feeds claimed the same id. 
 I don't know whether they really are the same entry, or whether one or 
both of their original ids were minted erroneously (accidentally 
duplicating another entry's id) or maliciously (attempting to DOS 
someone else's entry).  I can't see any way of resolving that question 
that I wouldn't have if PubSub had given me this instead:


  bar:bar
  
 foo:bar
 
 foo:foo
 
 I may be an imposter

  
 foo:bar
 
 baz:baz
 
 I may be an imposter

Re: Selfish Feeds...

On 6 May 2005, at 9:37 pm, Bob Wyman wrote:
Relying on a GUID alone only works if you implement a policy that
says that you are only interested in seeing content with "new"  
GUIDs and you
are willing to ignore any "updates" to previously seen entries/items.
Similarly, relying on atom:id + atom:updated implies a policy of  
only being
interested in content changes which are explicitly flagged by the  
publisher
as being "worthy of notice."
Well if you won't trust info in the feed, there's nothing any format  
can do for you. Get thee some venture capital and hire a few  
linguistics/statistics undergrads to work out a real "significant  
update" algorithm for yourself. If you're complaining because people  
are putting tracking info in links, you're obviously not doing a very  
good job at this at the moment.

Graham

RE: Selfish Feeds...

2005-05-06 Thread Walter Underwood


--On May 6, 2005 4:37:23 PM -0400 Bob Wyman <[EMAIL PROTECTED]> wrote:

>   Frankly, I really wish that we had done the "blog architecture" work
> many months ago so that we would all have a shared understanding of the
> system-wide issues and components rather than the widely divergent personal
> and partial views that are obvious in many our conversations today...

Agreed. "A conceptual model of a resource" is up there at the front of
our charter, and if we don't have that, it doesn't seem like the WG is done.

wunder
--
Walter Underwood
Principal Architect, Verity

Re: PaceAllowDuplicateIDs alteration

2005-05-06 Thread Antone Roundy

On Friday, May 6, 2005, at 09:16  AM, Bob Wyman wrote:
Graham wrote:
"If an Atom Feed Document contains multiple entries with the
same atom:id, software MUST treat them as multiple versions of
the same entry"
Are they still the same entry if they have different source elements
that identify their source as being different feeds?
In a perfect world with no malicious, undereducated, misinformed, 
intellectually challenged or other people people who don't mint ids 
appropriately, yes, they're the same entry.  In the real world, I have 
no idea.  A human looking at them could probably determine whether 
they're the same if they're different enough, but if they're 
substantially similar, then even a human wouldn't necessarily be able 
to determine whether they're the same or whether one is a malicious 
alteration.  There's no automated way to decide (unless their contents 
are identical).

Authors of consuming applications will have to decide whether or not to 
obey the commandment from the spec (if adopted) to treat them as being 
the same, or whether to give their users the option of making that 
decision.  Specifying that publishers who publish the same entry in 
multiple feeds MUST choose one to be the "original" source and express 
the rest as aggregated entries from that feed would make it much easier 
to justify treating them as different entries if they claimed to 
originate in different feeds.

Re: Autodiscovery discussion & editorship

2005-05-06 Thread Paul Hoffman

At 8:29 PM -0700 5/5/05, Tim Bray wrote:
Paul, is there any reason Mark or Phil shouldn't submit the most 
recent autodiscovery-01 as a committee draft?
No reason at all. We can poke at it, regardless of the -nn.txt numbering.
--Paul Hoffman, Director
--Internet Mail Consortium

RE: PaceSourceRecs


Antone Roundy wrote:
> Start by calculating the the language of the atom:feed and
> the atom:entry.  Second, if the language of atom:entry isn't
> the same as the aggregate feed, set it. Third, if the
> language of atom:feed isn't the same as the atom:entry,
> set it.
I'm curious about this "calculation" word... The Pace says that
"Language values should be calculated, according to the rules of
[W3C.REC-xml-20040204], by processing the xml:lang values of the element in
question and its ancestors."
This raises an interesting point... What should we do if we know
that the language that results from the calculation is not the actual
language of the entry? For instance, language recognition technology is
relatively well known and works with reasonable accuracy these days. The
fact is that the language fields in both RSS and existing Atom files are
wrong in a very large percentage of cases. (i.e. folk writing Chinese blogs
on US-based systems are getting feeds that claim "en-us" as the language and
often have no way to correct the language tag since US developers hardly
ever think about I18N issues...)
We've had folk suggest to us that we should run Language Recognition
algorithms to calculate the actual language of entries and then "fix" their
tags in cases where the tags are missing or clearly wrong. I've resisted
such suggestions if only because I can't figure out where to write the
"correct" language tags. What would you suggest we do? Or, should we do
nothing? What would you suggest we do when trying to insert an entry into an
aggregate feed if the original entry had no language tag yet we are very
confident that our Language Recognition code has determined the actual
language of the entry and it is not the same as the language of the
enclosing feed?

bob wyman

Re: PaceOriginalAttribute

Tim Bray wrote:
+1
I'm not 100% convinced it solves the problems Rob says it does, but it 
seems cheap, lightweight, and unlikely to cause any harm. -Tim
I'm growing increasingly comfortable with the concept of allowing 
redistributors to assign new ids as long as they track the original 
(i.e.: not immediate predecessor!) entry.

That being said, I have two things I want to think more about w.r.t. how 
this Pace is currently worded.  (Note: the first is actually only a nit 
concerning the current draft, not the Pace itself):

1) "MAY be preserved"
   I would prefer if this were recast not so much as an RFC 2119
   interoperability statement, but rather as a definitional one.
   "original attributes in atom:source elements are used to
   indicate..."  Something along those lines.
2) "Atom feeds MUST NOT contain duplicate atom:id values"
   How did that get in there?  Depending on how you read it, this Pace
   is incompatible with PaceAllowDuplicateIDs, implying that Tim's
   +1 above can be construed as voting against a Pace he authored!
Whether or not the work group comes to consensus about allowing multiple 
entries to share the same ID in a feed, it still is true that entries 
may change over time.  Perhaps atom:source elements could also define an 
@updated attribute.  As atom:updated is a required element for 
atom:entry, it would not be an unreasonable burden to require @updated 
attributes on atom:source elements.

- Sam Ruby

RE: Selfish Feeds...


Sam Ruby wrote:
> It seems to me that instead of adding a dynamic content flag, having
> a separate id element (or in RSS 2.0's case, utilizing the guid
> element) would be more to the point.
Relying on a GUID alone only works if you implement a policy that
says that you are only interested in seeing content with "new" GUIDs and you
are willing to ignore any "updates" to previously seen entries/items.
Similarly, relying on atom:id + atom:updated implies a policy of only being
interested in content changes which are explicitly flagged by the publisher
as being "worthy of notice." These are certainly appropriate policies for
*some* aggregators addressing *some* user needs. However, other aggregators
implement different policies which address other user needs. For instance,
many aggregators will update their content stores whenever *any* change
occurs in an item whether or not the GUID or Atom:id has changed. Some of
these aggregators will flag any change as a "new or unread" entry. (which I
think is a really stupid policy...) Others will, like Gush, distinguish
between "new" items and "updated" items. (I think this is much more
sensible, others will say it is overly complex and unnecessary.)
Conceivably, once Atom is released, some aggregators will wish to record
three states for an entry: "new", "major update" and "minor update." (I
would support anyone doing this, others would not.)
To understand this issue and many other syndication issues, it is
vital that you try to consider the full range of policies that are
implemented by aggregators and that you try to look beyond your personal
preferences. Please try to understand that this isn't a simple issue -- at
least not from the point of view of a channel intermediary like PubSub. As
was recently pointed out, a very large percentage of the HTTP specification
covers issues related to proxies (which is very much the role that PubSub
plays.) The same is true of the State Management (Cookie) RFC. I remember
that when we were working on that RFC, proxy issues were just about the
*only* thing we discussed... Problems which are "simple" in point to point
networks become much more complex when you introduce intermediaries.
Frankly, I really wish that we had done the "blog architecture" work
many months ago so that we would all have a shared understanding of the
system-wide issues and components rather than the widely divergent personal
and partial views that are obvious in many our conversations today...

bob wyman

Re: FeedId

Robert Sayre wrote:
On 5/6/05, Sam Ruby <[EMAIL PROTECTED]> wrote:
At the moment, alternate link is the element of record.
Do any applications use it as such? In my experience, applications use
the URI they retrieved the feed from as the feed identifier. For
example, I believe every single pubsub.com match feed uses the same
alternate link.
My thoughts mirror this discussion from just barely over a month ago:
  http://www.imc.org/atom-syntax/mail-archive/msg14008.html
Would people be happy with the following?  If so, I would gladly write 
up a Pace:

  Alternate Link: OPTIONAL
  Self Link: SHOULD (rationale: existing pain point)
  ID: MUST
- Sam Ruby

Re: Selfish Feeds...

Bob Wyman wrote:
I've got another example of a selfish feed which is producing dynamic
content which will cause many duplicate entries to float around the
blogosphere. The feed in question here is an RSS feed. Nonetheless, I think
we must expect people will do the same stupid tricks with Atom feeds. Check
out:
http://www.b-eye-network.com/xml/articles.php
What you'll get is a feed with entries that look something like the one at
the bottom of this page. The interesting thing to note is that the item has
a  element with the url:
  http://www.b-eye-network.com/view/index.php?
   cid=836&fc=0&frss=1&ua=Mozilla/4.0 (compatible; 
MSIE 6.0; Windows NT 5.1; .NET CLR 1.1.4322; Alexa Toolbar)

What's happened here is that the site has appended my User Agent to the URL
in the link. I assume that this is to allow some kind of tracking. However,
the impact is that the contents of the feed depend on what tool you use to
read the feed. If you access the feed, you will undoubtedly get different
content then I did... For instance, if PubSub's crawler had read the feed,
the value of the "ua" attribute in the URL would have been different and the
URL would have read:
   http://www.b-eye-network.com/view/index.php?
cid=836&fc=0&frss=1&ua=PubSub.com RSS reader - 
http://www.pubsub.com/

If this feed is read by more than one synthetic feed generator or if items
from the feed are copied from this feed to another, it is inevitable that
we'll have multiple copies of the item floating around and we'll have very
little means for determining which one is authoritative -- essentially they
all are. It would be handy to have a "dynamic content flag" that allows us
to ignore this stuff...
It seems to me that instead of adding a dynamic content flag, having a 
separate id element (or in RSS 2.0's case, utilizing the guid element) 
would be more to the point.

- Sam Ruby

Re: PaceSourceRecs


On 5/6/05, David Powell <[EMAIL PROTECTED]> wrote:

> I'd be less suspicious of being fobbed off, if our charter was updated
> to include the Implementor's Guide as a deliverable.

Indeed. This sounds like BCP material.

Robert Sayre

Re: FeedId

On 5/6/05, Sam Ruby <[EMAIL PROTECTED]> wrote:

> 
> At the moment, alternate link is the element of record.

Do any applications use it as such? In my experience, applications use
the URI they retrieved the feed from as the feed identifier. For
example, I believe every single pubsub.com match feed uses the same
alternate link.

> -1 to PaceOptionalFeedLink if it to be accepted in isolation or to be
> portrayed as "inconsistent" with any other feed id related paces (hey, I
> might be dumb, but I learn quick)
> 
> +1 to any Pace that moves this to responsibility to a more appropriate home.

I'd like to point out that writing a proposal doesn't necessarily
imply support on the part of the author.

Robert Sayre

Re: PaceSourceRecs

2005-05-06 Thread David Powell

Thursday, May 5, 2005, 11:42:22 PM, Tim Bray wrote:

> -1

> Irrespective of whether I agree with this or not, I think the material
> belongs in an implementor's guide, not the specification.  -Tim

I'm a bit uncomfortable with punting yet another issue into the
Implementor's Guide when the WG has made no commitment to produce an
Implementor's Guide.

I'd be less suspicious of being fobbed off, if our charter was updated
to include the Implementor's Guide as a deliverable.

-- 
Dave

Re: Selfish Feeds...


On 6 May 2005, at 8:41 pm, Bob Wyman wrote:
I've got another example of a selfish feed which is producing dynamic
content which will cause many duplicate entries to float around the
blogosphere. The feed in question here is an RSS feed. Nonetheless,  
I think
we must expect people will do the same stupid tricks with Atom  
feeds. Check
out:
Bob, what the fuck do you think ids are for?
Graham

Re: FeedId

Graham wrote:
On 6 May 2005, at 4:28 pm, Bill de hÓra wrote:
That there is consensus we'll want to identify a feed, even if we  
can't provide a link.
I'd +1 an ordinary "make feed:id required" Pace.
I'd +1 any Pace that has a chance of achieving consensus that makes at 
least one element that can be used as a feed identifier mandatory.

At the moment, alternate link is the element of record.
-1 to PaceOptionalFeedLink if it to be accepted in isolation or to be 
portrayed as "inconsistent" with any other feed id related paces (hey, I 
might be dumb, but I learn quick)

+1 to any Pace that moves this to responsibility to a more appropriate home.
- Sam Ruby

Selfish Feeds...


I've got another example of a selfish feed which is producing dynamic
content which will cause many duplicate entries to float around the
blogosphere. The feed in question here is an RSS feed. Nonetheless, I think
we must expect people will do the same stupid tricks with Atom feeds. Check
out:

http://www.b-eye-network.com/xml/articles.php

What you'll get is a feed with entries that look something like the one at
the bottom of this page. The interesting thing to note is that the item has
a  element with the url:

  http://www.b-eye-network.com/view/index.php?
   cid=836&fc=0&frss=1&ua=Mozilla/4.0 (compatible; 
MSIE 6.0; Windows NT 5.1; .NET CLR 1.1.4322; Alexa Toolbar)

What's happened here is that the site has appended my User Agent to the URL
in the link. I assume that this is to allow some kind of tracking. However,
the impact is that the contents of the feed depend on what tool you use to
read the feed. If you access the feed, you will undoubtedly get different
content then I did... For instance, if PubSub's crawler had read the feed,
the value of the "ua" attribute in the URL would have been different and the
URL would have read:
   http://www.b-eye-network.com/view/index.php?
cid=836&fc=0&frss=1&ua=PubSub.com RSS reader - 
http://www.pubsub.com/

If this feed is read by more than one synthetic feed generator or if items
from the feed are copied from this feed to another, it is inevitable that
we'll have multiple copies of the item floating around and we'll have very
little means for determining which one is authoritative -- essentially they
all are. It would be handy to have a "dynamic content flag" that allows us
to ignore this stuff...

This business of people including dynamic content in their feeds for selfish
purposes is making it very difficult to build a decent infrastructure for
distributing and caching RSS/Atom entries... We've got a "tragedy of the
commons" situation going on here. The much-too-respectable SEO crowd is
trying to seek profit at the expense of the network at large... Because they
can.

bob wyman

== Full example entry from the feed =

  Nanotechnology Basics Defined 
 
  
  
 
http://www.b-eye-network.com/view/index.php?cid=836&fc=0&frss=1&ua=Moz
illa/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR 1.1.4322; Alexa
Toolbar) 
  Thu, 5 May 2005 00:00:00 MST

RE: Autodiscovery


Sjoerd Visscher wrote:
> [HTML 4.01 says:] This attribute describes the relationship from
> the current document to the anchor specified by the href attribute.
> The value of this attribute is a space-separated list of link types.
But, if you copy HTML from one document to another, or you construct
an HTML document from parts, you risk carrying  tags with rel attributes
from one document to another. If I quote some HTML in a new HTML document
and the quoted HTML includes rel="alternate" in an  tag, are we really
saying that the presence of rel="alternate" in the quoted text establishes a
relation of the new HTML document as a whole?
Personally, I think there is a serious scoping problem here. We've
got attributes of separable components of a page establishing metadata for
the page as a whole. Not good.

bob wyman

Re: FeedId

On 6 May 2005, at 4:28 pm, Bill de hÓra wrote:
That there is consensus we'll want to identify a feed, even if we  
can't provide a link.
I'd +1 an ordinary "make feed:id required" Pace.
PaceFeedIdOrSelf is too awful an idea for words.
Graham

RE: entry definition

Henry Story wrote:
> An Atom Entry is a resource (identified by atom:id) whose
> representations (atom:entry) describe the state of a web resource
> at a time (the link alternate).

I think that if this is not 100% "correct" then it is at least very
close to whatever correct actually is. 

bob wyman

Re: Atom feed refresh rates

2005-05-06 Thread Walter Underwood

--On May 5, 2005 10:53:48 AM -0700 John Panzer <[EMAIL PROTECTED]> wrote:
> 
> I assume an HTTP Expires header for Atom content will work and play well with
> caches such as the Google Accelerator (http://webaccelerator.google.com/). 
> I'd also guess that a syntax-level tag won't.  Is this important? 

The syntax-level tag is useful inside a client program with a cache.
It can reduce the number of requests at the source, rather than 
reducing them in the middle of the network at an HTTP cache.

There is extra benefit from putting that info into the HTTP headers,
because the HTTP cache is shared between multiple clients. The source
webserver sees one GET per HTTP cache instead of one GET per Atom client.

The syntax-level tag also provides a way for the feed author to specify the
info without depending on webserver-specific controls. It does depend on
some extra bit of software to take that info and put it in the HTTP
Expires or Cache-control headers.

wunder
--
Walter Underwood
Principal Architect, Verity

Re: Autodiscovery

2005-05-06 Thread Sjoerd Visscher

Nikolas 'Atrus' Coukouma wrote:
Using @rel with any linking element is perfectly valid and has been for
years.
@rel not being supported for anything other than the link element itself
has also been an outstanding bug for just as long. There's lot of debate
attached to at least one Mozilla bug (#57399 [1] - filed on 2000-10-20).
Can we agree that this should be supported, but currently isn't? Unless
there's a compelling reason not to, I think we might as well allow
autodiscovery via either element. Any implementation guide should
recommend duplicating the information in the interest of autodiscovery
actually working.
[1] https://bugzilla.mozilla.org/show_bug.cgi?id=57399
Yes, absolutely, that was my point. As David Baron says in Bugzilla: 
"The spec was designed with the idea that any application that looked at 
rel/rev on LINK elements should do the same for A elements."

--
Sjoerd Visscher
http://w3future.com/weblog/

Re: PaceAllowDuplicateIDs alteration

On 6 May 2005, at 4:16 pm, Bob Wyman wrote:
Graham wrote:
"If an Atom Feed Document contains multiple entries with the
same atom:id, software MUST treat them as multiple versions of
the same entry"
Are they still the same entry if they have different source  
elements
that identify their source as being different feeds?
Why wouldn't they be? It would mean "Here's how the entry looked when  
it was published in feed A" and "Here's how the entry looked when it  
was in feed B". But if the publisher has assigned them the same ID,  
that's a fairly clear expression that they're versions of the same  
thing.

(obviously there's the danger of spoofing, but that's a general  
problem with IDs and not something that needs to be noted in every  
sentence)

Graham

Re: PaceAllowDuplicateIDs alteration

2005-05-06 Thread Tim Bray

On May 6, 2005, at 8:49 AM, Eric Scheid wrote:
Are they still the same entry if they have different source elements
that identify their source as being different feeds?
I don't see why. I subscribe to a Local News feed, a National News 
feed, and
a Science News feed. All from the same publisher. The same story may 
appear
in one, two, or three of those feeds. I don't believe each of those 
feeds
would have the same feed/source values.
Right but the story's  would have the same atom:id in 
each of those feeds, right?  So they are the same entry, right? -Tim

Re: FeedId

Bill de hÓra wrote:
Sam Ruby wrote:
Bill de hÓra wrote:
PaceFeedIdOrAlternate, withdrawn, no comment
PaceFeedIdOrSelf  0
PaceOptionalFeedLink +1
I agree with the rationale; no point making people things they can't 
do. I'm assuming that if PaceOptionalFeedLink goes through feed:id is 
a MUST.
On what do you base that assumption?
That there is consensus we'll want to identify a feed, even if we can't 
provide a link.
At the moment, there is not such a proposal on the table for the chairs 
to even determine if there is consensus around it.

My take (as a non-chair) is that after a lengthy discussion about this, 
people did want *something* that they could use to identify the feed. 
Some initially rallied around feed link, some around id, some around 
feed rel=self.  A number of proposals surfaced.

The one proposal on the table that has a chance is PaceFeedIdOrSelf, but 
it is in trouble.

So, at current course and speed, I see us coming to a conclusion that 
feed link be made optional with nothing to replace it.

- Sam Ruby

Re: PaceAllowDuplicateIDs alteration


On 5/6/05, Bob Wyman <[EMAIL PROTECTED]> wrote:
> 
> Graham wrote:
> > "If an Atom Feed Document contains multiple entries with the
> > same atom:id, software MUST treat them as multiple versions of
> > the same entry"
> Are they still the same entry if they have different source elements
> that identify their source as being different feeds?

I would say yes. Entries IDs are globally unique. If they aren't the
same entry, it's a collision, right?

Robert Sayre

entry definition

2005-05-06 Thread Henry Story

Some have been clamoring for a good definition of an entry.
Here is one I have thought of recently.
An Atom Entry is a resource (identified by atom:id) whose  
representations
(atom:entry) describe the state of a web resource at a time
(the link alternate)

Any thoughts?
Henry

Re: A quick note on process

On 5/6/05, Tim Bray <[EMAIL PROTECTED]> wrote:
> So if you have a strong feeling, pro or contra, about any of the Paces
> currently in play, please make sure you've put it on the record since
> then.

PaceOptionalSummary addresses a last call issue. I expect every single
last call comment will be taken into account.

Robert Sayre

Re: PaceAllowDuplicateIDs alteration

On 7/5/05 1:16 AM, "Bob Wyman" <[EMAIL PROTECTED]> wrote:

> Graham wrote:
>> "If an Atom Feed Document contains multiple entries with the
>> same atom:id, software MUST treat them as multiple versions of
>> the same entry"
> Are they still the same entry if they have different source elements
> that identify their source as being different feeds?

I don't see why. I subscribe to a Local News feed, a National News feed, and
a Science News feed. All from the same publisher. The same story may appear
in one, two, or three of those feeds. I don't believe each of those feeds
would have the same feed/source values.

e.

Re: FeedId

2005-05-06 Thread Bill de hÃra

Sam Ruby wrote:
Bill de hÓra wrote:
PaceFeedIdOrAlternate, withdrawn, no comment
PaceFeedIdOrSelf  0
PaceOptionalFeedLink +1
I agree with the rationale; no point making people things they can't 
do. I'm assuming that if PaceOptionalFeedLink goes through feed:id is 
a MUST.

On what do you base that assumption?
That there is consensus we'll want to identify a feed, even if we can't 
provide a link.

cheers
Bill

Re: FeedId

Bill de hÓra wrote:
PaceFeedIdOrAlternate, withdrawn, no comment
PaceFeedIdOrSelf  0
PaceOptionalFeedLink +1
I agree with the rationale; no point making people things they can't do. 
I'm assuming that if PaceOptionalFeedLink goes through feed:id is a MUST.
On what do you base that assumption?
- Sam Ruby

Re: PaceOptionalFeedLink

Sam Ruby wrote:
Something that WOULD attract my attention is somebody saying "here is a 
set of feeds that I would like to provide that I can't provide in a 
valid way according to any of the available RSS specifications."
Sam,
would:
 
work for you? This could make/encourage those who don't have alternates 
to say so and might allow downstream code to make better assumptions. If 
you like the idea, and we're not past the point, I'll write a pace.

cheers
Bill

Re: PaceAllowDuplicateIDs alteration

Graham wrote:
As the WG may have noticed, I have some serious problems with the  Pace. 
One small change would eliminate about 75% of them:

Replace the line:
"If an Atom Feed Document contains multiple entries with the same  
atom:id, software MAY choose to display all of them or some subset of  
them"

with:
"If an Atom Feed Document contains multiple entries with the same  
atom:id, software MUST treat them as multiple versions of the same  entry"

I don't think this changes the technical meaning of the proposal, but  
does make it very explicit.

Would anyone object to this change?
Looks promising. Would you replace "versions" with "representations"?
cheers
Bill

RE: PaceAllowDuplicateIDs alteration


Graham wrote:
> "If an Atom Feed Document contains multiple entries with the 
> same atom:id, software MUST treat them as multiple versions of
> the same entry"
Are they still the same entry if they have different source elements
that identify their source as being different feeds? 

bob wyman

RE: PaceAllowDuplicateIDs


Eric Scheid wrote:
> a feed is a stream of *instantiations* of an entry.
> Put another way, the map is not the territory, the  is not
> the 'entry'.
Right. We have abstract feeds and entries and we have concrete feeds
and entries. The abstract feed is the actual stream of entries and updates
to entries as they are created over time. Feed documents are "concrete"
snapshots of this stream or abstract feed of entries. An abstract entry is
made "concrete" in entry documents or entry elements. An abstract entry may
change over time and may have one or more concrete instantiations.
Some applications are only interested in being exposed to those
concrete entries that reflect the "current" or "most recent" state of the
abstract entries -- these apps would prefer to see no duplicate ids in
concrete feed documents even though these duplicates *will* occur in the
abstract feed. Other applications will require visibility to the entire
stream of changes to abstract entries -- these applications will wish to see
concrete feeds that may contain multiple, differing concrete instantiations
of abstract entries. i.e. they will want the concrete feed to be an accurate
representation of the abstract feed. Two needs, to views...

bob wyman

Status

PaceEntryState -1
PacePubControl -1, strongly
PacePubStatusResource 0
I agree very much with Tim's ob about decoupling state from format for 
PaceEntryState. I think PacePubControl has a similar looking problem, so 
I don't believe in PacePubControl, as Tim does. That's because it puts 
the control code into the format and creates a new processing directive 
as a consequence of whether we're in protocol mode or not. That couples 
protocol with format in a messy way. The pub:control directive should 
wrap the entry not be embedded in it.

I'm 0 on PacePubStatusResource, as I think it's the right approach, but 
I'm holding for someone with WebDAV/REST clout, who isn't Robert, to 
confirm whether they think it's ok.

cheers
Bill

Re: Atom Implementation Guide: seeking co-editor


On 6/5/05 11:51 PM, "Mark Pilgrim" <[EMAIL PROTECTED]> wrote:

> In case this got buried in the previous thread, last fall I wrote but
> never publicized a minimal draft of an Atom Implementation Guide.
> 
> http://diveintomark.org/rfc/draft-ietf-atompub-impl-guide-00.html
> 
> Due to my commitments elsewhere, I can not promise that I will ever
> finish this alone.  If the WG wants to publish an implementation guide
> as an RFC, I need a co-editor who is willing to write -- not provide
> feedback, not sit in the peanut gallery and poke at it with sharp
> pointy objects, but actually *write* -- a substantial portion of it.
> You would have complete control over its direction and be listed as
> the primary author; I may be able to contribute specific sections as
> time permits.

I believe I can commit to this.

You've provided a good outline, which is 80% of the work (writing the
content is the other 80% ;-)

e.

Re: Atom Implementation Guide: seeking co-editor

2005-05-06 Thread Svgdeveloper

Hi Mark,

Like you, I am not short of things to do but would be willing to contribute to this process.

I guess the first thing to be clarified is whether the WG wants an Implementation Guide or not. Then, assuming they do, an idea of hoped-for timetable and scope would be good.

Since I have, on the whole, sat back and let the discussion/war flow on this list I guess I should briefly mention some relevant credentials. Since writing text seems to be the main task which needs to be done I will focus comments in that direction.

I guess the most recent relevant writing is as co-author with Danny Ayers of Beginning RSS & Atom Programming which should be published by Wrox any day now.

I have also written all or part of various XML or XML-associated books including Pro XML (2nd Edition), Beginning XML (3rd Edition), Pro XSL, Teach Yourself XML in 10 Minutes, XPath Essentials, XML Schema Essentials and Designing SVG Web Graphics.

I have also, in times past, poked hard with the pointy stick at assorted W3C working drafts including XSLT 2.0, XPath 2.0, XQuery 1.0, SVG 1.x and XForms 1.0.

If anyone well-qualified has lots of time on their hands I would be happy to let them proceed with the task. :)

Andrew Watt

In a message dated 06/05/2005 14:52:28 GMT Daylight Time, [EMAIL PROTECTED] writes:

In case this got buried in the previous thread, last fall I wrote but
never publicized a minimal draft of an Atom Implementation Guide.

http://diveintomark.org/rfc/draft-ietf-atompub-impl-guide-00.html

Due to my commitments elsewhere, I can not promise that I will ever
finish this alone. If the WG wants to publish an implementation guide
as an RFC, I need a co-editor who is willing to write -- not provide
feedback, not sit in the peanut gallery and poke at it with sharp
pointy objects, but actually *write* -- a substantial portion of it.
You would have complete control over its direction and be listed as
the primary author; I may be able to contribute specific sections as
time permits.

Please reply on-list.

--
Cheers,
-Mark

Re: PaceAllowDuplicateIDs

Graham wrote:
On 6 May 2005, at 2:10 pm, Dave Johnson wrote:
Yes, I think both of my arguments fail to hold and I no longer have  a 
real objection to duplicates. Allowing duplicates gives feed  produces 
to model events or other objects (versioned documents in a  wiki) as 
they wish. Like you, I wonder "Does anyone remember why  having the 
same id in a feed is a bad idea?"

Beacuse instead of a fixed model where a feed is a stream of entries  
each with their own id, it is now a stream of entries each of which  
does not have its own id, but shares it with similar entries. This is  
bullshit.
No, it's not, not yet. You can't reasonably call bullshit on this either 
way until we know what's being identified. When I boil it down, this is 
what I get:

"the technical problem we have is that we can't distinguish between a 
buggy feed with the same ids and an aggregate feed with the same ids 
under the current spec"

I have no good answer to that until I know what what an id stands for. 
The answer "an entry" isn't sufficient.

cheers
Bill

Re: PaceAllowDuplicateIDs alteration

On 7/5/05 12:09 AM, "Graham" <[EMAIL PROTECTED]> wrote:

> "If an Atom Feed Document contains multiple entries with the same
> atom:id, software MUST treat them as multiple versions of the same
> entry"
> 
> I don't think this changes the technical meaning of the proposal, but
> does make it very explicit.

+1, with one minor amendment.

s/versions/instantiations/

the spec uses that word elsewhere, and 'versions' might suggest a media
adaptation, language variant, etc.

e.

FeedId

PaceFeedIdOrAlternate, withdrawn, no comment
PaceFeedIdOrSelf  0
PaceOptionalFeedLink +1
I agree with the rationale; no point making people things they can't do. 
I'm assuming that if PaceOptionalFeedLink goes through feed:id is a MUST.

cheers
Bill

Re: Atom Implementation Guide: seeking co-editor


On 7/5/05 12:39 AM, "Bill de hÓra" <[EMAIL PROTECTED]> wrote:

>> http://diveintomark.org/rfc/draft-ietf-atompub-impl-guide-00.html
>> 
>> Due to my commitments elsewhere, I can not promise that I will ever
>> finish this alone.  If the WG wants to publish an implementation guide
>> as an RFC, I need a co-editor who is willing to write -- not provide
>> feedback, not sit in the peanut gallery and poke at it with sharp
>> pointy objects, but actually *write* -- a substantial portion of it.
>> You would have complete control over its direction and be listed as
>> the primary author; I may be able to contribute specific sections as
>> time permits.
> 
> Tim at a time into my inbox earlier than the above:
> 
> "Given the volume of debate, it's obvious there may be more work to do
> here.  Paul and I have asked Phil Ringnalda to co-edit the autodiscovery
> spec, and he's accepted.  "
> 
> Crossed wires?

auto discovery spec != implementation guide

e.

Re: Atom Implementation Guide: seeking co-editor

Anne van Kesteren wrote:
ï wrote:
Tim at a time into my inbox earlier than the above:
"Given the volume of debate, it's obvious there may be more work to do 
here.  Paul and I have asked Phil Ringnalda to co-edit the 
autodiscovery spec, and he's accepted.  "

Crossed wires?

Euh, the autodiscovery specification is not really equal to the 
implementation guide.
To tell you the truth, in all this excitement, I've kinda lost track 
myself. My crossed wires then, never mind.

cheers
Bill

Re: Atom Implementation Guide: seeking co-editor

2005-05-06 Thread Anne van Kesteren

ï wrote:
Tim at a time into my inbox earlier than the above:
"Given the volume of debate, it's obvious there may be more work to do 
here.  Paul and I have asked Phil Ringnalda to co-edit the autodiscovery 
spec, and he's accepted.  "

Crossed wires?
Euh, the autodiscovery specification is not really equal to the 
implementation guide.

--
 Anne van Kesteren

EntryId

PaceAllowDuplicateIDs +1.
PaceDuplicateIDWithSource2 -1
PaceDuplicateIDWithSource -1
PaceExplainDuplicateIds +1
We have no clear reason to disallow the use-case, until we put our cards 
on the table re what's being identified. So either PaceAllowDuplicateIDs 
 or PaceExplainDuplicateIds  are fine by me.

An alternative option that might get consensus is to pace a profile that 
switches the duplicate per feed restriction off; the security question 
there is whether anyone will pay attention to such a directive.

cheers
Bill

A quick note on process

2005-05-06 Thread Tim Bray

Your co-chairs would like to ask a favor of the group.  When doing our 
last consensus call, it's going to be prohibitively difficult to go 
back to the beginning of time.  Thus, we'd like to start with Sam's 
announcement of the issues list revision at 
http://www.imc.org/atom-syntax/mail-archive/msg14742.html

So if you have a strong feeling, pro or contra, about any of the Paces 
currently in play, please make sure you've put it on the record since 
then.

Thanks in advance. -Tim

Re: PaceAllowDuplicateIDs

On 6/5/05 11:37 PM, "Graham" <[EMAIL PROTECTED]> wrote:

> Beacuse instead of a fixed model where a feed is a stream of entries
> each with their own id, it is now a stream of entries each of which
> does not have its own id, but shares it with similar entries. This is
> bullshit.

See the spec:

[...] Put another way, an atom:id element pertains to all
instantiations of a particular Atom entry or feed; revisions
retain the same content in their atom:id elements. [...]

This clearly implies that a feed is a stream of *instantiations* of an
entry.

Put another way, the map is not the territory, the  is not the
'entry'.

e.

Re: Atom Implementation Guide: seeking co-editor

Mark Pilgrim wrote:
In case this got buried in the previous thread, last fall I wrote but
never publicized a minimal draft of an Atom Implementation Guide.
http://diveintomark.org/rfc/draft-ietf-atompub-impl-guide-00.html
Due to my commitments elsewhere, I can not promise that I will ever
finish this alone.  If the WG wants to publish an implementation guide
as an RFC, I need a co-editor who is willing to write -- not provide
feedback, not sit in the peanut gallery and poke at it with sharp
pointy objects, but actually *write* -- a substantial portion of it. 
You would have complete control over its direction and be listed as
the primary author; I may be able to contribute specific sections as
time permits.
Tim at a time into my inbox earlier than the above:
"Given the volume of debate, it's obvious there may be more work to do 
here.  Paul and I have asked Phil Ringnalda to co-edit the autodiscovery 
spec, and he's accepted.  "

Crossed wires?
cheers
Bill

Re: PaceAllowDuplicateIDs alteration

2005-05-06 Thread Tim Bray

On May 6, 2005, at 7:09 AM, Graham wrote:
Replace the line:
"If an Atom Feed Document contains multiple entries with the same 
atom:id, software MAY choose to display all of them or some subset of 
them"

with:
"If an Atom Feed Document contains multiple entries with the same 
atom:id, software MUST treat them as multiple versions of the same 
entry"
Hmmm; the Pace already says "If multiple atom:entry elements with the 
same atom:id value appear in an Atom Feed document, they represent the 
same entry."  So what you want is almost there.

I don't think this changes the technical meaning of the proposal, but 
does make it very explicit.
My problem is that when you say "software MUST", I think you should 
follow up with something specific and testable.  This assertion feels 
fairly vague and leaves room for lots of argument.  Maybe a first step 
to tightening it up would be to provide some specific examples of 
software behaviors that would be forbidden/allowed by this MUST-clause.

-Tim [who has yet to express a +1 or any other final opinion about this 
Pace]

PaceTextShouldBeProvided

-1.
I agree with Robert; it conflicts with PaceOptionalSummary and I doubt 
it would exist if PaceOptionalSummary had not make the cut. At best 
level of specification belongs in the Fabled Implementor's Guide, not 
the specification.

cheers
Bill

Re: PaceAllowDuplicateIDs alteration

2005-05-06 Thread Brett Lindsley


What determines a "version"? If we have multiple entries with identical 
information,
these are copies, not versions.

The real issue is feed state. Given there are identical IDs, can we 
determine if
the entries are identical or different? The end client can then deal 
with it any way
it wants (e.g.):
- Keep only the most recent.
- Display them all.
- Combine them all into a single entry with a history.
- Show changes.
- Drop duplicates.
- etc.

Sorry I don't have an answer here, just an observation of the exact problem.
Brett

Graham wrote:

"If an Atom Feed Document contains multiple entries with the same  
atom:id, software MUST treat them as multiple versions of the same  
entry"

Re: PaceTextShouldBeProvided vs PaceOptionalSummary

2005-05-06 Thread Bill de hÃra

Sam Ruby wrote:
Bill de hÓra wrote:
3. It's the kind of spec text we have rejected in the past as 
impletation specific and/or "best current practice" guidance:

 "those that follow these suggestions will find that their feeds are 
useful to a wider audience than they would be otherwise."

Um, that text is not part of the proposal.  It is part of the rationale.
I know. I don't see the point of considering PaceTextShouldBeProvided if 
the rationale doesn't hold up.

cheers
Bill

RE: PaceAllowDuplicateIDs


Graham wrote:
>>"Does anyone remember why having the same id in a feed is a bad idea?"
> Beacuse instead of a fixed model where a feed is a stream of
> entries each with their own id, it is now a stream of entries each
> of which does not have its own id, but shares it with similar
> entries. This is bullshit.
I completely disagree on this.
I think the problem here is people focusing too much on
characteristics of the feed when the real issue here is Entries. Like I've
said in the past, "It's about the Entries, Stupid!" (don't take offense...)
As long as we allow entries to be updated, it is inevitable that the
stream of entries that is created over time will contain instances of
entries that share common atom:id values. 
The only question here is whether or not we're willing to allow a
feed document to *accurately* represent the stream of entries -- as they
were created -- or whether we insist that the feed document "censor" the
history of the stream by removing old instances of updated entries before
allowing updates to be inserted.
The reality is that no matter which decision we make in this case,
any useful aggregator must have code to deal with multiple instances of an
entries that share the same atom:id. This is the case since even if we don't
permit duplicate IDs in a single instance of a feed document, we would still
permit duplicate ID's *over time." Because duplicate ids appear, over time,
whenever you update an entry, the aggregator has to have all the logic
needed to handle them in the *stream* of entries that it reads -- over time.
This issue only becomes interesting if we try to provide special
rules for the handling of data within a single instance of a feed document.
The reality is, however, that any aggregator that actually pays attention to
these special case rules is going to either get more complex (since it can't
simply treat everything as a stream of entries) or it will get confused
(since folk will intentionally or unintentionally create duplicate ids).
This ban on duplicate ids provides no benefit for aggregators, it
makes feed producers more complex, it tempts aggregator or client writers to
do dangerous things, it forces deletion of data that is useful to some
people for some applications, it puts too much emphasis on "feeds" when we
should be working on "entries", etc... It is a really bad thing to do.

bob wyman

PaceAllowDuplicateIDs alteration

As the WG may have noticed, I have some serious problems with the  
Pace. One small change would eliminate about 75% of them:

Replace the line:
"If an Atom Feed Document contains multiple entries with the same  
atom:id, software MAY choose to display all of them or some subset of  
them"

with:
"If an Atom Feed Document contains multiple entries with the same  
atom:id, software MUST treat them as multiple versions of the same  
entry"

I don't think this changes the technical meaning of the proposal, but  
does make it very explicit.

Would anyone object to this change?
Graham

Re: PaceAllowDuplicateIDs

2005-05-06 Thread Brett Lindsley


Unique IDs allow clients to determine the state of the feed.  If entry 
ids are
not unique, then we still need some other way to determine the unique
state of the feed. If we allow duplicate IDs but *require* something else
to be different (e.g. update time), then we can still determine the unique
state of a feed and repeated IDs are OK. We would need to properly
document which elements make an entry unique in the event of a duplicated
ID. Brett.

On 6 May 2005, at 2:10 pm, Dave Johnson wrote:
Yes, I think both of my arguments fail to hold and I no longer have  
a real objection to duplicates. Allowing duplicates gives feed  
produces to model events or other objects (versioned documents in a  
wiki) as they wish. Like you, I wonder "Does anyone remember why  
having the same id in a feed is a bad idea?"

Atom Implementation Guide: seeking co-editor

2005-05-06 Thread Mark Pilgrim


In case this got buried in the previous thread, last fall I wrote but
never publicized a minimal draft of an Atom Implementation Guide.

http://diveintomark.org/rfc/draft-ietf-atompub-impl-guide-00.html

Due to my commitments elsewhere, I can not promise that I will ever
finish this alone.  If the WG wants to publish an implementation guide
as an RFC, I need a co-editor who is willing to write -- not provide
feedback, not sit in the peanut gallery and poke at it with sharp
pointy objects, but actually *write* -- a substantial portion of it. 
You would have complete control over its direction and be listed as
the primary author; I may be able to contribute specific sections as
time permits.

Please reply on-list.

-- 
Cheers,
-Mark

Re: Autodiscovery discussion & editorship

2005-05-06 Thread Mark Pilgrim

On 5/5/05, Tim Bray <[EMAIL PROTECTED]> wrote:
> The discussion in recent days has been lively but unstructured.  If I
> were forced to make a consensus call right now, I'm pretty sure I
> wouldn't be able to pick out any one spec change that I could say
> clearly has consensus.

The one suggestion I did see, which should be acted on immediately, is
to update the references section to point to the newest versions of
the XML and URI specs (and associated link changes throughout the
text).

-- 
Cheers,
-Mark

Re: PaceAllowDuplicateIDs

On 6 May 2005, at 2:10 pm, Dave Johnson wrote:
Yes, I think both of my arguments fail to hold and I no longer have  
a real objection to duplicates. Allowing duplicates gives feed  
produces to model events or other objects (versioned documents in a  
wiki) as they wish. Like you, I wonder "Does anyone remember why  
having the same id in a feed is a bad idea?"
Beacuse instead of a fixed model where a feed is a stream of entries  
each with their own id, it is now a stream of entries each of which  
does not have its own id, but shares it with similar entries. This is  
bullshit.

Graham

Re: PaceOptionalFeedLink

2005-05-06 Thread Julian Reschke

Sam Ruby wrote:
Graham wrote:
On 6 May 2005, at 3:50 am, Sam Ruby wrote:
FYI: we have an instance proof of this requiring an existing tool  to 
do additional work:

  http://www.imc.org/atom-syntax/mail-archive/msg13983.html

Tools will have to be updated to work with Atom? Scandalous.
+1 to the Pace

This Pace is not one that I plan to lie down in the road over.  However, 
it continues to puzzle the bejeebers out of me.

The channel link element is required in every version of RSS from 0.91 
to 1.0 to 2.0.  And as a co-author of the feedvalidator, I have seen a 
lot of broken feeds where people have either inadvertently or 
deliberately ignored the specification, but I don't recall ever seeing 
one where this element was not present.
Which shows that those who don't have a useful link to report just make 
something up. If Atom requires the link, the situation will likely be 
the same. I just don't get how this is better than just stating "there's 
no alternative representation" instead.

Out of the three feeds I currently generate/author, only one has 
meaningful "alternate" version.

My concern is not that tools will need to be updated.  My concern is 
that tools won't know that they need to update.  How will they know that 
they need to update to handle a set of feeds that nobody is currently 
providing?
Which tools are you talking about? Tools that consume RSS variants, or 
tools that consume earlier Atom format versions? The latter will have to 
be upgraded anyway because of the changing XML namespace.

Something that WOULD attract my attention is somebody saying "here is a 
set of feeds that I would like to provide that I can't provide in a 
valid way according to any of the available RSS specifications."
As stated earlier, this is the case here. I had to make up the 
"alternate" link value just to satisfy the validator 
().

Best regards,
Julian

Re: PaceAllowDuplicateIDs

2005-05-06 Thread Dave Johnson


On May 6, 2005, at 5:17 AM, Bill de hÓra wrote:
Dave Johnson wrote:
Immediately after sending this message, I had a rush  of second 
thoughts.
My point #2 is not very well thought out. I think it applies for 
things like earthquake data, but when Atom feeds represent blog 
entries or articles (in an archive or an Atom Protocol feed) the  ID 
represents the article not an event in the blog entry's life.  So, 
you can discount my second reason against the pace.
Good, because not everyone would agree that's what's being modeled. 
Now I also think your 1) starts to fall away as well because 2) no 
longer holds. That is, some of the argument for "Current best 
practice" you're talking about (you sure about "best" in there? ;) is 
predicated on entries being like events. Aggregators tend, or would 
like, to treat entries as singular happenings. Finally some events are 
not easily modeled in the discrete way you're talking about (in turn 
some of that comes down to how you model time), I don't think we have 
to worry about those here.
Yes, I think both of my arguments fail to hold and I no longer have a 
real objection to duplicates. Allowing duplicates gives feed produces 
to model events or other objects (versioned documents in a wiki) as 
they wish. Like you, I wonder "Does anyone remember why having the same 
id in a feed is a bad idea?"

- Dave

Re: PaceTextShouldBeProvided vs PaceOptionalSummary

Bill de hÓra wrote:
3. It's the kind of spec text we have rejected in the past as 
impletation specific and/or "best current practice" guidance:

 "those that follow these suggestions will find that their feeds are 
useful to a wider audience than they would be otherwise."
Um, that text is not part of the proposal.  It is part of the rationale.
- Sam Ruby

Re: PaceOptionalFeedLink

On 6 May 2005, at 1:26 pm, Sam Ruby wrote:
My concern is not that tools will need to be updated.  My concern  
is that tools won't know that they need to update.  How will they  
know that they need to update to handle a set of feeds that nobody  
is currently providing?
How is this different to any of the other new features in Atom? No  
one agrees with you on this point; If you don't have anything else,  
please stop making everyone else's life harder by labouring a point  
that doesn't affect you in any way.

Something that WOULD attract my attention is somebody saying "here  
is a set of feeds that I would like to provide that I can't provide  
in a valid way according to any of the available RSS specifications."
I have private RSS feeds showing new referrals for my websites. They  
do not have corresponding web pages, and don't have feed-level links.  
I think these kind of feeds make up a significant chunk of the demand  
for dropping the requirement.

Graham

Re: PaceTextShouldBeProvided vs PaceOptionalSummary

Tim Bray wrote:
Speaking not as the chair but as an interested WG member,  I read them 
about eight times and I do not understand why they are in conflict.  
Someone please explain, as simply as possible, what the problem is, 
because I just don't get it.  On the face of it, I am inclined to be +1 
to both PaceOptionalSummary and PaceTextShouldBeProvided.

1. The pace's rationale has claims which have already been refuted by 
Robert and others in the discussion on optional summaries:

 "Encourage interoperability and accessibility"
this rationale has no merit, imo.
2. It has a bias that is squarely aimed at title only feeds, which is 
the outcome of PaceOptionalSummary

 "Unfortunately, there are also existence proofs of title-only feeds"
it clearly takes a shot across the bows of PaceOptionalSummary.
3. It's the kind of spec text we have rejected in the past as 
impletation specific and/or "best current practice" guidance:

 "those that follow these suggestions will find that their feeds are 
useful to a wider audience than they would be otherwise."

we have a decision making legacy that speaks for itself, this is not 
demstrated to be a special case we ought to cater for.

4. It would appear to contradict PaceOptionalSummary by highlighted that 
 legal usage as bad practice. That's contradictory in spirit, and 
personally speaking it's the kind of wording and deliberate vagueness 
that infuriates me about software specs. fFutzing about like this is 
showing poor form to the users of the spec

2 alone should be enough for you. Technically these things are not in 
contradiction, in sprit they are.

I'm on the record already here:
http://www.imc.org/atom-syntax/mail-archive/msg14535.html
cheers
Bill

Re: PaceOptionalFeedLink

Graham wrote:
On 6 May 2005, at 3:50 am, Sam Ruby wrote:
FYI: we have an instance proof of this requiring an existing tool  to 
do additional work:

  http://www.imc.org/atom-syntax/mail-archive/msg13983.html
Tools will have to be updated to work with Atom? Scandalous.
+1 to the Pace
This Pace is not one that I plan to lie down in the road over.  However, 
it continues to puzzle the bejeebers out of me.

The channel link element is required in every version of RSS from 0.91 
to 1.0 to 2.0.  And as a co-author of the feedvalidator, I have seen a 
lot of broken feeds where people have either inadvertently or 
deliberately ignored the specification, but I don't recall ever seeing 
one where this element was not present.

My concern is not that tools will need to be updated.  My concern is 
that tools won't know that they need to update.  How will they know that 
they need to update to handle a set of feeds that nobody is currently 
providing?

Something that WOULD attract my attention is somebody saying "here is a 
set of feeds that I would like to provide that I can't provide in a 
valid way according to any of the available RSS specifications."

- Sam Ruby

PaceOptionalSummary

+1, in case it got lost in the earlier threads.
cheers
Bill

Re: Autodiscovery

2005-05-06 Thread Nikolas Coukouma

Sjoerd Visscher wrote:
Like I wrote before, this is not how HTML 4.01 (or XHTML 2.0 for that 
matter) defines the rel attribute on a hyperlink:

This attribute describes the relationship from the current document to 
the anchor specified by the href attribute. The value of this 
attribute is a space-separated list of link types.

Completely separately, to the anchor versus link debate:
"space-separated list of link types."
is a find worthy of points (redeemable at your local ego shop).
This means we can have our pie and eat it too.
If we define rel="feed", there's no reason it can't be combined with 
other rel types :) For example, we could allow
http://www.example.com/xml/index.atom";>
http://www.example.com/xml/index.atom";>

Using @rel for type-ish hinting is an ugly hack, but this allows us to 
at least let @rel function the way it was intended to in addition to the 
hack.

It's that or choose another attribute to be our victim
-Nikolas 'Atrus' Coukouma

Re: Atom on portable wireless device (was: RE: Atom feed refresh rates)

2005-05-06 Thread Janne Jalkanen


> You've written on your blog that you want to see more "304"
> responses. Well, I would suggest that what you *really* should want is more
> "226" responses -- 226 is the success code for an RFC3229+feed GET
> operation.

I like so agree.  226 support would be highly commendable for
everyone, who wants to serve feeds for mobiles...  But considering the
status these days, even 304 would be good. *sigh*

Though, the real solution probably lies in notification protocols,
such as SIP.  Reduce the need of polling, do a proper SIP
subscribe-notify...  Of course, these are not solutions for current
devices.

/Janne

Re: PaceOptionalFeedLink

2005-05-06 Thread Julian Reschke

Graham wrote:

On 6 May 2005, at 3:50 am, Sam Ruby wrote:
FYI: we have an instance proof of this requiring an existing tool  to 
do additional work:

  http://www.imc.org/atom-syntax/mail-archive/msg13983.html

Tools will have to be updated to work with Atom? Scandalous.
+1 to the Pace
+1 as well. That something which has been developed against a previous 
draft will not work with a change in the format seems to be quite natural.

On the other hand, we also heard of feeds that need to "make up" links 
(which doesn't seem very useful to me).

Best regards,
Julian

Re: PaceAllowDuplicateIDs

Dave Johnson wrote:
Immediately after sending this message, I had a rush  of second thoughts.
My point #2 is not very well thought out. I think it applies for things 
like earthquake data, but when Atom feeds represent blog entries or 
articles (in an archive or an Atom Protocol feed) the  ID represents the 
article not an event in the blog entry's life.  So, you can discount my 
second reason against the pace.
Good, because not everyone would agree that's what's being modeled. Now 
I also think your 1) starts to fall away as well because 2) no longer 
holds. That is, some of the argument for "Current best practice" you're 
talking about (you sure about "best" in there? ;) is predicated on 
entries being like events. Aggregators tend, or would like, to treat 
entries as singular happenings. Finally some events are not easily 
modeled in the discrete way you're talking about (in turn some of that 
comes down to how you model time), I don't think we have to worry about 
those here.

Put it this way, under either "this is an event stream" or "this is a 
stream of entries", having multiple entries in a single feed is not an 
unreasonable request. We came up with the id approach to allow people to 
zero in on duplicates; that's the primary case. We did that without 
really articulating what an entry stands for, some effort was done 
post-hoc, but it doesn't seem to have made it as spec text. Consider. Is 
the XML entry in a feed a representation of an entry, a la REST? If so, 
does the id identify the representation or the entry? If the id 
identifies the entry representation, how (or should) we identify the 
entry? If the id identifies the entry, how (or need) we identify the 
representation?

Those are just some of the questions, we could ask.  We could then ask 
the whole set over regarding what a feed is, as that has a bearing too.

I've said this before - the technical problem we have is that we can't 
distinguish between a buggy feed with the same ids and an aggregate feed 
with the same ids under the current spec. Because you can't have it both 
ways, that rationale should have been provided in the spec. It's an 
architectural constraint - you can not say this with your feed because 
we can't make sense of it -  done, not to preserve some notion of 
identity we have here, but to allow people to drop duplicates and 
normalize their streams. The downside is that 1) some people do want to 
aggregate versions of an entry in a single feed, which presumably have 
the same id, so "they or others can say, these are all of the same 
thing", 2) some people do re-edit their entries or edit their entry 
dates around so the entry reappears with updated content. Bray does this 
with his Sunday server logs and no-one convinced me it's not equally as 
questionable an approach at some level as allow duplicate ids, or 
munging URLs the way ad people do - who cares as long as it gets into 
people's clients?

And it's clear now, that ids don't solve "the get rid of all those 
duplicates problem", dates are required also, to cater for cases where 
someone updates the entry, but we don't want people to miss that because 
of over-eager reaping on the clients.

It's a mess, that's our fault. As a first step we need to be able to say 
what's being identified. If we decide we don't care about that, we just 
want to illegalise duplicates, then maybe ids were not the right idea to 
begin with.

cheers
Bill

Re: PaceAllowDuplicateIDs

Robert Sayre wrote:
I'm much more sympathetic to the aggregate feed problem of multiple
IDs. People advocating this type of thing seem to think the default
action should be grouping, so they want to use the same ID. I think
that's a bad idea, and there are plenty of other ways to indicate the
fundamental sameness of entries. For example, NewsML URNs have a
NewsItemID and a RevisionID, which would allow smart aggregators to
group the entries without violating Atom's constraint.
Then you have two ways of indicating fundamental sameness of entries, 
one for when the same entry appears multiple times in a feed, and one 
for everything else.

Back to basics then. Does anyone remember why having the same id in a 
feed is a bad idea?

cheers
Bill

Re: Autodiscovery

Sjoerd Visscher wrote:

>
> fantasai wrote:
>
>>
>> The difference between  and  is that
>>   -  applies to the document as a whole: it indicates a
>> relationship
>> between this document and the href destination.
>>   -  is a contextual link: it indicates a relationship between the
>> linking context and the href destination.
>>
>> They have different purposes. It is imho perfectly reasonable to limit
>> autodiscovery to s only. It is also perfectly reasonable to link
>> to feeds with , and expect that the UA will recognize it as a feed
>> rather than a generic XML document.
>>
>
> Like I wrote before, this is not how HTML 4.01 (or XHTML 2.0 for that
> matter) defines the rel attribute on a hyperlink:
>
> This attribute describes the relationship from the current document to
> the anchor specified by the href attribute. The value of this
> attribute is a space-separated list of link types.

Using @rel with any linking element is perfectly valid and has been for
years.
@rel not being supported for anything other than the link element itself
has also been an outstanding bug for just as long. There's lot of debate
attached to at least one Mozilla bug (#57399 [1] - filed on 2000-10-20).

Can we agree that this should be supported, but currently isn't? Unless
there's a compelling reason not to, I think we might as well allow
autodiscovery via either element. Any implementation guide should
recommend duplicating the information in the interest of autodiscovery
actually working.

[1] https://bugzilla.mozilla.org/show_bug.cgi?id=57399

-Nikolas 'Atrus' Coukouma

Re: Autodiscovery - different cases should use different rel

2005-05-06 Thread fantasai

Nikolas 'Atrus' Coukouma wrote:
fantasai wrote:
Actually, I think "start" is the best fit. The main feed is often not a
table of contents to the entire weblog, but something partial. It is,
however, the "starting point of the collection".
Actually, I disagree with start because of the first sentence in the
HTML spec:
"Refers to the first document in a collection of documents."
This indicates that start should point to the first post in a weblog.
end would be the most recent (not that end exists in the HTML spec)
I think "first" here should not be taken as chronological, but as
foremost. This would make it consistent with the second sentence:
"This link type tells search engines which document is considered by the
author to be the starting point of the collection."
This is a completely different meaning and I'm not sure why it's bundled
with the first. According to this, start pointing to the homepage is fine.

BTW, you might want to take a look at
 http://fantasai.tripod.com/qref/Appendix/LinkTypes/ltdef.html
 http://fantasai.tripod.com/qref/Appendix/LinkTypes/alphindex.html
No offense, but with all the tripod ads, I would have much preferred a
link to the "Hypertext links in HTML" draft [1].
Sorry.
Section four is what I want. It's not indexed alphabetically and
doesn't combine other documents, but it's the covers everything
pretty well.
[1] http://www.w3.org/MarkUp/draft-ietf-html-relrev-00.txt
Not quite everything. Some of the values (like "start") are not
covered in that draft.
~fantasai

Re: Autodiscovery - different cases should use different rel