Re: Latest IE7 release 'AtomicRSS' output comparison results

2006-03-22 Thread David Powell


Wednesday, March 22, 2006, 5:13:05 AM, M. David Peterson wrote:

 Hey Folks,
   
 With yesterdays build release of IE7, it seemed appropriate to run
 a quick inventory check to see where things stand in regards to the
 derived MS/RSS conversion of a fairly element/attribute usage
 heavy Atom feed.  Here's the overall breakdown.
 [...]
 Beyond this, it seems that everything else *SHOULD*  be able to map
 back fairly well.

There haven't been many changes to the transformation process in this
build, so all of the 15 issues with the Atom transformation in the old
build are still issues with this one.

http://www.imc.org/atom-syntax/mail-archive/msg17898.html

[Quick summary of actual data-loss:

loss of person extensions, loss of timezones/corruption of times, loss
of [EMAIL PROTECTED], titles are flattened to text without preserving
HTML version, loss of category label, xml:base/xml:lang loss,
inheritance of atom:author and atom:rights not handled
]

The last issue perhaps needs some more explanation:

In Atom, the following two entries are equivalent:

a)

atom:feed
atom:authoratom:nameDavid Powell/atom:name/atom:author
[...]
atom:entry
[...]
/atom:entry
/atom:feed

b)

atom:feed
[...]
atom:entry
atom:authoratom:nameDavid Powell/atom:name/atom:author
[...]
/atom:entry
/atom:feed

The same inheritance also applies to some other elements such as
atom:rights, and xml:base/lang suffer similar issues.

Superficially it seems that there is a problem with IE7s rendering, in
that http://www.tbray.org/ongoing/ongoing.atom doesn't display Tim
Bray's name next to the entries.

But, actually the problem is deeper than that. Because you only
preserve the latest instance of feed metadata, if it was up to the
client of the API to examine the feed author, and manually inherit it
every time it wanted to display the author of the entry, then the
entry would inherit the wrong authors if the feed author had been
changed since the entry was polled.

Eg: feed producing code may put atom:author only on the feed unless
there are multiple entries in the feed with different authors when it
would add it to the entries too.

Basically you can't require the client of the API (eg IE7) to perform
the inheritance, because they need to inherit the author from the feed
document as it was when the entry appeared in it, not as it is now.

The solution, I expect, is to copy any elements that should inherit
down into the entry during the normalisation process. That way the
display of Tim's feed gets fixed, clients don't need to worry about
inheritance, and author and rights attributions of old entries don't
get mangled by future modifications to the feed document.

Ideally you should perform inheritance from atom:source too, as
described in the RFC.

  The areas that are currently untested, and potentially a point of
 concern (that I can think of off the top of my head, anyway)
   
  * undefinedContent of element atom:category

I think it is perfectly reasonable to discard undefined content
(such as namespaced attributes on Atom elements). If you want an
extension, use an extension element. If you want to sprinkle
attributes everywhere in the assumption that implementations are going
to preserve whatever document you pass to them verbatim - well, don't
be too disappointed.

  * any extended usage of xml:base and xml:lang

Proper preservation of these two is essential. In particular the
baseURI for each feed needs to be preserved after resolving it
relative to the document URI (and Content-Location if you're after
extra credit); the baseURI for each entry needs to be preserved after
resolving it relative to the baseURI for the feed.

The baseURI for entries needs to be stored independently of the feed
metadata, otherwise redirecting the feed, or changing its base, will
retroactively break the baseURIs of older entries.

 atom:published

atom:published isn't preserved as an exact copy. It is converted to an
RSS style date (with the time as-is, and the timezone set to GMT even
if it wasn't).

 Overall it seems to me the MS/RSS team has done a pretty fantastic
 job of ensuring a fairly high quality conversion, making exact
 copies of those elements and their associated attributes that did
 not allow for a clean conversion to the MS/RSS format and still
 maintain any hope whatsoever of making the round trip back to its
 original Atom format.

Here here! With all of the high-profile coverage of RSS in the
publicity, I was expecting Atom to be either neglected, or supported
with much lower fidelity than it is now.

Most of the data-loss problems are minor (and therefore easy to fix
:). The serious problems are with inheritance (xml:base, xml:lang,
atom:author, and atom:rights).


-- 
Dave




Re: Latest IE7 release 'AtomicRSS' output comparison results

2006-03-22 Thread M. David Peterson
Hey David,

Wow!This is really cool :)

Since you have obviously gone to great effort to both test and extract this info, I will take the time to add these as tickets into the Trac database. For now I have added three categories:

defect:ms-related:internal-ms-fix-required


defect:ms-related:externally-fixable:no-ms-intervention-required


defect:ms-related:external-fix-possible:temporary


These should all link to a new ticket, with the specificied ticket-type, so if anybody finds additional issues, please don't hesitate to simply post a quick explanation, and either attach the related file, or link directly to them.


David, beyond adding these into the system, is there anything else I can do that you feel might be helpful to ensure we can see a resolution for each of these issues, tracking the process from start to finish as we go?


I'll get started on this now.

Thanks again!


On 3/22/06, David Powell [EMAIL PROTECTED] wrote:
Wednesday, March 22, 2006, 5:13:05 AM, M. David Peterson wrote: Hey Folks, With yesterdays build release of IE7, it seemed appropriate to run
 aquick inventory check to see where things stand in regards to the derived MS/RSS conversion of a fairly element/attribute usage heavyAtom feed.Here's the overallbreakdown. [...] Beyond this, it seems that everything else *SHOULD* be able to map
 back fairly well.There haven't been many changes to the transformation process in thisbuild, so all of the 15 issues with the Atom transformation in the oldbuild are still issues with this one.
http://www.imc.org/atom-syntax/mail-archive/msg17898.html[Quick summary of actual data-loss:loss of person extensions, loss of timezones/corruption of times, loss
of [EMAIL PROTECTED], titles are flattened to text without preservingHTML version, loss of category label, xml:base/xml:lang loss,inheritance of atom:author and atom:rights not handled]The last issue perhaps needs some more explanation:
In Atom, the following two entries are equivalent:a)atom:feedatom:authoratom:nameDavid Powell/atom:name/atom:author[...]atom:entry[...]
/atom:entry/atom:feedb)atom:feed[...]atom:entryatom:authoratom:nameDavid Powell/atom:name/atom:author[...]/atom:entry
/atom:feedThe same inheritance also applies to some other elements such asatom:rights, and xml:base/lang suffer similar issues.Superficially it seems that there is a problem with IE7s rendering, in
that http://www.tbray.org/ongoing/ongoing.atom doesn't display TimBray's name next to the entries.But, actually the problem is deeper than that. Because you only
preserve the latest instance of feed metadata, if it was up to theclient of the API to examine the feed author, and manually inherit itevery time it wanted to display the author of the entry, then theentry would inherit the wrong authors if the feed author had been
changed since the entry was polled.Eg: feed producing code may put atom:author only on the feed unlessthere are multiple entries in the feed with different authors when itwould add it to the entries too.
Basically you can't require the client of the API (eg IE7) to performthe inheritance, because they need to inherit the author from the feeddocument as it was when the entry appeared in it, not as it is now.
The solution, I expect, is to copy any elements that should inheritdown into the entry during the normalisation process. That way thedisplay of Tim's feed gets fixed, clients don't need to worry aboutinheritance, and author and rights attributions of old entries don't
get mangled by future modifications to the feed document.Ideally you should perform inheritance from atom:source too, asdescribed in the RFC.The areas that are currently untested, and potentially a point of
 concern (that I can think of off the top of my head, anyway)* undefinedContent of element atom:categoryI think it is perfectly reasonable to discard undefined content(such as namespaced attributes on Atom elements). If you want an
extension, use an extension element. If you want to sprinkleattributes everywhere in the assumption that implementations are goingto preserve whatever document you pass to them verbatim - well, don'tbe too disappointed.
* any extended usage of xml:base and xml:langProper preservation of these two is essential. In particular thebaseURI for each feed needs to be preserved after resolving itrelative to the document URI (and Content-Location if you're after
extra credit); the baseURI for each entry needs to be preserved afterresolving it relative to the baseURI for the feed.The baseURI for entries needs to be stored independently of the feedmetadata, otherwise redirecting the feed, or changing its base, will
retroactively break the baseURIs of older entries. atom:publishedatom:published isn't preserved as an exact copy. It is converted to anRSS style date (with the time as-is, and the timezone set to GMT even
if it wasn't). Overall it seems to me the MS/RSS team has done a pretty fantastic job of ensuringa fairly high quality conversion, making exact 

Re: Latest IE7 release 'AtomicRSS' output comparison results

2006-03-22 Thread M. David Peterson
Hmm... guess I was a bit off on the links... you need to be logged in as an admin for the links to take you anywhere useful, and even then its not to where I had hoped.

http://trac.understandingatom.com/newticket

This link, however, will take you to the correct location in which you can select the issue from the drop down. You do not need an account, and can post bugs as anonymous if you would prefer. I am happy to create accounts for those interested, but thats not something that needs to happen first, or ever, to be able to interact with the system.


Thanks again!

On 3/22/06, M. David Peterson [EMAIL PROTECTED] wrote:


Hey David,

Wow!This is really cool :)

Since you have obviously gone to great effort to both test and extract this info, I will take the time to add these as tickets into the Trac database. For now I have added three categories:

defect:ms-related:internal-ms-fix-required
 




 












defect:ms-related:externally-fixable:no-ms-intervention-required

 

 








defect:ms-related:external-fix-possible:temporary


These should all link to a new ticket, with the specificied ticket-type, so if anybody finds additional issues, please don't hesitate to simply post a quick explanation, and either attach the related file, or link directly to them. 


David, beyond adding these into the system, is there anything else I can do that you feel might be helpful to ensure we can see a resolution for each of these issues, tracking the process from start to finish as we go? 


I'll get started on this now.

Thanks again!



On 3/22/06, David Powell 
[EMAIL PROTECTED] wrote: 


Wednesday, March 22, 2006, 5:13:05 AM, M. David Peterson wrote: Hey Folks, With yesterdays build release of IE7, it seemed appropriate to run 

 aquick inventory check to see where things stand in regards to the
 derived MS/RSS conversion of a fairly element/attribute usage heavyAtom feed.Here's the overallbreakdown. [...] Beyond this, it seems that everything else *SHOULD* be able to map 
 back fairly well.There haven't been many changes to the transformation process in thisbuild, so all of the 15 issues with the Atom transformation in the oldbuild are still issues with this one.
http://www.imc.org/atom-syntax/mail-archive/msg17898.html[Quick summary of actual data-loss:
loss of person extensions, loss of timezones/corruption of times, loss of [EMAIL PROTECTED], titles are flattened to text without preservingHTML version, loss of category label, xml:base/xml:lang loss,
inheritance of atom:author and atom:rights not handled]The last issue perhaps needs some more explanation: In Atom, the following two entries are equivalent:a)atom:feedatom:authoratom:nameDavid Powell/atom:name/atom:author
[...]atom:entry[...] /atom:entry/atom:feedb)atom:feed[...]atom:entryatom:authoratom:nameDavid Powell/atom:name/atom:author
[...]/atom:entry /atom:feedThe same inheritance also applies to some other elements such asatom:rights, and xml:base/lang suffer similar issues.Superficially it seems that there is a problem with IE7s rendering, in 
that http://www.tbray.org/ongoing/ongoing.atom doesn't display TimBray's name next to the entries.
But, actually the problem is deeper than that. Because you only preserve the latest instance of feed metadata, if it was up to theclient of the API to examine the feed author, and manually inherit itevery time it wanted to display the author of the entry, then the
entry would inherit the wrong authors if the feed author had been changed since the entry was polled.Eg: feed producing code may put atom:author only on the feed unlessthere are multiple entries in the feed with different authors when it
would add it to the entries too. Basically you can't require the client of the API (eg IE7) to performthe inheritance, because they need to inherit the author from the feeddocument as it was when the entry appeared in it, not as it is now. 
The solution, I expect, is to copy any elements that should inheritdown into the entry during the normalisation process. That way thedisplay of Tim's feed gets fixed, clients don't need to worry aboutinheritance, and author and rights attributions of old entries don't 
get mangled by future modifications to the feed document.Ideally you should perform inheritance from atom:source too, asdescribed in the RFC.The areas that are currently untested, and potentially a point of 
 concern (that I can think of off the top of my head, anyway)* undefinedContent of element atom:categoryI think it is perfectly reasonable to discard undefined content(such as namespaced attributes on Atom elements). If you want an 
extension, use an extension element. If you want to sprinkleattributes everywhere in the assumption that implementations are goingto preserve whatever document you pass to them verbatim - well, don'tbe too disappointed. 
* any extended usage of xml:base and xml:langProper preservation of these two is essential. In particular thebaseURI for each feed needs to be preserved after 

Re: Latest IE7 release 'AtomicRSS' output comparison results

2006-03-22 Thread M. David Peterson

I'll get started on this now.

Thanks again!Okay, these are all entered  http://trac.understandingatom.com/report/1Thanks for snagging all of these David! Please feel free to make updates/changes as you see fit, or any of the rest of you just the same... At this stage I think it seems appropriate to chat with the XML-MVP's and get them involved with this process such that more can be accomplished via the proper, legally-blessed channels. I will start that process moving forward today.
I will continue updating the wiki interface as well as the tickets themselves. If something seems significant enough to broadcast to the list in general, I will make sure and do this. Otherwise the following links will gain you access to the various web feeds: 
NOTE: unfortunately, at present time, Trac doesn't seem to support Atom -- I haven't noticed any plugins, but if anybody knows of one, can you let me know so we can utilize the format we all obviously prefer. If time allows, I will attempt to start tapping into my developing Python skills and develop the module for this -- although wait... nevermind... I will just create a transformation file and convert the RSS 
2.0 format into an atom format, posting the link to the transformation process on the wiki when its ready.Active Tickets : http://trac.understandingatom.com/report/1?format=rssUSER=m.david
Wiki:RecentChanges : http://trac.understandingatom.com/wiki/RecentChangesTimeline (collection of everything into one feed) : 
http://trac.understandingatom.com/timeline?milestone=onticket=onchangeset=onwiki=onmax=50daysback=90format=rssEnjoy your day :)-- M:D/M. David Peterson
http://www.xsltblog.com/


Latest IE7 release 'AtomicRSS' output comparison results

2006-03-21 Thread M. David Peterson
Hey Folks,

With yesterdays build release of IE7, it seemed appropriate to run aquick inventory check to see where things stand in regards to the derived MS/RSS conversion of a fairly element/attribute usage heavyAtom feed.Here's the overallbreakdown.


Process:
I took a quick snapshot of the atom feed from my personal blog -- put it on a medium dosage of Atom-RNC approvedsteroids (using Uche's latest RNC update  
http://copia.ogbuji.net/blog/2006-02-06/Small_fix_ ), andranthe resultthrough the officiallive instance of the feedvalidator -- minus the incorrect encoding being reported (obviously a simple fix -- will do that after
I finish to the current mapping and related transformation file) it validated.

I then subscribed to this feed in the latest (March 20th) build of IE7, extracting the transformed result to compare against the original looking for areas ofpotential data loss.

The two docs:

Original:
http://m.david-2.xsltransformations.com/atom-test.xml

Derived:
http://m.david-2.xsltransformations.com/xsltblog.atomicrss-sample.xml

Initial analysis: From what I can tell, it seems that theres is only one significant loss that can not be extracted, interpolated, or otherwise derived from somewhere within the transformed result doc:


Original:
category label=foobar scheme=http://www.xsltblog.com/tags term=Internet Explorer 7.0/


Derived:
category domain=http://www.xsltblog.com/tagsInternet Explorer 7.0/category

Obviously the derived category element is missing: label=foobar

I should note that its only the first entry that Iadded@label and @schemeto the category element. The rest contain only the required @term. 

I'm not sure if this, in and of itself, caused the loss. I guess it would depend on how they go about the conversion,(e.g. placing weight on the number ofoccurrences of an attribute, disregarding that which falls below a certain determined criteria? I don't know... just throwing something out therefor the sake of throwing something out there:)


Sean, can you verify this and determine if its something on your side of the lake, or on mine?

Beyond this, it seems that everything else *SHOULD* be able to map back fairly well.

The areas that are currently untested, and potentially a point of concern (that I can think of off the top of my head, anyway)

* undefinedContent of element atom:category
* any extended usage of xml:base and xml:lang

Some areas worth noting:

The following elementsseems to be exact copies (including attributes)of the originals:

atom:link 
atom:author atom:name 
 atom:email 
 atom:uri
atom:contributor
 atom:name 
 atom:email 
 atom:uri
atom:published
atom:summary

Overall it seems to me the MS/RSS team has done a pretty fantastic job of ensuringa fairly high quality conversion, making exact copies of those elements and their associated attributes that did not allow for a clean conversion to the MS/RSS formatand still maintainany hope whatsoever of making the round trip back to its original Atom format.


A BIG PHAT thanks to the MS/RSS team for this! There's a significant amount of conversion work to get back to theoriginal Atom formatthat is no longer required because of their efforts. So again, thanks to Sean and the rest of the MS/RSS folks for their extended efforts on this.


I will be posting this report to the http://trac.understandingatom.com/wiki/AtomicRSS.NETsite, as well as adding the transformation files to the repository (available from this same interface) as soon as I can finish it up and verify to some level of certainty that the original file and the conversion back appear to be the same file. Of course, if this is simply not possible I will post what I have, and report the problems areas back to this thread.

Enjoy your dev-day! :)
-- M:D/M. David Petersonhttp://www.xsltblog.com/