Mark Nottingham wrote:
Also, if a client doesn't visit for a long time,  it will see
http://journals.aol.com/panzerjohn/abstractioneer/atom.xml? page=2&count=10 and assume it already has all of the entries in it, because it's fetched that URI before.

Yeah. That's what I was worried about too. The couple of test feeds that I've subscribed to haven't had any new entries yet so I can't be sure, but with urls like that I don't see how it can possibly work.

Did you find that algorithm wrong, too hard to understand/implement, or did you just do a different take on it? Does the approach that you took end up having the same result?

The problem I had with the algorithm was that it required two passes. The first pass to gather all the links, starting with the current feed document and moving back in time through the archives; the second pass to actually process the documents, starting with the oldest and moving forwards in time. Either this required retrieving everything twice, or caching every document retrieved. Neither of those options sounded particularly appealing to me.

My implementation does everything in one pass. I start by processing the current feed document. If it contains a history link which I haven't seen before, I'll retrieve and process that document next. Repeat until there are no more links or I encounter a link that I've seen before. There are subtle differences in the results that you would get from my algorithm, and technically what you're suggesting is more accurate, but I don't think the differences are significant enough to care about.

Other than that, I skip steps 1 and 2, and I default to using the "next" link relation (with a fallback to "previous" and "prev"). I may consider adding support for fh:complete at some point, but for now I'm sticking with Microsoft's cf:treatAs.

Regards
James

Reply via email to