Chris DeSalvo wrote:
> As the author of an aggregator app for a portable wireless device I 
> can tell you that this is a serious problem for this class of products.
        You didn't list support for RFC3229+feed[1,2] as one of the things
you are doing. This would help you drastically reduce the bandwidth needed
when you find a feed that actually has new content. If you use RFC3229+feed
to pull a feed, then you will only get the new entries in the feed -- not
ones that you've copied over before. It's one step beyond If-None-Match,
etc.
        But, the real problem with your approach is that you have apparently
coded the device so that it goes out and polls large numbers of feeds. This
doesn't make sense. For a portable wireless device with limited bandwidth
and limited connectivity, you should be accessing feeds via an intermediary
"proxy" that gathers up all your updates into a *single* feed. That feed
should be served using RFC3229+feed to ensure that you only copy from it the
updated entries since you last pulled from it. Of course, it would also make
sense to support compression on the results. There is no more efficient
mechanism for polling for feeds from the kind of device you describe.
        You say that you're reading about 20MB per day but you're only able
to harvest 2MB of "fresh" data from it? This 1/10 harvesting yield is
actually pretty normal when polling RSS/Atom feeds served without
RFC3229+feed. If you used RFC3229+feed, you would find that your yield would
start to approach 100% rather then the 10% you are at now. Additionally,
given the efficiencies here, you would be able to increase your polling
frequency almost arbitrarily without significantly increasing the bandwidth
consumption of your system. Thus, you could cut latency below the average of
30 minutes which is implied by a polling frequency of 1 hour.
        You've written on your blog that you want to see more "304"
responses. Well, I would suggest that what you *really* should want is more
"226" responses -- 226 is the success code for an RFC3229+feed GET
operation.

                bob wyman

[1] http://bobwyman.pubsub.com/main/2004/10/massive_bandwid.html
[2] http://bobwyman.pubsub.com/main/2004/09/using_rfc3229_w.html

==== Original Message ======
In 
my app I've implemented every trick in the book to try and reduce the 
amount of data that I have to pull through the radio and parse.  I use 
If-None-Match and If-Changed-Since headers in my requests, I support 
compression, I respect caching hints from the servers.  It doesn't help 
in all cases.  I have 112 loaded up in my aggregator and only 74 of the 
servers hosting those feeds ever return a 304.  The rest give me a 200 
and gladly hand me everything regardless of whether it has changed or 
not.  17 of the servers don't bother supplying an ETag header.

My feed list amounts to about 20 MB of data per day when polling once 
per hour.  That is a lot of air time for a small radio, and a lot time 
spent grinding in an XML parser for a small CPU.  This is especially 
upsetting because by my measurements only about 2 MB of data is fresh 
for any given day.  The main hit is in battery life - the above stats 
can trivially knock HOURS off of the life of a small battery.

I've written extensively about this problem here:
<http://www.desalvo.org/blog/?p=230>
with a real-world example studied here:
<http://www.desalvo.org/blog/?p=232>

So, I guess I'd like to see an optional update-frequency hint element.

Thanks,
Chris



Reply via email to