Chris DeSalvo wrote: > As the author of an aggregator app for a portable wireless device I > can tell you that this is a serious problem for this class of products. You didn't list support for RFC3229+feed[1,2] as one of the things you are doing. This would help you drastically reduce the bandwidth needed when you find a feed that actually has new content. If you use RFC3229+feed to pull a feed, then you will only get the new entries in the feed -- not ones that you've copied over before. It's one step beyond If-None-Match, etc. But, the real problem with your approach is that you have apparently coded the device so that it goes out and polls large numbers of feeds. This doesn't make sense. For a portable wireless device with limited bandwidth and limited connectivity, you should be accessing feeds via an intermediary "proxy" that gathers up all your updates into a *single* feed. That feed should be served using RFC3229+feed to ensure that you only copy from it the updated entries since you last pulled from it. Of course, it would also make sense to support compression on the results. There is no more efficient mechanism for polling for feeds from the kind of device you describe. You say that you're reading about 20MB per day but you're only able to harvest 2MB of "fresh" data from it? This 1/10 harvesting yield is actually pretty normal when polling RSS/Atom feeds served without RFC3229+feed. If you used RFC3229+feed, you would find that your yield would start to approach 100% rather then the 10% you are at now. Additionally, given the efficiencies here, you would be able to increase your polling frequency almost arbitrarily without significantly increasing the bandwidth consumption of your system. Thus, you could cut latency below the average of 30 minutes which is implied by a polling frequency of 1 hour. You've written on your blog that you want to see more "304" responses. Well, I would suggest that what you *really* should want is more "226" responses -- 226 is the success code for an RFC3229+feed GET operation.
bob wyman [1] http://bobwyman.pubsub.com/main/2004/10/massive_bandwid.html [2] http://bobwyman.pubsub.com/main/2004/09/using_rfc3229_w.html ==== Original Message ====== In my app I've implemented every trick in the book to try and reduce the amount of data that I have to pull through the radio and parse. I use If-None-Match and If-Changed-Since headers in my requests, I support compression, I respect caching hints from the servers. It doesn't help in all cases. I have 112 loaded up in my aggregator and only 74 of the servers hosting those feeds ever return a 304. The rest give me a 200 and gladly hand me everything regardless of whether it has changed or not. 17 of the servers don't bother supplying an ETag header. My feed list amounts to about 20 MB of data per day when polling once per hour. That is a lot of air time for a small radio, and a lot time spent grinding in an XML parser for a small CPU. This is especially upsetting because by my measurements only about 2 MB of data is fresh for any given day. The main hit is in battery life - the above stats can trivially knock HOURS off of the life of a small battery. I've written extensively about this problem here: <http://www.desalvo.org/blog/?p=230> with a real-world example studied here: <http://www.desalvo.org/blog/?p=232> So, I guess I'd like to see an optional update-frequency hint element. Thanks, Chris