Barry Finkel wrote: >I have a question about zipped list archives; the question arose from >a subscriber to one of our lists. I am running Mailman 2.1.11 on >Ubuntu from a package I built from the SourceForge source. > >mailman# pwd >/var/lib/mailman/archives/private/LISTNAME >mailman# ls -ald 2009-August* >drwxrwsr-x 2 list list 4096 2009-08-31 11:34 2009-August >-rw-rw-r-- 1 list list 91577 2009-08-31 11:34 2009-August.txt >-rw-rw-r-- 1 list list 20708 2009-09-01 03:27 2009-August.txt.gz >mailman# > >The .txt file looks fine, as does the .gz file. >When I go to the list admin web interface and look at the archives, >I see > > August 2009: [Thread] [Subject] [Author] [Date] [GZip'd text 20KB] > >That value (20KB) seems to be correct. When I click on the "[Gzip...]" >link, Firefox/Solaris gives me a text file, not a .gz file. Maybe >Firefox knows how to unzip the file, as vim does. When I click on >the same link using IE8/XP, IE8 sees the .gz suffix and asks me what >to do with the file. I save it on my desktop, and when I look at the >file, I see that it is a plain text file. It is not a gzip'd file. >Why? Thanks.
Your web server is converting the gzipped file and serving it as plain text, but MSIE sees the .gz extension and thinks it can't display the content. However, I recommend you don't gzip the files at all. As you can see, doing so doesn't save space; it requires more space because the .txt files are kept even after gzipping. The old ones that will have no more messages added can be removed, but you have to do that manually. Keeping a gzipped file can save some bandwidth when accessing the file on the web, but not if your web server converts and serves it as plain text, which appears to be the case. Also, unless you set GZIP_ARCHIVE_TXT_FILES = Yes in mm_cfg.py (don't do it see below), the current day's posts are not in the .txt.gz file until cron runs Mailman's cron/nightly_gzip. Thus, I recommend not gzipping the archive .txt files at all. I.e., do not put GZIP_ARCHIVE_TXT_FILES = Yes in mm_cfg.py and remove or comment the cron/nightly_gzip entry from Mailman's crontab. This can be a bit tricky to do right because you have links on the archive TOC page that point to the .txt.gz files, and if you just comment the cron/nightly_gzip entry, the current period's .txt.gz file will be quickly out of date. You can remove all the .txt.gz files, and the next archived post will rebuild the TOC with links to the .txt files, but for the period before the next archived post, the archive TOC will have links pointing to the removed .txt.gz files. One way around this is just to run bin/arch --wipe on a list or lists. This will remove all the list's .txt.gz files and build an archive TOC with correct links to the .txt files. The .txt.gz files will only be regenerated if cron/nightly_gzip is run. The usual caveats about running bin/arch --wipe, especially on older lists, apply. Namely, it's a good idea to first check the archives/private/LIST.mbox/LIST.mbox file with bin/cleanarch, and there is a possibility that messages can get renumbered which invalidates externally saved links to exisitng messages. Another way around it is to remove the .txt.gz files manually and then run 'bin/arch LISTNAME /dev/null' to rebuild the archive TOC. Note no --wipe option and no input redirection - just /dev/null as a filename argument. -- Mark Sapiro <m...@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan ------------------------------------------------------ Mailman-Users mailing list Mailman-Users@python.org http://mail.python.org/mailman/listinfo/mailman-users Mailman FAQ: http://wiki.list.org/x/AgA3 Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/ Unsubscribe: http://mail.python.org/mailman/options/mailman-users/archive%40jab.org Security Policy: http://wiki.list.org/x/QIA9