Re: [Gossip] status report + look and feel questions
On Wed, 24 Nov 2004, Earl Hood wrote: > > On my lists I still find that requiring posts to come from subscribed > > addresses keeps virtually all spam from being distributed. I've had > > very few if any instances of spammers subscribing to a list to spam it. > > Does mail-archive.com archive lists to which anyone can post? > > List administration is handled by the list owners not mail-archive.com. > Therefore, if the list owner allows anyone to post to the list, then > the messages will get archived (unless mail-archive.com spam filters > believe such messages are spam). I would think that mail-archive was enough of a useful service that it could put requirements or at least strongly urge listowners to adopt reasonable policies. For example, surely you would not archive a list that encourages spam would it. [I just realized I should look at your site's policies - laterI guess.] Enforcing such would be difficult but it would give you a tool to deal with the worst cases and may influence some list owners just by being stated. It would be pretty easy to test lists for the subscriber only posting: This is a test messagebeing sent from a special address or to all lists archived at... > > With my browser (Mozilla 1.4.1) the ads occasionally prevet the last few > > characters of a message line from being displayed. Example, in: > > http://www.mail-archive.com/mpls%40mnforum.org/msg32125.html > > The end of the third line on my display reads > > What operating system? Message looks fine to me, but I'm using a > later version of Mozilla. Linux, Fedora Core 1, using Gnome On Wed, 24 Nov 2004, Jeff Breidenbach wrote: > Fred, thanks for the feedback. Keep it coming if you have more. Sorry if I came across as overly critical; mail-archive is a big improvement for may lists. > Does the problem fix itself if you make the broswer wider? Can I get > a screenshot? Send it to my gmail account please <[EMAIL PROTECTED]> > I'll try to get to this after Thanksgiving - gotta run now. > Sorry - the name "gossip" was the first thing I thought of in > 1998. I've changed the footer as you suggested. Thanks. Someday I'll learn procmail so I can manipulate such things to my preferences... Fred -- Fred H. Olson Minneapolis,MN 55411 USA(near north Mpls) Communications for Justice - My new listserv org. UU, Linux My Link Page: http://fholson.cohousing.org Ham radio:WB0YQM fholson at cohousing.org 612-588-9532 (7am-10pm Central time) ___ Discussion list for The Mail Archive [EMAIL PROTECTED] http://jab.org/cgi-bin/mailman/listinfo/gossip
Re: [Gossip] status report + look and feel questions
Fred, thanks for the feedback. Keep it coming if you have more. >Does mail-archive.com archive lists to which anyone can post? Because the service is so big, we probably have the full range. Lists without spam problems, lists with spam problems, and (unfortunately) lists that ARE spam problems. The main reason for basic spam filtering on the archival inbox is to not waste resources archiving spam. That and the fact that we detest spammers. >the ads occasionally prevet the last few characters of a message That may be something we can address. Does the problem fix itself if you make the broswer wider? Can I get a screenshot? Send it to my gmail account please <[EMAIL PROTECTED]> >I did see one ad for [junk] Yeah, at least to my eye the Google ads seem to be taking a decline in quality - with far too many cases either being irrelevant or sketchy. Since we need the ads to cover costs, I'm not really sure what to do about it. But I agree that it is a growing concern and I would like to find a way to address the problem. >The list name link in the upper left corner of a message page and of >index pages bring up an index page. Such a link on index pages is >pretty useless [...] Completely agree. We should either try to make that link more useful, or (perhaps more likely) unlinkify the listname on index pages. >How about changing the tag to be something like "mailarc" and putting >something in message footers like "Mail Archive talk" Sorry - the name "gossip" was the first thing I thought of in 1998. I've changed the footer as you suggested. I'm leaving the tag as [gossip] to match the listname, but in the long long term will think about wether or not to change the name of the list. ___ Discussion list for The Mail Archive [EMAIL PROTECTED] http://jab.org/cgi-bin/mailman/listinfo/gossip
Re: [Gossip] status report + look and feel questions
On November 24, 2004 at 10:29, Fred H Olson wrote: > On my lists I still find that requiring posts to come from subscribed > addresses keeps virtually all spam from being distributed. I've had > very few if any instances of spammers subscribing to a list to spam it. > Does mail-archive.com archive lists to which anyone can post? List administration is handled by the list owners not mail-archive.com. Therefore, if the list owner allows anyone to post to the list, then the messages will get archived (unless mail-archive.com spam filters believe such messages are spam). > As one last precaution I have new subscribers first messages moderated > (sent to the reject page) so I'd catch a subscribed spammer's first > message. This has the added advantage of catching some "please > unsubscribe me" messages from people who never post anything else. Something that may be good to do for list administrators. Mail-archive.com does not perform any list administration functions. > -- Advertising on mail-archive.com -- > Regretable that you have to have it but it's more tolerable than yahoo's. > With my browser (Mozilla 1.4.1) the ads occasionally prevet the last few > characters of a message line from being displayed. Example, in: > http://www.mail-archive.com/mpls%40mnforum.org/msg32125.html > The end of the third line on my display reads What operating system? Message looks fine to me, but I'm using a later version of Mozilla. > The list name link in the upper left corner of a message page and of index > pages bring up an index page. Such a link on index pages is pretty > useless, it would be much better to link to the lists "info page" (I think > all lists should and most do have these) which in turn has description of > list, subscription info etc. Are there links somewhere to contact info > for archived lists? Mail-archive.com is as automated as possible, including the detection of new lists to archive. Helps keep operational costs down. Right now, there are no facilities for list administrators to register list info, and such capabilities would require human-based review for content. I believe the folks at mail-archive.com have considered additional features similiar to this, but such things will probably not get added unless it can be automated and done in a secure fashion. --ewh ___ Gossip mailing list [EMAIL PROTECTED] http://jab.org/cgi-bin/mailman/listinfo/gossip
Re: [Gossip] status report + look and feel questions
On Mon, 22 Nov 2004, Jeff Breidenbach wrote wrt http://www.mail-archive.com/ : > > Recent changes have been mostly behind the scenes. Here's some of > the highlights that haven't been mentioned yet on gossip: > > a) Improved spam filtering on the archives. Unfortunately there's > so much junk flying around the internet that we had to get > a little more serious at filtering the archival inbox. > On my lists I still find that requiring posts to come from subscribed addresses keeps virtually all spam from being distributed. I've had very few if any instances of spammers subscribing to a list to spam it. Does mail-archive.com archive lists to which anyone can post? The ISP that hosts my lists has filtering (greylisting) that keeps most spam to my big list ** from getting thru to where I have to look at it on the reject page (where messages from non subscribed addresses go). As one last precaution I have new subscribers first messages moderated (sent to the reject page) so I'd catch a subscribed spammer's first message. This has the added advantage of catching some "please unsubscribe me" messages from people who never post anything else. ** http://www.cohousing.org/cohousing-L/ (~150 msg/month; ~500 subs - admittedly higher than average on civility scale due topic ) -- Advertising on mail-archive.com -- Regretable that you have to have it but it's more tolerable than yahoo's. With my browser (Mozilla 1.4.1) the ads occasionally prevet the last few characters of a message line from being displayed. Example, in: http://www.mail-archive.com/mpls%40mnforum.org/msg32125.html The end of the third line on my display reads Gurban to be described as "not as qualifie Curiously when I copied and pasted the line into this message the d" were there... I did see one ad for what appeared to be lists "b2b" addresses to be solicited (spammed) - sorry I did not keep track of specifics. -- misc -- The list name link in the upper left corner of a message page and of index pages bring up an index page. Such a link on index pages is pretty useless, it would be much better to link to the lists "info page" (I think all lists should and most do have these) which in turn has description of list, subscription info etc. Are there links somewhere to contact info for archived lists? Lastly why is does this list have the meaningless name and subject line tag "gossip"? How about changing the tag to be something like "mailarc" and putting something in message footers like "Mail Archive talk" Fred -- Fred H. Olson Minneapolis,MN 55411 USA(near north Mpls) Communications for Justice - My new listserv org. UU, Linux My Link Page: http://fholson.cohousing.org Ham radio:WB0YQM fholson at cohousing.org 612-588-9532 (7am-10pm Central time) ___ Gossip mailing list Gossip@jab.org http://jab.org/cgi-bin/mailman/listinfo/gossip
[Gossip] status report + look and feel questions
Recent changes have been mostly behind the scenes. Here's some of the highlights that haven't been mentioned yet on gossip: a) Improved spam filtering on the archives. Unfortunately there's so much junk flying around the internet that we had to get a little more serious at filtering the archival inbox. b) YahooGroups lists are no longer banned. This is in part because we now have better processing capacity, and in part because I expect fewer YahooGroup related headaches. c) Improved network infrastructure. This is mostly behind the scenes in terms of number of redundant mail servers, management of dns servers, network monitoring, backups, etc. You probably won't notice anything but it makes Jeff and my lives a little easier. Also now that people have had a chance to play with the new look and feel for a few weeks, I'd like to get some comments. - Do the page load fast enough for everyone? - Anyone unhappy due to browser compatibility problems? - Is anyone actually honest-to-god using the date navigation links at the bottom of message pages now that we have keyboard shortcuts? Speak up or my minimalist sensibilities will take over and I get rid of them. - Does anyone prefer the old layout better? Cheers, Jeff ___ Gossip mailing list [EMAIL PROTECTED] http://jab.org/cgi-bin/mailman/listinfo/gossip
[Gossip] status
Archiving system is still offline, and the queue size is 815 MB at the moment. Made some progress this weekend in behind the scenes data copying. When we are up and processing (this coming week sometime!?! it will be out-of-order. Newest first, while feeding from the backlog slowly. -Jeff ___ Gossip mailing list [EMAIL PROTECTED] http://jab.org/cgi-bin/mailman/listinfo/gossip
[Gossip] status, right now
Ok -- Zamboni (the old server) will never update again. Poet (the new server) is processing current mail and also getting some older archive messages recopied into it as we speak. You'll know things are done when http://216.218.240.194 and http://www.mail-archive.com resolve to the same machine. Poet's mail exchanger (MX) has stronger spam filtering and should be becoming active as DNS changes propogate, so let me know if any of you experience unusual bounces. Cheers, Jeff ___ Gossip mailing list [EMAIL PROTECTED] http://jab.org/cgi-bin/mailman/listinfo/gossip
Re: [Gossip] status, mail-archive
Jeff Breidenbach wrote: > 5) Regarding the common "phrase search" feature request, it looks > like htdig 3.2 is nowhere near ready to go, so that's not > happening any time soon. Again, what about giving ASPseek a try? I'm one of developers ;) -- [EMAIL PROTECTED] ICQ7551596 [EMAIL PROTECTED] -- Guinness a Day Keeps a Doctor Away (people's wisdom) ___ Gossip mailing list [EMAIL PROTECTED] http://jab.org/cgi-bin/mailman/listinfo/gossip
[Gossip] status, mail-archive
Lots of news on next generation system: 1) Configuration work {exim, htdig, analog, bind, mailmen ...} is done. System is running great in shadow mode. Many thanks to my padwan. 2) TODO: insertion into final network and switchover. 3) HTML page count is about 10 million; I'm consulting with the reiserfs team on how to get really fast file counts. The 10 million number is from precise but slow measurement; currently estimating deltas based on df. 4) Exim will use sender_verify_hosts_callback on switchover. Rumored to be a good spam shield, but we may have some false positives. We'll see. 5) Regarding the common "phrase search" feature request, it looks like htdig 3.2 is nowhere near ready to go, so that's not happening any time soon. 6) Regarding the common "address hide" feature request, I've bumped up the obfuscation slightly by custom hacking MHonArc. Do people prefer that I mangle email addresses in message bodies to something completely indecipherable like GeoCrawler's ? Cheers, Jeff ___ Gossip mailing list [EMAIL PROTECTED] http://jab.org/cgi-bin/mailman/listinfo/gossip
[Gossip] status
Hi all, Next generation hardware finally arrived, and is looking very good. So far, I'm very happy with the vendor who seems to have done a great job testing. Data transfer will begin shortly, not sure how long it will take. Expect periodic queuing of mail during this time (for example, mail-archive will in queuing mode for all of tonight) Cheers, Jeff ___ Gossip mailing list [EMAIL PROTECTED] http://jab.org/cgi-bin/mailman/listinfo/gossip
Re: [Gossip] status
hi jeff im the list manager for [EMAIL PROTECTED] and [EMAIL PROTECTED] we've moved the archives to our own site, and no longer need them in the mail-archive. ive unsubbed archive@jab and you may delete the folders at your leisure from your site if you wish. thanks for the service. its a pleasure to get something for nothing in this overly capatalistic world. -- Shawn Bega http://www.begaservices.com http://www.dccourier.com please ride safely ___ Gossip mailing list [EMAIL PROTECTED] http://jab.org/cgi-bin/mailman/listinfo/gossip
[Gossip] status
Hi all, I am now back from a long vacation (which included chatting with riot police in Salt Lake City a few days ago!) and have tried to work my way through various mail-archive questions, comments, etc. I think I've caught up at this point, but so if some issue is not taken care of, bring it up again. On the front burner: * Gossip subscription needs to be fixed. No idea what is wrong or how to proceed, but this is problematic. * I'm increasingly getting (a) spam (b) requests for spamblocks For example, I think my recent signal to spam ratio in personal mail was 6:80 when I got back. Thus I'm finally going to admit defeat and do address obfustication. Surprisingly, apache doesn't seem to have an obfusticate-address module, so I'm probably going to put the obfusticator in MHonArc and permanently burn the obfuscation into the HTML. Anyway, that's the news. -Jeff ___ Gossip mailing list [EMAIL PROTECTED] http://jab.org/cgi-bin/mailman/listinfo/gossip
[Gossip] Status of archives?
In checking this morning on a mail list i help administer which is archived at mail-archive, i noticed a slew of problems. am wondering whats up with the status of the archives overall health. in particular, things i noticed: 1.) no new gossip messages since 10/29. perhaps this is combo of delay in archiving plus low posting rate. 2.) when i search the archive i administer http://www.mail-archive.com/marxism%40lists.panix.com/ the main index shows no new posts since Oct 31. Is the archiving now delayed by 5 days? 3.) When i look at the date index for the archive: http://www.mail-archive.com/marxism%40lists.panix.com/maillist.html the output page seems very broken. the last post shown is 10/17, and after the index reaches back to 09/30, the listing repeats (many) multiple times. 4.) When i search the archive for marxism list using my name as search keys, i get only about 24 posts, when i know there are many more than that in the archives. when i earlier checked the same search i got 19 posts, so the number changed within a half hour of checking. when i searched using the name of the list moderator, i found only about 34 posts, and he truly should have hundreds of posts in the archive. so something seems amiss. thanks les schaffer ___ Gossip mailing list [EMAIL PROTECTED] http://jab.org/cgi-bin/mailman/listinfo/gossip
[Gossip] status: users, attachments, bounces, spam, disk
Users = I'm very pleased to see organizations like The Apache Group using mail-archive to help futher their projects. It's an honor. Attachments: With MHonArc 2.4.8 + http://www.oac.uci.edu/indiv/ehood/tmp/mhnull.pl I still am not successfully blocking .jpg attachments. Not sure if I made a mistake or if something isn't working. Configuration follows. image/jpeg;m2h_null::filter;mhnull.pl image/pjpeg;m2h_null::filter;mhnull.pl text/plain; maxwidth=87 asis=windows-1252:iso-8859-15 m2h_external::filter; excludeexts="src,vbs,jpg,JPG,jpeg,JPEG" usename useicon subdir iconurl="../attachment.gif" Bounces === I'm bouncing (not dropping) all incoming messages over 100KB. This was a conscious decision to actively discourage large messages and the lists that carry them. Working very well. Spam My personal inbox spam continues to increase. The telltale addresses (including [EMAIL PROTECTED]) don't seem to get corresponding spams. So it doesn't _appear_ to be from spambots crawling mail-archive, but I'm keeping wary. We're a pretty sizable target. Disk Disk continues to fill despite software tweaks. I've asked VA about the possibility of loaning or donating a large disk array. If that doesn't pan out, I will approach other storage venders as well. I think there is a reasonable chance a company might be willing to donate excess inventory, especially if they can get a tax writeoff. By the way, the message count on the homepage is bogus - it's just an linear estimator tied to df, and doesn't take into account any of the space efficency tweaks of the last year or so. I guesstimate mail-archive holds somewhere on the order of 10+ million emails right now. I don't have a compuitationally cheap way to get real filecount. jeff@zamboni:~$ df Filesystem 1k-blocks Used Available Use% Mounted on /dev/rd/c0d0p3 482093196255260938 43% / /dev/rd/c0d0p5 964476496200419280 54% /usr /dev/rd/c0d0p6 964476667964247516 73% /var /dev/rd/c0d0p1 23300 3715 18382 17% /boot /dev/rd/c0d0p7 189746780 170726132 9382052 95% /data ___ Gossip mailing list [EMAIL PROTECTED] http://jab.org/cgi-bin/mailman/listinfo/gossip
Re: [Gossip] status, mail-archive
>This is then put in a tag at the top of the page like this Thanks, went ahead and implemented this solution. All non-iso-8869-1 localizations will now have a tag denoting the character set. Polish is the first such localization I've encountered. :) Ok, now I'm going to grumble a bit: * it's the 21st century and we still don't have unicode everywhere * it's the 21st century and we still don't have email headers specifying language, making automatic localization of email archives highly improbable. On the bright side, the two extra localizations are now live. German and Polish users can now submit names of lists that should be localized. As for the monthly granularity with MHonArc. Pro: Immediate speedup of MHonArc Already implemented on per-list basis (see http://www.mail-archive.com/linux-kernel@vger.rutgers.edu) Secondary benefits from increased disk caching due to lower memory use by MHonArc. Con: Some contortion required if I don't want to break existing URLs Adds human interface complexity Adds program level complexity Lose any lobbying power I might have had towards increasing MHonArc scalability through windowing It feels kludgy and requires me to reverse years of stubborness. :) The other obvious software wins include ignoring "cold" lists better, and switching to ReiserFS (although maybe that's not so easy and I should wait for either the next disk upgrade, and/or perhaps a 2.4.x kernel that incorporates it natively) Jeff $ uptime 7:57pm up 224 days, 3:03, 2 users, load average: 1.22, 1.26, 1.27 ___ Gossip mailing list [EMAIL PROTECTED] http://jab.org/cgi-bin/mailman/listinfo/gossip
Re: [Gossip] status, mail-archive
On October 29, 2000 at 12:28, Jeff Breidenbach wrote: >Bottom line is I need to put another round of attention into >software efficiency, relatively soon. If you break up a list into a set of archives (broken down by month), efficiency should no longer be a problem. I know you have had some reluctance about this, mainly for low traffic lists, but I believe in general it will be the best. An idea is to do the regular monthly-based archives for all lists, but have a complimentary "latest messages" archive for the last X number of messages. This way the monthly archives definitely serve as a more "archiving" function while the latest messages archive is geared to more current discussions and serves like a newsgroup where older message eventually expire (but are still present in the monthly archives). --ewh ___ Gossip mailing list [EMAIL PROTECTED] http://jab.org/cgi-bin/mailman/listinfo/gossip
Re: [Gossip] status, mail-archive
On Sun, 29 Oct 2000, Jeff Breidenbach wrote: > >I'm a bit concerned with the character sets -- the provided >translations don't use HTML escape characters, like é. >The polish translation appears to use the iso-8859-2 character set, >while the German translation seems to be in straight ASCII -- >I wonder if that is actually ok? I'd prefer using HTML escape >characters, but I'm not sure I have the ability / knowledge to go >ahead and put them into the translations. Any help is appreciated. > Nit pick: é not é -- this is a common HTML bug. Unfortunately, MSIE renders the latter as an e-acute too, so people who only test their pages on MSIE never realise that it will render in an unintended way on any conformant browser. Anyway, Polish can't use the &; things because there isn't any &lslashed; etc. So what I do with analog is have one extra field at the top of the language file declaring the character set. This is then put in a tag at the top of the page like this: == Statystyki WWW Statystyki WWW Program uruchomiony: Pon, 30 Nie 2000 13:54. ... == You might find this idea useful. -- Stephen Turner http://www.statslab.cam.ac.uk/~sret1/ Statistical Laboratory, Wilberforce Road, Cambridge, CB3 0WB, England "The new operating system will recover more easily from system crashes." (Microsoft, aiming high with Windows Millennium) ___ Gossip mailing list [EMAIL PROTECTED] http://jab.org/cgi-bin/mailman/listinfo/gossip
[Gossip] status, mail-archive
Hi all, Here's what's up with mail-archive. 1) I've received localizations for German and Polish and will put them into the next point release, maybe tomorrow. I'm a bit concerned with the character sets -- the provided translations don't use HTML escape characters, like é. The polish translation appears to use the iso-8859-2 character set, while the German translation seems to be in straight ASCII -- I wonder if that is actually ok? I'd prefer using HTML escape characters, but I'm not sure I have the ability / knowledge to go ahead and put them into the translations. Any help is appreciated. 2) We're getting too big again. How do I know? People are complaining about archiving latency. I also learned that grep (which is used inside the guts, specifically grep -F) doesn't like receiving more than 5000 arguments. Note that mail-archive very recently exceeded 5000 lists :) Bottom line is I need to put another round of attention into software efficiency, relatively soon. Jeff ___ Gossip mailing list [EMAIL PROTECTED] http://jab.org/cgi-bin/mailman/listinfo/gossip