> > From: Jesse Pelton [mailto:[email protected]] > Sent: Thursday, February 04, 2010 9:29 AM > To: Ocean; [email protected] > Subject: RE: [Spambayes] Problems with classifying as spam > > SpamBayes doesn't follow links (see http://spambayes.sourceforge.net/faq. > html#will-show-spam-clues-notify-a-spammer-that-i-opened-their-message for
> a tangentially related discussion), but it does process message headers. > Lots of good information in there that you might think came from a Web site. > > Unless you're willing to dive into the the code and the math, I'd caution > against trying to second-guess SpamBayes. You're going to want it to > behave rationally, and it doesn't (at least at the level you're looking > at); it behaves statistically. That's why the FAQ (http://spambayes.svn. > sourceforge. > net/viewvc/spambayes/trunk/spambayes/Outlook2000/docs/troubleshooting. > html#Messages_have_incorrect_or_unexpected) suggests sending all the Spam > clues to the list when trying to understand why a given message isn't > classified as expected. > Okay, then here's the nitty gritty: ------------------------------ Subject: ***Discount_Viagra_VXPL_Percocet*_Adderall**** Body: <URL Link>***Discount_Viagra_VXPL_Percocet*_Adderall****! <Links to:> http://kashertqdum17.com/ ------------------------------ As I said before, that's all that's in the message, nothing else. And here are the spam clues: -------------------------------------------- Combined Score: 0% (3.88578e-16) Internal ham score (*H*): 1 Internal spam score (*S*): 7.77156e-16 # ham trained on: 11546 # spam trained on: 2608 The last time this message was classified or trained: This message had not been filtered. This message had not been trained. 150 Significant Tokens token spamprob #ham #spam 'subject:****' 0.000567537 396 0 'submitting' 0.00233281 96 0 'default' 0.00251537 89 0 'lab' 0.00286807 78 0 '(from' 0.00360288 62 0 'skip:* 40' 0.00391645 57 0 '9:00' 0.0042898 52 0 'to:addr:ocean' 0.00511949 904 1 'listings' 0.00542823 41 0 'report:' 0.00542823 41 0 'engineers' 0.00819672 27 0 'lately,' 0.00920245 24 0 'daniel' 0.00964836 3204 7 'flaw' 0.0104895 21 0 'geek' 0.0104895 21 0 'liquid' 0.0104895 21 0 'detect' 0.0115681 19 0 'tech.' 0.0115681 19 0 'viruses' 0.0115681 19 0 'binary' 0.0121951 18 0 'textbook' 0.012894 17 0 'finalizing' 0.0136778 16 0 'greg' 0.0180723 12 0 'yahoo' 0.0180723 12 0 'providers.' 0.0196507 11 0 'adults' 0.0238095 9 0 "amazon's" 0.0238095 9 0 '2012' 0.0266272 8 0 'music.' 0.0266272 8 0 'textbooks' 0.0266272 8 0 '11:33' 0.0302013 7 0 '1:01' 0.0302013 7 0 '9:53' 0.0302013 7 0 'buys' 0.0302013 7 0 'suggests' 0.0302013 7 0 'html' 0.0343728 255 2 '2:05' 0.0348837 6 0 'compression' 0.0348837 6 0 'formats' 0.0348837 6 0 'teenagers' 0.0348837 6 0 'adapt' 0.0412844 5 0 "apple's" 0.0412844 5 0 'h.264' 0.0412844 5 0 'kingdom,' 0.0412844 5 0 'times)' 0.0412844 5 0 'page.' 0.0444174 481 5 'hours,' 0.0480985 92 1 'to:addr:cobaltnight.com' 0.0496998 1274 15 '11:37' 0.0505618 4 0 'explorer' 0.0507201 87 1 'indicated' 0.0512533 168 2 'amazon' 0.0524349 84 1 'tech' 0.0543494 389 5 'sign' 0.0553702 608 8 'payment' 0.0573002 1169 16 '1:12' 0.0652174 3 0 'globally.' 0.0652174 3 0 'u.s.,' 0.0652174 3 0 'interface' 0.0657778 66 1 '2009' 0.0678561 672 11 'center' 0.0698918 651 11 'electronic' 0.0758594 542 10 "aren't" 0.0871721 95 2 'microscopic' 0.0918367 2 0 'start-up' 0.0918367 2 0 'surveillance' 0.0918367 2 0 'teens,' 0.0918367 2 0 'turf' 0.0918367 2 0 'sales' 0.0930142 736 17 'still' 0.0972529 1728 42 'relevant' 0.0974632 84 2 'charge' 0.0981746 368 9 'test' 0.0985371 407 10 'from:no real name:2**0' 0.104008 2519 66 'information' 0.110498 3031 85 "that's" 0.112004 809 23 'requests' 0.11383 174 5 '100' 0.11434 276 8 'post' 0.11434 276 8 'support' 0.114415 1201 35 "doesn't" 0.116645 471 14 'data' 0.11763 699 21 'pst' 0.119211 67 2 'be?' 0.120093 34 1 'touch' 0.12132 226 7 'service' 0.121935 2010 63 'deal.' 0.126626 32 1 'job' 0.130755 178 6 'actual' 0.13257 349 12 'web' 0.133212 952 33 'tom' 0.136878 113 4 'needs' 0.137534 390 14 'team' 0.140032 681 25 'rather' 0.141804 296 11 'leaving' 0.150599 151 6 'licenses' 0.151311 26 1 'working' 0.152874 639 26 'wednesday' 0.154298 171 7 '"protected' 0.155172 1 0 '787' 0.155172 1 0 'armstrong' 0.155172 1 0 'blogging' 0.155172 1 0 'declines' 0.155172 1 0 'ina' 0.155172 1 0 'mode"' 0.155172 1 0 'nsa' 0.155172 1 0 'pew' 0.155172 1 0 'reprieve' 0.155172 1 0 'rumor' 0.155172 1 0 'another' 0.155232 748 31 'relatively' 0.158807 48 2 'technology' 0.161453 277 12 'going' 0.163466 1519 67 'shows' 0.165458 269 12 'agreement' 0.165465 202 9 'ago' 0.166308 112 5 'using' 0.170597 1357 63 'provides' 0.178445 164 8 'adding' 0.179077 143 7 'running' 0.179577 365 18 'digital' 0.180445 202 10 'helped' 0.181042 61 3 'onto' 0.182937 80 4 'survey' 0.186745 78 4 'ever.' 0.187926 20 1 'touch.' 0.187926 20 1 'software' 0.191964 504 27 'skip:n 10' 0.19449 1321 72 'decides' 0.195819 19 1 'group' 0.197292 271 15 'sure,' 0.199182 72 4 'improving' 0.200943 36 2 'has' 0.202357 3997 229 'research' 0.20242 245 14 'figures' 0.796501 12 11 'crowd' 0.828161 5 6 'war' 0.829187 15 17 'police' 0.834856 17 20 'briefed' 0.844828 0 1 'chambers,' 0.844828 0 1 'investigates' 0.844828 0 1 'kindle' 0.844828 0 1 'nash' 0.844828 0 1 'upfront' 0.844828 0 1 'boeing' 0.84654 1 2 'ward' 0.84654 1 2 'from:addr:message.myspace.com' 0.908163 0 2 'message-id:@message.myspace.com' 0.908163 0 2 'chemists' 0.934783 0 3 'reportedly' 0.949438 0 4 Message Stream Received: from vlan195-30.azeronline.com ([88.151.195.30]) by mail.cobaltnight.com (CW Mail Server) with SMTP id MPR19711 for <[email protected]>; Thu, 04 Feb 2010 08:39:11 -0500 Received: from localhost (127.0.0.1) by vlan195-30.azeronline.com (88.151.195.30) with Microsoft SMTP Server id 8.0.685.24; Thu, 4 Feb 2010 17:38:57 +0400 From: < [email protected]> To: [email protected] Subject: ***Discount_Viagra_VXPL_Percocet*_Adderall**** Date: Thu, 4 Feb 2010 17:38:57 +0400 MIME-Version: 1.0 Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: 8bit Message-ID: <[email protected]> ***Discount_Viagra_VXPL_Percocet*_Adderall****! <http://kashertqdum17.com> <a href="http://kashertqdum17.com">***Discount_Viagra_VXPL_Percocet*_Adderall** **!</a> <style> Web video gets H.264 royalty reprieve 14 hours, 21 minutes ago The group that licenses the widely used H.264 video compression technology decides against adding a Web-streaming royalty charge that could have helped rival formats such as Ogg Theora. Read full story HTML vs. Flash: Can a turf war be avoided? February 3, 2010 9:00 AM PST After Apple said its iPad doesn't support Flash, Adobe sticks up for its tech. The anti-Flash crowd has increasing clout--but still plenty of chaos, too. Read full story Report: Google, NSA talk defense partnership 14 hours, 44 minutes ago The electronic surveillance organization is finalizing an agreement to help the search giant ward off cyberattacks like the ones that originated in China, according to a Washington Post report. (Posted in Security by Steven Musil) Amazon said to buy touch start-up 23 hours, 18 minutes ago The company has acquired Touchco, a New York-based start-up, according to a person briefed on the deal. (From The New York Times) Police survey provides glimpse of Net-surveillance figures 16 hours, 52 minutes ago A relatively small group of 100 police working on Internet investigations reports submitting as many as 22,800 legal requests for information a year to Internet providers. . Police want backdoor to Web users' private data (Posted in Politics and Law by Declan McCullagh) Spotify has much to do before U.S. launch 17 hours, 29 minutes ago Lately, Spotify is a rumor magnet, and one going around suggests the company is for sale. One thing is for sure, before it rolls out in U.S., service needs to license some music. (Posted in Media Maverick by Greg Sandoval) Monster buys Yahoo's HotJobs for $225 million 23 hours, 38 minutes ago Another divestiture for Yahoo will result in an upfront cash payment and three-year exclusive deal with Monster to supply job search listings on Yahoo's home page. (Posted in Relevant Results by Tom Krazit) Microsoft investigates new Internet Explorer flaw 23 hours, 31 minutes ago Software maker says flaw could affect those running Windows XP who aren't using a "protected mode" that's turned on by default in Windows Vista and Windows 7. (Posted in Beyond Binary by Ina Fried) Cisco results signal economic recovery underway February 3, 2010 2:05 PM PST John Chambers, CEO for the networking giant, said he saw positive results across the board and indicated the economic environment is improving globally. (Posted in Signal Strength by Marguerite Reardon) New virus-detecting lab on a chip gets even better 21 hours, 37 minutes ago A team of BYU engineers and chemists figures out how to interface macro levels of liquid onto their microscopic silicon chip to detect viruses at even very low concentrations. (Posted in Health Tech by Elizabeth Armstrong Moore) Boeing unveils 787 Dreamliner interior February 3, 2010 1:12 PM PST The aviation giant on Wednesday released a picture of an actual 787 interior, rather than a mock-up. The picture is of the third test flight 787, known as ZA003. (Posted in Geek Gestalt by Daniel Terdiman) Windows veteran Mike Nash leaving Microsoft February 3, 2010 11:33 AM PST After nearly two decades with the software maker, executive is reportedly leaving to take a role on Amazon's Kindle team. (Posted in Beyond Binary by Ina Fried) Textbook publishers heading to iPad February 3, 2010 11:37 AM PST Major publishers sign deal with ScrollMotion to adapt their textbooks and study guides for Apple's iPad, iPhone, and iPod Touch. . Analyst: Apple will sell 8 million iPads by 2012 (Posted in Apple by Lance Whitney) Global video game sales down 7 percent in 2009 February 3, 2010 9:53 AM PST According to a report on software sales in the U.S., Japan, and the United Kingdom, 2009 was a poor year. But how could it not be? The year before was the best ever. (Posted in Geek Gestalt by Daniel Terdiman) Blogging declines among teens, young adults February 3, 2010 1:01 PM PST A recent survey from the Pew Research Center shows that only 14 percent of teenagers are bloggers, down from 28 percent in 2006. (Posted in Digital Media by Dong Ngo) All Message Tokens 401 unique tokens '"protected' '$225' '(from' '(posted' '100' '11:33' '11:37' '1:01' '1:12' '2006.' '2009' '2010' '2012' '22,800' '2:05' '787' '787,' '9:00' '9:53' 'according' 'acquired' 'across' 'actual' 'adapt' 'adding' 'adobe' 'adults' 'affect' 'after' 'against' 'ago' 'agreement' 'amazon' "amazon's" 'among' 'analyst:' 'and' 'another' 'anti-flash' 'apple' "apple's" 'are' "aren't" 'armstrong' 'around' 'aviation' 'avoided?' 'backdoor' 'be?' 'before' 'best' 'better' 'beyond' 'binary' 'bloggers,' 'blogging' 'board' 'boeing' 'briefed' 'but' 'buy' 'buys' 'byu' 'can' 'cash' 'cc:none' 'center' 'ceo' 'chambers,' 'chaos,' 'charge' 'chemists' 'china,' 'chip' 'cisco' 'clout--but' 'company' 'compression' 'content-type:text/plain' 'could' 'crowd' 'cyberattacks' 'daniel' 'data' 'deal' 'deal.' 'decades' 'decides' 'declan' 'declines' 'default' 'defense' 'detect' 'digital' 'divestiture' "doesn't" 'dong' 'down' 'dreamliner' 'economic' 'electronic' 'elizabeth' 'engineers' 'environment' 'even' 'ever.' 'exclusive' 'executive' 'explorer' 'february' 'figures' 'finalizing' 'flash,' 'flash:' 'flaw' 'flight' 'for' 'formats' 'fried)' 'from' 'from:addr:message.myspace.com' 'from:addr:noreply' 'from:no real name:2**0' 'full' 'game' 'geek' 'gestalt' 'gets' 'giant' 'giant,' 'glimpse' 'global' 'globally.' 'going' 'google,' 'greg' 'group' 'guides' 'h.264' 'has' 'have' 'header:Date:1' 'header:From:1' 'header:MIME-Version:1' 'header:Message-ID:1' 'header:Received:2' 'header:Subject:1' 'header:To:1' 'heading' 'health' 'help' 'helped' 'home' 'hotjobs' 'hours,' 'how' 'html' 'improving' 'ina' 'increasing' 'indicated' 'information' 'interface' 'interior' 'interior,' 'internet' 'investigates' 'ipad' 'ipad,' 'ipads' 'iphone,' 'ipod' 'its' 'japan,' 'job' 'john' 'kindle' 'kingdom,' 'known' 'krazit)' 'lab' 'lance' 'lately,' 'launch' 'law' 'leaving' 'legal' 'levels' 'license' 'licenses' 'like' 'liquid' 'listings' 'low' 'macro' 'magnet,' 'major' 'maker' 'maker,' 'many' 'marguerite' 'maverick' 'mccullagh)' 'media' 'message-id:@message.myspace.com' 'microscopic' 'microsoft' 'mike' 'million' 'minutes' 'mock-up.' 'mode"' 'monster' 'moore)' 'much' 'music.' 'musil)' 'nash' 'nearly' 'needs' 'networking' 'new' 'ngo)' 'not' 'nsa' 'off' 'ogg' 'one' 'ones' 'only' 'onto' 'organization' 'originated' 'out' 'page.' 'partnership' 'payment' 'percent' 'person' 'pew' 'picture' 'plenty' 'police' 'politics' 'poor' 'positive' 'post' 'private' 'proto:http' 'providers.' 'provides' 'pst' 'publishers' 'rather' 'read' 'reardon)' 'recent' 'recovery' 'relatively' 'released' 'relevant' 'reply-to:none' 'report' 'report.' 'report:' 'reportedly' 'reports' 'reprieve' 'requests' 'research' 'result' 'results' 'rival' 'role' 'rolls' 'royalty' 'rumor' 'running' 'said' 'sale.' 'sales' 'sandoval)' 'saw' 'says' 'scrollmotion' 'search' 'security' 'sell' 'sender:none' 'service' 'shows' 'sign' 'signal' 'silicon' 'skip:* 40' 'skip:c 10' 'skip:i 10' 'skip:n 10' 'skip:v 10' 'skip:w 10' 'small' 'software' 'some' 'spotify' 'start-up' 'start-up,' 'steven' 'sticks' 'still' 'story' 'strength' 'study' 'subject:*' 'subject:***' 'subject:****' 'subject:_Adderall' 'subject:skip:D 20' 'submitting' 'such' 'suggests' 'supply' 'support' 'sure,' 'surveillance' 'survey' 'take' 'talk' 'team' 'team.' 'tech' 'tech.' 'technology' 'teenagers' 'teens,' 'terdiman)' 'test' 'textbook' 'textbooks' 'than' 'that' "that's" 'the' 'their' 'theora.' 'thing' 'third' 'those' 'three-year' 'times)' 'to:2**0' 'to:addr:cobaltnight.com' 'to:addr:ocean' 'to:no real name:2**0' 'tom' 'too.' 'touch' 'touch.' 'touchco,' 'turf' 'turned' 'two' 'u.s.' 'u.s.,' 'underway' 'united' 'unveils' 'upfront' 'url:com' 'url:kashertqdum17' 'used' "users'" 'using' 'very' 'veteran' 'video' 'viruses' 'vista' 'vs.' 'want' 'war' 'ward' 'was' 'washington' 'web' 'wednesday' 'whitney)' 'who' 'widely' 'will' 'windows' 'with' 'working' 'x-mailer:none' 'yahoo' "yahoo's" 'year' 'year.' 'york' 'york-based' 'young' 'za003.' -------------------------------------------- _______________________________________________ [email protected] http://mail.python.org/mailman/listinfo/spambayes Info/Unsubscribe: http://mail.python.org/mailman/listinfo/spambayes Check the FAQ before asking: http://spambayes.sf.net/faq.html
