> 
> From: Jesse Pelton [mailto:[email protected]] 
> Sent: Thursday, February 04, 2010 9:29 AM
> To: Ocean; [email protected]
> Subject: RE: [Spambayes] Problems with classifying as spam
> 
> SpamBayes doesn't follow links (see http://spambayes.sourceforge.net/faq.
> html#will-show-spam-clues-notify-a-spammer-that-i-opened-their-message for

> a tangentially related discussion), but it does process message headers.  
> Lots of good information in there that you might think came from a Web
site.
> 
> Unless you're willing to dive into the the code and the math, I'd caution 
> against trying to second-guess SpamBayes.  You're going to want it to 
> behave rationally, and it doesn't (at least at the level you're looking 
> at); it behaves statistically.  That's why the FAQ (http://spambayes.svn.
> sourceforge.
> net/viewvc/spambayes/trunk/spambayes/Outlook2000/docs/troubleshooting.
> html#Messages_have_incorrect_or_unexpected) suggests sending all the Spam 
> clues to the list when trying to understand why a given message isn't 
> classified as expected.
> 



Okay, then here's the nitty gritty:



------------------------------

Subject: ***Discount_Viagra_VXPL_Percocet*_Adderall****

Body:

<URL Link>***Discount_Viagra_VXPL_Percocet*_Adderall****!
<Links to:> http://kashertqdum17.com/

------------------------------

As I said before, that's all that's in the message, nothing else.


And here are the spam clues:


--------------------------------------------

Combined Score: 0% (3.88578e-16)
Internal ham score (*H*): 1
Internal spam score (*S*): 7.77156e-16

# ham trained on: 11546
# spam trained on: 2608

The last time this message was classified or trained:
This message had not been filtered.
This message had not been trained.
150 Significant Tokens
token                               spamprob         #ham  #spam
'subject:****'                      0.000567537       396      0
'submitting'                        0.00233281         96      0
'default'                           0.00251537         89      0
'lab'                               0.00286807         78      0
'(from'                             0.00360288         62      0
'skip:* 40'                         0.00391645         57      0
'9:00'                              0.0042898          52      0
'to:addr:ocean'                    0.00511949        904      1
'listings'                          0.00542823         41      0
'report:'                           0.00542823         41      0
'engineers'                         0.00819672         27      0
'lately,'                           0.00920245         24      0
'daniel'                            0.00964836       3204      7
'flaw'                              0.0104895          21      0
'geek'                              0.0104895          21      0
'liquid'                            0.0104895          21      0
'detect'                            0.0115681          19      0
'tech.'                             0.0115681          19      0
'viruses'                           0.0115681          19      0
'binary'                            0.0121951          18      0
'textbook'                          0.012894           17      0
'finalizing'                        0.0136778          16      0
'greg'                              0.0180723          12      0
'yahoo'                             0.0180723          12      0
'providers.'                        0.0196507          11      0
'adults'                            0.0238095           9      0
"amazon's"                          0.0238095           9      0
'2012'                              0.0266272           8      0
'music.'                            0.0266272           8      0
'textbooks'                         0.0266272           8      0
'11:33'                             0.0302013           7      0
'1:01'                              0.0302013           7      0
'9:53'                              0.0302013           7      0
'buys'                              0.0302013           7      0
'suggests'                          0.0302013           7      0
'html'                              0.0343728         255      2
'2:05'                              0.0348837           6      0
'compression'                       0.0348837           6      0
'formats'                           0.0348837           6      0
'teenagers'                         0.0348837           6      0
'adapt'                             0.0412844           5      0
"apple's"                           0.0412844           5      0
'h.264'                             0.0412844           5      0
'kingdom,'                          0.0412844           5      0
'times)'                            0.0412844           5      0
'page.'                             0.0444174         481      5
'hours,'                            0.0480985          92      1
'to:addr:cobaltnight.com'           0.0496998        1274     15
'11:37'                             0.0505618           4      0
'explorer'                          0.0507201          87      1
'indicated'                         0.0512533         168      2
'amazon'                            0.0524349          84      1
'tech'                              0.0543494         389      5
'sign'                              0.0553702         608      8
'payment'                           0.0573002        1169     16
'1:12'                              0.0652174           3      0
'globally.'                         0.0652174           3      0
'u.s.,'                             0.0652174           3      0
'interface'                         0.0657778          66      1
'2009'                              0.0678561         672     11
'center'                            0.0698918         651     11
'electronic'                        0.0758594         542     10
"aren't"                            0.0871721          95      2
'microscopic'                       0.0918367           2      0
'start-up'                          0.0918367           2      0
'surveillance'                      0.0918367           2      0
'teens,'                            0.0918367           2      0
'turf'                              0.0918367           2      0
'sales'                             0.0930142         736     17
'still'                             0.0972529        1728     42
'relevant'                          0.0974632          84      2
'charge'                            0.0981746         368      9
'test'                              0.0985371         407     10
'from:no real name:2**0'            0.104008         2519     66
'information'                       0.110498         3031     85
"that's"                            0.112004          809     23
'requests'                          0.11383           174      5
'100'                               0.11434           276      8
'post'                              0.11434           276      8
'support'                           0.114415         1201     35
"doesn't"                           0.116645          471     14
'data'                              0.11763           699     21
'pst'                               0.119211           67      2
'be?'                               0.120093           34      1
'touch'                             0.12132           226      7
'service'                           0.121935         2010     63
'deal.'                             0.126626           32      1
'job'                               0.130755          178      6
'actual'                            0.13257           349     12
'web'                               0.133212          952     33
'tom'                               0.136878          113      4
'needs'                             0.137534          390     14
'team'                              0.140032          681     25
'rather'                            0.141804          296     11
'leaving'                           0.150599          151      6
'licenses'                          0.151311           26      1
'working'                           0.152874          639     26
'wednesday'                         0.154298          171      7
'"protected'                        0.155172            1      0
'787'                               0.155172            1      0
'armstrong'                         0.155172            1      0
'blogging'                          0.155172            1      0
'declines'                          0.155172            1      0
'ina'                               0.155172            1      0
'mode"'                             0.155172            1      0
'nsa'                               0.155172            1      0
'pew'                               0.155172            1      0
'reprieve'                          0.155172            1      0
'rumor'                             0.155172            1      0
'another'                           0.155232          748     31
'relatively'                        0.158807           48      2
'technology'                        0.161453          277     12
'going'                             0.163466         1519     67
'shows'                             0.165458          269     12
'agreement'                         0.165465          202      9
'ago'                               0.166308          112      5
'using'                             0.170597         1357     63
'provides'                          0.178445          164      8
'adding'                            0.179077          143      7
'running'                           0.179577          365     18
'digital'                           0.180445          202     10
'helped'                            0.181042           61      3
'onto'                              0.182937           80      4
'survey'                            0.186745           78      4
'ever.'                             0.187926           20      1
'touch.'                            0.187926           20      1
'software'                          0.191964          504     27
'skip:n 10'                         0.19449          1321     72
'decides'                           0.195819           19      1
'group'                             0.197292          271     15
'sure,'                             0.199182           72      4
'improving'                         0.200943           36      2
'has'                               0.202357         3997    229
'research'                          0.20242           245     14
'figures'                           0.796501           12     11
'crowd'                             0.828161            5      6
'war'                               0.829187           15     17
'police'                            0.834856           17     20
'briefed'                           0.844828            0      1
'chambers,'                         0.844828            0      1
'investigates'                      0.844828            0      1
'kindle'                            0.844828            0      1
'nash'                              0.844828            0      1
'upfront'                           0.844828            0      1
'boeing'                            0.84654             1      2
'ward'                              0.84654             1      2
'from:addr:message.myspace.com'     0.908163            0      2
'message-id:@message.myspace.com'   0.908163            0      2
'chemists'                          0.934783            0      3
'reportedly'                        0.949438            0      4
Message Stream

Received: from vlan195-30.azeronline.com ([88.151.195.30])
        by mail.cobaltnight.com (CW Mail Server) with SMTP id MPR19711
        for <[email protected]>; Thu, 04 Feb 2010 08:39:11 -0500
Received: from localhost (127.0.0.1) by vlan195-30.azeronline.com
        (88.151.195.30) with Microsoft SMTP Server id 8.0.685.24;
        Thu, 4 Feb 2010 17:38:57 +0400
From: < [email protected]>
To: [email protected]
Subject: ***Discount_Viagra_VXPL_Percocet*_Adderall****
Date: Thu, 4 Feb 2010 17:38:57 +0400
MIME-Version: 1.0
Content-Type: text/html; charset="iso-8859-1"
Content-Transfer-Encoding: 8bit
Message-ID: <[email protected]>

***Discount_Viagra_VXPL_Percocet*_Adderall****! <http://kashertqdum17.com>  

<a
href="http://kashertqdum17.com";>***Discount_Viagra_VXPL_Percocet*_Adderall**
**!</a>
<style>
Web video gets 
H.264 royalty reprieve

14 hours, 21 minutes ago

The group that licenses the widely used H.264 video compression technology
decides against adding a Web-streaming royalty charge that could have helped
rival formats such as Ogg Theora.
Read full story 
 
HTML vs. Flash: Can 
a turf war be avoided?

February 3, 2010 9:00 AM PST

After Apple said its iPad doesn't support Flash, Adobe sticks up for its
tech. The anti-Flash crowd has increasing clout--but still plenty of chaos,
too.
Read full story 
Report: Google, NSA talk defense partnership

14 hours, 44 minutes ago

The electronic surveillance organization is finalizing an agreement to help
the search giant ward off cyberattacks like the ones that originated in
China, according to a Washington Post report.
(Posted in Security by Steven Musil) 
Amazon said to buy touch start-up

23 hours, 18 minutes ago

The company has acquired Touchco, a New York-based start-up, according to a
person briefed on the deal. 
(From The New York Times) 
Police survey provides glimpse of Net-surveillance figures

16 hours, 52 minutes ago

A relatively small group of 100 police working on Internet investigations
reports submitting as many as 22,800 legal requests for information a year
to Internet providers. 
. Police want backdoor to Web users' private data 
(Posted in Politics and Law by Declan McCullagh) 
Spotify has much to do before U.S. launch

17 hours, 29 minutes ago

Lately, Spotify is a rumor magnet, and one going around suggests the company
is for sale. One thing is for sure, before it rolls out in U.S., service
needs to license some music.
(Posted in Media Maverick by Greg Sandoval) 
Monster buys Yahoo's HotJobs for $225 million

23 hours, 38 minutes ago

Another divestiture for Yahoo will result in an upfront cash payment and
three-year exclusive deal with Monster to supply job search listings on
Yahoo's home page.
(Posted in Relevant Results by Tom Krazit) 
Microsoft investigates new Internet Explorer flaw

23 hours, 31 minutes ago

Software maker says flaw could affect those running Windows XP who aren't
using a "protected mode" that's turned on by default in Windows Vista and
Windows 7.
(Posted in Beyond Binary by Ina Fried) 
Cisco results signal economic recovery underway

February 3, 2010 2:05 PM PST

John Chambers, CEO for the networking giant, said he saw positive results
across the board and indicated the economic environment is improving
globally.
(Posted in Signal Strength by Marguerite Reardon) 
New virus-detecting lab on a chip gets even better

21 hours, 37 minutes ago

A team of BYU engineers and chemists figures out how to interface macro
levels of liquid onto their microscopic silicon chip to detect viruses at
even very low concentrations.
(Posted in Health Tech by Elizabeth Armstrong Moore) 
Boeing unveils 787 Dreamliner interior

February 3, 2010 1:12 PM PST

The aviation giant on Wednesday released a picture of an actual 787
interior, rather than a mock-up. The picture is of the third test flight
787, known as ZA003.
(Posted in Geek Gestalt by Daniel Terdiman) 
Windows veteran Mike Nash leaving Microsoft

February 3, 2010 11:33 AM PST

After nearly two decades with the software maker, executive is reportedly
leaving to take a role on Amazon's Kindle team.
(Posted in Beyond Binary by Ina Fried) 
Textbook publishers heading to iPad

February 3, 2010 11:37 AM PST

Major publishers sign deal with ScrollMotion to adapt their textbooks and
study guides for Apple's iPad, iPhone, and iPod Touch. 
. Analyst: Apple will sell 8 million iPads by 2012 
(Posted in Apple by Lance Whitney) 
Global video game sales down 7 percent in 2009

February 3, 2010 9:53 AM PST

According to a report on software sales in the U.S., Japan, and the United
Kingdom, 2009 was a poor year. But how could it not be? The year before was
the best ever.
(Posted in Geek Gestalt by Daniel Terdiman) 
Blogging declines among teens, young adults

February 3, 2010 1:01 PM PST

A recent survey from the Pew Research Center shows that only 14 percent of
teenagers are bloggers, down from 28 percent in 2006.
(Posted in Digital Media by Dong Ngo)
All Message Tokens
401 unique tokens

'"protected'
'$225'
'(from'
'(posted'
'100'
'11:33'
'11:37'
'1:01'
'1:12'
'2006.'
'2009'
'2010'
'2012'
'22,800'
'2:05'
'787'
'787,'
'9:00'
'9:53'
'according'
'acquired'
'across'
'actual'
'adapt'
'adding'
'adobe'
'adults'
'affect'
'after'
'against'
'ago'
'agreement'
'amazon'
"amazon's"
'among'
'analyst:'
'and'
'another'
'anti-flash'
'apple'
"apple's"
'are'
"aren't"
'armstrong'
'around'
'aviation'
'avoided?'
'backdoor'
'be?'
'before'
'best'
'better'
'beyond'
'binary'
'bloggers,'
'blogging'
'board'
'boeing'
'briefed'
'but'
'buy'
'buys'
'byu'
'can'
'cash'
'cc:none'
'center'
'ceo'
'chambers,'
'chaos,'
'charge'
'chemists'
'china,'
'chip'
'cisco'
'clout--but'
'company'
'compression'
'content-type:text/plain'
'could'
'crowd'
'cyberattacks'
'daniel'
'data'
'deal'
'deal.'
'decades'
'decides'
'declan'
'declines'
'default'
'defense'
'detect'
'digital'
'divestiture'
"doesn't"
'dong'
'down'
'dreamliner'
'economic'
'electronic'
'elizabeth'
'engineers'
'environment'
'even'
'ever.'
'exclusive'
'executive'
'explorer'
'february'
'figures'
'finalizing'
'flash,'
'flash:'
'flaw'
'flight'
'for'
'formats'
'fried)'
'from'
'from:addr:message.myspace.com'
'from:addr:noreply'
'from:no real name:2**0'
'full'
'game'
'geek'
'gestalt'
'gets'
'giant'
'giant,'
'glimpse'
'global'
'globally.'
'going'
'google,'
'greg'
'group'
'guides'
'h.264'
'has'
'have'
'header:Date:1'
'header:From:1'
'header:MIME-Version:1'
'header:Message-ID:1'
'header:Received:2'
'header:Subject:1'
'header:To:1'
'heading'
'health'
'help'
'helped'
'home'
'hotjobs'
'hours,'
'how'
'html'
'improving'
'ina'
'increasing'
'indicated'
'information'
'interface'
'interior'
'interior,'
'internet'
'investigates'
'ipad'
'ipad,'
'ipads'
'iphone,'
'ipod'
'its'
'japan,'
'job'
'john'
'kindle'
'kingdom,'
'known'
'krazit)'
'lab'
'lance'
'lately,'
'launch'
'law'
'leaving'
'legal'
'levels'
'license'
'licenses'
'like'
'liquid'
'listings'
'low'
'macro'
'magnet,'
'major'
'maker'
'maker,'
'many'
'marguerite'
'maverick'
'mccullagh)'
'media'
'message-id:@message.myspace.com'
'microscopic'
'microsoft'
'mike'
'million'
'minutes'
'mock-up.'
'mode"'
'monster'
'moore)'
'much'
'music.'
'musil)'
'nash'
'nearly'
'needs'
'networking'
'new'
'ngo)'
'not'
'nsa'
'off'
'ogg'
'one'
'ones'
'only'
'onto'
'organization'
'originated'
'out'
'page.'
'partnership'
'payment'
'percent'
'person'
'pew'
'picture'
'plenty'
'police'
'politics'
'poor'
'positive'
'post'
'private'
'proto:http'
'providers.'
'provides'
'pst'
'publishers'
'rather'
'read'
'reardon)'
'recent'
'recovery'
'relatively'
'released'
'relevant'
'reply-to:none'
'report'
'report.'
'report:'
'reportedly'
'reports'
'reprieve'
'requests'
'research'
'result'
'results'
'rival'
'role'
'rolls'
'royalty'
'rumor'
'running'
'said'
'sale.'
'sales'
'sandoval)'
'saw'
'says'
'scrollmotion'
'search'
'security'
'sell'
'sender:none'
'service'
'shows'
'sign'
'signal'
'silicon'
'skip:* 40'
'skip:c 10'
'skip:i 10'
'skip:n 10'
'skip:v 10'
'skip:w 10'
'small'
'software'
'some'
'spotify'
'start-up'
'start-up,'
'steven'
'sticks'
'still'
'story'
'strength'
'study'
'subject:*'
'subject:***'
'subject:****'
'subject:_Adderall'
'subject:skip:D 20'
'submitting'
'such'
'suggests'
'supply'
'support'
'sure,'
'surveillance'
'survey'
'take'
'talk'
'team'
'team.'
'tech'
'tech.'
'technology'
'teenagers'
'teens,'
'terdiman)'
'test'
'textbook'
'textbooks'
'than'
'that'
"that's"
'the'
'their'
'theora.'
'thing'
'third'
'those'
'three-year'
'times)'
'to:2**0'
'to:addr:cobaltnight.com'
'to:addr:ocean'
'to:no real name:2**0'
'tom'
'too.'
'touch'
'touch.'
'touchco,'
'turf'
'turned'
'two'
'u.s.'
'u.s.,'
'underway'
'united'
'unveils'
'upfront'
'url:com'
'url:kashertqdum17'
'used'
"users'"
'using'
'very'
'veteran'
'video'
'viruses'
'vista'
'vs.'
'want'
'war'
'ward'
'was'
'washington'
'web'
'wednesday'
'whitney)'
'who'
'widely'
'will'
'windows'
'with'
'working'
'x-mailer:none'
'yahoo'
"yahoo's"
'year'
'year.'
'york'
'york-based'
'young'
'za003.'

--------------------------------------------


_______________________________________________
[email protected]
http://mail.python.org/mailman/listinfo/spambayes
Info/Unsubscribe: http://mail.python.org/mailman/listinfo/spambayes
Check the FAQ before asking: http://spambayes.sf.net/faq.html

Reply via email to