> ------- Additional Comments From [EMAIL PROTECTED] 2005-07-28 18:15 -------
> btw, more hits that look very iffy, from the freqs file:
> 0.333 0.0546 0.8887 0.058 0.26 -4.30 RCVD_IN_BSP_TRUSTED
> 0.051 0.0130 0.1267 0.093 0.19 -0.10 RCVD_IN_BSP_OTHER
> 0.036 0.0053 0.0961 0.053 0.29 -8.00 HABEAS_ACCREDITED_COI
> that seems like a *LOT* of Bonded Sender spam hits -- 809 messages hitting
> RCVD_IN_BSP_TRUSTED! could we get those spam hits verified? (Bob, in
> particular, most seem to be coming from your corpus)
Summary:
Misclassified ham: 28
Bounce/outscatter of spam: 1
Possibly misclassified ham: 34
Constant Contact questionable: 3099 (ham and spam)
The remainder are IMO spam.
Note: In the following discussions where I say "flagged spam", I mean
fully encapsulated, with full SA report and score presented as the
primary email to the user.
> Misclassified ham:
From: [EMAIL PROTECTED] (count: 7)
From: "American Express" <[EMAIL PROTECTED]>
count: 10, multiple users fed to sa-learn, primarily because
instead of being official notifications, statements, alerts,
etc., the "spam" identified by users were marketing emails,
"take a look at our special offers", "plan the perfect holiday",
"upgrade to a card with premium service", etc. Only one of the
sa-learned "spam" was what I'd consider a ham, though none of
them are spam.
From: <[EMAIL PROTECTED]> (count: 2, 1 to each of 2 users)
From: PayPal <[EMAIL PROTECTED]> (count: 6)
From: Tikkun <[EMAIL PROTECTED]> (count: 1)
From: HeartCenterOnline <[EMAIL PROTECTED]> (count: 2)
> Possibly misclassified ham:
From: "CNET Help.com Online Courses "
<[EMAIL PROTECTED]>
Count: 9
User CR declared it to be spam via sa-learn. Probably old subscription.
Several others not fed to sa-learn, but flagged as spam by our system
(and not corrected by the users via sa-learn).
Willing to consider these ham.
From: "The Home Depot" <[EMAIL PROTECTED]>
Subject: Great Last-Minute Gifts for Dad
Count 4: Various users, flagged as spam by our system, not fed through
sa-learn. Looked like spam during validation. also have nine emails
from same source, 3 with low positive scores, six with negative
scores, also not fed through sa-learn.
Willing to consider these ham.
From: Godiva.com <[EMAIL PROTECTED]>
Count 3: User CR declared it to be spam via sa-learn. Might be old
subscription.
Count 1: User SV, flagged as spam by our system, no sa-learn correction.
Note: my unverified corpus also has two more emails from same source,
not flagged as spam (low positive score), not fed to sa-learn.
From: "eBay" <[EMAIL PROTECTED]> Count: 7
Subject: Preview eBay's Summer Sizzlers & Save Big!
Subject: B-52's Live, BBQ at Great America--register now for eBay Live and save!
Subject: feralcanning, check these amazing eBay deals--all under $10
User CR declared it to be spam via sa-learn. Maybe old subscription,
very likely not the type of email the user wanted from eBay.
From: "Movies Unlimited Video E-Flash" <[EMAIL PROTECTED]>
Count 3: User SA, system flagged as spam, no sa-learn, look like spam,
but all to single user. Could be ham.
From: "DVD Talk" <[EMAIL PROTECTED]> count: 2
To: [EMAIL PROTECTED]
Subject: DVD Talk: It's Back - The Huge DeepDiscountDVD.com Sale
User MM, system flagged as spam, no sa-learn, look like spam, all to
single user, count 2, many others not flagged as spam (some low
positive, some negative), none through sa-learn. Could be ham.
From: "Planet DVD Now" <[EMAIL PROTECTED]> count: 3
To: [EMAIL PROTECTED]
Subject: Planet DVD Now Insider News for Saturday June 18, 2005
User NP, system flagged as spam, no sa-learn, look like spam, all to
single user, count 3, many others not flagged as spam (some low
positive, some negative), none through sa-learn. Could be ham.
From: [EMAIL PROTECTED] count: 3
Subject: SexSearch Shown Interest
User JB, flagged spam, no sa-learn. Only user receiving these emails.
> Constant Contact
Per earlier email, several other Constant Contact "newsletters"
flagged by our system as spam, variety of newsletters, variety of
users, spam classification not corrected by users, including technical
users who regularly and reliably sa-learn their misclassified emails.
Messages fed through sa-learn as spam by users: 17
Messages flagged as spam and not sa-learned as ham: 1586
Messages not flagged as spam: 1496
IMO, if we discard the 1603 flagged as spam, we should also discard
the 1496 treated as ham.
> Sure looks like spam:
From: "Entertainment Update" <[EMAIL PROTECTED]>
Subject: New Promotional Partner Opportunities
User CR declared it to be spam via sa-learn. Sure looks to me like spam.
From: The Motley Fool <[EMAIL PROTECTED]>
Subject: Urgent Stock Buy/Sell Alert...from Motley Fool Stock Advisor
User CR declared it to be spam via sa-learn. Sure looks to me like spam.
Plus another copy flagged as spam by our system, same user, not fed to
sa-learn. Quite a few others, all look like spam.
From: "Entertainment Insider" <[EMAIL PROTECTED]>
Subject: New Marketing Opportunities from The b EQUAL Company
Subject: New Promotional Opportunities Available from Nickelodeon
Subject: New Marketing Opportunities from Buena Vista Home Entertainment
User CR declared it to be spam via sa-learn. Sure looks to me like spam.
Count: 5
From: Rabbi Michael Lerner <[EMAIL PROTECTED]>
Subject: Science and Spirit--a work group at the Network of Spiritual
Progressives Founding Conferences
User RI declared it to be spam via sa-learn. Maybe old subscription,
very likely not the type of email the user wanted from this source.
From: "ArcaMax" <[EMAIL PROTECTED]>
Subject: Congratulations - You Won
User NP declared it to be spam via sa-learn. Sure looks to me like spam.
Two copies, same recipient, different message ids
Third email, also user NP, no sa-learn, flagged as spam by our system,
sure looks like spam to me.
Other emails, various users, no sa-learn, flagged as spam by our
system, look like spam to me.
From: South Beach Diet Online <[EMAIL PROTECTED]>
Subject: why this diet WORKS!
User AM, no sa-learn, flagged as spam by our system.
> You are receiving this message because you subscribed to or visited
> a Waterfront Media newsletter or product."
Visited a newsletter or product = looks like spam to me.
From: DGI Line - asi/50910 <[EMAIL PROTECTED]>
Reply-To: [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Subject: 2005 Magnetic Football Schedules! All Pro Teams Available
User JA, no sa-learn, flagged as spam by our system, roving constant
contact, contents look like spam to me.
From: "NewsMax.com" <[EMAIL PROTECTED]>
Subject: Ken Blackwell and New Republicans: Inside Story
User GI, no sa-learn, flagged as spam by our system, only one email in
corpus, including unclassified. If "newsmax.com" were a real service,
I'd expect repeated emails. Therefore I believe this to be spam.
From: Health Insurance Solutions <[EMAIL PROTECTED]>
Subject: Health and happiness go hand in hand.
User JC, system flagged as spam, no sa-learn, five separate emails,
all look like spam (including no MID from sender), all to single user,
an insurance agent. Could be ham. But...
From: Medical Insurance <[EMAIL PROTECTED]>
Subject: Take care with medical insurance.
From: US Immigration Help <[EMAIL PROTECTED]>
Subject: Make the dream of citizenship a reality.
User JC, system flagged as spam, no sa-learn, multiple emails,
all look like spam (including no MID from sender), all to single user,
an insurance agent. Content very much so aimed at consumer, not agent,
strongly suggesting to me that all email from @focalex2.com is indeed
spam. Then ...
From: Posters And Wall Art <[EMAIL PROTECTED]>
Subject: What your walls want to wear.
Same user (insurance agent), same source, nothing at all to do with
insurance or anything similar to any other email received by this
user. Other spam samples abound in more recent email.
From: "SmartBargains" <[EMAIL PROTECTED]>
Reply-To: "SmartBargains" <[EMAIL PROTECTED]>
To: [EMAIL PROTECTED]
Subject: 320TC Sheet Set, Duvet & More Just $29.95
User SC, system flagged as spam, no sa-learn, all look like spam.
User DT, "
Emails do refer to users by a first name which matches first letter of
email address.
> You are receiving this email because you subscribed to it through
> SmartBargains.com or one of our partners.
From: AIU Online <[EMAIL PROTECTED]>
Subject: Nights. Weekends. We're here when it's convenient for YOU!
Consistent spam, repeated sa-learn as spam, 2 users, plus one
unclassified to third user. Confident this is spam.
From: "International Living" <[EMAIL PROTECTED]>
To: [EMAIL PROTECTED]
Subject: IL Postcards - Tax Breaks in the Cloud Forest
User JC, many emails flagged spam, many emails not flagged, no
sa-learn. May or may not be spam. Certainly looks like scam.
From: "Martin D. Weiss, Ph.D." <[EMAIL PROTECTED]>
Subject: A Personal Invitation from Martin Weiss
User JC, all emails flagged spam, no sa-learn, emails certainly do
look like spam/scam. Sent to only this user.
From: Hersheys Kisses <[EMAIL PROTECTED]>
Subject: Complimentary 10 lbs of Hershey~Rs Chocolate
User BQ, clear spam, even in SURBL blacklist.
From: "TopButton" <[EMAIL PROTECTED]>
To: [EMAIL PROTECTED]
Subject: TOP BUTTON VIP - Prada Price Cuts: 4-Days Only
User ND, among the most technically oriented and skilled of our users,
email flagged as spam, no sa-learn, only email from this source in the
entire corpus, looks unquestionably spam.
From: eDiets Extra <[EMAIL PROTECTED]>
Subject: Miami Mediterranean Diet: It's Hot!
Users ST and KG, several emails flagged spam, many emails not flagged,
no sa-learn. May or may not be spam. Certainly looks like spam.
Bob Menschel