[Bug 6707] PerMsgStatus get_decoded_body_text_array() incorrect documentation

2019-08-08 Thread bugzilla-daemon
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=6707

Henrik Krohns  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|NEW |RESOLVED
 CC||apa...@hege.li

--- Comment #4 from Henrik Krohns  ---
Was even slightly more clarified in revision 1864686. Closing.

-- 
You are receiving this mail because:
You are the assignee for the bug.

[Bug 7054] PerMsgStatus could simplify plugins by caching A record lookups for get_uri_detail_list()

2019-07-31 Thread bugzilla-daemon
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7054

--- Comment #3 from Henrik Krohns  ---
(In reply to RW from comment #2)
> It's about caching host A-record look-ups, for plugins to use.
> 
> It not very well put as there doesn't look to be any plugin that uses
> get_uri_detail_list and needs A-records.

Uh yes, but what is there to cache? If some DNS A query is made, SA only does
it once and caches it already. Dunno if it was the case back in 2014. All this
has very little relevance to uri hashes.

-- 
You are receiving this mail because:
You are the assignee for the bug.

[Bug 7054] PerMsgStatus could simplify plugins by caching A record lookups for get_uri_detail_list()

2019-07-31 Thread bugzilla-daemon
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7054

RW  changed:

   What|Removed |Added

 CC||rwmailli...@googlemail.com

--- Comment #2 from RW  ---
It's about caching host A-record look-ups, for plugins to use.

It not very well put as there doesn't look to be any plugin that uses
get_uri_detail_list and needs A-records.

-- 
You are receiving this mail because:
You are the assignee for the bug.

[Bug 7054] PerMsgStatus could simplify plugins by caching A record lookups for get_uri_detail_list()

2019-07-30 Thread bugzilla-daemon
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7054

Henrik Krohns  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |WONTFIX
 CC||apa...@hege.li

--- Comment #1 from Henrik Krohns  ---
I'm not sure what the actual point here is. Plugins query all sorts of stuff, I
don't see anything inherently bad in calling get_uri_detail_list(), it's not
expensive, and it's been defacto for ages. I don't see any reason to
change/break things.

-- 
You are receiving this mail because:
You are the assignee for the bug.

[Bug 7719] (Can't locate object method "check_uri_host_listed" via package "Mail: [...]:SpamAssassin::PerMsgStatus" at (eval 1478)

2019-06-08 Thread bugzilla-daemon
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7719

Steadramon  changed:

   What|Removed |Added

 CC||paul.st...@gmail.com
 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #1 from Steadramon  ---
Fixed in commit below by adding ifplugin, this will be picked up during next
rule build

Sending20_urlshort.cf
Transmitting file data .done
Committing transaction...
Committed revision 1860821.

-- 
You are receiving this mail because:
You are the assignee for the bug.

[Bug 7719] New: (Can't locate object method "check_uri_host_listed" via package "Mail: [...]:SpamAssassin::PerMsgStatus" at (eval 1478)

2019-06-08 Thread bugzilla-daemon
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7719

Bug ID: 7719
   Summary: (Can't locate object method "check_uri_host_listed"
via package "Mail: [...]:SpamAssassin::PerMsgStatus"
at (eval 1478)
   Product: Spamassassin
   Version: 3.4.2
  Hardware: PC
OS: Linux
Status: NEW
  Severity: critical
  Priority: P2
 Component: Rules
  Assignee: dev@spamassassin.apache.org
  Reporter: h.rei...@thelounge.net
  Target Milestone: Undefined

given that nothing was touched on that machine for days now this is pretty sure
once againa bad rules update - and yes since yesterday i receive that as email
beause i have running my tests

Jun  8 12:30:03.449 [1517663] warn: rules: failed to run __PDS_URL_SHORTENER
test, skipping:
Jun  8 12:30:03.449 [1517663] warn:  (Can't locate object method
"check_uri_host_listed" via package "Mail: [...]:SpamAssassin::PerMsgStatus" at
(eval 1478) line 19.
Jun  8 12:30:03.449 [1517663] warn: )

-- 
You are receiving this mail because:
You are the assignee for the bug.

[Bug 7404] Bad regexp (and logic) in MS::PerMsgStatus::get_content_preview

2018-01-11 Thread bugzilla-daemon
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7404

Bill Cole  changed:

   What|Removed |Added

 CC||walt...@pobox.com

--- Comment #8 from Bill Cole  ---
*** Bug 7480 has been marked as a duplicate of this bug. ***

-- 
You are receiving this mail because:
You are the assignee for the bug.

[Bug 6707] PerMsgStatus get_decoded_body_text_array() incorrect documentation

2017-12-27 Thread bugzilla-daemon
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=6707

Giovanni Bechis  changed:

   What|Removed |Added

 CC||giova...@paclan.it

--- Comment #3 from Giovanni Bechis  ---
In trunk now it is documented as:

   Returns the message body, with base64 or quoted-printable encodings
   decoded, and non-text parts or non-inline attachments stripped.

   It is returned as an array of strings, with each string
   representing one newline-separated line of the body.


I think bz is no more current.

-- 
You are receiving this mail because:
You are the assignee for the bug.

[Bug 7404] Bad regexp (and logic) in MS::PerMsgStatus::get_content_preview

2017-08-09 Thread bugzilla-daemon
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7404

Kevin A. McGrail  changed:

   What|Removed |Added

 CC||christophe.troestler@umons.
   ||ac.be

--- Comment #7 from Kevin A. McGrail  ---
*** Bug 7448 has been marked as a duplicate of this bug. ***

-- 
You are receiving this mail because:
You are the assignee for the bug.

[Bug 7404] Bad regexp (and logic) in MS::PerMsgStatus::get_content_preview

2017-07-05 Thread bugzilla-daemon
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7404

Bill Cole  changed:

   What|Removed |Added

 CC||trond.endres...@ximalas.inf
   ||o

--- Comment #6 from Bill Cole  ---
*** Bug 7444 has been marked as a duplicate of this bug. ***

-- 
You are receiving this mail because:
You are the assignee for the bug.

[Bug 7404] Bad regexp (and logic) in MS::PerMsgStatus::get_content_preview

2017-07-04 Thread bugzilla-daemon
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7404

Bill Cole  changed:

   What|Removed |Added

 CC||alexcho...@gmail.com

--- Comment #5 from Bill Cole  ---
*** Bug 7369 has been marked as a duplicate of this bug. ***

-- 
You are receiving this mail because:
You are the assignee for the bug.

[Bug 7212] Warning of uninitialized value in Mail::SpamAssassin::PerMsgStatus::get_names_of_tests_hit_with_scores()

2017-04-16 Thread bugzilla-daemon
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7212

Sidney Markowitz  changed:

   What|Removed |Added

 CC||sid...@sidney.com

--- Comment #5 from Sidney Markowitz  ---
Ported the change from comment 4 to branch 3.4
Committed revision 1791580

-- 
You are receiving this mail because:
You are the assignee for the bug.

[Bug 7196] PerMsgStatus Warning

2017-04-16 Thread bugzilla-daemon
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7196

Sidney Markowitz  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 CC||sid...@sidney.com
 Resolution|--- |FIXED

--- Comment #7 from Sidney Markowitz  ---
The patch in comment 5 is already in the commits. It looks like Mark was
posting corrections to the proposed patch that reflect what he had already
committed. I'm closing this as fixed.

-- 
You are receiving this mail because:
You are the assignee for the bug.

[Bug 7404] Bad regexp (and logic) in MS::PerMsgStatus::get_content_preview

2017-04-12 Thread bugzilla-daemon
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7404

Mark Martinec  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|NEW |RESOLVED

--- Comment #4 from Mark Martinec  ---
closing, resolved

-- 
You are receiving this mail because:
You are the assignee for the bug.

Re: [Bug 7404] New: MS::PerMsgStatus::get_content_preview

2017-04-11 Thread John Hardin

On Tue, 11 Apr 2017, bugzilla-dae...@issues.apache.org wrote:


https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7404

   Bug ID: 7404
  Summary: MS::PerMsgStatus::get_content_preview

t/autolearn.t . Unescaped left brace in regex is deprecated
here (and will be fatal in Perl 5.30), passed through in regex; marked by <--
HERE in m/^(.{ <-- HERE ,200}).*$/ at
../blib/lib/Mail/SpamAssassin/PerMsgStatus.pm line 923.

perl 5.25.11


I remember discussing that (and embarrassing myself because that syntax IS 
valid in sed but not perl) a while back. That bug may be a dupe...


--
 John Hardin KA7OHZhttp://www.impsec.org/~jhardin/
 jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
---
  If you are "fighting for social justice," then you are defining
  yourself as someone who considers regular old everyday
  *equal* justice to be something you don't want.   -- GOF at TSM
---
 2 days until Thomas Jefferson's 274th Birthday


[Bug 7404] Bad regexp (and logic) in MS::PerMsgStatus::get_content_preview

2017-04-11 Thread bugzilla-daemon
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7404

--- Comment #3 from Mark Martinec  ---
3.4:
  Sending PerMsgStatus.pm
Committed revision 1791013.

-- 
You are receiving this mail because:
You are the assignee for the bug.

[Bug 7404] Bad regexp (and logic) in MS::PerMsgStatus::get_content_preview

2017-04-11 Thread bugzilla-daemon
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7404

--- Comment #2 from Mark Martinec  ---
Created attachment 5441
  --> https://bz.apache.org/SpamAssassin/attachment.cgi?id=5441=edit
Fixes inappropriate regexp (and surrounding logic)

-- 
You are receiving this mail because:
You are the assignee for the bug.

[Bug 7404] Bad regexp (and logic) in MS::PerMsgStatus::get_content_preview

2017-04-11 Thread bugzilla-daemon
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7404

Mark Martinec <mark.marti...@ijs.si> changed:

   What|Removed |Added

Summary|MS::PerMsgStatus::get_conte |Bad regexp (and logic) in
   |nt_preview  |MS::PerMsgStatus::get_conte
   ||nt_preview
   Target Milestone|Undefined   |3.4.2

--- Comment #1 from Mark Martinec <mark.marti...@ijs.si> ---
(sorry for hitting a submit prematurely on an unfinished PR)

Testing with 5.25.11 (soon to be be released as 5.26) fills the log
with warnings on a suspicious regexp, as shown above - an x{,n}
is not interpreted as x{0,n}, and even if it were, the logic is wrong
in that code.

fixing in trunk:
  Sending lib/Mail/SpamAssassin/PerMsgStatus.pm
Committed revision 1791010.

-- 
You are receiving this mail because:
You are the assignee for the bug.

[Bug 7404] New: MS::PerMsgStatus::get_content_preview

2017-04-11 Thread bugzilla-daemon
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7404

Bug ID: 7404
   Summary: MS::PerMsgStatus::get_content_preview
   Product: Spamassassin
   Version: 3.4.1
  Hardware: PC
OS: All
Status: NEW
  Severity: normal
  Priority: P2
 Component: Libraries
  Assignee: dev@spamassassin.apache.org
  Reporter: mark.marti...@ijs.si
  Target Milestone: Undefined

t/autolearn.t . Unescaped left brace in regex is deprecated
here (and will be fatal in Perl 5.30), passed through in regex; marked by <--
HERE in m/^(.{ <-- HERE ,200}).*$/ at
../blib/lib/Mail/SpamAssassin/PerMsgStatus.pm line 923.

perl 5.25.11

-- 
You are receiving this mail because:
You are the assignee for the bug.

[Bug 7212] Warning of uninitialized value in Mail::SpamAssassin::PerMsgStatus::get_names_of_tests_hit_with_scores()

2015-06-16 Thread bugzilla-daemon
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7212

Kevin A. McGrail kmcgr...@pccc.com changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|NEW |RESOLVED
 CC||kmcgr...@pccc.com

--- Comment #1 from Kevin A. McGrail kmcgr...@pccc.com ---
Fixed on trunk and 3.4 branch, thanks.

svn commit -m 'Fixed uninitialized error with $line - Thanks to Franz Schwartau
bug 7212 - trunk'
Sendinglib/Mail/SpamAssassin/PerMsgStatus.pm
Transmitting file data .
Committed revision 1685842.

svn commit -m 'Fixed uninitialized error with $line - Thanks to Franz Schwartau
bug 7212 - 3.4' lib/
Sendinglib/Mail/SpamAssassin/PerMsgStatus.pm
Transmitting file data .
Committed revision 1685843.

-- 
You are receiving this mail because:
You are the assignee for the bug.


[Bug 7212] Warning of uninitialized value in Mail::SpamAssassin::PerMsgStatus::get_names_of_tests_hit_with_scores()

2015-06-16 Thread bugzilla-daemon
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7212

--- Comment #3 from Kevin A. McGrail kmcgr...@pccc.com ---
I'm +1 if you want to apply.

-- 
You are receiving this mail because:
You are the assignee for the bug.


[Bug 7212] Warning of uninitialized value in Mail::SpamAssassin::PerMsgStatus::get_names_of_tests_hit_with_scores()

2015-06-16 Thread bugzilla-daemon
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7212

--- Comment #2 from Mark Martinec mark.marti...@ijs.si ---
Created attachment 5309
  -- https://bz.apache.org/SpamAssassin/attachment.cgi?id=5309action=edit
A more compact alternative

A more radical coding approach (attached) reduces the number of
generated opcodes in affected code sections by 40 % (132  79 opcodes)
according to B::Concise.

-- 
You are receiving this mail because:
You are the assignee for the bug.


[Bug 7212] Warning of uninitialized value in Mail::SpamAssassin::PerMsgStatus::get_names_of_tests_hit_with_scores()

2015-06-16 Thread bugzilla-daemon
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7212

Mark Martinec mark.marti...@ijs.si changed:

   What|Removed |Added

   Target Milestone|Undefined   |3.4.2

--- Comment #4 from Mark Martinec mark.marti...@ijs.si ---
 I'm +1 if you want to apply.

trunk:
Bug 7212: code compaction (was: Warning of uninitialized value)
  Sending lib/Mail/SpamAssassin/PerMsgStatus.pm
Committed revision 1685857.

-- 
You are receiving this mail because:
You are the assignee for the bug.


[Bug 7212] New: Warning of uninitialized value in Mail::SpamAssassin::PerMsgStatus::get_names_of_tests_hit_with_scores()

2015-06-16 Thread bugzilla-daemon
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7212

Bug ID: 7212
   Summary: Warning of uninitialized value in
Mail::SpamAssassin::PerMsgStatus::get_names_of_tests_h
it_with_scores()
   Product: Spamassassin
   Version: 3.4.1
  Hardware: PC
OS: FreeBSD
Status: NEW
  Severity: minor
  Priority: P2
 Component: Libraries
  Assignee: dev@spamassassin.apache.org
  Reporter: fr...@electromail.org

Created attachment 5308
  -- https://bz.apache.org/SpamAssassin/attachment.cgi?id=5308action=edit
patch for PerMsgStatus::get_names_of_tests_hit_with_scores()

Using Mail::SpamAssassin::PerMsgStatus::get_names_of_tests_hit_with_scores()
results in:

Use of uninitialized value $line in string ne at
/usr/local/lib/perl5/site_perl/Mail/SpamAssassin/PerMsgStatus.pm line 770.

$line should be initialized with ''. Patch attached.

-- 
You are receiving this mail because:
You are the assignee for the bug.


[Bug 7196] PerMsgStatus Warning

2015-05-26 Thread bugzilla-daemon
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7196

--- Comment #6 from Kevin A. McGrail kmcgr...@pccc.com ---
Can this be marked resolved? I'm not clear if the patches are committed or is
there a fix needed still?

-- 
You are receiving this mail because:
You are the assignee for the bug.


[Bug 7196] PerMsgStatus Warning

2015-05-15 Thread bugzilla-daemon
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7196

--- Comment #3 from Kevin A. McGrail kmcgr...@pccc.com ---
(In reply to Mark Martinec from comment #2)
 Created attachment 5300 [details]
 proposed patch
 
  Is this patch what you mean or is my checking for defined sweeping
  something bigger under the table?
 
 Seems fine to me.
 
 Attached is my version (sorry for duplicating your work, unintentionally).
 Treats domains case-insensitively, retains original order instead of random.

Your fix looks good too with the extra benefit of case insensitivity (not sure
I care about the order that much).  And definitely no worries about the
overlap.  Please commit when ready.

Regards,
KAM

-- 
You are receiving this mail because:
You are the assignee for the bug.


[Bug 7196] PerMsgStatus Warning

2015-05-15 Thread bugzilla-daemon
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7196

Mark Martinec mark.marti...@ijs.si changed:

   What|Removed |Added

   Target Milestone|Undefined   |3.4.2

-- 
You are receiving this mail because:
You are the assignee for the bug.


[Bug 7196] PerMsgStatus Warning

2015-05-15 Thread bugzilla-daemon
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7196

--- Comment #4 from Mark Martinec mark.marti...@ijs.si ---
trunk:
  Sending lib/Mail/SpamAssassin/PerMsgStatus.pm
Committed revision 1679652.

3.4:
  Sending lib/Mail/SpamAssassin/PerMsgStatus.pm
Committed revision 1679653.

-- 
You are receiving this mail because:
You are the assignee for the bug.


[Bug 7196] PerMsgStatus Warning

2015-05-15 Thread bugzilla-daemon
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7196

Mark Martinec mark.marti...@ijs.si changed:

   What|Removed |Added

   Attachment #5300|0   |1
is obsolete||

--- Comment #5 from Mark Martinec mark.marti...@ijs.si ---
Created attachment 5301
  -- https://bz.apache.org/SpamAssassin/attachment.cgi?id=5301action=edit
proposed patch (fixes previous patch)

-- 
You are receiving this mail because:
You are the assignee for the bug.


[Bug 7196] New: PerMsgStatus Warning

2015-05-14 Thread bugzilla-daemon
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7196

Bug ID: 7196
   Summary: PerMsgStatus Warning
   Product: Spamassassin
   Version: 3.4.1
  Hardware: PC
OS: Windows 7
Status: NEW
  Severity: normal
  Priority: P2
 Component: spamassassin
  Assignee: dev@spamassassin.apache.org
  Reporter: kmcgr...@pccc.com

This spanned from bug 7195:

 May 13 23:35:43 mail01 amavis[17386]: (17386-12) _WARN: Use of
 uninitialized value in concatenation (.) or string at
 /usr/share/perl5/vendor_perl/Mail/SpamAssassin/PerMsgStatus.pm line
 3082.

Mark: It's because the RegistryBoundaries::uri_to_domain returns undef
when it sees %hh encoding in an URL, and the caller then tries to
concatenate it with 'dummy@', which produces this warning.
Seems harmless but annoying, should be fixed.

-- 
You are receiving this mail because:
You are the assignee for the bug.


[Bug 7195] New: PerMsgStatus Util warnings

2015-05-14 Thread bugzilla-daemon
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7195

Bug ID: 7195
   Summary: PerMsgStatus  Util warnings
   Product: Spamassassin
   Version: 3.4.1
  Hardware: PC
OS: Linux
Status: NEW
  Severity: normal
  Priority: P2
 Component: spamassassin
  Assignee: dev@spamassassin.apache.org
  Reporter: mysqlstud...@gmail.com

I reported the following on users@ and Mark determined the two warning messages
below were independent bugs that should be addressed:

 I have v3.4.1 with amavisd v2.9.1 on fedora20 and receiving the
 following warnings:

 May 13 23:32:31 mail01 amavis[17306]: (17306-10) _WARN: plugin: eval
 failed: Undefined subroutine
 Mail::SpamAssassin::Util::RegistrarBoundaries::trim_domain called at
 /usr/share/perl5/vendor_perl/Mail/SpamAssassin/Util.pm line 1236.

 May 13 23:35:43 mail01 amavis[17386]: (17386-12) _WARN: Use of
 uninitialized value in concatenation (.) or string at
 /usr/share/perl5/vendor_perl/Mail/SpamAssassin/PerMsgStatus.pm line
 3082.

The second warning is independent from the first, I'm seeing these too.

It's because the RegistryBoundaries::uri_to_domain returns undef
when it sees %hh encoding in an URL, and the caller then tries to
concatenate it with 'dummy@', which produces this warning.
Seems harmless but annoying, should be fixed.

Please open a bug report (one should do, even though these are two
independent issues).

-- 
You are receiving this mail because:
You are the assignee for the bug.


[Bug 7196] PerMsgStatus Warning

2015-05-14 Thread bugzilla-daemon
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7196

Kevin A. McGrail kmcgr...@pccc.com changed:

   What|Removed |Added

 CC||kmcgr...@pccc.com

--- Comment #1 from Kevin A. McGrail kmcgr...@pccc.com ---
Mark,

Is this patch what you mean or is my checking for defined sweeping something
bigger under the table?

Index: lib/Mail/SpamAssassin/PerMsgStatus.pm
===
--- lib/Mail/SpamAssassin/PerMsgStatus.pm   (revision 1679280)
+++ lib/Mail/SpamAssassin/PerMsgStatus.pm   (working copy)
@@ -3078,12 +3078,17 @@
   dbg(eval: all '*From' addrs domains (before):  . join( , @addrs));

   #loop through and limit to just the domain with a dummy address
+  my @tmp_addrs;
   for (my $i = 0; $i  scalar(@addrs); $i++) {
-$addrs[$i] =
'dummy@'.$self-{main}-{registryboundaries}-uri_to_domain($addrs[$i]);
+#Checking if the return from uri_to_domain is defined - bug 7196 KAM 
+my $tmp_addrs =
$self-{main}-{registryboundaries}-uri_to_domain($addrs[$i]);
+if (defined $tmp_addrs) {
+  push (@tmp_addrs, 'dummy@'.$tmp_addrs);
+}
   }

   #Remove duplicate domains
-  my %addrs = map { $_ = 1 } @addrs;
+  my %addrs = map { $_ = 1 } @tmp_addrs;
   @addrs = keys %addrs;

   dbg(eval: all '*From' addrs domains (after uri to domain):  . join( ,
@addrs));

-- 
You are receiving this mail because:
You are the assignee for the bug.


[Bug 7196] PerMsgStatus Warning

2015-05-14 Thread bugzilla-daemon
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7196

--- Comment #2 from Mark Martinec mark.marti...@ijs.si ---
Created attachment 5300
  -- https://bz.apache.org/SpamAssassin/attachment.cgi?id=5300action=edit
proposed patch

 Is this patch what you mean or is my checking for defined sweeping
 something bigger under the table?

Seems fine to me.

Attached is my version (sorry for duplicating your work, unintentionally).
Treats domains case-insensitively, retains original order instead of random.

-- 
You are receiving this mail because:
You are the assignee for the bug.


Re: Uninitialized value in PerMsgStatus

2015-01-16 Thread Kevin A. McGrail

On 1/15/2015 2:43 PM, Mark Martinec wrote:

What is a PR?  A bugzilla ticket?


Yes, a problem report.  Sorry for using terminology from another context.
Thanks.  I wasn't sure if you meant a press release about the bug but 
that seemed overkill...



If there are other perl critic warnings, they are likely not enabled
in that test.


Opened a Bug 7119.
Probably depends on a version of Perl::Critic.
I upgraded to $VERSION = '1.121'; of Perl::Critic so we can try and be 
on the same page.




That statement deserves the  '## no critic'  annotation,
something like (untested) :

# see Bug 7120
return undef if !defined $_[0];  ## no critic 
(ProhibitExplicitReturnUndef)
Exactly.   ## no critic (ProhibitExplicitReturnUndef)  - See Bug 7120  
added and passes xt test.


Will commit as soon as I have a full make test completed.



For other cases I didn't check.
The other cases are all returns if a $self checker is not defined so 
much less of a concern anyway.



Regards,
KAM


Re: Uninitialized value in PerMsgStatus

2015-01-15 Thread Kevin A. McGrail

On 1/15/2015 11:09 AM, Mark Martinec wrote:

:-)  Well I don't want to change the vetting process for a release and
xt/60_perlcritic.t has been used for years on the code base.
Suggestions what we can do to resolve the issue that also passes that
test so we don't have to go down that rabbithole?


There are some other perlcritic warnings about modifying $_ in list 
functions

(in sa-update, spamassassin, spamd).  Opening a PR would be warranted.

What is a PR?  A bugzilla ticket?

Also, running prove -v xt/60_perlcritic.t shows just the one issue and I 
committed the change which reverses some fixes and the original return 
undef change in Util.pm.


not ok 38 - Test::Perl::Critic for ../blib/lib/Mail/SpamAssassin/Util.pm

#   Failed test 'Test::Perl::Critic for 
../blib/lib/Mail/SpamAssassin/Util.pm'

#   at /usr/local/lib/perl5/site_perl/5.8.6/Test/Perl/Critic.pm line 110.
#
# Perl::Critic found these violations in 
../blib/lib/Mail/SpamAssassin/Util.pm:

# return statement with explicit undef at line 287, column 5.

If there are other perl critic warnings, they are likely not enabled in 
that test.


Looking back, this return undef was added after 3.4.0's release and 
appears to have been added during the profiling work on 
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6854


Do you feel confident enough about your change that I should modify or 
notate the build process notes to ignore that issue for that specific 
return undef?



The MS::Util::untaint_var() is still broken...
I'm reverting some recent changes and reverting back to the one undef in 
there.  Sorry about that.  Best of intentions.

Btw, common use of 'return 0' in TxRep to indicate a boolean response
falls into the same category as 'return undef', although 
Test::Perl::Critic
does not check for that. 
I use no warnings 'uninitialized', 'once'; for a lot of my day to day 
work so I can see that but my key concern is passing the tests required 
of a release without trying to change those tests right now.


For a decade or so, we've been able to make releases that pass that 
60_perlcritic test and don't cause real-world issues.  I'm confident 
that we can do so at least in this case.  I want to get the release out 
by Jan 30.  Then I think you are correct we need to look at if this 
makes sense longer-term.


Regards,
KAM




Re: Uninitialized value in PerMsgStatus

2015-01-15 Thread Mark Martinec

:-)  Well I don't want to change the vetting process for a release and
xt/60_perlcritic.t has been used for years on the code base.
Suggestions what we can do to resolve the issue that also passes that
test so we don't have to go down that rabbithole?


There are some other perlcritic warnings about modifying $_ in list 
functions

(in sa-update, spamassassin, spamd).  Opening a PR would be warranted.

The MS::Util::untaint_var() is still broken...

Btw, common use of 'return 0' in TxRep to indicate a boolean response
falls into the same category as 'return undef', although 
Test::Perl::Critic

does not check for that.

  Mark



Re: Uninitialized value in PerMsgStatus

2015-01-15 Thread Mark Martinec
There are some other perlcritic warnings about modifying $_ in list 
functions

(in sa-update, spamassassin, spamd).  Opening a PR would be warranted.



What is a PR?  A bugzilla ticket?


Yes, a problem report.  Sorry for using terminology from another 
context.



If there are other perl critic warnings, they are likely not enabled
in that test.


Opened a Bug 7119.
Probably depends on a version of Perl::Critic.



# Perl::Critic found these violations in
../blib/lib/Mail/SpamAssassin/Util.pm:
# return statement with explicit undef at line 287, column 5.


Opened a Bug 7120. It can be used as a reference in a comment
when adding ## no critic annotations in code.



Looking back, this return undef was added after 3.4.0's release and
appears to have been added during the profiling work on
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6854


I probably fixed that potential or actual problem in passing.



The MS::Util::untaint_var() is still broken...

I'm reverting some recent changes and reverting back to the one undef


Thanks!


Do you feel confident enough about your change that I should modify or
notate the build process notes to ignore that issue for that specific
return undef?


For the 'return undef' in untaint_var() most definitely so.
That function can be used when assembling arguments in a
call to some other function, just as described in Bug 7120.

That statement deserves the  '## no critic'  annotation,
something like (untested) :

# see Bug 7120
return undef if !defined $_[0];  ## no critic 
(ProhibitExplicitReturnUndef)



For other cases I didn't check.



Btw, common use of 'return 0' in TxRep to indicate a boolean response
falls into the same category as 'return undef', although 
Test::Perl::Critic

does not check for that.


I use no warnings 'uninitialized', 'once'; for a lot of my day to day 
work

so I can see that but my key concern is passing the tests required of a
release without trying to change those tests right now.


Testing in a boolean context does not issue a warning when
an expression is undefined. There is no reason to return
a zero (instead of undef or an empty list) when a subroutine
result is considered a boolean value.

  Mark




Re: Uninitialized value in PerMsgStatus

2015-01-15 Thread Quanah Gibson-Mount
--On Thursday, January 15, 2015 11:48 AM -0500 Kevin A. McGrail 
kmcgr...@pccc.com wrote:



(in sa-update, spamassassin, spamd).  Opening a PR would be warranted.

What is a PR?  A bugzilla ticket?


PR is generally short for problem report.  So yes, I'd guess a bugzilla 
ticket.


--Quanah




--

Quanah Gibson-Mount
Platform Architect
Zimbra, Inc.

Zimbra ::  the leader in open source messaging and collaboration


Re: Uninitialized value in PerMsgStatus

2015-01-14 Thread Mark Martinec

Yep that was done to pass the XT tests for a release.  Joe can you
look at those returns again?


That advice from perlcritic needs to be taken with a large grain of 
salt.

In the past I have been bitten by this several times. It is generally
safer to leave 'return undef' unless one carefully analyzes in what
context a subroutine is being called. I'd suggest to revert that
changeset entirely.

  Mark


Re: Uninitialized value in PerMsgStatus

2015-01-14 Thread Kevin A. McGrail

On 1/14/2015 3:11 PM, Mark Martinec wrote:

Yep that was done to pass the XT tests for a release.  Joe can you
look at those returns again?


That advice from perlcritic needs to be taken with a large grain of 
salt.

In the past I have been bitten by this several times. It is generally
safer to leave 'return undef' unless one carefully analyzes in what
context a subroutine is being called. I'd suggest to revert that
changeset entirely.
  Mark



If a subroutine is supposed to return exactly one scalar
and is using a return, that return must be a 'return undef'.

Consider the surprising behaviour:

perl -le '
  sub foo {$a=shift; return if !$a; sin($a)};
  ($x, $y, $z) = (foo(3), foo(0), 9);
  print x=$x, y=$y, z=$z'

x=0.141120008059867, y=9, z= 


:-)  Well I don't want to change the vetting process for a release and 
xt/60_perlcritic.t has been used for years on the code base. Suggestions 
what we can do to resolve the issue that also passes that test so we 
don't have to go down that rabbithole?


Regards,
KAM


Re: Uninitialized value in PerMsgStatus

2015-01-14 Thread Mark Martinec

Yep that was done to pass the XT tests for a release.  Joe can you
look at those returns again?


That advice from perlcritic needs to be taken with a large grain of 
salt.

In the past I have been bitten by this several times. It is generally
safer to leave 'return undef' unless one carefully analyzes in what
context a subroutine is being called. I'd suggest to revert that
changeset entirely.
  Mark



If a subroutine is supposed to return exactly one scalar
and is using a return, that return must be a 'return undef'.

Consider the surprising behaviour:

perl -le '
  sub foo {$a=shift; return if !$a; sin($a)};
  ($x, $y, $z) = (foo(3), foo(0), 9);
  print x=$x, y=$y, z=$z'

x=0.141120008059867, y=9, z=


  Mark


Re: Uninitialized value in PerMsgStatus

2015-01-14 Thread Kevin A. McGrail
Yep that was done to pass the XT tests for a release.  Joe can you look at 
those returns again?
Regards,
KAM

On January 14, 2015 1:35:20 PM EST, Mark Martinec mark.martinec...@ijs.si 
wrote:
2015-01-14 18:06, je Alex Regan napisal
 Hi,
 
 I'm using amavisd-new-2.9.1 and perl-5.18.4 on fedora20 with the svn
 spamassassin snapshot from today, and receive the following message:
 
 Jan 14 11:59:21 mail01 amavis[19431]: (19431-18) _WARN: Use of
 uninitialized value in concatenation (.) or string at
 /usr/share/perl5/vendor_perl/Mail/SpamAssassin/PerMsgStatus.pm line
 3072, GEN19 line 16875.
 
 3071#loop through and limit to just the domain with a dummy
address
 3072for (my $i = 0; $i  scalar(@addrs); $i++) {
 3073  $addrs[$i] =
 'dummy@'.Mail::SpamAssassin::Util::uri_to_domain($addrs[$i]);
 3074}
 
 This has apparently been going on for some time, not just with
today's
 update. I don't see any references to others having this problem, so
 hoped someone could help.
 
 How can I troubleshoot this? Could it be an amavis issue?
 
 Thanks,
 Alex

Looks like someone just recently blindly changed 'return undef'
to a 'return', which breaks calls to such subroutine in a list
context, e.g. foo( untaint($a), bla,  untaint($b) )

   Mark


Uninitialized value in PerMsgStatus

2015-01-14 Thread Alex Regan

Hi,

I'm using amavisd-new-2.9.1 and perl-5.18.4 on fedora20 with the svn 
spamassassin snapshot from today, and receive the following message:


Jan 14 11:59:21 mail01 amavis[19431]: (19431-18) _WARN: Use of 
uninitialized value in concatenation (.) or string at 
/usr/share/perl5/vendor_perl/Mail/SpamAssassin/PerMsgStatus.pm line 
3072, GEN19 line 16875.


3071#loop through and limit to just the domain with a dummy address
3072for (my $i = 0; $i  scalar(@addrs); $i++) {
3073  $addrs[$i] = 
'dummy@'.Mail::SpamAssassin::Util::uri_to_domain($addrs[$i]);

3074}

This has apparently been going on for some time, not just with today's 
update. I don't see any references to others having this problem, so 
hoped someone could help.


How can I troubleshoot this? Could it be an amavis issue?

Thanks,
Alex


Re: Uninitialized value in PerMsgStatus

2015-01-14 Thread Mark Martinec

2015-01-14 18:06, je Alex Regan napisal

Hi,

I'm using amavisd-new-2.9.1 and perl-5.18.4 on fedora20 with the svn
spamassassin snapshot from today, and receive the following message:

Jan 14 11:59:21 mail01 amavis[19431]: (19431-18) _WARN: Use of
uninitialized value in concatenation (.) or string at
/usr/share/perl5/vendor_perl/Mail/SpamAssassin/PerMsgStatus.pm line
3072, GEN19 line 16875.

3071#loop through and limit to just the domain with a dummy address
3072for (my $i = 0; $i  scalar(@addrs); $i++) {
3073  $addrs[$i] =
'dummy@'.Mail::SpamAssassin::Util::uri_to_domain($addrs[$i]);
3074}

This has apparently been going on for some time, not just with today's
update. I don't see any references to others having this problem, so
hoped someone could help.

How can I troubleshoot this? Could it be an amavis issue?

Thanks,
Alex


Looks like someone just recently blindly changed 'return undef'
to a 'return', which breaks calls to such subroutine in a list
context, e.g. foo( untaint($a), bla,  untaint($b) )

  Mark


[Bug 7054] PerMsgStatus could simplify plugins by caching A record lookups for get_uri_detail_list()

2014-09-10 Thread bugzilla-daemon
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=7054

Philip Prindeville phil...@redfish-solutions.com changed:

   What|Removed |Added

 CC||kmcgr...@pccc.com,
   ||mark.marti...@ijs.si

-- 
You are receiving this mail because:
You are the assignee for the bug.


[Bug 7054] PerMsgStatus could simplify plugins by caching A record lookups for get_uri_detail_list()

2014-06-16 Thread bugzilla-daemon
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=7054

Philip Prindeville phil...@redfish-solutions.com changed:

   What|Removed |Added

 CC||philipp@redfish-solutions.c
   ||om

-- 
You are receiving this mail because:
You are the assignee for the bug.


[Bug 7054] New: PerMsgStatus could simplify plugins by caching A record lookups for get_uri_detail_list()

2014-06-16 Thread bugzilla-daemon
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=7054

Bug ID: 7054
   Summary: PerMsgStatus could simplify plugins by caching A
record lookups for get_uri_detail_list()
   Product: Spamassassin
   Version: 3.4.0
  Hardware: PC
OS: Linux
Status: NEW
  Severity: normal
  Priority: P2
 Component: Libraries
  Assignee: dev@spamassassin.apache.org
  Reporter: phil...@redfish-solutions.com

There are several plugins which call PerMsgStatus::get_uri_detail_list() for
the {hosts} hash.

The complexity to do efficient asynchronous lookups of the A records is
substantial, and is best moved into the framework rather than being left to the
plugins themselves.

Either the {hosts} hash could be replaced to not map the key to the value, but
instead map the key to an array reference of corresponding A records, or else
an additional hash could be added to the PerMsgStatus class which maps hosts to
their A record(s).

-- 
You are receiving this mail because:
You are the assignee for the bug.


Re: [Bug 6915] PerMsgStatus::get_tag() enhancement and optimization

2013-03-07 Thread Kevin A. McGrail

On 3/6/2013 7:21 PM, bugzilla-dae...@issues.apache.org wrote:

https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6915

Mark Martinec mark.marti...@ijs.si changed:

What|Removed |Added

Target Milestone|Undefined   |3.4.0

--- Comment #2 from Mark Martinec mark.marti...@ijs.si ---
Out of curiosity, I did a benchmark between the old and the new get_tag
in a real-life environment (production mailer, SA called from amavisd,
time measured by Time::HiRes, perl 5.17.9).

Overall the difference is about 2 milliseconds per mail message scan
(there are about 30 calls to get_tag per scan) in favour of the new one.

The new get_tag is about twice as fast as the old one
(0.077 ms vs. 0.14 ms per one call).

Not that lot, but why not benefit if it comes for free...

Very nice!  Working on the build again now that this is committed.


[Bug 6915] PerMsgStatus::get_tag() enhancement and optimization

2013-03-07 Thread bugzilla-daemon
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6915

Mark Martinec mark.marti...@ijs.si changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #3 from Mark Martinec mark.marti...@ijs.si ---
Closing, done.

-- 
You are receiving this mail because:
You are the assignee for the bug.


[Bug 6915] New: PerMsgStatus::get_tag() enhancement and optimization

2013-03-06 Thread bugzilla-daemon
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6915

Bug ID: 6915
   Summary: PerMsgStatus::get_tag() enhancement and optimization
   Product: Spamassassin
   Version: SVN Trunk (Latest Devel Version)
  Hardware: PC
OS: All
Status: NEW
  Severity: enhancement
  Priority: P2
 Component: Libraries
  Assignee: dev@spamassassin.apache.org
  Reporter: mark.marti...@ijs.si
Classification: Unclassified

1) The current get_tag method dynamically assembles the hash of anonymous
subroutines with every call.

This behaviour was introduced by Bug 2066 (Duncan Findlay, 2003)
in favour of a more self-contained code in _get_tag (avoiding a
global variable and avoiding passing of $pms object as a parameter),
but at the expense of dynamically constructing the hash of anonymous
subroutines with every call to get_tag, and depending on a closure
to access the $pms object.

As the number of tags has grown somewhat in the past 10 years,
and as the get_tag is called rather often, I'm suggesting essentially
a revert of Duncan's change.

My changes moves building of that hash of subs into a static BEGIN
phase as an optimization, but for this reason needs to pass an extra
argument (the $pms) to these subroutines. It only affects modules
PerMsgStatus and the Bayes plugin, public API remains unchanged.

2) Leverage the existing possibility of a tag value being an arrayref:
Introduces an additional (small) method get_tag_raw, which allows
the caller to receive the tag value as an array reference if it wishes
so, thus avoiding a need for splitting a string on a space for tags
which are actually lists, and makes it possible for a tag value
(as a list) to contain spaces without ambiguity. This will allow me
some future improvements in my AskDNS plugin.

3) A somewhat unrelated change in Message/Metadata.pm, replacing tags
LASTEXTERNALREVIP and FIRSTTRUSTEDREVIP with more universal lists:
RELAYSTRUSTEDREVIP, RELAYSUNTRUSTEDREVIP, RELAYSINTERNALREVIP,
and RELAYSEXTERNALREVIP. Neither the old nor the new tags are
currently in use by rukes, so this affects noone (except it enables
my future work on AskDNS plugin).

-- 
You are receiving this mail because:
You are the assignee for the bug.


[Bug 6915] PerMsgStatus::get_tag() enhancement and optimization

2013-03-06 Thread bugzilla-daemon
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6915

--- Comment #1 from Mark Martinec mark.marti...@ijs.si ---
trunk:
  Sending lib/Mail/SpamAssassin/Message/Metadata.pm
  Sending lib/Mail/SpamAssassin/Message.pm
  Sending lib/Mail/SpamAssassin/PerMsgStatus.pm
  Sending lib/Mail/SpamAssassin/Plugin/Bayes.pm
  Sending lib/Mail/SpamAssassin/Plugin/RelayCountry.pm
Committed revision 1453407.

-- 
You are receiving this mail because:
You are the assignee for the bug.


[Bug 6915] PerMsgStatus::get_tag() enhancement and optimization

2013-03-06 Thread bugzilla-daemon
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6915

Mark Martinec mark.marti...@ijs.si changed:

   What|Removed |Added

   Target Milestone|Undefined   |3.4.0

--- Comment #2 from Mark Martinec mark.marti...@ijs.si ---
Out of curiosity, I did a benchmark between the old and the new get_tag
in a real-life environment (production mailer, SA called from amavisd,
time measured by Time::HiRes, perl 5.17.9).

Overall the difference is about 2 milliseconds per mail message scan
(there are about 30 calls to get_tag per scan) in favour of the new one.

The new get_tag is about twice as fast as the old one
(0.077 ms vs. 0.14 ms per one call).

Not that lot, but why not benefit if it comes for free...

-- 
You are receiving this mail because:
You are the assignee for the bug.


[Bug 6707] PerMsgStatus get_decoded_body_text_array() incorrect documentation

2011-11-26 Thread bugzilla-daemon
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6707

Kevin A. McGrail kmcgr...@pccc.com changed:

   What|Removed |Added

 CC||kmcgr...@pccc.com

--- Comment #2 from Kevin A. McGrail kmcgr...@pccc.com 2011-11-26 13:44:05 
UTC ---
It's a step in the right direction.  Perhaps mentioning WHY this was done would
help.  I believe there was a potential DoS edge case this resolved and pointing
towards bug 5717 seems to agree with my memory.

Regards,
KAM

-- 
Configure bugmail: 
https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.


[Bug 6707] New: PerMsgStatus get_decoded_body_text_array() incorrect documentation

2011-11-25 Thread bugzilla-daemon
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6707

 Bug #: 6707
   Summary: PerMsgStatus get_decoded_body_text_array() incorrect
documentation
   Product: Spamassassin
   Version: 3.3 SVN branch
  Platform: All
OS/Version: All
Status: NEW
  Severity: normal
  Priority: P2
 Component: Documentation
AssignedTo: dev@spamassassin.apache.org
ReportedBy: guent...@rudersport.de
Classification: Unclassified


The documentation for M::SA::PerMsgStatus get_decoded_body_text_array() is
incorrect. It claims to return an array of lines of the body.

What it actually returns is an array with chunks of all textual MIME-parts.

More precisely, it (used to) return an array of all textual MIME-parts, each
MIME-part in a string (member of the array) of its own. Since 3.3, long-ish
MIME-parts are broken up into chunks between 1024 and 2048 bytes.

See M::SA::Message get_decoded_body_text_array() for the actual code.

See bug 5717 for the split into chunks issue.


The directly related M::SA::Conf rawbody rule documentation has been changed
already, not to talk about lines, and that multiline RE modifiers might be
needed. Breaking up MIME-parts into chunks is not mentioned, though. Might be
worthwhile to add here, or reference the PerMsgStatus docs -- since REs don't
match, if the target sub-string happens to span a chunk boundary, even though
the message would match the RE.

-- 
Configure bugmail: 
https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.


[Bug 6707] PerMsgStatus get_decoded_body_text_array() incorrect documentation

2011-11-25 Thread bugzilla-daemon
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6707

--- Comment #1 from Karsten BrÀckelmann guent...@rudersport.de 2011-11-26 
00:47:09 UTC ---
Created attachment 5013
  -- https://issues.apache.org/SpamAssassin/attachment.cgi?id=5013
proposed patch

POD for PerMsgStatus get_decoded_body_text_array() fixed, to reflect the
current code.

Added a sentence to Conf rawbody POD, additionally clarifying what the RE will
be matched against.

Attaching here rather than committing right away for discussion. Feel free to
further clarify or adjust the POD changes.

-- 
Configure bugmail: 
https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.

[Bug 5948] New: uninitialized value within @pristine_headers PerMsgStatus. pm line 832

2008-07-26 Thread bugzilla-daemon
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=5948

   Summary: uninitialized value within @pristine_headers
PerMsgStatus.pm line 832
   Product: Spamassassin
   Version: 3.2.5
  Platform: Other
OS/Version: All
Status: NEW
  Severity: enhancement
  Priority: P5
 Component: spamassassin
AssignedTo: dev@spamassassin.apache.org
ReportedBy: [EMAIL PROTECTED]


I noticed an error in /var/log/spamassassin/spamd.log --

Sat Jul 26 20:24:21 2008 [5009] warn: Use of uninitialized value within
@pristine_headers in pattern match (m//) at
/usr/lib/perl5/vendor_perl/5.10.0/Mail/SpamAssassin/PerMsgStatus.pm line 832,
GEN7 line 5.
Sat Jul 26 20:24:21 2008 [5009] warn: Use of uninitialized value $separator in
concatenation (.) or string at
/usr/lib/perl5/vendor_perl/5.10.0/Mail/SpamAssassin/PerMsgStatus.pm line 907,
GEN7 line 5.

Looking at the code, I think that

  my $separator = '';
  if ($pristine_headers[$#pristine_headers] =~ /^\s*$/) {
$separator = pop @pristine_headers;
  }

should read
  my $separator = '';
  if (@pristine_headers  $pristine_headers[$#pristine_headers] =~ /^\s*$/) {
$separator = pop @pristine_headers;
  }

in order to avoid setting #separator to undef if the pristine_headers array is
empty.
(This line of code is unchanged between 3.2.4 and 3.2.5, and I see the warnings
with both versions)


-- 
Configure bugmail: 
https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.


[Bug 4906] Uninitialized $score in PerMsgStatus::_handle_hit

2007-04-16 Thread bugzilla-daemon
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=4906


[EMAIL PROTECTED] changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution||FIXED




--- Additional Comments From [EMAIL PROTECTED]  2007-04-16 05:23 ---
assuming this is fixed; feel free to reopen if it still happens in 3.2.0 ;)



--- You are receiving this mail because: ---
You are the assignee for the bug, or are watching the assignee.


[Bug 4497] reorganise PerMsgStatus code

2006-12-04 Thread bugzilla-daemon
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=4497


[EMAIL PROTECTED] changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution||WONTFIX




--- Additional Comments From [EMAIL PROTECTED]  2006-12-04 10:41 ---
marking this WONTFIX -- this work pretty much stalled and other stuff happened
instead.

this part of the idea is probably still worthwhile, though:

- rewrite . Mail::SpamAssassin::Plugin::Rewrite
  (a plugin that implements rewriting of mails, so that alternative
  rewriting methods can be used instead, optionally)


maybe someone will eventually have tuits for that ;)



--- You are receiving this mail because: ---
You are the assignee for the bug, or are watching the assignee.


[Bug 4906] Uninitialized $score in PerMsgStatus::_handle_hit

2006-07-12 Thread bugzilla-daemon
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=4906





--- Additional Comments From [EMAIL PROTECTED]  2006-07-12 15:26 ---
I think this may be fixed in svn trunk.



--- You are receiving this mail because: ---
You are the assignee for the bug, or are watching the assignee.


[Bug 4973] get_tag and set_tag APIs missing from PerMsgStatus pod

2006-07-06 Thread bugzilla-daemon
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=4973





--- Additional Comments From [EMAIL PROTECTED]  2006-07-06 12:17 ---
Created an attachment (id=3569)
 -- (http://issues.apache.org/SpamAssassin/attachment.cgi?id=3569action=view)
Trivial patch to fix the POD

I did submit an Apache CLA by snail-mail 12 months ago, I hope it's got there
now!



--- You are receiving this mail because: ---
You are the assignee for the bug, or are watching the assignee.


[Bug 4973] get_tag and set_tag APIs missing from PerMsgStatus pod

2006-07-06 Thread bugzilla-daemon
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=4973


[EMAIL PROTECTED] changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution||FIXED
Summary|get_tag and set_tag APIs|get_tag and set_tag APIs
   |missing from PerMsgStatus   |missing from PerMsgStatus
   |pod |pod
   Target Milestone|Undefined   |3.1.4




--- Additional Comments From [EMAIL PROTECTED]  2006-07-06 18:42 ---
aha, interesting.  good catch! :)

I've added in the newlines and committed.

3.1:
Sendinglib/Mail/SpamAssassin/PerMsgStatus.pm
Transmitting file data .
Committed revision 419648.

3.2:
Sendinglib/Mail/SpamAssassin/PerMsgStatus.pm
Transmitting file data .
Committed revision 419649.


FWIW: I searched for other similar POD issues but didn't see any:
find . -name *.pm | grep -v '\.svn' | xargs grep -B1 =item | grep '#'



--- You are receiving this mail because: ---
You are the assignee for the bug, or are watching the assignee.


[Bug 4906] Uninitialized $score in PerMsgStatus::_handle_hit

2006-05-25 Thread bugzilla-daemon
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=4906





--- Additional Comments From [EMAIL PROTECTED]  2006-05-26 00:55 ---
I'm not completely clear about how often, or under what exact circumstances and
config, you're seeing this but you're probably seeing bug 4179.

The last time I tested bug 4179 out, though, two users had to be using the same
rule name(s) to see any wierd config issues. 



--- You are receiving this mail because: ---
You are the assignee for the bug, or are watching the assignee.


[Bug 4906] Uninitialized $score in PerMsgStatus::_handle_hit

2006-05-24 Thread bugzilla-daemon
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=4906





--- Additional Comments From [EMAIL PROTECTED]  2006-05-24 18:36 ---
My suspicion is that spamd is not clearing the config completely between users
when allow_user_rules is set, and the rules, but not the scores, are surviving
between users. (I use my own account as a testbed for rules before deploying
them system-wide for the handful of other users on the server.)

I'm finding it tough to find the code that dumps the old config after 
processing.



--- You are receiving this mail because: ---
You are the assignee for the bug, or are watching the assignee.


[Bug 4906] Uninitialized $score in PerMsgStatus::_handle_hit

2006-05-23 Thread bugzilla-daemon
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=4906





--- Additional Comments From [EMAIL PROTECTED]  2006-05-23 13:29 ---
Stock RPM built from the tarball. No special plugins. A couple of custom system
rules for the float obfuscation spam but they have scores.

What's the Perl syntax for sending a line to syslog if $score is undefined?
Given that I can sprinkle some printfs through that path to trace where this
is coming from. My sample triggers the problem solidly on my setup.



--- You are receiving this mail because: ---
You are the assignee for the bug, or are watching the assignee.


[Bug 4906] Uninitialized $score in PerMsgStatus::_handle_hit

2006-05-23 Thread bugzilla-daemon
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=4906





--- Additional Comments From [EMAIL PROTECTED]  2006-05-23 14:20 ---
if (!defined $score) { warn HELP undef score; }

that's what you're after.  curious!



--- You are receiving this mail because: ---
You are the assignee for the bug, or are watching the assignee.


[Bug 4906] Uninitialized $score in PerMsgStatus::_handle_hit

2006-05-23 Thread bugzilla-daemon
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=4906





--- Additional Comments From [EMAIL PROTECTED]  2006-05-23 18:40 ---
Ok, here's a clue. In /etc/mail/spamassassin/locals.cf I have allow_user_rules
1. In my userprefs I have this rule:

header__MIME_VERSION   exists:MIME-Version
header__SARE_HEAD_MIME_VALID   Mime-Version =~ m'^\s*1.0\b'
meta  SARE_HEAD_MIME_INVALID   !__SARE_HEAD_MIME_VALID  __MIME_VERSION
describe  SARE_HEAD_MIME_INVALID   Invalid mime version
score SARE_HEAD_MIME_INVALID   1.666

If I run spamc on a test message (not the one above, which isn't triggering
today), I get the error running as a different user (who has no custom rules)
but I don't get it running it from my account. The debug printf indicates this
rule as lacking the score.

if (!defined $score) { warn HELP undef score for rule $rule; }


Theory: Does spamd handle user rules inconsistently, loading other users'
personal rules but not scores? How are per-user rules handled by the daemon?



--- You are receiving this mail because: ---
You are the assignee for the bug, or are watching the assignee.


[Bug 4906] Uninitialized $score in PerMsgStatus::_handle_hit

2006-05-23 Thread bugzilla-daemon
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=4906





--- Additional Comments From [EMAIL PROTECTED]  2006-05-23 18:44 ---
Correction, the original test case works, just not when I test as root. Testing
as a mortal other than me elicits the bug.



--- You are receiving this mail because: ---
You are the assignee for the bug, or are watching the assignee.


[Bug 4906] New: Uninitialized $score in PerMsgStatus::_handle_hit

2006-05-22 Thread bugzilla-daemon
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=4906

   Summary: Uninitialized $score in PerMsgStatus::_handle_hit
   Product: Spamassassin
   Version: 3.1.1
  Platform: Other
OS/Version: other
Status: NEW
  Severity: normal
  Priority: P5
 Component: spamc/spamd
AssignedTo: dev@spamassassin.apache.org
ReportedBy: [EMAIL PROTECTED]


I'm seeing several uninitialized value errors in maillog which seem to center on
$score not getting set for some rules. This happens when some messages are
passed through spamc/spamd, but no error is reported when the message is passed
through spamassassin. I don't see it on my own account with a non-empty
user_prefs but I do see it with other users with the default (comments-only)
user_prefs, and I see it on mine if I rename mine to user_prefs.disabled. I'll
attach a sample message that triggers it.

Here's the list of errors I see when this happens:

Use of uninitialized value in addition (+) at
/usr/lib/perl5/site_perl/5.8.3/Mail/SpamAssassin/PerMsgStatus.pm line 2708.
Use of uninitialized value in numeric ge (=) at
/usr/lib/perl5/site_perl/5.8.3/Mail/SpamAssassin/PerMsgStatus.pm line 2713.
Use of uninitialized value in numeric le (=) at
/usr/lib/perl5/site_perl/5.8.3/Mail/SpamAssassin/PerMsgStatus.pm line 2713.
Use of uninitialized value in sprintf at
/usr/lib/perl5/site_perl/5.8.3/Mail/SpamAssassin/PerMsgStatus.pm line 2717.
Use of uninitialized value in numeric eq (==) at
/usr/lib/perl5/site_perl/5.8.3/Mail/SpamAssassin/Plugin/AWL.pm line 341.
Use of uninitialized value in numeric eq (==) at
/usr/lib/perl5/site_perl/5.8.3/Mail/SpamAssassin/PerMsgStatus.pm line 432.



--- You are receiving this mail because: ---
You are the assignee for the bug, or are watching the assignee.


[Bug 4906] Uninitialized $score in PerMsgStatus::_handle_hit

2006-05-22 Thread bugzilla-daemon
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=4906





--- Additional Comments From [EMAIL PROTECTED]  2006-05-22 23:06 ---
Created an attachment (id=3519)
 -- (http://issues.apache.org/SpamAssassin/attachment.cgi?id=3519action=view)
Sample spam for triggering the error message




--- You are receiving this mail because: ---
You are the assignee for the bug, or are watching the assignee.


[Bug 4906] Uninitialized $score in PerMsgStatus::_handle_hit

2006-05-22 Thread bugzilla-daemon
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=4906





--- Additional Comments From [EMAIL PROTECTED]  2006-05-22 23:13 ---
This is one of those can't ever happen (without something crazy going on) 
issues.  But it looks like this is 
probably the same as bug 4699.  Can you try out the current test 3.1.2 or the 
patch as found in that 
ticket?



--- You are receiving this mail because: ---
You are the assignee for the bug, or are watching the assignee.


[Bug 4906] Uninitialized $score in PerMsgStatus::_handle_hit

2006-05-22 Thread bugzilla-daemon
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=4906





--- Additional Comments From [EMAIL PROTECTED]  2006-05-23 00:54 ---
No joy. I applied the patch to Timeout.pm and spamd and restarted the daemon and
still get the same error messages.



--- You are receiving this mail because: ---
You are the assignee for the bug, or are watching the assignee.


[Bug 4906] Uninitialized $score in PerMsgStatus::_handle_hit

2006-05-22 Thread bugzilla-daemon
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=4906





--- Additional Comments From [EMAIL PROTECTED]  2006-05-23 01:41 ---
(In reply to comment #3)
 No joy. I applied the patch to Timeout.pm and spamd and restarted the daemon 
 and
 still get the same error messages.

hrm.  can you describe your SA setup a bit more?  are there any non-standard 
patches applied?  non-
standard plugins?

using the standard code, there's no way for a rule to get in without a score.  
the copy_config issue can 
cause it because it corrupts the config.



--- You are receiving this mail because: ---
You are the assignee for the bug, or are watching the assignee.


[Bug 4497] reorganise PerMsgStatus code

2005-10-17 Thread bugzilla-daemon
http://bugzilla.spamassassin.org/show_bug.cgi?id=4497





--- Additional Comments From [EMAIL PROTECTED]  2005-10-17 16:36 ---
Mail::SpamAssassin::Message::Node will need a ptr to {main} for bug 4636, so
there's little point for a separate Mail::SpamAssassin::Message:Rendered.




--- You are receiving this mail because: ---
You are the assignee for the bug, or are watching the assignee.


[Bug 4497] reorganise PerMsgStatus code

2005-10-17 Thread bugzilla-daemon
http://bugzilla.spamassassin.org/show_bug.cgi?id=4497





--- Additional Comments From [EMAIL PROTECTED]  2005-10-17 16:54 ---
Subject: Re:  reorganise PerMsgStatus code

On Mon, Oct 17, 2005 at 04:36:06PM -0700, [EMAIL PROTECTED] wrote:
 Mail::SpamAssassin::Message::Node will need a ptr to {main} for bug 4636, so
 there's little point for a separate Mail::SpamAssassin::Message:Rendered.

FYI, I will almost definitely -1 any such modification to the Message or
Message::Node code unless there is some huge absolute need to have it.
That code is designed to very specifically have no dependence, relation,
or access to the rest of the SA code.





--- You are receiving this mail because: ---
You are the assignee for the bug, or are watching the assignee.


[Bug 4497] reorganise PerMsgStatus code

2005-10-17 Thread bugzilla-daemon
http://bugzilla.spamassassin.org/show_bug.cgi?id=4497





--- Additional Comments From [EMAIL PROTECTED]  2005-10-17 16:59 ---
There is a work-in-progress patch attached to bug 4636.  It would be better to
discuss the issue in that bug.




--- You are receiving this mail because: ---
You are the assignee for the bug, or are watching the assignee.


Re: PerMsgStatus

2005-07-29 Thread Loren Wilton
  a) what the heck are priorities, who sets them, and do they really have
any
  justifiable purpose?  Ie: can they just quietly vanish into the night
with
  nobody being any the wiser?

 They order the rules -- or more correctly, sets of rules.

 Most rules are priority 500 (iirc), but some need to run earlier and some
 need to run later (e.g. AWL needs to run after all other rules).  Running
 rules earlier is how we propose to implement early-exit -- certain rules
 can run before all others, and cause an early-exit if they fire.

 They cannot just vanish. ;)

Let me challenge or at least prod around the edges of this a bit to further
my understanding.

I think what you are saying is that priority is used (at least in part) to
do the ordering that is known or believed to be required.  However, there
seems to be some ordering built into PMS itself, such as firing the net
rules first and then harvesting the results later.

That makes me believe that we probably have two methods of the same thing:
some rules are ordered because the pms code is written to do them in a given
order, and some rules are ordered because someone assigned a priority
somewhere.

I guess I'm mostly wondering if 'priority' as a number (at least one with a
seemingly rather fine granularity) is necessarily the way to do this.  It is
certainly general.  But I'm wondering if this is over-general, and can end
up forcing a rule ordering algorithm to make potentially bad ordering
decisions.

Might it be reasonable to do the enforced ordering based on a small set of
known rule types, and just flag the unusual rules of each special type?  The
unusual rules that come to mind just at the instant are net, bayes, and awl.
Maybe there are more, but I can't think what they would be at the moment.

While it could be argued that an enumeration is just a form of priority that
doesn't use numbers, it seems to me to have an advantage - you can change
the order that you look at the enumerated values without having to change
the values themselves.  Also, it would prevent assigning 'useless'
classifications such as priorities of 501, 502 and 503 to three user rules.

An example of why I think an enumeration might be better:  Right now all net
rules are started first, since they take longest.  But suppose we have a
rule that will score -100 and the total positive score, including net rules,
is only 100.  Clearly it makes more sense to evaluate that single -100 rule
before firing any of the net rules -- if the -100 rule triggers the net rule
scores are moot, and we have wasted significant system resources.  Doing
that with priorities would be awkward.

So: do we *really* need the existing priority structure?  Or do we just need
a method of identifying a very small number of rule classifications, eg:
bayes, awl, net?  (Anything else?)

Loren




Re: PerMsgStatus

2005-07-29 Thread Justin Mason
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


Loren Wilton writes:
 Let me challenge or at least prod around the edges of this a bit to further
 my understanding.
 
 I think what you are saying is that priority is used (at least in part) to
 do the ordering that is known or believed to be required.  However, there
 seems to be some ordering built into PMS itself, such as firing the net
 rules first and then harvesting the results later.
 
 That makes me believe that we probably have two methods of the same thing:
 some rules are ordered because the pms code is written to do them in a given
 order, and some rules are ordered because someone assigned a priority
 somewhere.
 
 I guess I'm mostly wondering if 'priority' as a number (at least one with a
 seemingly rather fine granularity) is necessarily the way to do this.  It is
 certainly general.  But I'm wondering if this is over-general, and can end
 up forcing a rule ordering algorithm to make potentially bad ordering
 decisions.
 
 Might it be reasonable to do the enforced ordering based on a small set of
 known rule types, and just flag the unusual rules of each special type?  The
 unusual rules that come to mind just at the instant are net, bayes, and awl.
 Maybe there are more, but I can't think what they would be at the moment.
 
 While it could be argued that an enumeration is just a form of priority that
 doesn't use numbers, it seems to me to have an advantage - you can change
 the order that you look at the enumerated values without having to change
 the values themselves.  Also, it would prevent assigning 'useless'
 classifications such as priorities of 501, 502 and 503 to three user rules.
 
 An example of why I think an enumeration might be better:  Right now all net
 rules are started first, since they take longest.  But suppose we have a
 rule that will score -100 and the total positive score, including net rules,
 is only 100.  Clearly it makes more sense to evaluate that single -100 rule
 before firing any of the net rules -- if the -100 rule triggers the net rule
 scores are moot, and we have wasted significant system resources.  Doing
 that with priorities would be awkward.

Why?

header BIG_NEGATIVE_RULE Foo =~ /bar/
score BIG_NEGATIVE_RULE  -999
priority BIG_NEGATIVE_RULE  10  # ensure it runs first

seems pretty simple and non-awkward imo ;)

This is certainly the main form of early-exiting that I'd want to
see -- a specific few tests (like USER_IN_WHITELIST) that run first
(using priority) and cause early exit.

The way priority works is that there's only 1 priority defined in the
config files -- AWL at pri 1000.  everything else is at the same priority,
except for meta tests at META_TEST_MIN_PRIORITY and DNSBL harvesting which
only happens after HARVEST_DNSBL_PRIORITY.

We've been trying to reduce the number of wierd effects in the rule
ordering and selection code, for example the fact that DNS tests are run
first and then harvested later -- that *should* be implemented as some
kind of general mechanism, instead of being keyed off the rule type,
because doing it by rule type means that there'll be no way for plugins to
support similar models for future unforeseen rule types without changes to
PerMsgStatus and/or Conf.   Also, AWL was previously not even a real rule
- -- it was just hard-coded into PerMsgStatus.  having priority let us fix
both of these (although the DNS stuff is still only half-implemented ;).

With a very small number of priorities, and the vast majority of rules
being in one priority bucket, you can then optimise the rule selection
in that bucket and have more or less the same effect without too much
wierd stuff like side-effects of rule names, or rule types.

btw, yes, short-circuiting breaks Bayes autolearning and the AWL. this is
one of its downsides.  However, by far the biggest downside is that we
have never found a way to do automatic short-circuiting that didn't turn
out at about the same speed as not short-circuiting, when measured against
a corpus of mail that includes spam. ;)

You should read the discussion that went on when priority was implemented
last year: http://bugzilla.spamassassin.org/show_bug.cgi?id=2912 .  Much
of this was discussed then. 

- --j.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFC6mXUMJF5cimLx9ARAqaVAJ0f4WfH08qteaVQyTefK8ivIqeBNQCfbSIq
ruV89/Z0Eq2xG1vITm0s+qw=
=txDh
-END PGP SIGNATURE-



Re: PerMsgStatus

2005-07-28 Thread Justin Mason
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


Loren Wilton writes:
 I just spent 45 minutes or so staring at the PerMsgStatus code and figuring
 out a bit more about how it works.  Baroque!  Still, there is the basis of a
 concept underlying the implementation, and it doesn't *look* like it would
 be all that hard to flop things around to work more the way I think they
 should.
 
 It looks like the main things that aren't obvious and I'll need to figure
 out something about are:
 
 a) what the heck are priorities, who sets them, and do they really have any
 justifiable purpose?  Ie: can they just quietly vanish into the night with
 nobody being any the wiser?

They order the rules -- or more correctly, sets of rules.

Most rules are priority 500 (iirc), but some need to run earlier and some
need to run later (e.g. AWL needs to run after all other rules).  Running
rules earlier is how we propose to implement early-exit -- certain rules
can run before all others, and cause an early-exit if they fire.

They cannot just vanish. ;)

 b) why were tests broken out into groups by test type and all the tests of a
 given type run at once?  My best guess was an attempt at efficiency based on
 assumptions about data set size and cache threashing.  Is there a known
 reason that it has to be this way, or would it work just as well to just run
 tests in 'whatever' order?

two reasons:

1. reducing the number of items in a hash is good for efficiency, as it
reduces hash collisions.

2. running all tests of a certain type in one block allows some
optimisations; e.g. for the body rules, we can iterate through all lines
in the body, and for each line, call all of the active body rules of that
priority level one after another.  (I'm not sure if we still do this
or not btw.)

it may work better to run in whatever order -- benchmarks are the one
true authority here ;)

 c) are there known ways in Perl to actually dispose of memory items and have
 them really return memory to the available pool, or do you just hope that
 exiting scope and garbage collection may eventually do the job for you?

  {
my $obj = [ ...something...];
  } # $obj has gone out of scope.  GC happens now and $obj is deleted

in other words, once it goes out of scope, it is immediately GC'd.  it's
not like java, where it may be gc'd if you're lucky, the moon is full,
and you call System.gc() three times in a row.  (java programmers will
know I'm not joking about that ;)

If it's a member of a hash like $self, delete $self-{variable} is
how you force it to be deleted.  If something else has a ref to it,
it won't be deleted, of course -- everything's ref counted.

 d) can you build an array/list/hash/whatever of procedure names/pointers and
 efficiently iterate over the structure calling the procedures in sequence?
 Will this be slower than generating an eval containing a bunch of lines
 calling the same procedures in sequence?

You can indeed --

foreach my $fn_ref (@array) {
  $fn_ref-(...arguments...);
}

or even

map { $_-(arguments...); } @array;

But -- the bad news -- it will almost definitely be slower.
The only way to find out is with benchmarks.  the 'Benchmark'
CPAN module can be very useful to measure that stuff.

 Do you have any insightful (or alternately: quick) answers to the above
 questions?  I have a feeling that while I could make some deductions on the
 first two questions from tracking stuff into other modules, the *real*
 reasons are probably lost or stored in the group arcana of the dev's minds.
 
 It seems to me as a first whack, there isn't any huge reason that the rules
 couldn't be looked at en masse and a quick dependency tree built, then the
 results sorted in some convenient score-and-whatnot-based order, and then,
 instead of half a dozen essentially identical rule building procedures that
 exist now, just have one procedure that will make the test and calling
 procedures come into existance.

One thing -- watch out for $score == 0.  If a score is set to 0, the
evaluation of the rule's code (be that an eval test, a regexp or whatever)
should not happen; and rules can have their scores set to 0 in user prefs,
so assuming that because a rule is 0 in the system-wide config, it'll
never be run from then on, is not a safe assumption.

 I'm not even sure that you really need to pass much more than @self to the
 procedures, and let them find the data they want to play with as member
 variables on @self with known names.

That's entirely true.

 (Although maybe Perl requires more
 parameters, I still don't understand things like @_ and the like.)

@_ is the parameter list.   btw accessing parameters passed to a function
directly as stuff in @_ is faster than assigning variable names to
them, in other words

  sub myfunction { 
return $_[0] + $_[1];
  }

is faster than

  sub myfunction { 
my ($foo,$bar) = @_;
return $foo + $bar;
  }

- --j.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.5 (GNU

Re: PerMsgStatus

2005-07-28 Thread Matt Sergeant

[Lots of stuff snipped]

You know, it'd be nice if Daniel, or anyone else, checked in my 
optimised PMS.pm [*] in as a branch. That way it can be worked on 
easily by multiple people. An optimisation branch would mean you can 
continue with the current release work, while others work on 
performance.


Matt.

[*] I changed a few parts about how rules get compiled while at CEAS 
that should have been faster, only benchmarks showed no visible sign of 
improvements. I'd be happy to send it to anyone interested.



__
This email has been scanned by the MessageLabs Email Security System.
For more information please visit http://www.messagelabs.com/email 
__


Re: [Bug 4497] New: reorganise PerMsgStatus code

2005-07-24 Thread Daniel Quinlan
Loren Wilton [EMAIL PROTECTED] writes:

 Is there a way that [user rules] could be saved in a more-compiled
 state when used with spamd and similar?  Maybe name the rules with the
 username as part of the procedure name, and just add them to the
 namespace the first time encountered?

Beyond the memory bloating this could potentially cause and the
potentially more complicated security aspects, it's important to
optimize for the common case.  I've been thinking we might be able to
move all user rules into their own priority and then *only* those would
be slower and overall performance would be as good as reasonably
possible.  Also, when there are no user rules, the user rule priority
could just be skipped for good performance.

In the long run, it might even be possible to give user rules priorities
based on the user name (non-numeric ones) so they could be cached.

So, are any user rule sets using priorities?

Daniel

-- 
Daniel Quinlan
http://www.pathname.com/~quinlan/


Re: [Bug 4497] New: reorganise PerMsgStatus code

2005-07-24 Thread Justin Mason
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


Daniel Quinlan writes:
 Loren Wilton [EMAIL PROTECTED] writes:
 
  Is there a way that [user rules] could be saved in a more-compiled
  state when used with spamd and similar?  Maybe name the rules with the
  username as part of the procedure name, and just add them to the
  namespace the first time encountered?
 
 Beyond the memory bloating this could potentially cause and the
 potentially more complicated security aspects, it's important to
 optimize for the common case.  I've been thinking we might be able to
 move all user rules into their own priority and then *only* those would
 be slower and overall performance would be as good as reasonably
 possible.  Also, when there are no user rules, the user rule priority
 could just be skipped for good performance.
 
 In the long run, it might even be possible to give user rules priorities
 based on the user name (non-numeric ones) so they could be cached.

I don't get it -- how does that help?

- --j.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFC5BJAMJF5cimLx9ARAtu/AKCaOu6/yYj46MxHwXiwiEfU54JdPwCfQhSM
6fR7KmFEJfaCgvZ/B3NJTzw=
=0/Gr
-END PGP SIGNATURE-



Re: [Bug 4497] New: reorganise PerMsgStatus code

2005-07-24 Thread Daniel Quinlan
Justin Mason [EMAIL PROTECTED] writes:

 I don't get it -- how does that help?

Basically, user rule would always run last.  Something like:

 for (@priorities) {
standard prorities loop
 }
 if ($allow_user_rules  $defined_rules{$user}) {
do_xxx_tests($user, ...);  # $user is the priority
do_yyy_tests($user, ...);
 }

Precompiling could be done by an early Plugin that runs an init message
on each user that has configuration.  The do_xxx_tests routines should
not care about whether you are doing user rules, ideally.

Daniel

-- 
Daniel Quinlan
http://www.pathname.com/~quinlan/


[Bug 4497] New: reorganise PerMsgStatus code

2005-07-23 Thread bugzilla-daemon
http://bugzilla.spamassassin.org/show_bug.cgi?id=4497

   Summary: reorganise PerMsgStatus code
   Product: Spamassassin
   Version: SVN Trunk (Latest Devel Version)
  Platform: Other
OS/Version: other
Status: NEW
  Severity: enhancement
  Priority: P5
 Component: Libraries
AssignedTo: dev@spamassassin.apache.org
ReportedBy: [EMAIL PROTECTED]


talking about this at the hackathon and there's pretty unanimous agreement that
PerMsgStatus does too many different things.

So here are the conceptual blocks of functionality in PerMsgStatus, and the
proposed new classes to implement it:

- context of scan --\
  results of scan  Mail::SpamAssassin::PerMsgStatus
  (keep this where it is, badly named or not.)

- logic for check() .. Mail::SpamAssassin::Plugin::Check.
  (a new plugin to implement check logic, so that it can be swapped
  out for alternative check prioritisation implementations, short-circuiting
  implementations, etc.)

- body formats / rendering -\
- URI list  Mail::SpamAssassin::Message::Rendered

  (a new object to hold rendered state off Message.  *NOT* Message itself, as
  this object has a ptr to {main} and {conf}, so can use configuration data
  during its work; it's also deleted when Maill::SpamAssassin::PerMsgStatus
  is deleted.)

- logic for each rule type --\
- rule compilation  Mail::SpamAssassin::Scanner
- config parsing for rules --/

- rewrite . Mail::SpamAssassin::Plugin::Rewrite
  (a plugin that implements rewriting of mails, so that alternative
  rewriting methods can be used instead, optionally)

Regarding backwards compatibility -- all public APIs remain compatible.
This is essential (but easily done).

Every public member variable remains on PerMsgStatus; every public method keeps
a public facade method on PerMsgStatus that just calls into the real
implementation code.

this is post-3.1.0, just wanted to get this into bits and off the whiteboard...



--- You are receiving this mail because: ---
You are the assignee for the bug, or are watching the assignee.


Re: [Bug 4497] New: reorganise PerMsgStatus code

2005-07-23 Thread Loren Wilton
I know user rules aren't real popular with the sa dev community, however
that attitude isn't universally shared by sa users.  Therefore may I
suggest:

Would it be possible when reorganizing things to come up with some
semi-persistant storage for compiled user rules, so that they don't have to
be rewritten and recompiled in a bunch of evals for every message processed
for the user?

At the very least it should be moderately trivial to save the text passed to
the evals into a file someplace, and then just eval that one file to get
everything compiled.  Timestamps would indicate if the pre-built stuff was
out of date or not.

Is there a way that they could be saved in a more-compiled state when used
with spamd and similar?  Maybe name the rules with the username as part of
the procedure name, and just add them to the namespace the first time
encountered?

If there were a way to make user rules be a subclass of the main rules, then
you could just call the user rules for each user, and any user rules (or I
presume scores) would override the standard rules, and anything that didn't
override would pass through to the main rules.

Of course this assumes the rules was a class instance of some sort of
other, and you could call ruleclass-dorules(message) or the like (however
that C++ invocation would be spelled in Perl).

Having all user rules remain in memory forever would certainly add some to
the bloat of SA memory usage. However, if one assumes most users would have
few rules compared to the number of overall system rules, then the increase
might be reasonable; especially since you would be trading off that memory
for the compile and build overhead on every message.

Loren

BTW, if the rules were in a class, then it would be possible to create the
instance of that class at startup in just about exactly the same way user
rule subclasses could be created.  Then a hup, instead of taking spamd and
children down, would just cause the children to discard the current rules
instance and fabricate a new instance and continue onward.



[Bug 4411] $permsgstatus-get_uri_list too aggressive?

2005-06-22 Thread bugzilla-daemon
http://bugzilla.spamassassin.org/show_bug.cgi?id=4411


[EMAIL PROTECTED] changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution||FIXED




--- Additional Comments From [EMAIL PROTECTED]  2005-06-22 09:42 ---
I thought I'd changed this yesterday bug bugzilla doesn't show the mods. I'm
changing the status to 'fixed' since it's been incorporated into 3.1. However,
this doesn't really help me any since I'm not going to make any dependancies on
3.1 until 3.1.1 comes out probably (give peole time to move).

On a related issue, however, I'm wondering what kind of forward strategy there
is for unencoded i18n domain names. Right now, a raw IDN like
http://www.bravÄ.nu causes some breakage. URIDNSBL ignores these entirely, and I
also need to add similar code that ignores URIs with 8-bit characters. But at
some point, it would probably be a good idea for get_uri_list to provide these
in two forms--once in the original unencoded form, and also in the normalized
(xn--) format. Is this work in 3.1 also, or is this something that hasn't been
dealt with yet?



--- You are receiving this mail because: ---
You are the assignee for the bug, or are watching the assignee.


[Bug 4411] New: $permsgstatus-get_uri_list too aggressive?

2005-06-20 Thread bugzilla-daemon
http://bugzilla.spamassassin.org/show_bug.cgi?id=4411

   Summary: $permsgstatus-get_uri_list too aggressive?
   Product: Spamassassin
   Version: unspecified
  Platform: Other
OS/Version: other
Status: NEW
  Severity: enhancement
  Priority: P4
 Component: Plugins
AssignedTo: dev@spamassassin.apache.org
ReportedBy: [EMAIL PROTECTED]


I'm adding support for URI processing to my LDAPfilter plugin, and have noticed
that a URI of http://username:[EMAIL PROTECTED]:nn/ will get parsed and
returned as all of the following variations:

  mailto:username:[EMAIL PROTECTED]

  http://username:[EMAIL PROTECTED]:nn/

  http://www.example.com:nn/

The mailto and the last http URIs should not be returned since they are not
accurate representations of the input URI.

I can kind of see why you'd want to return something like the last URI for your
reduced view mechanism, but such a view would theoretically be provided without
the port number, so I'm not sure if that's the intention or not.

If you need/want to keep all of these in the outputs, can we get a separate
array or parameter that just returns the normalized URIs?

Thanks



--- You are receiving this mail because: ---
You are the assignee for the bug, or are watching the assignee.


[Bug 4411] $permsgstatus-get_uri_list too aggressive?

2005-06-20 Thread bugzilla-daemon
http://bugzilla.spamassassin.org/show_bug.cgi?id=4411





--- Additional Comments From [EMAIL PROTECTED]  2005-06-20 12:51 ---
Subject: Re:   New: $permsgstatus-get_uri_list too aggressive?

On Mon, Jun 20, 2005 at 12:43:28PM -0700, [EMAIL PROTECTED] wrote:
 that a URI of http://username:[EMAIL PROTECTED]:nn/ will get parsed and
 
   mailto:username:[EMAIL PROTECTED]
   http://username:[EMAIL PROTECTED]:nn/
   http://www.example.com:nn/
 
 The mailto and the last http URIs should not be returned since they are not
 accurate representations of the input URI.

The last one is.  It's the actual URI without the login information, which is
exactly what we want.  The mailto is incorrect.

 I can kind of see why you'd want to return something like the last URI for 
 your
 reduced view mechanism, but such a view would theoretically be provided 
 without
 the port number, so I'm not sure if that's the intention or not.

The idea is to remove the obfuscation of URIs so the rules don't have to
account for everything like username and passwords.  We do want to catch port
numbers.

 If you need/want to keep all of these in the outputs, can we get a separate
 array or parameter that just returns the normalized URIs?

This is possible in 3.1: don't call get_uri_list, call get_uri_detail_list
which you can see the raw URIs as well as the canonified version
separately.





--- You are receiving this mail because: ---
You are the assignee for the bug, or are watching the assignee.


[Bug 4218] PerMsgStatus $status-finish() not getting registered

2005-04-12 Thread bugzilla-daemon
http://bugzilla.spamassassin.org/show_bug.cgi?id=4218


[EMAIL PROTECTED] changed:

   What|Removed |Added

 AssignedTo|dev@spamassassin.apache.org |[EMAIL PROTECTED]
 Status|REOPENED|ASSIGNED




--- Additional Comments From [EMAIL PROTECTED]  2005-04-11 19:57 ---
Created an attachment (id=2783)
 -- (http://bugzilla.spamassassin.org/attachment.cgi?id=2783action=view)
fix 

argh, this looked easy and then wasn't ;)

sure enough, there was a bug.  Basically, a temporary PerMsgStatus object is
created *very* early on -- in lib/Mail/SpamAssassin/Message/Metadata/Received
-- before the plugin was loaded, resulting in the 'per_msg_finish' callbacks
being cached without the plugin's method in the list.

fix was to get that object created *once* in Mail::SpamAssassin -- not every
time a message is parsed.  (it's much more efficient this way anyway.)

long-term fix would be to get some of the DNS-lookup code out of Dns.pm, since
having it there means it unnecessarily relies on a PerMsgStatus object being in
existence... which is really not necessary for lookups in the Received header
parsing code.

couple of other minor fixes: added some doco to Plugin::per_msg_finish() to
note that only circular refs need to be cleaned up normally. and added a test
to the test suite.




--- You are receiving this mail because: ---
You are the assignee for the bug, or are watching the assignee.


[Bug 4218] PerMsgStatus $status-finish() not getting registered

2005-04-06 Thread bugzilla-daemon
http://bugzilla.spamassassin.org/show_bug.cgi?id=4218


[EMAIL PROTECTED] changed:

   What|Removed |Added

 Status|RESOLVED|REOPENED
 Resolution|WORKSFORME  |




--- Additional Comments From [EMAIL PROTECTED]  2005-04-06 05:48 ---
doesn't work for me. perhaps there is a version comment that would be
appropriate. perhaps the docs are wrong, and the bug needs to be changed.

?




--- You are receiving this mail because: ---
You are the assignee for the bug, or are watching the assignee.


[Bug 3885] Undefined subroutine Mail::SpamAssassin::PerMsgStatus::is_razor2_available

2005-03-30 Thread bugzilla-daemon
http://bugzilla.spamassassin.org/show_bug.cgi?id=3885





--- Additional Comments From [EMAIL PROTECTED]  2005-03-30 06:42 ---
Subject: Re:  Undefined subroutine 
Mail::SpamAssassin::PerMsgStatus::is_razor2_available

[EMAIL PROTECTED] writes:
 http://bugzilla.spamassassin.org/show_bug.cgi?id=3885
 
 
 
 
 
 --- Additional Comments From [EMAIL PROTECTED]  2005-03-29 18:54 ---
 Now that the reporting stuff is all in the plugin, can I assume this is no
 longer an issue?
 

Correct.  This can be closed.





--- You are receiving this mail because: ---
You are the assignee for the bug, or are watching the assignee.


[Bug 3885] Undefined subroutine Mail::SpamAssassin::PerMsgStatus::is_razor2_available

2005-03-30 Thread bugzilla-daemon
http://bugzilla.spamassassin.org/show_bug.cgi?id=3885


[EMAIL PROTECTED] changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution||FIXED




--- Additional Comments From [EMAIL PROTECTED]  2005-03-30 07:09 ---
Yep, it's fixed.



--- You are receiving this mail because: ---
You are the assignee for the bug, or are watching the assignee.


[Bug 4218] New: PerMsgStatus $status-finish() not getting registered

2005-03-22 Thread bugzilla-daemon
http://bugzilla.spamassassin.org/show_bug.cgi?id=4218

   Summary: PerMsgStatus $status-finish() not getting registered
   Product: Spamassassin
   Version: 3.0.2
  Platform: All
OS/Version: Linux
Status: NEW
  Severity: minor
  Priority: P3
 Component: Rules (Eval Tests)
AssignedTo: dev@spamassassin.apache.org
ReportedBy: [EMAIL PROTECTED]


http://spamassassin.apache.org/full/3.0.x/dist/doc/Mail_SpamAssassin_PerMsgStatus.html
says that $status-finish() should get called when the PerMsgStatus object is
destroyed. However, the finish() function in my plug-in is not getting
registered or called.



--- You are receiving this mail because: ---
You are the assignee for the bug, or are watching the assignee.


speedup for PerMsgStatus

2004-09-28 Thread Justin Mason
OK, here's a trick I was thinking about. Currently we have these massive
hashtable refs:

$pms-{conf}-{rbl_evals}
  {head_tests}
  {body_tests}
  
  {scoreset}-[0,1,2,3]
  {tflags}

Each of those is keyed by the name of the rule.

Now the thing is, this is really wasteful - speed-wise (not really
RAM-wise) -- just performing all those hash lookups!   When a message is
scanned, each of the _evals and _tests hashes are iterated over,
extracting the rule name and rule text for every entry. In reality, we
only need the rule text at this point, *not* the name.

  - We have about 700 rules

  - 99% of the time, any given rule will NOT fire, so we should speedup:

foreach my $rulepat (@{all_rules_of_given_type}) {
  ...
  if ($whatever =~ /$rulepat/) {
# hit!
  }
  # otherwise miss!
}

we should speedup the 'foreach', the rule-text fetch, and the 'miss'.
note that we don't need to know the rule name until the rule gives
us a hit!

so I'm thinking that we should replace parts of this with arrays, using
integer indexes, instead of hashes with string indexes.

Array lookups are quite a bit faster than hash lookups.

Each array would have RAM usage of -- guessing -- (size_of_whats_stored +
9100) bytes, since arrays in perl have an overhead of about 13 bytes per
entry.  (this is about the same as hashes iirc, poss a bit less.  not sure
if there'd be RAM savings there, since perl hash keys are refcounted
shared strings iirc.)

we can optimize for the rules that are loaded from the system-wide config,
because (a) allow_user_rules is almost always off, and (b) even if it's
on, I'd guess that most times 99% of the rules that a scan runs would be
system-wide rules anyway.   (we can deal with user-rules by just pushing
them onto the rules array when they're defined, same as the system rules
are done.)

--j.


Re: speedup for PerMsgStatus

2004-09-28 Thread Daniel Quinlan
[EMAIL PROTECTED] (Justin Mason) writes:

 Array lookups are quite a bit faster than hash lookups.

+1

-- 
Daniel Quinlan ApacheCon! 13-17 November (3 SpamAssassin
http://www.pathname.com/~quinlan/  http://www.apachecon.com/  sessions  more)


Re: speedup for PerMsgStatus

2004-09-28 Thread Loren Wilton
 so I'm thinking that we should replace parts of this with arrays, using
 integer indexes, instead of hashes with string indexes.

 Array lookups are quite a bit faster than hash lookups.

I have no idea how painful linked lists are in Perl (or if they even exist).
But if you are essentially iterating over all rules, they could have some
distinct advantages for possible future optimizations.

For instance: just link user rules onto the end of the last system rule.

For instance: reorder rules with a trivial LRU or counted LRU algorithm to
put the most-hit rules first, by dynamic experience.  Then stop processing
rules if score total  x.  (Insert obvious arguments about negative-scoring
rules here.)

LRUs are really cheap in most languages, and if you can do them in Perl this
could be a huge advantage when coupled with an early bailout algorithm.

Loren



Re: speedup for PerMsgStatus

2004-09-28 Thread Loren Wilton
  I have no idea how painful linked lists are in Perl (or if they even
exist).

 Why are you commenting then???

Because they are very useful, as I pointed out.


 They don't exist as a native data structure.  Arrays are fast, painless,
 and dynamically sized.

They don't exist as a native data structure in C++ either.  But they get a
lot of use. Even when template classes exist to do reasonably fast and
reasonably painless dynamic arrays.  For certain things (like collections of
objects that can get reordered frequently) they are generally more efficient
than dynamic arrays.

If there is an SA coding requirement for only using native data structures,
then forget lists.  If no such requirement exists and there is an interest
in optimizing performance, then they should be a tool to be considered.

Loren



Re: speedup for PerMsgStatus

2004-09-28 Thread Justin Mason
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


Loren Wilton writes:
   I have no idea how painful linked lists are in Perl (or if they even
 exist).
 
  Why are you commenting then???
 
 Because they are very useful, as I pointed out.
 
  They don't exist as a native data structure.  Arrays are fast, painless,
  and dynamically sized.
 
 They don't exist as a native data structure in C++ either.  But they get a
 lot of use. Even when template classes exist to do reasonably fast and
 reasonably painless dynamic arrays.  For certain things (like collections of
 objects that can get reordered frequently) they are generally more efficient
 than dynamic arrays.
 
 If there is an SA coding requirement for only using native data structures,
 then forget lists.  If no such requirement exists and there is an interest
 in optimizing performance, then they should be a tool to be considered.

Unfortunately, perl speed optimisation doesn't work like that.
The reason is that perl native data structures (arrays, hashes, strings,
numeric SVs, etc.) can be looked up in one perl OP, but a user-defined
data structure cannot.

The OP is the lowest level command in the perl VM, equivalent to an
assembly opcode, and as such is very very fast -- since the innards of an
OP is pure C.   That's why regexp matching in perl is as fast as it
is in C -- because a regexp match is compiled to a single OP.

(Perl's not like Java in that respect.  Perl's vm has quite high-level
opcodes, whereas java's is more like real assembly and more low-level.
that's why perl is faster than java ;)

Unfortunately when reading fields in a perl data structure like a hash or
array, and traversing reference chains, each variable access, and
ref derefence, is an individual OP.

So the upshot is that using a native perl data type will always be
faster than defining a new non-native data type structure in perl.

cf. http://www.ccl4.org/~nick/P/Fast_Enough/#ops_are_bad,_m%27kay
for more details...   in fact, I'm even considering looking into some
use of pack() here for the very reasons noted here ;)

(ps.  I'm sure if I got any of that wrong Matt will correct me ;)

- --j.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.4 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFBWQRrQTcbUG5Y7woRAvGHAJwOAxmPKpX09LoiZBCsYypL5UzA2ACgvbTm
6uB3igI7ObXF+vn+jeOmN98=
=cQEI
-END PGP SIGNATURE-



Re: speedup for PerMsgStatus

2004-09-28 Thread [EMAIL PROTECTED]
That is an unreasonablly nasty retort.

Original Message:
-
From: Daniel Quinlan [EMAIL PROTECTED]
Date: 27 Sep 2004 22:53:38 -0700
To: [EMAIL PROTECTED], dev@spamassassin.apache.org
Subject: Re: speedup for PerMsgStatus


Loren Wilton [EMAIL PROTECTED] writes:

 I have no idea how painful linked lists are in Perl (or if they even
exist).

Why are you commenting then???

They don't exist as a native data structure.  Arrays are fast, painless,
and dynamically sized.

-- 
Daniel Quinlan ApacheCon! 13-17 November (3 SpamAssassin
http://www.pathname.com/~quinlan/  http://www.apachecon.com/  sessions 
more)


mail2web - Check your email from the web at
http://mail2web.com/ .