[Bug 6707] PerMsgStatus get_decoded_body_text_array() incorrect documentation
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=6707 Henrik Krohns changed: What|Removed |Added Resolution|--- |FIXED Status|NEW |RESOLVED CC||apa...@hege.li --- Comment #4 from Henrik Krohns --- Was even slightly more clarified in revision 1864686. Closing. -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7054] PerMsgStatus could simplify plugins by caching A record lookups for get_uri_detail_list()
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7054 --- Comment #3 from Henrik Krohns --- (In reply to RW from comment #2) > It's about caching host A-record look-ups, for plugins to use. > > It not very well put as there doesn't look to be any plugin that uses > get_uri_detail_list and needs A-records. Uh yes, but what is there to cache? If some DNS A query is made, SA only does it once and caches it already. Dunno if it was the case back in 2014. All this has very little relevance to uri hashes. -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7054] PerMsgStatus could simplify plugins by caching A record lookups for get_uri_detail_list()
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7054 RW changed: What|Removed |Added CC||rwmailli...@googlemail.com --- Comment #2 from RW --- It's about caching host A-record look-ups, for plugins to use. It not very well put as there doesn't look to be any plugin that uses get_uri_detail_list and needs A-records. -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7054] PerMsgStatus could simplify plugins by caching A record lookups for get_uri_detail_list()
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7054 Henrik Krohns changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |WONTFIX CC||apa...@hege.li --- Comment #1 from Henrik Krohns --- I'm not sure what the actual point here is. Plugins query all sorts of stuff, I don't see anything inherently bad in calling get_uri_detail_list(), it's not expensive, and it's been defacto for ages. I don't see any reason to change/break things. -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7719] (Can't locate object method "check_uri_host_listed" via package "Mail: [...]:SpamAssassin::PerMsgStatus" at (eval 1478)
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7719 Steadramon changed: What|Removed |Added CC||paul.st...@gmail.com Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #1 from Steadramon --- Fixed in commit below by adding ifplugin, this will be picked up during next rule build Sending20_urlshort.cf Transmitting file data .done Committing transaction... Committed revision 1860821. -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7719] New: (Can't locate object method "check_uri_host_listed" via package "Mail: [...]:SpamAssassin::PerMsgStatus" at (eval 1478)
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7719 Bug ID: 7719 Summary: (Can't locate object method "check_uri_host_listed" via package "Mail: [...]:SpamAssassin::PerMsgStatus" at (eval 1478) Product: Spamassassin Version: 3.4.2 Hardware: PC OS: Linux Status: NEW Severity: critical Priority: P2 Component: Rules Assignee: dev@spamassassin.apache.org Reporter: h.rei...@thelounge.net Target Milestone: Undefined given that nothing was touched on that machine for days now this is pretty sure once againa bad rules update - and yes since yesterday i receive that as email beause i have running my tests Jun 8 12:30:03.449 [1517663] warn: rules: failed to run __PDS_URL_SHORTENER test, skipping: Jun 8 12:30:03.449 [1517663] warn: (Can't locate object method "check_uri_host_listed" via package "Mail: [...]:SpamAssassin::PerMsgStatus" at (eval 1478) line 19. Jun 8 12:30:03.449 [1517663] warn: ) -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7404] Bad regexp (and logic) in MS::PerMsgStatus::get_content_preview
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7404 Bill Colechanged: What|Removed |Added CC||walt...@pobox.com --- Comment #8 from Bill Cole --- *** Bug 7480 has been marked as a duplicate of this bug. *** -- You are receiving this mail because: You are the assignee for the bug.
[Bug 6707] PerMsgStatus get_decoded_body_text_array() incorrect documentation
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=6707 Giovanni Bechischanged: What|Removed |Added CC||giova...@paclan.it --- Comment #3 from Giovanni Bechis --- In trunk now it is documented as: Returns the message body, with base64 or quoted-printable encodings decoded, and non-text parts or non-inline attachments stripped. It is returned as an array of strings, with each string representing one newline-separated line of the body. I think bz is no more current. -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7404] Bad regexp (and logic) in MS::PerMsgStatus::get_content_preview
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7404 Kevin A. McGrailchanged: What|Removed |Added CC||christophe.troestler@umons. ||ac.be --- Comment #7 from Kevin A. McGrail --- *** Bug 7448 has been marked as a duplicate of this bug. *** -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7404] Bad regexp (and logic) in MS::PerMsgStatus::get_content_preview
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7404 Bill Colechanged: What|Removed |Added CC||trond.endres...@ximalas.inf ||o --- Comment #6 from Bill Cole --- *** Bug 7444 has been marked as a duplicate of this bug. *** -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7404] Bad regexp (and logic) in MS::PerMsgStatus::get_content_preview
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7404 Bill Colechanged: What|Removed |Added CC||alexcho...@gmail.com --- Comment #5 from Bill Cole --- *** Bug 7369 has been marked as a duplicate of this bug. *** -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7212] Warning of uninitialized value in Mail::SpamAssassin::PerMsgStatus::get_names_of_tests_hit_with_scores()
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7212 Sidney Markowitzchanged: What|Removed |Added CC||sid...@sidney.com --- Comment #5 from Sidney Markowitz --- Ported the change from comment 4 to branch 3.4 Committed revision 1791580 -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7196] PerMsgStatus Warning
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7196 Sidney Markowitzchanged: What|Removed |Added Status|NEW |RESOLVED CC||sid...@sidney.com Resolution|--- |FIXED --- Comment #7 from Sidney Markowitz --- The patch in comment 5 is already in the commits. It looks like Mark was posting corrections to the proposed patch that reflect what he had already committed. I'm closing this as fixed. -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7404] Bad regexp (and logic) in MS::PerMsgStatus::get_content_preview
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7404 Mark Martinecchanged: What|Removed |Added Resolution|--- |FIXED Status|NEW |RESOLVED --- Comment #4 from Mark Martinec --- closing, resolved -- You are receiving this mail because: You are the assignee for the bug.
Re: [Bug 7404] New: MS::PerMsgStatus::get_content_preview
On Tue, 11 Apr 2017, bugzilla-dae...@issues.apache.org wrote: https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7404 Bug ID: 7404 Summary: MS::PerMsgStatus::get_content_preview t/autolearn.t . Unescaped left brace in regex is deprecated here (and will be fatal in Perl 5.30), passed through in regex; marked by <-- HERE in m/^(.{ <-- HERE ,200}).*$/ at ../blib/lib/Mail/SpamAssassin/PerMsgStatus.pm line 923. perl 5.25.11 I remember discussing that (and embarrassing myself because that syntax IS valid in sed but not perl) a while back. That bug may be a dupe... -- John Hardin KA7OHZhttp://www.impsec.org/~jhardin/ jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79 --- If you are "fighting for social justice," then you are defining yourself as someone who considers regular old everyday *equal* justice to be something you don't want. -- GOF at TSM --- 2 days until Thomas Jefferson's 274th Birthday
[Bug 7404] Bad regexp (and logic) in MS::PerMsgStatus::get_content_preview
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7404 --- Comment #3 from Mark Martinec--- 3.4: Sending PerMsgStatus.pm Committed revision 1791013. -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7404] Bad regexp (and logic) in MS::PerMsgStatus::get_content_preview
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7404 --- Comment #2 from Mark Martinec--- Created attachment 5441 --> https://bz.apache.org/SpamAssassin/attachment.cgi?id=5441=edit Fixes inappropriate regexp (and surrounding logic) -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7404] Bad regexp (and logic) in MS::PerMsgStatus::get_content_preview
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7404 Mark Martinec <mark.marti...@ijs.si> changed: What|Removed |Added Summary|MS::PerMsgStatus::get_conte |Bad regexp (and logic) in |nt_preview |MS::PerMsgStatus::get_conte ||nt_preview Target Milestone|Undefined |3.4.2 --- Comment #1 from Mark Martinec <mark.marti...@ijs.si> --- (sorry for hitting a submit prematurely on an unfinished PR) Testing with 5.25.11 (soon to be be released as 5.26) fills the log with warnings on a suspicious regexp, as shown above - an x{,n} is not interpreted as x{0,n}, and even if it were, the logic is wrong in that code. fixing in trunk: Sending lib/Mail/SpamAssassin/PerMsgStatus.pm Committed revision 1791010. -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7404] New: MS::PerMsgStatus::get_content_preview
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7404 Bug ID: 7404 Summary: MS::PerMsgStatus::get_content_preview Product: Spamassassin Version: 3.4.1 Hardware: PC OS: All Status: NEW Severity: normal Priority: P2 Component: Libraries Assignee: dev@spamassassin.apache.org Reporter: mark.marti...@ijs.si Target Milestone: Undefined t/autolearn.t . Unescaped left brace in regex is deprecated here (and will be fatal in Perl 5.30), passed through in regex; marked by <-- HERE in m/^(.{ <-- HERE ,200}).*$/ at ../blib/lib/Mail/SpamAssassin/PerMsgStatus.pm line 923. perl 5.25.11 -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7212] Warning of uninitialized value in Mail::SpamAssassin::PerMsgStatus::get_names_of_tests_hit_with_scores()
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7212 Kevin A. McGrail kmcgr...@pccc.com changed: What|Removed |Added Resolution|--- |FIXED Status|NEW |RESOLVED CC||kmcgr...@pccc.com --- Comment #1 from Kevin A. McGrail kmcgr...@pccc.com --- Fixed on trunk and 3.4 branch, thanks. svn commit -m 'Fixed uninitialized error with $line - Thanks to Franz Schwartau bug 7212 - trunk' Sendinglib/Mail/SpamAssassin/PerMsgStatus.pm Transmitting file data . Committed revision 1685842. svn commit -m 'Fixed uninitialized error with $line - Thanks to Franz Schwartau bug 7212 - 3.4' lib/ Sendinglib/Mail/SpamAssassin/PerMsgStatus.pm Transmitting file data . Committed revision 1685843. -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7212] Warning of uninitialized value in Mail::SpamAssassin::PerMsgStatus::get_names_of_tests_hit_with_scores()
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7212 --- Comment #3 from Kevin A. McGrail kmcgr...@pccc.com --- I'm +1 if you want to apply. -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7212] Warning of uninitialized value in Mail::SpamAssassin::PerMsgStatus::get_names_of_tests_hit_with_scores()
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7212 --- Comment #2 from Mark Martinec mark.marti...@ijs.si --- Created attachment 5309 -- https://bz.apache.org/SpamAssassin/attachment.cgi?id=5309action=edit A more compact alternative A more radical coding approach (attached) reduces the number of generated opcodes in affected code sections by 40 % (132 79 opcodes) according to B::Concise. -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7212] Warning of uninitialized value in Mail::SpamAssassin::PerMsgStatus::get_names_of_tests_hit_with_scores()
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7212 Mark Martinec mark.marti...@ijs.si changed: What|Removed |Added Target Milestone|Undefined |3.4.2 --- Comment #4 from Mark Martinec mark.marti...@ijs.si --- I'm +1 if you want to apply. trunk: Bug 7212: code compaction (was: Warning of uninitialized value) Sending lib/Mail/SpamAssassin/PerMsgStatus.pm Committed revision 1685857. -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7212] New: Warning of uninitialized value in Mail::SpamAssassin::PerMsgStatus::get_names_of_tests_hit_with_scores()
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7212 Bug ID: 7212 Summary: Warning of uninitialized value in Mail::SpamAssassin::PerMsgStatus::get_names_of_tests_h it_with_scores() Product: Spamassassin Version: 3.4.1 Hardware: PC OS: FreeBSD Status: NEW Severity: minor Priority: P2 Component: Libraries Assignee: dev@spamassassin.apache.org Reporter: fr...@electromail.org Created attachment 5308 -- https://bz.apache.org/SpamAssassin/attachment.cgi?id=5308action=edit patch for PerMsgStatus::get_names_of_tests_hit_with_scores() Using Mail::SpamAssassin::PerMsgStatus::get_names_of_tests_hit_with_scores() results in: Use of uninitialized value $line in string ne at /usr/local/lib/perl5/site_perl/Mail/SpamAssassin/PerMsgStatus.pm line 770. $line should be initialized with ''. Patch attached. -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7196] PerMsgStatus Warning
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7196 --- Comment #6 from Kevin A. McGrail kmcgr...@pccc.com --- Can this be marked resolved? I'm not clear if the patches are committed or is there a fix needed still? -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7196] PerMsgStatus Warning
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7196 --- Comment #3 from Kevin A. McGrail kmcgr...@pccc.com --- (In reply to Mark Martinec from comment #2) Created attachment 5300 [details] proposed patch Is this patch what you mean or is my checking for defined sweeping something bigger under the table? Seems fine to me. Attached is my version (sorry for duplicating your work, unintentionally). Treats domains case-insensitively, retains original order instead of random. Your fix looks good too with the extra benefit of case insensitivity (not sure I care about the order that much). And definitely no worries about the overlap. Please commit when ready. Regards, KAM -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7196] PerMsgStatus Warning
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7196 Mark Martinec mark.marti...@ijs.si changed: What|Removed |Added Target Milestone|Undefined |3.4.2 -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7196] PerMsgStatus Warning
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7196 --- Comment #4 from Mark Martinec mark.marti...@ijs.si --- trunk: Sending lib/Mail/SpamAssassin/PerMsgStatus.pm Committed revision 1679652. 3.4: Sending lib/Mail/SpamAssassin/PerMsgStatus.pm Committed revision 1679653. -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7196] PerMsgStatus Warning
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7196 Mark Martinec mark.marti...@ijs.si changed: What|Removed |Added Attachment #5300|0 |1 is obsolete|| --- Comment #5 from Mark Martinec mark.marti...@ijs.si --- Created attachment 5301 -- https://bz.apache.org/SpamAssassin/attachment.cgi?id=5301action=edit proposed patch (fixes previous patch) -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7196] New: PerMsgStatus Warning
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7196 Bug ID: 7196 Summary: PerMsgStatus Warning Product: Spamassassin Version: 3.4.1 Hardware: PC OS: Windows 7 Status: NEW Severity: normal Priority: P2 Component: spamassassin Assignee: dev@spamassassin.apache.org Reporter: kmcgr...@pccc.com This spanned from bug 7195: May 13 23:35:43 mail01 amavis[17386]: (17386-12) _WARN: Use of uninitialized value in concatenation (.) or string at /usr/share/perl5/vendor_perl/Mail/SpamAssassin/PerMsgStatus.pm line 3082. Mark: It's because the RegistryBoundaries::uri_to_domain returns undef when it sees %hh encoding in an URL, and the caller then tries to concatenate it with 'dummy@', which produces this warning. Seems harmless but annoying, should be fixed. -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7195] New: PerMsgStatus Util warnings
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7195 Bug ID: 7195 Summary: PerMsgStatus Util warnings Product: Spamassassin Version: 3.4.1 Hardware: PC OS: Linux Status: NEW Severity: normal Priority: P2 Component: spamassassin Assignee: dev@spamassassin.apache.org Reporter: mysqlstud...@gmail.com I reported the following on users@ and Mark determined the two warning messages below were independent bugs that should be addressed: I have v3.4.1 with amavisd v2.9.1 on fedora20 and receiving the following warnings: May 13 23:32:31 mail01 amavis[17306]: (17306-10) _WARN: plugin: eval failed: Undefined subroutine Mail::SpamAssassin::Util::RegistrarBoundaries::trim_domain called at /usr/share/perl5/vendor_perl/Mail/SpamAssassin/Util.pm line 1236. May 13 23:35:43 mail01 amavis[17386]: (17386-12) _WARN: Use of uninitialized value in concatenation (.) or string at /usr/share/perl5/vendor_perl/Mail/SpamAssassin/PerMsgStatus.pm line 3082. The second warning is independent from the first, I'm seeing these too. It's because the RegistryBoundaries::uri_to_domain returns undef when it sees %hh encoding in an URL, and the caller then tries to concatenate it with 'dummy@', which produces this warning. Seems harmless but annoying, should be fixed. Please open a bug report (one should do, even though these are two independent issues). -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7196] PerMsgStatus Warning
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7196 Kevin A. McGrail kmcgr...@pccc.com changed: What|Removed |Added CC||kmcgr...@pccc.com --- Comment #1 from Kevin A. McGrail kmcgr...@pccc.com --- Mark, Is this patch what you mean or is my checking for defined sweeping something bigger under the table? Index: lib/Mail/SpamAssassin/PerMsgStatus.pm === --- lib/Mail/SpamAssassin/PerMsgStatus.pm (revision 1679280) +++ lib/Mail/SpamAssassin/PerMsgStatus.pm (working copy) @@ -3078,12 +3078,17 @@ dbg(eval: all '*From' addrs domains (before): . join( , @addrs)); #loop through and limit to just the domain with a dummy address + my @tmp_addrs; for (my $i = 0; $i scalar(@addrs); $i++) { -$addrs[$i] = 'dummy@'.$self-{main}-{registryboundaries}-uri_to_domain($addrs[$i]); +#Checking if the return from uri_to_domain is defined - bug 7196 KAM +my $tmp_addrs = $self-{main}-{registryboundaries}-uri_to_domain($addrs[$i]); +if (defined $tmp_addrs) { + push (@tmp_addrs, 'dummy@'.$tmp_addrs); +} } #Remove duplicate domains - my %addrs = map { $_ = 1 } @addrs; + my %addrs = map { $_ = 1 } @tmp_addrs; @addrs = keys %addrs; dbg(eval: all '*From' addrs domains (after uri to domain): . join( , @addrs)); -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7196] PerMsgStatus Warning
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7196 --- Comment #2 from Mark Martinec mark.marti...@ijs.si --- Created attachment 5300 -- https://bz.apache.org/SpamAssassin/attachment.cgi?id=5300action=edit proposed patch Is this patch what you mean or is my checking for defined sweeping something bigger under the table? Seems fine to me. Attached is my version (sorry for duplicating your work, unintentionally). Treats domains case-insensitively, retains original order instead of random. -- You are receiving this mail because: You are the assignee for the bug.
Re: Uninitialized value in PerMsgStatus
On 1/15/2015 2:43 PM, Mark Martinec wrote: What is a PR? A bugzilla ticket? Yes, a problem report. Sorry for using terminology from another context. Thanks. I wasn't sure if you meant a press release about the bug but that seemed overkill... If there are other perl critic warnings, they are likely not enabled in that test. Opened a Bug 7119. Probably depends on a version of Perl::Critic. I upgraded to $VERSION = '1.121'; of Perl::Critic so we can try and be on the same page. That statement deserves the '## no critic' annotation, something like (untested) : # see Bug 7120 return undef if !defined $_[0]; ## no critic (ProhibitExplicitReturnUndef) Exactly. ## no critic (ProhibitExplicitReturnUndef) - See Bug 7120 added and passes xt test. Will commit as soon as I have a full make test completed. For other cases I didn't check. The other cases are all returns if a $self checker is not defined so much less of a concern anyway. Regards, KAM
Re: Uninitialized value in PerMsgStatus
On 1/15/2015 11:09 AM, Mark Martinec wrote: :-) Well I don't want to change the vetting process for a release and xt/60_perlcritic.t has been used for years on the code base. Suggestions what we can do to resolve the issue that also passes that test so we don't have to go down that rabbithole? There are some other perlcritic warnings about modifying $_ in list functions (in sa-update, spamassassin, spamd). Opening a PR would be warranted. What is a PR? A bugzilla ticket? Also, running prove -v xt/60_perlcritic.t shows just the one issue and I committed the change which reverses some fixes and the original return undef change in Util.pm. not ok 38 - Test::Perl::Critic for ../blib/lib/Mail/SpamAssassin/Util.pm # Failed test 'Test::Perl::Critic for ../blib/lib/Mail/SpamAssassin/Util.pm' # at /usr/local/lib/perl5/site_perl/5.8.6/Test/Perl/Critic.pm line 110. # # Perl::Critic found these violations in ../blib/lib/Mail/SpamAssassin/Util.pm: # return statement with explicit undef at line 287, column 5. If there are other perl critic warnings, they are likely not enabled in that test. Looking back, this return undef was added after 3.4.0's release and appears to have been added during the profiling work on https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6854 Do you feel confident enough about your change that I should modify or notate the build process notes to ignore that issue for that specific return undef? The MS::Util::untaint_var() is still broken... I'm reverting some recent changes and reverting back to the one undef in there. Sorry about that. Best of intentions. Btw, common use of 'return 0' in TxRep to indicate a boolean response falls into the same category as 'return undef', although Test::Perl::Critic does not check for that. I use no warnings 'uninitialized', 'once'; for a lot of my day to day work so I can see that but my key concern is passing the tests required of a release without trying to change those tests right now. For a decade or so, we've been able to make releases that pass that 60_perlcritic test and don't cause real-world issues. I'm confident that we can do so at least in this case. I want to get the release out by Jan 30. Then I think you are correct we need to look at if this makes sense longer-term. Regards, KAM
Re: Uninitialized value in PerMsgStatus
:-) Well I don't want to change the vetting process for a release and xt/60_perlcritic.t has been used for years on the code base. Suggestions what we can do to resolve the issue that also passes that test so we don't have to go down that rabbithole? There are some other perlcritic warnings about modifying $_ in list functions (in sa-update, spamassassin, spamd). Opening a PR would be warranted. The MS::Util::untaint_var() is still broken... Btw, common use of 'return 0' in TxRep to indicate a boolean response falls into the same category as 'return undef', although Test::Perl::Critic does not check for that. Mark
Re: Uninitialized value in PerMsgStatus
There are some other perlcritic warnings about modifying $_ in list functions (in sa-update, spamassassin, spamd). Opening a PR would be warranted. What is a PR? A bugzilla ticket? Yes, a problem report. Sorry for using terminology from another context. If there are other perl critic warnings, they are likely not enabled in that test. Opened a Bug 7119. Probably depends on a version of Perl::Critic. # Perl::Critic found these violations in ../blib/lib/Mail/SpamAssassin/Util.pm: # return statement with explicit undef at line 287, column 5. Opened a Bug 7120. It can be used as a reference in a comment when adding ## no critic annotations in code. Looking back, this return undef was added after 3.4.0's release and appears to have been added during the profiling work on https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6854 I probably fixed that potential or actual problem in passing. The MS::Util::untaint_var() is still broken... I'm reverting some recent changes and reverting back to the one undef Thanks! Do you feel confident enough about your change that I should modify or notate the build process notes to ignore that issue for that specific return undef? For the 'return undef' in untaint_var() most definitely so. That function can be used when assembling arguments in a call to some other function, just as described in Bug 7120. That statement deserves the '## no critic' annotation, something like (untested) : # see Bug 7120 return undef if !defined $_[0]; ## no critic (ProhibitExplicitReturnUndef) For other cases I didn't check. Btw, common use of 'return 0' in TxRep to indicate a boolean response falls into the same category as 'return undef', although Test::Perl::Critic does not check for that. I use no warnings 'uninitialized', 'once'; for a lot of my day to day work so I can see that but my key concern is passing the tests required of a release without trying to change those tests right now. Testing in a boolean context does not issue a warning when an expression is undefined. There is no reason to return a zero (instead of undef or an empty list) when a subroutine result is considered a boolean value. Mark
Re: Uninitialized value in PerMsgStatus
--On Thursday, January 15, 2015 11:48 AM -0500 Kevin A. McGrail kmcgr...@pccc.com wrote: (in sa-update, spamassassin, spamd). Opening a PR would be warranted. What is a PR? A bugzilla ticket? PR is generally short for problem report. So yes, I'd guess a bugzilla ticket. --Quanah -- Quanah Gibson-Mount Platform Architect Zimbra, Inc. Zimbra :: the leader in open source messaging and collaboration
Re: Uninitialized value in PerMsgStatus
Yep that was done to pass the XT tests for a release. Joe can you look at those returns again? That advice from perlcritic needs to be taken with a large grain of salt. In the past I have been bitten by this several times. It is generally safer to leave 'return undef' unless one carefully analyzes in what context a subroutine is being called. I'd suggest to revert that changeset entirely. Mark
Re: Uninitialized value in PerMsgStatus
On 1/14/2015 3:11 PM, Mark Martinec wrote: Yep that was done to pass the XT tests for a release. Joe can you look at those returns again? That advice from perlcritic needs to be taken with a large grain of salt. In the past I have been bitten by this several times. It is generally safer to leave 'return undef' unless one carefully analyzes in what context a subroutine is being called. I'd suggest to revert that changeset entirely. Mark If a subroutine is supposed to return exactly one scalar and is using a return, that return must be a 'return undef'. Consider the surprising behaviour: perl -le ' sub foo {$a=shift; return if !$a; sin($a)}; ($x, $y, $z) = (foo(3), foo(0), 9); print x=$x, y=$y, z=$z' x=0.141120008059867, y=9, z= :-) Well I don't want to change the vetting process for a release and xt/60_perlcritic.t has been used for years on the code base. Suggestions what we can do to resolve the issue that also passes that test so we don't have to go down that rabbithole? Regards, KAM
Re: Uninitialized value in PerMsgStatus
Yep that was done to pass the XT tests for a release. Joe can you look at those returns again? That advice from perlcritic needs to be taken with a large grain of salt. In the past I have been bitten by this several times. It is generally safer to leave 'return undef' unless one carefully analyzes in what context a subroutine is being called. I'd suggest to revert that changeset entirely. Mark If a subroutine is supposed to return exactly one scalar and is using a return, that return must be a 'return undef'. Consider the surprising behaviour: perl -le ' sub foo {$a=shift; return if !$a; sin($a)}; ($x, $y, $z) = (foo(3), foo(0), 9); print x=$x, y=$y, z=$z' x=0.141120008059867, y=9, z= Mark
Re: Uninitialized value in PerMsgStatus
Yep that was done to pass the XT tests for a release. Joe can you look at those returns again? Regards, KAM On January 14, 2015 1:35:20 PM EST, Mark Martinec mark.martinec...@ijs.si wrote: 2015-01-14 18:06, je Alex Regan napisal Hi, I'm using amavisd-new-2.9.1 and perl-5.18.4 on fedora20 with the svn spamassassin snapshot from today, and receive the following message: Jan 14 11:59:21 mail01 amavis[19431]: (19431-18) _WARN: Use of uninitialized value in concatenation (.) or string at /usr/share/perl5/vendor_perl/Mail/SpamAssassin/PerMsgStatus.pm line 3072, GEN19 line 16875. 3071#loop through and limit to just the domain with a dummy address 3072for (my $i = 0; $i scalar(@addrs); $i++) { 3073 $addrs[$i] = 'dummy@'.Mail::SpamAssassin::Util::uri_to_domain($addrs[$i]); 3074} This has apparently been going on for some time, not just with today's update. I don't see any references to others having this problem, so hoped someone could help. How can I troubleshoot this? Could it be an amavis issue? Thanks, Alex Looks like someone just recently blindly changed 'return undef' to a 'return', which breaks calls to such subroutine in a list context, e.g. foo( untaint($a), bla, untaint($b) ) Mark
Uninitialized value in PerMsgStatus
Hi, I'm using amavisd-new-2.9.1 and perl-5.18.4 on fedora20 with the svn spamassassin snapshot from today, and receive the following message: Jan 14 11:59:21 mail01 amavis[19431]: (19431-18) _WARN: Use of uninitialized value in concatenation (.) or string at /usr/share/perl5/vendor_perl/Mail/SpamAssassin/PerMsgStatus.pm line 3072, GEN19 line 16875. 3071#loop through and limit to just the domain with a dummy address 3072for (my $i = 0; $i scalar(@addrs); $i++) { 3073 $addrs[$i] = 'dummy@'.Mail::SpamAssassin::Util::uri_to_domain($addrs[$i]); 3074} This has apparently been going on for some time, not just with today's update. I don't see any references to others having this problem, so hoped someone could help. How can I troubleshoot this? Could it be an amavis issue? Thanks, Alex
Re: Uninitialized value in PerMsgStatus
2015-01-14 18:06, je Alex Regan napisal Hi, I'm using amavisd-new-2.9.1 and perl-5.18.4 on fedora20 with the svn spamassassin snapshot from today, and receive the following message: Jan 14 11:59:21 mail01 amavis[19431]: (19431-18) _WARN: Use of uninitialized value in concatenation (.) or string at /usr/share/perl5/vendor_perl/Mail/SpamAssassin/PerMsgStatus.pm line 3072, GEN19 line 16875. 3071#loop through and limit to just the domain with a dummy address 3072for (my $i = 0; $i scalar(@addrs); $i++) { 3073 $addrs[$i] = 'dummy@'.Mail::SpamAssassin::Util::uri_to_domain($addrs[$i]); 3074} This has apparently been going on for some time, not just with today's update. I don't see any references to others having this problem, so hoped someone could help. How can I troubleshoot this? Could it be an amavis issue? Thanks, Alex Looks like someone just recently blindly changed 'return undef' to a 'return', which breaks calls to such subroutine in a list context, e.g. foo( untaint($a), bla, untaint($b) ) Mark
[Bug 7054] PerMsgStatus could simplify plugins by caching A record lookups for get_uri_detail_list()
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=7054 Philip Prindeville phil...@redfish-solutions.com changed: What|Removed |Added CC||kmcgr...@pccc.com, ||mark.marti...@ijs.si -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7054] PerMsgStatus could simplify plugins by caching A record lookups for get_uri_detail_list()
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=7054 Philip Prindeville phil...@redfish-solutions.com changed: What|Removed |Added CC||philipp@redfish-solutions.c ||om -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7054] New: PerMsgStatus could simplify plugins by caching A record lookups for get_uri_detail_list()
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=7054 Bug ID: 7054 Summary: PerMsgStatus could simplify plugins by caching A record lookups for get_uri_detail_list() Product: Spamassassin Version: 3.4.0 Hardware: PC OS: Linux Status: NEW Severity: normal Priority: P2 Component: Libraries Assignee: dev@spamassassin.apache.org Reporter: phil...@redfish-solutions.com There are several plugins which call PerMsgStatus::get_uri_detail_list() for the {hosts} hash. The complexity to do efficient asynchronous lookups of the A records is substantial, and is best moved into the framework rather than being left to the plugins themselves. Either the {hosts} hash could be replaced to not map the key to the value, but instead map the key to an array reference of corresponding A records, or else an additional hash could be added to the PerMsgStatus class which maps hosts to their A record(s). -- You are receiving this mail because: You are the assignee for the bug.
Re: [Bug 6915] PerMsgStatus::get_tag() enhancement and optimization
On 3/6/2013 7:21 PM, bugzilla-dae...@issues.apache.org wrote: https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6915 Mark Martinec mark.marti...@ijs.si changed: What|Removed |Added Target Milestone|Undefined |3.4.0 --- Comment #2 from Mark Martinec mark.marti...@ijs.si --- Out of curiosity, I did a benchmark between the old and the new get_tag in a real-life environment (production mailer, SA called from amavisd, time measured by Time::HiRes, perl 5.17.9). Overall the difference is about 2 milliseconds per mail message scan (there are about 30 calls to get_tag per scan) in favour of the new one. The new get_tag is about twice as fast as the old one (0.077 ms vs. 0.14 ms per one call). Not that lot, but why not benefit if it comes for free... Very nice! Working on the build again now that this is committed.
[Bug 6915] PerMsgStatus::get_tag() enhancement and optimization
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6915 Mark Martinec mark.marti...@ijs.si changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #3 from Mark Martinec mark.marti...@ijs.si --- Closing, done. -- You are receiving this mail because: You are the assignee for the bug.
[Bug 6915] New: PerMsgStatus::get_tag() enhancement and optimization
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6915 Bug ID: 6915 Summary: PerMsgStatus::get_tag() enhancement and optimization Product: Spamassassin Version: SVN Trunk (Latest Devel Version) Hardware: PC OS: All Status: NEW Severity: enhancement Priority: P2 Component: Libraries Assignee: dev@spamassassin.apache.org Reporter: mark.marti...@ijs.si Classification: Unclassified 1) The current get_tag method dynamically assembles the hash of anonymous subroutines with every call. This behaviour was introduced by Bug 2066 (Duncan Findlay, 2003) in favour of a more self-contained code in _get_tag (avoiding a global variable and avoiding passing of $pms object as a parameter), but at the expense of dynamically constructing the hash of anonymous subroutines with every call to get_tag, and depending on a closure to access the $pms object. As the number of tags has grown somewhat in the past 10 years, and as the get_tag is called rather often, I'm suggesting essentially a revert of Duncan's change. My changes moves building of that hash of subs into a static BEGIN phase as an optimization, but for this reason needs to pass an extra argument (the $pms) to these subroutines. It only affects modules PerMsgStatus and the Bayes plugin, public API remains unchanged. 2) Leverage the existing possibility of a tag value being an arrayref: Introduces an additional (small) method get_tag_raw, which allows the caller to receive the tag value as an array reference if it wishes so, thus avoiding a need for splitting a string on a space for tags which are actually lists, and makes it possible for a tag value (as a list) to contain spaces without ambiguity. This will allow me some future improvements in my AskDNS plugin. 3) A somewhat unrelated change in Message/Metadata.pm, replacing tags LASTEXTERNALREVIP and FIRSTTRUSTEDREVIP with more universal lists: RELAYSTRUSTEDREVIP, RELAYSUNTRUSTEDREVIP, RELAYSINTERNALREVIP, and RELAYSEXTERNALREVIP. Neither the old nor the new tags are currently in use by rukes, so this affects noone (except it enables my future work on AskDNS plugin). -- You are receiving this mail because: You are the assignee for the bug.
[Bug 6915] PerMsgStatus::get_tag() enhancement and optimization
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6915 --- Comment #1 from Mark Martinec mark.marti...@ijs.si --- trunk: Sending lib/Mail/SpamAssassin/Message/Metadata.pm Sending lib/Mail/SpamAssassin/Message.pm Sending lib/Mail/SpamAssassin/PerMsgStatus.pm Sending lib/Mail/SpamAssassin/Plugin/Bayes.pm Sending lib/Mail/SpamAssassin/Plugin/RelayCountry.pm Committed revision 1453407. -- You are receiving this mail because: You are the assignee for the bug.
[Bug 6915] PerMsgStatus::get_tag() enhancement and optimization
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6915 Mark Martinec mark.marti...@ijs.si changed: What|Removed |Added Target Milestone|Undefined |3.4.0 --- Comment #2 from Mark Martinec mark.marti...@ijs.si --- Out of curiosity, I did a benchmark between the old and the new get_tag in a real-life environment (production mailer, SA called from amavisd, time measured by Time::HiRes, perl 5.17.9). Overall the difference is about 2 milliseconds per mail message scan (there are about 30 calls to get_tag per scan) in favour of the new one. The new get_tag is about twice as fast as the old one (0.077 ms vs. 0.14 ms per one call). Not that lot, but why not benefit if it comes for free... -- You are receiving this mail because: You are the assignee for the bug.
[Bug 6707] PerMsgStatus get_decoded_body_text_array() incorrect documentation
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6707 Kevin A. McGrail kmcgr...@pccc.com changed: What|Removed |Added CC||kmcgr...@pccc.com --- Comment #2 from Kevin A. McGrail kmcgr...@pccc.com 2011-11-26 13:44:05 UTC --- It's a step in the right direction. Perhaps mentioning WHY this was done would help. I believe there was a potential DoS edge case this resolved and pointing towards bug 5717 seems to agree with my memory. Regards, KAM -- Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug.
[Bug 6707] New: PerMsgStatus get_decoded_body_text_array() incorrect documentation
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6707 Bug #: 6707 Summary: PerMsgStatus get_decoded_body_text_array() incorrect documentation Product: Spamassassin Version: 3.3 SVN branch Platform: All OS/Version: All Status: NEW Severity: normal Priority: P2 Component: Documentation AssignedTo: dev@spamassassin.apache.org ReportedBy: guent...@rudersport.de Classification: Unclassified The documentation for M::SA::PerMsgStatus get_decoded_body_text_array() is incorrect. It claims to return an array of lines of the body. What it actually returns is an array with chunks of all textual MIME-parts. More precisely, it (used to) return an array of all textual MIME-parts, each MIME-part in a string (member of the array) of its own. Since 3.3, long-ish MIME-parts are broken up into chunks between 1024 and 2048 bytes. See M::SA::Message get_decoded_body_text_array() for the actual code. See bug 5717 for the split into chunks issue. The directly related M::SA::Conf rawbody rule documentation has been changed already, not to talk about lines, and that multiline RE modifiers might be needed. Breaking up MIME-parts into chunks is not mentioned, though. Might be worthwhile to add here, or reference the PerMsgStatus docs -- since REs don't match, if the target sub-string happens to span a chunk boundary, even though the message would match the RE. -- Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug.
[Bug 6707] PerMsgStatus get_decoded_body_text_array() incorrect documentation
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6707 --- Comment #1 from Karsten BrÀckelmann guent...@rudersport.de 2011-11-26 00:47:09 UTC --- Created attachment 5013 -- https://issues.apache.org/SpamAssassin/attachment.cgi?id=5013 proposed patch POD for PerMsgStatus get_decoded_body_text_array() fixed, to reflect the current code. Added a sentence to Conf rawbody POD, additionally clarifying what the RE will be matched against. Attaching here rather than committing right away for discussion. Feel free to further clarify or adjust the POD changes. -- Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug.
[Bug 5948] New: uninitialized value within @pristine_headers PerMsgStatus. pm line 832
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=5948 Summary: uninitialized value within @pristine_headers PerMsgStatus.pm line 832 Product: Spamassassin Version: 3.2.5 Platform: Other OS/Version: All Status: NEW Severity: enhancement Priority: P5 Component: spamassassin AssignedTo: dev@spamassassin.apache.org ReportedBy: [EMAIL PROTECTED] I noticed an error in /var/log/spamassassin/spamd.log -- Sat Jul 26 20:24:21 2008 [5009] warn: Use of uninitialized value within @pristine_headers in pattern match (m//) at /usr/lib/perl5/vendor_perl/5.10.0/Mail/SpamAssassin/PerMsgStatus.pm line 832, GEN7 line 5. Sat Jul 26 20:24:21 2008 [5009] warn: Use of uninitialized value $separator in concatenation (.) or string at /usr/lib/perl5/vendor_perl/5.10.0/Mail/SpamAssassin/PerMsgStatus.pm line 907, GEN7 line 5. Looking at the code, I think that my $separator = ''; if ($pristine_headers[$#pristine_headers] =~ /^\s*$/) { $separator = pop @pristine_headers; } should read my $separator = ''; if (@pristine_headers $pristine_headers[$#pristine_headers] =~ /^\s*$/) { $separator = pop @pristine_headers; } in order to avoid setting #separator to undef if the pristine_headers array is empty. (This line of code is unchanged between 3.2.4 and 3.2.5, and I see the warnings with both versions) -- Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug.
[Bug 4906] Uninitialized $score in PerMsgStatus::_handle_hit
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=4906 [EMAIL PROTECTED] changed: What|Removed |Added Status|NEW |RESOLVED Resolution||FIXED --- Additional Comments From [EMAIL PROTECTED] 2007-04-16 05:23 --- assuming this is fixed; feel free to reopen if it still happens in 3.2.0 ;) --- You are receiving this mail because: --- You are the assignee for the bug, or are watching the assignee.
[Bug 4497] reorganise PerMsgStatus code
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=4497 [EMAIL PROTECTED] changed: What|Removed |Added Status|NEW |RESOLVED Resolution||WONTFIX --- Additional Comments From [EMAIL PROTECTED] 2006-12-04 10:41 --- marking this WONTFIX -- this work pretty much stalled and other stuff happened instead. this part of the idea is probably still worthwhile, though: - rewrite . Mail::SpamAssassin::Plugin::Rewrite (a plugin that implements rewriting of mails, so that alternative rewriting methods can be used instead, optionally) maybe someone will eventually have tuits for that ;) --- You are receiving this mail because: --- You are the assignee for the bug, or are watching the assignee.
[Bug 4906] Uninitialized $score in PerMsgStatus::_handle_hit
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=4906 --- Additional Comments From [EMAIL PROTECTED] 2006-07-12 15:26 --- I think this may be fixed in svn trunk. --- You are receiving this mail because: --- You are the assignee for the bug, or are watching the assignee.
[Bug 4973] get_tag and set_tag APIs missing from PerMsgStatus pod
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=4973 --- Additional Comments From [EMAIL PROTECTED] 2006-07-06 12:17 --- Created an attachment (id=3569) -- (http://issues.apache.org/SpamAssassin/attachment.cgi?id=3569action=view) Trivial patch to fix the POD I did submit an Apache CLA by snail-mail 12 months ago, I hope it's got there now! --- You are receiving this mail because: --- You are the assignee for the bug, or are watching the assignee.
[Bug 4973] get_tag and set_tag APIs missing from PerMsgStatus pod
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=4973 [EMAIL PROTECTED] changed: What|Removed |Added Status|NEW |RESOLVED Resolution||FIXED Summary|get_tag and set_tag APIs|get_tag and set_tag APIs |missing from PerMsgStatus |missing from PerMsgStatus |pod |pod Target Milestone|Undefined |3.1.4 --- Additional Comments From [EMAIL PROTECTED] 2006-07-06 18:42 --- aha, interesting. good catch! :) I've added in the newlines and committed. 3.1: Sendinglib/Mail/SpamAssassin/PerMsgStatus.pm Transmitting file data . Committed revision 419648. 3.2: Sendinglib/Mail/SpamAssassin/PerMsgStatus.pm Transmitting file data . Committed revision 419649. FWIW: I searched for other similar POD issues but didn't see any: find . -name *.pm | grep -v '\.svn' | xargs grep -B1 =item | grep '#' --- You are receiving this mail because: --- You are the assignee for the bug, or are watching the assignee.
[Bug 4906] Uninitialized $score in PerMsgStatus::_handle_hit
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=4906 --- Additional Comments From [EMAIL PROTECTED] 2006-05-26 00:55 --- I'm not completely clear about how often, or under what exact circumstances and config, you're seeing this but you're probably seeing bug 4179. The last time I tested bug 4179 out, though, two users had to be using the same rule name(s) to see any wierd config issues. --- You are receiving this mail because: --- You are the assignee for the bug, or are watching the assignee.
[Bug 4906] Uninitialized $score in PerMsgStatus::_handle_hit
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=4906 --- Additional Comments From [EMAIL PROTECTED] 2006-05-24 18:36 --- My suspicion is that spamd is not clearing the config completely between users when allow_user_rules is set, and the rules, but not the scores, are surviving between users. (I use my own account as a testbed for rules before deploying them system-wide for the handful of other users on the server.) I'm finding it tough to find the code that dumps the old config after processing. --- You are receiving this mail because: --- You are the assignee for the bug, or are watching the assignee.
[Bug 4906] Uninitialized $score in PerMsgStatus::_handle_hit
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=4906 --- Additional Comments From [EMAIL PROTECTED] 2006-05-23 13:29 --- Stock RPM built from the tarball. No special plugins. A couple of custom system rules for the float obfuscation spam but they have scores. What's the Perl syntax for sending a line to syslog if $score is undefined? Given that I can sprinkle some printfs through that path to trace where this is coming from. My sample triggers the problem solidly on my setup. --- You are receiving this mail because: --- You are the assignee for the bug, or are watching the assignee.
[Bug 4906] Uninitialized $score in PerMsgStatus::_handle_hit
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=4906 --- Additional Comments From [EMAIL PROTECTED] 2006-05-23 14:20 --- if (!defined $score) { warn HELP undef score; } that's what you're after. curious! --- You are receiving this mail because: --- You are the assignee for the bug, or are watching the assignee.
[Bug 4906] Uninitialized $score in PerMsgStatus::_handle_hit
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=4906 --- Additional Comments From [EMAIL PROTECTED] 2006-05-23 18:40 --- Ok, here's a clue. In /etc/mail/spamassassin/locals.cf I have allow_user_rules 1. In my userprefs I have this rule: header__MIME_VERSION exists:MIME-Version header__SARE_HEAD_MIME_VALID Mime-Version =~ m'^\s*1.0\b' meta SARE_HEAD_MIME_INVALID !__SARE_HEAD_MIME_VALID __MIME_VERSION describe SARE_HEAD_MIME_INVALID Invalid mime version score SARE_HEAD_MIME_INVALID 1.666 If I run spamc on a test message (not the one above, which isn't triggering today), I get the error running as a different user (who has no custom rules) but I don't get it running it from my account. The debug printf indicates this rule as lacking the score. if (!defined $score) { warn HELP undef score for rule $rule; } Theory: Does spamd handle user rules inconsistently, loading other users' personal rules but not scores? How are per-user rules handled by the daemon? --- You are receiving this mail because: --- You are the assignee for the bug, or are watching the assignee.
[Bug 4906] Uninitialized $score in PerMsgStatus::_handle_hit
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=4906 --- Additional Comments From [EMAIL PROTECTED] 2006-05-23 18:44 --- Correction, the original test case works, just not when I test as root. Testing as a mortal other than me elicits the bug. --- You are receiving this mail because: --- You are the assignee for the bug, or are watching the assignee.
[Bug 4906] New: Uninitialized $score in PerMsgStatus::_handle_hit
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=4906 Summary: Uninitialized $score in PerMsgStatus::_handle_hit Product: Spamassassin Version: 3.1.1 Platform: Other OS/Version: other Status: NEW Severity: normal Priority: P5 Component: spamc/spamd AssignedTo: dev@spamassassin.apache.org ReportedBy: [EMAIL PROTECTED] I'm seeing several uninitialized value errors in maillog which seem to center on $score not getting set for some rules. This happens when some messages are passed through spamc/spamd, but no error is reported when the message is passed through spamassassin. I don't see it on my own account with a non-empty user_prefs but I do see it with other users with the default (comments-only) user_prefs, and I see it on mine if I rename mine to user_prefs.disabled. I'll attach a sample message that triggers it. Here's the list of errors I see when this happens: Use of uninitialized value in addition (+) at /usr/lib/perl5/site_perl/5.8.3/Mail/SpamAssassin/PerMsgStatus.pm line 2708. Use of uninitialized value in numeric ge (=) at /usr/lib/perl5/site_perl/5.8.3/Mail/SpamAssassin/PerMsgStatus.pm line 2713. Use of uninitialized value in numeric le (=) at /usr/lib/perl5/site_perl/5.8.3/Mail/SpamAssassin/PerMsgStatus.pm line 2713. Use of uninitialized value in sprintf at /usr/lib/perl5/site_perl/5.8.3/Mail/SpamAssassin/PerMsgStatus.pm line 2717. Use of uninitialized value in numeric eq (==) at /usr/lib/perl5/site_perl/5.8.3/Mail/SpamAssassin/Plugin/AWL.pm line 341. Use of uninitialized value in numeric eq (==) at /usr/lib/perl5/site_perl/5.8.3/Mail/SpamAssassin/PerMsgStatus.pm line 432. --- You are receiving this mail because: --- You are the assignee for the bug, or are watching the assignee.
[Bug 4906] Uninitialized $score in PerMsgStatus::_handle_hit
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=4906 --- Additional Comments From [EMAIL PROTECTED] 2006-05-22 23:06 --- Created an attachment (id=3519) -- (http://issues.apache.org/SpamAssassin/attachment.cgi?id=3519action=view) Sample spam for triggering the error message --- You are receiving this mail because: --- You are the assignee for the bug, or are watching the assignee.
[Bug 4906] Uninitialized $score in PerMsgStatus::_handle_hit
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=4906 --- Additional Comments From [EMAIL PROTECTED] 2006-05-22 23:13 --- This is one of those can't ever happen (without something crazy going on) issues. But it looks like this is probably the same as bug 4699. Can you try out the current test 3.1.2 or the patch as found in that ticket? --- You are receiving this mail because: --- You are the assignee for the bug, or are watching the assignee.
[Bug 4906] Uninitialized $score in PerMsgStatus::_handle_hit
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=4906 --- Additional Comments From [EMAIL PROTECTED] 2006-05-23 00:54 --- No joy. I applied the patch to Timeout.pm and spamd and restarted the daemon and still get the same error messages. --- You are receiving this mail because: --- You are the assignee for the bug, or are watching the assignee.
[Bug 4906] Uninitialized $score in PerMsgStatus::_handle_hit
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=4906 --- Additional Comments From [EMAIL PROTECTED] 2006-05-23 01:41 --- (In reply to comment #3) No joy. I applied the patch to Timeout.pm and spamd and restarted the daemon and still get the same error messages. hrm. can you describe your SA setup a bit more? are there any non-standard patches applied? non- standard plugins? using the standard code, there's no way for a rule to get in without a score. the copy_config issue can cause it because it corrupts the config. --- You are receiving this mail because: --- You are the assignee for the bug, or are watching the assignee.
[Bug 4497] reorganise PerMsgStatus code
http://bugzilla.spamassassin.org/show_bug.cgi?id=4497 --- Additional Comments From [EMAIL PROTECTED] 2005-10-17 16:36 --- Mail::SpamAssassin::Message::Node will need a ptr to {main} for bug 4636, so there's little point for a separate Mail::SpamAssassin::Message:Rendered. --- You are receiving this mail because: --- You are the assignee for the bug, or are watching the assignee.
[Bug 4497] reorganise PerMsgStatus code
http://bugzilla.spamassassin.org/show_bug.cgi?id=4497 --- Additional Comments From [EMAIL PROTECTED] 2005-10-17 16:54 --- Subject: Re: reorganise PerMsgStatus code On Mon, Oct 17, 2005 at 04:36:06PM -0700, [EMAIL PROTECTED] wrote: Mail::SpamAssassin::Message::Node will need a ptr to {main} for bug 4636, so there's little point for a separate Mail::SpamAssassin::Message:Rendered. FYI, I will almost definitely -1 any such modification to the Message or Message::Node code unless there is some huge absolute need to have it. That code is designed to very specifically have no dependence, relation, or access to the rest of the SA code. --- You are receiving this mail because: --- You are the assignee for the bug, or are watching the assignee.
[Bug 4497] reorganise PerMsgStatus code
http://bugzilla.spamassassin.org/show_bug.cgi?id=4497 --- Additional Comments From [EMAIL PROTECTED] 2005-10-17 16:59 --- There is a work-in-progress patch attached to bug 4636. It would be better to discuss the issue in that bug. --- You are receiving this mail because: --- You are the assignee for the bug, or are watching the assignee.
Re: PerMsgStatus
a) what the heck are priorities, who sets them, and do they really have any justifiable purpose? Ie: can they just quietly vanish into the night with nobody being any the wiser? They order the rules -- or more correctly, sets of rules. Most rules are priority 500 (iirc), but some need to run earlier and some need to run later (e.g. AWL needs to run after all other rules). Running rules earlier is how we propose to implement early-exit -- certain rules can run before all others, and cause an early-exit if they fire. They cannot just vanish. ;) Let me challenge or at least prod around the edges of this a bit to further my understanding. I think what you are saying is that priority is used (at least in part) to do the ordering that is known or believed to be required. However, there seems to be some ordering built into PMS itself, such as firing the net rules first and then harvesting the results later. That makes me believe that we probably have two methods of the same thing: some rules are ordered because the pms code is written to do them in a given order, and some rules are ordered because someone assigned a priority somewhere. I guess I'm mostly wondering if 'priority' as a number (at least one with a seemingly rather fine granularity) is necessarily the way to do this. It is certainly general. But I'm wondering if this is over-general, and can end up forcing a rule ordering algorithm to make potentially bad ordering decisions. Might it be reasonable to do the enforced ordering based on a small set of known rule types, and just flag the unusual rules of each special type? The unusual rules that come to mind just at the instant are net, bayes, and awl. Maybe there are more, but I can't think what they would be at the moment. While it could be argued that an enumeration is just a form of priority that doesn't use numbers, it seems to me to have an advantage - you can change the order that you look at the enumerated values without having to change the values themselves. Also, it would prevent assigning 'useless' classifications such as priorities of 501, 502 and 503 to three user rules. An example of why I think an enumeration might be better: Right now all net rules are started first, since they take longest. But suppose we have a rule that will score -100 and the total positive score, including net rules, is only 100. Clearly it makes more sense to evaluate that single -100 rule before firing any of the net rules -- if the -100 rule triggers the net rule scores are moot, and we have wasted significant system resources. Doing that with priorities would be awkward. So: do we *really* need the existing priority structure? Or do we just need a method of identifying a very small number of rule classifications, eg: bayes, awl, net? (Anything else?) Loren
Re: PerMsgStatus
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Loren Wilton writes: Let me challenge or at least prod around the edges of this a bit to further my understanding. I think what you are saying is that priority is used (at least in part) to do the ordering that is known or believed to be required. However, there seems to be some ordering built into PMS itself, such as firing the net rules first and then harvesting the results later. That makes me believe that we probably have two methods of the same thing: some rules are ordered because the pms code is written to do them in a given order, and some rules are ordered because someone assigned a priority somewhere. I guess I'm mostly wondering if 'priority' as a number (at least one with a seemingly rather fine granularity) is necessarily the way to do this. It is certainly general. But I'm wondering if this is over-general, and can end up forcing a rule ordering algorithm to make potentially bad ordering decisions. Might it be reasonable to do the enforced ordering based on a small set of known rule types, and just flag the unusual rules of each special type? The unusual rules that come to mind just at the instant are net, bayes, and awl. Maybe there are more, but I can't think what they would be at the moment. While it could be argued that an enumeration is just a form of priority that doesn't use numbers, it seems to me to have an advantage - you can change the order that you look at the enumerated values without having to change the values themselves. Also, it would prevent assigning 'useless' classifications such as priorities of 501, 502 and 503 to three user rules. An example of why I think an enumeration might be better: Right now all net rules are started first, since they take longest. But suppose we have a rule that will score -100 and the total positive score, including net rules, is only 100. Clearly it makes more sense to evaluate that single -100 rule before firing any of the net rules -- if the -100 rule triggers the net rule scores are moot, and we have wasted significant system resources. Doing that with priorities would be awkward. Why? header BIG_NEGATIVE_RULE Foo =~ /bar/ score BIG_NEGATIVE_RULE -999 priority BIG_NEGATIVE_RULE 10 # ensure it runs first seems pretty simple and non-awkward imo ;) This is certainly the main form of early-exiting that I'd want to see -- a specific few tests (like USER_IN_WHITELIST) that run first (using priority) and cause early exit. The way priority works is that there's only 1 priority defined in the config files -- AWL at pri 1000. everything else is at the same priority, except for meta tests at META_TEST_MIN_PRIORITY and DNSBL harvesting which only happens after HARVEST_DNSBL_PRIORITY. We've been trying to reduce the number of wierd effects in the rule ordering and selection code, for example the fact that DNS tests are run first and then harvested later -- that *should* be implemented as some kind of general mechanism, instead of being keyed off the rule type, because doing it by rule type means that there'll be no way for plugins to support similar models for future unforeseen rule types without changes to PerMsgStatus and/or Conf. Also, AWL was previously not even a real rule - -- it was just hard-coded into PerMsgStatus. having priority let us fix both of these (although the DNS stuff is still only half-implemented ;). With a very small number of priorities, and the vast majority of rules being in one priority bucket, you can then optimise the rule selection in that bucket and have more or less the same effect without too much wierd stuff like side-effects of rule names, or rule types. btw, yes, short-circuiting breaks Bayes autolearning and the AWL. this is one of its downsides. However, by far the biggest downside is that we have never found a way to do automatic short-circuiting that didn't turn out at about the same speed as not short-circuiting, when measured against a corpus of mail that includes spam. ;) You should read the discussion that went on when priority was implemented last year: http://bugzilla.spamassassin.org/show_bug.cgi?id=2912 . Much of this was discussed then. - --j. -BEGIN PGP SIGNATURE- Version: GnuPG v1.2.5 (GNU/Linux) Comment: Exmh CVS iD8DBQFC6mXUMJF5cimLx9ARAqaVAJ0f4WfH08qteaVQyTefK8ivIqeBNQCfbSIq ruV89/Z0Eq2xG1vITm0s+qw= =txDh -END PGP SIGNATURE-
Re: PerMsgStatus
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Loren Wilton writes: I just spent 45 minutes or so staring at the PerMsgStatus code and figuring out a bit more about how it works. Baroque! Still, there is the basis of a concept underlying the implementation, and it doesn't *look* like it would be all that hard to flop things around to work more the way I think they should. It looks like the main things that aren't obvious and I'll need to figure out something about are: a) what the heck are priorities, who sets them, and do they really have any justifiable purpose? Ie: can they just quietly vanish into the night with nobody being any the wiser? They order the rules -- or more correctly, sets of rules. Most rules are priority 500 (iirc), but some need to run earlier and some need to run later (e.g. AWL needs to run after all other rules). Running rules earlier is how we propose to implement early-exit -- certain rules can run before all others, and cause an early-exit if they fire. They cannot just vanish. ;) b) why were tests broken out into groups by test type and all the tests of a given type run at once? My best guess was an attempt at efficiency based on assumptions about data set size and cache threashing. Is there a known reason that it has to be this way, or would it work just as well to just run tests in 'whatever' order? two reasons: 1. reducing the number of items in a hash is good for efficiency, as it reduces hash collisions. 2. running all tests of a certain type in one block allows some optimisations; e.g. for the body rules, we can iterate through all lines in the body, and for each line, call all of the active body rules of that priority level one after another. (I'm not sure if we still do this or not btw.) it may work better to run in whatever order -- benchmarks are the one true authority here ;) c) are there known ways in Perl to actually dispose of memory items and have them really return memory to the available pool, or do you just hope that exiting scope and garbage collection may eventually do the job for you? { my $obj = [ ...something...]; } # $obj has gone out of scope. GC happens now and $obj is deleted in other words, once it goes out of scope, it is immediately GC'd. it's not like java, where it may be gc'd if you're lucky, the moon is full, and you call System.gc() three times in a row. (java programmers will know I'm not joking about that ;) If it's a member of a hash like $self, delete $self-{variable} is how you force it to be deleted. If something else has a ref to it, it won't be deleted, of course -- everything's ref counted. d) can you build an array/list/hash/whatever of procedure names/pointers and efficiently iterate over the structure calling the procedures in sequence? Will this be slower than generating an eval containing a bunch of lines calling the same procedures in sequence? You can indeed -- foreach my $fn_ref (@array) { $fn_ref-(...arguments...); } or even map { $_-(arguments...); } @array; But -- the bad news -- it will almost definitely be slower. The only way to find out is with benchmarks. the 'Benchmark' CPAN module can be very useful to measure that stuff. Do you have any insightful (or alternately: quick) answers to the above questions? I have a feeling that while I could make some deductions on the first two questions from tracking stuff into other modules, the *real* reasons are probably lost or stored in the group arcana of the dev's minds. It seems to me as a first whack, there isn't any huge reason that the rules couldn't be looked at en masse and a quick dependency tree built, then the results sorted in some convenient score-and-whatnot-based order, and then, instead of half a dozen essentially identical rule building procedures that exist now, just have one procedure that will make the test and calling procedures come into existance. One thing -- watch out for $score == 0. If a score is set to 0, the evaluation of the rule's code (be that an eval test, a regexp or whatever) should not happen; and rules can have their scores set to 0 in user prefs, so assuming that because a rule is 0 in the system-wide config, it'll never be run from then on, is not a safe assumption. I'm not even sure that you really need to pass much more than @self to the procedures, and let them find the data they want to play with as member variables on @self with known names. That's entirely true. (Although maybe Perl requires more parameters, I still don't understand things like @_ and the like.) @_ is the parameter list. btw accessing parameters passed to a function directly as stuff in @_ is faster than assigning variable names to them, in other words sub myfunction { return $_[0] + $_[1]; } is faster than sub myfunction { my ($foo,$bar) = @_; return $foo + $bar; } - --j. -BEGIN PGP SIGNATURE- Version: GnuPG v1.2.5 (GNU
Re: PerMsgStatus
[Lots of stuff snipped] You know, it'd be nice if Daniel, or anyone else, checked in my optimised PMS.pm [*] in as a branch. That way it can be worked on easily by multiple people. An optimisation branch would mean you can continue with the current release work, while others work on performance. Matt. [*] I changed a few parts about how rules get compiled while at CEAS that should have been faster, only benchmarks showed no visible sign of improvements. I'd be happy to send it to anyone interested. __ This email has been scanned by the MessageLabs Email Security System. For more information please visit http://www.messagelabs.com/email __
Re: [Bug 4497] New: reorganise PerMsgStatus code
Loren Wilton [EMAIL PROTECTED] writes: Is there a way that [user rules] could be saved in a more-compiled state when used with spamd and similar? Maybe name the rules with the username as part of the procedure name, and just add them to the namespace the first time encountered? Beyond the memory bloating this could potentially cause and the potentially more complicated security aspects, it's important to optimize for the common case. I've been thinking we might be able to move all user rules into their own priority and then *only* those would be slower and overall performance would be as good as reasonably possible. Also, when there are no user rules, the user rule priority could just be skipped for good performance. In the long run, it might even be possible to give user rules priorities based on the user name (non-numeric ones) so they could be cached. So, are any user rule sets using priorities? Daniel -- Daniel Quinlan http://www.pathname.com/~quinlan/
Re: [Bug 4497] New: reorganise PerMsgStatus code
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Daniel Quinlan writes: Loren Wilton [EMAIL PROTECTED] writes: Is there a way that [user rules] could be saved in a more-compiled state when used with spamd and similar? Maybe name the rules with the username as part of the procedure name, and just add them to the namespace the first time encountered? Beyond the memory bloating this could potentially cause and the potentially more complicated security aspects, it's important to optimize for the common case. I've been thinking we might be able to move all user rules into their own priority and then *only* those would be slower and overall performance would be as good as reasonably possible. Also, when there are no user rules, the user rule priority could just be skipped for good performance. In the long run, it might even be possible to give user rules priorities based on the user name (non-numeric ones) so they could be cached. I don't get it -- how does that help? - --j. -BEGIN PGP SIGNATURE- Version: GnuPG v1.2.5 (GNU/Linux) Comment: Exmh CVS iD8DBQFC5BJAMJF5cimLx9ARAtu/AKCaOu6/yYj46MxHwXiwiEfU54JdPwCfQhSM 6fR7KmFEJfaCgvZ/B3NJTzw= =0/Gr -END PGP SIGNATURE-
Re: [Bug 4497] New: reorganise PerMsgStatus code
Justin Mason [EMAIL PROTECTED] writes: I don't get it -- how does that help? Basically, user rule would always run last. Something like: for (@priorities) { standard prorities loop } if ($allow_user_rules $defined_rules{$user}) { do_xxx_tests($user, ...); # $user is the priority do_yyy_tests($user, ...); } Precompiling could be done by an early Plugin that runs an init message on each user that has configuration. The do_xxx_tests routines should not care about whether you are doing user rules, ideally. Daniel -- Daniel Quinlan http://www.pathname.com/~quinlan/
[Bug 4497] New: reorganise PerMsgStatus code
http://bugzilla.spamassassin.org/show_bug.cgi?id=4497 Summary: reorganise PerMsgStatus code Product: Spamassassin Version: SVN Trunk (Latest Devel Version) Platform: Other OS/Version: other Status: NEW Severity: enhancement Priority: P5 Component: Libraries AssignedTo: dev@spamassassin.apache.org ReportedBy: [EMAIL PROTECTED] talking about this at the hackathon and there's pretty unanimous agreement that PerMsgStatus does too many different things. So here are the conceptual blocks of functionality in PerMsgStatus, and the proposed new classes to implement it: - context of scan --\ results of scan Mail::SpamAssassin::PerMsgStatus (keep this where it is, badly named or not.) - logic for check() .. Mail::SpamAssassin::Plugin::Check. (a new plugin to implement check logic, so that it can be swapped out for alternative check prioritisation implementations, short-circuiting implementations, etc.) - body formats / rendering -\ - URI list Mail::SpamAssassin::Message::Rendered (a new object to hold rendered state off Message. *NOT* Message itself, as this object has a ptr to {main} and {conf}, so can use configuration data during its work; it's also deleted when Maill::SpamAssassin::PerMsgStatus is deleted.) - logic for each rule type --\ - rule compilation Mail::SpamAssassin::Scanner - config parsing for rules --/ - rewrite . Mail::SpamAssassin::Plugin::Rewrite (a plugin that implements rewriting of mails, so that alternative rewriting methods can be used instead, optionally) Regarding backwards compatibility -- all public APIs remain compatible. This is essential (but easily done). Every public member variable remains on PerMsgStatus; every public method keeps a public facade method on PerMsgStatus that just calls into the real implementation code. this is post-3.1.0, just wanted to get this into bits and off the whiteboard... --- You are receiving this mail because: --- You are the assignee for the bug, or are watching the assignee.
Re: [Bug 4497] New: reorganise PerMsgStatus code
I know user rules aren't real popular with the sa dev community, however that attitude isn't universally shared by sa users. Therefore may I suggest: Would it be possible when reorganizing things to come up with some semi-persistant storage for compiled user rules, so that they don't have to be rewritten and recompiled in a bunch of evals for every message processed for the user? At the very least it should be moderately trivial to save the text passed to the evals into a file someplace, and then just eval that one file to get everything compiled. Timestamps would indicate if the pre-built stuff was out of date or not. Is there a way that they could be saved in a more-compiled state when used with spamd and similar? Maybe name the rules with the username as part of the procedure name, and just add them to the namespace the first time encountered? If there were a way to make user rules be a subclass of the main rules, then you could just call the user rules for each user, and any user rules (or I presume scores) would override the standard rules, and anything that didn't override would pass through to the main rules. Of course this assumes the rules was a class instance of some sort of other, and you could call ruleclass-dorules(message) or the like (however that C++ invocation would be spelled in Perl). Having all user rules remain in memory forever would certainly add some to the bloat of SA memory usage. However, if one assumes most users would have few rules compared to the number of overall system rules, then the increase might be reasonable; especially since you would be trading off that memory for the compile and build overhead on every message. Loren BTW, if the rules were in a class, then it would be possible to create the instance of that class at startup in just about exactly the same way user rule subclasses could be created. Then a hup, instead of taking spamd and children down, would just cause the children to discard the current rules instance and fabricate a new instance and continue onward.
[Bug 4411] $permsgstatus-get_uri_list too aggressive?
http://bugzilla.spamassassin.org/show_bug.cgi?id=4411 [EMAIL PROTECTED] changed: What|Removed |Added Status|NEW |RESOLVED Resolution||FIXED --- Additional Comments From [EMAIL PROTECTED] 2005-06-22 09:42 --- I thought I'd changed this yesterday bug bugzilla doesn't show the mods. I'm changing the status to 'fixed' since it's been incorporated into 3.1. However, this doesn't really help me any since I'm not going to make any dependancies on 3.1 until 3.1.1 comes out probably (give peole time to move). On a related issue, however, I'm wondering what kind of forward strategy there is for unencoded i18n domain names. Right now, a raw IDN like http://www.bravÄ.nu causes some breakage. URIDNSBL ignores these entirely, and I also need to add similar code that ignores URIs with 8-bit characters. But at some point, it would probably be a good idea for get_uri_list to provide these in two forms--once in the original unencoded form, and also in the normalized (xn--) format. Is this work in 3.1 also, or is this something that hasn't been dealt with yet? --- You are receiving this mail because: --- You are the assignee for the bug, or are watching the assignee.
[Bug 4411] New: $permsgstatus-get_uri_list too aggressive?
http://bugzilla.spamassassin.org/show_bug.cgi?id=4411 Summary: $permsgstatus-get_uri_list too aggressive? Product: Spamassassin Version: unspecified Platform: Other OS/Version: other Status: NEW Severity: enhancement Priority: P4 Component: Plugins AssignedTo: dev@spamassassin.apache.org ReportedBy: [EMAIL PROTECTED] I'm adding support for URI processing to my LDAPfilter plugin, and have noticed that a URI of http://username:[EMAIL PROTECTED]:nn/ will get parsed and returned as all of the following variations: mailto:username:[EMAIL PROTECTED] http://username:[EMAIL PROTECTED]:nn/ http://www.example.com:nn/ The mailto and the last http URIs should not be returned since they are not accurate representations of the input URI. I can kind of see why you'd want to return something like the last URI for your reduced view mechanism, but such a view would theoretically be provided without the port number, so I'm not sure if that's the intention or not. If you need/want to keep all of these in the outputs, can we get a separate array or parameter that just returns the normalized URIs? Thanks --- You are receiving this mail because: --- You are the assignee for the bug, or are watching the assignee.
[Bug 4411] $permsgstatus-get_uri_list too aggressive?
http://bugzilla.spamassassin.org/show_bug.cgi?id=4411 --- Additional Comments From [EMAIL PROTECTED] 2005-06-20 12:51 --- Subject: Re: New: $permsgstatus-get_uri_list too aggressive? On Mon, Jun 20, 2005 at 12:43:28PM -0700, [EMAIL PROTECTED] wrote: that a URI of http://username:[EMAIL PROTECTED]:nn/ will get parsed and mailto:username:[EMAIL PROTECTED] http://username:[EMAIL PROTECTED]:nn/ http://www.example.com:nn/ The mailto and the last http URIs should not be returned since they are not accurate representations of the input URI. The last one is. It's the actual URI without the login information, which is exactly what we want. The mailto is incorrect. I can kind of see why you'd want to return something like the last URI for your reduced view mechanism, but such a view would theoretically be provided without the port number, so I'm not sure if that's the intention or not. The idea is to remove the obfuscation of URIs so the rules don't have to account for everything like username and passwords. We do want to catch port numbers. If you need/want to keep all of these in the outputs, can we get a separate array or parameter that just returns the normalized URIs? This is possible in 3.1: don't call get_uri_list, call get_uri_detail_list which you can see the raw URIs as well as the canonified version separately. --- You are receiving this mail because: --- You are the assignee for the bug, or are watching the assignee.
[Bug 4218] PerMsgStatus $status-finish() not getting registered
http://bugzilla.spamassassin.org/show_bug.cgi?id=4218 [EMAIL PROTECTED] changed: What|Removed |Added AssignedTo|dev@spamassassin.apache.org |[EMAIL PROTECTED] Status|REOPENED|ASSIGNED --- Additional Comments From [EMAIL PROTECTED] 2005-04-11 19:57 --- Created an attachment (id=2783) -- (http://bugzilla.spamassassin.org/attachment.cgi?id=2783action=view) fix argh, this looked easy and then wasn't ;) sure enough, there was a bug. Basically, a temporary PerMsgStatus object is created *very* early on -- in lib/Mail/SpamAssassin/Message/Metadata/Received -- before the plugin was loaded, resulting in the 'per_msg_finish' callbacks being cached without the plugin's method in the list. fix was to get that object created *once* in Mail::SpamAssassin -- not every time a message is parsed. (it's much more efficient this way anyway.) long-term fix would be to get some of the DNS-lookup code out of Dns.pm, since having it there means it unnecessarily relies on a PerMsgStatus object being in existence... which is really not necessary for lookups in the Received header parsing code. couple of other minor fixes: added some doco to Plugin::per_msg_finish() to note that only circular refs need to be cleaned up normally. and added a test to the test suite. --- You are receiving this mail because: --- You are the assignee for the bug, or are watching the assignee.
[Bug 4218] PerMsgStatus $status-finish() not getting registered
http://bugzilla.spamassassin.org/show_bug.cgi?id=4218 [EMAIL PROTECTED] changed: What|Removed |Added Status|RESOLVED|REOPENED Resolution|WORKSFORME | --- Additional Comments From [EMAIL PROTECTED] 2005-04-06 05:48 --- doesn't work for me. perhaps there is a version comment that would be appropriate. perhaps the docs are wrong, and the bug needs to be changed. ? --- You are receiving this mail because: --- You are the assignee for the bug, or are watching the assignee.
[Bug 3885] Undefined subroutine Mail::SpamAssassin::PerMsgStatus::is_razor2_available
http://bugzilla.spamassassin.org/show_bug.cgi?id=3885 --- Additional Comments From [EMAIL PROTECTED] 2005-03-30 06:42 --- Subject: Re: Undefined subroutine Mail::SpamAssassin::PerMsgStatus::is_razor2_available [EMAIL PROTECTED] writes: http://bugzilla.spamassassin.org/show_bug.cgi?id=3885 --- Additional Comments From [EMAIL PROTECTED] 2005-03-29 18:54 --- Now that the reporting stuff is all in the plugin, can I assume this is no longer an issue? Correct. This can be closed. --- You are receiving this mail because: --- You are the assignee for the bug, or are watching the assignee.
[Bug 3885] Undefined subroutine Mail::SpamAssassin::PerMsgStatus::is_razor2_available
http://bugzilla.spamassassin.org/show_bug.cgi?id=3885 [EMAIL PROTECTED] changed: What|Removed |Added Status|NEW |RESOLVED Resolution||FIXED --- Additional Comments From [EMAIL PROTECTED] 2005-03-30 07:09 --- Yep, it's fixed. --- You are receiving this mail because: --- You are the assignee for the bug, or are watching the assignee.
[Bug 4218] New: PerMsgStatus $status-finish() not getting registered
http://bugzilla.spamassassin.org/show_bug.cgi?id=4218 Summary: PerMsgStatus $status-finish() not getting registered Product: Spamassassin Version: 3.0.2 Platform: All OS/Version: Linux Status: NEW Severity: minor Priority: P3 Component: Rules (Eval Tests) AssignedTo: dev@spamassassin.apache.org ReportedBy: [EMAIL PROTECTED] http://spamassassin.apache.org/full/3.0.x/dist/doc/Mail_SpamAssassin_PerMsgStatus.html says that $status-finish() should get called when the PerMsgStatus object is destroyed. However, the finish() function in my plug-in is not getting registered or called. --- You are receiving this mail because: --- You are the assignee for the bug, or are watching the assignee.
speedup for PerMsgStatus
OK, here's a trick I was thinking about. Currently we have these massive hashtable refs: $pms-{conf}-{rbl_evals} {head_tests} {body_tests} {scoreset}-[0,1,2,3] {tflags} Each of those is keyed by the name of the rule. Now the thing is, this is really wasteful - speed-wise (not really RAM-wise) -- just performing all those hash lookups! When a message is scanned, each of the _evals and _tests hashes are iterated over, extracting the rule name and rule text for every entry. In reality, we only need the rule text at this point, *not* the name. - We have about 700 rules - 99% of the time, any given rule will NOT fire, so we should speedup: foreach my $rulepat (@{all_rules_of_given_type}) { ... if ($whatever =~ /$rulepat/) { # hit! } # otherwise miss! } we should speedup the 'foreach', the rule-text fetch, and the 'miss'. note that we don't need to know the rule name until the rule gives us a hit! so I'm thinking that we should replace parts of this with arrays, using integer indexes, instead of hashes with string indexes. Array lookups are quite a bit faster than hash lookups. Each array would have RAM usage of -- guessing -- (size_of_whats_stored + 9100) bytes, since arrays in perl have an overhead of about 13 bytes per entry. (this is about the same as hashes iirc, poss a bit less. not sure if there'd be RAM savings there, since perl hash keys are refcounted shared strings iirc.) we can optimize for the rules that are loaded from the system-wide config, because (a) allow_user_rules is almost always off, and (b) even if it's on, I'd guess that most times 99% of the rules that a scan runs would be system-wide rules anyway. (we can deal with user-rules by just pushing them onto the rules array when they're defined, same as the system rules are done.) --j.
Re: speedup for PerMsgStatus
[EMAIL PROTECTED] (Justin Mason) writes: Array lookups are quite a bit faster than hash lookups. +1 -- Daniel Quinlan ApacheCon! 13-17 November (3 SpamAssassin http://www.pathname.com/~quinlan/ http://www.apachecon.com/ sessions more)
Re: speedup for PerMsgStatus
so I'm thinking that we should replace parts of this with arrays, using integer indexes, instead of hashes with string indexes. Array lookups are quite a bit faster than hash lookups. I have no idea how painful linked lists are in Perl (or if they even exist). But if you are essentially iterating over all rules, they could have some distinct advantages for possible future optimizations. For instance: just link user rules onto the end of the last system rule. For instance: reorder rules with a trivial LRU or counted LRU algorithm to put the most-hit rules first, by dynamic experience. Then stop processing rules if score total x. (Insert obvious arguments about negative-scoring rules here.) LRUs are really cheap in most languages, and if you can do them in Perl this could be a huge advantage when coupled with an early bailout algorithm. Loren
Re: speedup for PerMsgStatus
I have no idea how painful linked lists are in Perl (or if they even exist). Why are you commenting then??? Because they are very useful, as I pointed out. They don't exist as a native data structure. Arrays are fast, painless, and dynamically sized. They don't exist as a native data structure in C++ either. But they get a lot of use. Even when template classes exist to do reasonably fast and reasonably painless dynamic arrays. For certain things (like collections of objects that can get reordered frequently) they are generally more efficient than dynamic arrays. If there is an SA coding requirement for only using native data structures, then forget lists. If no such requirement exists and there is an interest in optimizing performance, then they should be a tool to be considered. Loren
Re: speedup for PerMsgStatus
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Loren Wilton writes: I have no idea how painful linked lists are in Perl (or if they even exist). Why are you commenting then??? Because they are very useful, as I pointed out. They don't exist as a native data structure. Arrays are fast, painless, and dynamically sized. They don't exist as a native data structure in C++ either. But they get a lot of use. Even when template classes exist to do reasonably fast and reasonably painless dynamic arrays. For certain things (like collections of objects that can get reordered frequently) they are generally more efficient than dynamic arrays. If there is an SA coding requirement for only using native data structures, then forget lists. If no such requirement exists and there is an interest in optimizing performance, then they should be a tool to be considered. Unfortunately, perl speed optimisation doesn't work like that. The reason is that perl native data structures (arrays, hashes, strings, numeric SVs, etc.) can be looked up in one perl OP, but a user-defined data structure cannot. The OP is the lowest level command in the perl VM, equivalent to an assembly opcode, and as such is very very fast -- since the innards of an OP is pure C. That's why regexp matching in perl is as fast as it is in C -- because a regexp match is compiled to a single OP. (Perl's not like Java in that respect. Perl's vm has quite high-level opcodes, whereas java's is more like real assembly and more low-level. that's why perl is faster than java ;) Unfortunately when reading fields in a perl data structure like a hash or array, and traversing reference chains, each variable access, and ref derefence, is an individual OP. So the upshot is that using a native perl data type will always be faster than defining a new non-native data type structure in perl. cf. http://www.ccl4.org/~nick/P/Fast_Enough/#ops_are_bad,_m%27kay for more details... in fact, I'm even considering looking into some use of pack() here for the very reasons noted here ;) (ps. I'm sure if I got any of that wrong Matt will correct me ;) - --j. -BEGIN PGP SIGNATURE- Version: GnuPG v1.2.4 (GNU/Linux) Comment: Exmh CVS iD8DBQFBWQRrQTcbUG5Y7woRAvGHAJwOAxmPKpX09LoiZBCsYypL5UzA2ACgvbTm 6uB3igI7ObXF+vn+jeOmN98= =cQEI -END PGP SIGNATURE-
Re: speedup for PerMsgStatus
That is an unreasonablly nasty retort. Original Message: - From: Daniel Quinlan [EMAIL PROTECTED] Date: 27 Sep 2004 22:53:38 -0700 To: [EMAIL PROTECTED], dev@spamassassin.apache.org Subject: Re: speedup for PerMsgStatus Loren Wilton [EMAIL PROTECTED] writes: I have no idea how painful linked lists are in Perl (or if they even exist). Why are you commenting then??? They don't exist as a native data structure. Arrays are fast, painless, and dynamically sized. -- Daniel Quinlan ApacheCon! 13-17 November (3 SpamAssassin http://www.pathname.com/~quinlan/ http://www.apachecon.com/ sessions more) mail2web - Check your email from the web at http://mail2web.com/ .