Re: Attachments with no Content-Type mime header
I’ve checked and as in the plugin, foreach my $part ($pms->{msg}->find_parts(qr/./, 1)) { does find each attachment, including the ones without Content-Type header – the method below can be used on these parts found regardless of lack of Content-Type Paul From: Pedro David Marco Reply-To: Pedro David Marco Date: Wednesday, 16 August 2017 at 23:49 To: Paul Stead , "users@spamassassin.apache.org" Subject: Re: Attachments with no Content-Type mime header Thanks Paul, but your plugin uses find_parts() that turns it pointless if there is no Content-Type mime header... PedroD >The magic number or file signature can be helpful in determining the filetype: >https://en.wikipedia.org/wiki/List_of_file_signatures >I make use of this in the OLEMacro plugin: >https://github.com/fmbla/spamassassin-olemacro/ >Paul Stead -- Paul Stead Systems Engineer Zen Internet
Re: Attachments with no Content-Type mime header
Thanks Paul, but your plugin uses find_parts() that turns it pointless if there is no Content-Type mime header... PedroD >The magic number or file signature can be helpful in determining the filetype: >https://en.wikipedia.org/wiki/List_of_file_signatures >I make use of this in the OLEMacro plugin: https://github.com/fmbla/spamassassin-olemacro/ >Paul Stead #yiv6466611010 #yiv6466611010 -- _filtered #yiv6466611010 {panose-1:2 4 5 3 5 4 6 3 2 4;} _filtered #yiv6466611010 {font-family:Calibri;panose-1:2 15 5 2 2 2 4 3 2 4;} _filtered #yiv6466611010 {font-family:HelveticaNeue;}#yiv6466611010 #yiv6466611010 p.yiv6466611010MsoNormal, #yiv6466611010 li.yiv6466611010MsoNormal, #yiv6466611010 div.yiv6466611010MsoNormal {margin:0in;margin-bottom:.0001pt;font-size:12.0pt;}#yiv6466611010 a:link, #yiv6466611010 span.yiv6466611010MsoHyperlink {color:#0563C1;text-decoration:underline;}#yiv6466611010 a:visited, #yiv6466611010 span.yiv6466611010MsoHyperlinkFollowed {color:#954F72;text-decoration:underline;}#yiv6466611010 span.yiv6466611010EmailStyle17 {font-family:Calibri;color:windowtext;}#yiv6466611010 span.yiv6466611010msoIns {text-decoration:underline;color:teal;}#yiv6466611010 .yiv6466611010MsoChpDefault {font-size:10.0pt;} _filtered #yiv6466611010 {margin:1.0in 1.0in 1.0in 1.0in;}#yiv6466611010 div.yiv6466611010WordSection1 {}#yiv6466611010
Re: Attachments with no Content-Type mime header
From: Pedro David Marco Reply-To: Pedro David Marco Date: Wednesday, 16 August 2017 at 22:32 To: David Niklas , "users@spamassassin.apache.org" Subject: Re: Attachments with no Content-Type mime header Hi David... I agree with you... but some functions like find_parts() do not work if there are not Content-Type Headers... making impossible the analysis of some attachments... i am writing a plugin to detect suspicious PDFs... Maybe there's a better way to analyze attachments that using find_parts() Thanks! -- PedroD The magic number or file signature can be helpful in determining the filetype: https://en.wikipedia.org/wiki/List_of_file_signatures I make use of this in the OLEMacro plugin: https://github.com/fmbla/spamassassin-olemacro/ -- Paul Stead Systems Engineer Zen Internet
Re: Attachments with no Content-Type mime header
Hi David... I agree with you... but some functions like find_parts() do not work if there are not Content-Type Headers... making impossible the analysis of some attachments... i am writing a plugin to detect suspicious PDFs... Maybe there's a better way to analyze attachments that using find_parts() Thanks! --PedroD >You should not trust what the files extension says that the file is. Also >file(1) does not yet do a good enough job to be reliable this way. >As for guessing, I think that the best guess that could be applied would >be a test of the file to see if, once decoded, it is a utf-8 encoded, >ASCII, or iso8859-X encoded text file. Failing that I would assume it is >either an MS doc/ppt/spreadsheet/etc, pdf file, or pure binary. Then you >could try trusting the file extension. >Otherwise, it is a text file and could contain an innocent html or >an uncompressed ps file or a dangerous JS infection program. >Either way I'd be really careful. > >What is your use case? >What do you intend to do with a pdf file vs. an html one? > >Sincerely, >David
Re: TxRep can't use SQLBasedAddrList factory module
> Please open a bug on bugzilla. Nothing jumps to mind. If you can > include the version of mysql just for completeness sake as well as how > you created the tables, that would be good. > I'll do that, unless you can spot an error below. > Do you have a line like user_awl_sql_table txrep? Yes. > And can you run a show tables and describe like such? > show tables; > describe txrep; MariaDB [spamdb]> show tables; describe txrep; +---+ | Tables_in_spamdb | +---+ | bayes_expire | | bayes_global_vars | | bayes_seen| | bayes_token | | bayes_vars| | txrep | +---+ 6 rows in set (0.00 sec) +--+--+--+-+---+-+ | Field| Type | Null | Key | Default | Extra | +--+--+--+-+---+-+ | username | varchar(100) | NO | PRI | NULL | | | email| varchar(191) | NO | PRI | NULL | | | ip | varchar(48) | NO | PRI | NULL | | | count| int(11) | NO | | 0 | | | totscore | float| NO | | 0 | | | signedby | varchar(191) | NO | PRI | NULL | | | last_hit | timestamp| NO | | CURRENT_TIMESTAMP | on update CURRENT_TIMESTAMP | +--+--+--+-+---+-+ 7 rows in set (0.00 sec) I didn't have a 'last_hit'-column originally, but adding it (as above) did not change anything. One further difference: 'email' and 'signedby' are varchar(191) instead of (255) because the db is utf8mb4, which means varchar can be at most 191 characters long. Best, christopher
Re: Attachments with no Content-Type mime header
On Fri, 11 Aug 2017 18:28:56 + (UTC) Pedro David Marco wrote: > Hi everybody... > When an email has a MIME part with no Content-Type header, is there any > way to force SA "guess" the format based on other criteria... file > extension, for example? Example: Content-Disposition: attachment; > filename="details.pdf"Content-Transfer-Encoding: base64 > > Thanks! > PedroD You should not trust what the files extension says that the file is. Also file(1) does not yet do a good enough job to be reliable this way. As for guessing, I think that the best guess that could be applied would be a test of the file to see if, once decoded, it is a utf-8 encoded, ASCII, or iso8859-X encoded text file. Failing that I would assume it is either an MS doc/ppt/spreadsheet/etc, pdf file, or pure binary. Then you could try trusting the file extension. Otherwise, it is a text file and could contain an innocent html or an uncompressed ps file or a dangerous JS infection program. Either way I'd be really careful. What is your use case? What do you intend to do with a pdf file vs. an html one? Sincerely, David
Re: TxRep can't use SQLBasedAddrList factory module
On 8/16/2017 4:38 AM, Christopher Engelhard wrote: I'd start by giving it all perms (excepting things like GRANT), see if it works, and then scale back the perms until you find the minimal necessary set. After giving the user full permissions I still get the exact same error message(s). For completeness' sake I tried the Bayes module with just spamassassin (no amavis), that works as well. 'spamdb' contains the tables for Bayes and TxRep, and all are accessed using the same user/password/privileges. Bayes works, TxRep doesn't, even with full privileges. Please open a bug on bugzilla. Nothing jumps to mind. If you can include the version of mysql just for completeness sake as well as how you created the tables, that would be good. Actually one more set of questions before a bug. Do you have a line like user_awl_sql_table txrep? And can you run a show tables and describe like such? show tables; describe txrep; +--+--+--+-+---+---+ | Field| Type | Null | Key | Default | Extra | +--+--+--+-+---+---+ | username | varchar(100) | NO | PRI | | | | email| varchar(255) | NO | PRI | | | | ip | varchar(40) | NO | PRI | | | | count| int(11) | NO | | 0 | | | totscore | float| NO | | 0 | | | signedby | varchar(255) | NO | PRI | | | | last_hit | timestamp| NO | MUL | CURRENT_TIMESTAMP | | +--+--+--+-+---+---+ 7 rows in set (0.00 sec) Regards, KAM
Off-topic RoboCall Spammers Pay Up & Support Inclusion in Technology
Morning Everyone! So 1st, a sort-of spammy topic. There is a US Legal Settlement for $300 to $900 bucks if some robocallers bugged you with voice-spam about their cruises. Check if your number at: https://www.rmgtcpasettlement.com/Landing.aspx Article for more info: http://www.miamiherald.com/news/nation-world/national/article167331957.html 2nd, I was annoyed by recent discussions on a woman's place (or lack thereof) in technical fields. So I took a few minutes and explained why Meritocracy is so important to me (and the ASF). Read more at https://www.linkedin.com/feed/update/urn:li:activity:6303220271335698432 NOTE: free Apache wristbands almost gone. Regards, KAM
Re: TxRep can't use SQLBasedAddrList factory module
> I'd start by giving it all perms (excepting things like GRANT), see if > it works, and then scale back the perms until you find the minimal > necessary set. After giving the user full permissions I still get the exact same error message(s). For completeness' sake I tried the Bayes module with just spamassassin (no amavis), that works as well. 'spamdb' contains the tables for Bayes and TxRep, and all are accessed using the same user/password/privileges. Bayes works, TxRep doesn't, even with full privileges.