Re: base64 encoded emails

2019-10-18 Thread Ralph Seichter
* Fourhundred Thecat:

> What is the legitimate reason to use base64 encoded emails ?

Languages with an alphabet size of more than the puny 26 characters used
in English, for example. ;-) While quoted-printable encoding would do as
well in many cases, MUAs may opt to use base64 instead. The primary
disadvantage I see is the inherent waste of space, but given the
evolution of storage space and network connectivity, that's a minor
issue to me in 2019.

> Would it be crazy to want to configure Postfix to not accept base64 ?

Very much so, assuming that you wish to communicate with anybody not
using only US-ASCII, or that you wish to keep receiving attachments.

> I believe email should be plaintext.

The introduction of RFC1341 in 1992 made it quite obvious that other
people think differently. While plain text is still very much a thing in
mailing lists (and rightly so), modern email uses encompass much more
than sending text messages.

-Ralph


Re: base64 encoded emails

2019-10-18 Thread Jaroslaw Rafa
Dnia 18.10.2019 o godz. 08:53:38 Bernardo Reino pisze:
> On 2019-10-17 12:17, Jaroslaw Rafa wrote:
> >So you just can't block HTML, because you'll cut yourself off of many
> >important messages that you actually want to receive. (However, I give
> >HTML-only messages without a plaintext part quite a large spam
> >score in my
> >antispam filter).
> 
> Interesting to read that you give a spam score based on non-content
> aspects, where you have, in the last few days and ad nauseam,
> repeatedly argued that spam/no-spam should be based solely on
> content and not on other meta-data :)

HTML vs plaintext *is* a content aspect - at least in my opinion.
-- 
Regards,
   Jaroslaw Rafa
   r...@rafa.eu.org
--
"In a million years, when kids go to school, they're gonna know: once there
was a Hushpuppy, and she lived with her daddy in the Bathtub."


Re: base64 encoded emails

2019-10-18 Thread Bernardo Reino

On 2019-10-17 12:17, Jaroslaw Rafa wrote:

So you just can't block HTML, because you'll cut yourself off of many
important messages that you actually want to receive. (However, I give
HTML-only messages without a plaintext part quite a large spam score in 
my

antispam filter).


Interesting to read that you give a spam score based on non-content 
aspects, where you have, in the last few days and ad nauseam, repeatedly 
argued that spam/no-spam should be based solely on content and not on 
other meta-data :)


Cheers.


Re: base64 encoded emails

2019-10-17 Thread Chris Wedgwood
> What is the legitimate reason to use base64 encoded emails ?

i see quite a lot of legitimate email as base64 encoded

> Seems to me, it is only being used by spammers to complicate
> body_checks

any modern checker can and will decode base64 or indeed other message
details (the cost of doing so is quite low)

> Would it be crazy to want to configure Postfix to not accept base64?

that will break a lot of legitimate sources


Re: base64 encoded emails

2019-10-17 Thread @lbutlr
On 17 Oct 2019, at 08:35, Bill Cole 
 wrote:
> On 17 Oct 2019, at 7:51, @lbutlr wrote:
>> Also., of course, some plaintext messages still have to be en=E2=85=BDoded=
>> .
>> 
>> Like this one.
> 
> But not always in Base64. :)

True, the sender rarely has any control over how the message is encoded.

> Many clients only do QP or only do Base64 when the content makes it 
> necessary, some make a judgment call based (usually) on the frequency of 
> non-ASCII characters. From a code standpoint, just doing Base64 is simpler 
> and more robust.

I would love to see an end to QP, it’s a mess and it breaks too frequently.


-- 
I laugh in the face of danger. Then I hide until it goes away.



Re: base64 encoded emails

2019-10-17 Thread Bill Cole

On 17 Oct 2019, at 7:51, @lbutlr wrote:

Also., of course, some plaintext messages still have to be 
en=E2=85=BDoded=

.

Like this one.


But not always in Base64. :)

Many clients only do QP or only do Base64 when the content makes it 
necessary, some make a judgment call based (usually) on the frequency of 
non-ASCII characters. From a code standpoint, just doing Base64 is 
simpler and more robust.


--
Bill Cole
b...@scconsult.com or billc...@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)


Re: base64 encoded emails

2019-10-17 Thread Bill Cole

On 17 Oct 2019, at 4:48, Fourhundred Thecat wrote:


Hello,

I would like to ask what the Postfix community thinks about base64
encoded emails.


They are necessary.


What is the legitimate reason to use base64 encoded emails ?


Any characters outside of the US-ASCII set must be encoded using Base64 
or Quoted-Printable, because it remains unsafe to assume that all mail 
transport paths are 8-bit safe. Choosing to use Base64 instead of QP is 
a reasonable decision for messages with a substantial number of 
non-ASCII characters and is simpler for the composing client, since it 
encodes all bytes in a manner that always decodes perfectly to the 
original input, while QP demands per-character judgment by the encoder 
and has corner cases where the encode/decode cycle is imperfect.


Seems to me, it is only being used by spammers to complicate 
body_checks


That is simply not true. It is used by some email clients as their 
default encoding for UTF-8 text and by Sendmail when it needs to pass an 
8-bit message to a mailer that isn't defined as 8-bit safe.



Would it be crazy to want to configure Postfix to not accept base64 ?


Yes.


I believe email should be plaintext.


I do not disagree.

There are many things in the world which should not exist but do or 
should exist but do not. Recognizing which 'shoulds' can be achieved is 
an important cognitive skill.



I don't like HTML emails either.


I do not disagree.

I also do not like getting old. Short of terminating the process, I will 
continue to age indefinitely.



If
somebody feels that his message needs fancy formatting, he should send
it as pdf attachment. But emails should stay plaintext.


General-purpose PDF has greater facility for being terrible for email 
than HTML.


This is a classic "Lost Cause" argument. The transition from strictly 
plaintext email to complex formatting and graphics in email was 
imperfect but was essential to its survival and we have the suboptimal 
result of HTML email as a de facto standard. That's the reality, and the 
world is not going to revert that change.


Hypothetically, If I wanted to block base64 and HTML, what would be 
the

best way to do it ?


Shut off your mail system?

It would be trivially easy with header_checks. MIME exists to allow MUAs 
to know how to identify the nature of non-plaintext email. See the man 
page and CONTENT_INSPECTION_README.



At what stage in the mail delivery pipeline ?


By necessity, content inspection has to happen in the second stage of 
the DATA command or after message acceptance and queueing. Rejection by 
header_checks or mime_header_checks occurs at the end of DATA.


If you want to try something more subtle, as you should, you need a real 
mail filter like MIMEDefang with or without Apache SpamAssassin. 
SpamAssassin can also be used via other milters, in a Postfix 
content_filter, or via the Postfix SMTP proxy architecture. MIMEDefang 
includes a robust MIME parser that can be used to strip MIME parts out 
and/or reconstruct them and SpamAssassin exposes the decoded and 
rendered content of Base64/HTML parts to its many filtering rules. It 
also has rule types that operate on decoded but not rendered HTML and on 
raw undecoded message data. The default ruleset included many that 
identify malformed HTML, gratuitous Base64 or QP encoding, and other 
technical quirks that correlate to mail being spam.



--
Bill Cole
b...@scconsult.com or billc...@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)


Re: base64 encoded emails

2019-10-17 Thread Richard Damon
On 10/17/19 4:48 AM, Fourhundred Thecat wrote:
> Hello,
>
> I would like to ask what the Postfix community thinks about base64
> encoded emails.
>
> What is the legitimate reason to use base64 encoded emails ?
>
> Seems to me, it is only being used by spammers to complicate body_checks
>
> Would it be crazy to want to configure Postfix to not accept base64 ?
>
> I believe email should be plaintext. I don't like HTML emails either. If
> somebody feels that his message needs fancy formatting, he should send
> it as pdf attachment. But emails should stay plaintext.
>
> Hypothetically, If I wanted to block base64 and HTML, what would be the
> best way to do it ?
> At what stage in the mail delivery pipeline ?
> for 
> thanks,
>
Email Transport is not always 8 bit clean, so message that use
characters besides ASCII, (like most languages other than English will
need some extra characters) so some method needs to be used to encode
the other characters in a 7 bit way.

There are two basic encoding choices to do this, Quoted-Printable and
Base64. In Quoted-Printable every byte outside of ASCII (and the
character =) is encode as =XX (so with UTF-8 character encoding, every
no-ASCII character becomes 2-4 sets of =XX) where XX is the hex value
for the byte. This leaves the message mostly readable undecoded, but can
greatly expand the size of the message if there are a lot of non-ASCII
characters.

The other encoding is Base64 where every character in the message
encodes 6 bits of the message, so for every 3 byte of content, you
encode it with 4 ASCII characters. This makes the message always longer,
but has a much less worse case expansion.

The use of these transfer encoding can occur even for 'plain text', all
that it requires is that the character set not be ASCII, and HTML
Messages actually have their own ways to avoid needing this sort of
encoding by using HTML entities for the extended characters.

It is also quite possible that the sender didn't chose the encoding
method, and sent out the message as 8-bit (because their mail system
accepts 8-bit messages) and in the transport pipeline it got converted
to being 7 bit clean with Base64 or Quoted-Printable encoding, with the
choice being made in the pipeline.

Optimally, the encoder would chose the best encoding for a given
message, and if that as done Quoted-Printable would likely be chosen for
most message using a Latin Character language, but many encoders will
just fall back to base64 as it is simpler, and Quoted-Printable can be
very bad for some messages.

I would say that for me, I would say that a significant (maybe 10-20%)
of ordinary messages that look to be plain text, are base64 encoded, so
reject them if you are willing to lose that much legitimate email. (And
those messages are full according to the RFCs)

-- 
Richard Damon



Re: base64 encoded emails

2019-10-17 Thread @lbutlr
On 17 Oct 2019, at 02:48, Fourhundred Thecat <400the...@gmx.ch> wrote:
> I believe email should be plaintext. I don't like HTML emails either. If
> somebody feels that his message needs fancy formatting, he should send
> it as pdf attachment. But emails should stay plaintext.

Have fun with that windmill.

It is not 1989. Or even 1999. That battle was lost ages ago.

Also., of course, some plaintext messages still have to be enⅽoded.

Like this one.


-- 
People who would not believe a High Priest if he said the sky was blue,
and was able to produce signed affidavits to this effect from his
white-haired old mother and three Vestal virgins, would trust just about
anything whispered darkly behind their hand by a complete stranger.



Re: base64 encoded emails

2019-10-17 Thread Jaroslaw Rafa
Dnia 17.10.2019 o godz. 10:48:21 Fourhundred Thecat pisze:
> 
> I believe email should be plaintext. I don't like HTML emails either. If
> somebody feels that his message needs fancy formatting, he should send
> it as pdf attachment. But emails should stay plaintext.

I second that emails should be plaintext, but the reality is that a lot of
services send automated emails like eg. website registration confirmation,
password change link, order confirmation from an Internet shop, parcel
tracking status etc. as HTML. It's not that bad when it is
multipart/alternative HTML + plaintext (as it should be), however more and
more often it's HTML only. I know also one webmail provider, quite popular
in my country, that formats outgoing messages as HTML only(!).

So you just can't block HTML, because you'll cut yourself off of many
important messages that you actually want to receive. (However, I give
HTML-only messages without a plaintext part quite a large spam score in my
antispam filter).
-- 
Regards,
   Jaroslaw Rafa
   r...@rafa.eu.org
--
"In a million years, when kids go to school, they're gonna know: once there
was a Hushpuppy, and she lived with her daddy in the Bathtub."


Re: base64 encoded emails

2019-10-17 Thread Wesley Peng

Hi

on 2019/10/17 16:48, Fourhundred Thecat wrote:

I believe email should be plaintext. I don't like HTML emails either. If
somebody feels that his message needs fancy formatting, he should send
it as pdf attachment. But emails should stay plaintext.


non-latin message body should be encoded for better transfer.

regards.