Re: Thoughts and queries (n=1) on the filter API

Edgar Pettijohn Sun, 23 Dec 2018 05:28:52 -0800

On Dec 23, 2018 5:27 AM, Gilles Chehade <[email protected]> wrote:
>
> On Sun, Dec 23, 2018 at 12:06:02PM +0100, Aham Brahmasmi wrote:
> > Bonjour Monsieur Gilles,
> > 
> > Merci beaucoup for your exhaustive explanations.
> > 
> > > > 1) What is the difference between the "report" and "filter" prefixes?
> > > > My current understanding is that "report" is oriented towards reporting
> > > > and "filter" is oriented towards writing filters.
> > > >
> > > 
> > > very simple:
> > > 
> > > - event reporting lets smtpd notify filters of any event that leads to
> > >   a change in the SMTP session, they are informative and do not accept
> > >   any answer.
> > > 
> > > - event filtering lets smtpd notify filters of any event that leads to
> > >   a decision in the SMTP session, they are decisive and require that a
> > >   filter provides a decision.
> > > 
> > > report lines are one way from smtpd to filters, filters must not reply
> > > 
> > > filter lines are two way between smtpd and filters, for any request an
> > > answer is required.
> > > 
> > > both reporting and filtering are "subscribed" so filters will only get
> > > the events they subscribed to.
> > 
> > So in case one wants to reject a session based only on reporting events,
> > one still has to wait till the filter events start coming in. Or one
> > could write a built-in like fcrdns. Is this correct?
> >  
>
> To reject a session based only on reporting events, you should subscribe
> to the relevant events in order to gather the info you want, then to one
> filter event to actually perform the action of rejecting.
>
> For example, I could subscribe to tx-connect, tx-helo and tx-ehlo to get
> source address, rdns, fcrdns and helo name from reporting events, but to
> reject the session based on this information I must decide when I want a
> rejection (do i want to reject at MAIL FROM ? RCPT TO ? DATA ? COMMIT ?)
> and register for appropriate filter event to actually reject a session.
>
> Builtin filters are a bit special because we are going to be VERY, VERY,
> VERY selective about them. Anything that can be done with a builtin, can
> be done with a proc filter and proc filters don't run in the same memory
> space so ... a good approach is to write a proc filter and if we find it
> makes sense to convert to a builtin, it can be discussed.
>
>
> > > > 3) Are there time limits for a filter to return response?
> > > >
> > > 
> > > time limits are not implemented yet, but yes there will be a time limit,
> > > very likely related to the SMTP session timeouts.
> > > 
> > > it's not tricky to implement, it'll be a notification sent to the filter
> > > that the last query timed-out and it shouldn't respond anymore.
> > > 
> > > this will result in a Temporary Failure in the SMTP session, filters are
> > > not allowed to exceed that timeout.
> > 
> > So if in a chain {f1,f2,f3}, if f2 takes too much time to respond, both
> > f1 and f3 will be notified of the session timeout.
> > 
>
> yes, definitely.
>
> in filter chains, all filters always receive all _reporting_ events.
>
> this is a reason why report and filters are separate, if you look at the
> connect phase, all filters will receive the tx-connect report event THEN
> the connect filter event is triggered and all filters may not receive it
> because first filter may reject. if it rejects, then the reporting event
> tx-disconnected is sent to all filters.
>
> the action of a filter will never prevent all filters from receiving the
> report events.
>
>
> > > > 5) Could we have the filter builtin for helo be different from the
> > > > ehlo? It might be instructional to understand that the client asked
> > > > for ehlo.
> > > > { FILTER_EHLO, "ehlo", filter_builtins_helo },
> > > 
> > > it is already the case, helo and ehlo are different filter hooks:
> > > 
> > >    filter foo1 builtin helo [...]
> > >    filter foo1 builtin ehlo [...]
> > > 
> > > the same applies for proc filters, they can subscribe to helo and ehlo
> > > as different filtering events.
> > 
> > I may be wrong here, but I was unable to find a filter_builtins_ehlo in
> > the code. From lka_filter.c
> > ..
> > static int filter_builtins_notimpl(struct filter_session *, struct filter 
> > *, uint64_t, const char *);
> > static int filter_builtins_connect(struct filter_session *, struct filter 
> > *, uint64_t, const char *);
> > static int filter_builtins_helo(struct filter_session *, struct filter *, 
> > uint64_t, const char *);
> > static int filter_builtins_mail_from(struct filter_session *, struct filter 
> > *, uint64_t, const char *);
> > static int filter_builtins_rcpt_to(struct filter_session *, struct filter 
> > *, uint64_t, const char *);
> > ..
> > { FILTER_CONNECT, "connect", filter_builtins_connect },
> > { FILTER_HELO, "helo", filter_builtins_helo },
> > { FILTER_EHLO, "ehlo", filter_builtins_helo },
> > { FILTER_STARTTLS,     "starttls", filter_builtins_notimpl },
> > ..
> > 
>
> yes, the same implementation of the builtin is used so I factored it but
> the filter phase  and reporting events differentiate them.
>
> don't worry about that I ran a filter rejecting HELO and accepting EHLO,
> the two are distinct as far as filtering is concerned.
>


My tests showed the same.

>
> > > > 7) In lka_filter.c, if a filter feeds back more than LINE_MAX, should
> > > > we handle that?
> > > >
> > > > (void)strlcpy(buffer, line, sizeof buffer);
> > > >
> > > 
> > > that's an interesting question.
> > > 
> > > LINE_MAX is not the correct value but we need to have a maximum value
> > > for the line and filters will need to ensure they don't produce lines
> > > bigger than these.
> > 
> > Understood. I based the LINE_MAX on the following within lka_filter.c:
> > int
> > lka_filter_process_response(const char *name, const char *line)
> > {
> > ...
> > char buffer[LINE_MAX];
> > ...
> > (void)strlcpy(buffer, line, sizeof buffer);
> > 
>
> yes, I had to start with something.
>
> to be very transparent my goal was to get behaving filters to fully work
> before the end of the year, then jan/feb/mar will be to ensure smtpd can
> cope with misbheaving filters.
>
> the areas we know need to be improved:
>
> - all kinds of timeouts: smtp session timeout, filters timeout, ...
> - all kinds of DATA issues: filter responding with end of message, while
>   client hasn't responded with end of message yet, ...
> - all kinds of exhaustions: failure to allocate filter sessions, failure
>   to send data to filters because the pipe is exhausted, ...
> - all kinds of filters fuckup: filters responding with bad phases or bad
>   sessions or bad action, etc... some are bugs, some are legit, ...
>
> now that we know filters work, including in chains, we can focus on what
> is needed to make them rock solid for April :-)
>
> please continue raising questions because the more people play with them
> the more we can spot what needs to be investigated.
>
> -- 
> Gilles Chehade        @poolpOrg
>
> https://www.poolp.org                 tip me: https://paypal.me/poolpOrg
>
> -- 
> You received this mail because you are subscribed to [email protected]
> To unsubscribe, send a mail to: [email protected]
>

Re: Thoughts and queries (n=1) on the filter API

Reply via email to