Re: [RFC] pop3: support `?limit=$NUM' parameter in mailbox name
On Mon, Sep 18, 2023 at 09:14:22PM +, Eric Wong wrote: > > Oh, I did notice what is probably unintentional behaviour -- passing > > ?limit=XXX affects all mailbox access, not just the initial retrieval. > > > > E.g. if I configured pop3 with ?limit=128, then leave for the weekend and > > return on Monday, I will only be able to retrieve 128 new messages, > > regardless > > of how many arrived over the weekend. > > > > I'm not sure if this is what was intended -- I think it makes more sense to > > have ?limit=XXX only affect the initial retrieval. In all other cases, when > > a > > tracking uuid cookie is present, it should return all messages regardless of > > ?limit=. > > > > Does that make sense? > > I think there should be an initial_limit parameter in addition to the > current limit. initial_limit would be more suited for cronjobs and > such running on 24/7 systems. The regular limit would be better > for systems with intermittent access and could go weeks w/o being > online (including situations where somebody restored a system from > a months/years-old backup). I'm game with that. Maybe even shorten that to l= and il=? I'm still worried about the field size limit a bit. > Not feeling well, will try to work on it once (or if) I feel better. Please take care! -K
Re: [RFC] pop3: support `?limit=$NUM' parameter in mailbox name
Konstantin Ryabitsev wrote: > On Fri, Sep 15, 2023 at 08:41:10PM +, Eric Wong wrote: > > Thanks, pushed the series as > > a37e3ab3740c24c3 (pop3: limit default mailbox to 1K messages, 2023-09-14) > > 392d251f97d46579 (pop3: support `?limit=$NUM' parameter in mailbox name, > > 2023-09-12) > > Oh, I did notice what is probably unintentional behaviour -- passing > ?limit=XXX affects all mailbox access, not just the initial retrieval. > > E.g. if I configured pop3 with ?limit=128, then leave for the weekend and > return on Monday, I will only be able to retrieve 128 new messages, regardless > of how many arrived over the weekend. > > I'm not sure if this is what was intended -- I think it makes more sense to > have ?limit=XXX only affect the initial retrieval. In all other cases, when a > tracking uuid cookie is present, it should return all messages regardless of > ?limit=. > > Does that make sense? I think there should be an initial_limit parameter in addition to the current limit. initial_limit would be more suited for cronjobs and such running on 24/7 systems. The regular limit would be better for systems with intermittent access and could go weeks w/o being online (including situations where somebody restored a system from a months/years-old backup). Not feeling well, will try to work on it once (or if) I feel better.
Re: [RFC] pop3: support `?limit=$NUM' parameter in mailbox name
On Fri, Sep 15, 2023 at 08:41:10PM +, Eric Wong wrote: > Thanks, pushed the series as > a37e3ab3740c24c3 (pop3: limit default mailbox to 1K messages, 2023-09-14) > 392d251f97d46579 (pop3: support `?limit=$NUM' parameter in mailbox name, > 2023-09-12) Oh, I did notice what is probably unintentional behaviour -- passing ?limit=XXX affects all mailbox access, not just the initial retrieval. E.g. if I configured pop3 with ?limit=128, then leave for the weekend and return on Monday, I will only be able to retrieve 128 new messages, regardless of how many arrived over the weekend. I'm not sure if this is what was intended -- I think it makes more sense to have ?limit=XXX only affect the initial retrieval. In all other cases, when a tracking uuid cookie is present, it should return all messages regardless of ?limit=. Does that make sense? -K
Re: [RFC] pop3: support `?limit=$NUM' parameter in mailbox name
Konstantin Ryabitsev wrote: > Tested-by: Konstantin Ryabitsev Thanks, pushed the series as a37e3ab3740c24c3 (pop3: limit default mailbox to 1K messages, 2023-09-14) 392d251f97d46579 (pop3: support `?limit=$NUM' parameter in mailbox name, 2023-09-12)
Re: [RFC] pop3: support `?limit=$NUM' parameter in mailbox name
On Thu, Sep 14, 2023 at 12:38:28AM +, Eric Wong wrote: > > My initial target for deploying POP3 support is to allow Gmail users to > > pull-subscribe to mailing lists, since Gmail is the #1 provider that we have > > trouble with message delivery due to their draconian threshold limits. > > However, I think if the default behaviour results in dumping 50,000 messages > > into people's inboxes, they wouldn't use it, which is why I think we should > > have a default that is lighter both on the server side and on the users. > > OK, I think this could work (goes on top of my previous limit patch): Yes, it looks good in my tests: - specifying the username as `[uuid]@org.kernel.vger.linux-kernel` downloads 1000 messages - specifying the username as `[uuid]@org.kernel.vger.linux-kernel?limit=128` properly downloads only 128 - tested in both Claws-mail and Thunderbird Tested-by: Konstantin Ryabitsev Thanks! -K
Re: [RFC] pop3: support `?limit=$NUM' parameter in mailbox name
On Wed, Sep 13, 2023 at 10:03:26PM +, Eric Wong wrote: > > What if we move the uuid into the password field -- it seems it belongs > > there > > anyway, as it's tied to the user cookie. > > I've thought about that, too; but it can get tricky since passwords > aren't visible in most UIs. I've also seen some UIs (not POP3) which > forbid copy+paste in password fields. > > Furthermore, if a user wants to migrate to a different POP3 client; > carrying their UUID with them is easier when it's readable in the > username. (I'm assuming users won't be bothered backup their UUID > anywhere) That makes sense. > I'm open to supporting both ways; but I'm also not inclined to > do so unless there's evidence of real-world POP3 clients being > unable to handle the user names. > > Documenting both ways can be overwhelming to users. Yes, let's keep it as-is -- I'll test the patches shortly and follow up with details. -K
Re: [RFC] pop3: support `?limit=$NUM' parameter in mailbox name
Konstantin Ryabitsev wrote: > On Tue, Sep 12, 2023 at 10:40:34PM +, Eric Wong wrote: > > Perhaps 50K is too much? I figured clients would have a way to > > limit that, but I don't really pay attention to POP3 clients... > > The few clients I looked at didn't give any option to specify how many remote > messages I want to retrieve, so I think defaulting to 50,000 is not the right > approach. Maybe the default limit should be something "last 7 days or 1000 > messages, whichever is larger"? OK, 1K (or any other fixed limit) is fine and easiest. 7 days (or any other time window) could get flooded if there's a nasty spike of some sort. > My initial target for deploying POP3 support is to allow Gmail users to > pull-subscribe to mailing lists, since Gmail is the #1 provider that we have > trouble with message delivery due to their draconian threshold limits. > However, I think if the default behaviour results in dumping 50,000 messages > into people's inboxes, they wouldn't use it, which is why I think we should > have a default that is lighter both on the server side and on the users. OK, I think this could work (goes on top of my previous limit patch): ---8<--- Subject: [PATCH] pop3: limit default mailbox to 1K messages This is probably friendlier to webmail providers which support importing mail from POP3. --- lib/PublicInbox/POP3.pm| 7 --- lib/PublicInbox/WwwText.pm | 3 +++ 2 files changed, 7 insertions(+), 3 deletions(-) diff --git a/lib/PublicInbox/POP3.pm b/lib/PublicInbox/POP3.pm index 4a21ef5e..f97eccfd 100644 --- a/lib/PublicInbox/POP3.pm +++ b/lib/PublicInbox/POP3.pm @@ -72,7 +72,7 @@ sub cmd_user ($$) { $user =~ tr/-//d; # most have dashes, some (dbus-uuidgen) don't $user =~ m!\A[a-f0-9]{32}\z!i or return \"-ERR user has no UUID\r\n"; - my $limit = UID_SLICE; + my $limit; $mailbox =~ s/\?limit=([0-9]+)\z// and $limit = $1 > UID_SLICE ? UID_SLICE : $1; @@ -86,10 +86,11 @@ sub cmd_user ($$) { my $tip = "$mailbox.$max"; return \"-ERR $mailbox.$slice does not exist ($tip does)\r\n" if $slice > $max; + $limit //= UID_SLICE; $self->{uid_base} = ($slice * UID_SLICE) + UID_SLICE - $limit; $self->{slice} = $slice; - } else { # latest $limit messages - my $base = $uidmax - $limit; + } else { # latest $limit messages, 1k if unspecified + my $base = $uidmax - ($limit // 1000); $self->{uid_base} = $base < 0 ? 0 : $base; $self->{slice} = -1; } diff --git a/lib/PublicInbox/WwwText.pm b/lib/PublicInbox/WwwText.pm index c31a7f86..f4508b3f 100644 --- a/lib/PublicInbox/WwwText.pm +++ b/lib/PublicInbox/WwwText.pm @@ -293,6 +293,9 @@ The POP3 password is: anonymous The POP3 username is: \$(uuidgen)\@$ctx->{ibx}->{newsgroup} where \$(uuidgen) in the output of the `uuidgen' command on your system. The UUID in the username functions as a private cookie (don't share it). +By default, only 1000 messages are retrieved. You may download more +by appending `?limit=NUM' (without quotes) to the username, where +`NUM' is an integer between 1 and 5. Idle accounts will expire periodically. EOM }
Re: [RFC] pop3: support `?limit=$NUM' parameter in mailbox name
Konstantin Ryabitsev wrote: > On Wed, Sep 13, 2023 at 06:20:40AM +, Eric Wong wrote: > > Eric Wong wrote: > > > I'm not sure if `?' or `=' are allowed characters in POP3 > > > mailbox names. In fact, I can't find any information on > > > valid characters allowed in RFC 1081 nor RFC 1939. > > It's a username, though, not mailbox name? There's no restriction on the > characters or length of the username, though I'm guessing some UI clients may > have their own limits regarding the length of the username field. username == mailbox name as far as POP3 goes. > > Of course, the parameters and all manner of special characters > > can also be placed the password, so `anonymous?limit=1000'. > > > > But somehow putting parameters in the "password" (even a > > well-known and obvious one) feels wrong. > > What if we move the uuid into the password field -- it seems it belongs there > anyway, as it's tied to the user cookie. I've thought about that, too; but it can get tricky since passwords aren't visible in most UIs. I've also seen some UIs (not POP3) which forbid copy+paste in password fields. Furthermore, if a user wants to migrate to a different POP3 client; carrying their UUID with them is easier when it's readable in the username. (I'm assuming users won't be bothered backup their UUID anywhere) > username: newsgroup.name?params > password: $(uuidgen) > > So, in my example it becomes: > > username: org.kernel.vger.git?limit=1000 > password: 288e5e35-1a35-46ef-b3d5-6d94c20aeab8 > > This could be backward-compatible with the current implementation -- if there > is an @ in the username field, then the cookie is based on what's preceding > it. If there's none, then we use the password field (unless it's "anonymous"). > > This way we're less likely to run into any problems with username length > limitations set by MUAs. Right, backwards compatibility isn't a problem either way. I'm open to supporting both ways; but I'm also not inclined to do so unless there's evidence of real-world POP3 clients being unable to handle the user names. Documenting both ways can be overwhelming to users. Thanks.
Re: [RFC] pop3: support `?limit=$NUM' parameter in mailbox name
On Tue, Sep 12, 2023 at 10:40:34PM +, Eric Wong wrote: > Perhaps 50K is too much? I figured clients would have a way to > limit that, but I don't really pay attention to POP3 clients... The few clients I looked at didn't give any option to specify how many remote messages I want to retrieve, so I think defaulting to 50,000 is not the right approach. Maybe the default limit should be something "last 7 days or 1000 messages, whichever is larger"? My initial target for deploying POP3 support is to allow Gmail users to pull-subscribe to mailing lists, since Gmail is the #1 provider that we have trouble with message delivery due to their draconian threshold limits. However, I think if the default behaviour results in dumping 50,000 messages into people's inboxes, they wouldn't use it, which is why I think we should have a default that is lighter both on the server side and on the users. -K
Re: [RFC] pop3: support `?limit=$NUM' parameter in mailbox name
Eric Wong wrote: > I'm not sure if `?' or `=' are allowed characters in POP3 > mailbox names. In fact, I can't find any information on > valid characters allowed in RFC 1081 nor RFC 1939. Of course, the parameters and all manner of special characters can also be placed the password, so `anonymous?limit=1000'. But somehow putting parameters in the "password" (even a well-known and obvious one) feels wrong. *shrug*
[RFC] pop3: support `?limit=$NUM' parameter in mailbox name
Konstantin Ryabitsev wrote: > Hello: > > I've been playing around with pop3, and I'm wondering if we can improve its > usability by adding a "last NNN messages" pseudo-folder. Currently, if someone > wants to access the git mailing list archive via pop3, they have to do the > following: > > - know that the username should be $(uuidgen)@org.kernel.vger.git.1 (the > default username would access slice 0, right? Or is it the last 50,000 > messages?) The /\.[0-9]+$/ slice is actually optional for POP3. `$(uuidgen)@org.kernel.vger.git' alone will get you the latest 50k. > - wait for their client to retrieve tens of thousands of unread messages on > first access Perhaps 50K is too much? I figured clients would have a way to limit that, but I don't really pay attention to POP3 clients... Patch below adds a `?limit=$NUM' parameter, but I'm not sure if `?' or `=' are allowed in POP3 mailbox names. mpop(1) doesn't complain... Haven't looked at other POP3 clients. > - if the remote archive rolls over to the next slice, they have to edit their > account info to get new messages (unless I'm wrong about #1) Yeah, that only applies to IMAP. IMAP is a pain since connections can be long-lived and per-connection MSN <=> UID mappings can grow without bound after more messages arrive. Perhaps our -imapd can be less nice and forcibly terminate connections if the most recent window gets too big. > Perhaps the default could be slightly different: > > - $(uuidgen)@org.kernel.vger.git would start with an empty view (or something > like the last 10 messages) Small numbers would be very unuseful, too, I think... > - it would only get any new messages added to the archive > > I think this would be a friendlier experience, but not sure how difficult it > would be to implement. I'm also not 100% sure all my assumptions are correct, > so please feel free to correct me. No worries, the POP3 stuff hasn't seen much use. IMAP's been hammered relentlessly by bots on my server, at least :> Lightly-tested patch to support ?limit=$NUM ---8< Subject: [PATCH] pop3: support `?limit=$NUM' parameter in mailbox name I'm not sure if `?' or `=' are allowed characters in POP3 mailbox names. In fact, I can't find any information on valid characters allowed in RFC 1081 nor RFC 1939. In any case, it seems to work fine with mpop. --- lib/PublicInbox/POP3.pm | 18 -- xt/pop3d-mpop.t | 4 ++-- 2 files changed, 14 insertions(+), 8 deletions(-) diff --git a/lib/PublicInbox/POP3.pm b/lib/PublicInbox/POP3.pm index d32793e4..4a21ef5e 100644 --- a/lib/PublicInbox/POP3.pm +++ b/lib/PublicInbox/POP3.pm @@ -41,6 +41,7 @@ use PublicInbox::IMAP; # for UID slice stuff use constant { LINE_MAX => 512, # XXX unsure + UID_SLICE => PublicInbox::IMAP::UID_SLICE, }; # XXX FIXME: duplicated stuff from NNTP.pm and IMAP.pm @@ -70,20 +71,25 @@ sub cmd_user ($$) { my $user = $1; $user =~ tr/-//d; # most have dashes, some (dbus-uuidgen) don't $user =~ m!\A[a-f0-9]{32}\z!i or return \"-ERR user has no UUID\r\n"; - my $slice; - $mailbox =~ s/\.([0-9]+)\z// and $slice = $1 + 0; + + my $limit = UID_SLICE; + $mailbox =~ s/\?limit=([0-9]+)\z// and + $limit = $1 > UID_SLICE ? UID_SLICE : $1; + + my $slice = $mailbox =~ s/\.([0-9]+)\z// ? $1 + 0 : undef; + my $ibx = $self->{pop3d}->{pi_cfg}->lookup_newsgroup($mailbox) // return \"-ERR $mailbox does not exist\r\n"; my $uidmax = $ibx->mm(1)->num_highwater // 0; if (defined $slice) { - my $max = int($uidmax / PublicInbox::IMAP::UID_SLICE); + my $max = int($uidmax / UID_SLICE); my $tip = "$mailbox.$max"; return \"-ERR $mailbox.$slice does not exist ($tip does)\r\n" if $slice > $max; - $self->{uid_base} = $slice * PublicInbox::IMAP::UID_SLICE; + $self->{uid_base} = ($slice * UID_SLICE) + UID_SLICE - $limit; $self->{slice} = $slice; - } else { # latest 50K messages - my $base = $uidmax - PublicInbox::IMAP::UID_SLICE; + } else { # latest $limit messages + my $base = $uidmax - $limit; $self->{uid_base} = $base < 0 ? 0 : $base; $self->{slice} = -1; } diff --git a/xt/pop3d-mpop.t b/xt/pop3d-mpop.t index fc82bc6b..9da1050c 100644 --- a/xt/pop3d-mpop.t +++ b/xt/pop3d-mpop.t @@ -53,7 +53,7 @@ delivery maildir $tmpdir/md account default host ${\$sock->sockhost} port ${\$sock->sockport} -user $uuid\@$newsgroup +user $uuid\@$newsgroup?limit=1 auth user password anonymous received_header off @@ -65,7 +65,7 @@ EOM my $pid = spawn($cmd, undef, { 1 => 2 }); $pids{$pid} = $cmd; } - +diag "mpop is writing to $tmpdir/md ..."; while (scalar keys %pids) { my $pid = waitpid(-1, 0) or next; my $cmd = delete $pids{$pid}