[PATCH 5/8] doc: speling fickses

2020-08-27 Thread Eric Wong
--- Documentation/public-inbox-edit.pod | 2 +- Documentation/public-inbox-purge.pod | 2 +- Documentation/public-inbox-tuning.pod | 2 +- 3 files changed, 3 insertions(+), 3 deletions(-) diff --git a/Documentation/public-inbox-edit.pod b/Documentation/public-inbox-edit.pod index 3853fa9c..68

[PATCH 8/8] doc: watch: expand on NNTP and IMAP-specific knobs

2020-08-27 Thread Eric Wong
There's a few more, but maybe they're too esoteric to be worth documenting at the moment (batch sizes, timeouts, etc). --- Documentation/public-inbox-watch.pod | 36 +++- 1 file changed, 35 insertions(+), 1 deletion(-) diff --git a/Documentation/public-inbox-watch.pod b/D

[PATCH 7/8] doc: move watch config docs to -watch manpage

2020-08-27 Thread Eric Wong
The -config manpage is a bit long and the -watch stuff is isolated from the rest of it while we start documenting NNTP and IMAP support. I'm not entirely happy with the way IMAP and NNTP are configured, it's still good enough for small setups. This also fixes a long-standing misplaced comment abo

[PATCH 1/2] www: improve navigation around comtemporary threads

2020-08-27 Thread Eric Wong
Sometimes it's useful to quickly get to threads and messages which are contemporaries of the current thread/message being focused on. This hopefully improves navigatin by making: a) the top line (where $INBOX_DIR/description) is shown a link to the latest topics in search results and per-th

[PATCH 2/2] www: more descriptive pagination

2020-08-27 Thread Eric Wong
Being an easily confused person, I find "next" and "prev" ambiguous as to whether messages on the next or previous page will be newer or older than the current page. Clarify that for the threaded /$INBOX/ view and search results. For search results sorted by relevance, we'll use "[>= $SCORE]" or

[PATCH 0/2] www: navigation tweaks

2020-08-27 Thread Eric Wong
Just some things I noticed could use improvement while browsing around old archives. All subjective... Eric Wong (2): www: improve navigation around comtemporary threads www: more descriptive pagination Documentation/mknews.perl | 3 +- lib/PublicInbox/Feed.pm | 5 ++- lib

[PATCH] Makefile.PL: run check-man for <= 80 columns on check-run, too

2020-08-27 Thread Eric Wong
I mostly use "make check-run" instead of the slower "make check" target, nowadays, so add this check to ensure the rendered manpage is always be visible to more users who need big fonts. --- Makefile.PL | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Makefile.PL b/Makefile.PL i

Re: [PATCH 8/8] doc: watch: expand on NNTP and IMAP-specific knobs

2020-08-27 Thread Eric Wong
Eric Wong wrote: > --- a/Documentation/public-inbox-watch.pod > +++ b/Documentation/public-inbox-watch.pod > @@ -78,7 +78,12 @@ public-inbox 1.6.0 supports C<nntp://>, C, > C and C URLs: > > watch = nntp://news.example.com/inbox.test.group >

Re: [PATCH 1/2] www: improve navigation around comtemporary threads

2020-08-27 Thread Eric Wong
Kyle Meyer wrote: > Eric Wong writes: > > > Sometimes it's useful to quickly get to threads and messages > > which are contemporaries of the current thread/message being > > focused on. This hopefully improves navigatin by making: > > s/navigatin/navigation

[PATCH 2/3] imaptracker: update_last: simplify callers

2020-08-28 Thread Eric Wong
By making it a no-op if last_uid is not defined. This isn't a hot code path, so the extra method dispatch isn't an issue. It'll save some indentation/wrapping in future commits. --- lib/PublicInbox/IMAPTracker.pm | 5 +++-- lib/PublicInbox/WatchMaildir.pm | 4 ++-- 2 files changed, 5 insertions(

[PATCH 1/3] watch: flush changes to inbox before updating IMAPTracker

2020-08-28 Thread Eric Wong
Data needs to hit inboxes, first. Otherwise it's possible to skip messages in case git-fast-import is killed before it sees "done\n". Now, -watch will just waste a little bandwidth in re-downloading a seen message if it's interrupted immediately before updating IMAPTracker. --- lib/PublicInbox/W

[PATCH 3/3] tests: check-run: show skipped tests

2020-08-28 Thread Eric Wong
We'll deduplicate redundant lines and show counts of skipped tests to ensure it's easy to notice if something is unexpectedly skipped. --- t/run.perl | 22 -- 1 file changed, 20 insertions(+), 2 deletions(-) diff --git a/t/run.perl b/t/run.perl index b1a0d2fe..e3e3e075 100755

[PATCH 0/3] more watch-related stuff

2020-08-28 Thread Eric Wong
1/3 is the most important; more watch tweaks coming... Eric Wong (3): watch: flush changes to inbox before updating IMAPTracker imaptracker: update_last: simplify callers tests: check-run: show skipped tests lib/PublicInbox/IMAPTracker.pm | 5 +++-- lib/PublicInbox/WatchMaildir.pm | 4

Re: [PATCH 3/3] tests: check-run: show skipped tests

2020-08-28 Thread Eric Wong
Eric Wong wrote: > + my %nr; > + $nr{$_}++ for @sk; > + for (@sk) { > + my $n = delete $nr{$_} or next; > + print OLDERR

[PATCH] imapd: filter out unusable flags from search

2020-08-29 Thread Eric Wong
Quiet down logs from -imapd when clients are blindly sending some unsupported flag conditions (e.g. "DRAFT", "DELETED") specified in RFC 3501. --- lib/PublicInbox/IMAPsearchqp.pm | 21 - t/imapd.t | 8 +++- 2 files changed, 27 insertions(+), 2 deletio

[PATCH] doc: expand on indexBatchSize regarding fragementation

2020-08-30 Thread Eric Wong
And change the documentation reference in -tuning to point to the -index manpage while we're at it. --- Documentation/public-inbox-index.pod | 5 +++-- Documentation/public-inbox-tuning.pod | 6 -- 2 files changed, 7 insertions(+), 4 deletions(-) diff --git a/Documentation/public-inbox-index

[PATCH 07/11] watch: comments and tiny cleanups

2020-08-30 Thread Eric Wong
From: Eric Wong Get rid of an unused variable, prefix a warning and try to better document control flow around various callbacks. --- lib/PublicInbox/Watch.pm | 12 ++-- 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/lib/PublicInbox/Watch.pm b/lib/PublicInbox/Watch.pm

[PATCH 11/11] replace ParentPipe with EOFpipe

2020-08-30 Thread Eric Wong
ParentPipe was a subset of EOFpipe, except EOFpipe correctly accounts for theoretical(*) spurious wakeups on the pipe. (*) AFAIK, spurious wakeups are/were more likely on TCP sockets due to checksum failures, something that's not a problem on local pipes. We're also not sharing pipes like

[PATCH 08/11] ds: avoid excessive queueing when reaping PIDs

2020-08-30 Thread Eric Wong
We should not enqueue reap_pids() to run more than once per EventLoop iteration. We'll start reformatting reap_pids to tabs, too, since we're no longer Danga::Socket. We should also be able to remove timer usage for reaping down-the-line once we stop abusing dwaitpid() in -watch. --- lib/PublicI

[PATCH 03/11] rename WatchMaildir => Watch

2020-08-30 Thread Eric Wong
From: Eric Wong This is no longer limited to Maildirs now that IMAP and NNTP support exist; so give it a shorter name. --- MANIFEST | 2 +- lib/PublicInbox/{WatchMaildir.pm => Watch.pm} | 2 +- script/public-inbox-watch |

[PATCH 05/11] watch: avoid unnecessary spawning on spam removals

2020-08-30 Thread Eric Wong
From: Eric Wong This should further mitigate lock contention problems when -watch is configured to watch on a Maildir for spam while performing a large NNTP import. There is now a small risk a message won't get removed because if it's in the current (uncommitted) fast-import batch, bu

[PATCH 02/11] watchmaildir: use v5.10.1, drop warnings

2020-08-30 Thread Eric Wong
From: Eric Wong Declare 5.10.1 to avoid potential compatibility problems with Perl 7/8 down the line. We'll rely on the command-line to set or drop warnings during development, at least. --- lib/PublicInbox/WatchMaildir.pm | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git

[PATCH 10/11] ds: avoid unnecessary timer for waitpid

2020-08-30 Thread Eric Wong
It doesn't seem necessary, since we won't call dwaitpid() until we see an EOF. --- lib/PublicInbox/DS.pm | 13 +++-- 1 file changed, 3 insertions(+), 10 deletions(-) diff --git a/lib/PublicInbox/DS.pm b/lib/PublicInbox/DS.pm index b252ea3c..661be1fd 100644 --- a/lib/PublicInbox/DS.pm +++

[PATCH 04/11] watch: log signal activities to STDERR

2020-08-30 Thread Eric Wong
From: Eric Wong Sometimes it may not be apparent when/if a signal is processed, this hopefully improves the situation. We'll also change the process title when we're quitting to better inform users. --- script/public-inbox-watch | 24 ++-- 1 file changed, 18

[PATCH 06/11] watch: block signals before fork on non-signalfd/kevent systems

2020-08-30 Thread Eric Wong
In case there's non-Linux or BSD users w/o IO::KQueue, we shouldn't let signal handlers fire in the child processes. The child processes always assumed signals were blocked by the parent, so no changes were necessary, there. --- lib/PublicInbox/Watch.pm | 21 ++--- 1 file changed,

[PATCH 01/11] watch: limit batch size of NNTP and IMAP workers, too

2020-08-30 Thread Eric Wong
From: Eric Wong We don't want to monopolize locks because processes can easily block each other if using `watchspam' on a Maildir while a big NNTP or IMAP import is happening. This can also happen if somebody configured a single inbox to watch from several sources to merge several

[PATCH 09/11] watch: use EOFpipe to reduce dwaitpid wakeups

2020-08-30 Thread Eric Wong
It's a bit inefficient to use a pipe, here. However, using dwaitpid() on a process that's not expected to exit soon is also inefficient as it causes excessive wakeups as most of our inbox-writing code expects synchronous waitpid(). This only affects -watch instances configured for NNTP and IMAP c

[PATCH 00/11] watch: fix contention w/ Maildir & NNTP

2020-08-30 Thread Eric Wong
or `watchspam' removals. These affect IMAP, too; but I've been mainly using NNTP. Eric Wong (11): watch: limit batch size of NNTP and IMAP workers, too watchmaildir: use v5.10.1, drop warnings rename WatchMaildir => Watch watch: log signal activities to STDERR watch: avoid un

[PATCH] t/run: Perl future proofing

2020-08-31 Thread Eric Wong
Bareword file handles outside of STD(IN|OUT|ERR) seem to be on the chopping block for Perl 8. We'll also "use v5.10.1" to guard against future incompatibilities. --- t/run.perl | 24 +--- 1 file changed, 13 insertions(+), 11 deletions(-) diff --git a/t/run.perl b/t/run.perl i

Re: [PATCH] doc: expand on indexBatchSize regarding fragmentation

2020-08-31 Thread Eric Wong
pushed, dropping extraneous `e' in Subject. -- unsubscribe: one-click, see List-Unsubscribe header archive: https://public-inbox.org/meta/

[PATCH 07/10] config: use defined-or (//) in a few places

2020-08-31 Thread Eric Wong
Just some golfing to reduce scrolling and hopefully readability. --- lib/PublicInbox/Config.pm | 6 ++ 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/lib/PublicInbox/Config.pm b/lib/PublicInbox/Config.pm index f9184bd2..ae9ad8de 100644 --- a/lib/PublicInbox/Config.pm +++ b/lib/P

[PATCH 06/10] mda+learn: add --help / -h support

2020-08-31 Thread Eric Wong
"use Getopt::Long" doesn't seem too slow on a hot page cache, and it's probably used frequently enough to be in cache. We'll also start reducing the amount of markup in the .pod and favoring verbatim text in documentation for readability in source form, since the bold text seems excessive. --- Do

[PATCH 01/10] script/*: set executable bit on -learn and -imapd

2020-08-31 Thread Eric Wong
It's useful to mark they're meant to be executable, even if the shebang is useless. --- script/public-inbox-imapd | 0 script/public-inbox-learn | 0 2 files changed, 0 insertions(+), 0 deletions(-) mode change 100644 => 100755 script/public-inbox-imapd mode change 100644 => 100755 script/public

[PATCH 09/10] doc: remove B<> (bold) markup from the remaining POD

2020-08-31 Thread Eric Wong
B<> decreases readability of the POD source and is of dubious usefulness in the man page. --- Documentation/public-inbox-httpd.pod | 2 +- Documentation/public-inbox-imapd.pod | 2 +- Documentation/public-inbox-nntpd.pod | 2 +- 3 files changed, 3 insertions(+), 3 deletions(-) diff --git a/Docume

[PATCH 10/10] init+convert: create non-existing directory hierarchies

2020-08-31 Thread Eric Wong
Following "git init" as an example, we'll create every parent path up to the one specified, instead of attempting to continue on when Cwd::abs_path returns `undef'. --- script/public-inbox-convert | 7 +-- script/public-inbox-init| 10 -- t/convert-compact.t | 20 +

[PATCH 05/10] daemon: support --help/-h in -httpd/imapd/nntpd

2020-08-31 Thread Eric Wong
For consistency with other commands, though the protocol-specific options should refer users to the manpage. --- lib/PublicInbox/Daemon.pm | 19 +-- 1 file changed, 17 insertions(+), 2 deletions(-) diff --git a/lib/PublicInbox/Daemon.pm b/lib/PublicInbox/Daemon.pm index 000ba169..

[PATCH 00/10] some usability tweaks

2020-08-31 Thread Eric Wong
minor cleanups and future proofing along the way, too. Eric Wong (10): script/*: set executable bit on -learn and -imapd admin: improve minimum version text edit+purge: support `--help' and `-h' like other commands script/*: fold $usage into $help, support `-h' instead of

[PATCH 03/10] edit+purge: support `--help' and `-h' like other commands

2020-08-31 Thread Eric Wong
And while we're at it, note edit is *destructive* to encourage reading the fine manual. --- Documentation/public-inbox-edit.pod | 2 +- lib/PublicInbox/AdminEdit.pm| 2 +- script/public-inbox-edit| 21 ++--- script/public-inbox-purge | 17 +++

[PATCH 08/10] watch: add --help/-h support

2020-08-31 Thread Eric Wong
And avoid unnecessary POD markup in the man page. --- Documentation/public-inbox-watch.pod | 2 +- script/public-inbox-watch| 18 ++ 2 files changed, 15 insertions(+), 5 deletions(-) diff --git a/Documentation/public-inbox-watch.pod b/Documentation/public-inbox-watch

[PATCH 02/10] admin: improve minimum version text

2020-08-31 Thread Eric Wong
"inboxes 1 inboxes not supported by ..." was non-sensical. Now it'll show "-V1 inbox not supported by ...", instead. --- lib/PublicInbox/Admin.pm | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/lib/PublicInbox/Admin.pm b/lib/PublicInbox/Admin.pm index b8ead6f7..fb88e621 100644

[PATCH 04/10] script/*: fold $usage into $help, support `-h' instead of -?

2020-08-31 Thread Eric Wong
`-h' doesn't conflict with anything, and some users (including git users) may be more accustomed to using it rather than the rarely-seen-outside-of-Getopt::Long `-?' switch. We can also rely on the GetOptions() function to emit a proper error message instead of just "bad command-line args". --- s

[PATCH] t/v2dupindex: test indexing mirrors with duplicate messages

2020-08-31 Thread Eric Wong
While it's not a known problem, our deduplicating logic may change in the future; or a BOFH could be manually injecting duplicate messages directly into the git epoch repositories. Ensure indexing in mirrors doesn't break when there's duplicates. This is in preparation for detached indices for mu

[PATCH] index: check for xapian-compact when using --compact

2020-09-01 Thread Eric Wong
Otherwise, users may be frustrated to discover it missing a long indexing run. --- script/public-inbox-index | 4 1 file changed, 4 insertions(+) diff --git a/script/public-inbox-index b/script/public-inbox-index index 89c6b782..5dad6ecb 100755 --- a/script/public-inbox-index +++ b/script/pu

Re: [PATCH] index: check for xapian-compact when using --compact

2020-09-01 Thread Eric Wong
Eric Wong wrote: > Otherwise, users may be frustrated to discover it missing > a long indexing run. "after a long indexing run" :x -- unsubscribe: one-click, see List-Unsubscribe header archive: https://public-inbox.org/meta/

[PATCH 2/3] use "\&" where possible when referring to subroutines

2020-09-01 Thread Eric Wong
"*foo" is ambiguous in that it may refer to a bareword file handle; so we'll use it where we can without triggering warnings. PublicInbox::TestCommon::run_script_exit required dropping the prototype, however. We'll also future-proof by dropping "use warnings" in Cgit.pm and use the less-ambiguous

[PATCH 3/3] www: manifest.js.gz generation no longer hogs event loop

2020-09-01 Thread Eric Wong
It's still as slow as before with hundreds/thousands of inboxes, but at least it's fair. Future changes will allow it to be cached and memoized with persistent HTTP servers. --- MANIFEST| 1 + lib/PublicInbox/ManifestJsGz.pm | 153 lib/Pu

[PATCH 0/3] www: cleanups + scheduling improvements

2020-09-01 Thread Eric Wong
Some more stuff to do along these lines, but I might be eaten by a bear in the next few hours... Eric Wong (3): solver: drop warnings, modernize use v5.10.1, use SEEK_SET use "\&" where possible when referring to subroutines www: manifest.js.gz generation no longer h

[PATCH 1/3] solver: drop warnings, modernize use v5.10.1, use SEEK_SET

2020-09-01 Thread Eric Wong
With Perl upstream preparing to deprecate things, we'll move towards only enabling warnings during development via shebang and stop enabling them via "use". We'll also favor "use v5.10.1" over the Perl 5.6-compatible "use 5.010_001", since our code base never worked on 5.6. Finally, were also imp

[PATCH 00/11] cleanups, mostly indexing related

2020-09-02 Thread Eric Wong
Some cleanups ahead of detached index support. Found some dead code, too. Eric Wong (11): msgmap: note how we use ->created_at disambiguate OverIdx and Over by field name use more idiomatic internal API for ->over access search: remove special case for blank query tests: ad

[PATCH 02/11] disambiguate OverIdx and Over by field name

2020-09-02 Thread Eric Wong
We'll use {oidx} as the common field name for the read-write OverIdx, here, to disambiguate it from the read-only {over} field. This hopefully makes it clearer which code paths are read-only and which are read-write. --- lib/PublicInbox/SearchIdx.pm | 32 ++- lib/Publ

[PATCH 03/11] use more idiomatic internal API for ->over access

2020-09-02 Thread Eric Wong
{over_ro} being a part of the Search object is a historical oddity which will go away, soon. Lets start removing its use in tests and rarely-used helper scripts. --- scripts/dupe-finder | 3 +-- t/search.t | 14 +++--- t/v2mirror.t| 2 +- t/v2writable.t | 4 ++--

[PATCH 01/11] msgmap: note how we use ->created_at

2020-09-02 Thread Eric Wong
It'll likely be used in the future for JMAP, detached indices, and maybe other things. --- lib/PublicInbox/Msgmap.pm | 1 + 1 file changed, 1 insertion(+) diff --git a/lib/PublicInbox/Msgmap.pm b/lib/PublicInbox/Msgmap.pm index d696ce83..f15875e3 100644 --- a/lib/PublicInbox/Msgmap.pm +++ b/lib/P

[PATCH 04/11] search: remove special case for blank query

2020-09-02 Thread Eric Wong
The special case (if any) belongs at a higher-level, and this is another step towards removing {over_ro}-dependence in our Search object. --- lib/PublicInbox/Search.pm | 13 - t/v2mda.t | 6 +++--- t/watch_maildir_v2.t | 19 +-- 3 files changed, 16

[PATCH 05/11] tests: add "use strict" and declare v5.10.1 compatibility

2020-09-02 Thread Eric Wong
strict.pm helped me find a typo in an upcoming recent change, so ensure we use it since it does more good than harm. We'll also take the opportunity here to declare v5.10.1 compatibility level to future-proof against Perl incompatibilities. --- t/index-git-times.t | 3 +++ xt/eml_check_limits.t

[PATCH 07/11] search: remove {over_ro} field

2020-09-02 Thread Eric Wong
Only inbox accesses the read-only {over}, now, instead of going through ->search. This simplifies our object graph and avoids potentially redundant FDs and DB handles pointing to the same over.sqlite3 file. --- lib/PublicInbox/Inbox.pm | 11 +-- lib/PublicInbox/Search.pm | 2 -- 2 files

[PATCH 09/11] wwwaltid: drop unused sqlite3_missing function

2020-09-02 Thread Eric Wong
It's inlined into the main function, which we'll shorten slightly with the defined-or (`//') operator. Also noticed and fixed a mismatched HTML tag. --- lib/PublicInbox/WwwAltId.pm | 16 +--- 1 file changed, 1 insertion(+), 15 deletions(-) diff --git a/lib/PublicInbox/WwwAltId.pm b/l

[PATCH 08/11] imap: drop old, pre-Parse::RecDescent search parser

2020-09-02 Thread Eric Wong
We switched to Parse::RecDescent during development and left some dead code behind. --- lib/PublicInbox/IMAP.pm | 61 - 1 file changed, 61 deletions(-) diff --git a/lib/PublicInbox/IMAP.pm b/lib/PublicInbox/IMAP.pm index d540fd0b..2d0d005e 100644 --- a/lib/

[PATCH 06/11] search: replace ->query with ->mset

2020-09-02 Thread Eric Wong
Nearly all of the search uses in the production code rely on a Xapian mset iterator being returned (instead of an array of $smsg objects). So default to returning the mset and move the burden of smsg array conversion into the test cases. --- lib/PublicInbox/ExtMsg.pm | 4 +- lib/PublicInbox

[PATCH 11/11] v2writable: reuse read-only shard counting code

2020-09-02 Thread Eric Wong
We'll also fix the read-only code to ensure we notice missing Xapian shards, since gaps would throw off our expectation that Xapian document IDs and NNTP article numbers are interchangeable. --- lib/PublicInbox/Search.pm | 5 - lib/PublicInbox/V2Writable.pm | 23 +++ 2

[PATCH 10/11] overidx: document column uses

2020-09-02 Thread Eric Wong
This may be useful for keeping our heads on straight dealing with IMAP, NNTP, JMAP, etc. --- lib/PublicInbox/OverIdx.pm | 18 +- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/lib/PublicInbox/OverIdx.pm b/lib/PublicInbox/OverIdx.pm index 6f0477f0..db4b7738 100644 ---

message bloat over time...

2020-09-02 Thread Eric Wong
I've been indexing and reindexing a local mirror of https://lore.kernel.org/lkml a bit, and it's kinda depressing to see newer messages being more and more bloated even on a plain-text-only mailing list :< The first column ("$X.git" is the epoch number, older epochs are lower-numbered: "0.git" is

[PATCH] www: make mirror instructions more prominent

2020-09-08 Thread Eric Wong
In order to fight the misconception that public-inboxes are centralized, anchor "#mirror" to the clone instructions and place an emphasis on "mirror", not just cloning. While we're at it, better describe multi-epoch -V2 inboxes, since some users do not seem to realize epochs consist of different d

[PATCH 03/11] use "\&" where possible when referring to subroutines

2020-09-08 Thread Eric Wong
"*foo" is ambiguous in that it may refer to a bareword file handle; so we'll use it where we can without triggering warnings. PublicInbox::TestCommon::run_script_exit required dropping the prototype, however. We'll also future-proof by dropping "use warnings" in Cgit.pm and use the less-ambiguous

[PATCH 07/11] t/cgi.t: show stderr on failures

2020-09-08 Thread Eric Wong
This helped me diagnose an error I would've introduced in the next commit. --- t/cgi.t | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/t/cgi.t b/t/cgi.t index 366d6594..96c627c3 100644 --- a/t/cgi.t +++ b/t/cgi.t @@ -158,7 +158,7 @@ sub cgi_run { my ($in, $out, $err) =

[PATCH 06/11] config: split out iterator into separate object

2020-09-08 Thread Eric Wong
We will need to allow simultaneous iterators on the same config object, since we'll need this for ExtMsg, NNTPD, WwwListing, NewsWWW, and other places. --- MANIFEST | 1 + lib/PublicInbox/Config.pm | 18 -- lib/PublicInbox/ConfigIter.pm | 28 ++

[PATCH 00/11] httpd: further reduce event loop monopolization

2020-09-08 Thread Eric Wong
A couple more things to mitigate the effects of slow storage with many inboxes. Mostly solver-related, and still more to come... (Hoping the electrical grid stays up and dust bunny removal solved overheating problems). Eric Wong (11): xt/solver: test with public-inbox-httpd, too solver

[PATCH 02/11] solver: drop warnings, modernize use v5.10.1, use SEEK_SET

2020-09-08 Thread Eric Wong
With Perl upstream preparing to deprecate things, we'll move towards only enabling warnings during development via shebang and stop enabling them via "use". We'll also favor "use v5.10.1" over the Perl 5.6-compatible "use 5.010_001", since our code base never worked on 5.6. Finally, were also imp

[PATCH 04/11] www: manifest.js.gz generation no longer hogs event loop

2020-09-08 Thread Eric Wong
It's still as slow as before with hundreds/thousands of inboxes, but at least it's fair. Future changes will allow it to be cached and memoized with persistent HTTP servers. --- MANIFEST| 1 + lib/PublicInbox/ManifestJsGz.pm | 153 lib/Pu

[PATCH 08/11] extmsg: prevent cross-inbox matches from hogging event loop

2020-09-08 Thread Eric Wong
With many inboxes, checking multiple SQLite repos will be slow and time-consuming, so ensure we can schedule it fairly between multiple inboxes. --- lib/PublicInbox/ExtMsg.pm | 101 ++ 1 file changed, 70 insertions(+), 31 deletions(-) diff --git a/lib/PublicInb

[PATCH 01/11] xt/solver: test with public-inbox-httpd, too

2020-09-08 Thread Eric Wong
We'll be making changes to solver to make it even fairer to slow clients on slow storage. Ensure we test with public-inbox-httpd-specific codepaths, since the generic PSGI code paths are rare in production use. --- xt/solver.t | 31 +-- 1 file changed, 25 insertions(+)

[PATCH 11/11] solver: break apart inbox blob retrieval

2020-09-08 Thread Eric Wong
To avoid hogging the event loop in public-inbox-httpd when many candidate messages match, we'll separate the steps to ensure fairness on slow storage. --- lib/PublicInbox/SolverGit.pm | 136 +-- 1 file changed, 83 insertions(+), 53 deletions(-) diff --git a/lib/Pub

[PATCH 09/11] wwwlisting: avoid hogging event loop

2020-09-08 Thread Eric Wong
By using the just-introduced ConfigIter class. And make ManifestJsGz a subclass of it to reduce duplication. --- lib/PublicInbox/ConfigIter.pm | 12 +++ lib/PublicInbox/ManifestJsGz.pm | 92 -- lib/PublicInbox/WWW.pm | 19 ++-- lib/PublicInbox/WwwListing.pm | 163 ++

[PATCH 10/11] solver: check one git coderepo and inbox at a time

2020-09-08 Thread Eric Wong
With public-inbox-httpd, this mitigates the effect of slow git blob storage with multiple coderepos configured for an inbox. It's still synchronous for now (and may need to remain that way for ->last_check_err), but no longer monopolizes the event loop when checking multiple coderepos. We don't ye

[PATCH 05/11] config: flatten each_inbox and iterate_start args

2020-09-08 Thread Eric Wong
In Perl, we can simplify callers by passing a single array all the way down the stack instead of a single array ref which needs to be expanded every call. --- lib/PublicInbox/Config.pm | 12 ++-- lib/PublicInbox/ExtMsg.pm | 7 +++ lib/PublicInbox/Watch.pm | 13 ++-

[PATCH] contrib/css: limit coloring to links, only

2020-09-09 Thread Eric Wong
We don't want tags without href= attributes to be colored, since the `' tag in the HTML footer is intended as an anchor destination for `' link at the top. --- contrib/css/216dark.css | 2 +- contrib/css/216light.css | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/contrib/

[SQUASH] update UserContent.pm for CSS change

2020-09-09 Thread Eric Wong
Oops, this gets auto-updated via "make" :x Not a huge fan of the duplication in this file, but not having to worry about installation locations for non-Perl files is nice... diff --git a/lib/PublicInbox/UserContent.pm b/lib/PublicInbox/UserContent.pm index b6b43900..789da2f1 100644 --- a/lib/Publi

[PATCH 2/3] wwwtext: don't blindly quote "git clone" destination

2020-09-09 Thread Eric Wong
Save screen space and light up fewer pixels to reduce visual noise. --- lib/PublicInbox/WwwText.pm | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/lib/PublicInbox/WwwText.pm b/lib/PublicInbox/WwwText.pm index 14470b45..99aec370 100644 --- a/lib/PublicInbox/WwwText.pm +++

[PATCH 0/3] wwwtext: minor config example tweaks

2020-09-09 Thread Eric Wong
Just a couple of things which hopefully makes things easier for newbies... Eric Wong (3): wwwtext: describe the use of `coderepo' entries wwwtext: don't blindly quote "git clone" destination wwwtext: config comment improvements lib/PublicInbox/WwwText.pm | 23 ++

[PATCH 1/3] wwwtext: describe the use of `coderepo' entries

2020-09-09 Thread Eric Wong
The `solver' feature is not very obvious, give potential users a hint about it. --- lib/PublicInbox/WwwText.pm | 6 ++ 1 file changed, 6 insertions(+) diff --git a/lib/PublicInbox/WwwText.pm b/lib/PublicInbox/WwwText.pm index fa3774f8..14470b45 100644 --- a/lib/PublicInbox/WwwText.pm +++ b/li

[PATCH 3/3] wwwtext: config comment improvements

2020-09-09 Thread Eric Wong
Use the full URL of the inbox being mirrored to reduce ambiguity (instead of just the inbox name). Using asymmetric quotes (e.g `foo') improves readability for me in that it's more obvious when a quote begins and ends. It also lights up fewer pixels and reduces visual noise compared to double-quo

[PATCH] wwwstream: fix "Atom feed" link

2020-09-09 Thread Eric Wong
Oops, I wanted to stop escaping double-quotes with `qq()' but used `q()' instead :x Fixes: 2f61828fcb727e51 ("www: make mirror instructions more prominent") --- lib/PublicInbox/WwwStream.pm | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/lib/PublicInbox/WwwStream.pm b/lib/Publ

[PATCH 12/11] solver: async blob retrieval for diff extraction

2020-09-09 Thread Eric Wong
Like the rest of the WWW code, public-inbox-httpd now uses git_async_cat to retrieve blobs without blocking the event loop. This improves fairness when git blobs are on slow storage and allows us to take better advantage of SMP systems. --- lib/PublicInbox/SolverGit.pm | 85 +++

[PATCH] wwwstream: show init + index instructions for -V1, too

2020-09-09 Thread Eric Wong
This should've always been there. I'm not sure how widely spread 1.0 and earlier releases were, but we'll keep documenting the version requirement. --- lib/PublicInbox/WwwStream.pm | 17 ++--- 1 file changed, 6 insertions(+), 11 deletions(-) diff --git a/lib/PublicInbox/WwwStream.pm

[PATCH] nntp: fix cross-newsgroup Message-ID lookups

2020-09-10 Thread Eric Wong
We cannot blindly use the selected newsgroup for HEAD/ARTICLE/BODY requests using Message-ID, since those commands look across all newsgroups; not just the selected one (if any). So stuff a reference to the Inbox object into $smsg. We can reduce args passed into set_nntp_headers() and msg_hdr_writ

[PATCH 0/3] mostly NNTP stuff

2020-09-11 Thread Eric Wong
Just a couple of cleanups and small tweaks while I figure out how to get the NNTP code to support tens of thousands of inboxes; since it's hampered by the most existing code for dealing with mere dozens of inboxes... I think configuring detached index support will be a requirement. Eric Wo

[PATCH 1/3] treewide: avoid `goto &NAME' for tail recursion

2020-09-11 Thread Eric Wong
While Perl implements tail recursion via `goto' which allows avoiding warnings on deep recursion. It doesn't (as of 5.28) optimize the speed of such dispatches, though it may reduce ephemeral memory usage. Make the code less alien to hackers coming from other languages by using normal subroutine

[PATCH 3/3] nntp: share more code between art_lookup callers

2020-09-11 Thread Eric Wong
This prepares us for future changes to improve scalability to many inboxes. --- lib/PublicInbox/NNTP.pm | 44 + 1 file changed, 18 insertions(+), 26 deletions(-) diff --git a/lib/PublicInbox/NNTP.pm b/lib/PublicInbox/NNTP.pm index 46398cd4..88fe2bb0 100644

[PATCH 2/3] t/nntpd: add test for the XPATH command

2020-09-11 Thread Eric Wong
It's only in RFC 2980 (not 977 or 3977), but Net::NNTP has supported it since 2001, at least. We'll be making changes to avoid pathological behavior, so test it, first. --- t/nntpd.t | 2 ++ 1 file changed, 2 insertions(+) diff --git a/t/nntpd.t b/t/nntpd.t index a3d974cf..14db1a93 100644 --- a/

brain dump detached/external index so far...

2020-09-12 Thread Eric Wong
[This should eventually be put into a section 5 manpage similar to our existing v1+v2 format manpages] One feature I've been working on is detached/external indices for Xapian search. Currently (and since the earliest days of this project supporting Xapian), indices were per-inbox. This allowed

Re: [PATCH] doc: Add piem to list of clients

2020-09-13 Thread Eric Wong
Kyle Meyer wrote: > It's of course only of potential interest to Emacs users, but would it > be okay to add a pointer in clients.txt? Sure thing; applied and pushed. Thanks. -- unsubscribe: one-click, see List-Unsubscribe header archive: https://public-inbox.org/meta/

[PATCH] sigfd: fix typos and scoping on systems w/o epoll+kqueue

2020-09-13 Thread Eric Wong
Unfortunately, I'm not sure how easy catching these at compile-time, is. Prototypes do not seem to check these at compile time when crossing packages (not even with exported subroutines). --- lib/PublicInbox/Daemon.pm | 8 script/public-inbox-watch | 2 +- 2 files changed, 5 insertions(+

[PATCH] tests: consistently check for xapian-compact

2020-09-13 Thread Eric Wong
We may need to test against development versions of Xapian, which may rely on setting `XAPIAN_COMPACT=xapian-compact-1.5'. Ensure it's possible to do that. And add a missing check in t/xcpdb-reshard.t, too. --- lib/PublicInbox/TestCommon.pm | 9 - t/convert-compact.t | 3 +-- t/

[PATCH] doc: TODO and release notes updates ahead of 1.6

2020-09-13 Thread Eric Wong
Some more things have happened... And drop some items which are too expensive to support, such as automatic mirroring. --- Documentation/RelNotes/v1.6.0.eml | 31 --- TODO | 20 ++-- 2 files changed, 38 insertions(+), 13 del

1.6 in a few hours/days?

2020-09-14 Thread Eric Wong
It seems like a good as any time to tag and release 1.6... Detached index + libgit2 support is being worked on for 1.7; which might be soon after 1.6. Something's felt off for months even though I can't quite put my finger on it... The last round of -watch-related fixed contention problems might

Re: brain dump detached/external index so far...

2020-09-14 Thread Eric Wong
Konstantin Ryabitsev wrote: > I think several virtual inboxes makes more sense than always one global > search, as people may want to search something like "all Linux kernel > discussions" or "all gcc/compiler discussions". There could be different > frontends to indicate which search is runnin

[PATCH] t/imapd.t: skip dependent test on failure

2020-09-14 Thread Eric Wong
We don't want to cascade failures/warnings when something else breaks. There's likely more of these to be fixed as we encounter them. --- t/imapd.t | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/t/imapd.t b/t/imapd.t index f743bf06..cb95fa5d 100644 --- a/t/imapd.t +++

[PATCH] ci/deps: add Plack::Test::ExternalServer for devtest

2020-09-14 Thread Eric Wong
More of our Plack tests exercise public-inbox-httpd, nowadays; and ExternalServer lets us test it easily alongside generic PSGI stuff. --- ci/deps.perl | 1 + 1 file changed, 1 insertion(+) diff --git a/ci/deps.perl b/ci/deps.perl index 77d95fc8..4c273337 100755 --- a/ci/deps.perl +++ b/ci/deps.p

[PATCH] imap: quiet uninitialized variable warning on FETCH

2020-09-14 Thread Eric Wong
This was triggered by blindly trying to FETCH an MSN (not "UID FETCH") on an empty dummy inbox. It's harmless, and probably triggered by a wayward client or misbehaving bot. --- lib/PublicInbox/IMAP.pm | 2 +- t/imapd.t | 4 2 files changed, 5 insertions(+), 1 deletion(-) diff

[PATCH] wwwtext: link to public-inbox.org/meta archives

2020-09-15 Thread Eric Wong
Since we're advertising our address at meta@public-inbox.org, we should advertise the archives, too. --- lib/PublicInbox/WwwText.pm | 1 + 1 file changed, 1 insertion(+) diff --git a/lib/PublicInbox/WwwText.pm b/lib/PublicInbox/WwwText.pm index 2ed7d0d2..04c9b1c4 100644 --- a/lib/PublicInbox/WwwT

[PATCH 1/2] mid: rename MID_MAX to ID_MAX

2020-09-15 Thread Eric Wong
It's only used for HTML anchors which we will need indefinitely. --- lib/PublicInbox/MID.pm | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/lib/PublicInbox/MID.pm b/lib/PublicInbox/MID.pm index e9a3b0c0..369bb034 100644 --- a/lib/PublicInbox/MID.pm +++ b/lib/PublicInbox/MID

<    10   11   12   13   14   15   16   17   18   19   >