[PATCH 12/36] lei_to_mail: support for non-seekable outputs

2020-12-31 Thread Eric Wong
Users may wish to pipe output to "git am", "spamc", or similar, so we need to support those cases and not bail out on lseek(2) or ftruncate(2) failures. --- lib/PublicInbox/LeiToMail.pm | 24 t/lei_to_mail.t | 29 - 2 files changed,

[PATCH 15/36] ipc: support Sereal

2020-12-31 Thread Eric Wong
Some testing will be needed to see if it's worth the code and maintenance overhead, but it seems easy-enough to get working. --- lib/PublicInbox/IPC.pm | 29 - t/ipc.t| 2 +- 2 files changed, 25 insertions(+), 6 deletions(-) diff --git a/lib/PublicInbo

[PATCH 17/36] lei: rename "extinbox" => "external"

2020-12-31 Thread Eric Wong
The words "extinbox" and "extindex" are too close and easy to confuse with the other. Rename "extinbox" to "external", since these could be IMAP, JMAP or other non-public-inbox search APIs. Link: https://public-inbox.org/meta/20201226112649.GB6226@dcvr/ --- MANIFEST

[PATCH 16/36] lei_store: add ->set_eml, ->add_eml can return smsg

2020-12-31 Thread Eric Wong
Add a ->set_eml method which can be a useful fire-and-forget way of either adding new files to store OR setting keywords on them. When seeing brand-new messages, add_eml can afford to return more information in the smsg instead of just the OID. --- lib/PublicInbox/LeiStore.pm | 8 +++- t/lei

[PATCH 18/36] mid: use defined-or with `push' for uniqueness check

2020-12-31 Thread Eric Wong
As shown recently in commit a05445fb400108e60ede7d377cf3b26a0392eb24 ("config: config_fh_parse: micro-optimize"), the relying on the return value of `push' and defined-or operators can avoid modifying a the hash value scalar with an increment. --- lib/PublicInbox/MID.pm | 2 +- 1 file changed, 1 i

[PATCH 21/36] ipc: use shutdown(2), base atfork* callback

2020-12-31 Thread Eric Wong
shutdown(2) on a socket can be preferable if there's multiple forked processes writing to a single worker and we really want to shut things down ASAP. It may also be good to provide an ipc_worker_exit method which subclasses can override if needed for graceful shutdown. But we won't need equivale

[PATCH 20/36] lei_store: handle messages without Message-ID at all

2020-12-31 Thread Eric Wong
For personal mail, unsent drafts messages are a common source of messages without Message-IDs. --- lib/PublicInbox/LeiStore.pm | 20 lib/PublicInbox/OverIdx.pm | 2 ++ lib/PublicInbox/Smsg.pm | 6 ++ t/lei_store.t | 24 4 files

[PATCH 19/36] mid: hoist out mids_in sub

2020-12-31 Thread Eric Wong
We'll be using it for Resent-Message-ID with lei, and possibly other places. --- lib/PublicInbox/MID.pm | 13 - 1 file changed, 8 insertions(+), 5 deletions(-) diff --git a/lib/PublicInbox/MID.pm b/lib/PublicInbox/MID.pm index 601f4c9b..28739011 100644 --- a/lib/PublicInbox/MID.pm +++

[PATCH 23/36] lei: add --mfolder as an --output alias

2020-12-31 Thread Eric Wong
This will be helpful for mairix users. --- lib/PublicInbox/LEI.pm | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm index f960aa72..bb77198e 100644 --- a/lib/PublicInbox/LEI.pm +++ b/lib/PublicInbox/LEI.pm @@ -69,7 +69,7 @@ sub _con

[PATCH 25/36] init: remove embedded UnlinkMe package

2020-12-31 Thread Eric Wong
PublicInbox::OnDestroy can do the same thing --- script/public-inbox-init | 19 +++ 1 file changed, 3 insertions(+), 16 deletions(-) diff --git a/script/public-inbox-init b/script/public-inbox-init index 85d14377..693f5ca1 100755 --- a/script/public-inbox-init +++ b/script/public-

[PATCH 24/36] spawn: move run_die here from PublicInbox::Import

2020-12-31 Thread Eric Wong
It seems like a more logical place for it, but we'll favor the newly-added xsys_e() in tests for BAIL_OUT use. --- lib/PublicInbox/Import.pm | 9 + lib/PublicInbox/LEI.pm| 5 ++--- lib/PublicInbox/Spawn.pm | 9 - lib/PublicInbox/TestCommon.pm | 25 ++

[PATCH 22/36] lei_to_mail: unlink mboxes if not augmenting

2020-12-31 Thread Eric Wong
This matches mairix(1) behavior and may be safer if there's concurrent readers on the existing mbox, especially since we don't do currently implement mbox locking (nor does mairix). --- lib/PublicInbox/LeiToMail.pm | 9 + 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/lib/Pu

[PATCH 26/36] t/run: avoid uninitialized var on incomplete test

2020-12-31 Thread Eric Wong
Diagnosing an occasional FIFO failure in t/lei_to_mail.t... --- t/run.perl | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/t/run.perl b/t/run.perl index 1c7bcfc3..5c056356 100755 --- a/t/run.perl +++ b/t/run.perl @@ -71,7 +71,8 @@ sub test_status () { my $skip

[PATCH 28/36] lei_to_mail: open FIFOs O_WRONLY so we block

2020-12-31 Thread Eric Wong
Opening a FIFO with O_RDWR always succeeds on Linux, which cause the cat(1) process invoked by t/lei_to_mail.t to get stuck. Furthermore O_APPEND makes no sense on FIFOs and perhaps there's some kernel out there which will reject it. --- lib/PublicInbox/LeiToMail.pm | 5 +++-- 1 file changed, 3 i

[PATCH 29/36] searchidxshard: call DS->Reset at worker start

2020-12-31 Thread Eric Wong
The daemon for the local email interface will be inside the DS->EventLoop. -watch currently doesn't trigger this bug since it doesn't enable parallelism, but it may in the future. --- lib/PublicInbox/SearchIdxShard.pm | 1 + 1 file changed, 1 insertion(+) diff --git a/lib/PublicInbox/SearchIdxSh

[PATCH 27/36] gcf2client: reap process on DESTROY

2020-12-31 Thread Eric Wong
We don't want to leave Xapcmd waitpid(-1, ...) call to hit it. --- lib/PublicInbox/Gcf2Client.pm | 20 ++-- 1 file changed, 18 insertions(+), 2 deletions(-) diff --git a/lib/PublicInbox/Gcf2Client.pm b/lib/PublicInbox/Gcf2Client.pm index ab486de5..4bda5520 100644 --- a/lib/PublicI

[PATCH 31/36] use PublicInbox::DS for dwaitpid

2020-12-31 Thread Eric Wong
This simplifies our code and provides a more consistent API for error handling. PublicInbox::DS can be loaded nowadays on all *BSDs and Linux distros easily without extra packages to install. The downside is possibly increased startup time, but it's probably not as a big problem with lei being a

[PATCH 30/36] t/ipc.t: test for references via `die'

2020-12-31 Thread Eric Wong
We'll probably start using references as exceptions in some places for more exact matching. --- t/ipc.t | 13 + 1 file changed, 13 insertions(+) diff --git a/t/ipc.t b/t/ipc.t index f3715e2c..5ee45e63 100644 --- a/t/ipc.t +++ b/t/ipc.t @@ -45,6 +45,19 @@ my $test = sub { is((v

[PATCH 33/36] lei: avoid Spawn package when starting daemon

2020-12-31 Thread Eric Wong
Spawn was designed to speed up process spawning inside long-lived daemons with largish memory usage. It does not help for short-lived scripts which only exist to start and connect to a daemon. This change actually speeds up initial lei startup from ~190ms to ~140ms(!). Normal usage once the daem

[PATCH 32/36] syscall: SFD_NONBLOCK can be a constant, again

2020-12-31 Thread Eric Wong
Since Perl exposes O_NONBLOCK as a constant, we can safely make SFD_NONBLOCK a constant, too. This is not the case for SFD_CLOEXEC, since O_CLOEXEC is not exposed by Perl despite being used internally in the interpreter. --- lib/PublicInbox/DSKQXS.pm | 4 ++-- lib/PublicInbox/Daemon.pm | 4 ++--

[PATCH 34/36] avoid calling waitpid from children in DESTROY

2020-12-31 Thread Eric Wong
Objects with DESTROY callbacks get propagated to children, so we must be careful to not invoke waitpid from children on their sibling processes. Only parents (and their parents...) can reap child processes. --- lib/PublicInbox/DS.pm | 4 ++-- lib/PublicInbox/Gcf2Client.pm | 8 ++-- li

[PATCH 35/36] ds: clobber $in_loop first at reset

2020-12-31 Thread Eric Wong
This may help ensure DESTROY callbacks will see in_loop before the others. --- lib/PublicInbox/DS.pm | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/lib/PublicInbox/DS.pm b/lib/PublicInbox/DS.pm index 4f1558c7..8a560ae8 100644 --- a/lib/PublicInbox/DS.pm +++ b/lib/PublicInbox

[PATCH 36/36] on_destroy: support PID owner guard

2020-12-31 Thread Eric Wong
Since we'll be forking for Xapian indexing and maybe other places, having a simple guard in place to ensure OnDestroy doesn't unexpectedly unlink files or similar is a safer option. --- lib/PublicInbox/LEI.pm | 5 ++--- lib/PublicInbox/Lock.pm | 4 ++-- lib/PublicInbox/OnDestroy.pm | 5

[DRAFT] doc: 1.6.1 draft release notes

2020-12-31 Thread Eric Wong
reate mode 100644 Documentation/RelNotes/v1.6.1.wip diff --git a/Documentation/RelNotes/v1.6.1.wip b/Documentation/RelNotes/v1.6.1.wip new file mode 100644 index ..13b41956 --- /dev/null +++ b/Documentation/RelNotes/v1.6.1.wip @@ -0,0 +1,55 @@ +From: Eric Wong +To: meta@public-inbo

Re: Unexpected white-on-black text in QtWebEngine-based browsers

2020-12-31 Thread Eric Wong
Johannes Altmanninger wrote: > Hi, > > Sometime during the last two weeks, https://public-inbox.org/ started to be > displayed as white text on black background. I haven't changed anything in public-inbox.org > This happens with Qutebrowser and Falkon, two QtWebEngine-based browsers. > On Firef

[ANNOUNCE] public-inbox 1.6.1

2020-12-31 Thread Eric Wong
A small, bugfix release on top of 1.6.0 from September 2020. Bug fixes: * MIME header decoding no longer warns on undefined variables, with Perl <5.28. Thanks to a bug report by Ali Alnubani. https://public-inbox.org/meta/dm6pr12mb49106f8e3bd697b63b943a22da...@dm6pr12mb4910.namprd12.prod.ou

[PATCH] Makefile.PL: add update-copyrights target

2020-12-31 Thread Eric Wong
It might save me a few cycles every year to not have to scroll through git history to see how it's run. --- Makefile.PL | 8 +++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/Makefile.PL b/Makefile.PL index 924e8dfd..613a72ae 100644 --- a/Makefile.PL +++ b/Makefile.PL @@ -1,5 +

[PUSHED] update copyrights for 2021

2020-12-31 Thread Eric Wong
commit af0b0fb7a454470a32c452119d0392e0dedb3fe1 update copyrights for 2021 Using "make update-copyrights" after setting GNULIB_PATH in my config.mak Full-diff here (yes, I keep meaning to make "/s/" better for non-blobs, I think lei can help with experiments there) https://public

[PATCH 0/4] TEST_RUN_MODE=0 fixes

2020-12-31 Thread Eric Wong
Oops :x I should use TEST_RUN_MODE more often to give my hands a rest. None of these surface with the quick "check-run" target Eric Wong (4): search: do not use $QP_FLAGS until Xapian is loaded t/lei: fix TEST_RUN_MODE=0, simplify oneshot fallback test import: unset GIT_CONFIG

[PATCH 1/4] search: do not use $QP_FLAGS until Xapian is loaded

2020-12-31 Thread Eric Wong
The default $QP_FLAGS won't be set until after Xapian is loaded, duh... This fixes t/imapd.t with TEST_RUN_MODE=0 --- lib/PublicInbox/Search.pm | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/lib/PublicInbox/Search.pm b/lib/PublicInbox/Search.pm index 547b6fbe..0bdf6fc6 100644

[PATCH 3/4] import: unset GIT_CONFIG with `git config --global'

2020-12-31 Thread Eric Wong
GIT_CONFIG is set by -convert, and user may have it set for other reasons. In either case, it conflicts with any any attempt to use `git config --global` so we have to unset it. This fixes t/multi-mid.t under TEST_RUN_MODE=0 --- lib/PublicInbox/Import.pm | 1 + 1 file changed, 1 insertion(+) di

[PATCH 4/4] treewide: reduce load_xapian* callsites

2020-12-31 Thread Eric Wong
Hopefully this will make it easier to spot dependency bugs in the future. --- lib/PublicInbox/LEI.pm | 1 - lib/PublicInbox/LeiStore.pm | 5 + t/indexlevels-mirror.t | 4 +--- t/replace.t | 3 +-- 4 files changed, 3 insertions(+), 10 deletions(-) diff --git a/lib/Pub

[PATCH 2/4] t/lei: fix TEST_RUN_MODE=0, simplify oneshot fallback

2020-12-31 Thread Eric Wong
We need to use an absolute path after chdir in run modes where scripts aren't loaded into in-memory subs. The oneshot test was also failing under TEST_RUN_MODE=0 due to no "lei-oneshot" command existing on the FS. So we force a socket failure by making XDG_RUNTIME_DIR too large to fit into the 10

[PATCH] lei_store: quiet down "git var" failures

2021-01-01 Thread Eric Wong
$git->qx and $git->popen now $env and $opt for redirects like lower-level popen_rd. This may be beneficial in other places. --- lib/PublicInbox/Git.pm | 14 +- lib/PublicInbox/LeiStore.pm | 4 +++- 2 files changed, 12 insertions(+), 6 deletions(-) diff --git a/lib/PublicInbox/G

[PATCH] lei_store: alternative unconfigured "git var" workaround

2021-01-02 Thread Eric Wong
While the changes to git->qx/git->popen from commit 171a9c24022ad7ef will be useful for the lei daemon, hiding git error messages from actual users is probably wrong and we'll just localize GIT_* vars for testing. --- lib/PublicInbox/LeiStore.pm | 4 +--- t/lei.t | 2 ++ t/lei_

[PATCH 0/6] process pipe improvements

2021-01-02 Thread Eric Wong
G) call in GitAsyncCat. Maybe switching --batch + DS to use UNIX sockets can be done to save FDs (or I'm too brain-damaged to figure this out). But right now, all of our codebase is robust against children attempting to reap siblings (or PIDs of former siblings) Eric Wong (6): processpipe: allow sy

[PATCH 1/6] processpipe: allow synchronous close to set $?

2021-01-02 Thread Eric Wong
To get rid of the ugly $PublicInbox::DS::in_loop localization in MboxReader, we'll distinguish between ->CLOSE and ->DESTROY with ProcessPipe. If we end up closing via ->DESTROY, we'll assume the caller will want to deal with $? asynchronously via the event loop (or not even care about $?). If we

[PATCH 3/6] git: qx: waitpid synchronously via ProcessPipe->CLOSE

2021-01-02 Thread Eric Wong
If we're using ->qx, we're operating synchronously anyways, so there's little point in relying on the event loop for waitpid. --- lib/PublicInbox/Git.pm | 13 +++-- 1 file changed, 11 insertions(+), 2 deletions(-) diff --git a/lib/PublicInbox/Git.pm b/lib/PublicInbox/Git.pm index f7332bb6

[PATCH 2/6] processpipe: lazy-require PublicInbox::DS for dwaitpid

2021-01-02 Thread Eric Wong
This saves over 20ms with scripts that only use PublicInbox::Spawn. --- lib/PublicInbox/ProcessPipe.pm | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/lib/PublicInbox/ProcessPipe.pm b/lib/PublicInbox/ProcessPipe.pm index 400a22f3..e540dc22 100644 --- a/lib/PublicInbox/Proce

[PATCH 6/6] qspawn: switch to ProcessPipe via popen_rd

2021-01-02 Thread Eric Wong
ProcessPipe has a built-in mechanism to prevent siblings from reaping children. --- lib/PublicInbox/Qspawn.pm | 15 ++- 1 file changed, 6 insertions(+), 9 deletions(-) diff --git a/lib/PublicInbox/Qspawn.pm b/lib/PublicInbox/Qspawn.pm index 68b71112..7e50a59a 100644 --- a/lib/PublicIn

[PATCH 5/6] git: manifest_entry: use ProcessPipe via popen_rd

2021-01-02 Thread Eric Wong
Only saves us one line of code, but that's better than nothing. --- lib/PublicInbox/Git.pm | 7 +++ 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/lib/PublicInbox/Git.pm b/lib/PublicInbox/Git.pm index cdd2b400..3d97300c 100644 --- a/lib/PublicInbox/Git.pm +++ b/lib/PublicInbox/G

[PATCH 4/6] import: switch to using ProcessPipe

2021-01-02 Thread Eric Wong
This saves us a few lines of code, but also prevents misreaping by sibling processes. --- lib/PublicInbox/Import.pm | 23 +-- 1 file changed, 9 insertions(+), 14 deletions(-) diff --git a/lib/PublicInbox/Import.pm b/lib/PublicInbox/Import.pm index 47a529ff..b5780d2b 100644 ---

[PATCH 0/7] v2: swap in new IPC package

2021-01-02 Thread Eric Wong
SearchIdxShard was too big and adding the new extindex stuff made things worse. Since I intend to use IPC in more places, I figured it'd be good to prove it with works well by dropping it into the old v2 mix. The below diffstat is nice Eric Wong (7): ipc: some documentation com

[PATCH 1/7] ipc: some documentation comments

2021-01-02 Thread Eric Wong
Fix some comments and add some short summary descriptions to hopefully make things easier-to-follow. --- lib/PublicInbox/IPC.pm | 10 +++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/lib/PublicInbox/IPC.pm b/lib/PublicInbox/IPC.pm index 288a8c94..79cd34fe 100644 --- a/lib/P

[PATCH 2/7] searchidxshard: use PublicInbox::IPC to kill lots of code

2021-01-02 Thread Eric Wong
It's nice to prove the new code works by swapping it into the current V2Writable / SearchIdxShard packages. This is only the first step for the core bits, and we'll be able to delete more code in a subsequent patch. --- lib/PublicInbox/IPC.pm| 2 +- lib/PublicInbox/SearchIdx.pm

[PATCH 3/7] searchidxshard: IPC conversion, part 2

2021-01-02 Thread Eric Wong
We can remove some now-pointless wrapper functions by using ->ipc_do in even more places. --- lib/PublicInbox/ExtSearchIdx.pm | 23 --- lib/PublicInbox/LeiStore.pm | 13 +++-- lib/PublicInbox/SearchIdx.pm | 8 +--- lib/PublicInbox/SearchIdxShard.pm | 3

[PATCH 4/7] searchidxshard: replace index_raw with index_eml

2021-01-02 Thread Eric Wong
Since Storable and Sereal are designed for lossless serialization, we'll just pass $eml objects to whatever process is running SearchIdx. --- lib/PublicInbox/ExtSearchIdx.pm | 4 ++-- lib/PublicInbox/LeiStore.pm | 3 ++- lib/PublicInbox/SearchIdxShard.pm | 9 ++--- lib/PublicInbox/V

[PATCH 5/7] use Eml (or MIME) objects for all indexing paths

2021-01-02 Thread Eric Wong
We don't need to be keeping the raw message around after it hits git. Shard work now relies on Storable (or Sereal) and all of the indexing code relies on the Email::MIME-like API of Eml to access interesting parts of the message. Similarly, smsg->{raw_bytes} is no longer carried around and we do

[PATCH 7/7] searchidxshard: use add_xapian directly for v2

2021-01-02 Thread Eric Wong
We can more clearly distinguish between v1 and v2-only code paths this way, and may be able to save a few cycles this way. --- lib/PublicInbox/SearchIdx.pm | 1 + lib/PublicInbox/SearchIdxShard.pm | 2 +- lib/PublicInbox/V2Writable.pm | 8 ++-- 3 files changed, 8 insertions(+), 3 dele

[PATCH 6/7] ipc: switch to one-way pipes

2021-01-02 Thread Eric Wong
This fixes a performance regression in multi-process v2 indexing due to the switch to PublicInbox::IPC. While Unix sockets are fewer FDs to manage, pipes allow unprivileged processes to use larger buffers (up to 1M) on out-of-the-box Linux instances. A larger buffer via F_SETPIPE_SZ afforded by p

[PATCH] gcf2client: split out request API from regular git

2021-01-02 Thread Eric Wong
While Gcf2Client is designed to mimic what git-cat-file writes to stdout, its request format is different to support requests with a git repository path included. We'll highlight the distinction and make the GitAsyncCat support code easier-to-follow as a result. Since Gcf2Client relies on DS, we

[PATCH 0/3] lei-related test fixes

2021-01-03 Thread Eric Wong
icInbox::Spawn should probably renamed PublicInbox::C... Eric Wong (3): t/lei: use $lei->() callback wrapper testcommon: prepare_redirects: fix error message spawn: support send_fd+recv_fd w/o IO::FDPass lib/PublicInbox/LEI.pm| 6 ++- lib/PublicInbox/Spawn.pm | 78

[PATCH 1/3] t/lei: use $lei->() callback wrapper

2021-01-03 Thread Eric Wong
This shortens the test and should make it easier to debug and add new tests. --- t/lei.t | 78 - 1 file changed, 33 insertions(+), 45 deletions(-) diff --git a/t/lei.t b/t/lei.t index 6f6a5888..541d83ce 100644 --- a/t/lei.t +++ b/t/lei.t @@

[PATCH 2/3] testcommon: prepare_redirects: fix error message

2021-01-03 Thread Eric Wong
I never hit these die() calls, but noticed it while debugging another problem on FreeBSD. --- lib/PublicInbox/TestCommon.pm | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/lib/PublicInbox/TestCommon.pm b/lib/PublicInbox/TestCommon.pm index 56f04bd4..16ae2650 100644 --- a/li

[PATCH 3/3] spawn: support send_fd+recv_fd w/o IO::FDPass

2021-01-03 Thread Eric Wong
IO::FDPass may be an extra installation burden I don't want to impose on users. We only support Linux and *BSDs, however. --- lib/PublicInbox/LEI.pm | 6 ++-- lib/PublicInbox/Spawn.pm | 78 +--- script/lei | 7 ++-- t/lei.t |

[PATCH 0/2] fix race from stdout buffering in FD pass exit

2021-01-03 Thread Eric Wong
I implemented 1/2 thinking in hopes it would help, but it's a nice syscall reduction anyways. 2/2 is the actual fix I've been struggling to find, and I could only reproduce it on this FreeBSD VM I have access to. It seems so obvious in retrospect :x Eric Wong (2): send and receive

[PATCH 1/2] send and receive all 3 FDs at once

2021-01-03 Thread Eric Wong
We'll always be transferring stdin, stdout, and stderr together for lei. Perhaps I lack imagination or foresight, but I can't think of a reason to send more or less FDs. --- lib/PublicInbox/LEI.pm | 27 ++-- lib/PublicInbox/Spawn.pm | 53 ++--

[PATCH 2/2] lei: fix output race in client/daemon mode

2021-01-03 Thread Eric Wong
The daemon needs to flush stdout before disconnecting or killing clients, otherwise they may reread empty data on redirected outputs. We also don't want to unbuffer stdout too early in case we have lots of small chunks of data to output. The received ($self->{2}) will always have autoflush, match

[PATCH] lei: prefer IO::FDPass over our Inline::C recv_3fds

2021-01-03 Thread Eric Wong
While our recv_3fds() implementation is more efficient syscall-wise, loading Inline takes nearly 50ms on my machine even after Inline::C memoizes the build. The current ~20ms in the fast path is barely acceptable to me, and 50ms would be unusable. Eventually, script/lei may invoke tcc(1) or cc(1)

Re: Mailman 3 archiver for public-inbox

2021-01-03 Thread Eric Wong
Toke Høiland-Jørgensen wrote: > Hi > > I created an archiver for mailman3 that will use public-inbox as the > archiving backend (using public-inbox-mda). It's somewhat rudimentary, Cool. How's performance? I've been thinking of some better importers(*) for batch work and -mda has always been h

[PATCH 0/2] lei: some usage bits

2021-01-03 Thread Eric Wong
Still trying to wrap my head around xsearch but my head hurts :< Eric Wong (2): lei: fix opt_dash to pass non-dash args to @argv lei: improve idempotent "init" error message lib/PublicInbox/LEI.pm | 19 +++ 1 file changed, 15 insertions(+), 4 deletions(-) -- uns

[PATCH 1/2] lei: fix opt_dash to pass non-dash args to @argv

2021-01-03 Thread Eric Wong
The special "<>" handling in Getopt::Long actually invokes the callback for every single command-line arg, not just those prefixed by "-". This will let us pass arbitrary non-dashed words for search queries so users can type queries naturally without quoting (unless they want phrase search). ---

[PATCH 2/2] lei: improve idempotent "init" error message

2021-01-03 Thread Eric Wong
Showing "leistore.dir= already initialized" because $cur is undefined isn't useful. --- lib/PublicInbox/LEI.pm | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm index 50453dde..9a3b1ee3 100644 --- a/lib/PublicInbox/LEI.pm +++ b/lib/

Re: public-inbox + mlmmj best practices?

2021-01-04 Thread Eric Wong
Konstantin Ryabitsev wrote: > On Mon, Dec 28, 2020 at 09:31:39PM +0000, Eric Wong wrote: > > AFAIK, V2Writable always does the right thing on -purge/-edit; > > at least for WWW users(*). > > > > V2W does more work in rare cases when history gets rewritten, > >

[PATCH] v2writable: exact discontiguous history handling

2021-01-04 Thread Eric Wong
Eric Wong wrote: > That would allow the new version of the edited message to be > piped and seen by NNTP/IMAP readers. > > You *do* want to pipe the new version of the message you've > edited, right? -8< Subject: [PATCH] v2writable: exact discontiguous

[PATCH 1/4] lei: completion: fix filename completion

2021-01-05 Thread Eric Wong
"-o default" is what we want from "complete", "-o filename" just tells readline the result from the "_lei" function might be a filename and quote appropriately. --- contrib/completion/lei-completion.bash | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/contrib/completion/lei-com

[PATCH 0/4] more lei usability stuff

2021-01-05 Thread Eric Wong
Eric Wong (4): lei: completion: fix filename completion lei: automatic pager support lei: use client env as-is, drop daemon-env command address: pairs: new helper for JMAP (and maybe lei) contrib/completion/lei-completion.bash | 2 +- lib/PublicInbox/Address.pm | 11

[PATCH 4/4] address: pairs: new helper for JMAP (and maybe lei)

2021-01-05 Thread Eric Wong
Per JMAP RFC 8621 sec 4.1.2.3, we should be able to denote the lack of a phrase/comment corresponding to an email address with a JSON "null" (or Perl `undef'). [ { "name": "James Smythe", "email": "ja...@example.com" }, { "name": null, "email": "j...@example.com" }, { "name": "John S

[PATCH 3/4] lei: use client env as-is, drop daemon-env command

2021-01-05 Thread Eric Wong
There may be subtle misbehaviours when mixing the existing daemon env and the client-supplied env. Just do the simplest thing and use the client env as-is. We'll also start the ->event_step callback since we'll need to remember some things for long-lived commands. --- lib/PublicInbox/LEI.pm | 38

[PATCH 2/4] lei: automatic pager support

2021-01-05 Thread Eric Wong
Just like git, we'll start a pager when outputting to a terminal for user-friendliness when reading many messages. --- lib/PublicInbox/LEI.pm | 30 -- 1 file changed, 28 insertions(+), 2 deletions(-) diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm index 9a

JSON pretty-printing [was: [4/4] ... (and maybe lei)]

2021-01-05 Thread Eric Wong
Eric Wong wrote: > [ > { "name": "James Smythe", "email": "ja...@example.com" }, > { "name": null, "email": "j...@example.com" }, > { "name": "John Smith", "email": "

JSON field names in terminal/pager output

2021-01-05 Thread Eric Wong
avior which # needs MMDDHHMMSS (all digits). Getting Xapian to parse # dates from Perl (w/o custom C++) isn't possible, yet. # dt: is the date header, "UTCDate" in JMAP. "f": "Eric Wong ", # "from": might be more obvious, but seeing it th

[PATCH] imap: fix uninitialized var on MSN search miss

2021-01-05 Thread Eric Wong
It seems only triggered by bots trying to steal information. --- lib/PublicInbox/IMAP.pm | 2 +- t/imapd.t | 4 +++- 2 files changed, 4 insertions(+), 2 deletions(-) diff --git a/lib/PublicInbox/IMAP.pm b/lib/PublicInbox/IMAP.pm index 68a7e050..226e98a2 100644 --- a/lib/PublicInbox/

Re: JSON field names in terminal/pager output

2021-01-06 Thread Eric Wong
Kyle Meyer wrote: > Eric Wong writes: > > > Are "f", "s", "t", "c" acceptable field names to show in JSON > > output? (instead of from/subject/to/cc) > > In my view they are, and, as you mention next, I like that they align >

Re: [PATCH] v2writable: exact discontiguous history handling

2021-01-09 Thread Eric Wong
Pushed as 392533147f50061d93cb9ed82abf98067dde5472 -- unsubscribe: one-click, see List-Unsubscribe header archive: https://public-inbox.org/meta/

[PATCH 00/22] lei query overview views

2021-01-10 Thread Eric Wong
terruptible... The wq_* IPC stuff will be reused in the normal read-only WWW/IMAP search at some point, too. Eric Wong (22): lei query + pagination sorta working lei q: deduplicate smsg ds: block signals when reaping ipc: add support for asynchronous callbacks cmd_ipc: send FDs with buff

[PATCH 03/22] ds: block signals when reaping

2021-01-10 Thread Eric Wong
This lets us call dwaitpid long before a process exits and not have to wait around for it. This is advantageous for lei where we can run dwaitpid on the pager as soon as we spawn it, instead of waiting for a client socket to go away on DESTROY. --- lib/PublicInbox/DS.pm | 16 +++

[PATCH 02/22] lei q: deduplicate smsg

2021-01-10 Thread Eric Wong
We don't want duplicate messages in results overviews, either. --- lib/PublicInbox/LeiDedupe.pm | 29 - lib/PublicInbox/LeiQuery.pm | 5 + t/lei_dedupe.t | 14 ++ 3 files changed, 47 insertions(+), 1 deletion(-) diff --git a/lib/PublicIn

[PATCH 01/22] lei query + pagination sorta working

2021-01-10 Thread Eric Wong
Parallelism and interactivity with pager + SIGPIPE needs work; but results are shown and phrase search works without shell users having to apply Xapian quoting rules on top of standard shell quoting. --- MANIFEST | 1 + lib/PublicInbox/LEI.pm | 12 +-- lib/PublicIn

[PATCH 05/22] cmd_ipc: send FDs with buffer payload

2021-01-10 Thread Eric Wong
For another step in in syscall reduction, we'll support transferring 3 FDs and a buffer with a single sendmsg/recvmsg syscall using Socket::MsgHdr if available. Beyond script/lei itself, this will be used for internal IPC between search backends (perhaps with SOCK_SEQPACKET). There's a chance thi

[PATCH 08/22] ipc: eliminate ipc_worker_stop method

2021-01-10 Thread Eric Wong
We can just EOF the pipe, and instead rely on per-class error handling to deal with uncommitted transactions and what not. --- lib/PublicInbox/IPC.pm | 8 1 file changed, 8 deletions(-) diff --git a/lib/PublicInbox/IPC.pm b/lib/PublicInbox/IPC.pm index 27ea90de..0c5205c1 100644 --- a/lib

[PATCH 04/22] ipc: add support for asynchronous callbacks

2021-01-10 Thread Eric Wong
Similar to git->cat_async, this will let us deal with responses asynchronously, as well as being able to mix synchronous and asynchronous code transparently (though perhaps not optimally). --- lib/PublicInbox/IPC.pm | 53 +++--- t/ipc.t| 25 +

[PATCH 07/22] ipc: work queue support via SOCK_SEQPACKET

2021-01-10 Thread Eric Wong
This will allow any number of younger sibling processes to communicate with older siblings directly without relying on a mediator process. This is intended to be useful for distributing search work across multiple workers without caring which worker hits it (we only care about shard members). And

[PATCH 06/22] ipc: avoid excessive evals

2021-01-10 Thread Eric Wong
We should not need an eval for warning with our code base. Nowadays, dwaitpid() automatically does the right thing regardless of whether we're in the event loop, so no eval is needed there, either. --- lib/PublicInbox/IPC.pm | 12 ++-- 1 file changed, 2 insertions(+), 10 deletions(-) diff

[PATCH 09/22] ipc: wq: support dynamic worker count change

2021-01-10 Thread Eric Wong
Increasing/decreasing workers count will be useful in some situations. --- lib/PublicInbox/IPC.pm | 99 ++ t/ipc.t| 9 2 files changed, 81 insertions(+), 27 deletions(-) diff --git a/lib/PublicInbox/IPC.pm b/lib/PublicInbox/IPC.pm inde

[PATCH 10/22] ipc: drop -ipc_parent_pid field

2021-01-10 Thread Eric Wong
It is not used anywhere. --- lib/PublicInbox/IPC.pm | 1 - 1 file changed, 1 deletion(-) diff --git a/lib/PublicInbox/IPC.pm b/lib/PublicInbox/IPC.pm index 5bca3627..4d29532c 100644 --- a/lib/PublicInbox/IPC.pm +++ b/lib/PublicInbox/IPC.pm @@ -107,7 +107,6 @@ sub ipc_worker_spawn { define

[PATCH 12/22] lei: rename $w to $wpager for warning message

2021-01-10 Thread Eric Wong
Perl keeps track of the variable name for error messages when auto-closing an FD fails, so this will help identify the source of a close error.. --- lib/PublicInbox/LEI.pm | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm inde

[PATCH 11/22] ipc: DESTROY and wq_workers methods

2021-01-10 Thread Eric Wong
We'll enable automatic cleanup when IPC classes go out-of-scope to avoid leaving zombies around. ->wq_workers will be a useful convenience method to change worker counts. --- lib/PublicInbox/IPC.pm | 23 +-- t/ipc.t| 10 +++--- 2 files changed, 28 insertion

[PATCH 14/22] lei: query: ensure pager exit is instantaneous

2021-01-10 Thread Eric Wong
Improve interactivity and user experience by allowing the user to return to the terminal immediately when the pager is exited (e.g. hitting the `q' key in less(1)). This is a massive change which restructures query handling to allow parallel search when --thread expansion is in use and offloading

[PATCH 13/22] lei: fix oneshot TTY detection by passing STD*{GLOB}

2021-01-10 Thread Eric Wong
... instead of STD*{IO}. I'm not sure why *STDOUT{IO} being an IO::File object disqualifies it from the "-t" perlop check returning true on TTY, but it does. So use *STDOUT{GLOB} for now. http://nntp.perl.org/group/perl.perl5.porters/258760 Message-ID: --- lib/PublicInbox/LEI.pm | 6 +++--- 1

[PATCH 17/22] ipc: drop unused fields, default sighandlers for wq

2021-01-10 Thread Eric Wong
Relying on signal handlers to kill a particular worker was a laggy/racy idea and I gave up on the idea of targetting workers explicitly and instead chose to make wq_worker_decr stop the next idle worker ->wq_exit. We will however attempt to support sending signals to a process group. --- lib/Publ

[PATCH 16/22] ipc: fix IO::FDPass use with a worker limit of 1

2021-01-10 Thread Eric Wong
IO::FDPass is our last choice for implementing the workqueue because its lack of atomicity makes it impossible to guarantee all requests of a single group hit a single worker out of many. So the only way to use IO::FDPass for workqueues it to only have a single worker. A single worker still buys

[PATCH 15/22] ipc: start supporting sending/receiving more than 3 FDs

2021-01-10 Thread Eric Wong
Actually, sending 4 FDs will be useful for lei internal xsearch work once we start accepting input from stdin. It won't be used with the lightweight lei(1) client, however. For WWW (eventually), a single FD may be enough. --- lib/PublicInbox/CmdIPC1.pm| 16 +++- lib/PublicInbox/CmdIP

[PATCH 18/22] lei: get rid of client {pid} field

2021-01-10 Thread Eric Wong
Using kill(2) is too dangerous since extremely long queries may mean the original PID of the aborted lei(1) client process to be recycled by a new process. It would be bad if the lei_xsearch worker process issued a kill on the wrong process. So just rely on sending the exit message via socket. --

[PATCH 19/22] lei: fork + FD cleanup

2021-01-10 Thread Eric Wong
Do a better job of closing FDs that we don't want shared with the work queue workers. We'll also fix naming and use "atfork_prepare" instead of "atfork_parent" to match pthread_atfork(3) naming. --- lib/PublicInbox/IPC.pm| 57 +++ lib/PublicInbox/LEI.pm

[PATCH 22/22] lei: query: restore JSON output overview

2021-01-10 Thread Eric Wong
This internal API is better suited for fork-friendliness (but locking + dedupe still needs to be re-added). Normal "json" is the default, though stream-friendly "concatjson" and "jsonl" (AKA "ndjson" AKA "ldjson") all seem working (though tests aren't working, yet). For normal "json", the biggest

[PATCH 21/22] lei_xsearch: transfer 4 FDs internally, drop IO::FDPass

2021-01-10 Thread Eric Wong
It's easier to make the code more generic by transferring all four FDs (std(in|out|err) + socket) instead of omitting stdin. We'll be reading from stdin on some imports, and possibly outputting to stdout, so omitting stdin now would needlessly complicate things. The differences with IO::FDPass "1

[PATCH 20/22] lei: run pager in client script

2021-01-10 Thread Eric Wong
While most single keystrokes work fine when the pager is launched from the background daemon, Ctrl-C and WINCH can cause strangeness when connected to the wrong terminal. --- lib/PublicInbox/LEI.pm | 26 +++--- lib/PublicInbox/LeiQuery.pm | 5 +++-- script/lei

SOCK_SEQPACKET portability for AF_UNIX?

2021-01-11 Thread Eric Wong
Any concerns about the portability of SOCK_SEQPACKET for AF_UNIX (aka AF_LOCAL) sockets? I know Linux has had it for ages, and FreeBSD 9+, too, I think... (FreeBSD 11.x+ definitely does) I've been using them for many years in less-popular projects and (AFAIK) those projects only have Linux users.

[PATCH 01/14] cmd_ipc: support + test EINTR + EAGAIN, no FDs

2021-01-13 Thread Eric Wong
We'll ensure our {send,recv}_cmd4 implementations are consistent w.r.t. non-blocking and interrupted sockets. We'll also support receiving messages without FDs associated so we don't have to send dummy FDs to keep receivers from reporting EOF. --- lib/PublicInbox/CmdIPC4.pm | 6 +++--- lib/Publi

<    15   16   17   18   19   20   21   22   23   24   >