Re: [PATCH] lei: support reading MH for convert+import+index

2023-12-16 Thread Eric Wong
Konstantin Ryabitsev  wrote:
> Nice, so eventually we should be able to specify the following instead of
> faking out a maildir?
> 
> watch=mh:/var/spool/mlmmj/list.name/archive

Yes, that's the plan.

> > inotify|EVFILT_VNODE watches aren't supported, yet, either.
> 
> In the case of mlmmj it's sufficient to watch the
> /var/spool/mlmmj/list.name/index file for updates, but I don't know how well
> this lends itself to other implementations (I am not at all familiar with MH).

Just watching the directory itself is sufficient (like Maildir)
and will report new files.  We just have to check /\A[0-9]+\z/



Re: [PATCH] lei: support reading MH for convert+import+index

2023-12-16 Thread Konstantin Ryabitsev
On Sat, Dec 16, 2023 at 01:09:32PM +, Eric Wong wrote:
> The MH format is widely-supported and used by various MUAs such
> as mutt and sylpheed, and a MH-like format is used by mlmmj for
> archives, as well.  Locking implementations for writes are
> inconsistent, so this commit doesn't support writes, yet.

Nice, so eventually we should be able to specify the following instead of
faking out a maildir?

watch=mh:/var/spool/mlmmj/list.name/archive

> inotify|EVFILT_VNODE watches aren't supported, yet, either.

In the case of mlmmj it's sufficient to watch the
/var/spool/mlmmj/list.name/index file for updates, but I don't know how well
this lends itself to other implementations (I am not at all familiar with MH).

-K



[PATCH] lei: support reading MH for convert+import+index

2023-12-16 Thread Eric Wong
The MH format is widely-supported and used by various MUAs such
as mutt and sylpheed, and a MH-like format is used by mlmmj for
archives, as well.  Locking implementations for writes are
inconsistent, so this commit doesn't support writes, yet.

inotify|EVFILT_VNODE watches aren't supported, yet, either.
---
 MANIFEST   |   3 +
 lib/PublicInbox/LEI.pm |  13 ++--
 lib/PublicInbox/LeiConvert.pm  |   5 ++
 lib/PublicInbox/LeiImport.pm   |  23 +++
 lib/PublicInbox/LeiImportKw.pm |   2 +-
 lib/PublicInbox/LeiIndex.pm|   2 +-
 lib/PublicInbox/LeiInput.pm|  52 +---
 lib/PublicInbox/LeiMailSync.pm |  39 
 lib/PublicInbox/LeiToMail.pm   |   5 ++
 lib/PublicInbox/MHreader.pm| 103 +++
 lib/PublicInbox/MdirReader.pm  |   2 +-
 lib/PublicInbox/MdirSort.pm|  46 ++
 lib/PublicInbox/TestCommon.pm  |  22 ---
 t/mh_reader.t  | 108 +
 14 files changed, 392 insertions(+), 33 deletions(-)
 create mode 100644 lib/PublicInbox/MHreader.pm
 create mode 100644 lib/PublicInbox/MdirSort.pm
 create mode 100644 t/mh_reader.t

diff --git a/MANIFEST b/MANIFEST
index e22674b7..8bcc3179 100644
--- a/MANIFEST
+++ b/MANIFEST
@@ -293,6 +293,7 @@ lib/PublicInbox/Linkify.pm
 lib/PublicInbox/Listener.pm
 lib/PublicInbox/Lock.pm
 lib/PublicInbox/MDA.pm
+lib/PublicInbox/MHreader.pm
 lib/PublicInbox/MID.pm
 lib/PublicInbox/MIME.pm
 lib/PublicInbox/MailDiff.pm
@@ -302,6 +303,7 @@ lib/PublicInbox/MboxGz.pm
 lib/PublicInbox/MboxLock.pm
 lib/PublicInbox/MboxReader.pm
 lib/PublicInbox/MdirReader.pm
+lib/PublicInbox/MdirSort.pm
 lib/PublicInbox/MiscIdx.pm
 lib/PublicInbox/MiscSearch.pm
 lib/PublicInbox/MsgIter.pm
@@ -543,6 +545,7 @@ t/mda-mime.eml
 t/mda.t
 t/mda_filter_rubylang.t
 t/mdir_reader.t
+t/mh_reader.t
 t/mid.t
 t/mime.t
 t/miscsearch.t
diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index 17431518..e0cfd55a 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -267,7 +267,7 @@ import => [ 'LOCATION...|--stdin [LABELS...]',
'one-time import/update from URL or filesystem',
qw(stdin| offset=i recursive|r exclude=s include|I=s new-only
lock=s@ in-format|F=s kw! verbose|v+ incremental! mail-sync!
-   commit-delay=i),
+   commit-delay=i sort|s:s@),
@net_opt, @c_opt ],
 'forget-mail-sync' => [ 'LOCATION...',
'forget sync information for a mail folder', @c_opt ],
@@ -280,7 +280,7 @@ import => [ 'LOCATION...|--stdin [LABELS...]',
 'convert' => [ 'LOCATION...|--stdin',
'one-time conversion from URL or filesystem to another format',
qw(stdin| in-format|F=s out-format|f=s output|mfolder|o=s lock=s@ kw!
-   rsyncable),
+   rsyncable sort|s:s@),
@net_opt, @c_opt ],
 'p2q' => [ 'LOCATION_OR_COMMIT...|--stdin',
"use a patch to generate a query for `lei q --stdin'",
@@ -321,6 +321,9 @@ import => [ 'LOCATION...|--stdin [LABELS...]',
 my $stdin_formats = [ 'MAIL_FORMAT|eml|mboxrd|mboxcl2|mboxcl|mboxo',
'specify message input format' ];
 my $ls_format = [ 'OUT|plain|json|null', 'listing output format' ];
+my $sort_out = [ 'VAL|received|relevance|docid',
+   "order of results is `--output'-dependent"];
+my $sort_in = [ 'sequence|mtime|size', 'sort input (format-dependent)' ];
 
 # we use \x{a0} (non-breaking SP) to avoid wrapping in PublicInbox::LeiHelp
 my %OPTDESC = (
@@ -428,8 +431,10 @@ my %OPTDESC = (
 'limit|n=i@' => ['NUM', 'limit on number of matches (default: 1)' ],
 'offset=i' => ['OFF', 'search result offset (default: 0)'],
 
-'sort|s=s' => [ 'VAL|received|relevance|docid',
-   "order of results is `--output'-dependent"],
+'sort|s=s  q' => $sort_out,
+'sort|s=s  lcat' => $sort_out,
+'sort|s:s@ convert' => $sort_in,
+'sort|s:s@ import' => $sort_in,
 'reverse|r' => 'reverse search results', # like sort(1)
 
 'boost=i' => 'increase/decrease priority of results (default: 0)',
diff --git a/lib/PublicInbox/LeiConvert.pm b/lib/PublicInbox/LeiConvert.pm
index 8f628562..17a952f2 100644
--- a/lib/PublicInbox/LeiConvert.pm
+++ b/lib/PublicInbox/LeiConvert.pm
@@ -28,6 +28,11 @@ sub input_maildir_cb {
$self->{wcb}->(undef, { kw => $kw }, $eml);
 }
 
+sub input_mh_cb {
+   my ($dn, $bn, $kw, $eml, $self) = @_;
+   $self->{wcb}->(undef, { kw => $kw }, $eml);
+}
+
 sub process_inputs { # via wq_do
my ($self) = @_;
local $PublicInbox::DS::in_loop = 0; # force synchronous awaitpid
diff --git a/lib/PublicInbox/LeiImport.pm b/lib/PublicInbox/LeiImport.pm
index c2552bf0..5521188c 100644
--- a/lib/PublicInbox/LeiImport.pm
+++ b/lib/PublicInbox/LeiImport.pm
@@ -53,6 +53,29 @@ sub pmdir_cb { # called via wq_io_do from 
LeiPmdir->each_mdir_fn
}
 }
 
+sub input_mh_cb {
+   my ($mhdir, $n, $kw, $eml, $self) = @_;
+   substr($mhdir, 0, 0) = 'mh:'; # add prefix
+   my $lse 

[PATCH 2/2] lei: use ->child_error API properly

2023-12-16 Thread Eric Wong
I noticed this bug while developing another feature and tests
were getting SIGHUP (since SIGHUP == 1 on most systems).
---
 lib/PublicInbox/LeiExportKw.pm | 4 ++--
 lib/PublicInbox/LeiMirror.pm   | 2 +-
 lib/PublicInbox/LeiToMail.pm   | 4 ++--
 3 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/lib/PublicInbox/LeiExportKw.pm b/lib/PublicInbox/LeiExportKw.pm
index d2396fa7..16f069da 100644
--- a/lib/PublicInbox/LeiExportKw.pm
+++ b/lib/PublicInbox/LeiExportKw.pm
@@ -38,7 +38,7 @@ sub export_kw_md { # LeiMailSync->each_src callback
} elsif ($! == EEXIST) { # lost race with lei/store?
return;
} elsif ($! != ENOENT) {
-   $lei->child_error(1,
+   $lei->child_error(0,
"E: rename_noreplace($src -> $dst): $!");
} # else loop @try
}
@@ -46,7 +46,7 @@ sub export_kw_md { # LeiMailSync->each_src callback
# both tries failed
my $oidhex = unpack('H*', $oidbin);
my $src = "$mdir/{".join(',', @try)."}/$$id";
-   $lei->child_error(1, "rename_noreplace($src -> $dst) ($oidhex): $e");
+   $lei->child_error(0, "rename_noreplace($src -> $dst) ($oidhex): $e");
for (@try) { return if -e "$mdir/$_/$$id" }
$self->{lms}->clear_src("maildir:$mdir", $id);
 }
diff --git a/lib/PublicInbox/LeiMirror.pm b/lib/PublicInbox/LeiMirror.pm
index 0c77a8b5..5353ae61 100644
--- a/lib/PublicInbox/LeiMirror.pm
+++ b/lib/PublicInbox/LeiMirror.pm
@@ -1175,7 +1175,7 @@ sub try_manifest {
local $self->{-local_manifest} = load_current_manifest($self);
local $self->{-new_symlinks} = [];
my ($path_pfx, $n, $multi) = multi_inbox($self, \$path, $m);
-   return $lei->child_error(1, $multi) if !ref($multi);
+   return $lei->child_error(0, $multi) if !ref($multi);
my $v2 = delete $multi->{v2};
if ($v2) {
for my $name (sort keys %$v2) {
diff --git a/lib/PublicInbox/LeiToMail.pm b/lib/PublicInbox/LeiToMail.pm
index a930fc30..071ba113 100644
--- a/lib/PublicInbox/LeiToMail.pm
+++ b/lib/PublicInbox/LeiToMail.pm
@@ -147,9 +147,9 @@ sub git_to_mail { # git->cat_async callback
$type = 'blob';
$size = length($$bref);
}
-   $type eq 'blob' or return $self->{lei}->child_error(1,
+   $type eq 'blob' or return $self->{lei}->child_error(0,
"W: $oid is $type (!= blob)");
-   $size or return $self->{lei}->child_error(1,"E: $oid is empty");
+   $size or return $self->{lei}->child_error(0,"E: $oid is empty");
$smsg->{blob} eq $oid or die "BUG: expected=$smsg->{blob}";
$self->{wcb}->($bref, $smsg);
};



[PATCH 1/2] lei index: support +L: labels

2023-12-16 Thread Eric Wong
`lei index' should be capable of indexing the the same way
`lei import' does, but without the indexing.  I only noticed
this omission while developing a new feature.
---
 lib/PublicInbox/LEI.pm | 2 +-
 t/lei-index.t  | 3 ++-
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/lib/PublicInbox/LEI.pm b/lib/PublicInbox/LEI.pm
index a89bdc51..17431518 100644
--- a/lib/PublicInbox/LEI.pm
+++ b/lib/PublicInbox/LEI.pm
@@ -259,7 +259,7 @@ tag => [ 'KEYWORDS... LOCATION...|--stdin',
 
 'reindex' => [ '', 'reindex all locally-indexed messages', @c_opt ],
 
-'index' => [ 'LOCATION...', 'one-time index from URL or filesystem',
+'index' => [ 'LOCATION... [LABELS...]', 'one-time index from URL or 
filesystem',
qw(in-format|F=s kw! offset=i recursive|r exclude=s include|I=s
verbose|v+ incremental!), @net_opt, # mainly for --proxy=
 @c_opt ],
diff --git a/t/lei-index.t b/t/lei-index.t
index c31b1c3c..2b28f1be 100644
--- a/t/lei-index.t
+++ b/t/lei-index.t
@@ -48,9 +48,10 @@ symlink(File::Spec->rel2abs('t/mda-mime.eml'), 
"$tmpdir/md1/cur/x:2,S") or
 test_lei({ tmpdir => $tmpdir }, sub {
my $store_path = "$ENV{HOME}/.local/share/lei/store/";
 
-   lei_ok('index', "$tmpdir/md");
+   lei_ok qw(index +L:md), "$tmpdir/md";
lei_ok(qw(q mid:q...@example.com));
my $res_a = json_utf8->decode($lei_out);
+   is_deeply $res_a->[0]->{L}, [ 'md' ], 'label set on index';
my $blob = $res_a->[0]->{'blob'};
like($blob, qr/\A[0-9a-f]{40,}\z/, 'got blob from qp@example');
lei_ok(qw(-C / blob), $blob);



[PATCH 0/2] lei bugfixes

2023-12-16 Thread Eric Wong
Eric Wong (2):
  lei index: support +L: labels
  lei: use ->child_error API properly

 lib/PublicInbox/LEI.pm | 2 +-
 lib/PublicInbox/LeiExportKw.pm | 4 ++--
 lib/PublicInbox/LeiMirror.pm   | 2 +-
 lib/PublicInbox/LeiToMail.pm   | 4 ++--
 t/lei-index.t  | 3 ++-
 5 files changed, 8 insertions(+), 7 deletions(-)