Re: Troubleshooting threads missing from /all/

2021-10-17 Thread Eric Wong
Eric Wong  wrote:
> Konstantin Ryabitsev  wrote:
> > On Sat, Oct 16, 2021 at 09:43:24AM +, Eric Wong wrote:
> > > Eric Wong  wrote:
> > > > Yes.  Though given the current situation with missing messages
> > > > from /all/, I'd wait until a reindex recovers the missing
> > > > messages (and probably a fast fsck checker).
> > > 
> > > I think "public-inbox-extindex --reindex --all --fast" is
> > > reasonably ready as an fsck checker.  I've been running it a
> > > bunch in recent days/weeks and also found+fixed some other bugs
> > > along the way.
> 
> Btw, I'm chasing a separate bug in v2 which causes recycled
> Message-IDs to go missing sometimes from a v2 over.sqlite3;
> which then causes -extindex to lose a message...

I just pushed out commit 325fbe26c3e7731e
(v2: mirrors don't clobber msgs w/ reused Message-IDs, 2021-10-18)

Now I'm reindexing all my v2 inboxes before running
"-extindex --all --reindex --fast".  Fortunately, v2 inboxes
are all "-L basic" so they're not too expensive to reindex.

Really hoping this is the last bug related to indexing for a
while...
--
unsubscribe: one-click, see List-Unsubscribe header
archive: https://public-inbox.org/meta/



[PATCH 2/2] v2: mirrors don't clobber msgs w/ reused Message-IDs

2021-10-17 Thread Eric Wong
For odd messages with reused Message-IDs, the second message
showing up in a mirror (via git-fetch + -index) should never
clobber an entry with a different blob in over.

This is noticeable only if the messages arrive in-between
indexing runs.

Fixes: 4441a38481ed ("v2: index forwards (via `git log --reverse')")
---
 MANIFEST  |  1 +
 lib/PublicInbox/V2Writable.pm |  7 ++-
 t/v2index-late-dupe.t | 37 +++
 3 files changed, 44 insertions(+), 1 deletion(-)
 create mode 100644 t/v2index-late-dupe.t

diff --git a/MANIFEST b/MANIFEST
index b5aae77747dd..af1522d71bd1 100644
--- a/MANIFEST
+++ b/MANIFEST
@@ -552,6 +552,7 @@ t/v1-add-remove-add.t
 t/v1reindex.t
 t/v2-add-remove-add.t
 t/v2dupindex.t
+t/v2index-late-dupe.t
 t/v2mda.t
 t/v2mirror.t
 t/v2reindex.t
diff --git a/lib/PublicInbox/V2Writable.pm b/lib/PublicInbox/V2Writable.pm
index 3914383cc9d3..ed5182ae8460 100644
--- a/lib/PublicInbox/V2Writable.pm
+++ b/lib/PublicInbox/V2Writable.pm
@@ -813,8 +813,8 @@ sub index_oid { # cat_async callback
}
}
}
+   my $oidx = $self->{oidx};
if (!defined($num)) { # reuse if reindexing (or duplicates)
-   my $oidx = $self->{oidx};
for my $mid (@$mids) {
($num, $mid0) = $oidx->num_mid0_for_oid($oid, $mid);
last if defined $num;
@@ -822,6 +822,11 @@ sub index_oid { # cat_async callback
}
$mid0 //= do { # is this a number we got before?
$num = $arg->{mm_tmp}->num_for($mids->[0]);
+
+   # don't clobber existing if Message-ID is reused:
+   if (my $x = defined($num) ? $oidx->get_art($num) : undef) {
+   undef($num) if $x->{blob} ne $oid;
+   }
defined($num) ? $mids->[0] : undef;
};
if (!defined($num)) {
diff --git a/t/v2index-late-dupe.t b/t/v2index-late-dupe.t
new file mode 100644
index ..c83e3409044f
--- /dev/null
+++ b/t/v2index-late-dupe.t
@@ -0,0 +1,37 @@
+# Copyright (C) all contributors 
+# License: AGPL-3.0+ 
+#
+# this simulates a mirror path: git fetch && -index
+use strict; use v5.10.1; use PublicInbox::TestCommon;
+use Test::More; # redundant, used for bisect
+require_mods 'v2';
+require PublicInbox::Import;
+require PublicInbox::Inbox;
+require PublicInbox::Git;
+my ($tmpdir, $for_destroy) = tmpdir();
+my $inboxdir = "$tmpdir/i";
+PublicInbox::Import::init_bare(my $e0 = "$inboxdir/git/0.git");
+open my $fh, '>', "$inboxdir/inbox.lock" or xbail $!;
+my $git = PublicInbox::Git->new($e0);
+my $im = PublicInbox::Import->new($git, qw(i i...@example.com));
+$im->{lock_path} = undef;
+$im->{path_type} = 'v2';
+my $eml = eml_load('t/plack-qp.eml');
+ok($im->add($eml), 'add original');
+$im->done;
+run_script([qw(-index -Lbasic), $inboxdir]);
+is($?, 0, 'basic index');
+my $ibx = PublicInbox::Inbox->new({ inboxdir => $inboxdir });
+my $orig = $ibx->over->get_art(1);
+
+my @mid = $eml->header_raw('Message-ID');
+$eml->header_set('Message-ID', @mid, '');
+ok($im->add($eml), 'add another');
+$im->done;
+run_script([qw(-index -Lbasic), $inboxdir]);
+is($?, 0, 'basic index again');
+
+my $after = $ibx->over->get_art(1);
+is_deeply($after, $orig, 'original unchanged') or note explain([$orig,$after]);
+
+done_testing;
--
unsubscribe: one-click, see List-Unsubscribe header
archive: https://public-inbox.org/meta/



[PATCH 1/2] extindex: show mismatches for messages deleted from inbox

2021-10-17 Thread Eric Wong
There seems to be a bug in v2 inbox reindexing somewhere...
---
 lib/PublicInbox/ExtSearchIdx.pm | 14 +++---
 1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/lib/PublicInbox/ExtSearchIdx.pm b/lib/PublicInbox/ExtSearchIdx.pm
index f479cf9e1a3f..4b46fa1622ea 100644
--- a/lib/PublicInbox/ExtSearchIdx.pm
+++ b/lib/PublicInbox/ExtSearchIdx.pm
@@ -292,8 +292,8 @@ sub ck_existing { # git->cat_async callback
 
 # is the messages visible in the inbox currently being indexed?
 # return the number if so
-sub cur_ibx_xnum ($$) {
-   my ($req, $bref) = @_;
+sub cur_ibx_xnum ($$;$) {
+   my ($req, $bref, $mismatch) = @_;
my $ibx = $req->{ibx} or die 'BUG: current {ibx} missing';
 
$req->{eml} = PublicInbox::Eml->new($bref);
@@ -303,6 +303,7 @@ sub cur_ibx_xnum ($$) {
my ($id, $prev);
while (my $x = $ibx->over->next_by_mid($mid, \$id, \$prev)) {
return $x->{num} if $x->{blob} eq $req->{oid};
+   push @$mismatch, $x if $mismatch;
}
}
undef;
@@ -317,8 +318,15 @@ sub index_oid { # git->cat_async callback for 'm'
blob => $oid,
}, 'PublicInbox::Smsg';
$new_smsg->set_bytes($$bref, $size);
-   defined($req->{xnum} = cur_ibx_xnum($req, $bref)) or return;
++${$req->{nr}};
+   my $mismatch = [];
+   $req->{xnum} = cur_ibx_xnum($req, $bref, $mismatch) // do {
+   warn "# deleted\n";
+   warn "# mismatch $_->{blob}\n" for @$mismatch;
+   ${$req->{latest_cmt}} = $req->{cur_cmt} //
+   die "BUG: {cur_cmt} unset ($oid)\n";
+   return;
+   };
do_step($req);
 }
 
--
unsubscribe: one-click, see List-Unsubscribe header
archive: https://public-inbox.org/meta/



[PATCH 0/2] fix v2 mirrors of reused Message-IDs

2021-10-17 Thread Eric Wong
Eeep! :<

Eric Wong (2):
  extindex: show mismatches for messages deleted from inbox
  v2: mirrors don't clobber msgs w/ reused Message-IDs

 MANIFEST|  1 +
 lib/PublicInbox/ExtSearchIdx.pm | 14 ++---
 lib/PublicInbox/V2Writable.pm   |  7 ++-
 t/v2index-late-dupe.t   | 37 +
 4 files changed, 55 insertions(+), 4 deletions(-)
 create mode 100644 t/v2index-late-dupe.t
--
unsubscribe: one-click, see List-Unsubscribe header
archive: https://public-inbox.org/meta/



Re: Troubleshooting threads missing from /all/

2021-10-17 Thread Eric Wong
Konstantin Ryabitsev  wrote:
> On Sat, Oct 16, 2021 at 09:43:24AM +, Eric Wong wrote:
> > Eric Wong  wrote:
> > > Yes.  Though given the current situation with missing messages
> > > from /all/, I'd wait until a reindex recovers the missing
> > > messages (and probably a fast fsck checker).
> > 
> > I think "public-inbox-extindex --reindex --all --fast" is
> > reasonably ready as an fsck checker.  I've been running it a
> > bunch in recent days/weeks and also found+fixed some other bugs
> > along the way.

Btw, I'm chasing a separate bug in v2 which causes recycled
Message-IDs to go missing sometimes from a v2 over.sqlite3;
which then causes -extindex to lose a message...

For example, patches v7 and v8 of
  "btrfs: consolidate device_list_mutex in prepare_sprout to its parent"
reused the same Message-ID, but hitting the v2 inbox directly
https://lore.kernel.org/linux-btrfs/6585e7d938e6600189c1bc7b61a7c76badef18dd.1633003671.git.anand.j...@oracle.com/
doesn't show it, anymore.  --reindex on the v2 inbox seems to
work, but not always...

> Thanks, Eric! I've been out this week for some family time (it was
> Thanksgiving in Canada), which is why I was staying conspicuously silent. :)
> I'll give --reindex --fast a whirl in the next few days.

No problem, but there's more bugs to fix :/
--
unsubscribe: one-click, see List-Unsubscribe header
archive: https://public-inbox.org/meta/



Re: Troubleshooting threads missing from /all/

2021-10-17 Thread Konstantin Ryabitsev
On Sat, Oct 16, 2021 at 09:43:24AM +, Eric Wong wrote:
> Eric Wong  wrote:
> > Yes.  Though given the current situation with missing messages
> > from /all/, I'd wait until a reindex recovers the missing
> > messages (and probably a fast fsck checker).
> 
> I think "public-inbox-extindex --reindex --all --fast" is
> reasonably ready as an fsck checker.  I've been running it a
> bunch in recent days/weeks and also found+fixed some other bugs
> along the way.

Thanks, Eric! I've been out this week for some family time (it was
Thanksgiving in Canada), which is why I was staying conspicuously silent. :)
I'll give --reindex --fast a whirl in the next few days.

> With --fast, --reindex takes around 20 minutes for me with
> "--batch-size=20m --no-fsync".  The first run may take longer
> if it has stuff to do.  But running it repeatedly should not
> cause it to complain about unseen/stale/mismatched messages
> (likely the first run will).
> 
> So it's not /really/ fast, but compared to ~35 hours w/o --fast,
> then it's alright.   Either way, --reindex should be safe
> with parallel -index and -extindex being run by cronjobs.

Absolutely, thanks for working on this.

-K



[PATCH 2/4] extindex: retry sync_inbox before reindex

2021-10-17 Thread Eric Wong
Ensure the num highwater mark of the target inbox is stable
before using it.  Otherwise we may end up repeating work
done to index a message.
---
 lib/PublicInbox/ExtSearchIdx.pm | 12 +---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/lib/PublicInbox/ExtSearchIdx.pm b/lib/PublicInbox/ExtSearchIdx.pm
index 67d720368922..daff656d1ac5 100644
--- a/lib/PublicInbox/ExtSearchIdx.pm
+++ b/lib/PublicInbox/ExtSearchIdx.pm
@@ -859,14 +859,20 @@ sub _reindex_check_ibx ($$$) {
my $slice = 1;
my $opt = { limit => $slice };
my ($beg, $end) = (1, $slice);
-   my $err = sync_inbox($self, $sync, $ibx) and return;
-   my $max = $ibx->mm->num_highwater;
+   my $ekey = $ibx->eidx_key;
+   my ($max, $max0);
+   do {
+   $max0 = $ibx->mm->num_highwater;
+   sync_inbox($self, $sync, $ibx) and return; # warned
+   $max = $ibx->mm->num_highwater;
+   return if $sync->{quit};
+   } while ($max > $max0 &&
+   warn("# $ekey moved $max0..$max, resyncing..\n"));
$end = $max if $end > $max;
 
# first, check if we missed any messages in target $ibx
my $msgs;
my $pr = $sync->{-opt}->{-progress};
-   my $ekey = $ibx->eidx_key;
local $sync->{-regen_fmt} = "$ekey checking %u/$max\n";
${$sync->{nr}} = 0;
my $fast = $sync->{-opt}->{fast};
--
unsubscribe: one-click, see List-Unsubscribe header
archive: https://public-inbox.org/meta/



[PATCH 4/4] extindex: better locations for {quit} checks

2021-10-17 Thread Eric Wong
Check for graceful termination at every message since it's
a fairly inexpensive check.
---
 lib/PublicInbox/ExtSearchIdx.pm | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/lib/PublicInbox/ExtSearchIdx.pm b/lib/PublicInbox/ExtSearchIdx.pm
index cb5256a2c562..f479cf9e1a3f 100644
--- a/lib/PublicInbox/ExtSearchIdx.pm
+++ b/lib/PublicInbox/ExtSearchIdx.pm
@@ -908,10 +908,9 @@ ibx_id = ? AND xnum >= ? AND xnum <= ?
for my $num (@$docids) {
$self->{oidx}->eidxq_add($num);
}
-   return if $sync->{quit};
}
+   return if $sync->{quit};
}
-   return if $sync->{quit};
next unless scalar keys %x3m;
$self->git->async_wait_all; # wait for reindex_unseen
 
@@ -936,6 +935,7 @@ BUG: (non-fatal) $ekey #$xnum $smsg->{blob} still matches 
(old exp: $exp)
for my $i (@$docids) {
_unref_doc($sync, $i, $ibx, $xnum, $bin);
}
+   return if $sync->{quit};
}
}
defined($hi) and ($hi < $max) and
--
unsubscribe: one-click, see List-Unsubscribe header
archive: https://public-inbox.org/meta/



[PATCH 3/4] extindex: guard against false mismatch unrefs

2021-10-17 Thread Eric Wong
I'm not sure if this is a bug or not (or it could be
an old bug in the v2 indexing code).
---
 lib/PublicInbox/ExtSearchIdx.pm | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/lib/PublicInbox/ExtSearchIdx.pm b/lib/PublicInbox/ExtSearchIdx.pm
index daff656d1ac5..cb5256a2c562 100644
--- a/lib/PublicInbox/ExtSearchIdx.pm
+++ b/lib/PublicInbox/ExtSearchIdx.pm
@@ -921,6 +921,16 @@ ibx_id = ? AND xnum >= ? AND xnum <= ?
my ($xnum, $hex) = unpack('JH*', $k);
my $bin = pack('H*', $hex);
my $exp = $mismatch{$xnum};
+   if (defined $exp) {
+   my $smsg = $ibx->over->get_art($xnum) // next;
+   # $xnum may be expired by another process
+   if ($smsg->{blob} eq $hex) {
+   warn <<"";
+BUG: (non-fatal) $ekey #$xnum $smsg->{blob} still matches (old exp: $exp)
+
+   next;
+   } # else: continue to unref
+   }
my $m = defined($exp) ? "mismatch (!= $exp)" : 'stale';
warn("# $xnum:$hex (#@$docids): $m\n");
for my $i (@$docids) {
--
unsubscribe: one-click, see List-Unsubscribe header
archive: https://public-inbox.org/meta/



[PATCH 0/4] extindex tweaks and small fixes

2021-10-17 Thread Eric Wong
Probably nothing of note, but some extra safety and
redundant work elimination.

Eric Wong (4):
  extindex: use localtime to display lock time
  extindex: retry sync_inbox before reindex
  extindex: guard against false mismatch unrefs
  extindex: better locations for {quit} checks

 lib/PublicInbox/ExtSearchIdx.pm | 31 ---
 1 file changed, 24 insertions(+), 7 deletions(-)

--
unsubscribe: one-click, see List-Unsubscribe header
archive: https://public-inbox.org/meta/



[PATCH 1/4] extindex: use localtime to display lock time

2021-10-17 Thread Eric Wong
Since this is intended for use on the command-line,
include TZ offset in time and try to shorten the
message a bit so it wraps less on a terminal.
---
 lib/PublicInbox/ExtSearchIdx.pm | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/lib/PublicInbox/ExtSearchIdx.pm b/lib/PublicInbox/ExtSearchIdx.pm
index 69d048fb7342..67d720368922 100644
--- a/lib/PublicInbox/ExtSearchIdx.pm
+++ b/lib/PublicInbox/ExtSearchIdx.pm
@@ -719,11 +719,12 @@ sub eidxq_lock_acquire ($) {
return $locked if $locked eq $cur;
}
my ($pid, $time, $euid, $ident) = split(/-/, $cur, 4);
-   my $t = strftime('%Y-%m-%d %k:%M:%S', gmtime($time));
+   my $t = strftime('%Y-%m-%d %k:%M %z', localtime($time));
+   local $self->{current_info} = 'eidxq';
if ($euid == $> && $ident eq host_ident) {
if (kill(0, $pid)) {
warn