Re: Troubleshooting threads missing from /all/

2021-10-07 Thread Eric Wong
Konstantin Ryabitsev  wrote:
> On Thu, Oct 07, 2021 at 08:36:52AM +, Eric Wong wrote:
> > Also, did you capture any error messages to stderr?
> > I suppose you would've told us if you did.
> 
> Yeah, I looked through any place that would have logged an error and I didn't
> really see anything. I expect this would have happened during an extindex run,
> but I didn't see any non-zero exits when I looked through the logs.
> 
> Regarding reindex -- is that something that would make sense to do
> occasionally simply for potential improvements, e.g. similarly to how we
> periodically repack repos with -f for better packs? Or would that be pointless
> churn in the context of xapian?

Yes, I try to note when a reindex is necessary in commit messages.
It takes my system around 2 days to do, typically, but it should
run safely in parallel with everything else safely.

It should probably be done every release, but I suck at making those :<
--
unsubscribe: one-click, see List-Unsubscribe header
archive: https://public-inbox.org/meta/



Re: Troubleshooting threads missing from /all/

2021-10-07 Thread Eric Wong
Also, did you capture any error messages to stderr?
I suppose you would've told us if you did.

(resend, MTA dropped this part:)

In particular, I just posted a patch to fix
"Can't bless non-reference value" messages could've been causing
some messages to fail indexing completely.

<20211007082932.6985-...@80x24.org>
(overidx: each_by_mid: account for messages being deleted)

Error reporting/handling needs some work... :x
--
unsubscribe: one-click, see List-Unsubscribe header
archive: https://public-inbox.org/meta/



Re: Troubleshooting threads missing from /all/

2021-10-07 Thread Eric Wong
(resend, screwed up something with my MTA :x)

OK.  I tried reproducing the problem even with f28fdcd6d8d6
(content_hash: normalize whitespace before hashing addresses, 2021-10-02)
reverted, but haven't been able to...

So far I've found some gc and dedupe bugs, but something's still
eluding me.  And I also noticed and started fixing another bug
which may necessitate a full --reindex, anyways (at least for
non-ASCII subjects).

Konstantin Ryabitsev  wrote:
> [publicinbox "regressions"]
>   address = regressi...@lists.linux.dev
>   url = regressions
>   inboxdir = /srv/public-inbox/lore.kernel.org/regressions
>   indexlevel = full

Btw, "indexlevel = basic" ought to be sufficient if an inbox
is in extindex once bugs are ironed out.  full/medium is
of course helpful if messages are missing from extindex,
though...
--
unsubscribe: one-click, see List-Unsubscribe header
archive: https://public-inbox.org/meta/



Re: Troubleshooting threads missing from /all/

2021-10-07 Thread Konstantin Ryabitsev
On Thu, Oct 07, 2021 at 08:36:52AM +, Eric Wong wrote:
> Also, did you capture any error messages to stderr?
> I suppose you would've told us if you did.

Yeah, I looked through any place that would have logged an error and I didn't
really see anything. I expect this would have happened during an extindex run,
but I didn't see any non-zero exits when I looked through the logs.

Regarding reindex -- is that something that would make sense to do
occasionally simply for potential improvements, e.g. similarly to how we
periodically repack repos with -f for better packs? Or would that be pointless
churn in the context of xapian?

-K



Re: Troubleshooting threads missing from /all/

2021-10-07 Thread Eric Wong
Also, did you capture any error messages to stderr?
I suppose you would've told us if you did.
--
unsubscribe: one-click, see List-Unsubscribe header
archive: https://public-inbox.org/meta/



[PATCH] overidx: each_by_mid: account for messages being deleted

2021-10-07 Thread Eric Wong
This may fix some extindex problems and should get rid of
the "Can't bless non-reference value" errors.
---
 lib/PublicInbox/OverIdx.pm | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/lib/PublicInbox/OverIdx.pm b/lib/PublicInbox/OverIdx.pm
index 0c8a4d9ee3f8..985abbf4e693 100644
--- a/lib/PublicInbox/OverIdx.pm
+++ b/lib/PublicInbox/OverIdx.pm
@@ -158,7 +158,8 @@ SELECT $cols FROM over WHERE over.num = ? LIMIT 1
 
foreach (@$nums) {
$sth->execute($_->[0]);
-   my $smsg = $sth->fetchrow_hashref;
+   # $cb may delete rows and invalidate nums
+   my $smsg = $sth->fetchrow_hashref // next;
$smsg = PublicInbox::Over::load_from_row($smsg);
$cb->($self, $smsg, @arg) or return;
}
--
unsubscribe: one-click, see List-Unsubscribe header
archive: https://public-inbox.org/meta/