Re: Troubleshooting threads missing from /all/
Konstantin Ryabitsev wrote: > On Thu, Oct 07, 2021 at 08:36:52AM +, Eric Wong wrote: > > Also, did you capture any error messages to stderr? > > I suppose you would've told us if you did. > > Yeah, I looked through any place that would have logged an error and I didn't > really see anything. I expect this would have happened during an extindex run, > but I didn't see any non-zero exits when I looked through the logs. > > Regarding reindex -- is that something that would make sense to do > occasionally simply for potential improvements, e.g. similarly to how we > periodically repack repos with -f for better packs? Or would that be pointless > churn in the context of xapian? Yes, I try to note when a reindex is necessary in commit messages. It takes my system around 2 days to do, typically, but it should run safely in parallel with everything else safely. It should probably be done every release, but I suck at making those :< -- unsubscribe: one-click, see List-Unsubscribe header archive: https://public-inbox.org/meta/
Re: Troubleshooting threads missing from /all/
Also, did you capture any error messages to stderr? I suppose you would've told us if you did. (resend, MTA dropped this part:) In particular, I just posted a patch to fix "Can't bless non-reference value" messages could've been causing some messages to fail indexing completely. <20211007082932.6985-...@80x24.org> (overidx: each_by_mid: account for messages being deleted) Error reporting/handling needs some work... :x -- unsubscribe: one-click, see List-Unsubscribe header archive: https://public-inbox.org/meta/
Re: Troubleshooting threads missing from /all/
(resend, screwed up something with my MTA :x) OK. I tried reproducing the problem even with f28fdcd6d8d6 (content_hash: normalize whitespace before hashing addresses, 2021-10-02) reverted, but haven't been able to... So far I've found some gc and dedupe bugs, but something's still eluding me. And I also noticed and started fixing another bug which may necessitate a full --reindex, anyways (at least for non-ASCII subjects). Konstantin Ryabitsev wrote: > [publicinbox "regressions"] > address = regressi...@lists.linux.dev > url = regressions > inboxdir = /srv/public-inbox/lore.kernel.org/regressions > indexlevel = full Btw, "indexlevel = basic" ought to be sufficient if an inbox is in extindex once bugs are ironed out. full/medium is of course helpful if messages are missing from extindex, though... -- unsubscribe: one-click, see List-Unsubscribe header archive: https://public-inbox.org/meta/
Re: Troubleshooting threads missing from /all/
On Thu, Oct 07, 2021 at 08:36:52AM +, Eric Wong wrote: > Also, did you capture any error messages to stderr? > I suppose you would've told us if you did. Yeah, I looked through any place that would have logged an error and I didn't really see anything. I expect this would have happened during an extindex run, but I didn't see any non-zero exits when I looked through the logs. Regarding reindex -- is that something that would make sense to do occasionally simply for potential improvements, e.g. similarly to how we periodically repack repos with -f for better packs? Or would that be pointless churn in the context of xapian? -K
Re: Troubleshooting threads missing from /all/
Also, did you capture any error messages to stderr? I suppose you would've told us if you did. -- unsubscribe: one-click, see List-Unsubscribe header archive: https://public-inbox.org/meta/
[PATCH] overidx: each_by_mid: account for messages being deleted
This may fix some extindex problems and should get rid of the "Can't bless non-reference value" errors. --- lib/PublicInbox/OverIdx.pm | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/lib/PublicInbox/OverIdx.pm b/lib/PublicInbox/OverIdx.pm index 0c8a4d9ee3f8..985abbf4e693 100644 --- a/lib/PublicInbox/OverIdx.pm +++ b/lib/PublicInbox/OverIdx.pm @@ -158,7 +158,8 @@ SELECT $cols FROM over WHERE over.num = ? LIMIT 1 foreach (@$nums) { $sth->execute($_->[0]); - my $smsg = $sth->fetchrow_hashref; + # $cb may delete rows and invalidate nums + my $smsg = $sth->fetchrow_hashref // next; $smsg = PublicInbox::Over::load_from_row($smsg); $cb->($self, $smsg, @arg) or return; } -- unsubscribe: one-click, see List-Unsubscribe header archive: https://public-inbox.org/meta/