Re: extra search flags and params? (ispatch, replycount, ...)

2023-11-28 Thread Eric Wong
Konstantin Ryabitsev wrote: > On Tue, Nov 28, 2023 at 06:20:03PM +, Eric Wong wrote: > > Though being able to find unanswered threads could be helpful. > > Note, I'm not saying it's not a cool feature. :) However, I imagine people > would be more interested in searching for something like

Re: extra search flags and params? (ispatch, replycount, ...)

2023-11-28 Thread Konstantin Ryabitsev
On Tue, Nov 28, 2023 at 06:20:03PM +, Eric Wong wrote: > > Ah. I think here is enough to just say "s:* AND NOT s:PATCH" without > > introducing additional xapian indexing parameters. Though, perhaps the web > > interface can also gain a "collapse threads" view? > > topics_new.html /

Re: extra search flags and params? (ispatch, replycount, ...)

2023-11-28 Thread Eric Wong
Konstantin Ryabitsev wrote: > On Tue, Nov 28, 2023 at 05:35:09PM +, Eric Wong wrote: > > > I understand the reasoning, but I'm not sure we should be trying too hard > > > to > > > make public-inbox a patch tracking platform. What makes lei great is > > > ability > > > to automatically find

[PATCH 15/14] www: load cindex join data for ->ALL, too

2023-11-28 Thread Eric Wong
This ensures the /all/ extindex can have auto-associations with coderepos just like normal inboxes do. --- lib/PublicInbox/CodeSearch.pm | 9 + 1 file changed, 9 insertions(+) diff --git a/lib/PublicInbox/CodeSearch.pm b/lib/PublicInbox/CodeSearch.pm index 7c0dd063..5c5774cf 100644 ---

Re: extra search flags and params? (ispatch, replycount, ...)

2023-11-28 Thread Konstantin Ryabitsev
On Tue, Nov 28, 2023 at 05:35:09PM +, Eric Wong wrote: > > I understand the reasoning, but I'm not sure we should be trying too hard to > > make public-inbox a patch tracking platform. What makes lei great is ability > > to automatically find and retrieve entire threads -- I feel like we

[PATCH 1/4] lei q: fix --no-import-before completion + docs

2023-11-28 Thread Eric Wong
--no-import-before skips importing entire messages, not just keywords, so it can cause permanent data loss if -o is pointed to precious data. --- Documentation/lei-q.pod | 5 +++-- lib/PublicInbox/LEI.pm | 1 + t/lei-q-kw.t| 19 --- 3 files changed, 20

[PATCH 2/4] www: mail_diff: fix optional address obfuscation

2023-11-28 Thread Eric Wong
We need to load the proper package and fully-qualify the sub call since we shouldn't load Hval in lei. Some users use this feature even if its broken, oh well :< --- lib/PublicInbox/MailDiff.pm | 7 ++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/lib/PublicInbox/MailDiff.pm

[PATCH 4/4] www: mail_diff: add missing tag

2023-11-28 Thread Eric Wong
Found by tidy(1) while dealing with other stuff. --- lib/PublicInbox/MailDiff.pm | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/lib/PublicInbox/MailDiff.pm b/lib/PublicInbox/MailDiff.pm index 89284e39..e4e262ef 100644 --- a/lib/PublicInbox/MailDiff.pm +++

[PATCH 0/4] non-cindex-related stuff

2023-11-28 Thread Eric Wong
Well, I actually found the mail_diff bugs while looking into micro-optimizing -cindex. Eric Wong (4): lei q: fix --no-import-before completion + docs www: mail_diff: fix optional address obfuscation www: mail_diff: add final newline before diffing www: mail_diff: add missing tag

[PATCH 3/4] www: mail_diff: add final newline before diffing

2023-11-28 Thread Eric Wong
This gets rid of the "\ No newline at end of file" since it's distracting noise. --- lib/PublicInbox/MailDiff.pm | 2 +- t/lei-mail-diff.t | 1 + t/psgi_v2.t | 1 + 3 files changed, 3 insertions(+), 1 deletion(-) diff --git a/lib/PublicInbox/MailDiff.pm

Re: extra search flags and params? (ispatch, replycount, ...)

2023-11-28 Thread Eric Wong
Konstantin Ryabitsev wrote: > On Tue, Nov 28, 2023 at 12:10:28AM +, Eric Wong wrote: > > Would they be useful? > > > > It's not currently possible to quickly search for whether or not > > a term (e.g. patchid:) is present in a Xapian document. Having > > the ability to do so would make it

Re: extra search flags and params? (ispatch, replycount, ...)

2023-11-28 Thread Konstantin Ryabitsev
On Tue, Nov 28, 2023 at 12:10:28AM +, Eric Wong wrote: > Would they be useful? > > It's not currently possible to quickly search for whether or not > a term (e.g. patchid:) is present in a Xapian document. Having > the ability to do so would make it easier to find non-patch messages, > or

[PATCH 05/14] xap_helper.h: move cindex endpoints to separate file

2023-11-28 Thread Eric Wong
It ought to help a bit with organization since xap_helper.h is getting somewhat large and we'll need new endpoints to support WWW, lei, and whatever else that needs to come. --- MANIFEST| 1 + lib/PublicInbox/XapHelperCxx.pm | 10 +- lib/PublicInbox/xap_helper.h|

[PATCH 02/14] t/cindex*: require SCM_RIGHTS for these tests

2023-11-28 Thread Eric Wong
Code search will require SCM_RIGHTS, and Inline::C on BSDs probably isn't too onerous a dependency for new features as all the ones I've tested have it packaged. Furthermore, requiring SCM_RIGHTS isn't far off since OpenBSD's Perl is patched to route the `syscall' perlop through libc[1], while

[PATCH 12/14] admin: resolve_git_dir respects symlinks

2023-11-28 Thread Eric Wong
Absolute pathnames of git coderepos are stored in the cindex, but we should favor paths relative to $ENV{PWD} since it respects symlinks in the heirarchy. Respecting symlinks makes it easier to migrate cindex to new storage as old storage wears out and to relocate the storage device onto another

[PATCH 09/14] git: speed up ->git_path for non-worktrees

2023-11-28 Thread Eric Wong
Only worktrees need to use `git rev-parse --git-path', so avoid the spawn overhead of a new process. With the SolverGit.pm limit on coderepo scans disabled and scanning over 800 git repos for git@vger matches, this reduces up xt/solver.t times by roughly 25%. --- lib/PublicInbox/Git.pm | 17

[PATCH 07/14] hval: use File::Spec to make relative paths for href

2023-11-28 Thread Eric Wong
File::Spec->abs2rel doesn't touch the filesystem at all when given an absolute base arg ($env->{PATH_INFO}), so we can rely on it to generate relative links that work with the `mount' from Plack::Builder and also people running `wget -r' mirrors. --- lib/PublicInbox/Hval.pm | 12 +++- 1

[PATCH 04/14] solver: schedule cleanup after synchronous git->check

2023-11-28 Thread Eric Wong
We don't want hundreds of git cat-file processes for coderepos lingering around. --- lib/PublicInbox/Git.pm | 7 ++- lib/PublicInbox/SolverGit.pm | 3 +++ 2 files changed, 9 insertions(+), 1 deletion(-) diff --git a/lib/PublicInbox/Git.pm b/lib/PublicInbox/Git.pm index

[PATCH 14/14] www: start working on a repo listing

2023-11-28 Thread Eric Wong
The HTML is still extremely rough, but links seem to be mostly working... --- MANIFEST | 1 + lib/PublicInbox/CodeSearch.pm | 8 +++ lib/PublicInbox/RepoList.pm| 39 ++ lib/PublicInbox/WwwCoderepo.pm | 3 +++

[PATCH 01/14] test_common: create_*: detect changes all parameters

2023-11-28 Thread Eric Wong
Data::Dumper+B::Deparse seems fast enough to generate cache keys with, so this makes updating and developing tests easier (as opposed to forcing the developer to change the identifier). The main downside is we'll have to deal with cache expiration, but "make clean" seems overly aggressive already

[PATCH 10/14] cindex: require `-g GIT_DIR' or `-r PROJECT_ROOT'

2023-11-28 Thread Eric Wong
Accepting @ARGV without switches ends up being ambiguous with optional parameters for --join and --show. Requiring users to specify `--join=' or `--show=' is a bit awkward (as it with -clone --objstore= and the like, but that is historical baggage we need to carry at this point...) ---

[PATCH 00/14] IT'S ALIVE! www loads cindex join data

2023-11-28 Thread Eric Wong
8/14 is the killer one which actually makes the cindex data useful for WWW and powering solver. Keep in mind, I've had to cap solver at 3 coderepos as a temporary measure since there's a lot of "weak" joins we should be weeding out. More documentation coming, but cindex joins are very much a

[PATCH 06/14] xap_helper: implement mset endpoint for WWW, IMAP, etc...

2023-11-28 Thread Eric Wong
The C++ version will allow us to take full advantage of Xapian's APIs for better queries, and the Perl bindings version can still be advantageous in the future since we'll be able to support timeouts effectively. --- MANIFEST| 1 + Makefile.PL | 8

[PATCH 11/14] git: speed up Git->new by 5% or so

2023-11-28 Thread Eric Wong
This becomes noticeable when loading lots of coderepos on my local mirror of git.kernel.org now that we can load repos from cindex. --- lib/PublicInbox/Git.pm | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/lib/PublicInbox/Git.pm b/lib/PublicInbox/Git.pm index

[PATCH 08/14] www: load and use cindex join data

2023-11-28 Thread Eric Wong
This is a major step in solving the problem of having to manually associate hundreds/thousands of coderepos with hundreds/thousands of public-inboxes to power solver (and more). --- lib/PublicInbox/CodeSearch.pm| 153 +-- lib/PublicInbox/CodeSearchIdx.pm | 42

[PATCH 03/14] codesearch: eliminate redundant substitutions

2023-11-28 Thread Eric Wong
We store the full path name and xap_terms already removes the `P' character, so the loop and substr calls are a no-op replacing `/' with `/'. --- lib/PublicInbox/CodeSearch.pm | 1 - 1 file changed, 1 deletion(-) diff --git a/lib/PublicInbox/CodeSearch.pm b/lib/PublicInbox/CodeSearch.pm index

[PATCH 13/14] cindex: extra quit checks

2023-11-28 Thread Eric Wong
We don't want to be accessing uninitialized variables on process teardown since much of our control flow revolves around DESTROY for dependency handling. --- lib/PublicInbox/CodeSearchIdx.pm | 5 + 1 file changed, 5 insertions(+) diff --git a/lib/PublicInbox/CodeSearchIdx.pm