This brings us closer to the behavior of mairix(1) for search
by supporting n:, t:, c:, f:, tc:, tcf:, n:, b:, and bs:
prefixes as documented in the mairix(1) manpage.
We also introduce the use of q: and nq: prefixes for quoted and
non-quoted text, respectively.
There is a schema version change i
Specifying the "d:" field only worked for
NumberValueRangeProcessor in older versions of Xapian, such
as the one in Debian wheezy (libsearch-xapian-perl=1.2.10.0-1)
This slipped through since I rarely use wheezy, anymore, and
perhaps nobody else does, either. Perhaps wheezy support may be
dropped
We only document the "s:" anyways. While the long name is more
descriptive, the ambiguity makes agnostic caching (by Varnish or
similar) slightly harder and longer URLs are more likely to be
accidentally truncated when shared.
---
lib/PublicInbox/Search.pm | 1 -
t/search.t| 14 +
It's not worth entering a complex codepath in Email::MIME to
save some (probably immeasurable amount of) memory, here. We've
already stopped doing this in our WWW code a while back, too.
If we really cared enough about it, we'd prioritize work on a
streaming replacement for Email::MIME.
---
lib/P
"bs:" and "b:" are adapted from mairix(1)
We will also support searching explicitly for quoted vs
non-quoted text via "q:" and "nq:" prefixes since sometimes
readers will not care for quoted text.
In the future, we will support parsing diffs (perhaps when
repobrowse integration is complete).
Not
And while we're at it, ensure searching inside displayable
attachment bodies works.
---
lib/PublicInbox/Search.pm| 3 ++-
lib/PublicInbox/SearchIdx.pm | 4
t/search.t | 44
3 files changed, 50 insertions(+), 1 deletion(-)
d
The basic rule is that if it is displayable via our WWW
interface, it should be indexable text for Xapian search.
---
lib/PublicInbox/SearchIdx.pm | 21 -
1 file changed, 16 insertions(+), 5 deletions(-)
diff --git a/lib/PublicInbox/SearchIdx.pm b/lib/PublicInbox/SearchIdx.pm
As of Xapian 1.0.4 (from 2007) is possible to use
Search::Xapian::QueryParser::add_prefix multiple times with the
same user field name but different term prefixes.
This brings my current git@vger mirror from 6.5GB to 2.1GB
(both sizes are after xapian-compact).
---
lib/PublicInbox/Search.pm|
We pay a storage cost for storing positional information
in Xapian, make good use of it by attempting to preserve
it for (hopefully) better search results.
---
lib/PublicInbox/SearchIdx.pm | 23 +++
1 file changed, 11 insertions(+), 12 deletions(-)
diff --git a/lib/PublicInbox
This is stricter than the mutt quote_regexp default
("^([ \t]*[|>:}#])+" on Debian jessie),
but matches what we have in View.pm.
I prefer the stricter quote detection since it is less ambiguous
and less likely to hide/obscure important details.
---
lib/PublicInbox/SearchIdx.pm | 2 +-
1 file chan
Sometimes it can be useful to search based on who the
message was sent to, sent by, or Cc:-ed. Of course,
headers can be faked, but they usually are not...
Anyways this mostly matches the behavior of mairix(1).
---
lib/PublicInbox/Search.pm| 10 +++-
lib/PublicInbox/SearchIdx.pm | 59 +++
We need to prevent excessive repository growth for
public-inbox-watch and public-inbox-mda users.
---
lib/PublicInbox/Import.pm | 1 +
1 file changed, 1 insertion(+)
diff --git a/lib/PublicInbox/Import.pm b/lib/PublicInbox/Import.pm
index 083fb1b..611f7b1 100644
--- a/lib/PublicInbox/Import.pm
++
We will be reusing this in the next commit, too.
---
lib/PublicInbox/Import.pm | 21 ++---
1 file changed, 10 insertions(+), 11 deletions(-)
diff --git a/lib/PublicInbox/Import.pm b/lib/PublicInbox/Import.pm
index 09dd38d..083fb1b 100644
--- a/lib/PublicInbox/Import.pm
+++ b/lib/P
This change is way overdue :x Better late than never, I guess.
Eric Wong (2):
import: hoist out common run_die subroutine
import: run "git gc --auto" when done
lib/PublicInbox/Import.pm | 22 +++---
1 file changed, 11 insertions(+), 11 deletions(-)
For now, we will document this since it allows better
performance without the burden of extensions. Perhaps one day
far in the future Perl can natively support vfork(2) AND that
version of Perl will be widely available, but I suspect that day
is at least a decade away, if not two:
https:/
This reduces duplication, slightly. We may be using it
yet again in a to-be-introduced function (or we may not
introduce it).
---
lib/PublicInbox/Import.pm | 37 ++---
1 file changed, 18 insertions(+), 19 deletions(-)
diff --git a/lib/PublicInbox/Import.pm b/lib/P
Email::MIME internally assumes "text/plain" for messages
missing a Content-Type, but does not expose that in the
Email::MIME::content_type API method. We must assume it
ourselves to avoid uninitialized value warnings for the
rare (nowadays) MUAs which do not set it.
---
lib/PublicInbox/View.pm |
17 matches
Mail list logo