On Thu, Mar 21, 2024 at 06:01:12PM +0200, Laurențiu Nicola wrote:
> Package: apt
> Version: 2.7.12
> 
> I noticed that searching for packages is very slow if the package lists are 
> compressed. To reproduce, remove `/var/lib/apt/lists`, enable
> 
>     Acquire::GzipIndexes "true"; Acquire::CompressionTypes::Order:: "gz";
> 
> , run `apt update`. This enables LZ4 compression on my systems, but I don't 
> think the exact method matters. You can then run `apt search librust`, which 
> takes about 19 seconds in a Debian 12 container (docker.io/debian:12 has 
> compression already set up), compared to 0.4 seconds without compression.
> 
> Also tested on Ubuntu 22.04 and 24.04, so the exact APT version shouldn't 
> matter too much.
> 
> I tried to look into it, and `strace -e trace=openat apt-cache search 
> librust` shows it reopen and re-read one of the package lists:
> 
> openat(AT_FDCWD, 
> "/var/lib/apt/lists/archive.ubuntu.com_ubuntu_dists_jammy_universe_binary-amd64_Packages.lz4",
>  O_RDONLY) = 16
> librust-addr2line+default-dev - Cross-platform symbolication library - 
> feature "default"
> openat(AT_FDCWD, 
> "/var/lib/apt/lists/archive.ubuntu.com_ubuntu_dists_jammy_universe_binary-amd64_Packages.lz4",
>  O_RDONLY) = 16
> librust-addr2line+object-dev - Cross-platform symbolication library - feature 
> "object"
> openat(AT_FDCWD, 
> "/var/lib/apt/lists/archive.ubuntu.com_ubuntu_dists_jammy_universe_binary-amd64_Packages.lz4",
>  O_RDONLY) = 16
> librust-addr2line+rustc-demangle-dev - Cross-platform symbolication library - 
> feature "rustc-demangle"
> openat(AT_FDCWD, 
> "/var/lib/apt/lists/archive.ubuntu.com_ubuntu_dists_jammy_universe_binary-amd64_Packages.lz4",
>  O_RDONLY) = 16
> librust-addr2line+std-dev - Cross-platform symbolication library - feature 
> "std"
> 
> (you can use -e trace=openat,read to confirm that it's actually reading the 
> file)
> 
> I believe it's quadratic in the number of search results, and this is related 
> to the pseudo-indexing mechanism used by APT (see `pkgRecords::Lookup` in 
> apt-pkg). Each lookup call will have to decompress the file in order to seek 
> to the destination.
> 
> Unfortunately, I suspect this isn't exactly an easy fix, given the current 
> design.
> 

Going to respond to this but also including responses to your followup email
which has a broken Subject:


Searching works by ordering the packages based on file, offset
and then iterating over them and looking them up. Seeking forward
to a higher offset does not involve a reopen, we just skip content
in betwene.

Full-text search is inside the description in the section parsed
for each package.

It's not clear why this fails on bookworm - I can reproduce that -
t certainly is fine in git main on my Ubuntu 24.04 system.


-- 
debian developer - deb.li/jak | jak-linux.org - free software dev
ubuntu core developer                              i speak de, en

Reply via email to