On Thu, Mar 21, 2024 at 06:01:12PM +0200, Laurențiu Nicola wrote: > Package: apt > Version: 2.7.12 > > I noticed that searching for packages is very slow if the package lists are > compressed. To reproduce, remove `/var/lib/apt/lists`, enable > > Acquire::GzipIndexes "true"; Acquire::CompressionTypes::Order:: "gz"; > > , run `apt update`. This enables LZ4 compression on my systems, but I don't > think the exact method matters. You can then run `apt search librust`, which > takes about 19 seconds in a Debian 12 container (docker.io/debian:12 has > compression already set up), compared to 0.4 seconds without compression. > > Also tested on Ubuntu 22.04 and 24.04, so the exact APT version shouldn't > matter too much. > > I tried to look into it, and `strace -e trace=openat apt-cache search > librust` shows it reopen and re-read one of the package lists: > > openat(AT_FDCWD, > "/var/lib/apt/lists/archive.ubuntu.com_ubuntu_dists_jammy_universe_binary-amd64_Packages.lz4", > O_RDONLY) = 16 > librust-addr2line+default-dev - Cross-platform symbolication library - > feature "default" > openat(AT_FDCWD, > "/var/lib/apt/lists/archive.ubuntu.com_ubuntu_dists_jammy_universe_binary-amd64_Packages.lz4", > O_RDONLY) = 16 > librust-addr2line+object-dev - Cross-platform symbolication library - feature > "object" > openat(AT_FDCWD, > "/var/lib/apt/lists/archive.ubuntu.com_ubuntu_dists_jammy_universe_binary-amd64_Packages.lz4", > O_RDONLY) = 16 > librust-addr2line+rustc-demangle-dev - Cross-platform symbolication library - > feature "rustc-demangle" > openat(AT_FDCWD, > "/var/lib/apt/lists/archive.ubuntu.com_ubuntu_dists_jammy_universe_binary-amd64_Packages.lz4", > O_RDONLY) = 16 > librust-addr2line+std-dev - Cross-platform symbolication library - feature > "std" > > (you can use -e trace=openat,read to confirm that it's actually reading the > file) > > I believe it's quadratic in the number of search results, and this is related > to the pseudo-indexing mechanism used by APT (see `pkgRecords::Lookup` in > apt-pkg). Each lookup call will have to decompress the file in order to seek > to the destination. > > Unfortunately, I suspect this isn't exactly an easy fix, given the current > design. >
Going to respond to this but also including responses to your followup email which has a broken Subject: Searching works by ordering the packages based on file, offset and then iterating over them and looking them up. Seeking forward to a higher offset does not involve a reopen, we just skip content in betwene. Full-text search is inside the description in the section parsed for each package. It's not clear why this fails on bookworm - I can reproduce that - t certainly is fine in git main on my Ubuntu 24.04 system. -- debian developer - deb.li/jak | jak-linux.org - free software dev ubuntu core developer i speak de, en