Re: File search

2022-12-20 Thread Antoine R. Dumont (@ardumont)
Hello Guix, Thanks for the feedback! Note: @civodul, assuming you are subscribed to the ml, I currently kept you as a `To:` recipient but I can drop you from it, right? I'm "also" subscribed to the ml so you may drop me from the `To:` too (if i'm not mistaken). Ludovic Courtès writes: > Hi Ant

Re: File search

2022-12-19 Thread zimoun
Hi Ludo, On Mon, 19 Dec 2022 at 22:25, Ludovic Courtès wrote: > I think at this point we could consider integration in Guix proper, > under ‘guix/scripts’. For that we could dismiss commit history. > > That’ll entail extra work (d’oh!) such as fine-tuning, writing tests, > and writing a section

Re: File search

2022-12-19 Thread Ludovic Courtès
Hi Antoine! "Antoine R. Dumont (@ardumont)" skribis: > Here is the rough changelog: > > - The local db cache is now versioned. Migration will transparently > happen for users at each index command calls (if need be). Perfect! > - The cli parsing got rewritten to be more flexible (inspired fr

Re: File search

2022-12-15 Thread Antoine R. Dumont (@ardumont)
anifests. This is the fastest implementation but this indexes less packages. That'd be typically the use case for user local indexation. - out of the local store. This is slower due to implementation details (it discusses with the store daemon for one). That'd be typically the use case for b

Re: File search

2022-12-11 Thread Ludovic Courtès
Hi! "Antoine R. Dumont (@ardumont)" skribis: > |---+-+--+--| > | Iteration | Host System | Time (s) | Packages | > |---+-+--+--| > | 1st | Debian | 121.88 | 284 | > | | Guix System | 413.55 |

Re: File search

2022-12-09 Thread zimoun
Hi Antoine, Cool! I have not really look yet. Just a minor answer to one of your question. :-) On Fri, 09 Dec 2022 at 11:05, "Antoine R. Dumont (@ardumont)" wrote: >> It should instead show “git@2.38.1:send-email”. We probably need an >> ‘output’ field in the ‘Packages’ table. > > Why must t

Re: File search

2022-12-09 Thread Antoine R. Dumont (@ardumont)
Hello, > So we went from 413s to 11s (on the Guix System node) for only 6% fewer > files in the latter case? Do I get that right? That’s pretty cool. Not 6% of loss, a bit more, around half is only detected between the first and second round. Here is the summary [1] (org-mode) table I should ha

Re: File search

2022-12-08 Thread Ludovic Courtès
lient > when [built a package derivation] > [add files to local database] > > when [user ran guix find] > if [send find request to substitute server] didn't work > [search through local database] > [display result] Upthread I started the discussion of criter

Re: File search

2022-12-06 Thread (
On Tue Dec 6, 2022 at 10:01 AM GMT, Ludovic Courtès wrote: > The implementation based on manifests can of course miss packages, so > it’s a tradeoff. Purely local indexing will only find packages you > already have anyway, so eventually we’ll need a second mode that would > download a database. S

Re: File search

2022-12-06 Thread zimoun
Hi, On Tue, 06 Dec 2022 at 11:01, Ludovic Courtès wrote: > "Antoine R. Dumont (@ardumont)" > skribis: > >> Please, find enclosed the latest implementation as a patch (somewhat vcs >> code ;). I've edited commits to mark Ludo as author with his >> started/amended implementations first [0] (that s

Re: File search

2022-12-06 Thread Ludovic Courtès
Howdy! "Antoine R. Dumont (@ardumont)" skribis: > Please, find enclosed the latest implementation as a patch (somewhat vcs > code ;). I've edited commits to mark Ludo as author with his > started/amended implementations first [0] (that should be in the patch). Nice! > For information, I extrac

Re: File search

2022-12-04 Thread Antoine R. Dumont (@ardumont)
(cons (package-match package version > (string-append directory "/" file)) > lst >'() lookup-stmt)) > > > > > (define (index-packages-with-db db-pathname) > &quo

Re: File search

2022-12-03 Thread Ludovic Courtès
g-results matches) "Print the MATCHES matching results." (for-each (lambda (result) (format #t "~20a ~a~%" (string-append (package-match-name result) "@" (package-match-version result)) (p

Re: File search

2022-12-02 Thread Antoine R. Dumont (@ardumont)
arching package for a given filename Report bugs to: bug-g...@gnu.org. $ guix help index # or: guix index [--help|-h] Usage: guix index [OPTIONS...] [search FILE...] Without FILE, index (package, file) relationships in the local store. With 'search FILE', search for packages installing

Re: File search

2022-12-02 Thread antoine . romain . dumont
Hello Guix! Guix is top so thanks for the awesome work! Just to give some feedback on this thread. That's a good news that the file search functionality in the radar. > Lately I found myself going several times to > <https://packages.debian.org> to look for packages providing a

Re: File search

2022-02-06 Thread André A . Gomes
Ludovic Courtès writes: > Hello Guix! > > Lately I found myself going several times to > to look for packages providing a given > file and I thought it’s time to do something about it. My understanding is very limited but I thought that the following blog post could

Re: File search

2022-02-05 Thread Ludovic Courtès
Hi, Ryan Prior skribis: > On Friday, January 21st, 2022 at 9:03 AM, Ludovic Courtès > wrote: > >> The database for 18K packages is quite big: >> >> --8<---cut here---start->8--- >> >> $ du -h /tmp/db* >> >> 389M /tmp/db >> >> 82M /tmp/db.gz >> >> 61M /tmp/db

Re: File search

2022-02-05 Thread Ludovic Courtès
Hi, Maxim Cournoyer skribis: > It used to be broken, but with the c-u-f merge the 'tlmgr' tool now > works as expected to search for things in the local texlive.tlpdb > database: > > $ guix shell --pure texlive-bin grep which coreutils sed gnupg -- tlmgr info > cite.sty > tlmgr: cannot find pac

Re: File search

2022-02-02 Thread Maxim Cournoyer
Hi, Ricardo Wurmus writes: > Ludovic Courtès writes: > >> Ricardo Wurmus skribis: >> >>> raingloom writes: >>> One use case that I hope can be addressed is TeXlive packages. Trying to figure out which package corresponded to which missing file was a nightmare the last I had to

Re: File search

2022-01-25 Thread Ryan Prior
On Friday, January 21st, 2022 at 9:03 AM, Ludovic Courtès wrote: > The database for 18K packages is quite big: > > --8<---cut here---start->8--- > > $ du -h /tmp/db* > > 389M /tmp/db > > 82M /tmp/db.gz > > 61M /tmp/db.zst > > --8<---cut here

Re: File search

2022-01-25 Thread Oliver Propst
On 2022-01-25 12:20, Oliver Propst wrote: On 2022-01-25 12:15, Ludovic Courtès wrote: I'm also not an expert at Sql-Lite but I can state that the effort looks very nice and promising Ludovic :) And definitely a step-up from the current implementation (obviously).. -- Kinds regards Oliver Props

Re: File search

2022-01-25 Thread Oliver Propst
On 2022-01-25 12:15, Ludovic Courtès wrote: I'm also not an expert at Sql-Lite but I can state that the effort looks very nice and promising Ludovic :) -- Kinds regards Oliver Propst https://twitter.com/Opropst

Re: File search

2022-01-25 Thread Ludovic Courtès
Maxim Cournoyer skribis: > I also had the idea of making it a package... this way only the people > who opt to install the database locally would incur the cost (in > bandwidth). > > Perhaps a question for Vagrant: talking about size, is this SQLite > database file comparable or smaller in size t

Re: File search

2022-01-24 Thread Ricardo Wurmus
Ludovic Courtès writes: > Ricardo Wurmus skribis: > >> raingloom writes: >> >>> One use case that I hope can be addressed is TeXlive packages. Trying >>> to figure out which package corresponded to which missing file was a >>> nightmare the last I had to use LaTeX. >> >> The texlive package d

Re: File search

2022-01-24 Thread Ludovic Courtès
Ricardo Wurmus skribis: > raingloom writes: > >> One use case that I hope can be addressed is TeXlive packages. Trying >> to figure out which package corresponded to which missing file was a >> nightmare the last I had to use LaTeX. > > The texlive package database is the authoritative source of

Re: File search

2022-01-21 Thread Ricardo Wurmus
raingloom writes: > One use case that I hope can be addressed is TeXlive packages. Trying > to figure out which package corresponded to which missing file was a > nightmare the last I had to use LaTeX. The texlive package database is the authoritative source of information. The file texlive.tl

Re: File search

2022-01-21 Thread raingloom
(where the shell tells you what > package to install when a command is not found). > > Based on that, it is tempting to just distribute a full database from > ci.guix, say, that the client command would regularly fetch. The > downside is that that’s quite a lot of data to download; if yo

Re: File search

2022-01-21 Thread Maxim Cournoyer
Hi Ludovic, Thank you for this valuable initiative :-). I like that it sits in few lines and should already be useful for local searches with a minimal front command to query it. Ludovic Courtès writes: > Hi! > > Vagrant Cascadian skribis: > >> What about ... a roughly weekly job that runs on

Re: File search

2022-01-21 Thread Ludovic Courtès
Hi! Vagrant Cascadian skribis: > What about ... a roughly weekly job that runs on ci.guix. to create the > database and packages of parts of the database and a channel that > includes those and utilities to query them so that you can install the > packages and refresh them at your leisure... > >

Re: File search

2022-01-21 Thread Ludovic Courtès
Hi! Mathieu Othacehe skribis: >> I think accuracy (making sure you get results that correspond precisely >> to, say, your current channel revisions and your current system) is not >> a high priority: some result is better than no result. Likewise for >> freshness: results for an older version o

Re: File search

2022-01-21 Thread Vagrant Cascadian
that the client command would regularly fetch. The > downside is that that’s quite a lot of data to download; if you use the > file search command infrequently, you might find yourself spending more > time downloading the database than actually searching it. > > We could have a hybrid

Re: File search

2022-01-21 Thread Mathieu Othacehe
Hello Ludo! > Lately I found myself going several times to > to look for packages providing a given > file and I thought it’s time to do something about it. Yeah, I'm also thinking regularly about it but giving up because setting up this mechanism properly turns ou

File search

2022-01-21 Thread Ludovic Courtès
ad; if you use the file search command infrequently, you might find yourself spending more time downloading the database than actually searching it. We could have a hybrid solution: distribute a database that contains only files in /bin and /sbin (it should be much smaller), and for everything els

Re: File search progress: database review and question on triggers

2020-10-21 Thread Pierre Neidhardt
Ludovic Courtès writes: > A client-side approach (not involving guix-daemon) would be more readily > usable, though some of the questions above remain open. I'd also prefer to stick to the client side. But how can I trigger an event when a package gets built? Maybe we could hook into specific

Re: File search progress: database review and question on triggers

2020-10-21 Thread Ludovic Courtès
Pierre Neidhardt skribis: > Ludovic Courtès writes: > >> It first tries ‘query-path-info’, which succeeds if the store item is >> available and contains info about its size, references, and so on. >> >> When ‘query-path-info’ fails, it falls back to >> ‘query-substitutable-path-info’, which allo

Re: File search progress: database review and question on triggers

2020-10-17 Thread Pierre Neidhardt
Pierre Neidhardt writes: > So we could do the same with `guix filesearch`: > > - First try the entry in the database. > > - If not there, try query-path-info and if it succeeds, populate the > database. > > - If query-path-info does not succeed, try our new > query-substitutable-filesearch-in

Re: File search progress: database review and question on triggers

2020-10-16 Thread Ludovic Courtès
Pierre Neidhardt skribis: > Ludovic Courtès writes: > >>> Question: How do I hook onto =guix build=? >> >> You would need a build-completion hook in the daemon, which doesn’t >> exist (yet!). Note also that at this level we only see derivations, not >> packages. > > Hmm... Can you explain me ho

Re: File search progress: database review and question on triggers

2020-10-14 Thread Pierre Neidhardt
Ludovic Courtès writes: >> Question: How do I hook onto =guix build=? > > You would need a build-completion hook in the daemon, which doesn’t > exist (yet!). Note also that at this level we only see derivations, not > packages. Hmm... Can you explain me how =guix size= works with local builds?

Re: File search progress: database review and question on triggers

2020-10-13 Thread Ludovic Courtès
Pierre Neidhardt skribis: >> “Something” needs to build the file-to-package database (which is what >> you’re working on), and then there needs to be a way for users to fetch >> that database. This is all orthogonal to substitutes, as I see it, >> which is why I think we need to think about inte

Re: File search progress: database review and question on triggers

2020-10-13 Thread Ludovic Courtès
Pierre Neidhardt skribis: > Ludovic Courtès writes: > >> I would lean towards keeping it separate, so that it’s an optional >> feature (given that it relies on downloading an external database). > > I was leaning towards downloading the database with "guix pull", so that > the "filesearch" subco

Re: File search progress: database review and question on triggers

2020-10-13 Thread Ludovic Courtès
Hi Pierre, Pierre Neidhardt skribis: > Ludovic Courtès writes: [...] >> It would be nice to see whether/how this could be integrated with >> third-party channels. Of course it’s not a priority, but while >> designing this feature, we should keep in mind that we might want >> third-party chan

Re: File search progress: database review and question on triggers

2020-10-12 Thread zimoun
synopsis and descriptions. Maybe we should include all fields >> that are searched by `guix search`. This incurs a cost on the >> database size but it would fix the `guix search` speed issue. Size >> increases by some 10 MiB. > > Oh so this is going beyond file

Re: File search progress: database review and question on triggers

2020-10-12 Thread Ludovic Courtès
Hi, Pierre Neidhardt skribis: > Of course, `guix filesearch` hasn't been implemented yet ;) > > We still need to decide whether we want to make it part of `guix search' > or define a separate command. I would lean towards keeping it separate, so that it’s an optional feature (given that it reli

Re: File search progress: database review and question on triggers

2020-10-12 Thread Ludovic Courtès
search`. This incurs a cost on the > database size but it would fix the `guix search` speed issue. Size > increases by some 10 MiB. Oh so this is going beyond file search, right? Perhaps it would make sense to focus on file search only as a first step, and see what can be done with synop

Re: File search progress: database review and question on triggers

2020-10-11 Thread zimoun
On Sun, 11 Oct 2020 at 16:25, Pierre Neidhardt wrote: > Maybe you misunderstood a point: the filesearch database is not a > database of _all store items_, but only of the items that correspond to > the packages of a given Guix generation. Yes, it is clear for me. I meant: “all the store items o

Re: File search progress: database review and question on triggers

2020-10-11 Thread zimoun
Hi Pierre, I am trying to resume the work on "guix search" to improve it (faster). That's why I am asking the details. :-) Because with the introduction of this database, as mentioned earlier, 2 annoyances could be fixed at once. On Sun, 11 Oct 2020 at 13:19, Pierre Neidhardt wrote: > > --8<-

Re: File search progress: database review and question on triggers

2020-10-11 Thread Pierre Neidhardt
Hi Zimoun, Thanks for the feedback! > --8<---cut here---start->8--- > echo 3 > /proc/sys/vm/drop_caches > time updatedb --output=/tmp/store.db --database-root=/gnu/store/ > > real0m19.903s > user0m1.549s > sys 0m4.500s I don't know the size of your

Re: File search progress: database review and question on triggers

2020-10-10 Thread zimoun
Hi, On Mon, 05 Oct 2020 at 20:53, Pierre Neidhardt wrote: > - Textual database: slow and not lighter than SQLite. Not worth it I believe. Maybe I am out-of-scope, but re-reading *all* the discussion about “fileserch”, is it possible to really do better than “locate”? As Ricardo mentioned. --8

Re: File search progress: database review and question on triggers

2020-10-10 Thread zimoun
Hi, On Sat, 10 Oct 2020 at 10:57, Pierre Neidhardt wrote: > Of course, `guix filesearch` hasn't been implemented yet ;) Sorry, I have overlooked the status. :-) > We still need to decide whether we want to make it part of `guix search' > or define a separate command. >From my point of view, i

Re: File search progress: database review and question on triggers

2020-10-10 Thread Pierre Neidhardt
Of course, `guix filesearch` hasn't been implemented yet ;) We still need to decide whether we want to make it part of `guix search' or define a separate command. Thoughts? -- Pierre Neidhardt https://ambrevar.xyz/ signature.asc Description: PGP signature

Re: File search progress: database review and question on triggers

2020-10-09 Thread zimoun
Hi Pierre, On Mon, 5 Oct 2020 at 20:53, Pierre Neidhardt wrote: > Comments and help welcome! :) I have just checked out and I am probably failing but "./pre-inst-env guix filesearch whatever" raises an error with the backtrace: --8<---cut here---start->8---

Re: File search progress: database review and question on triggers

2020-10-05 Thread Pierre Neidhardt
Hi Ludo! Ludovic Courtès writes: > Nice! Thanks! > Could you post a summary of what you have done, what’s left to do, and > how you’d like to integrate it? (If you’ve already done it, my > apologies, but you can resend a link. :-)) What I've done: mostly a database benchmark. - Textual dat

Re: File search progress: database review and question on triggers

2020-10-05 Thread Ludovic Courtès
Hi, Pierre Neidhardt skribis: > I've pushed a commit which adds the synopsis and the description to the > database. > > 0127cfa5d089857a716bf7b0a167f31cc6dd > > The quite surprising result is that these new details only cost 1 MiB extra! > > I can then search packages and it's super fast: >

Re: File search progress: database review and question on triggers

2020-09-06 Thread Arun Isaac
> There is a subtle different: in the latter, (search-file-package) is > allowed and won't trigger a compile time error, while the former does. > This "foo . more-foo" paradigm is a way to say "1 or more arguments", > instead of "0 or more". Ah, ok, that makes sense! Regards, Arun signature.as

Re: File search progress: database review and question on triggers

2020-09-06 Thread Arun Isaac
> Oops! Forgot to push. > > It's actually commit 25147f983bdf432b03e8271abe0318f4812f94ba on > wip-filesearch. I checked and it works now. Just a tiny nitpick: The function signature of search-file-package, --8<---cut here---start->8--- (define (search-file-

Re: File search progress: database review and question on triggers

2020-09-04 Thread Arun Isaac
Hi Pierre! Sorry for the very late response. > I've fixed it in > e08f913d20428a9a925cc46d177c7446f55e6443. The downside is that we can't > use any special character like boolean operators. I'm not sure how we > can get the best of both worlds. Maybe add a command line flag that > would enab

Re: File search progress: database review and question on triggers

2020-08-27 Thread Pierre Neidhardt
zimoun writes: > I am not sure to see how. One needs all the database to search inside > and cannot know in advance which packages etc.. Contrary to “guix size” > which manipulates the graph and then download the missing parts. > > Therefore, your suggestion is to download all the database the

Re: File search progress: database review and question on triggers

2020-08-27 Thread zimoun
Hi Pierre, On Thu, 27 Aug 2020 at 13:15, Pierre Neidhardt wrote: > zimoun writes: > >> If you are going to an local SQL database, my two questions are: >> >> a) >> Which part would update it? “guix pull”? Other? Even using >> substitutes, the channels and co could lead to an extra cost and s

Re: File search progress: database review and question on triggers

2020-08-27 Thread Pierre Neidhardt
Hi Simon! zimoun writes: > If you are going to an local SQL database, my two questions are: > > a) > Which part would update it? “guix pull”? Other? Even using > substitutes, the channels and co could lead to an extra cost and so what > is acceptable and what is not? I suggest fetching datab

Re: File search progress: database review and question on triggers

2020-08-27 Thread zimoun
Hi Pierre, Cool to relive the topic. :-) (disclaim: I have not yet process all the thread and emails) On Mon, 10 Aug 2020 at 16:32, Pierre Neidhardt wrote: > 1. An SQLite database with the following schema: If you are going to an local SQL database, my two questions are: a) Which part would

Re: File search progress: database review and question on triggers

2020-08-24 Thread Pierre Neidhardt
>> - You should use SQL prepared statements with sqlite-prepare, >> sqlite-bind, etc. That would correctly handle escaping special >> characters in the search string. Currently, searching for >> "transmission-gtk", "libm.so", etc. errors out. > > Thanks for pointing this out, I'll look into i

Re: File search progress: database review and question on triggers OFF TOPIC PRAISE

2020-08-18 Thread Joshua Branson
Thanks for working on this! This is a super awesome feature! Best of luck! -- Joshua Branson Sent from Emacs and Gnus

Re: File search progress: database review and question on triggers

2020-08-16 Thread Hartmut Goebel
Am 15.08.20 um 23:20 schrieb Bengt Richter: > If you are on debian, have you tried > dpkg -l '*your*globbed*name*here*' No, since for Debian I'm not using a command-line tool, but the Web-Interface - which allows querying even packages I have not installed. (And the later is my specific use-ca

Re: File search progress: database review and question on triggers

2020-08-15 Thread Bengt Richter
Hi Hartmut, et al On +2020-08-15 14:47:12 +0200, Hartmut Goebel wrote: > Am 13.08.20 um 12:04 schrieb Pierre Neidhardt: > > SQLite pattern search queries are extremely fast (<0.1s) and cover all > > examples named so far: > > > > - exact basename match > > - partial path match > > - pattern match

Re: File search progress: database review and question on triggers

2020-08-15 Thread Arun Isaac
Hi Pierre, I tried the wip-filesearch branch. Nice work! :-) persist-all-local-packages takes around 350 seconds on my machine (slow machine with spinning disk) and the database is 50 MB. Some other comments follow. - Maybe, we shouldn't index hidden files, particularly all the .xxx-real files

Re: File search progress: database review and question on triggers

2020-08-15 Thread Hartmut Goebel
Am 11.08.20 um 14:35 schrieb Pierre Neidhardt: > Unlike Nix, we would like to do more than just index executable files. > Indeed, it's very useful to know where to find, say, a C header, a .so > library, a TeXlive .sty file, etc. +1 Most of the time I'm searching for non-executable files. -- Re

Re: File search progress: database review and question on triggers

2020-08-15 Thread Hartmut Goebel
Am 13.08.20 um 12:04 schrieb Pierre Neidhardt: > SQLite pattern search queries are extremely fast (<0.1s) and cover all > examples named so far: > > - exact basename match > - partial path match > - pattern match (e.g. "/include/%foo%") For comparison: These are the options Debian Package search

Re: File search progress: database review and question on triggers

2020-08-13 Thread Pierre Neidhardt
I've pushed my experiment to the `wip-filesearch' branch. As of this writing it is not automatically triggered by "guix build". To test it: - Load the module from a REPL. - Run --8<---cut here---start->8--- (test-index-git) --8<---cut here--

Re: File search progress: database review and question on triggers

2020-08-13 Thread Pierre Neidhardt
Arun Isaac writes: > But filenames usually don't have diacritics. So, I'm not sure if > diacritic insensitivity is useful. Probably not, but if there ever is this odd file name with an accent, then we won't have to worry about it, it will be handled. Better too much than too little! > This is

Re: File search progress: database review and question on triggers

2020-08-13 Thread Arun Isaac
> Yes, but full text search brings us a few niceties here: These are nice features, but I don't know if all of them are useful for file search. Normally, with Arch's pkgfile, I seach for some missing header file, shared library, etc. Usually, I know the exact filename I am looking f

Re: File search progress: database review and question on triggers

2020-08-13 Thread Pierre Neidhardt
Ricardo Wurmus writes: > Pierre Neidhardt writes: > >> - Or do you think SQLite patterns (using "%") would do for now? As >> Mathieu pointed out, it's an unfortunate inconsistency with the rest of >> Guix. But maybe regexp support can be added in a second stage. > > These patterns could be

Re: File search progress: database review and question on triggers

2020-08-13 Thread Arun Isaac
e controversial. In this specific case of file search, we could use the sqlite like patterns, but not expose them to the user. For example, if the search query is "", we search for the LIKE pattern "%%". I think this addresses how users normally search for files. I don't thi

Re: File search progress: database review and question on triggers

2020-08-13 Thread Ricardo Wurmus
Pierre Neidhardt writes: > - Or do you think SQLite patterns (using "%") would do for now? As > Mathieu pointed out, it's an unfortunate inconsistency with the rest of > Guix. But maybe regexp support can be added in a second stage. These patterns could be generated from user input that

Re: File search progress: database review and question on triggers

2020-08-13 Thread Ricardo Wurmus
Pierre Neidhardt writes: > Julien Lepiller writes: > >> Why wouldn't it help? Can't you make it a trie from basename -> >> complete name? If I'm looking for "libcord.so" (which is a key in the >> trie), I don't think I need to look for every path. I only need to >> follow the trie until I find

Re: File search progress: database review and question on triggers

2020-08-12 Thread Pierre Neidhardt
Hi Ricardo, Ricardo Wurmus writes: >> Why wouldn't it help? Can't you make it a trie from basename -> complete >> name? If I'm looking for "libcord.so" (which is a key in the trie), I don't >> think I need to look for every path. I only need to follow the trie until I >> find a pointer to som

Re: File search progress: database review and question on triggers

2020-08-12 Thread Arun Isaac
Hi, > 1. I tried to fine-tune the SQL a bit: > - Open/close the database only once for the whole indexing. > - Use "insert" instead of "insert or replace". > - Use numeric ID as key instead of path. > > Result: Still around 15-20 minutes to build. Switching to numeric > indices shrank

Re: File search progress: database review and question on triggers

2020-08-12 Thread Ricardo Wurmus
Julien Lepiller writes: > Why wouldn't it help? Can't you make it a trie from basename -> complete > name? If I'm looking for "libcord.so" (which is a key in the trie), I don't > think I need to look for every path. I only need to follow the trie until I > find a pointer to some structure th

Re: File search progress: database review and question on triggers

2020-08-12 Thread Julien Lepiller
Why wouldn't it help? Can't you make it a trie from basename -> complete name? If I'm looking for "libcord.so" (which is a key in the trie), I don't think I need to look for every path. I only need to follow the trie until I find a pointer to some structure that contains the data I look for (ex:

Re: File search progress: database review and question on triggers

2020-08-12 Thread Pierre Neidhardt
Pierre Neidhardt writes: > Result: Takes between 20 and 2 minutes to complete and the result is > 32 MiB big. (I don't know why the timing varies.) Typo: 20 _seconds_ to 2 minutes! So it's faster than SQL by 1 or 2 orders of magnitude. -- Pierre Neidhardt https://ambrevar.xyz/ signatu

Re: File search progress: database review and question on triggers

2020-08-12 Thread Julien Lepiller
Have you tried something more structured? I have some code for creating a binary search tree and even compressing/decompressing strings with huffman, as well as code to serialize all that (my deserialization is in Java though, so not very useful to you): https://framagit.org/nani-project/nani-we

Re: File search progress: database review and question on triggers

2020-08-11 Thread Ricardo Wurmus
Pierre Neidhardt writes: > Pierre Neidhardt writes: > >> Ricardo Wurmus writes: >> >>> I’m not suggesting to use updatedb, but I think it can be instructive to >>> look at how the file database is implemented there. We don’t have to >>> use SQlite if it is much slower and heavier than a cust

Re: File search progress: database review and question on triggers

2020-08-11 Thread Pierre Neidhardt
Ricardo Wurmus writes: > Oof. The updatedb hack above takes 6 seconds on my i7-6500U CPU @ > 2.50GHz with SSD. > > I’m not suggesting to use updatedb, but I think it can be instructive to > look at how the file database is implemented there. We don’t have to > use SQlite if it is much slower an

Re: File search progress: database review and question on triggers

2020-08-11 Thread Ricardo Wurmus
Pierre Neidhardt writes: > 3. Size of the database: >I've persisted all locally-present store items for my current Guix version >and it produced a database of 72 MiB. It compresses down to 8 MiB >in zstd. For comparison, my laptop’s store contains 1,103,543 files, excluding .links

Re: File search progress: database review and question on triggers

2020-08-11 Thread Pierre Neidhardt
ion: This bounds us to the SQLite syntax for pattern matching. Is >> it a >>problem? >>It seems powerful enough in practice. But maybe we can use regular >> expression in SQLite as well? > > From the UI perspective, we already have "guix search" th

Re: File search progress: database review and question on triggers

2020-08-11 Thread Mathieu Othacehe
gt;Question: This bounds us to the SQLite syntax for pattern matching. Is it > a >problem? >It seems powerful enough in practice. But maybe we can use regular >expression in SQLite as well? >From the UI perspective, we already have "guix search" that expec

File search progress: database review and question on triggers

2020-08-10 Thread Pierre Neidhardt
Hi! After much delay I finally got down to work on file search support for Guix. By "file search", I mean the ability to find which package contains files matching the queried pattern. If we want to be able to know which package to install, we need file search to be able to work fo