At brief glance, I'm guessing this is the issue: https://github.com/jaap-karssenberg/zim-desktop-wiki/blob/d96b3509890f4c9b9af9119f64b64947337d8da7/zim/notebook/index/files.py line 89
def _update_iter_inner(self, prefix=''): # sort folders before files: first index structure, then contents # this makes e.g. index links more efficient and robust # sort by id to ensure parents are found before children while True: row = self.db.execute( 'SELECT id, path, node_type FROM files' ' WHERE index_status = ? AND path LIKE ?' ' ORDER BY node_type, id', (STATUS_NEED_UPDATE, prefix + '%') ).fetchone() if row: node_id, path, node_type = row #print ">> UPDATE", node_id, path, node_type else: break It seems like the whole database is being re-loaded and re-ordered again for the import of every single file. As file number in a notebook increases, this per-file database operation seems not to scale linearly, but some much higher order. Something like globbing for the entire notebook-subdirectory structure and then db importing on a loop through that glob would be vastly more efficient for large file numbers. On Sun, 25 Mar 2018 14:30:33 -0500 <hawk...@bitmessage.ch> wrote: > Would you mind pointing me to the source file(s) that manage this > indexing? I'd like to see if there is any way to speed the process > up for large numbers of files. > > > > On Mon, 3 Jul 2017 17:42:07 +0000 > <hawk...@bitmessage.ch> wrote: > > > Yes, for medium sized notebooks, and those with a "normal" amount of > > files, indexing is still under 5 minutes. I also use Zim to manage a > > notebook under which there are lots of small work data files > > (>350,000). > > > > The progress bar suggests there is some part of the parsing process > > that slows down over time, as does a cursory check on the contents > > of the database updating over time. There are many more files added > > within the first few minutes, and many fewer over time, such that > > after a while, only one or two files are added ever several minutes. > > It suggests to me that the whole list is being re-processed or > > re-opened as part of the indexing loop, perhaps re-opening the > > sqlite file for every new file or something. Ultimately, I don't > > think that exponential slowdown is a necessity, but I have not had > > a free moment to familiarize myself with the source yet. > > > > Thanks! > > > > > > > > On Mon, 03 Jul 2017 08:04:36 +0000 > > Jaap Karssenberg <jaap.karssenb...@gmail.com> wrote: > > > > > Yes, zim does indeed now build a tabel of all files in the > > > notebook folder, not just text files. However it doesn't access > > > them, it just stores file names and mtime. > > > > > > Despite this change, the indexing is faster than with 0.65 in most > > > of my test cases. The behavior you describe suggest a huge amount > > > of files under the notebook folder, is this the case? > > > > > > -- Jaap > > > > > > On Sun, Jul 2, 2017 at 8:43 PM <hawk...@bitmessage.ch> wrote: > > > > > > > The notebooks that used to take me about 5 minutes to re-index > > > > are taking close to 40 hours for me (they are larger notebooks). > > > > > > > > It looks like the sql database is indexing every file under the > > > > root directory of the notebook, even those not associated with > > > > Zim directly, like zip or data files. I'm not sure if that was > > > > happening with earlier versions. > > > > > > > > > > > > > > > > On Sat, 1 Jul 2017 23:32:16 +0200 > > > > Olivier Boesch <boe...@free.fr> wrote: > > > > > > > > > 6 minutes to reindex. pretty long in comparison with the 0.65. > > > > > > > > > > > > > > > Le 01/07/2017 à 23:24, Olivier Boesch a écrit : > > > > > > > > > > > > I seem to experience the same issue... > > > > > > > > > > > > I clicked the "cancel" button after several minutes... > > > > > > > > > > > > testing now how long it takes to re-index... > > > > > > > > > > > > > > > > > > Le 01/07/2017 à 23:04, hawk...@bitmessage.ch a écrit : > > > > > >> After this latest upgrade came through (it looks great), > > > > > >> notebooks that took me several minutes to re-index are now > > > > > >> taking multiple days of time, and it seems like an > > > > > >> exponential slowdown with the number (and maybe size) of > > > > > >> files under the notebook root directory. Has anyone else > > > > > >> experienced this? > > > > > >> > > > > > >> > > > > > >> _______________________________________________ > > > > > >> Mailing list:https://launchpad.net/~zim-wiki > > > > > >> Post to :zim-wiki@lists.launchpad.net > > > > > >> Unsubscribe :https://launchpad.net/~zim-wiki > > > > > >> More help :https://help.launchpad.net/ListHelp > > > > > > > > > > > > > > > > > > > > > > > > _______________________________________________ > > > > > > Mailing list: https://launchpad.net/~zim-wiki > > > > > > Post to : zim-wiki@lists.launchpad.net > > > > > > Unsubscribe : https://launchpad.net/~zim-wiki > > > > > > More help : https://help.launchpad.net/ListHelp > > > > > > > > > > > > > > > > > _______________________________________________ > > > > Mailing list: https://launchpad.net/~zim-wiki > > > > Post to : zim-wiki@lists.launchpad.net > > > > Unsubscribe : https://launchpad.net/~zim-wiki > > > > More help : https://help.launchpad.net/ListHelp > > > > > > > > > > _______________________________________________ > > Mailing list: https://launchpad.net/~zim-wiki > > Post to : zim-wiki@lists.launchpad.net > > Unsubscribe : https://launchpad.net/~zim-wiki > > More help : https://help.launchpad.net/ListHelp > > > _______________________________________________ > Mailing list: https://launchpad.net/~zim-wiki > Post to : zim-wiki@lists.launchpad.net > Unsubscribe : https://launchpad.net/~zim-wiki > More help : https://help.launchpad.net/ListHelp _______________________________________________ Mailing list: https://launchpad.net/~zim-wiki Post to : zim-wiki@lists.launchpad.net Unsubscribe : https://launchpad.net/~zim-wiki More help : https://help.launchpad.net/ListHelp