Re: [Zim-wiki] caching and progressively slower page loads
Sounds good, I just opened an issue here: https://github.com/jaap-karssenberg/zim-desktop-wiki/issues/311 Best, Nathaniel On 02/02/2018 06:25 AM, Stephen LARROQUE wrote: Thank you very much Nathaniel for tracking the culprit, someone else reported a similar issue on the mailing list a few months ago but could not find the root. I think you should open an issue on the issue tracker, else this will get lost in the archives of the mailing list. Best regards, Stephen L. Le Jeu. 1 Fév. 2018 22:56, Nathaniel Beaver> a écrit : When Zim starts up, switching pages is nice and snappy. https://www.youtube.com/watch?v=tl4_t6dW1l0 After a while, loading even the smallest page has a noticeable delay. https://www.youtube.com/watch?v=euyGgEwLaIk Note the time required to display the new page in addition to the time before the cursor becomes active. This is running a recent Github version (commit e46a4b40a90d6903b6fa860bd72a4464dab7d704) with all add-ons disabled, including built-in add-ons. However, it was originally reproduced on older versions, so it is not a new effect. A few backtraces with GDB during a page switch shows most of the time is spent in this query from the "get_mytreeiter()" function in zim/notebook/index/pages.py: for i, row in enumerate(self.db.execute(''' SELECT * FROM pages WHERE parent=? ORDER BY sortkey, name LIMIT 20 OFFSET ? ''', (parent_id, offset) )): https://github.com/jaap-karssenberg/zim-desktop-wiki/blob/e46a4b40a90d6903b6fa860bd72a4464dab7d704/zim/notebook/index/pages.py#L804 This is being called from the main loop. Traceback (most recent call first): File "/home/nathaniel/.local/lib/python2.7/site-packages/zim/notebook/index/pages.py", line 805, in get_mytreeiter (parent_id, offset) File "/home/nathaniel/.local/lib/python2.7/site-packages/zim/gui/pageindex.py", line 224, in on_get_iter return self.get_mytreeiter(treepath) File "/home/nathaniel/.local/lib/python2.7/site-packages/zim/gui/pageindex.py", line 238, in on_iter_next return self.on_get_iter(treepath) File "/home/nathaniel/.local/lib/python2.7/site-packages/zim/main/__init__.py", line 703, in _run_main_loop gtk.main() File "/home/nathaniel/.local/lib/python2.7/site-packages/zim/main/__init__.py", line 656, in _run_cmd self._run_main_loop(cmd) File "/home/nathaniel/.local/lib/python2.7/site-packages/zim/main/__init__.py", line 626, in run self._run_cmd(cmd, args) # test seam File "/home/nathaniel/.local/lib/python2.7/site-packages/zim/main/__init__.py", line 845, in main The query itself is fast, so it seems strange that so much time is spent here. However, running Zim with "cprofile" for about a day shows there are roughly 1.3e6 calls to the sqlite3 execute method, so even though the time per call is less than a millisecond, the cumulative time is considerable. Wed Jan 17 10:33:24 2018 zim.notebook.index.pages_1516168791.prof 43403639 function calls (42725089 primitive calls) in 86784.644 seconds Ordered by: internal time List reduced from 2168 to 4 due to restriction <4> ncalls tottime percall cumtime percall filename:lineno(function) 1 85453.595 85453.595 86765.186 86765.186 {gtk._gtk.main} 1347825 1154.685 0.001 1154.686 0.001 {method 'execute' of 'sqlite3.Connection' objects} 5138231 50.881 0.000 1207.454 0.000 /home/nathaniel/.local/lib/python2.7/site-packages/zim/gui/pageindex.py:231(on_iter_next) 11 43.953 3.996 48.992 4.454 /home/nathaniel/.local/lib/python2.7/site-packages/zim/gui/widgets.py:3152(run) By setting a breakpoint in GDB: break _pysqlite_query_execute and then using "ignore" and "info break", we can count how many queries are run. In my notebook of about 1,400 pages, there are queries amounting to roughly twice that on startup. When the pages begin loading slowly, it runs correspondingly more queries. Since "get_mytreeiter()" caches the page on each call, is it possible that the cache is being duplicated or otherwise not working properly, resulting in an excessive number of queries? If the cache is working as intended, could it be changed so that it does not become progressively slower? Should I file a bug report on Github, or should I wait until there is something more concrete to submit? Nathaniel Beaver P.S. Upon request, I am happy to provide full data and details on
Re: [Zim-wiki] caching and progressively slower page loads
Thank you very much Nathaniel for tracking the culprit, someone else reported a similar issue on the mailing list a few months ago but could not find the root. I think you should open an issue on the issue tracker, else this will get lost in the archives of the mailing list. Best regards, Stephen L. Le Jeu. 1 Fév. 2018 22:56, Nathaniel Beavera écrit : > When Zim starts up, switching pages is nice and snappy. > > https://www.youtube.com/watch?v=tl4_t6dW1l0 > > After a while, loading even the smallest page has a noticeable delay. > > https://www.youtube.com/watch?v=euyGgEwLaIk > > Note the time required to display the new page in addition to the time > before the cursor becomes active. > > This is running a recent Github version (commit > e46a4b40a90d6903b6fa860bd72a4464dab7d704) with all add-ons disabled, > including built-in add-ons. However, it was originally reproduced on > older versions, so it is not a new effect. > > A few backtraces with GDB during a page switch shows most of the time is > spent in this query from the "get_mytreeiter()" function in > zim/notebook/index/pages.py: > > for i, row in enumerate(self.db.execute(''' > SELECT * FROM pages WHERE parent=? > ORDER BY sortkey, name LIMIT 20 OFFSET ? > ''', > (parent_id, offset) > )): > > > https://github.com/jaap-karssenberg/zim-desktop-wiki/blob/e46a4b40a90d6903b6fa860bd72a4464dab7d704/zim/notebook/index/pages.py#L804 > > This is being called from the main loop. > > Traceback (most recent call first): >File > > "/home/nathaniel/.local/lib/python2.7/site-packages/zim/notebook/index/pages.py", > line 805, in get_mytreeiter > (parent_id, offset) >File > "/home/nathaniel/.local/lib/python2.7/site-packages/zim/gui/pageindex.py", > line 224, in on_get_iter > return self.get_mytreeiter(treepath) >File > "/home/nathaniel/.local/lib/python2.7/site-packages/zim/gui/pageindex.py", > line 238, in on_iter_next > return self.on_get_iter(treepath) >File > "/home/nathaniel/.local/lib/python2.7/site-packages/zim/main/__init__.py", > line 703, in _run_main_loop > gtk.main() >File > "/home/nathaniel/.local/lib/python2.7/site-packages/zim/main/__init__.py", > line 656, in _run_cmd > self._run_main_loop(cmd) >File > "/home/nathaniel/.local/lib/python2.7/site-packages/zim/main/__init__.py", > line 626, in run > self._run_cmd(cmd, args) # test seam >File > "/home/nathaniel/.local/lib/python2.7/site-packages/zim/main/__init__.py", > line 845, in main > > The query itself is fast, so it seems strange that so much time is spent > here. However, running Zim with "cprofile" for about a day shows there > are roughly 1.3e6 calls to the sqlite3 execute method, so even though > the time per call is less than a millisecond, the cumulative time is > considerable. > > Wed Jan 17 10:33:24 2018zim.notebook.index.pages_1516168791.prof > > 43403639 function calls (42725089 primitive calls) in > 86784.644 seconds > > Ordered by: internal time > List reduced from 2168 to 4 due to restriction <4> > > ncalls tottime percall cumtime percall > filename:lineno(function) > 1 85453.595 85453.595 86765.186 86765.186 {gtk._gtk.main} >1347825 1154.6850.001 1154.6860.001 {method 'execute' of > 'sqlite3.Connection' objects} >5138231 50.8810.000 1207.4540.000 > > /home/nathaniel/.local/lib/python2.7/site-packages/zim/gui/pageindex.py:231(on_iter_next) > 11 43.9533.996 48.9924.454 > > /home/nathaniel/.local/lib/python2.7/site-packages/zim/gui/widgets.py:3152(run) > > By setting a breakpoint in GDB: > > break _pysqlite_query_execute > > and then using "ignore" and "info break", we can count how many queries > are run. In my notebook of about 1,400 pages, there are queries > amounting to roughly twice that on startup. When the pages begin loading > slowly, it runs correspondingly more queries. > > Since "get_mytreeiter()" caches the page on each call, is it possible > that the cache is being duplicated or otherwise not working properly, > resulting in an excessive number of queries? > > If the cache is working as intended, could it be changed so that it does > not become progressively slower? > > Should I file a bug report on Github, or should I wait until there is > something more concrete to submit? > > > Nathaniel Beaver > > P.S. Upon request, I am happy to provide full data and details on > collection methods. > > Please let me know if there is something I overlooked or a flaw in the > methodology. > > This is a follow-up to my message from December 29, 2017 with subject > "profiling zim for performance?" > > ___ > Mailing list: https://launchpad.net/~zim-wiki > Post to : zim-wiki@lists.launchpad.net > Unsubscribe :
[Zim-wiki] caching and progressively slower page loads
When Zim starts up, switching pages is nice and snappy. https://www.youtube.com/watch?v=tl4_t6dW1l0 After a while, loading even the smallest page has a noticeable delay. https://www.youtube.com/watch?v=euyGgEwLaIk Note the time required to display the new page in addition to the time before the cursor becomes active. This is running a recent Github version (commit e46a4b40a90d6903b6fa860bd72a4464dab7d704) with all add-ons disabled, including built-in add-ons. However, it was originally reproduced on older versions, so it is not a new effect. A few backtraces with GDB during a page switch shows most of the time is spent in this query from the "get_mytreeiter()" function in zim/notebook/index/pages.py: for i, row in enumerate(self.db.execute(''' SELECT * FROM pages WHERE parent=? ORDER BY sortkey, name LIMIT 20 OFFSET ? ''', (parent_id, offset) )): https://github.com/jaap-karssenberg/zim-desktop-wiki/blob/e46a4b40a90d6903b6fa860bd72a4464dab7d704/zim/notebook/index/pages.py#L804 This is being called from the main loop. Traceback (most recent call first): File "/home/nathaniel/.local/lib/python2.7/site-packages/zim/notebook/index/pages.py", line 805, in get_mytreeiter (parent_id, offset) File "/home/nathaniel/.local/lib/python2.7/site-packages/zim/gui/pageindex.py", line 224, in on_get_iter return self.get_mytreeiter(treepath) File "/home/nathaniel/.local/lib/python2.7/site-packages/zim/gui/pageindex.py", line 238, in on_iter_next return self.on_get_iter(treepath) File "/home/nathaniel/.local/lib/python2.7/site-packages/zim/main/__init__.py", line 703, in _run_main_loop gtk.main() File "/home/nathaniel/.local/lib/python2.7/site-packages/zim/main/__init__.py", line 656, in _run_cmd self._run_main_loop(cmd) File "/home/nathaniel/.local/lib/python2.7/site-packages/zim/main/__init__.py", line 626, in run self._run_cmd(cmd, args) # test seam File "/home/nathaniel/.local/lib/python2.7/site-packages/zim/main/__init__.py", line 845, in main The query itself is fast, so it seems strange that so much time is spent here. However, running Zim with "cprofile" for about a day shows there are roughly 1.3e6 calls to the sqlite3 execute method, so even though the time per call is less than a millisecond, the cumulative time is considerable. Wed Jan 17 10:33:24 2018zim.notebook.index.pages_1516168791.prof 43403639 function calls (42725089 primitive calls) in 86784.644 seconds Ordered by: internal time List reduced from 2168 to 4 due to restriction <4> ncalls tottime percall cumtime percall filename:lineno(function) 1 85453.595 85453.595 86765.186 86765.186 {gtk._gtk.main} 1347825 1154.6850.001 1154.6860.001 {method 'execute' of 'sqlite3.Connection' objects} 5138231 50.8810.000 1207.4540.000 /home/nathaniel/.local/lib/python2.7/site-packages/zim/gui/pageindex.py:231(on_iter_next) 11 43.9533.996 48.9924.454 /home/nathaniel/.local/lib/python2.7/site-packages/zim/gui/widgets.py:3152(run) By setting a breakpoint in GDB: break _pysqlite_query_execute and then using "ignore" and "info break", we can count how many queries are run. In my notebook of about 1,400 pages, there are queries amounting to roughly twice that on startup. When the pages begin loading slowly, it runs correspondingly more queries. Since "get_mytreeiter()" caches the page on each call, is it possible that the cache is being duplicated or otherwise not working properly, resulting in an excessive number of queries? If the cache is working as intended, could it be changed so that it does not become progressively slower? Should I file a bug report on Github, or should I wait until there is something more concrete to submit? Nathaniel Beaver P.S. Upon request, I am happy to provide full data and details on collection methods. Please let me know if there is something I overlooked or a flaw in the methodology. This is a follow-up to my message from December 29, 2017 with subject "profiling zim for performance?" ___ Mailing list: https://launchpad.net/~zim-wiki Post to : zim-wiki@lists.launchpad.net Unsubscribe : https://launchpad.net/~zim-wiki More help : https://help.launchpad.net/ListHelp