Hey,

We at Lanedo have been investigating search performance on the master
plain text copy of the code, which is at the moment performed reading
from file_index.txt line by line and matching on the search term.

This approach pretty consistently takes a fixed time in doing so, which
is measured to be ~9.1s in this testing system with the 774M text file
generated [1], the accumulated page generation time is ~9.6s

One possible approach to improve this situation is using sqlite's FTS
module (seemingly included by default in recent sqlite3 versions, also
compilable standalone). This module has been used quite successfully in
the Tracker project [2] to enable full text search, and it can surely
improve performance in DXR as both usecases are fairly similar.

So I've hacked up an script that dumps in the sqlite DB the plain text
contents in a FTS table, and a search.cgi page to use that FTS table as
a proof of concept, and for the same matches (admittedly not nicely
formatted, so the query could grow complex) it does take a fraction of
the current time, with the results being fetched in 0.05s (accumulated
page generation ~0.65s).

This is just a proof of concept atm, although it turns FTS into a very
compelling option to improve performance.

  Carlos


[1] For these measurements, I've been searching for "GfxContext"
specifically, although as I say it's pretty consistent regardless of the
search term
[2] http://projects.gnome.org/tracker/


_______________________________________________
dev-static-analysis mailing list
[email protected]
https://lists.mozilla.org/listinfo/dev-static-analysis

Reply via email to