I'm writing a small application for detecting source code plagiarism that
currently relies on a database to store lines of code.

The application has two primary functions: adding a new file to the database
and comparing a file to those that are already stored in the database.

I started out using sqlite3, but was not satisfied with the performance
results. I then tried using psycopg2 with a local postgresql server, and the
performance got even worse. My simple benchmarks show that sqlite3 is an
average of 3.5 times faster at inserting a file, and on average less than a
tenth of a second slower than psycopg2 at matching a file.

I expected postgresql to be a lot faster ... is there some peculiarity in
psycopg2 that could be causing slowdown? Are these performance results
typical? Any suggestions on what to try from here? I don't think my
code/queries are inherently slow, but I'm not a DBA or a very accomplished
Python developer, so I could be wrong.

Any advice is appreciated.
--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to