#77: Bad performance of query_formatted()
-----------------------+-----------------------------
Reporter: justin | Owner: cito
Type: task | Status: new
Priority: major | Milestone: 5.1
Component: DB API 2 | Version: 5.0
Resolution: | Keywords: pg, performance
-----------------------+-----------------------------
Comment (by cito):
Also posted by Justin 2019-01-19:
1) in 5.0, document that relative to query, query_formatted has an
overhead "which can be significant for queries repeated many times", and
document that the mitigation is to use inline=True; or, use prepared
statements "available since 5.1". Note that for simple queries like
INSERT, the significant overhead is in pygres, but for complex queries
like JOINs/large inheritence trees/etc, the more overhead is in planning.
2) For 5.1.1 (and maybe 5.0), something to mitigate the cost of
isinstance() in pg and pgdb.
3) In 5.1 (but probably not 5.0?), consider changing `query_formatted`
default to `inline=True`. In my test, this inserted 30% faster (!) even
with no 2nd patch.
{{{
$ python2.7 ./testinsert.py
diff 192.718273878
vs
$ python2.7 ./testinsert.py
diff 309.562824965
}}}
That might be good to consider for other reasons: there's 1) pqExec vs 2)
pqExecParams. 1) supports multiple commands; but 2) allows binary
protocol (which pygres doesn't currently support). Binary protocol (or
anything using pqExecParams) will never support multiple commands.
If there aren't params, `query_formatted` currently calls `query` and
`pqExec`, to allow the possibility of including multiple commands. I
wonder whether (starting in v5.1) perhaps pygres shouldn't call
pqExecParams() in the case that there are no params? Otherwise it's odd
that query_formatted would call pqExec sometimes only, and an odd
conditional which complicates any future support for things like binary
format. I realize that binary format isn't going to happentime anytime
soon, if ever, but 5.1 is maybe an opportunity to make that change.
Maybe multiple commands should be documented, and it's odd to write that
`query_formatted` supports multiple command "if there are no params, or if
inline=True". It'd be better to be able to say "multiple commands are not
supported except when inline=True"; or, if that was the default, "multiple
commands are not supported if inline=False".
If pg always (or defaulted) used `inline=True`, we could always use
multiple commands (even with pqExecParams), and it would be similar to
pgdb. But maybe that's moving in the wrong direction, too.. Or maybe that
should wait until binary protocol is on the table
I realize this message is addressing multiple things and maybe not very
focused, but a few of these are kind of connected.
--
Ticket URL: <http://trac.pygresql.org:8000/pgtracker/ticket/77#comment:1>
PyGreSQL <http://www.pygresql.org/>
PyGreSQL Tracker
_______________________________________________
PyGreSQL mailing list
[email protected]
https://mail.vex.net/mailman/listinfo/pygresql