On Sat, Oct 3, 2015 at 4:34 AM, David Allouche <da...@allouche.net> wrote:

> On 2 Oct 2015, at 05:49, George Reilly <george.v.rei...@gmail.com> wrote:
>
> On Mon, Sep 28, 2015 at 1:04 AM, David Allouche <da...@allouche.net>
> wrote:
>
>> On 28 Sep 2015, at 05:21, George V. Reilly <george.v.rei...@gmail.com>
>> wrote:
>>
>> SQLTap is a very useful library that helps you in profiling SQLAlchemy
>> queries. It helps you understand where and when SQLAlchemy issues queries,
>> how often they are issued, how many rows are returned, and if you are
>> issuing queries with duplicate parameters. The last two are new in
>> tonight's 0.3.10 release, which incorporates improved reporting from me.
>>
>> https://github.com/inconshreveable/sqltap
>>
>>
>> What are the downsides of issuing queries with duplicate parameters?
>> Aside from "it makes it harder to read the logs".
>>
>> I mean, obviously, not "dozens of duplicate parameters": any query with
>> dozens of parameters probably has other problems.
>>
>> I could not find any rationale on the linked page.
>>
>
> I was investigating the performance of a single, pathologically slow API
> request to one of our web services. SQLTap told me that there were nearly
> 12,000 database queries over several minutes. Digging further, I found that
> about half of these requests could be eliminated by adding lazy="joined" to
> some relationships. There was one relationship where adding lazy="joined"
> eliminated some queries but the result rowsets were so large that the
> overall API request became noticeably slower. I updated SQLTap's report to
> show the number of rows returned by each query, which helped identify such
> problems.
>
> I also found that certain objects were being requested again and again;
> i.e., I was issuing queries with identical parameters. SQLAlchemy issues
> each query to the database, gets back an identical result (since they
> weren't being modified), and returns the existing object from its identity
> map. See
> http://docs.sqlalchemy.org/en/rel_1_0/orm/session_basics.html#is-the-session-a-cache.
> Making my app smarter about not asking again for objects that it had
> already loaded eliminated thousands more queries. This is why I added the
> duplicate query reporting, so that I could pinpoint opportunities for
> caching.
>
>
> Since your message addresses a wider scope than my question, I will answer
> similarly widely.
>
> You had a pathological API handler that used an anti patterns of ORMs:
> object graph traversal concealing requests in loops.  SQLTap helped you
> identify the problem.
>
> You applied the fix: eager loading through joins. Then you hit the related
> anti pattern: unnecessarily large result sets. And SQLTap helped you
> identify that problem too.
>
> That's nice, and it helps illustrate that SQLTap is useful for debugging
> database performance. But, so far, it does not answer my question.
>
> Then you found another anti pattern: redundant explicit queries. And this
> is what detection of "issuing queries with duplicate parameters" is about.
> It is not so much about "duplicate parameters", which I understood as
> "multiple parameters in a single query that have the same value", but
> rather about redundant queries that have identical SQL and parameters in a
> single handler.
>
> Maybe you can make the benefits of SQLTap more obvious by explaining its
> goals in this way: how it helps you debug common anti patterns in database
> code.
>
> Sorry if I sound pedantic: in a couple of weeks I will make a presentation
> about SQLAlchemy Tips and Tricks. So this is as much about clarifying my
> thinking than about clarifying the SQLTap presentation.
>
> Thank you for reading this much.
>

​CCing Alan Shreve, the author of SQLTap​.

I've used SQLAlchemy logging in the past to diagnose performance problems.
Set the loglevel to DEBUG and you get a ton of low-level information in
your log, including every row fetched from the database. The sheer volume
of logging is very hard to digest. SQLTap neatly summarizes that, telling
you what the different queries are. Many queries are triggered implicitly,
such as lazy relationships. You can't tell from the logfile how these
queries happened. SQLTap captures the callstack of every query, grouping
the report by those stacks, so you can say exactly which line in your code
is responsible and how often it happens.

SQLTap captures the SQL for the query and how long each query took. You can
experiment directly in your database client with EXPLAIN ANALYZE if you
think the queries are too slow, which can lead you to add indexes to your
database (tip: use Alembic) or to adjust your SQLAlchemy query.

Your original question was "
What are the downsides of issuing queries with duplicate parameters?"
​ I meant querying repeatedly with the same parameters each time, not
querying once with multiple copies of the same parameter.​

​Why go all the way to the database to answer a question that you
previously answered? If, in the context of your application, you know that
the results cannot have been modified, avoiding taking the hit of querying
the database is a good thing. It reduces the latency of your application
and it reduces load on the database. In my earlier example, I was
specifically talking about queries happening within one
session/transaction. It might make sense to cache for longer than a
session, but that's much harder for your application to get right.
-- 
/George V. Reilly  george.v.rei...@gmail.com  Twitter: @georgevreilly
<https://twitter.com/georgevreilly>
*www.GeorgeVReilly.com <http://www.georgevreilly.com/>*

-- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.

Reply via email to