Re: [GENERAL] strange sql behavior

Yu Nie Tue, 02 Feb 2016 07:18:11 -0800

One more observation:  not sure if it helps with the diagnosis.

If I query based on time interval, not based on taxiid, the query is
generally faster - I believe it is due to the way the data were loaded.
Yet there is still significant performance difference: the large table
seems runs faster than the small table.  More importantly, this time the
planner proposes two very different plans for the two tables. Not sure why
Bitmap index was used for the small table...



See results below:

explain (analyze, buffers) select * from data2011_01 where time >=
'2011-01-16 12:00:00' and time < '2011-01-16 12:05:00' order by time;

For small table.
"Sort  (cost=529024.61..529411.73 rows=154848 width=55) (actual
time=3761.329..3765.501 rows=149807 loops=1)"
"  Sort Key: "time""
"  Sort Method: quicksort  Memory: 27211kB"
"  Buffers: shared hit=39 read=4304"
"  ->  Bitmap Heap Scan on data2013_01w  (cost=3287.76..515676.33
rows=154848 width=55) (actual time=29.028..3714.592 rows=149807 loops=1)"
"        Recheck Cond: (("time" >= '2013-01-16 12:00:00'::timestamp without
time zone) AND ("time" < '2013-01-16 12:05:00'::timestamp without time
zone))"
"        Heap Blocks: exact=3929"
"        Buffers: shared hit=39 read=4304"
"        ->  Bitmap Index Scan on data2013_01w_ixtime  (cost=0.00..3249.05
rows=154848 width=0) (actual time=17.931..17.931 rows=149807 loops=1)"
"              Index Cond: (("time" >= '2013-01-16 12:00:00'::timestamp
without time zone) AND ("time" < '2013-01-16 12:05:00'::timestamp without
time zone))"
"              Buffers: shared hit=1 read=413"
"Planning time: 0.203 ms"
"Execution time: 3779.368 ms"


For large table.
"Index Scan using data2011_01_ixtime on data2011_01  (cost=0.57..78012.46
rows=110204 width=55) (actual time=43.765..709.985 rows=104617 loops=1)"
"  Index Cond: (("time" >= '2011-01-16 12:00:00'::timestamp without time
zone) AND ("time" < '2011-01-16 12:05:00'::timestamp without time zone))"
"  Buffers: shared hit=79619 read=1482"
"Planning time: 0.131 ms"
"Execution time: 713.446 ms"



On Mon, Feb 1, 2016 at 8:50 PM, Melvin Davidson <melvin6...@gmail.com>
wrote:

> OK, one more thing to try.
>
> Please send the output of the following query to a file and attach so I
> can review.
>
> SELECT c.relname as idxname,
>                 i.indisclustered,
>                 i.indisvalid,
>        i.indcollation
>     FROM pg_index i
>       JOIN pg_class c ON i.indexrelid = c.oid
>  WHERE i.indrelid IN (SELECT oid
>                                        FROM pg_class
>                                     WHERE relkind = 'r'
>                                           AND relname = 'data2011_01'
>                                             OR relname = 'data2013_01w'
>                                    )
> ORDER BY 1;
>
>
> On Mon, Feb 1, 2016 at 5:51 PM, Yu Nie <niey...@gmail.com> wrote:
>
>> Yeah, it is really puzzling. :(
>>
>> No, I have never attempted to cluster the large table - as I mentioned
>> before the two tables were created almost exactly the same way (through a
>> c++ api), except with different data files. I only use theme for query, and
>> had never done any update/insert/delete since the creation.
>>
>>
>>
>> On Mon, Feb 1, 2016 at 4:41 PM, Melvin Davidson <melvin6...@gmail.com>
>> wrote:
>>
>>> At this point I can see no possible explanation why this would occur.
>>>
>>> Just a thought though, is it possible that data2011_01 was clustered on
>>> the index at some point but data2013_01w has not been clustered?
>>> If you cluster data2013_01w on the index, does the performance change?
>>>
>>> On Mon, Feb 1, 2016 at 4:03 PM, Yu Nie <niey...@gmail.com> wrote:
>>>
>>>> Melvin,
>>>>
>>>> Please see attached for the requests results. I ran two queries (each
>>>> with a different taxiid that is next to each other) for each table.  Note
>>>> that for the large table one is much faster than the other because the
>>>> shared cache was used for the second query.  This does not work however for
>>>> the small table.
>>>>
>>>> Many thanks for your willingness to help!
>>>>
>>>> Best, Marco
>>>>
>>>> On Mon, Feb 1, 2016 at 2:25 PM, melvin6925 <melvin6...@gmail.com>
>>>> wrote:
>>>>
>>>>> Fine. Please rerun both explains and my queries and send ALL output to
>>>>> a file (F8 in PGADMIN Sql). Then attach the file.
>>>>>
>>>>>
>>>>>
>>>>> Sent via the Samsung Galaxy S® 6, an AT&T 4G LTE smartphone
>>>>> -------- Original message --------
>>>>> From: Yu Nie <niey...@gmail.com>
>>>>> Date: 2/1/2016 15:17 (GMT-05:00)
>>>>> To: melvin6925 <melvin6...@gmail.com>
>>>>> Subject: Re: [GENERAL] strange sql behavior
>>>>>
>>>>> Yes, absolutely.
>>>>>
>>>>> On Mon, Feb 1, 2016 at 2:12 PM, melvin6925 <melvin6...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> And were _your_ queries run on the same day at the same time within a
>>>>>> few seconds of each other?
>>>>>>
>>>>>> Sent via the Samsung Galaxy S® 6, an AT&T 4G LTE smartphone
>>>>>> -------- Original message --------
>>>>>> From: Yu Nie <niey...@gmail.com>
>>>>>> Date: 2/1/2016 15:01 (GMT-05:00)
>>>>>> To: melvin6925 <melvin6...@gmail.com>
>>>>>> Cc: Bill Moran <wmo...@potentialtech.com>,
>>>>>> pgsql-general@postgresql.org
>>>>>> Subject: Re: [GENERAL] strange sql behavior
>>>>>>
>>>>>> Thanks, Bill and Melvin!
>>>>>>
>>>>>> Just some quick note/answers before I absorb all the information
>>>>>> provided by Bill.
>>>>>>
>>>>>> 1. I don't expect many users running queries against the tables,
>>>>>> especially for the small table - since I just created it this morning, 
>>>>>> and
>>>>>> nobody know about it except myself.
>>>>>>
>>>>>> 2. The setting in the config:
>>>>>>
>>>>>>
>>>>>> shared_buffers = 512MB            # min 128kB
>>>>>> work_mem = 128MB                # min 64kB
>>>>>>
>>>>>> 3. I am running a Windows 7 with 24 GB RAM.  and my postgresal is
>>>>>> 9.4.
>>>>>>
>>>>>> 4.  here is the query I ran:
>>>>>>
>>>>>> SELECT n.nspname,
>>>>>>        s.relname,
>>>>>>        c.reltuples::bigint,
>>>>>>        n_tup_ins,
>>>>>>        n_tup_upd,
>>>>>>        n_tup_del,
>>>>>>        date_trunc('second', last_vacuum) as last_vacuum,
>>>>>>        date_trunc('second', last_autovacuum) as last_autovacuum,
>>>>>>        date_trunc('second', last_analyze) as last_analyze,
>>>>>>        date_trunc('second', last_autoanalyze) as last_autoanalyze
>>>>>>        ,
>>>>>>        round( current_setting('autovacuum_vacuum_threshold')::integer
>>>>>> + current_setting('autovacuum_vacuum_scale_factor')::numeric * 
>>>>>> C.reltuples)
>>>>>> AS av_threshold
>>>>>>   FROM pg_stat_all_tables s
>>>>>>   JOIN pg_class c ON c.oid = s.relid
>>>>>>   JOIN pg_namespace n ON (n.oid = c.relnamespace)
>>>>>>  WHERE s.relname NOT LIKE 'pg_%'
>>>>>>    AND s.relname NOT LIKE 'sql_%'
>>>>>>    AND s.relname IN  ('data2013_01w', 'data2011_01')
>>>>>>  ORDER by 1, 2;
>>>>>>
>>>>>> I copied the result from PGAdmin directly, here it is again.:
>>>>>>
>>>>>> public;"data2011_01";784236864;784236885;0;0;"";"";"";"2016-01-19
>>>>>> 17:31:08-06";156847423
>>>>>> public;"data2013_01w";300786432;300786444;0;0;"";"";"2016-02-01
>>>>>> 08:57:24-06";"2016-02-01 04:01:04-06";60157336
>>>>>>
>>>>>> On Mon, Feb 1, 2016 at 1:45 PM, melvin6925 <melvin6...@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Thanks Bill.
>>>>>>> Also, it's very important to include the headers with the queries!
>>>>>>>
>>>>>>> Marco,
>>>>>>> There is no top secret information that is requested, so please do
>>>>>>> not edit the output.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Sent via the Samsung Galaxy S® 6, an AT&T 4G LTE smartphone
>>>>>>> -------- Original message --------
>>>>>>> From: Bill Moran <wmo...@potentialtech.com>
>>>>>>> Date: 2/1/2016 14:41 (GMT-05:00)
>>>>>>> To: Yu Nie <niey...@gmail.com>
>>>>>>> Cc: Melvin Davidson <melvin6...@gmail.com>,
>>>>>>> pgsql-general@postgresql.org
>>>>>>> Subject: Re: [GENERAL] strange sql behavior
>>>>>>>
>>>>>>>
>>>>>>> Came a little late to the thread, see many comments inline below:
>>>>>>>
>>>>>>> On Mon, 1 Feb 2016 13:16:13 -0600
>>>>>>> Yu Nie <niey...@gmail.com> wrote:
>>>>>>>
>>>>>>> > Thanks  a lot for your reply. I ran the query you suggested and
>>>>>>> here are
>>>>>>> > the results
>>>>>>> >
>>>>>>> > Large table:
>>>>>>> "public";"data2011_01";784236885;0;0;"";"";"";"2016-01-19
>>>>>>> > 17:31:08-06";156847423
>>>>>>> > Small table:
>>>>>>> "public";"data2013_01w";300786444;0;0;"";"";"2016-02-01
>>>>>>> > 08:57:24-06";"2016-02-01 04:01:04-06";60157336
>>>>>>>
>>>>>>> You didn't do Melvin's query correctly. He specified 11 columns to
>>>>>>> select, but you only show 10. Since you don't show the query you
>>>>>>> actually ran, we don't know which of the numeric columns is missing.
>>>>>>>
>>>>>>> More information inline below:
>>>>>>>
>>>>>>> >
>>>>>>> > On Mon, Feb 1, 2016 at 1:00 PM, Melvin Davidson <
>>>>>>> melvin6...@gmail.com>
>>>>>>> > wrote:
>>>>>>> >
>>>>>>> > > One thing to look at is the last time both tables were
>>>>>>> vacuumed/analyzed.
>>>>>>> > >
>>>>>>> > > SELECT n.nspname,
>>>>>>> > >        s.relname,
>>>>>>> > >        c.reltuples::bigint,
>>>>>>> > >        n_tup_ins,
>>>>>>> > >        n_tup_upd,
>>>>>>> > >        n_tup_del,
>>>>>>> > >        date_trunc('second', last_vacuum) as last_vacuum,
>>>>>>> > >        date_trunc('second', last_autovacuum) as last_autovacuum,
>>>>>>> > >        date_trunc('second', last_analyze) as last_analyze,
>>>>>>> > >        date_trunc('second', last_autoanalyze) as last_autoanalyze
>>>>>>> > >        ,
>>>>>>> > >        round(
>>>>>>> current_setting('autovacuum_vacuum_threshold')::integer +
>>>>>>> > > current_setting('autovacuum_vacuum_scale_factor')::numeric *
>>>>>>> C.reltuples)
>>>>>>> > > AS av_threshold
>>>>>>> > >   FROM pg_stat_all_tables s
>>>>>>> > >   JOIN pg_class c ON c.oid = s.relid
>>>>>>> > >   JOIN pg_namespace n ON (n.oid = c.relnamespace)
>>>>>>> > >  WHERE s.relname NOT LIKE 'pg_%'
>>>>>>> > >    AND s.relname NOT LIKE 'sql_%'
>>>>>>> > >    AND s.relname IN  ' "your_small_table", "your_large_table"'
>>>>>>> > >  ORDER by 1, 2;
>>>>>>> > >
>>>>>>> > >
>>>>>>> > > Also, please confirm the indexes for both tables are using the
>>>>>>> same method
>>>>>>> > > (btree?).
>>>>>>> > >
>>>>>>> > > On Mon, Feb 1, 2016 at 1:35 PM, Yu Nie <niey...@gmail.com>
>>>>>>> wrote:
>>>>>>> > >
>>>>>>> > >> Hi there,
>>>>>>> > >>
>>>>>>> > >> Recently I am working with a large amount of taxis GIS data and
>>>>>>> had
>>>>>>> > >> encountered some weird performance issues.  I am hoping someone
>>>>>>> in this
>>>>>>> > >> community can help me figure it out.
>>>>>>> > >>
>>>>>>> > >> The taxi data were loaded in 5 minute block into a table.  I
>>>>>>> have two
>>>>>>> > >> separate such tables, one stores a month of data with about 700
>>>>>>> million
>>>>>>> > >> rows, another stores about 10 days of data with about 300
>>>>>>> million rows.
>>>>>>> > >> The two tables have the exactly same schema and indexes. There
>>>>>>> are two
>>>>>>> > >> indexes: one on taxiid (text), and the other on the time stamp
>>>>>>> (date
>>>>>>> > >> time).  In order to process the data, I need to get all points
>>>>>>> for a single
>>>>>>> > >> taxis; to do that, I use something like:
>>>>>>> > >>  select * from table1 where taxiid = 'SZB00S41' order by time;
>>>>>>> > >> What puzzled me greatly is that this query runs consistently
>>>>>>> much faster
>>>>>>> > >> for the large table than for the small table, which seems to
>>>>>>> contradict
>>>>>>> > >> with intuition.   At the end of message you may find explain
>>>>>>> (analyze
>>>>>>> > >> buffer) results of two particular queries for the same taxiid
>>>>>>> (one for each
>>>>>>> > >> table). You can see that it took much longer (more than 20
>>>>>>> times) to get
>>>>>>> > >> 20k rows from the small table than to get 44 k rows from the
>>>>>>> large table.
>>>>>>> > >> Interestingly it seems that the planner does expect about 1/3
>>>>>>> work for the
>>>>>>> > >> small table query - yet for some reason, it took much longer to
>>>>>>> fetch the
>>>>>>> > >> rows from the small table.   Why there is such a huge
>>>>>>> performance between
>>>>>>> > >> the two seemingly identical queries executed on two different
>>>>>>> tables?
>>>>>>> > >>
>>>>>>> > >> Is is because the data on the second table is on some
>>>>>>> mysteriously
>>>>>>> > >> "broken part" of the disk?  what else could explain such a
>>>>>>> bizarre
>>>>>>> > >> behavior?  Your help is greatly appreciated.
>>>>>>> > >>
>>>>>>> > >> The above behavior is consistent through all queries.   Another
>>>>>>> issue I
>>>>>>> > >> identified is that for the large table, the query can use the
>>>>>>> shared buffer
>>>>>>> > >> more effectively.  For example, after you query one taxiid and
>>>>>>> immediately
>>>>>>> > >> following that query run the same query for another taxi whose
>>>>>>> id ranks
>>>>>>> > >> right behind the first id, then shared hit buffers would be
>>>>>>> quite high (and
>>>>>>> > >> the query would run much faster); this however never works for
>>>>>>> the small
>>>>>>> > >> table.
>>>>>>> > >>
>>>>>>> > >> Thanks   a lot!
>>>>>>> > >>
>>>>>>> > >> Best, Marco
>>>>>>> > >>
>>>>>>> > >>
>>>>>>> > >> Results for the small table: it took 141 seconds to finish.
>>>>>>> The planning
>>>>>>> > >> time is 85256.31
>>>>>>> > >>
>>>>>>> > >> "Sort  (cost=85201.05..85256.31 rows=22101 width=55) (actual
>>>>>>> > >> time=141419.499..141420.025 rows=20288 loops=1)"
>>>>>>> > >> "  Sort Key: "time""
>>>>>>> > >> "  Sort Method: quicksort  Memory: 3622kB"
>>>>>>> > >> "  Buffers: shared hit=92 read=19816"
>>>>>>> > >> "  ->  Bitmap Heap Scan on data2013_01w  (cost=515.86..83606.27
>>>>>>> > >> rows=22101 width=55) (actual time=50.762..141374.777 rows=20288
>>>>>>> loops=1)"
>>>>>>> > >> "        Recheck Cond: ((taxiid)::text = 'SZB00S41'::text)"
>>>>>>> > >> "        Heap Blocks: exact=19826"
>>>>>>> > >> "        Buffers: shared hit=92 read=19816"
>>>>>>>
>>>>>>> ^^
>>>>>>> Note that despite this table being smaller, Postgres had to read
>>>>>>> 19816
>>>>>>> blocks from disk, which is 2.5x more than the larger table.
>>>>>>>
>>>>>>> > >> "        ->  Bitmap Index Scan on data2013_01w_ixtaxiid
>>>>>>> > >> (cost=0.00..510.33 rows=22101 width=0) (actual
>>>>>>> time=26.053..26.053
>>>>>>> > >> rows=20288 loops=1)"
>>>>>>> > >> "              Index Cond: ((taxiid)::text = 'SZB00S41'::text)"
>>>>>>> > >> "              Buffers: shared hit=4 read=78"
>>>>>>> > >> "Planning time: 0.144 ms"
>>>>>>> > >> "Execution time: 141421.154 ms"
>>>>>>> > >>
>>>>>>> > >> Results for the large table: it took 5 seconds to finish.  The
>>>>>>> planning
>>>>>>> > >> time is 252077.10
>>>>>>> > >> "Sort  (cost=251913.32..252077.10 rows=65512 width=55) (actual
>>>>>>> > >> time=5038.571..5039.765 rows=44204 loops=1)"
>>>>>>> > >> "  Sort Key: "time""
>>>>>>> > >> "  Sort Method: quicksort  Memory: 7753kB"
>>>>>>> > >> "  Buffers: shared hit=2 read=7543"
>>>>>>> > >> "  ->  Bitmap Heap Scan on data2011_01  (cost=1520.29..246672.53
>>>>>>> > >> rows=65512 width=55) (actual time=36.935..5017.463 rows=44204
>>>>>>> loops=1)"
>>>>>>> > >> "        Recheck Cond: ((taxiid)::text = 'SZB00S41'::text)"
>>>>>>> > >> "        Heap Blocks: exact=7372"
>>>>>>> > >> "        Buffers: shared hit=2 read=7543"
>>>>>>> > >> "        ->  Bitmap Index Scan on data2011_01_ixtaxiid
>>>>>>> > >> (cost=0.00..1503.92 rows=65512 width=0) (actual
>>>>>>> time=35.792..35.792
>>>>>>> > >> rows=44204 loops=1)"
>>>>>>>
>>>>>>> ^^
>>>>>>> Note that the larger table took LONGER to do the index work than the
>>>>>>> smaller table, which probably means there's nothing wrong with your
>>>>>>> disks or anything ... that's the behavior I would expect.
>>>>>>>
>>>>>>> > >> "              Index Cond: ((taxiid)::text = 'SZB00S41'::text)"
>>>>>>> > >> "              Buffers: shared hit=2 read=171"
>>>>>>> > >> "Planning time: 0.127 ms"
>>>>>>> > >> "Execution time: 5042.134 ms"
>>>>>>>
>>>>>>> So, what I'm seeing, is that Postgres is able to figure out _which_
>>>>>>> rows
>>>>>>> to fetch faster on the small table than the large table, which is
>>>>>>> what
>>>>>>> you would expect, since a smaller index should be faster than a
>>>>>>> large one.
>>>>>>>
>>>>>>> However, when it goes to actually fetch the row data, it takes
>>>>>>> significantly longer on the small table, despite the fact that it's
>>>>>>> only fetching 1/3 as many rows. It is, however, doing 2.5x as many
>>>>>>> disk reads to get those rows: For the large table, it reads 61MB from
>>>>>>> disk, but it reads 160MB to get all the data for the smaller table.
>>>>>>>
>>>>>>> How the data was inserted into each table could lead to similar data
>>>>>>> being clustered on common pages on the large table, while it's spread
>>>>>>> across many more pages on the small table.
>>>>>>>
>>>>>>> That still doesn't explain it all, though. 2.5x the disk
>>>>>>> activity normally wouldn't equate to 28x the time required. Unless
>>>>>>> you're disks are horrifically slow? Does this server have a lot of
>>>>>>> other activity against the disks? I.e. are other people running
>>>>>>> queries that you would have to contend with, or is the server a VM
>>>>>>> sharing storage with other VMs, or even a combined use server that
>>>>>>> has to share disk access with (for example) a web or mail server
>>>>>>> as well? Is the performance difference consistently ~28x?
>>>>>>>
>>>>>>> Other things: what is shared_buffers set to? The queries would seem
>>>>>>> to indicate that this server has less than 1M of those two tables
>>>>>>> cached in memory at the time you ran those queries, which seems to
>>>>>>> suggest that either you've got shared_buffers set very low, or that
>>>>>>> there are a lot of other tables that other queries are accessing at
>>>>>>> the time you're running these. Perhaps installing pg_buffercache to
>>>>>>> have a look at what's using your shared_buffers would be helpful.
>>>>>>>
>>>>>>> --
>>>>>>> Bill Moran
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>>
>>> --
>>> *Melvin Davidson*
>>> I reserve the right to fantasize.  Whether or not you
>>> wish to share my fantasy is entirely up to you.
>>>
>>
>>
>
>
> --
> *Melvin Davidson*
> I reserve the right to fantasize.  Whether or not you
> wish to share my fantasy is entirely up to you.
>

Re: [GENERAL] strange sql behavior

Reply via email to