[PERFORM] query not using index

2013-12-19 Thread Johann Spies
I would appreciate some help optimising the following query:

with
subject_journals as(
select  A.sq
fromisi.rissue A,
isi.rsc_joern_link C
WHERE
C.sc_id in
('d0963875-e438-4923-b3fa-f462e8975221',
'04e14284-09c8-421a-b1ad-c8238051601a',
'04e2189f-cd2a-44f0-b98d-52f6bb5dcd78',
'f5521c65-ec49-408a-9a42-8a69d47703cd',
'2e47ae2f-2c4d-433e-8bdf-9983eeeafc42',
'5d3639b1-04c2-4d94-a99a-5323277fd2b7')
AND
C.rj_id = A.uuid),
subject_articles as (
SELECT B.article_id as art_id
FROM
isi.isi_l1_publication B,
subject_journals A,
isi.ritem C


WHERE
A.sq = B.journal_id
AND
B.publication_year <= '2012'
AND
B.publication_year >= '2000'
AND
C.ut = B.article_id
AND
C.dt in ('@ Article','Review')
),
country_articles as (
SELECT A.art_id
FROM isi.art_country_link A
WHERE
A.countrycode = 'ZA')

select art_id from subject_articles
INTERSECT
select art_id from country_articles

Analyze explains shows that it is not using the indexes on both
isi.isi_l1_publication and isi.ritem (both tables with more than 43 million
records).:

"HashSetOp Intersect  (cost=10778065.50..11227099.44 rows=200 width=48)
(actual time=263120.868..263279.467 rows=4000 loops=1)"
"  Output: "*SELECT* 1".art_id, (0)"
"  Buffers: shared hit=627401 read=4347235, temp read=234498 written=234492"
"  CTE subject_journals"
"->  Hash Join  (cost=12846.55..17503.27 rows=28818 width=8) (actual
time=99.762..142.439 rows=30291 loops=1)"
"  Output: a.sq"
"  Hash Cond: ((c.rj_id)::text = (a.uuid)::text)"
"  Buffers: shared hit=12232"
"  ->  Bitmap Heap Scan on isi.rsc_joern_link c
(cost=1020.92..5029.23 rows=28818 width=37) (actual time=4.238..15.806
rows=30291 loops=1)"
"Output: c.id, c.rj_id, c.sc_id"
"Recheck Cond: ((c.sc_id)::text = ANY
('{d0963875-e438-4923-b3fa-f462e8975221,04e14284-09c8-421a-b1ad-c8238051601a,04e2189f-cd2a-44f0-b98d-52f6bb5dcd78,f5521c65-ec49-408a-9a42-8a69d47703cd,2e47ae2f-2c4d-433e-8bdf-9983eeeafc42,5d3639b1-04c2-4
(...)"
"Buffers: shared hit=3516"
"->  Bitmap Index Scan on rsc_joern_link_sc_id_idx
(cost=0.00..1013.72 rows=28818 width=0) (actual time=3.722..3.722
rows=30291 loops=1)"
"  Index Cond: ((c.sc_id)::text = ANY
('{d0963875-e438-4923-b3fa-f462e8975221,04e14284-09c8-421a-b1ad-c8238051601a,04e2189f-cd2a-44f0-b98d-52f6bb5dcd78,f5521c65-ec49-408a-9a42-8a69d47703cd,2e47ae2f-2c4d-433e-8bdf-9983eeeafc42,5d3639b1-04
(...)"
"  Buffers: shared hit=237"
"  ->  Hash  (cost=10098.06..10098.06 rows=138206 width=45) (actual
time=95.495..95.495 rows=138206 loops=1)"
"Output: a.sq, a.uuid"
"Buckets: 16384  Batches: 1  Memory Usage: 10393kB"
"Buffers: shared hit=8716"
"->  Seq Scan on isi.rissue a  (cost=0.00..10098.06
rows=138206 width=45) (actual time=0.005..58.225 rows=138206 loops=1)"
"  Output: a.sq, a.uuid"
"  Buffers: shared hit=8716"
"  CTE subject_articles"
"->  Merge Join  (cost=9660996.21..9896868.27 rows=13571895 width=16)
(actual time=229449.020..259557.073 rows=2513896 loops=1)"
"  Output: b.article_id"
"  Merge Cond: ((a.sq)::text = (b.journal_id)::text)"
"  Buffers: shared hit=519891 read=4347235, temp read=234498
written=234492"
"  ->  Sort  (cost=2711.01..2783.05 rows=28818 width=32) (actual
time=224.901..230.615 rows=30288 loops=1)"
"Output: a.sq"
"Sort Key: a.sq"
"Sort Method: quicksort  Memory: 2188kB"
"Buffers: shared hit=12232"
"->  CTE Scan on subject_journals a  (cost=0.00..576.36
rows=28818 width=32) (actual time=99.764..152.459 rows=30291 loops=1)"
"  Output: a.sq"
"  Buffers: shared hit=12232"
"  ->  Materialize  (cost=9658285.21..9722584.29 rows=12859816
width=24) (actual time=229223.851..253191.308 rows=14664245 loops=1)"
"Output: b.article_id, b.journal_id"
"Buffers: shared hit=507659 read=4347235, temp read=234498
written=234492"
"->  Sort  (cost=9658285.21..9690434.75 rows=12859816
width=24) (actual time=229223.846..251142.167 rows=14072645 loops=1)"
"  Output: b.article_id, b.journal_id"
"  Sort Key: b.journal_id"
"  Sort Method: external merge  Disk: 467704kB"
"

Re: [PERFORM] query not using index

2013-12-23 Thread Johann Spies
On 19 December 2013 16:48, Tom Lane  wrote:

> Johann Spies  writes:
> > I would appreciate some help optimising the following query:
>
> It's a mistake to imagine that indexes are going to help much with
> a join of this size.  Hash or merge join is going to be a lot better
> than nestloop.  What you need to do is make sure those will perform
> as well as possible, and to that end, it'd likely help to raise
> work_mem.  I'm not sure if you can sanely put it high enough to
> make the query operate totally in memory --- it looks like you'd
> need work_mem of 500MB or more to prevent any of the sorts or
> hashes from spilling to disk, and keep in mind that this query
> is going to use several times work_mem because there are multiple
> sorts/hashes going on.  But if you can transiently dedicate a lot
> of RAM to this query, that should help some.  I'd suggest increasing
> work_mem via a SET command in the particular session running this
> query --- you don't want such a high value to be the global default.
>

Thanks Tom.  Raising work_mem from 384MB to 512MB made a significant
difference.

You said "hash or merge join id going to be a lot better than nestloop".
Is that purely in the hands of the query planner or what can I do to get
the planner to use that options apart from raising the work_mem?

Regards
Johann


-- 
Because experiencing your loyal love is better than life itself,
my lips will praise you.  (Psalm 63:3)


[PERFORM] The same query - much different runtimes

2014-04-07 Thread Johann Spies
While waiting for a query to finish (activated through a web interface), I
ran the same query using psql through a ssh-connection with much different
runtimes.

I have configured the server to log queries taking more than five seconds
and in the log the query for which I waited was logged as:

2014-04-07 12:01:38 SAST LOG:  duration: 466754.684 ms  plan:
Query Text: SELECT  isi_alt_names.code FROM rresearch,
isi_alt_names WHERE UPPER(rresearch.ny) = 'GUANGZHOU') AND\
 (UPPER(rresearch.nu) = 'PEOPLES R CHINA')) AND (isi_alt_names.rsc_id =
rresearch.id)) AND (isi_alt_names.code IS NOT NULL)) \
ORDER BY rresearch.id, isi_alt_names.id LIMIT 2 OFFSET 0;
Limit  (cost=384216.93..384216.94 rows=2 width=15)
  ->  Sort  (cost=384216.93..384244.77 rows=11137 width=15)
Sort Key: rresearch.id, isi_alt_names.id
->  Nested Loop  (cost=138757.99..384105.56 rows=11137
width=15)
  ->  Bitmap Heap Scan on rresearch
(cost=138757.99..161224.50 rows=11337 width=4)
Recheck Cond: ((upper((ny)::text) =
'GUANGZHOU'::text) AND (upper((nu)::text) = 'PEOPLES R CHINA'\
::text))
->  BitmapAnd  (cost=138757.99..138757.99
rows=11337 width=0)
  ->  Bitmap Index Scan on
rresearch_ny_idx  (cost=0.00..4930.62 rows=215233 width=0)
Index Cond: (upper((ny)::text) =
'GUANGZHOU'::text)
  ->  Bitmap Index Scan on
rresearch_nu_idx  (cost=0.00..133821.46 rows=6229156 width=0)
Index Cond: (upper((nu)::text) =
'PEOPLES R CHINA'::text)
  ->  Index Scan using isi_alt_countrynames_rsc_id_idx
on isi_alt_names  (cost=0.00..19.65 rows=1 width=1\
5)
Index Cond: (rsc_id = rresearch.id)
Filter: (code IS NOT NULL)


While this was going on, I only changed the query to include the schema
(the web-based query used search_path) and ran it.  Query Analyze said:

"Limit  (cost=384288.35..384288.36 rows=2 width=15) (actual
time=2945.338..2945.340 rows=2 loops=1)"
"  Output: isi_alt_names.code, rresearch.id, isi_alt_names.id"
"  Buffers: shared hit=1408146"
"  ->  Sort  (cost=384288.35..384316.20 rows=11137 width=15) (actual
time=2945.338..2945.338 rows=2 loops=1)"
"Output: isi_alt_names.code, rresearch.id, isi_alt_names.id"
"Sort Key: rresearch.id, isi_alt_names.id"
"Sort Method: top-N heapsort  Memory: 25kB"
"Buffers: shared hit=1408146"
"->  Nested Loop  (cost=138757.99..384176.98 rows=11137 width=15)
(actual time=1530.875..2876.376 rows=241920 loops=1)"
"  Output: isi_alt_names.code, rresearch.id, isi_alt_names.id"
"  Buffers: shared hit=1408146"
"  ->  Bitmap Heap Scan on isi.rresearch
(cost=138757.99..161224.50 rows=11337 width=4) (actual
time=1530.848..1750.169 rows=241337 loops=1)"
"Output: rresearch.id, rresearch.cn, rresearch.nf,
rresearch.nc, rresearch.nd, rresearch.nn, rresearch.ny, rresearch.np,
rresearch.nu, rresearch.nz, rresearch.uuid, rresearch.tsv"
"Recheck Cond: ((upper((rresearch.ny)::text) =
'GUANGZHOU'::text) AND (upper((rresearch.nu)::text) = 'PEOPLES R
CHINA'::text))"
"Buffers: shared hit=195242"
"->  BitmapAnd  (cost=138757.99..138757.99 rows=11337
width=0) (actual time=1484.363..1484.363 rows=0 loops=1)"
"  Buffers: shared hit=31173"
"  ->  Bitmap Index Scan on rresearch_ny_idx
(cost=0.00..4930.62 rows=215233 width=0) (actual time=60.997..60.997
rows=241354 loops=1)"
"Index Cond: (upper((rresearch.ny)::text) =
'GUANGZHOU'::text)"
"Buffers: shared hit=1124"
"  ->  Bitmap Index Scan on rresearch_nu_idx
(cost=0.00..133821.46 rows=6229156 width=0) (actual time=1350.819..1350.819
rows=6434248 loops=1)"
"Index Cond: (upper((rresearch.nu)::text) =
'PEOPLES R CHINA'::text)"
"Buffers: shared hit=30049"
"  ->  Index Scan using isi_alt_countrynames_rsc_id_idx on
isi.isi_alt_names  (cost=0.00..19.65 rows=1 width=15) (actual
time=0.003..0.004 rows=1 loops=241337)"
"Output: isi_alt_names.rsc_id, isi_alt_names.code,
isi_alt_names.id, isi_alt_names.institution"
"Index Cond: (isi_alt_names.rsc_id = rresearch.id)"
"Filter: (isi_alt_names.code IS NOT NULL)"
"Buffers: shared hit=1212904"
"Total runtime: 2945.400 ms"

I then ran the query and the result was produced in about the same time as
(2945 ms).

What can cause such a huge discrepancy?  I have checked and there was no
other process blocking the query.

Regards
Johann

-- 
Because experiencing your loyal love is better 

[PERFORM] Specifications for a new server

2014-05-06 Thread Johann Spies
I am busy reading Gregory Smith' s  PostgreSQL 9.0 High Performance and
when the book was written he seemed to me a bit sceptical about SSD's.  I
suspect the reliability of the SSD's has improved significantly since then.

Our present server (128Gb RAM and 2.5 Tb disk space  and 12 CPU cores -
RAID 10) will become a development server and we are going to buy a new
server.

At the moment the 'base'  directory uses 1.5Tb  of disk space and there is
still more data to come.

The database contains blbliometric data that receive updates on a weekly
basis but not much changes other than that except for cleaning of data by a
few persons.

Some of the queries can take many hours to finish.

On our present system there are sometimes more than 300GB in temporary
files which I suspect will not be the case on the new system with a much
larger RAM.

Analysis or the SAR-logs showed that there were too much iowait in the
CPU's on the old system which has a lower spec CPU than the ones considered
for the new system.

We are looking possibly the following hardware:

CPU: 2 x  Ivy Bridge 8C E5-2667V2 3.3G 25M 8GT/s QPI - 16 cores
RAM: 24 x 32GB DDR3-1866 2Rx4 LP ECC REG RoHS  - 768Gb

with enough disk space - about 4.8 Tb on RAID 10.
My question is about the possible advantage and usage of SSD disks in the
new server.  At the moment I am considering using 2 x 200GB SSD' s for a
separate partion for temporary files and 2 x 100GB for the operating system.

So my questions:

1. Will the SSD's in this case be worth the cost?
2.  What will the best way to utilize them in the system?

Regards
Johann
-- 
Because experiencing your loyal love is better than life itself,
my lips will praise you.  (Psalm 63:3)


Re: [PERFORM] Specifications for a new server

2014-05-08 Thread Johann Spies
On 6 May 2014 13:07, Michael Stone  wrote:

> On Tue, May 06, 2014 at 11:13:42AM +0200, Johann Spies wrote:
>
>> Analysis or the SAR-logs showed that there were too much iowait in the
>> CPU's on
>> the old system which has a lower spec CPU than the ones considered for
>> the new
>> system.
>>
>
> iowait means the cpu is doing nothing but waiting for data from the disk.
> buying faster cpus means that they will be able to spend more time waiting
> for data from the disk. you'd probably get much better bang for the buck
> upgrading the storage subsystem than throwing more money at cpus.
>
>
> In that case I apologise for making the wrong assumption.  People who are
more experienced than me analyzed the logs told me that to their surprise
the CPU' s were under pressure.  I just assumed that the iowait was the
problem having looked at the logs myself.


> If you're talking about SSDs for the OS, that's a complete waste; there is
> essentially no I/O relating to the OS once you've booted.
>
>
I also thought this might be an overkill but I was not sure.


>
>  So my questions:
>>
>> 1. Will the SSD's in this case be worth the cost?
>> 2.  What will the best way to utilize them in the system?
>>
>
> The best way to utilize them would probably be to spend less on the CPU
> and RAM and more on the storage, and use SSD either for all of the storage
> or for specific items that have a high level of I/O (such as the indexes).
> Can't be more specific than that without a lot more information about the
> database, how it is utilized, and what's actually slow.
>
>

I understand your remark about the CPU in the light of my wrong assumption
earlier, but I do not understand your remark about the RAM.  The fact that
temporary files of up to 250Gb are created at times during complex queries,
is to me an indication of too low RAM.

Question: How do I dedicate a partition to indexes?  Were do I configure
PostgreSQL to write them in a particular area?

Regards
Johann


Re: [PERFORM] Specifications for a new server

2014-05-08 Thread Johann Spies
On 8 May 2014 10:11, Johann Spies  wrote:

>
> Question: How do I dedicate a partition to indexes?  Were do I configure
> PostgreSQL to write them in a particular area?
>
>
>
I just discovered TABLESPACE which answered my question.

Regards
Johann

-- 
Because experiencing your loyal love is better than life itself,
my lips will praise you.  (Psalm 63:3)


Re: [PERFORM] pgtune + configurations with 9.3

2014-11-23 Thread Johann Spies
I have done some tests using pgbench-tools with different configurations on
our new server with 768G RAM and it seems for our purpose 32G
shared_buffers would give the best results.

Regards
Johann

On 17 November 2014 at 07:17, Stuart Bishop  wrote:

> On 15 November 2014 06:00, Mark Kirkwood 
> wrote:
>
> > It is probably time to revisit this 8GB limit with some benchmarking. We
> > don't really have a hard and fast rule that is known to be correct, and
> that
> > makes Alexey's job really difficult. Informally folk (including myself at
> > times) have suggested:
> >
> > min(ram/4, 8GB)
> >
> > as the 'rule of thumb' for setting shared_buffers. However I was recently
>
> It would be nice to have more benchmarking and improve the rule of
> thumb. I do, however, believe this is orthogonal to fixing pgtune
> which I think should be using the current rule of thumb (which is
> overwhelmingly min(ram/4, 8GB) as you suggest).
>
>
>
> > benchmarking a machine with a lot of ram (1TB) and entirely SSD storage
> [1],
> > and that seemed quite happy with 50GB of shared buffers (better
> performance
> > than with 8GB). Now shared_buffers was not the variable we were
> > concentrating on so I didn't get too carried away and try much bigger
> than
> > about 100GB - but this seems like a good thing to come out with some
> numbers
> > for i.e pgbench read write and read only tps vs shared_buffers 1 -> 100
> GB
> > in size.
>
> I've always thought the shared_buffers setting would need to factor in
> things like CPU speed and memory access, since the rational for the
> 8GB cap has always been the cost to scan the data structures. And the
> kernel would factor in too, since the PG specific algorithms are in
> competition with the generic OS algorithms. And size of the hot set,
> since this gets pinned in shared_buffers. Urgh, so many variables.
>
> --
> Stuart Bishop 
> http://www.stuartbishop.net/
>
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance
>



-- 
Because experiencing your loyal love is better than life itself,
my lips will praise you.  (Psalm 63:3)


Re: [PERFORM] pgtune + configurations with 9.3

2014-11-26 Thread Johann Spies
Hello Greame,

It's probably helpful if everyone sharing this information can post their
> measurement process / settings and the results as completely as possible,
> for comparison and reference.
>

Apologies.  I have only changed one parameter in postgresql.conf for the
tests and that was shared_buffers:

 shared_buffers = 32GB   # min 128k
shared_preload_libraries = 'auto_explain'   # (change requires
restart)
vacuum_cost_delay = 5   # 0-100 milliseconds
wal_sync_method = open_sync # the default is the first option
wal_buffers = -1# min 32kB, -1 sets based
on shared_buffers
checkpoint_completion_target = 0.9  # checkpoint target
duration, 0.0 - 1.0
checkpoint_warning = 30s# 0 disables
default_statistics_target = 100 # range 1-1
log_line_prefix = '%t ' # special values:
log_statement = 'all'   # none, ddl, mod, all
log_timezone = 'localtime'
autovacuum_vacuum_scale_factor = 0.1# fraction of table size
before vacuum
autovacuum_vacuum_cost_delay = 5ms  # default vacuum cost delay
for
datestyle = 'iso, dmy'
timezone = 'localtime'
lc_messages = 'en_ZA.UTF-8' # locale for system
error message
lc_monetary = 'en_ZA.UTF-8' # locale for
monetary formatting
lc_numeric = 'en_ZA.UTF-8'  # locale for number
formatting
lc_time = 'en_ZA.UTF-8' # locale for time
formatting
default_text_search_config = 'pg_catalog.english'
auto_explain.log_min_duration = '6s' # Gregory Smith page 180
effective_cache_size = 512GB # pgtune wizard 2014-09-25
work_mem = 4608MB # pgtune wizard 2014-09-25
checkpoint_segments = 16 # pgtune wizard 2014-09-25
max_connections = 80 # pgtune wizard 2014-09-25

And pgbench-tools - the default configuration:

BASEDIR=`pwd`
PGBENCHBIN=`which pgbench`
TESTDIR="tests"
SKIPINIT=0
TABBED=0
OSDATA=1
TESTHOST=localhost
TESTUSER=`whoami`
TESTPORT=5432
TESTDB=pgbench
RESULTHOST="$TESTHOST"
RESULTUSER="$TESTUSER"
RESULTPORT="$TESTPORT"
RESULTDB=results
MAX_WORKERS=""
SCRIPT="select.sql"
SCALES="1 10 100 1000"
SETCLIENTS="1 2 4 8 16 32"
SETTIMES=3
RUNTIME=60
TOTTRANS=""
SETRATES=""


The server:

# See Gregory Smith: High Performans Postgresql 9.0 pages 81,82 for the
next lines
vm.swappiness=0
vm.overcommit_memory=2
vm.dirty_ratio = 2
vm.dirty_background_ratio=1
# Maximum shared segment size in bytes
kernel.shmmax = 406622322688
# Maximum number of shared memory segments in pages
kernel.shmall = 99273028

$ free
 total   used   free sharedbuffers cached
Mem: 794184164  792406416148  0 123676  788079892
-/+ buffers/cache:4202848  789981316
Swap:  7906300  07906300

I have attached the resulting graphs.

Regards
Johann

-- 
Because experiencing your loyal love is better than life itself,
my lips will praise you.  (Psalm 63:3)

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] pgtune + configurations with 9.3

2014-11-26 Thread Johann Spies
Another apology:

My pg_version is 9.3
and here are more up to date png's.

On 26 November 2014 at 15:34, Johann Spies  wrote:

> Hello Greame,
>
> It's probably helpful if everyone sharing this information can post their
>> measurement process / settings and the results as completely as possible,
>> for comparison and reference.
>>
>
> Apologies.  I have only changed one parameter in postgresql.conf for the
> tests and that was shared_buffers:
>
>  shared_buffers = 32GB   # min 128k
> shared_preload_libraries = 'auto_explain'   # (change requires
> restart)
> vacuum_cost_delay = 5   # 0-100 milliseconds
> wal_sync_method = open_sync # the default is the first option
> wal_buffers = -1# min 32kB, -1 sets based
> on shared_buffers
> checkpoint_completion_target = 0.9  # checkpoint target
> duration, 0.0 - 1.0
> checkpoint_warning = 30s# 0 disables
> default_statistics_target = 100 # range 1-1
> log_line_prefix = '%t ' # special values:
> log_statement = 'all'   # none, ddl, mod, all
> log_timezone = 'localtime'
> autovacuum_vacuum_scale_factor = 0.1# fraction of table size
> before vacuum
> autovacuum_vacuum_cost_delay = 5ms  # default vacuum cost
> delay for
> datestyle = 'iso, dmy'
> timezone = 'localtime'
> lc_messages = 'en_ZA.UTF-8' # locale for
> system error message
> lc_monetary = 'en_ZA.UTF-8' # locale for
> monetary formatting
> lc_numeric = 'en_ZA.UTF-8'  # locale for
> number formatting
> lc_time = 'en_ZA.UTF-8' # locale for time
> formatting
> default_text_search_config = 'pg_catalog.english'
> auto_explain.log_min_duration = '6s' # Gregory Smith page 180
> effective_cache_size = 512GB # pgtune wizard 2014-09-25
> work_mem = 4608MB # pgtune wizard 2014-09-25
> checkpoint_segments = 16 # pgtune wizard 2014-09-25
> max_connections = 80 # pgtune wizard 2014-09-25
>
> And pgbench-tools - the default configuration:
>
> BASEDIR=`pwd`
> PGBENCHBIN=`which pgbench`
> TESTDIR="tests"
> SKIPINIT=0
> TABBED=0
> OSDATA=1
> TESTHOST=localhost
> TESTUSER=`whoami`
> TESTPORT=5432
> TESTDB=pgbench
> RESULTHOST="$TESTHOST"
> RESULTUSER="$TESTUSER"
> RESULTPORT="$TESTPORT"
> RESULTDB=results
> MAX_WORKERS=""
> SCRIPT="select.sql"
> SCALES="1 10 100 1000"
> SETCLIENTS="1 2 4 8 16 32"
> SETTIMES=3
> RUNTIME=60
> TOTTRANS=""
> SETRATES=""
>
>
> The server:
>
> # See Gregory Smith: High Performans Postgresql 9.0 pages 81,82 for the
> next lines
> vm.swappiness=0
> vm.overcommit_memory=2
> vm.dirty_ratio = 2
> vm.dirty_background_ratio=1
> # Maximum shared segment size in bytes
> kernel.shmmax = 406622322688
> # Maximum number of shared memory segments in pages
> kernel.shmall = 99273028
>
> $ free
>  total   used   free sharedbuffers cached
> Mem: 794184164  792406416148  0 123676  788079892
> -/+ buffers/cache:4202848  789981316
> Swap:  7906300  07906300
>
> I have attached the resulting graphs.
>
> Regards
> Johann
>
> --
> Because experiencing your loyal love is better than life itself,
> my lips will praise you.  (Psalm 63:3)
>



-- 
Because experiencing your loyal love is better than life itself,
my lips will praise you.  (Psalm 63:3)

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Slow query - lots of temporary files.

2015-06-10 Thread Johann Spies
I have stopped this query after about 16 hours.  At the same time I ran a
'explain analyze' on the same query to find out why it took so long.  These
two processes generated temporary files of 173GB in
/var/lib/postgresql/9.4/main/base/pgsql_tmp.

COPY
  (SELECT A.ut,
  B.go AS funding_org,
  B.gn AS grant_no,
  C.gt AS thanks,
  D.au
   FROM isi.funding_text C,
isi.rauthor D,
isi.africa_uts A
   LEFT JOIN isi.funding_org B ON (B.ut = A.ut)
   WHERE (C.ut IS NOT NULL
  OR B.ut IS NOT NULL)
 AND D.rart_id = C.ut
 AND C.ut = B.ut
   GROUP BY A.ut,
GO,
gn,
gt,
au
   ORDER BY funding_org) TO '/tmp/africafunding2.csv' WITH csv quote
'"' DELIMITER ',';


A modified version of this query finished in 1min 27 sek:

COPY
  (SELECT 'UT'||A.ut,
B.go AS funding_org,
B.gn AS grant_no,
C.gt AS thanks
   FROM isi.africa_uts A
   LEFT JOIN isi.funding_org B ON (B.ut = A.ut)
   LEFT JOIN isi.funding_text C ON (A.ut = C.ut)
   WHERE (C.ut IS NOT NULL
  OR B.ut IS NOT NULL)
   GROUP BY A.ut,
GO,
gn,
gt) TO '/tmp/africafunding.csv' WITH csv quote '"' DELIMITER ',';

As I said, the process of 'explain analyze' of the problematic query
contributed to the 173GB
temporary files and did not finish in about 16 hours.

Just explain of the query part produces this:

"Sort  (cost=4781458203.46..4798118612.44 rows=6664163593 width=390)"
"  Output: a.ut, b.go, b.gn, c.gt, (array_to_string(array_agg(d.au),
';'::text)), b.go, b.gn, d.au"
"  Sort Key: b.go"
"  ->  GroupAggregate  (cost=2293037801.73..2509623118.51
rows=6664163593 width=390)"
"Output: a.ut, b.go, b.gn, c.gt,
array_to_string(array_agg(d.au), ';'::text), b.go, b.gn, d.au"
"Group Key: a.ut, b.go, b.gn, c.gt, d.au"
"->  Sort  (cost=2293037801.73..2309698210.72 rows=6664163593
width=390)"
"  Output: a.ut, c.gt, b.go, b.gn, d.au"
"  Sort Key: a.ut, b.go, b.gn, c.gt, d.au"
"  ->  Merge Join  (cost=4384310.92..21202716.78
rows=6664163593 width=390)"
"Output: a.ut, c.gt, b.go, b.gn, d.au"
"Merge Cond: ((c.ut)::text = (d.rart_id)::text)"
"->  Merge Join  (cost=635890.84..1675389.41
rows=6069238 width=412)"
"  Output: c.gt, c.ut, a.ut, b.go, b.gn, b.ut"
"  Merge Cond: ((c.ut)::text = (b.ut)::text)"
"  Join Filter: ((c.ut IS NOT NULL) OR (b.ut
IS NOT NULL))"
"  ->  Merge Join  (cost=635476.30..675071.77
rows=1150354 width=348)"
"Output: c.gt, c.ut, a.ut"
"Merge Cond: ((a.ut)::text = (c.ut)::text)"
"->  Index Only Scan using
africa_ut_idx on isi.africa_uts a  (cost=0.42..19130.19 rows=628918
width=16)"
"  Output: a.ut"
"->  Sort  (cost=632211.00..640735.23
rows=3409691 width=332)"
"  Output: c.gt, c.ut"
"  Sort Key: c.ut"
"  ->  Seq Scan on
isi.funding_text c  (cost=0.00..262238.91 rows=3409691 width=332)"
"Output: c.gt, c.ut"
"  ->  Index Scan using funding_org_ut_idx on
isi.funding_org b  (cost=0.56..912582.50 rows=9835492 width=64)"
"Output: b.id, b.ut, b.go, b.gn"
"->  Materialize  (cost=0.57..17914892.46
rows=159086560 width=26)"
"  Output: d.id, d.rart_id, d.au, d.ro, d.ln,
d.af, d.ras, d.ad, d.aa, d.em, d.ag, d.tsv"
"  ->  Index Scan using rauthor_rart_id_idx on
isi.rauthor d  (cost=0.57..17517176.06 rows=159086560 width=26)"
"Output: d.id, d.rart_id, d.au, d.ro,
d.ln, d.af, d.ras, d.ad, d.aa, d.em, d.ag, d.tsv"

Any idea on why adding the rauthor table in the query is so problematic?

My systerm:

768 GB RAM
shared_ buffers:  32GB
work_mem:  4608MB

Regards
Johann

-- 
Because experiencing your loyal love is better than life itself,
my lips will praise you.  (Psalm 63:3)


Re: [PERFORM] Slow query - lots of temporary files.

2015-06-10 Thread Johann Spies
On 10 June 2015 at 15:02, Claudio Freire  wrote:

>
> The joins are different on both versions, and the most likely culprit
> is the join against D. It's probably wrong, and the first query is
> building a cartesian product.
>
> Without more information about the schema it's difficult to be sure though.
>

Thanks for your  reply.  I will experiment futher with different joins.

Here is the schema of the involved tables:

nkb=# \d isi.funding_text
Table "isi.funding_text"
 Column | Type  |
Modifiers
+---+---
 id | integer   | not null default
nextval('isi.funding_text_id_seq'::regclass)
 ut | character varying(15) |
 gt | citext|
Indexes:
"funding_text_pkey" PRIMARY KEY, btree (id)
"funding_text_ut_idx" btree (ut)
Foreign-key constraints:
"funding_text_ut_fkey" FOREIGN KEY (ut) REFERENCES isi.ritem(ut)

nkb=# \d isi.funding_org
Table "isi.funding_org"
 Column | Type  |
Modifiers
+---+--
 id | integer   | not null default
nextval('isi.funding_org_id_seq'::regclass)
 ut | character varying(15) |
 go | citext|
 gn | character varying |
Indexes:
"funding_org_pkey" PRIMARY KEY, btree (id)
"funding_org_ut_idx" btree (ut)
Foreign-key constraints:
"funding_org_ut_fkey" FOREIGN KEY (ut) REFERENCES isi.ritem(ut)


Table "isi.africa_uts"
 Column | Type  |
Modifiers
+---+-
 ut | character varying(15) |
 id | integer   | not null default
nextval('isi.africa_uts_id_seq'::regclass)
Indexes:
"africa_uts_pkey" PRIMARY KEY, btree (id)
"africa_ut_idx" btree (ut)
Foreign-key constraints:
"africa_uts_ut_fkey" FOREIGN KEY (ut) REFERENCES isi.ritem(ut)


 Table "isi.rauthor"
 Column  |  Type  |
Modifiers
-++--
 id  | integer| not null default
nextval('isi.rauthor_id_seq'::regclass)
 rart_id | character varying(15)  |
 au  | character varying(75)  |
 ro  | character varying(30)  |
 ln  | character varying(200) |
 af  | character varying(200) |
 ras | character varying(4)   |
 ad  | integer|
 aa  | text   |
 em  | character varying(250) |
 ag  | character varying(75)  |
 tsv | tsvector   |
Indexes:
"rauthor_pkey" PRIMARY KEY, btree (id) CLUSTER
"rauthor_ad_idx" btree (ad)
"rauthor_au_idx" btree (au)
"rauthor_lower_idx" btree (lower(au::text))
"rauthor_lower_lower1_idx" btree (lower(ln::text), lower(af::text))
"rauthor_rart_id_idx" btree (rart_id)
"rauthor_tsv_idx" gin (tsv)
Referenced by:
TABLE "level1.person" CONSTRAINT "person_auth_id_fkey" FOREIGN KEY
(auth_id) REFERENCES isi.rauthor(id) ON DELETE CASCADE
Triggers:
tsvectorupdate_for_rauthor BEFORE INSERT OR UPDATE ON isi.rauthor FOR
EACH ROW EXECUTE PROCEDURE isi.update_rauthor_tsv()

Regards
Johann
-- 
Because experiencing your loyal love is better than life itself,
my lips will praise you.  (Psalm 63:3)


Re: [PERFORM] Slow query - lots of temporary files.

2015-06-12 Thread Johann Spies
On 10 June 2015 at 16:50, Tomas Vondra  wrote:

>
>
> The problematic piece of the explain plan is this:
>
>  ->  Merge Join  (cost=4384310.92..21202716.78 rows=6664163593
>   width=390)"
>Output: a.ut, c.gt, b.go, b.gn, d.au"
>Merge Cond: ((c.ut)::text = (d.rart_id)::text)"
>
> That is, the planner expects ~6.7 billion rows, each ~390B wide. That's
> ~2.5TB of data that needs to be stored to disk (so that the sort can
> process it).
>
> The way the schema is designed might be one of the issues - ISTM the 'ut'
> column is somehow universal, mixing values referencing different columns in
> multiple tables. Not only that's utterly misleading for the planner (and
> may easily cause issues with huge intermediate results), but it also makes
> formulating the queries very difficult. And of course, the casting between
> text and int is not very good either.
>
> Fix the schema to follow relational best practices - separate the values
> into multiple columns, and most of this will go away.
>

Thanks for your reply Tomas.

I do not understand what the problem with the 'ut' column is.  It is a
unique identifier in the first table(africa_uts) and is used in the other
tables to establish joins and does have the same type definition in all the
tables.  Is the problem in the similar name.  The data refers in all the
'ut' columns of the different tables to the same data.  I do not casting of
integers into text in this case.  I don't know why the planner is doing
it.  The field 'rart_id' in isi.rauthor is just another name for 'ut' in
the other tables and have the same datatype.

I do not understand your remark: "separate the values into multiple
columns". I cannot see which values can be separated into different columns
in the schema.  Do you mean in the query?  Why?


Johann
-- 
Because experiencing your loyal love is better than life itself,
my lips will praise you.  (Psalm 63:3)


[PERFORM] Long running query: How to monitor the progress

2015-08-25 Thread Johann Spies
Working with 9.4.

We are in the process of unpacking complicated XML-data into tables.
XML-data are already in a table with two fields (id, xml) - 47+ million
records.

Some of hour queries to extract the data and insert it in other tables runs
for days and in one case we have created a table with 758million unique
records.

Now my question. Is there a way to monitor the progress of a long running
query like this?

I have recently read that it is probably better for processes like this to
copy result of the query to a csv-file and then import it again with copy
as an insert. Next time I will try that.

The following query has been running for 6 days now and are still running
(I have anonymized it a little bit) on a server with 768 GB RAM.  It has
created 44 temporary files so far:

INSERT INTO table_a_link(uid,gn_id)

WITH p AS
  (SELECT ARRAY[ARRAY['t','some_xpath']] AS some_xpath),
 q AS
  (SELECT (xpath('//t:UID/text()',xml,some_xpath))[1] uid,
unnest(xpath('//t:grant',xml,some_xpath)) AS gr
   FROM source.xml_data a,
p
WHERE xpath_exists('//t:grant/t:grant_agency', xml ,some_xpath)),
r AS
  (
  SELECT
CASE WHEN
xpath_exists('//t:grant_ids', gr, some_xpath)
THEN
unnest(xpath('//t:grant_ids', gr, some_xpath))
ELSE
NULL
END
AS GR_ids
   FROM q,
p ) ,
   y as (SELECT A.UUID AS FO_ID,
/*  unnest(xpath('//t:grant_agency/text()',GR,ns))::citext AS agency,
*/  CASE WHEN
xpath_exists('//t:grant_id', gr_ids, some_xpath)
THEN
unnest(xpath('//t:grant_id/text()', gr_ids, some_xpath))::citext
ELSE
NULL
END
grant_NO,
uid::varchar(19)
from WOS.FUNDING_ORG A, p,q
  left join r on (xpath('//t:grant/t:grant_ids/t:grant_id/text()',gr,
  ARRAY[ARRAY['t','some_xpath']])::citext =

xpath('//t:grant_id/text()',GR_IDS,ARRAY[ARRAY['t','some_xpath']])::citext)
WHERE A.FUNDING_ORG =
(xpath('//t:grant_agency/text()',GR,some_xpath))[1]::CITEXT
)

 select distinct y.uid, B.uuid gn_id

 from y, table_b B
 where
y.fo_id = B.fo_id
and
y.grant_no is not distinct from b.grant_no


Regards.
Johann


[PERFORM] Delete, foreign key, index usage

2017-04-04 Thread Johann Spies
While updating our database which includes a lot of deletions where a lot
of foreign key references are involved we found that in the case of two
tables the indexes are ignored and it slow down the process a lot.

Here are stats about those two tables:

relname seq_scan seq_tup_read idx_scan idx_tup_fetch n_tup_ins n_tup_upd
n_tup_del n_tup_hot_upd n_live_tup n_dead_tup n_mod_since_analyze
belongs_to 227 52539487559 0 0 771 0 1459 0 125 1459 2230
publication 229 11502854612 0 0 254 0 229 0 60 229 483
Publication ( has a foreign key (ut) and more than 50million records) that
references the top of the chain of references. This field (ut) is also the
primary key of publication.

In the case of belongs_to (about 231393000 records)  which references the
same table (article) ut has an index.

All other tables in this dependency chain reports 100% or near 100% usage
of the indexes e.g.

citation_2010_2014 0 0 226 1882 2510 0 1910 0 816 1910 4420

The indexes are on a ssd and we have set the random_page_cost to 1 for
those queries.

The definition of belongs_to:

CREATE TABLE wos_2017_1.belongs_to
(
  suborg_id uuid,
  organisation_id uuid,
  address_id uuid,
  ut citext,
  uuid uuid NOT NULL,
  id integer NOT NULL DEFAULT
nextval('wos_2017_1.belongs2_id_seq'::regclass),
  pref_name_id uuid,
  addr_no smallint,
  reprint_addr_no smallint,
  CONSTRAINT belongs2_pkey PRIMARY KEY (uuid),
  CONSTRAINT belongs_to_address_id_fkey FOREIGN KEY (address_id)
  REFERENCES wos_2017_1.address (uuid) MATCH SIMPLE
  ON UPDATE NO ACTION ON DELETE SET NULL,
  CONSTRAINT belongs_to_pref_name_id_fkey FOREIGN KEY (pref_name_id)
  REFERENCES wos_2017_1.org_pref_name (uuid) MATCH SIMPLE
  ON UPDATE NO ACTION ON DELETE SET NULL,
  CONSTRAINT belongs_to_suborg_id_fkey FOREIGN KEY (suborg_id)
  REFERENCES wos_2017_1.suborg (uuid) MATCH SIMPLE
  ON UPDATE NO ACTION ON DELETE SET NULL,
  CONSTRAINT belongs_to_ut_fkey FOREIGN KEY (ut)
  REFERENCES wos_2017_1.article (ut) MATCH SIMPLE
  ON UPDATE NO ACTION ON DELETE CASCADE,
  CONSTRAINT belongs2_id_key UNIQUE (id),
  CONSTRAINT
belongs2_ut_suborg_id_organisation_id_address_id_addr_no_re_key UNIQUE (ut,
suborg_id, organisation_id, address_id, addr_no, reprint_addr_no,
pref_name_id)
)
WITH (
  OIDS=FALSE
);
with indexes on address_id, organisation_id, pref_name_id, ut

I have also tried to set enable_seqscan to false for these queries, but
still no usage of the indexes.

Why would that be?

Regards
Johann
-- 
Because experiencing your loyal love is better than life itself,
my lips will praise you.  (Psalm 63:3)


Re: [PERFORM] Delete, foreign key, index usage

2017-04-05 Thread Johann Spies
On 4 April 2017 at 14:07, Johann Spies  wrote:

> Why would that be?

To answer my own question.  After experimenting a lot we found that
9.6 uses a parallel seqscan that is actually a lot faster than using
the index on these large tables.

This, to us was a surprise!

Regards
Johann

-- 
Because experiencing your loyal love is better than life itself,
my lips will praise you.  (Psalm 63:3)


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Delete, foreign key, index usage

2017-04-23 Thread Johann Spies
> On Wed, Apr 5, 2017 at 6:40 AM, Johann Spies  wrote:
>>
>> On 4 April 2017 at 14:07, Johann Spies  wrote:
>>
>> > Why would that be?
>>
>> To answer my own question.  After experimenting a lot we found that
>> 9.6 uses a parallel seqscan that is actually a lot faster than using
>> the index on these large tables.

Further experimenting resulted in a solution which we do not understand:

The table 'publication'  had the field 'ut' as primary key and the ut
index was not used.

So we built an additional btree index(ut) on publication - which was
ignored as well.
Then we built a gin index(ut) on publication and now it is being used.

The same happened on the other table (belongs_to) where the btree
index was ignored by the planner but the gin-index used.

As a result our deletes runs between 25-60 times faster than earlier
with maximum of about 20 records per hour in comparison with a
maximum of 4500 earlier..

In the case of both tables the ut has a foreign key reference to
another article.

Why would the planner prefer the use the gin index and not the btree
index in this case?

Regards
Johann
-- 
Because experiencing your loyal love is better than life itself,
my lips will praise you.  (Psalm 63:3)


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Delete, foreign key, index usage

2017-04-24 Thread Johann Spies
On 24 April 2017 at 15:17, Tomas Vondra  wrote:
> On 04/24/2017 08:48 AM, Johann Spies wrote:
>>
>>
>> Why would the planner prefer the use the gin index and not the btree
>> index in this case?
>>
>
> You'll need to show what queries are you running - that's a quite important
> piece of information, and I don't see it anywhere in this thread. Seeing
> explain plans would also be helpful.

It is a simple "delete from wos_2017_1.article;" which causes a domino
effect deletes due to foreign keys. In the case of one table with more
than 50 million records where the primary key was also the foreign
key, the process only started to use the index when we built a gin
index.  In the case of the "belongs_to" table (shown in my first
email) we first built a btree index on the foreign key - and it was
ignored.  Only after the gin index was created did it use the index.

Regards.
Johann
-- 
Because experiencing your loyal love is better than life itself,
my lips will praise you.  (Psalm 63:3)


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Delete, foreign key, index usage

2017-05-25 Thread Johann Spies
On 4 April 2017 at 14:07, Johann Spies  wrote:

> While updating our database which includes a lot of deletions where a lot
> of foreign key references are involved we found that in the case of two
> tables the indexes are ignored and it slow down the process a lot.
> ...
>


>
> Why would that be?
>

After a long time we found the problem:  The primary/foreign key fields had
different types: varchar and citext.  In the case of the two tables where
the indexes were ignored indexes were built with the 'citext'  type and the
queries assumed it was varchar as the case were in the other tables using
the same field.

Lesson learnt:  Check your types in every field in every table - and we
have many tables.

Regards
Johann


-- 
Because experiencing your loyal love is better than life itself,
my lips will praise you.  (Psalm 63:3)