subject:"\[GENERAL\] Processor usage\/tuning question"

Re: [GENERAL] Processor usage/tuning question

2014-10-13 Thread Israel Brewster


---
Israel Brewster
Systems Analyst II
Ravn Alaska
5245 Airport Industrial Rd
Fairbanks, AK 99709
(907) 450-7293
---


BEGIN:VCARD
VERSION:3.0
N:Brewster;Israel;;;
FN:Israel Brewster
ORG:Frontier Flying Service;MIS
TITLE:PC Support Tech II
EMAIL;type=INTERNET;type=WORK;type=pref:isr...@frontierflying.com
TEL;type=WORK;type=pref:907-450-7293
item1.ADR;type=WORK;type=pref:;;5245 Airport Industrial Wy;Fairbanks;AK;99701;
item1.X-ABADR:us
CATEGORIES:General
X-ABUID:36305438-95EA-4410-91AB-45D16CABCDDC\:ABPerson
END:VCARD
 

On Oct 10, 2014, at 1:04 PM, Jim Nasby  wrote:

> On 10/8/14, 3:17 PM, Israel Brewster wrote:
>> Except that the last data point received is still valid as the aircraft's 
>> current location, even if it came in several hours ago - which we may well 
>> care about. That is, some users may need to see where a given aircraft (or 
>> the entire fleet) is, even if an aircraft hasn't updated in a while. That 
>> said, I did discuss this with my higher-ups, and got the ok to take it down 
>> to four hours.
> 
> Note that in your explain output nothing is filtering by time at all; are you 
> sure you posted the right explain?
> 
>>> I don't think PostgreSQL is going to be able to reason very effectively 
>>> about a ROW_NUMBER() in a inner table and then a row<=5 in the outer one 
>>> being equivalent to a LIMIT query for which it could walk an index and then 
>>> stopping once it finds 5 of them.
>>> 
>>> Does this need to issued as a single query?  Why not issue 55 different 
>>> queries?  It seems like the client is likely going to need to pick the 
>>> returned list back out by tail number anyway, so both the client and the 
>>> server might be happier with separate queries.
>> 
>> Good point. Doing that brought the entire execution time down to around 
>> 60ms. Just ever so slightly better than the ~1200ms I was getting before. 
>> :-) I just have an unreasonable aversion to making multiple queries. I guess 
>> this is a prime example of why I need to overcome that :-)
> 
> Do you actually need the last 5 points? If you could get away with just the 
> most recent point, SELECT DISTINCT ON might do a better job of this in a 
> single query.

At the moment, unfortunately yes - I have to do some calculations based on the 
past few data points. At some point I should be able to re-work the system such 
that said calculations are done when the points are saved, rather than when 
they are retrieved, which would be beneficial for a number of reasons. However, 
until I can get that done I need multiple points here.

> 
> As for the concern about issuing multiple queries, if you code this into a 
> database function it should still be quite fast because there won't be any 
> round-trip between your application and the database.

I've had phenomenally bad luck with coding queries into database functions. I 
had a number of functions written at one point that allowed me to do things 
like select .function,  FROM table - until I noticed that 
said queries ran significantly slower than just doing the query I had encoded 
in the function as a sub-query instead. I was doing these same sub-queries in a 
bunch of different places, so I figured it would clarify things if I could just 
code them into a DB function that I called just like a column. It's been a 
while since I looked at those, however, so I can't say why they were slow. This 
usage may not suffer from the same problem.

> 
> Something else to consider is having a second table that only keeps the last 
> X aircraft positions. I would do this by duplicating every insert into that 
> table via a trigger, and then have a separate process that ran once a minute 
> to delete any records other than the newest X. Because that table would 
> always be tiny queries against it should be blazing fast. Do note that you'll 
> want to vacuum that table frequently, like right after each delete.

Indeed. I think I'm happy with the performance of the multiple queries, but 
this would doubtless be the "best" option (from a performance standpoint), as 
the table would be small and my select would be essentially SELECT * FROM 
TABLE, with a potential WHERE ... IN... clause.

Thanks for all the help!

> -- 
> Jim Nasby, Data Architect, Blue Treble Consulting
> Data in Trouble? Get it in Treble! http://BlueTreble.com
> 
> 
> -- 
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Processor usage/tuning question

2014-10-10 Thread Jeff Janes

On Fri, Oct 10, 2014 at 2:04 PM, Jim Nasby  wrote:

> On 10/8/14, 3:17 PM, Israel Brewster wrote:
>
>> Except that the last data point received is still valid as the aircraft's
>> current location, even if it came in several hours ago - which we may well
>> care about. That is, some users may need to see where a given aircraft (or
>> the entire fleet) is, even if an aircraft hasn't updated in a while. That
>> said, I did discuss this with my higher-ups, and got the ok to take it down
>> to four hours.
>>
>
> Note that in your explain output nothing is filtering by time at all; are
> you sure you posted the right explain?
>

I thought that as well, but then I noticed that the depesz site truncates
the data in the html view.  Switch to the text view and the filter is
there, as part of the index condition, after the long ANY list.
http://explain.depesz.com/s/H5w

Cheers,

Jeff

Re: [GENERAL] Processor usage/tuning question

2014-10-10 Thread Jim Nasby


On 10/8/14, 3:17 PM, Israel Brewster wrote:

Except that the last data point received is still valid as the aircraft's 
current location, even if it came in several hours ago - which we may well care 
about. That is, some users may need to see where a given aircraft (or the 
entire fleet) is, even if an aircraft hasn't updated in a while. That said, I 
did discuss this with my higher-ups, and got the ok to take it down to four 
hours.


Note that in your explain output nothing is filtering by time at all; are you 
sure you posted the right explain?


I don't think PostgreSQL is going to be able to reason very effectively about a 
ROW_NUMBER() in a inner table and then a row<=5 in the outer one being 
equivalent to a LIMIT query for which it could walk an index and then stopping 
once it finds 5 of them.

Does this need to issued as a single query?  Why not issue 55 different 
queries?  It seems like the client is likely going to need to pick the returned 
list back out by tail number anyway, so both the client and the server might be 
happier with separate queries.


Good point. Doing that brought the entire execution time down to around 60ms. 
Just ever so slightly better than the ~1200ms I was getting before. :-) I just 
have an unreasonable aversion to making multiple queries. I guess this is a 
prime example of why I need to overcome that :-)


Do you actually need the last 5 points? If you could get away with just the 
most recent point, SELECT DISTINCT ON might do a better job of this in a single 
query.

As for the concern about issuing multiple queries, if you code this into a 
database function it should still be quite fast because there won't be any 
round-trip between your application and the database.

Something else to consider is having a second table that only keeps the last X 
aircraft positions. I would do this by duplicating every insert into that table 
via a trigger, and then have a separate process that ran once a minute to 
delete any records other than the newest X. Because that table would always be 
tiny queries against it should be blazing fast. Do note that you'll want to 
vacuum that table frequently, like right after each delete.
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Processor usage/tuning question

2014-10-08 Thread Israel Brewster


---Israel BrewsterSystems Analyst IIRavn Alaska5245 Airport Industrial RdFairbanks, AK 99709(907) 450-7293---BEGIN:VCARD
VERSION:3.0
N:Brewster;Israel;;;
FN:Israel Brewster
ORG:Frontier Flying Service;MIS
TITLE:PC Support Tech II
EMAIL;type=INTERNET;type=WORK;type=pref:isr...@frontierflying.com
TEL;type=WORK;type=pref:907-450-7293
item1.ADR;type=WORK;type=pref:;;5245 Airport Industrial Wy;Fairbanks;AK;99701;
item1.X-ABADR:us
CATEGORIES:General
X-ABUID:36305438-95EA-4410-91AB-45D16CABCDDC\:ABPerson
END:VCARD


On Oct 8, 2014, at 11:34 AM, Jeff Janes  wrote:On Tue, Oct 7, 2014 at 12:06 PM, israel  wrote:
Thank you all for the advice. It looks like the load is due to a query that is taking around 1300ms to complete - a query that is run by every client connected (probably half a dozen or so, although I don't have specific numbers), every fifteen seconds or so. As you can imagine, that keeps the server rather busy :-) Specifically, it looks like the time is due to a sort (PARTITION BY tail ORDER BY pointtime DESC) that operates on around 100,000 rows.

The lovely details:

The query in question is the following:

SELECT *
FROM (SELECT tail, to_char(pointtime,'MM/DD/ HH24:MI:SS'), lat,lng,altitude,heading,speed,source,pointtime,
      ROW_NUMBER() OVER (PARTITION BY tail ORDER BY pointtime DESC) as row
FROM data
WHERE tail in () and pointtime>=''::timestamp) s1
WHERE s1.row<=5
ORDER BY tail, pointtime DESC

In english, it boils down to get the five most recent data points for each listed tail number. I look at the last 24 hours of data because it is quite possible that a tail number may have no recent data points.How many different tail numbers do you have in the last 24 hours?  Based on the numbers you provide, it sounds like the list of 55 tail numbers is pretty much all of them that it could expect to find anyway.In this case yes - that 55 or so is everything (worst case, but fairly common scenario). The system is set up such that the user can select which tail number(s) they want to see, thus the tail in  construct. It's just that they often select all tail numbers. 
One obvious optimization is to look at a smaller time range. This will definitely speed up the query, but at the risk of not getting any data points for one or more of the requested tail numbers (there is already this risk, but looking back 24 hours keeps it fairly small for us).But then, do you care?  If you haven't heard from an airplane in 24 hours, it seems like either you don't care, or you care very very much and don't need the database to remind you.Except that the last data point received is still valid as the aircraft's current location, even if it came in several hours ago - which we may well care about. That is, some users may need to see where a given aircraft (or the entire fleet) is, even if an aircraft hasn't updated in a while. That said, I did discuss this with my higher-ups, and got the ok to take it down to four hours. 

The table description:
tracking=# \d data
                                     Table "public.data"
  Column   |            Type             |                     Modifiers
---+-+---
 id        | bigint                      | not null default nextval('data_id_seq'::regclass)
 tail      | character varying(16)       | not null
 timerecp  | timestamp without time zone | not null default now()
 altitude  | integer                     |
 pointtime | timestamp without time zone |
 lat       | numeric(7,5)                | not null
 lng       | numeric(8,5)                | not null
 speed     | integer                     |
 heading   | integer                     |
 source    | character varying(64)       |
 syncd     | boolean                     | default false
Indexes:
    "data_pkey" PRIMARY KEY, btree (id)
    "pointtime_idx" btree (pointtime)
    "syncd_idx" btree (syncd)
    "tail_idx" btree (tail)
    "tailtime_idx" btree (tail, pointtime DESC)
    "timerecp_idx" btree (timerecp)

tracking=#

Adding the two-column sorted index didn't seem to affect the query time much.I don't think PostgreSQL is going to be able to reason very effectively about a ROW_NUMBER() in a inner table and then a row<=5 in the outer one being equivalent to a LIMIT query for which it could walk an index and then stopping once it finds 5 of them.Does this need to issued as a single query?  Why not issue 55 different queries?  It seems like the client is likely going to need to pick the returned list back out by tail number anyway, so both the client and the server might be happier with separate queries.Good point. Doing that brought the entire execution time down to around 60ms. Just ever so slightly better than the ~1200ms I was getting before. :-) I just have an unreasonable aversion to making multiple queries. I gu

Re: [GENERAL] Processor usage/tuning question

2014-10-08 Thread Jeff Janes

On Tue, Oct 7, 2014 at 12:06 PM, israel  wrote:

>
> Thank you all for the advice. It looks like the load is due to a query
> that is taking around 1300ms to complete - a query that is run by every
> client connected (probably half a dozen or so, although I don't have
> specific numbers), every fifteen seconds or so. As you can imagine, that
> keeps the server rather busy :-) Specifically, it looks like the time is
> due to a sort (PARTITION BY tail ORDER BY pointtime DESC) that operates on
> around 100,000 rows.
>
> The lovely details:
>
> The query in question is the following:
>
> SELECT *
> FROM (SELECT tail, to_char(pointtime,'MM/DD/ HH24:MI:SS'),
> lat,lng,altitude,heading,speed,source,pointtime,
>   ROW_NUMBER() OVER (PARTITION BY tail ORDER BY pointtime DESC) as row
> FROM data
> WHERE tail in () and pointtime>=' of 24 hours prior to current UTC time>'::timestamp) s1
> WHERE s1.row<=5
> ORDER BY tail, pointtime DESC
>
> In english, it boils down to get the five most recent data points for each
> listed tail number. I look at the last 24 hours of data because it is quite
> possible that a tail number may have no recent data points.
>

How many different tail numbers do you have in the last 24 hours?  Based on
the numbers you provide, it sounds like the list of 55 tail numbers is
pretty much all of them that it could expect to find anyway.


>
> One obvious optimization is to look at a smaller time range. This will
> definitely speed up the query, but at the risk of not getting any data
> points for one or more of the requested tail numbers (there is already this
> risk, but looking back 24 hours keeps it fairly small for us).
>

But then, do you care?  If you haven't heard from an airplane in 24 hours,
it seems like either you don't care, or you care very very much and don't
need the database to remind you.



>
> The table description:
> tracking=# \d data
>  Table "public.data"
>   Column   |Type | Modifiers
> ---+-+--
> -
>  id| bigint  | not null default
> nextval('data_id_seq'::regclass)
>  tail  | character varying(16)   | not null
>  timerecp  | timestamp without time zone | not null default now()
>  altitude  | integer |
>  pointtime | timestamp without time zone |
>  lat   | numeric(7,5)| not null
>  lng   | numeric(8,5)| not null
>  speed | integer |
>  heading   | integer |
>  source| character varying(64)   |
>  syncd | boolean | default false
> Indexes:
> "data_pkey" PRIMARY KEY, btree (id)
> "pointtime_idx" btree (pointtime)
> "syncd_idx" btree (syncd)
> "tail_idx" btree (tail)
> "tailtime_idx" btree (tail, pointtime DESC)
> "timerecp_idx" btree (timerecp)
>
> tracking=#
>
> Adding the two-column sorted index didn't seem to affect the query time
> much.
>

I don't think PostgreSQL is going to be able to reason very effectively
about a ROW_NUMBER() in a inner table and then a row<=5 in the outer one
being equivalent to a LIMIT query for which it could walk an index and then
stopping once it finds 5 of them.

Does this need to issued as a single query?  Why not issue 55 different
queries?  It seems like the client is likely going to need to pick the
returned list back out by tail number anyway, so both the client and the
server might be happier with separate queries.


>
> The table current contains 1303951 rows, and any given 24 hour period has
> around 110,000 rows.
>
> The results of the explain analyze command can be seen here:
> http://explain.depesz.com/s/H5w (nice site, btw. I'll have to be sure to
> bookmark it), where it clearly shows the the sort on
> data.tail,data.pointtime is the largest timesink (if I am reading it right).
>


The sort does seem pretty slow.  What is your encoding and collation?
Could you use the "C" collation if you are not already?

Cheers,

Jeff

Re: [GENERAL] Processor usage/tuning question

2014-10-07 Thread israel


On 10/03/2014 6:28 pm, Andy Colson wrote:

On 10/03/2014 04:40 PM, Alan Hodgson wrote:

On Friday, October 03, 2014 11:24:31 AM Israel Brewster wrote:
I have a Postgresql 9.3.5 server running on CentOS 6.5. In looking at 
some

stats today, I saw that it was handling about 4-5 transactions/second
(according to the SELECT sum(xact_commit+xact_rollback) FROM
pg_stat_database; query), and an instance of the postmaster process 
was
consistently showing 40%-80% utilization to handle this. I didn't 
think
anything of that (the machine has plenty of capacity) until I 
mentioned it
to a friend of mine, who said that utilization level seemed high for 
that
many transactions. So if that level of utilization IS high, what 
might I

need to tune to bring it down to a more reasonable level?



You probably have some read queries not properly indexed that are 
sequentially
scanning that 1.2 million row table over and over again. Enable slow 
query

logging and see what's going on.





Yep, do that... and then:

https://wiki.postgresql.org/wiki/Slow_Query_Questions

-Andy


Thank you all for the advice. It looks like the load is due to a query 
that is taking around 1300ms to complete - a query that is run by every 
client connected (probably half a dozen or so, although I don't have 
specific numbers), every fifteen seconds or so. As you can imagine, that 
keeps the server rather busy :-) Specifically, it looks like the time is 
due to a sort (PARTITION BY tail ORDER BY pointtime DESC) that operates 
on around 100,000 rows.


The lovely details:

The query in question is the following:

SELECT *
FROM (SELECT tail, to_char(pointtime,'MM/DD/ HH24:MI:SS'), 
lat,lng,altitude,heading,speed,source,pointtime,
  ROW_NUMBER() OVER (PARTITION BY tail ORDER BY pointtime DESC) as 
row

FROM data
WHERE tail in () and 
pointtime>='time>'::timestamp) s1

WHERE s1.row<=5
ORDER BY tail, pointtime DESC

In english, it boils down to get the five most recent data points for 
each listed tail number. I look at the last 24 hours of data because it 
is quite possible that a tail number may have no recent data points.


One obvious optimization is to look at a smaller time range. This will 
definitely speed up the query, but at the risk of not getting any data 
points for one or more of the requested tail numbers (there is already 
this risk, but looking back 24 hours keeps it fairly small for us).


The table description:
tracking=# \d data
 Table "public.data"
  Column   |Type | Modifiers
---+-+---
 id| bigint  | not null default 
nextval('data_id_seq'::regclass)

 tail  | character varying(16)   | not null
 timerecp  | timestamp without time zone | not null default now()
 altitude  | integer |
 pointtime | timestamp without time zone |
 lat   | numeric(7,5)| not null
 lng   | numeric(8,5)| not null
 speed | integer |
 heading   | integer |
 source| character varying(64)   |
 syncd | boolean | default false
Indexes:
"data_pkey" PRIMARY KEY, btree (id)
"pointtime_idx" btree (pointtime)
"syncd_idx" btree (syncd)
"tail_idx" btree (tail)
"tailtime_idx" btree (tail, pointtime DESC)
"timerecp_idx" btree (timerecp)

tracking=#

Adding the two-column sorted index didn't seem to affect the query time 
much.


The table current contains 1303951 rows, and any given 24 hour period 
has around 110,000 rows.


The results of the explain analyze command can be seen here: 
http://explain.depesz.com/s/H5w (nice site, btw. I'll have to be sure to 
bookmark it), where it clearly shows the the sort on 
data.tail,data.pointtime is the largest timesink (if I am reading it 
right).


Postgres version is PostgreSQL 9.3.5 on x86_64-unknown-linux-gnu, 
compiled by gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-4), 64-bit


This is the first time I have dug into this particular query, I want to 
say it wasn't this slow in my testing, but then the server wasn't under 
use in my testing either, and I probably had a lot less data (everything 
works, so it's been a while since I looked). Hardware is dual quad-core 
2.5GHZ xeon processors, 16 GB ram, and a SSD raid 10 holding the 
database. All this is new as of about 4 months ago.


And to recap the postgres memory settings:
shared_buffers: 4GB
effective_cache_size: 12GB

So, basically, what it boils down to is "is there a way to speed up that 
sort"? I want to say I've seen a number of similar questions here 
recently, so I'll spend some time perusing those.


Thanks again!


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Processor usage/tuning question

2014-10-07 Thread Emanuel Calvo


El 03/10/14 a las 16:24, Israel Brewster escibió:
> I have a Postgresql 9.3.5 server running on CentOS 6.5. In looking at
> some stats today, I saw that it was handling about 4-5
> transactions/second (according to the SELECT
> sum(xact_commit+xact_rollback) FROM pg_stat_database; query), and an
> instance of the postmaster process was consistently showing 40%-80%
> utilization to handle this. I didn't think anything of that (the
> machine has plenty of capacity) until I mentioned it to a friend of
> mine, who said that utilization level seemed high for that many
> transactions. So if that level of utilization IS high, what might I
> need to tune to bring it down to a more reasonable level?
>

First you need to find which the bottleneck is. Is possible that the
processor is being using such percentage
due to a disk high load.

For example, bad queries use to behave like this. They consume a lot of
CPU due to large seq scans on your tables.
Or, a bunch of not so small seqscans. Also user defined functions could
lead to a performance issue if the code isn't
really optimized.


> Some details:
> Processors: 2x4core 2.5 GHz Xeon
> Total Memory: 16GB
> Hard Disk: SSD raid 10
> wa value from top is typically 0.0%, sometimes up to 0.1%
>
> The database consists (primary) of a single table with 5 indexes and
> 11 columns. The majority of transactions are probably single-row
> inserts (live location data from numerous aircraft). Current record
> count is 1,282,706, and kept fairly static on a day-to-day basis by a
> cleanup routine that runs each night and deletes old records (if that
> makes a difference). This database is streamed to a secondary hot
> read-only spare using streaming replication. The replica is using less
> than 1% processor on average.

So, it looks like that writes aren't the issue. You'll need to check the
IO rate on your machine.
I'll recommend that you share the output of the sysstat commands. Also,
check the RSS of each
process with the command:
 ps -u  uf




-- 
--
Emanuel Calvo http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services




-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Processor usage/tuning question

2014-10-06 Thread Jim Nasby


On 10/3/14, 2:24 PM, Israel Brewster wrote:

I have a Postgresql 9.3.5 server running on CentOS 6.5. In looking at some 
stats today, I saw that it was handling about 4-5 transactions/second 
(according to the SELECT sum(xact_commit+xact_rollback) FROM pg_stat_database; 
query), and an instance of the postmaster process was consistently showing 
40%-80% utilization to handle this.

Are you sure it's the actual postmaster doing that and not just another 
backend? There's fairly little that the postmaster itself is responsible for, 
other than spawning new backend connections. If it really is the postmaster, 
the first thing I'd check is if you've got something that's spamming the 
database with new connection requests.

--
Jim Nasby, Data Architect, Blue Treble
Data in Trouble? Get it in Treble! http://BlueTreble.com



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Processor usage/tuning question

2014-10-03 Thread Andy Colson


On 10/03/2014 04:40 PM, Alan Hodgson wrote:

On Friday, October 03, 2014 11:24:31 AM Israel Brewster wrote:

I have a Postgresql 9.3.5 server running on CentOS 6.5. In looking at some
stats today, I saw that it was handling about 4-5 transactions/second
(according to the SELECT sum(xact_commit+xact_rollback) FROM
pg_stat_database; query), and an instance of the postmaster process was
consistently showing 40%-80% utilization to handle this. I didn't think
anything of that (the machine has plenty of capacity) until I mentioned it
to a friend of mine, who said that utilization level seemed high for that
many transactions. So if that level of utilization IS high, what might I
need to tune to bring it down to a more reasonable level?



You probably have some read queries not properly indexed that are sequentially
scanning that 1.2 million row table over and over again. Enable slow query
logging and see what's going on.





Yep, do that... and then:

https://wiki.postgresql.org/wiki/Slow_Query_Questions

-Andy



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Processor usage/tuning question

2014-10-03 Thread Alan Hodgson

On Friday, October 03, 2014 11:24:31 AM Israel Brewster wrote:
> I have a Postgresql 9.3.5 server running on CentOS 6.5. In looking at some
> stats today, I saw that it was handling about 4-5 transactions/second
> (according to the SELECT sum(xact_commit+xact_rollback) FROM
> pg_stat_database; query), and an instance of the postmaster process was
> consistently showing 40%-80% utilization to handle this. I didn't think
> anything of that (the machine has plenty of capacity) until I mentioned it
> to a friend of mine, who said that utilization level seemed high for that
> many transactions. So if that level of utilization IS high, what might I
> need to tune to bring it down to a more reasonable level?
> 

You probably have some read queries not properly indexed that are sequentially 
scanning that 1.2 million row table over and over again. Enable slow query 
logging and see what's going on.



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] Processor usage/tuning question

2014-10-03 Thread Israel Brewster

I have a Postgresql 9.3.5 server running on CentOS 6.5. In looking at some stats today, I saw that it was handling about 4-5 transactions/second (according to the SELECT sum(xact_commit+xact_rollback) FROM pg_stat_database; query), and an instance of the postmaster process was consistently showing 40%-80% utilization to handle this. I didn't think anything of that (the machine has plenty of capacity) until I mentioned it to a friend of mine, who said that utilization level seemed high for that many transactions. So if that level of utilization IS high, what might I need to tune to bring it down to a more reasonable level?Some details:Processors: 2x4core 2.5 GHz XeonTotal Memory: 16GBHard Disk: SSD raid 10wa value from top is typically 0.0%, sometimes up to 0.1%The database consists (primary) of a single table with 5 indexes and 11 columns. The majority of transactions are probably single-row inserts (live location data from numerous aircraft). Current record count is 1,282,706, and kept fairly static on a day-to-day basis by a cleanup routine that runs each night and deletes old records (if that makes a difference). This database is streamed to a secondary hot read-only spare using streaming replication. The replica is using less than 1% processor on average.I have followed the directions here: https://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server however I'm sure there are nuances I missed, or values that could be better selected for my usage case.Some current postgres config values:shared_buffers: 4GBeffective_cache_size: 12GBNot sure what other information I should provide, so let me know what useful data I missed!
---Israel BrewsterSystems Analyst IIRavn Alaska5245 Airport Industrial RdFairbanks, AK 99709(907) 450-7293---BEGIN:VCARD
VERSION:3.0
N:Brewster;Israel;;;
FN:Israel Brewster
ORG:Frontier Flying Service;MIS
TITLE:PC Support Tech II
EMAIL;type=INTERNET;type=WORK;type=pref:isr...@frontierflying.com
TEL;type=WORK;type=pref:907-450-7293
item1.ADR;type=WORK;type=pref:;;5245 Airport Industrial Wy;Fairbanks;AK;99701;
item1.X-ABADR:us
CATEGORIES:General
X-ABUID:36305438-95EA-4410-91AB-45D16CABCDDC\:ABPerson
END:VCARD

Re: [GENERAL] Processor usage/tuning question

Re: [GENERAL] Processor usage/tuning question

Re: [GENERAL] Processor usage/tuning question

Re: [GENERAL] Processor usage/tuning question

Re: [GENERAL] Processor usage/tuning question

Re: [GENERAL] Processor usage/tuning question

Re: [GENERAL] Processor usage/tuning question

Re: [GENERAL] Processor usage/tuning question

Re: [GENERAL] Processor usage/tuning question

Re: [GENERAL] Processor usage/tuning question

[GENERAL] Processor usage/tuning question

11 matches

Site Navigation

Mail list logo

Footer information