Re: [PERFORM] Turn off Hyperthreading! WAS: 60 core performance with 9.3

2014-08-21 Thread Scott Marlowe
On Thu, Aug 21, 2014 at 5:29 PM, Josh Berkus  wrote:
> On 08/21/2014 04:08 PM, Steve Crawford wrote:
>> On 08/21/2014 03:51 PM, Josh Berkus wrote:
>>> On 08/21/2014 02:26 PM, Scott Marlowe wrote:
 I'm running almost the exact same setup in production as a spare. It
 has 4 of those CPUs, 256G RAM, and is currently set to use HT. Since
 it's a spare node I might be able to do some testing on it as well.
 It's running a 3.2 kernel right now. I could probably get a later
 model kernel on it even.
>>> You know about the IO performance issues with 3.2, yes?
>>>
>> Were those 3.2 only and since fixed or are there issues persisting in
>> 3.2+? The 12.04 LTS release of Ubuntu Server was 3.2 but the 14.04 is 3.13.
>
> The issues I know of were fixed in 3.9.
>
I thought they were fixed in 3.8.something? We're running 3.8 on our
production servers but IO is not an issue for us.


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Turn off Hyperthreading! WAS: 60 core performance with 9.3

2014-08-21 Thread Joshua D. Drake


On 08/21/2014 04:29 PM, Josh Berkus wrote:


On 08/21/2014 04:08 PM, Steve Crawford wrote:

On 08/21/2014 03:51 PM, Josh Berkus wrote:

On 08/21/2014 02:26 PM, Scott Marlowe wrote:

I'm running almost the exact same setup in production as a spare. It
has 4 of those CPUs, 256G RAM, and is currently set to use HT. Since
it's a spare node I might be able to do some testing on it as well.
It's running a 3.2 kernel right now. I could probably get a later
model kernel on it even.

You know about the IO performance issues with 3.2, yes?


Were those 3.2 only and since fixed or are there issues persisting in
3.2+? The 12.04 LTS release of Ubuntu Server was 3.2 but the 14.04 is 3.13.


The issues I know of were fixed in 3.9.



Correct. If you run trusty backports you are good to go.

JD

--
Command Prompt, Inc. - http://www.commandprompt.com/  503-667-4564
PostgreSQL Support, Training, Professional Services and Development
High Availability, Oracle Conversion, @cmdpromptinc
"If we send our children to Caesar for their education, we should
 not be surprised when they come back as Romans."


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Turn off Hyperthreading! WAS: 60 core performance with 9.3

2014-08-21 Thread Mark Kirkwood

On 22/08/14 11:29, Josh Berkus wrote:

On 08/21/2014 04:08 PM, Steve Crawford wrote:

On 08/21/2014 03:51 PM, Josh Berkus wrote:

On 08/21/2014 02:26 PM, Scott Marlowe wrote:

I'm running almost the exact same setup in production as a spare. It
has 4 of those CPUs, 256G RAM, and is currently set to use HT. Since
it's a spare node I might be able to do some testing on it as well.
It's running a 3.2 kernel right now. I could probably get a later
model kernel on it even.

You know about the IO performance issues with 3.2, yes?


Were those 3.2 only and since fixed or are there issues persisting in
3.2+? The 12.04 LTS release of Ubuntu Server was 3.2 but the 14.04 is 3.13.


The issues I know of were fixed in 3.9.



There is a 3.11 kernel series for Ubuntu 12.04 Precise.

Regards

Mark


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Turn off Hyperthreading! WAS: 60 core performance with 9.3

2014-08-21 Thread Josh Berkus
On 08/21/2014 04:08 PM, Steve Crawford wrote:
> On 08/21/2014 03:51 PM, Josh Berkus wrote:
>> On 08/21/2014 02:26 PM, Scott Marlowe wrote:
>>> I'm running almost the exact same setup in production as a spare. It
>>> has 4 of those CPUs, 256G RAM, and is currently set to use HT. Since
>>> it's a spare node I might be able to do some testing on it as well.
>>> It's running a 3.2 kernel right now. I could probably get a later
>>> model kernel on it even.
>> You know about the IO performance issues with 3.2, yes?
>>
> Were those 3.2 only and since fixed or are there issues persisting in
> 3.2+? The 12.04 LTS release of Ubuntu Server was 3.2 but the 14.04 is 3.13.

The issues I know of were fixed in 3.9.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Turn off Hyperthreading! WAS: 60 core performance with 9.3

2014-08-21 Thread Steve Crawford

On 08/21/2014 03:51 PM, Josh Berkus wrote:

On 08/21/2014 02:26 PM, Scott Marlowe wrote:

I'm running almost the exact same setup in production as a spare. It
has 4 of those CPUs, 256G RAM, and is currently set to use HT. Since
it's a spare node I might be able to do some testing on it as well.
It's running a 3.2 kernel right now. I could probably get a later
model kernel on it even.

You know about the IO performance issues with 3.2, yes?

Were those 3.2 only and since fixed or are there issues persisting in 
3.2+? The 12.04 LTS release of Ubuntu Server was 3.2 but the 14.04 is 3.13.


Cheers,
Steve



--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Turn off Hyperthreading! WAS: 60 core performance with 9.3

2014-08-21 Thread Josh Berkus
On 08/21/2014 02:26 PM, Scott Marlowe wrote:
> I'm running almost the exact same setup in production as a spare. It
> has 4 of those CPUs, 256G RAM, and is currently set to use HT. Since
> it's a spare node I might be able to do some testing on it as well.
> It's running a 3.2 kernel right now. I could probably get a later
> model kernel on it even.

You know about the IO performance issues with 3.2, yes?

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Turn off Hyperthreading! WAS: 60 core performance with 9.3

2014-08-21 Thread Graeme B. Bell
> HT off is common knowledge for better benchmarking result

It's wise to use the qualifer 'for better benchmarking results'. 

It's worth keeping in mind here that a benchmark is not the same as normal 
production use. 

For example, where I work we do lots of long-running queries in parallel over a 
big range of datasets rather than many short-term transactions as fast as 
possible. Our biggest DB server is also used for GDAL work and R at the same 
time*. Pretty far from pgbench; not everyone is constrained by locks.

I suppose that if your code is basically N copies of the same function, 
hyper-threading isn't likely to help much because it was introduced to allow 
different parts of the processor to be used in parallel when you're running 
hetarogenous code. 

But if you're hammering just one part of the CPU... well, adding another layer 
of logical complexity for your CPU to manage probably isn't going to do much 
good.

Should HT be on or off when you're running 64 very mixed types of long-term 
queries which involve variously either heavy use of real number calculations or 
e.g. logic/string handling, and different data sets? It's a much more complex 
question than simply maxing out your pgbench scores. 

I don't have the data now unfortunately, but I remember seeing a benefit for HT 
on our 4 core e3 when running GDAL/Postgis work in parallel last year. It's not 
surprising though; the GDAL calls are almost certainly using different 
functions of the processor compared to postgres and there should be very little 
lock contention. In light of this interesting data I'm now leaning towards 
proposing HT off for our mapservers (which receive short, similar requests over 
and over), but for the hetaragenous servers, I think I'll keep it on for now.

Graeme. 



* unrelated. There's also huge advantages for us in keeping these different 
programs running on the same machine since we found we can get much better 
transfer rates through unix sockets than with TCP over the network.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Turn off Hyperthreading! WAS: 60 core performance with 9.3

2014-08-21 Thread Scott Marlowe
On Thu, Aug 21, 2014 at 3:26 PM, Scott Marlowe  wrote:
> On Thu, Aug 21, 2014 at 3:02 PM, Josh Berkus  wrote:
>> On 08/20/2014 07:40 PM, Bruce Momjian wrote:
>>
>>> I am also
>>> unclear exactly what you tested, as I didn't see it mentioned in the
>>> email --- CPU type, CPU count, and operating system would be the minimal
>>> information required.
>>
>> Ooops!  I thought I'd posted that earlier, but I didn't.
>>
>> The processors in question is the Intel(R) Xeon(R) CPU E7- 4850, with 4
>> of them for a total of 40 cores or 80 HT cores.
>>
>> OS is RHEL with 2.6.32-431.3.1.el6.x86_64.
>
> I'm running almost the exact same setup in production as a spare. It
> has 4 of those CPUs, 256G RAM, and is currently set to use HT. Since
> it's a spare node I might be able to do some testing on it as well.
> It's running a 3.2 kernel right now. I could probably get a later
> model kernel on it even.
>
> --
> To understand recursion, one must first understand recursion.

To update this last post, the machine I have is running ubuntu 12.04.1
right now, and I have kernels 3.2, 3.5, 3.8, 3.11, and 3.13 available
to put on it. We're looking at removing it from our current production
cluster so I could likely do all kinds of crazy tests on it.


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Turn off Hyperthreading! WAS: 60 core performance with 9.3

2014-08-21 Thread Scott Marlowe
On Thu, Aug 21, 2014 at 3:02 PM, Josh Berkus  wrote:
> On 08/20/2014 07:40 PM, Bruce Momjian wrote:
>
>> I am also
>> unclear exactly what you tested, as I didn't see it mentioned in the
>> email --- CPU type, CPU count, and operating system would be the minimal
>> information required.
>
> Ooops!  I thought I'd posted that earlier, but I didn't.
>
> The processors in question is the Intel(R) Xeon(R) CPU E7- 4850, with 4
> of them for a total of 40 cores or 80 HT cores.
>
> OS is RHEL with 2.6.32-431.3.1.el6.x86_64.

I'm running almost the exact same setup in production as a spare. It
has 4 of those CPUs, 256G RAM, and is currently set to use HT. Since
it's a spare node I might be able to do some testing on it as well.
It's running a 3.2 kernel right now. I could probably get a later
model kernel on it even.

-- 
To understand recursion, one must first understand recursion.


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Turn off Hyperthreading! WAS: 60 core performance with 9.3

2014-08-21 Thread Bruce Momjian
On Thu, Aug 21, 2014 at 02:17:13PM -0700, Josh Berkus wrote:
> >> Actually, I don't know that anyone has posted the benefits of HT.  Link?
> >>  I want to compare results so that we can figure out what's different
> >> between my case and theirs.  Also, it makes a big difference if there is
> >> an advantage to turning HT on for some workloads.
> > 
> > I had Greg Smith test my system when it was installed, tested it, and
> > recommended hyper-threading.  The system is Debian Squeeze
> > (2.6.32-5-amd64), CPUs are dual Xeon E5620, 8 cores, 16 virtual cores.
> 
> Can you post some numerical results?
> 
> I'm serious.  It's obviously easier for our users if we can blanket
> recommend turning HT off; that's a LOT easier for them than "you might
> want to turn HT off if these conditions ...".  So I want to establish
> that HT is a benefit sometimes if it is.
> 
> I personally have never seen HT be a benefit.  I've seen it be harmless
> (most of the time) but never beneficial.

I know that when hyperthreading was introduced that it was mostly a
negative, but then this was improved, and it might have gotten bad
again.  I am afraid results are based on the type of CPU, so I am not
sure we can know a general answer.

I know I asked Greg Smith, and I assume he would know.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Turn off Hyperthreading! WAS: 60 core performance with 9.3

2014-08-21 Thread Josh Berkus
On 08/21/2014 02:11 PM, Bruce Momjian wrote:
> On Thu, Aug 21, 2014 at 02:02:26PM -0700, Josh Berkus wrote:
>> On 08/20/2014 07:40 PM, Bruce Momjian wrote:
>>> On Wed, Aug 20, 2014 at 12:13:50PM -0700, Josh Berkus wrote:
 On a read-write test, it's 10% faster with HT off as well.

 Further, from their production machine we've seen that having HT on
 causes the machine to slow down by 5X whenever you get more than 40
 cores (as in 100% of real cores or 50% of HT cores) worth of activity.

 So we're definitely back to "If you're using PostgreSQL, turn off
 Hyperthreading".
>>>
>>> Not sure how you can make such a blanket statement when so many people
>>> have tested and shown the benefits of hyper-threading.  
>>
>> Actually, I don't know that anyone has posted the benefits of HT.  Link?
>>  I want to compare results so that we can figure out what's different
>> between my case and theirs.  Also, it makes a big difference if there is
>> an advantage to turning HT on for some workloads.
> 
> I had Greg Smith test my system when it was installed, tested it, and
> recommended hyper-threading.  The system is Debian Squeeze
> (2.6.32-5-amd64), CPUs are dual Xeon E5620, 8 cores, 16 virtual cores.

Can you post some numerical results?

I'm serious.  It's obviously easier for our users if we can blanket
recommend turning HT off; that's a LOT easier for them than "you might
want to turn HT off if these conditions ...".  So I want to establish
that HT is a benefit sometimes if it is.

I personally have never seen HT be a benefit.  I've seen it be harmless
(most of the time) but never beneficial.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Turn off Hyperthreading! WAS: 60 core performance with 9.3

2014-08-21 Thread Bruce Momjian
On Thu, Aug 21, 2014 at 02:02:26PM -0700, Josh Berkus wrote:
> On 08/20/2014 07:40 PM, Bruce Momjian wrote:
> > On Wed, Aug 20, 2014 at 12:13:50PM -0700, Josh Berkus wrote:
> >> On a read-write test, it's 10% faster with HT off as well.
> >>
> >> Further, from their production machine we've seen that having HT on
> >> causes the machine to slow down by 5X whenever you get more than 40
> >> cores (as in 100% of real cores or 50% of HT cores) worth of activity.
> >>
> >> So we're definitely back to "If you're using PostgreSQL, turn off
> >> Hyperthreading".
> > 
> > Not sure how you can make such a blanket statement when so many people
> > have tested and shown the benefits of hyper-threading.  
> 
> Actually, I don't know that anyone has posted the benefits of HT.  Link?
>  I want to compare results so that we can figure out what's different
> between my case and theirs.  Also, it makes a big difference if there is
> an advantage to turning HT on for some workloads.

I had Greg Smith test my system when it was installed, tested it, and
recommended hyper-threading.  The system is Debian Squeeze
(2.6.32-5-amd64), CPUs are dual Xeon E5620, 8 cores, 16 virtual cores.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Turn off Hyperthreading! WAS: 60 core performance with 9.3

2014-08-21 Thread Josh Berkus
On 08/20/2014 07:40 PM, Bruce Momjian wrote:
> On Wed, Aug 20, 2014 at 12:13:50PM -0700, Josh Berkus wrote:
>> On a read-write test, it's 10% faster with HT off as well.
>>
>> Further, from their production machine we've seen that having HT on
>> causes the machine to slow down by 5X whenever you get more than 40
>> cores (as in 100% of real cores or 50% of HT cores) worth of activity.
>>
>> So we're definitely back to "If you're using PostgreSQL, turn off
>> Hyperthreading".
> 
> Not sure how you can make such a blanket statement when so many people
> have tested and shown the benefits of hyper-threading.  

Actually, I don't know that anyone has posted the benefits of HT.  Link?
 I want to compare results so that we can figure out what's different
between my case and theirs.  Also, it makes a big difference if there is
an advantage to turning HT on for some workloads.

> I am also
> unclear exactly what you tested, as I didn't see it mentioned in the
> email --- CPU type, CPU count, and operating system would be the minimal
> information required.

Ooops!  I thought I'd posted that earlier, but I didn't.

The processors in question is the Intel(R) Xeon(R) CPU E7- 4850, with 4
of them for a total of 40 cores or 80 HT cores.

OS is RHEL with 2.6.32-431.3.1.el6.x86_64.

I've emailed a kernel hacker who works at Intel for comment; for one
thing, I'm wondering if the older kernel version is a problem for a
system like this.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Window functions, partitioning, and sorting performance

2014-08-21 Thread Marti Raudsepp
On Thu, Aug 21, 2014 at 7:19 PM, Eli Naeher  wrote:
> However, when I try to do a
> test self-join using it, Postgres does two seq scans across the whole table,
> even though I have indexes on both id and previous_stop_event:
> http://explain.depesz.com/s/ctck. Any idea why those indexes are not being
> used?

Because the planner thinks seq scan+hash join is going to be faster
than incurring the overhead of index scans for other kinds of plans.

You can try out alternative plan types by running 'set
enable_hashjoin=off' in your session. If it does turn out to be
faster, then it usually means you haven't set planner tunables right
(random_page_cost, effective_cache_size and possibly cpu_tuple_cost).

Regards,
Marti


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Window functions, partitioning, and sorting performance

2014-08-21 Thread Eli Naeher
Oops, I forgot to include the test self-join query I'm using. It is simply:

SELECT se1.stop_time AS curr, se2.stop_time AS prev
FROM stop_event se1
JOIN stop_event se2 ON se1.previous_stop_event = se2.id;



On Thu, Aug 21, 2014 at 11:19 AM, Eli Naeher  wrote:

> Upping work_mem did roughly halve the time, but after thinking about
> Shaun's suggestion, I figured it's better to calculate this stuff once and
> then store it. So here is how the table looks now:
>
>   Table "public.stop_event"
>Column|Type |
>  Modifiers
>
> -+-+-
>  stop_time   | timestamp without time zone | not null
>  stop| integer | not null
>  bus | integer | not null
>  direction   | integer | not null
>  route   | integer | not null
>  id  | bigint  | not null default
> nextval('stop_event_id_seq'::regclass)
>  previous_stop_event | bigint  |
> Indexes:
> "stop_event_pkey" PRIMARY KEY, btree (id)
> "stop_event_previous_stop_event_idx" btree (previous_stop_event)
> Foreign-key constraints:
> "stop_event_direction_id_fkey" FOREIGN KEY (direction) REFERENCES
> direction(id)
> "stop_event_previous_stop_event_fkey" FOREIGN KEY
> (previous_stop_event) REFERENCES stop_event(id)
> "stop_event_route_fkey" FOREIGN KEY (route) REFERENCES route(id)
> "stop_event_stop" FOREIGN KEY (stop) REFERENCES stop(id)
> Referenced by:
> TABLE "stop_event" CONSTRAINT "stop_event_previous_stop_event_fkey"
> FOREIGN KEY (previous_stop_event) REFERENCES stop_event(id)
>
> previous_stop_event simply references the previous (by stop_time) stop
> event for the combination of stop, route, and direction. I have
> successfully populated this column for my existing test data. However, when
> I try to do a test self-join using it, Postgres does two seq scans across
> the whole table, even though I have indexes on both id and
> previous_stop_event: http://explain.depesz.com/s/ctck. Any idea why those
> indexes are not being used?
>
> Thank you again,
> -Eli
>
> On Thu, Aug 21, 2014 at 9:05 AM, Shaun Thomas 
>> wrote:
>>
>>> On 08/21/2014 08:29 AM, Eli Naeher wrote:
>>>
>>>  With around 1.2 million rows, this takes 20 seconds to run. 1.2 million
 rows is only about a week's worth of data, so I'd like to figure out a
 way to make this faster.

>>>
>>> Well, you'll probably be able to reduce the run time a bit, but even
>>> with really good hardware and all in-memory processing, you're not going to
>>> see significant run-time improvements with that many rows. This is one of
>>> the reasons reporting-specific structures, such as fact tables, were
>>> designed to address.
>>>
>>> Repeatedly processing the same week/month/year aggregate worth of
>>> several million rows will just increase linearly with each iteration as
>>> data size increases. You need to maintain up-to-date aggregates on the
>>> metrics you actually want to measure, so you're only reading the few
>>> hundred rows you introduce every update period. You can retrieve those kind
>>> of results in a few milliseconds.
>>>
>>> --
>>> Shaun Thomas
>>> OptionsHouse, LLC | 141 W. Jackson Blvd. | Suite 800 | Chicago IL, 60604
>>> 312-676-8870
>>> stho...@optionshouse.com
>>>
>>> __
>>>
>>> See http://www.peak6.com/email_disclaimer/ for terms and conditions
>>> related to this email
>>>
>>
>>
>


Re: [PERFORM] Window functions, partitioning, and sorting performance

2014-08-21 Thread Eli Naeher
Upping work_mem did roughly halve the time, but after thinking about
Shaun's suggestion, I figured it's better to calculate this stuff once and
then store it. So here is how the table looks now:

  Table "public.stop_event"
   Column|Type |
 Modifiers
-+-+-
 stop_time   | timestamp without time zone | not null
 stop| integer | not null
 bus | integer | not null
 direction   | integer | not null
 route   | integer | not null
 id  | bigint  | not null default
nextval('stop_event_id_seq'::regclass)
 previous_stop_event | bigint  |
Indexes:
"stop_event_pkey" PRIMARY KEY, btree (id)
"stop_event_previous_stop_event_idx" btree (previous_stop_event)
Foreign-key constraints:
"stop_event_direction_id_fkey" FOREIGN KEY (direction) REFERENCES
direction(id)
"stop_event_previous_stop_event_fkey" FOREIGN KEY (previous_stop_event)
REFERENCES stop_event(id)
"stop_event_route_fkey" FOREIGN KEY (route) REFERENCES route(id)
"stop_event_stop" FOREIGN KEY (stop) REFERENCES stop(id)
Referenced by:
TABLE "stop_event" CONSTRAINT "stop_event_previous_stop_event_fkey"
FOREIGN KEY (previous_stop_event) REFERENCES stop_event(id)

previous_stop_event simply references the previous (by stop_time) stop
event for the combination of stop, route, and direction. I have
successfully populated this column for my existing test data. However, when
I try to do a test self-join using it, Postgres does two seq scans across
the whole table, even though I have indexes on both id and
previous_stop_event: http://explain.depesz.com/s/ctck. Any idea why those
indexes are not being used?

Thank you again,
-Eli

On Thu, Aug 21, 2014 at 9:05 AM, Shaun Thomas 
> wrote:
>
>> On 08/21/2014 08:29 AM, Eli Naeher wrote:
>>
>>  With around 1.2 million rows, this takes 20 seconds to run. 1.2 million
>>> rows is only about a week's worth of data, so I'd like to figure out a
>>> way to make this faster.
>>>
>>
>> Well, you'll probably be able to reduce the run time a bit, but even with
>> really good hardware and all in-memory processing, you're not going to see
>> significant run-time improvements with that many rows. This is one of the
>> reasons reporting-specific structures, such as fact tables, were designed
>> to address.
>>
>> Repeatedly processing the same week/month/year aggregate worth of several
>> million rows will just increase linearly with each iteration as data size
>> increases. You need to maintain up-to-date aggregates on the metrics you
>> actually want to measure, so you're only reading the few hundred rows you
>> introduce every update period. You can retrieve those kind of results in a
>> few milliseconds.
>>
>> --
>> Shaun Thomas
>> OptionsHouse, LLC | 141 W. Jackson Blvd. | Suite 800 | Chicago IL, 60604
>> 312-676-8870
>> stho...@optionshouse.com
>>
>> __
>>
>> See http://www.peak6.com/email_disclaimer/ for terms and conditions
>> related to this email
>>
>
>


Re: [PERFORM] Turn off Hyperthreading! WAS: 60 core performance with 9.3

2014-08-21 Thread Shaun Thomas

On 08/20/2014 06:14 PM, Mark Kirkwood wrote:


Notwithstanding the above results, my workmate Matt made an interesting
observation: the scaling graph for (our) 60 core box (HT off), looks
just like the one for our 32 core box with HT *on*.


Hmm. I know this sounds stupid and unlikely, but has anyone actually 
tested PostgreSQL on a system with more than 64 legitimate cores? The 
work Robert Haas did to fix the CPU locking way back when showed 
significant improvements up to 64, but so far as I know, nobody really 
tested beyond that.


I seem to remember similar choking effects when pre-9.2 systems 
encountered high CPU counts. I somehow doubt Intel would allow their HT 
architecture to regress so badly from Nehalem, which is almost 
3-generations old at this point. This smells like something in the 
software stack, up to and including the Linux kernel.


--
Shaun Thomas
OptionsHouse, LLC | 141 W. Jackson Blvd. | Suite 800 | Chicago IL, 60604
312-676-8870
stho...@optionshouse.com

__

See http://www.peak6.com/email_disclaimer/ for terms and conditions related to 
this email


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Window functions, partitioning, and sorting performance

2014-08-21 Thread Shaun Thomas

On 08/21/2014 08:29 AM, Eli Naeher wrote:


With around 1.2 million rows, this takes 20 seconds to run. 1.2 million
rows is only about a week's worth of data, so I'd like to figure out a
way to make this faster.


Well, you'll probably be able to reduce the run time a bit, but even 
with really good hardware and all in-memory processing, you're not going 
to see significant run-time improvements with that many rows. This is 
one of the reasons reporting-specific structures, such as fact tables, 
were designed to address.


Repeatedly processing the same week/month/year aggregate worth of 
several million rows will just increase linearly with each iteration as 
data size increases. You need to maintain up-to-date aggregates on the 
metrics you actually want to measure, so you're only reading the few 
hundred rows you introduce every update period. You can retrieve those 
kind of results in a few milliseconds.


--
Shaun Thomas
OptionsHouse, LLC | 141 W. Jackson Blvd. | Suite 800 | Chicago IL, 60604
312-676-8870
stho...@optionshouse.com

__

See http://www.peak6.com/email_disclaimer/ for terms and conditions related to 
this email


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Window functions, partitioning, and sorting performance

2014-08-21 Thread Marti Raudsepp
On Thu, Aug 21, 2014 at 4:29 PM, Eli Naeher  wrote:
> Clearly the bulk of the time is spent sorting the rows in the original
> table, and then again sorting the results of the subselect. But I'm afraid I
> don't really know what to do with this information. Is there any way I can
> speed this up?

"Sort Method: external merge  Disk: 120976kB"

The obvious first step is to bump up work_mem to avoid disk-based
sort. Try setting it to something like 256MB in your session and see
how it performs then. This may also allow the planner to choose
HashAggregate instead of sort.

It not always straightforward how to tune correctly. It depends on
your hardware, concurrency and query complexity, here's some advice:
https://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server#work_mem_maintainance_work_mem

Also you could create an index on (route, direction, stop, stop_time)
to avoid the inner sort entirely.

And it seems that you can move the "INNER JOIN stop" to the outer
query as well, not sure if that will change much.

Try these and if it's still problematic, report back with a new EXPLAIN ANALYZE

Regards,
Marti


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Window functions, partitioning, and sorting performance

2014-08-21 Thread Eli Naeher
I have a table called stop_event (a stop event is one bus passing one bus
stop at a given time for a given route and direction), and I'd like to get
the average interval for each stop/route/direction combination.

A few hundred new events are written to the table once every minute. No
rows are ever updated (or deleted, except in development).

stop_event looks like this:

  Table "public.stop_event"
  Column   |Type | Modifiers
---+-+---
 stop_time | timestamp without time zone | not null
 stop  | integer | not null
 bus   | integer | not null
 direction | integer | not null
 route | integer | not null
Foreign-key constraints:
"stop_event_direction_id_fkey" FOREIGN KEY (direction) REFERENCES
direction(id)
"stop_event_route_fkey" FOREIGN KEY (route) REFERENCES route(id)
"stop_event_stop" FOREIGN KEY (stop) REFERENCES stop(id)

And my query looks like this:

SELECT (floor(date_part(E'epoch', avg(interval))) / 60)::INTEGER,
   route,
   direction,
   name,
   st_asgeojson(stop_location)::JSON
FROM
  (SELECT (stop_time - (lag(stop_time) OVER w)) AS interval,
  route,
  direction,
  name,
  stop_location
   FROM stop_event
   INNER JOIN stop ON (stop_event.stop = stop.id)
   WINDOW w AS (PARTITION BY route, direction, stop ORDER BY stop_time))
AS all_intervals
WHERE (interval IS NOT NULL)
GROUP BY route,
 direction,
 name,
 stop_location;

With around 1.2 million rows, this takes 20 seconds to run. 1.2 million
rows is only about a week's worth of data, so I'd like to figure out a way
to make this faster. The EXPLAIN ANALYZE is at
http://explain.depesz.com/s/ntC.

Clearly the bulk of the time is spent sorting the rows in the original
table, and then again sorting the results of the subselect. But I'm afraid
I don't really know what to do with this information. Is there any way I
can speed this up? Is my use of an aggregate key for stop_event causing
problems? Would using a synthetic key help?

Thank you for any help you can provide,
-Eli


Re: [PERFORM] Turn off Hyperthreading! WAS: 60 core performance with 9.3

2014-08-21 Thread Mark Kirkwood

On 21/08/14 11:14, Mark Kirkwood wrote:


You didn't mention what
cpu this is for (or how many sockets etc), would be useful to know.



Just to clarify - while you mentioned that the production system was 40 
cores, it wasn't immediately obvious that the same system was the source 
of the measurements you posted...sorry if I'm being a mixture of 
pedantic and dense - just trying to make sure it is clear what 
systems/cpus etc we are talking about (with this in mind it never hurts 
to quote cpu and mobo model numbers)!


Cheers

Mark




--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance