Re: [PERFORM] Turn off Hyperthreading! WAS: 60 core performance with 9.3
On Thu, Aug 21, 2014 at 5:29 PM, Josh Berkus wrote: > On 08/21/2014 04:08 PM, Steve Crawford wrote: >> On 08/21/2014 03:51 PM, Josh Berkus wrote: >>> On 08/21/2014 02:26 PM, Scott Marlowe wrote: I'm running almost the exact same setup in production as a spare. It has 4 of those CPUs, 256G RAM, and is currently set to use HT. Since it's a spare node I might be able to do some testing on it as well. It's running a 3.2 kernel right now. I could probably get a later model kernel on it even. >>> You know about the IO performance issues with 3.2, yes? >>> >> Were those 3.2 only and since fixed or are there issues persisting in >> 3.2+? The 12.04 LTS release of Ubuntu Server was 3.2 but the 14.04 is 3.13. > > The issues I know of were fixed in 3.9. > I thought they were fixed in 3.8.something? We're running 3.8 on our production servers but IO is not an issue for us. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Turn off Hyperthreading! WAS: 60 core performance with 9.3
On 08/21/2014 04:29 PM, Josh Berkus wrote: On 08/21/2014 04:08 PM, Steve Crawford wrote: On 08/21/2014 03:51 PM, Josh Berkus wrote: On 08/21/2014 02:26 PM, Scott Marlowe wrote: I'm running almost the exact same setup in production as a spare. It has 4 of those CPUs, 256G RAM, and is currently set to use HT. Since it's a spare node I might be able to do some testing on it as well. It's running a 3.2 kernel right now. I could probably get a later model kernel on it even. You know about the IO performance issues with 3.2, yes? Were those 3.2 only and since fixed or are there issues persisting in 3.2+? The 12.04 LTS release of Ubuntu Server was 3.2 but the 14.04 is 3.13. The issues I know of were fixed in 3.9. Correct. If you run trusty backports you are good to go. JD -- Command Prompt, Inc. - http://www.commandprompt.com/ 503-667-4564 PostgreSQL Support, Training, Professional Services and Development High Availability, Oracle Conversion, @cmdpromptinc "If we send our children to Caesar for their education, we should not be surprised when they come back as Romans." -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Turn off Hyperthreading! WAS: 60 core performance with 9.3
On 22/08/14 11:29, Josh Berkus wrote: On 08/21/2014 04:08 PM, Steve Crawford wrote: On 08/21/2014 03:51 PM, Josh Berkus wrote: On 08/21/2014 02:26 PM, Scott Marlowe wrote: I'm running almost the exact same setup in production as a spare. It has 4 of those CPUs, 256G RAM, and is currently set to use HT. Since it's a spare node I might be able to do some testing on it as well. It's running a 3.2 kernel right now. I could probably get a later model kernel on it even. You know about the IO performance issues with 3.2, yes? Were those 3.2 only and since fixed or are there issues persisting in 3.2+? The 12.04 LTS release of Ubuntu Server was 3.2 but the 14.04 is 3.13. The issues I know of were fixed in 3.9. There is a 3.11 kernel series for Ubuntu 12.04 Precise. Regards Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Turn off Hyperthreading! WAS: 60 core performance with 9.3
On 08/21/2014 04:08 PM, Steve Crawford wrote: > On 08/21/2014 03:51 PM, Josh Berkus wrote: >> On 08/21/2014 02:26 PM, Scott Marlowe wrote: >>> I'm running almost the exact same setup in production as a spare. It >>> has 4 of those CPUs, 256G RAM, and is currently set to use HT. Since >>> it's a spare node I might be able to do some testing on it as well. >>> It's running a 3.2 kernel right now. I could probably get a later >>> model kernel on it even. >> You know about the IO performance issues with 3.2, yes? >> > Were those 3.2 only and since fixed or are there issues persisting in > 3.2+? The 12.04 LTS release of Ubuntu Server was 3.2 but the 14.04 is 3.13. The issues I know of were fixed in 3.9. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Turn off Hyperthreading! WAS: 60 core performance with 9.3
On 08/21/2014 03:51 PM, Josh Berkus wrote: On 08/21/2014 02:26 PM, Scott Marlowe wrote: I'm running almost the exact same setup in production as a spare. It has 4 of those CPUs, 256G RAM, and is currently set to use HT. Since it's a spare node I might be able to do some testing on it as well. It's running a 3.2 kernel right now. I could probably get a later model kernel on it even. You know about the IO performance issues with 3.2, yes? Were those 3.2 only and since fixed or are there issues persisting in 3.2+? The 12.04 LTS release of Ubuntu Server was 3.2 but the 14.04 is 3.13. Cheers, Steve -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Turn off Hyperthreading! WAS: 60 core performance with 9.3
On 08/21/2014 02:26 PM, Scott Marlowe wrote: > I'm running almost the exact same setup in production as a spare. It > has 4 of those CPUs, 256G RAM, and is currently set to use HT. Since > it's a spare node I might be able to do some testing on it as well. > It's running a 3.2 kernel right now. I could probably get a later > model kernel on it even. You know about the IO performance issues with 3.2, yes? -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Turn off Hyperthreading! WAS: 60 core performance with 9.3
> HT off is common knowledge for better benchmarking result It's wise to use the qualifer 'for better benchmarking results'. It's worth keeping in mind here that a benchmark is not the same as normal production use. For example, where I work we do lots of long-running queries in parallel over a big range of datasets rather than many short-term transactions as fast as possible. Our biggest DB server is also used for GDAL work and R at the same time*. Pretty far from pgbench; not everyone is constrained by locks. I suppose that if your code is basically N copies of the same function, hyper-threading isn't likely to help much because it was introduced to allow different parts of the processor to be used in parallel when you're running hetarogenous code. But if you're hammering just one part of the CPU... well, adding another layer of logical complexity for your CPU to manage probably isn't going to do much good. Should HT be on or off when you're running 64 very mixed types of long-term queries which involve variously either heavy use of real number calculations or e.g. logic/string handling, and different data sets? It's a much more complex question than simply maxing out your pgbench scores. I don't have the data now unfortunately, but I remember seeing a benefit for HT on our 4 core e3 when running GDAL/Postgis work in parallel last year. It's not surprising though; the GDAL calls are almost certainly using different functions of the processor compared to postgres and there should be very little lock contention. In light of this interesting data I'm now leaning towards proposing HT off for our mapservers (which receive short, similar requests over and over), but for the hetaragenous servers, I think I'll keep it on for now. Graeme. * unrelated. There's also huge advantages for us in keeping these different programs running on the same machine since we found we can get much better transfer rates through unix sockets than with TCP over the network. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Turn off Hyperthreading! WAS: 60 core performance with 9.3
On Thu, Aug 21, 2014 at 3:26 PM, Scott Marlowe wrote: > On Thu, Aug 21, 2014 at 3:02 PM, Josh Berkus wrote: >> On 08/20/2014 07:40 PM, Bruce Momjian wrote: >> >>> I am also >>> unclear exactly what you tested, as I didn't see it mentioned in the >>> email --- CPU type, CPU count, and operating system would be the minimal >>> information required. >> >> Ooops! I thought I'd posted that earlier, but I didn't. >> >> The processors in question is the Intel(R) Xeon(R) CPU E7- 4850, with 4 >> of them for a total of 40 cores or 80 HT cores. >> >> OS is RHEL with 2.6.32-431.3.1.el6.x86_64. > > I'm running almost the exact same setup in production as a spare. It > has 4 of those CPUs, 256G RAM, and is currently set to use HT. Since > it's a spare node I might be able to do some testing on it as well. > It's running a 3.2 kernel right now. I could probably get a later > model kernel on it even. > > -- > To understand recursion, one must first understand recursion. To update this last post, the machine I have is running ubuntu 12.04.1 right now, and I have kernels 3.2, 3.5, 3.8, 3.11, and 3.13 available to put on it. We're looking at removing it from our current production cluster so I could likely do all kinds of crazy tests on it. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Turn off Hyperthreading! WAS: 60 core performance with 9.3
On Thu, Aug 21, 2014 at 3:02 PM, Josh Berkus wrote: > On 08/20/2014 07:40 PM, Bruce Momjian wrote: > >> I am also >> unclear exactly what you tested, as I didn't see it mentioned in the >> email --- CPU type, CPU count, and operating system would be the minimal >> information required. > > Ooops! I thought I'd posted that earlier, but I didn't. > > The processors in question is the Intel(R) Xeon(R) CPU E7- 4850, with 4 > of them for a total of 40 cores or 80 HT cores. > > OS is RHEL with 2.6.32-431.3.1.el6.x86_64. I'm running almost the exact same setup in production as a spare. It has 4 of those CPUs, 256G RAM, and is currently set to use HT. Since it's a spare node I might be able to do some testing on it as well. It's running a 3.2 kernel right now. I could probably get a later model kernel on it even. -- To understand recursion, one must first understand recursion. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Turn off Hyperthreading! WAS: 60 core performance with 9.3
On Thu, Aug 21, 2014 at 02:17:13PM -0700, Josh Berkus wrote: > >> Actually, I don't know that anyone has posted the benefits of HT. Link? > >> I want to compare results so that we can figure out what's different > >> between my case and theirs. Also, it makes a big difference if there is > >> an advantage to turning HT on for some workloads. > > > > I had Greg Smith test my system when it was installed, tested it, and > > recommended hyper-threading. The system is Debian Squeeze > > (2.6.32-5-amd64), CPUs are dual Xeon E5620, 8 cores, 16 virtual cores. > > Can you post some numerical results? > > I'm serious. It's obviously easier for our users if we can blanket > recommend turning HT off; that's a LOT easier for them than "you might > want to turn HT off if these conditions ...". So I want to establish > that HT is a benefit sometimes if it is. > > I personally have never seen HT be a benefit. I've seen it be harmless > (most of the time) but never beneficial. I know that when hyperthreading was introduced that it was mostly a negative, but then this was improved, and it might have gotten bad again. I am afraid results are based on the type of CPU, so I am not sure we can know a general answer. I know I asked Greg Smith, and I assume he would know. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Turn off Hyperthreading! WAS: 60 core performance with 9.3
On 08/21/2014 02:11 PM, Bruce Momjian wrote: > On Thu, Aug 21, 2014 at 02:02:26PM -0700, Josh Berkus wrote: >> On 08/20/2014 07:40 PM, Bruce Momjian wrote: >>> On Wed, Aug 20, 2014 at 12:13:50PM -0700, Josh Berkus wrote: On a read-write test, it's 10% faster with HT off as well. Further, from their production machine we've seen that having HT on causes the machine to slow down by 5X whenever you get more than 40 cores (as in 100% of real cores or 50% of HT cores) worth of activity. So we're definitely back to "If you're using PostgreSQL, turn off Hyperthreading". >>> >>> Not sure how you can make such a blanket statement when so many people >>> have tested and shown the benefits of hyper-threading. >> >> Actually, I don't know that anyone has posted the benefits of HT. Link? >> I want to compare results so that we can figure out what's different >> between my case and theirs. Also, it makes a big difference if there is >> an advantage to turning HT on for some workloads. > > I had Greg Smith test my system when it was installed, tested it, and > recommended hyper-threading. The system is Debian Squeeze > (2.6.32-5-amd64), CPUs are dual Xeon E5620, 8 cores, 16 virtual cores. Can you post some numerical results? I'm serious. It's obviously easier for our users if we can blanket recommend turning HT off; that's a LOT easier for them than "you might want to turn HT off if these conditions ...". So I want to establish that HT is a benefit sometimes if it is. I personally have never seen HT be a benefit. I've seen it be harmless (most of the time) but never beneficial. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Turn off Hyperthreading! WAS: 60 core performance with 9.3
On Thu, Aug 21, 2014 at 02:02:26PM -0700, Josh Berkus wrote: > On 08/20/2014 07:40 PM, Bruce Momjian wrote: > > On Wed, Aug 20, 2014 at 12:13:50PM -0700, Josh Berkus wrote: > >> On a read-write test, it's 10% faster with HT off as well. > >> > >> Further, from their production machine we've seen that having HT on > >> causes the machine to slow down by 5X whenever you get more than 40 > >> cores (as in 100% of real cores or 50% of HT cores) worth of activity. > >> > >> So we're definitely back to "If you're using PostgreSQL, turn off > >> Hyperthreading". > > > > Not sure how you can make such a blanket statement when so many people > > have tested and shown the benefits of hyper-threading. > > Actually, I don't know that anyone has posted the benefits of HT. Link? > I want to compare results so that we can figure out what's different > between my case and theirs. Also, it makes a big difference if there is > an advantage to turning HT on for some workloads. I had Greg Smith test my system when it was installed, tested it, and recommended hyper-threading. The system is Debian Squeeze (2.6.32-5-amd64), CPUs are dual Xeon E5620, 8 cores, 16 virtual cores. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Turn off Hyperthreading! WAS: 60 core performance with 9.3
On 08/20/2014 07:40 PM, Bruce Momjian wrote: > On Wed, Aug 20, 2014 at 12:13:50PM -0700, Josh Berkus wrote: >> On a read-write test, it's 10% faster with HT off as well. >> >> Further, from their production machine we've seen that having HT on >> causes the machine to slow down by 5X whenever you get more than 40 >> cores (as in 100% of real cores or 50% of HT cores) worth of activity. >> >> So we're definitely back to "If you're using PostgreSQL, turn off >> Hyperthreading". > > Not sure how you can make such a blanket statement when so many people > have tested and shown the benefits of hyper-threading. Actually, I don't know that anyone has posted the benefits of HT. Link? I want to compare results so that we can figure out what's different between my case and theirs. Also, it makes a big difference if there is an advantage to turning HT on for some workloads. > I am also > unclear exactly what you tested, as I didn't see it mentioned in the > email --- CPU type, CPU count, and operating system would be the minimal > information required. Ooops! I thought I'd posted that earlier, but I didn't. The processors in question is the Intel(R) Xeon(R) CPU E7- 4850, with 4 of them for a total of 40 cores or 80 HT cores. OS is RHEL with 2.6.32-431.3.1.el6.x86_64. I've emailed a kernel hacker who works at Intel for comment; for one thing, I'm wondering if the older kernel version is a problem for a system like this. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Window functions, partitioning, and sorting performance
On Thu, Aug 21, 2014 at 7:19 PM, Eli Naeher wrote: > However, when I try to do a > test self-join using it, Postgres does two seq scans across the whole table, > even though I have indexes on both id and previous_stop_event: > http://explain.depesz.com/s/ctck. Any idea why those indexes are not being > used? Because the planner thinks seq scan+hash join is going to be faster than incurring the overhead of index scans for other kinds of plans. You can try out alternative plan types by running 'set enable_hashjoin=off' in your session. If it does turn out to be faster, then it usually means you haven't set planner tunables right (random_page_cost, effective_cache_size and possibly cpu_tuple_cost). Regards, Marti -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Window functions, partitioning, and sorting performance
Oops, I forgot to include the test self-join query I'm using. It is simply: SELECT se1.stop_time AS curr, se2.stop_time AS prev FROM stop_event se1 JOIN stop_event se2 ON se1.previous_stop_event = se2.id; On Thu, Aug 21, 2014 at 11:19 AM, Eli Naeher wrote: > Upping work_mem did roughly halve the time, but after thinking about > Shaun's suggestion, I figured it's better to calculate this stuff once and > then store it. So here is how the table looks now: > > Table "public.stop_event" >Column|Type | > Modifiers > > -+-+- > stop_time | timestamp without time zone | not null > stop| integer | not null > bus | integer | not null > direction | integer | not null > route | integer | not null > id | bigint | not null default > nextval('stop_event_id_seq'::regclass) > previous_stop_event | bigint | > Indexes: > "stop_event_pkey" PRIMARY KEY, btree (id) > "stop_event_previous_stop_event_idx" btree (previous_stop_event) > Foreign-key constraints: > "stop_event_direction_id_fkey" FOREIGN KEY (direction) REFERENCES > direction(id) > "stop_event_previous_stop_event_fkey" FOREIGN KEY > (previous_stop_event) REFERENCES stop_event(id) > "stop_event_route_fkey" FOREIGN KEY (route) REFERENCES route(id) > "stop_event_stop" FOREIGN KEY (stop) REFERENCES stop(id) > Referenced by: > TABLE "stop_event" CONSTRAINT "stop_event_previous_stop_event_fkey" > FOREIGN KEY (previous_stop_event) REFERENCES stop_event(id) > > previous_stop_event simply references the previous (by stop_time) stop > event for the combination of stop, route, and direction. I have > successfully populated this column for my existing test data. However, when > I try to do a test self-join using it, Postgres does two seq scans across > the whole table, even though I have indexes on both id and > previous_stop_event: http://explain.depesz.com/s/ctck. Any idea why those > indexes are not being used? > > Thank you again, > -Eli > > On Thu, Aug 21, 2014 at 9:05 AM, Shaun Thomas >> wrote: >> >>> On 08/21/2014 08:29 AM, Eli Naeher wrote: >>> >>> With around 1.2 million rows, this takes 20 seconds to run. 1.2 million rows is only about a week's worth of data, so I'd like to figure out a way to make this faster. >>> >>> Well, you'll probably be able to reduce the run time a bit, but even >>> with really good hardware and all in-memory processing, you're not going to >>> see significant run-time improvements with that many rows. This is one of >>> the reasons reporting-specific structures, such as fact tables, were >>> designed to address. >>> >>> Repeatedly processing the same week/month/year aggregate worth of >>> several million rows will just increase linearly with each iteration as >>> data size increases. You need to maintain up-to-date aggregates on the >>> metrics you actually want to measure, so you're only reading the few >>> hundred rows you introduce every update period. You can retrieve those kind >>> of results in a few milliseconds. >>> >>> -- >>> Shaun Thomas >>> OptionsHouse, LLC | 141 W. Jackson Blvd. | Suite 800 | Chicago IL, 60604 >>> 312-676-8870 >>> stho...@optionshouse.com >>> >>> __ >>> >>> See http://www.peak6.com/email_disclaimer/ for terms and conditions >>> related to this email >>> >> >> >
Re: [PERFORM] Window functions, partitioning, and sorting performance
Upping work_mem did roughly halve the time, but after thinking about Shaun's suggestion, I figured it's better to calculate this stuff once and then store it. So here is how the table looks now: Table "public.stop_event" Column|Type | Modifiers -+-+- stop_time | timestamp without time zone | not null stop| integer | not null bus | integer | not null direction | integer | not null route | integer | not null id | bigint | not null default nextval('stop_event_id_seq'::regclass) previous_stop_event | bigint | Indexes: "stop_event_pkey" PRIMARY KEY, btree (id) "stop_event_previous_stop_event_idx" btree (previous_stop_event) Foreign-key constraints: "stop_event_direction_id_fkey" FOREIGN KEY (direction) REFERENCES direction(id) "stop_event_previous_stop_event_fkey" FOREIGN KEY (previous_stop_event) REFERENCES stop_event(id) "stop_event_route_fkey" FOREIGN KEY (route) REFERENCES route(id) "stop_event_stop" FOREIGN KEY (stop) REFERENCES stop(id) Referenced by: TABLE "stop_event" CONSTRAINT "stop_event_previous_stop_event_fkey" FOREIGN KEY (previous_stop_event) REFERENCES stop_event(id) previous_stop_event simply references the previous (by stop_time) stop event for the combination of stop, route, and direction. I have successfully populated this column for my existing test data. However, when I try to do a test self-join using it, Postgres does two seq scans across the whole table, even though I have indexes on both id and previous_stop_event: http://explain.depesz.com/s/ctck. Any idea why those indexes are not being used? Thank you again, -Eli On Thu, Aug 21, 2014 at 9:05 AM, Shaun Thomas > wrote: > >> On 08/21/2014 08:29 AM, Eli Naeher wrote: >> >> With around 1.2 million rows, this takes 20 seconds to run. 1.2 million >>> rows is only about a week's worth of data, so I'd like to figure out a >>> way to make this faster. >>> >> >> Well, you'll probably be able to reduce the run time a bit, but even with >> really good hardware and all in-memory processing, you're not going to see >> significant run-time improvements with that many rows. This is one of the >> reasons reporting-specific structures, such as fact tables, were designed >> to address. >> >> Repeatedly processing the same week/month/year aggregate worth of several >> million rows will just increase linearly with each iteration as data size >> increases. You need to maintain up-to-date aggregates on the metrics you >> actually want to measure, so you're only reading the few hundred rows you >> introduce every update period. You can retrieve those kind of results in a >> few milliseconds. >> >> -- >> Shaun Thomas >> OptionsHouse, LLC | 141 W. Jackson Blvd. | Suite 800 | Chicago IL, 60604 >> 312-676-8870 >> stho...@optionshouse.com >> >> __ >> >> See http://www.peak6.com/email_disclaimer/ for terms and conditions >> related to this email >> > >
Re: [PERFORM] Turn off Hyperthreading! WAS: 60 core performance with 9.3
On 08/20/2014 06:14 PM, Mark Kirkwood wrote: Notwithstanding the above results, my workmate Matt made an interesting observation: the scaling graph for (our) 60 core box (HT off), looks just like the one for our 32 core box with HT *on*. Hmm. I know this sounds stupid and unlikely, but has anyone actually tested PostgreSQL on a system with more than 64 legitimate cores? The work Robert Haas did to fix the CPU locking way back when showed significant improvements up to 64, but so far as I know, nobody really tested beyond that. I seem to remember similar choking effects when pre-9.2 systems encountered high CPU counts. I somehow doubt Intel would allow their HT architecture to regress so badly from Nehalem, which is almost 3-generations old at this point. This smells like something in the software stack, up to and including the Linux kernel. -- Shaun Thomas OptionsHouse, LLC | 141 W. Jackson Blvd. | Suite 800 | Chicago IL, 60604 312-676-8870 stho...@optionshouse.com __ See http://www.peak6.com/email_disclaimer/ for terms and conditions related to this email -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Window functions, partitioning, and sorting performance
On 08/21/2014 08:29 AM, Eli Naeher wrote: With around 1.2 million rows, this takes 20 seconds to run. 1.2 million rows is only about a week's worth of data, so I'd like to figure out a way to make this faster. Well, you'll probably be able to reduce the run time a bit, but even with really good hardware and all in-memory processing, you're not going to see significant run-time improvements with that many rows. This is one of the reasons reporting-specific structures, such as fact tables, were designed to address. Repeatedly processing the same week/month/year aggregate worth of several million rows will just increase linearly with each iteration as data size increases. You need to maintain up-to-date aggregates on the metrics you actually want to measure, so you're only reading the few hundred rows you introduce every update period. You can retrieve those kind of results in a few milliseconds. -- Shaun Thomas OptionsHouse, LLC | 141 W. Jackson Blvd. | Suite 800 | Chicago IL, 60604 312-676-8870 stho...@optionshouse.com __ See http://www.peak6.com/email_disclaimer/ for terms and conditions related to this email -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Window functions, partitioning, and sorting performance
On Thu, Aug 21, 2014 at 4:29 PM, Eli Naeher wrote: > Clearly the bulk of the time is spent sorting the rows in the original > table, and then again sorting the results of the subselect. But I'm afraid I > don't really know what to do with this information. Is there any way I can > speed this up? "Sort Method: external merge Disk: 120976kB" The obvious first step is to bump up work_mem to avoid disk-based sort. Try setting it to something like 256MB in your session and see how it performs then. This may also allow the planner to choose HashAggregate instead of sort. It not always straightforward how to tune correctly. It depends on your hardware, concurrency and query complexity, here's some advice: https://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server#work_mem_maintainance_work_mem Also you could create an index on (route, direction, stop, stop_time) to avoid the inner sort entirely. And it seems that you can move the "INNER JOIN stop" to the outer query as well, not sure if that will change much. Try these and if it's still problematic, report back with a new EXPLAIN ANALYZE Regards, Marti -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] Window functions, partitioning, and sorting performance
I have a table called stop_event (a stop event is one bus passing one bus stop at a given time for a given route and direction), and I'd like to get the average interval for each stop/route/direction combination. A few hundred new events are written to the table once every minute. No rows are ever updated (or deleted, except in development). stop_event looks like this: Table "public.stop_event" Column |Type | Modifiers ---+-+--- stop_time | timestamp without time zone | not null stop | integer | not null bus | integer | not null direction | integer | not null route | integer | not null Foreign-key constraints: "stop_event_direction_id_fkey" FOREIGN KEY (direction) REFERENCES direction(id) "stop_event_route_fkey" FOREIGN KEY (route) REFERENCES route(id) "stop_event_stop" FOREIGN KEY (stop) REFERENCES stop(id) And my query looks like this: SELECT (floor(date_part(E'epoch', avg(interval))) / 60)::INTEGER, route, direction, name, st_asgeojson(stop_location)::JSON FROM (SELECT (stop_time - (lag(stop_time) OVER w)) AS interval, route, direction, name, stop_location FROM stop_event INNER JOIN stop ON (stop_event.stop = stop.id) WINDOW w AS (PARTITION BY route, direction, stop ORDER BY stop_time)) AS all_intervals WHERE (interval IS NOT NULL) GROUP BY route, direction, name, stop_location; With around 1.2 million rows, this takes 20 seconds to run. 1.2 million rows is only about a week's worth of data, so I'd like to figure out a way to make this faster. The EXPLAIN ANALYZE is at http://explain.depesz.com/s/ntC. Clearly the bulk of the time is spent sorting the rows in the original table, and then again sorting the results of the subselect. But I'm afraid I don't really know what to do with this information. Is there any way I can speed this up? Is my use of an aggregate key for stop_event causing problems? Would using a synthetic key help? Thank you for any help you can provide, -Eli
Re: [PERFORM] Turn off Hyperthreading! WAS: 60 core performance with 9.3
On 21/08/14 11:14, Mark Kirkwood wrote: You didn't mention what cpu this is for (or how many sockets etc), would be useful to know. Just to clarify - while you mentioned that the production system was 40 cores, it wasn't immediately obvious that the same system was the source of the measurements you posted...sorry if I'm being a mixture of pedantic and dense - just trying to make sure it is clear what systems/cpus etc we are talking about (with this in mind it never hurts to quote cpu and mobo model numbers)! Cheers Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance