tivities. That's all there was
time for with PG10. This is highly likely to be improved sometime in
the future.
--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-performance mailing list (pgsql-p
her columns"
So, if you were to write the query as:
explain analyze select col, k from test_tbl group by col, k order by k
limit 1000;
It should run much more quickly, although still not as optimal as it could be.
[1] https://www.postgresql.org/docs/9.6/static/release-9-6.html
--
Davi
ed version of PostgreSQL?
--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance
On 4 October 2017 at 08:48, Ben Nachtrieb wrote:
> I have 2 cores and my max_parallel_workers_per_gather = 2 and
> max_worker_processes = 8, but my load averages are between 8 and 5 with
> scheduled at 1/189 to 5/195. Are these so high because I increased
> max_parallel_workers_per_gather? My unde
ases, then the only variable part is validation_status_code. Perhaps
validation_status_code = 'P' is much more common in QA than in
production.
You can look at the stats as gathered by ANALYZE with:
\x on
select * from pg_stats where tablename = 'event__' and at
that we could have seen if those are accurate or not.
--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance
us into a single column. "CASE ... WHEN
... THEN ... ELSE ... END" is quite helpful for doing stuff like that. For
now I'll just leave them as two columns.
SELECT status, payment_status, count(*)
FROM ud_document
WHERE uniqueid <> '201708141701018'
GROUP BY 1, 2;
David J.
re no non-DISTINCT output columns, so we needn't check. Note that
* subquery_is_pushdown_safe already reported that we can't use volatile
* quals if there's DISTINCT or DISTINCT ON.)
--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Suppo
hat
particular unlogged table since the data files are known to be accurate.
David J.
the somewhat
specialized nature of the problem, a response should be forthcoming even
though its taking a bit longer than usual.
David J.
possibly partial)
plans and picking the best one - flagging those plan steps that can
leverage parallelism for possible execution.
David J.
eleased>=to_timestamp(0)
ORDER BY released desc
LIMIT 1) b;
or you could write a function which just runs that query. Although,
with the above or the function method, if you give this enough
authors, then it'll eventually become slower than the problem query.
Perhaps if you kno
S ; 100 is the default, and 1
is the maximum.
--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
htt
other random_page_cost defined on it which is causing them
not to ever be preferred.
* you've actually got indexes
Also, you might like to try to EXPLAIN DELETE FROM wos_2017_1.article
WHERE ut = ''; to see if the planner makes use of the
index for that. If that's not choosing the
>> To understand recursion, one must first understand recursion.
This makes no sense unless you also provide the base case.
David
--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref
gresql.org/docs/9.6/static/pgbench.html
--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance
www.postgresql.org/docs/9.6/static/sql-grant.html
David J.
.
>
> xxx 172.23.110.175
> yyy 172.23.110.178
> zzz 172.23.110.177
> aaa 172.23.110.176
> bbb 172.23.111.180
> ccc 172.23.115.26
>
SELECT ... WHERE substring(ip_addr::text, 1, 10) = '172.23.110'
David J.
On Wed, Mar 1, 2017 at 5:24 PM, Jeff Janes wrote:
> On Wed, Mar 1, 2017 at 2:12 PM, David G. Johnston <
> david.g.johns...@gmail.com> wrote:
>
>> On Wed, Mar 1, 2017 at 3:00 PM, Stefan Andreatta > > wrote:
>>
>>> plain analyze
>>> select t
On Wed, Mar 1, 2017 at 3:00 PM, Stefan Andreatta
wrote:
> plain analyze
> select tmp_san_1.id
> from tmp_san_1
>left join tmp_san_2 on tmp_san_1.text = tmp_san_2.text
> where tmp_san_2.id is null;
>
> Does it help if you check for "tmp_san_2.text is null"?
David J.
on the visibility map or the heap).
https://www.postgresql.org/docs/9.6/static/indexes-index-only-scans.html
David J.
;
IIRC the only reason the first query cares to use the index is because it
can perform an Index Only Scan and thus avoid touching the heap at all. If
it cannot avoid touching the heap the planner is going to just use a
sequential scan to retrieve the records directly from the heap and save the
index lookup step.
David J.
so I'm surprised you didn't consider
> just having a format along the lines of: magic+offset+page,
> magic+offset+page, magic+offset+page, etc...
>
> I'd have to defer to David on this, but I think he was considering
> having some kind of a bitmap to indicate which pag
On Wed, Jan 18, 2017 at 4:23 PM, Tom Lane wrote:
> "David G. Johnston" writes:
> > I'm feeling a bit dense here but even after having read a number of
> these
> > kinds of interchanges I still can't get it to stick. I think part of the
> > probl
r -
and probably examples of both as well since its not clear when it can occur.
Some TLC to the docs here would be welcomed.
David J.
On Tue, Dec 27, 2016 at 10:38 AM, Valerii Valeev
wrote:
> Thank you David,
>
> I used same rationale to convince my colleague — it didn’t work :)
> Sort of “pragmatic” person who does what seems working no matter what
> happens tomorrow.
> So I’m seeking for better under
en obtaining rows from the master table. If this
is the case then you've gotten away from the expected usage of partitions
and so need to do things that aren't in the manual to make them work.
David J.
David J.
ed view data gets saved to the physical table thus making
the table clustered on whatever order by is specified.
David J.
temp table in function
>
>
I have no difficulty using arrays in functions.
As for "other methods" - you can use CTE (WITH) to create a truly local
table - updating the catalogs by using a temp table is indeed quite
expensive.
WITH vals AS ( VALUES (1, 'lw'), (2, 'lw2') )
SELECT * FROM vals;
David J.
On Thu, Jul 21, 2016 at 2:24 PM, Claudio Freire
wrote:
> That cross join doesn't look right. It has no join condition.
That is that the definition of a "CROSS JOIN"...
David J.
On 07/08/2016 07:44 AM, vincent wrote:
>
>
> Op 7/8/2016 om 12:23 PM schreef Jean-David Beyer:
>> Why all this concern about how long a disk (or SSD) drive can stay up
>> after a power failure?
>>
>> It seems to me that anyone interested in maintaining an
seconds, my natural gas fueled backup
generator picks up the load very quickly.
Am I overlooking something?
--
.~. Jean-David Beyer Registered Linux User 85642.
/V\ PGP-Key:166D840A 0C610C8B Registered Machine 1935521.
/( )\ Shrewsbury, New Jerseyhttp://linuxcounter.net
an additional index on "guid::text".
>
Or, better, persuade the app to label the value "
public.push_guid
" since that is the column's type...a type you haven't defined for us.
If you get to add explicit casts this should be easy...but I'm not familiar
with the framework you are using.
David J.
ding to the docs)
You need to get to a point where you are seeing feedback from the
pg_restore process. Once you get it telling you what it is doing (or
trying to do) then diagnosing can begin.
David J.
re col like '' and col not like ''
Or a CTE (with)
With likeqry as ( select where like )
Select from likeqry where not like
(sorry for brevity but not at a pc)
David J.
On Tue, Jun 7, 2016 at 8:42 AM, Nicolas Paris wrote:
>
>
> 2016-06-07 14:39 GMT+02:00 David G. Johnston :
>
>> On Tue, Jun 7, 2016 at 8:36 AM, Nicolas Paris
>> wrote:
>>
>>> 2016-06-07 14:31 GMT+02:00 David G. Johnston >> >:
>>>
>&
On Tue, Jun 7, 2016 at 8:36 AM, Nicolas Paris wrote:
> 2016-06-07 14:31 GMT+02:00 David G. Johnston :
>
>> On Tue, Jun 7, 2016 at 7:44 AM, Nicolas Paris
>> wrote:
>>
>>> Hello,
>>>
>>> I run a query transforming huge tables to a json document
> SQL ERROR[54000]
> ERROR: array size exceeds the maximum allowed (1073741823)
>
> https://www.postgresql.org/about/
Maximum Field Size: 1 GB
It doesn't matter that the data never actually is placed into a physical
table.
David J.
index even if it compiles (not tested):
CREATE FUNCTION similarity_80(col, val)
RETURNS boolean
SET similarity_threshold = 0.80
LANGUAGE sql
AS $$
SELECT col % val;
$$;
David J.
t; (cost=0.56..199919.49 rows=7161253 width=15) (actual time=0.016..1625.031
> rows=7160921 loops=1)
>Index Cond: (es09codemp = 1)
>Heap Fetches: 51499
>Buffers: shared hit=75342
> Planning time: 50.129 ms
> Execution time: 380419.435 ms
>
--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
e nice if you could
test the patch on the current git head to see if this fixes your
problem. The findings would be quite interesting for me. Please note
this patch is for test environments only at this stage, not for
production use.
--
David Rowley http://www.2ndQuadrant.com/
633764 11994442 1849232 2014935 4563638 132955919 7
>
>
Ok...again its beyond my present experience but its what the planner
thinks about the distribution, and not what actually is present, that
matters.
David J.
query the planner thinks it needs 1.5 million of the rows and
will have to check each of them for visibility. It decided that scanning
the entire table was more efficient.
The LIMIT 1 in both queries should not be necessary. The planner is smart
enough to stop once it finds what it is looking for. In fact the LIMIT's
presence may be a contributing factor...but I cannot say for sure.
A better query seems like it would be:
WITH active_sites AS (
SELECT DISTINCT site_id FROM datavalues;
)
SELECT *
FROM sites
JOIN active_sites USING (site_id);
David J.
as more aiming for some
agreement for generic infrastructure to do exactly as you describe.
[1] https://wiki.postgresql.org/wiki/FOSDEM/PGDay_2016_Developer_Meeting
--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
-
ut at the same time understand the annoying poor optimisation in this
area.
Although please remember the patch I proposed was merely a first draft
proposal. Not for production use.
[1]
http://www.postgresql.org/message-id/flat/cakjs1f9fk_x_5hkcpcseimy16owe3empmmgsgwlckkj_rw9...@mail.gmail.com#cakj
be equal, and since pc.oid = 1, then pa.attrelid
must also be 1.
If we have 2 items in the IN() clause, then this no longer happens:
# explain select * from pg_class pc inner join pg_attribute pa on pc.oid =
pa.attrelid where pc.oid in(1,2);
QUERY PLAN
Nested Loop (cost=8.84..54.84 rows=15 width=422)
-> Bitmap Heap Scan on pg_class pc (cost=8.56..14.03 rows=2 width=223)
Recheck Cond: (oid = ANY ('{1,2}'::oid[]))
-> Bitmap Index Scan on pg_class_oid_index (cost=0.00..8.56
rows=2 width=0)
Index Cond: (oid = ANY ('{1,2}'::oid[]))
-> Index Scan using pg_attribute_relid_attnum_index on pg_attribute pa
(cost=0.28..20.33 rows=8 width=203)
Index Cond: (attrelid = pc.oid)
(7 rows)
In your case the claim_id = 'e8a38718-7997-4304-bbfa-138deb84aa82'::uuid
was pushed down into the subqueries, thus giving them less work to do, and
also the flexibility of using indexes on claim_id in the tables contained
within the subqueries. PostgreSQL currently does not push any inequality
predicates down at all.
A few months ago I did a little bit of work to try and lift this
restriction, although I only made it cover the >=, >, < and <= operators as
a first measure.
Details here:
http://www.postgresql.org/message-id/flat/cakjs1f9fk_x_5hkcpcseimy16owe3empmmgsgwlckkj_rw9...@mail.gmail.com#cakjs1f9fk_x_5hkcpcseimy16owe3empmmgsgwlckkj_rw9...@mail.gmail.com
If you didn't have the VIEW, you could manually push these predicates into
each subquery. However this is not really possible to do with the VIEW.
Perhaps something could be done with a function and using dynamic SQL to
craft a query manually, or you could just get rid of the view and have the
application build the query. If that's not an option then maybe you could
response to the thread above to mention that you've been hit by this
problem and would +1 some solution to fix it, and perhaps cross link to
this thread. I did have a little bit of a hard time in convincing people
that this was in fact a fairly common problem in the above thread, so it
would be nice to see people who have hit this problem respond to that.
--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
25..0.26 rows=1
> width=116) (actual time=0.401..0.402 rows=1 loops=1)"
> "Planning time: 0.058 ms"
> "Execution time: 0.423 ms"
>
>
I'm doubting the query inside of the function is the problem here...it is
the function usage itself. Calling a function has overhead in that the
body of function needs to be processed. This only has to happen once per
session. The first call of the function incurs this overhead while
subsequent calls do not.
Pending others correcting me...I fairly certain regarding my conclusions
though somewhat inexperienced in doing this kind of diagnostics.
David J.
stem is designed to return data from the heap, not an index.
While it possibly can in some instances if you need to return data you
should store it directly in the table.
David J.
ion tree representing runtime application
* of type-input conversion functions. (As of PG 7.3
* that is actually possible, but not clear that we want
* to rely on it.)
*
* The token location is attached to the run-time
* typecast, not to the Const, for the convenience of
* pg_stat_statements (w
Sorry Igor - yes wrong plan.
Here's the new one ...
(running a wee bit slower this morning - still 20x faster that before
however)
http://explain.depesz.com/s/64YM
QUERY PLAN
---
Ok - wow.
Adding that index, I get the same estimate of 1 row, but a runtime of
~450ms.
A 23000ms improvement.
http://explain.depesz.com/s/TzF8h
This is great. So as a general rule of thumb, if I see a Join Filter
removing an excessive number of rows, I can check if that condition can be
added t
Thanks very much Tom.
Doesn't seem to quite do the trick. I created both those indexes (or the
missing one at least)
Then I ran analyse on stocksales_ib and branch_purchase_order.
I checked there were stats held in pg_stats for both indexes, which there
were.
But the query plan still predicts 1 ro
s.order_no,' ','')
) x
group by po_id,product_code
Explain:
http://explain.depesz.com/s/r7v
Can anyone suggest a better approach for improving the plan for this type
of query?
select version();
version
---
PostgreSQL 9.3.10 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.6.3
20120306 (Red Hat 4.6.3-2), 64-bit
Regards,
--
David
rtitioning data has to be injected into the
query explicitly so that it is already in place before the planner receives
the query. Anything within the query requiring "execution" is handled by
the executor and at that point the chance to exclude partitions has come
and gone.
David J.
Physical memory is 61GB at the moment.
work_mem is 1,249,104kB
>>
> What's physical memory and setting of work_mem?
>
> merlin
>
--
David Osborne
Qcode Software Limited
http://www.qcode.co.uk
T: +44 (0)1463 896484
23 October 2015 at 17:15, Jeff Janes wrote:
> On Fri, Oct 23, 2015 at 7:29 AM, David Osborne wrote:
>
>
>> Hi,
>>
>> Wondering if anyone could suggest how we could improve the performance of
>> this type of query?
>> The intensive part is the summing of inte
quot;mytable_seller_idx" btree (seller)
*Version:*
> SELECT version() ;
version
--
PostgreSQL 9.3.6 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.6.3
20120306 (Red Hat 4.6.3-2), 64-bit
(1 row)
This is running on an AWS RDS instance.
Thanks for any pointers
--
David
rly. I ran
> everything again and there's a marked difference between 9.3 and 9.4.
> The parallel copy times still inflate, but only from 1.4s to 2.5s at 4
> procs. Though it gets a bit dicey after that.
>
>
>
Do the times still inflate in the same way if you perform the COPY b
gt; slow to use? Is it because it involves the *date()* function call that it
> makes it difficult for the planner to guess the data distribution in the
> DOCUMENT table?
>
What happens if you pre-compute the date condition and hard code it?
David J.
ified. I'm not sure why the nested loop executor is
not intelligent enough to do this...
The important number in these plans is "loops", not "rows"
David J.
ure if the planner could be smarter because you are asking a
question it is not particularly suited to estimating - namely cross-table
correlations. Rethinking the model is likely to give you a better outcome
long-term though it does seem like there should be room for improvement
within the stated query and model.
As Tomas said you likely will benefit from increased working memory in
order to make materializing and hashing/bitmapping favorable compared to a
nested loop.
David J.
On Fri, Aug 21, 2015 at 8:07 AM, Stephane Bailliez
wrote:
>
> On Thu, Aug 20, 2015 at 8:19 PM, David G. Johnston <
> david.g.johns...@gmail.com> wrote:
>
>>
>> SELECT [...]
>> FROM (SELECT reference_id, [...] FROM table_where_referenced_id_is_a_pk
t;UNION ALL" query you proposed.
David J.
-> Index Scan using t1_a_idx on t1 (cost=0.42..4.44 rows=1
>> width=8)
>>Index Cond: (a <= 1)
>>-> Seq Scan on t2 (cost=0.00..17.50 rows=1 width=4)
>> Filter: (a <= 1)
>> (8 rows)
>>
>>
>>
>
> Hi David,
&g
badly because of this:
-> Seq Scan on fr13t1 (cost=0.00..25072.50
rows=311800 width=21) (actual time=0.007..115.766 rows=311800 loops=1)
Filter: (fr01codemp = '1'::smallint)
Buffers: shared hit=21175
Just how selec
-with-window-function/
>
>
>
I've not looked at the query in any detail, but that particular patch won't
help as it only allows pushdown of predicate into subqueries with window
functions where the predicate is part of all of the subquery's PARTITION BY
clauses.
The query in qu
performance is better with 32 cores that 64
> cores?.
>
>
You might be interested in
http://www.postgresql.org/message-id/53f4f36e.6050...@agliodbs.com
Regards
David Rowley
--
David Rowley http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/>
PostgreSQL
On Wed, Jul 15, 2015 at 4:53 PM, Michael Nolan wrote:
> On Wed, Jul 15, 2015 at 3:16 PM, Robert DiFalco
> wrote:
>
>>
>> Thanks David, my example was a big simplification, but I appreciate your
>> guidance. The different event types have differing amounts of related d
us the update would likely end up writing an
entirely new record upon each event category recording.
David J.
On Wed, Jul 15, 2015 at 1:56 PM, Robert DiFalco
wrote:
>
>
> On Wed, Jul 15, 2015 at 10:33 AM, David G. Johnston <
> david.g.johns...@gmail.com> wrote:
>
>> On Wednesday, July 15, 2015, Robert DiFalco
>> wrote:
>>
>>> First off I apologize if
On Wednesday, July 15, 2015, Robert DiFalco
wrote:
> First off I apologize if this is question has been beaten to death. I've
> looked around for a simple answer and could not find one.
>
> Given a database that will not have it's PKEY or indices modified, is it
> generally faster to INSERT or UP
On Wed, Jul 15, 2015 at 12:16 PM, Robert DiFalco
wrote:
> First off I apologize if this is question has been beaten to death. I've
> looked around for a simple answer and could not find one.
>
> Given a database that will not have it's PKEY or indices modified, is it
> generally faster to INSERT
changes are worse would take
effect after a reboot - though most are used on the very next query that
runs.
The vacuum would indeed likely account for the gains - there being
significantly fewer dead/invisible rows to have to scan over and discard
while retrieving the live rows that fulfill your query.
David J.
You should repost this directly and not through Nabble. It has wrapped
your code in raw tags which the PostgreSQL mailing list software strips.
On Wednesday, June 3, 2015, ben.play wrote:
> Hi all,
>
> We have a big database (more than 300 Gb) and we run a lot of queries each
> minute.
>
> Howe
ou insert 273 rows at once, you are doing it
as 273 transactions instead of one?
--
.~. Jean-David Beyer Registered Linux User 85642.
/V\ PGP-Key:166D840A 0C610C8B Registered Machine 1935521.
/( )\ Shrewsbury, New Jerseyhttp://linuxcounter.net
^^-^^ 09:00:01 up 3 days, 9:57
On Thursday, May 21, 2015, Bosco Rama wrote:
> On 05/20/15 20:22, David G. Johnston wrote:
> > On Monday, May 18, 2015, er.tejaspate...@gmail.com <
> > er.tejaspate...@gmail.com > wrote:
> >
> >> If I have to find upcoming birthdays in current week a
not affected by month boundaries but do start on Monday.
David J.
Shishkin wrote:
>
> > On 01 May 2015, at 13:54, David Osborne wrote:
> >
> > Hi,
> >
> > We have a query which finds the latest row_id for a particular code.
> >
> > We've found a backwards index scan is much slower than a forward one, to
>
x27;::bpchar)
Buffers: shared hit=14723
-> Bitmap Index Scan on table_code_idx (cost=0.00..5206.91
rows=278731 width=0) (actual time=23.298..23.298 rows=275909 loops=1)
Index Cond: (code = 'XX'::bpchar)
Buffers: shared hit=765
Total runtime: 184.043 ms
(13 rows)
http://explain.depesz.com/s/E9VE
Thanks in advance for any help.
Regards,
--
David Osborne
Qcode Software Limited
http://www.qcode.co.uk
On Mon, Apr 13, 2015 at 7:45 PM, Jim Nasby wrote:
> On 4/13/15 7:32 PM, David G. Johnston wrote:
>
> That particular use-case would probably best be served with a separate
>> replication channel which pushes data files from the primary to the
>> slaves and allows for t
On Mon, Apr 13, 2015 at 4:49 PM, Jeff Janes wrote:
> On Mon, Apr 13, 2015 at 1:49 PM, David G. Johnston <
> david.g.johns...@gmail.com> wrote:
>
>> On Monday, April 13, 2015, Matheus de Oliveira
>> wrote:
>>
>>>
>>> On Mon, Apr 13, 2015 at 4:31
Instead, not knowing whether there were changes since
the last checkpoint, the system truncated the relation.
What use case is there for a behavior that the last checkpoint data is left
on the relation upon restarting - not knowing whether it was possible the
other data could have been written subsequent?
David J.
ndition probably using only 1 or 2 columns
instead of all five.
I'm not familiar with the caching constraint or the data so its hard to
make more specific suggestions.
David J.
stick the 50 UUIDs in some table,
analyze it, then perform a join between the 2 tables, using IN() or
EXISTS()... Is that any faster?
Also how well does it perform with: set enable_bitmapscan = off; ?
Regards
David Rowley
RAL. More detailed reports may at least bring
exposure to what is being used in the wild and garner interest from other
parties in improving things. Unfortunately this report is too limited to
really make a dent; lacking even the name of the ORM that is being used and
the entire queries that are bein
personally choose only between having different databases for each
client or using a "client_id" column in conjunction with a multi-tenant
database. Those are the two logical models; everything else (e.g.
partitioning) are physical implementation details.
David J.
--
View
s query as well as pagination-oriented queries are two that
come to mind. I think the material would fit well in the tutorial section
but having some kind of quick synopsis and cross reference in the
performance chapter would aid someone whose looking to solve a problem and
not in general edu
case of
queries perform better at execution time?
Is there a valid reason why you don't just write the query with the <=
operator?
Regards
David Rowley
t noticed last night that someone is working on them.
Regards
David Rowley
asking if there's some way to get PostgreSQL to run the 1st query faster?
Or are you asking if both queries are equivalent?
Regards
David Rowley
David G Johnston wrote
>
> Laurent Martelli wrote
>> Le 20/10/2014 15:58, Tom Lane a écrit :
>>> Laurent Martelli <
>> laurent.martelli@
>> > writes:
>>>> Do we agree that both queries are identical ?
>>> No, they *aren't* id
o) will generate sub-optimal plans that can be
rewritten using relational algebra and better optimized for having done so.
But such work takes resources that would be expended for every single query
while manually rewriting the sub-optimal query solves the problem
once-and-for-all.
David J.
--
View
7;s only the contract_contract table that's looked at. This
is because PostgresSQL sees that none of the columns from the 2 tables
which are being left joined to are used, and also that the columns that
you're joining to on these tables are unique, therefore joining to them
cannot duplicate any rows, and since these are left joined, if there was no
matching row, then it wouldn't filter out rows from the contract_contract
table, as it would with INNER JOINs. The planner sees that these left joins
are pointless, so just removes them from the plan.
Regards
David Rowley
> GroupAggregate (cost=0.42..228012.62 rows=208120 width=15) (actual
> time=0.042..2089.367 rows=20 loops=1)
>Buffers: shared hit=53146
>-> Index Scan using t1_name on t1 (cos
will perform better since fewer rows
must be evaluated in the less efficient count(DISTINCT) expression - the
time saved there more than offset by the fact that you are effectively
passing over that subset of the data a second time.
HashAggregate(1M rows) + Aggregate(200k rows) < Aggregate(1M rows)
Huang, Suya wrote
> Can someone figure out why the first query runs so slow comparing to the
> second one? They generate the same result...
Try: EXPLAIN (ANALYZE, BUFFERS)
I believe you are only seeing caching effects.
David J.
--
View this message in context:
http://postgresql.1045
each twice - and note that (each(...).*) does not work to avoid the
double-call - you have to use a subquery / a CTE one to ensure that it is
not collapsed (offset 0 should work too but I find the CTE one a little
cleaner personally).
David J.
--
View this message in context:
http://postgresq
On Wed, Aug 27, 2014 at 1:02 AM, Alex Goncharov <
alex.goncharov@gmail.com> wrote:
> > Thank you, Kevin -- this is helpful.
>
> Thank you David, too.
>
>
> > But it still leaves questions for me.
>
> Still...
>
>
> Alex Goncharov wrote:
>
&g
, and another imported was to be attempted,
ideally the allocated space could be reused.
I'm not sure what a reasonable formula would be, especially at the TB
scale, but roughly 2x the size of the imported (uncompressed) file would be
a good starting point (table + WAL). You likely would want m
t RAM you might as well commit and free up that RAM
for the next batch.
David J.
1 - 100 of 994 matches
Mail list logo