Re: [GENERAL] Lots of stuck queries after upgrade to 9.4

2015-07-30 Thread Spiros Ioannou
That would need a replica of the data probably which is not possible
(tablespace is 4TB).








*Spiros Ioannou IT Manager, inAccesswww.inaccess.com
<http://www.inaccess.com>M: +30 6973-903808T: +30 210-6802-358*

On 30 July 2015 at 21:47, Scott Marlowe  wrote:

> You might want to try pg replay: http://laurenz.github.io/pgreplay/
>
> On Thu, Jul 30, 2015 at 7:23 AM, Spiros Ioannou 
> wrote:
>
>> I'm very sorry but we don't have a synthetic load generator for our
>> testing setup, only production and that is on SLA. I would be happy to test
>> the next release though.
>>
>>
>>
>>
>>
>>
>>
>>
>> *Spiros Ioannou IT Manager, inAccesswww.inaccess.com
>> <http://www.inaccess.com>M: +30 6973-903808T: +30 210-6802-358*
>>
>> On 29 July 2015 at 13:42, Heikki Linnakangas  wrote:
>>
>>> On 07/28/2015 11:36 PM, Heikki Linnakangas wrote:
>>>
>>>> A-ha, I succeeded to reproduce this now on my laptop, with pgbench! It
>>>> seems to be important to have a very large number of connections:
>>>>
>>>> pgbench -n -c400 -j4 -T600 -P5
>>>>
>>>> That got stuck after a few minutes. I'm using commit_delay=100.
>>>>
>>>> Now that I have something to work with, I'll investigate this more
>>>> tomorrow.
>>>>
>>>
>>> Ok, it seems that this is caused by the same issue that I found with my
>>> synthetic test case, after all. It is possible to get a lockup because of
>>> it.
>>>
>>> For the archives, here's a hopefully easier-to-understand explanation of
>>> how the lockup happens. It involves three backends. A and C are insertion
>>> WAL records, while B is flushing the WAL with commit_delay. The byte
>>> positions 2000, 2100, 2200, and 2300 are offsets within a WAL page. 2000
>>> points to the beginning of the page, while the others are later positions
>>> on the same page. WaitToFinish() is an abbreviation for
>>> WaitXLogInsertionsToFinish(). "Update pos X" means a call to
>>> WALInsertLockUpdateInsertingAt(X). "Reserve A-B" means a call to
>>> ReserveXLogInsertLocation, which returned StartPos A and EndPos B.
>>>
>>> Backend A   Backend B   Backend C
>>> -   -   -
>>> Acquire InsertLock 2
>>> Reserve 2100-2200
>>> Calls WaitToFinish()
>>>   reservedUpto is 2200
>>>   sees that Lock 1 is
>>>   free
>>> Acquire InsertLock 1
>>> Reserve 2200-2300
>>> GetXLogBuffer(2200)
>>>  page not in cache
>>>  Update pos 2000
>>>  AdvanceXLInsertBuffer()
>>>   run until about to
>>>   acquire WALWriteLock
>>> GetXLogBuffer(2100)
>>>  page not in cache
>>>  Update pos 2000
>>>  AdvanceXLInsertBuffer()
>>>   Acquire WALWriteLock
>>>   write out old page
>>>   initialize new page
>>>   Release WALWriteLock
>>> finishes insertion
>>> release InsertLock 2
>>> WaitToFinish() continues
>>>   sees that lock 2 is
>>>   free. Returns 2200.
>>>
>>> Acquire WALWriteLock
>>> Call WaitToFinish(2200)
>>>   blocks on Lock 1,
>>>   whose initializedUpto
>>>   is 2000.
>>>
>>> At this point, there is a deadlock between B and C. B is waiting for C
>>> to release the lock or update its insertingAt value past 2200, while C is
>>> waiting for WALInsertLock, held by B.
>>>
>>> To fix that, let's fix GetXLogBuffer() to always advertise the exact
>>> position, not the beginning of the page (except when inserting the first
>>> record on the page, just after the page header, see comments).
>>>
>>> This fixes the problem for me. I've been running pgbench for about 30
>>> minutes without lockups now, while without the patch it locked up within a
>>> couple of minutes. Spiros, can you easily test this patch in your
>>> environment? Would be nice to get a confirmation that this fixes the
>>> problem for you too.
>>>
>>> - Heikki
>>>
>>>
>>
>
>
> --
> To understand recursion, one must first understand recursion.
>


Re: [GENERAL] Lots of stuck queries after upgrade to 9.4

2015-07-28 Thread Spiros Ioannou
>
> Hmm. With that CTE query, were there other queries running at the same
> time?


yes, a multitude of INSERT and one COMMIT as I remember. I've noticed that
in every stuck situation there was always 1 stuck COMMIT running. We do
mainly INSERT and SELECT, very rarely UPDATE or DELETE.
but the stuck queries happen also without the CTE. The trigger on insert
runs non-stop though.

I think I've noticed a different behaviour with 9.4 and the CTE but I'm
just 50% sure about that, but may be relevant or not:
in 9.3 while the CTE was running, I think the trigger-filled table  was
locked and no inserts could be made to the primary table (which copies all
inserts to the trigger table). In 9.4 inserts can be made at a lower rate
even when CTE runs. As I said not 100% sure about the behaviour was like
that in 9.3.


I was able to debug the synthetic test case I created, but unfortunately I
> don't think it explains the lock up you're seeing after all.
>
> It's possible for WaitXLogInsertionsToFinish() to move backwards, in this
> scenario:
>
> 1. Backend A acquires WALInsertLock 2, and reserves xlog between byte
> positions 2100 - 2200
> 2. Backend B calls WaitXLogInsertionsToFinish(), which blocks on backend
> A, which hasn't advertised any location yet.
> 3. Backend C acquires WALInsertLock 1, and reserves xlog between byte
> positions 2200 - 2300
> 4. Backend C calls GetXLogBuffer(), and sees that the page is not in cache
> yet. (It does not call WALInsertLockUpdateInsertingAt() yet, because it's a
> bit slow or context-switched out)
> 5. Backend A initializes the page, completes inserting its WAL record, and
> releases its WALInsertLock.
> 6. Backend B gets unblocked, seeing that the lock held by B is now free.
> It calculated 2200 as the return value, which was the latest reserved WAL
> position. (Backend C started after it began, so it didn't have to wait for
> it)
> 7. Backend C calls WALInsertLockUpdateInsertingAt(), with a WAL position
> pointing to the beginning of the page, 2000.
>
> If you now call WALInsertLockUpdateInsertingAt() again, it will return
> 2000, because backend C is the only backend holding a lock, and its
> advertised position is 2000. But the previous call calculated 2200.
> GetXLogBuffer() always advertises a WAL position at the beginning of the
> requested page, but that's a bit bogus.
>
> However, AFAICS that is actually harmless. Backend C is not blocked. The
> page it's looking for is certainly in cache at this point, so it can
> continue without blocking. So I don't think this explains your lockup.


Thanks for the investigation. The only think I can offer is that it's still
running fine with commit_delay = 0. Perhaps a warning should be put on the
documentation until the cause is found.
I've found a similar issue online:
http://dba.stackexchange.com/questions/96957/postgresql-9-4-1-stuck-all-queries-when-making-multi-updates
thanks
-Spiros


Re: [GENERAL] Lots of stuck queries after upgrade to 9.4

2015-07-23 Thread Spiros Ioannou
Well, so far with commit_delay=0 no problems. I will report back of couse
if something happens, but I believe that the problem may indeed be
solved/masked with that setting.

Rough description of our setup, or how to reproduce:
* Timeseries data in table , say, "measurements", size: 3-4TB, about 1000
inserts/second
* table measurements also has a trigger on insert to also insert on
measurements_a (for daily export purposes)
Just the above would cause a stuck query after a few days.

Now for exporting we run the following CTE query (measurements_b is an
empty table, measurements_a has about 5GB)
* WITH d_rows AS (DELETE FROM measurement_events_a RETURNING * )  INSERT
INTO measurement_events_b SELECT * FROM d_rows;
The above caused the problem to appear every time, after a 10-20 minutes.

Regards,
-Spiros















*Spiros Ioannou IT Manager, inAccesswww.inaccess.com
<http://www.inaccess.com>M: +30 6973-903808T: +30 210-6802-358*

On 20 July 2015 at 17:02, Andres Freund  wrote:

> On 2015-07-20 17:00:52 +0300, Spiros Ioannou wrote:
> > FYI we have an 9.3.5 with commit_delay = 4000 and commit_siblings = 5
> with
> > a 8TB dataset which seems fine. (Runs on different - faster hardware
> > though).
>
> 9.4 has a different xlog insertion algorithm (scaling much better), so
> that unfortunately doesn't say very much...
>


Re: [GENERAL] Lots of stuck queries after upgrade to 9.4

2015-07-20 Thread Spiros Ioannou
FYI we have an 9.3.5 with commit_delay = 4000 and commit_siblings = 5 with
a 8TB dataset which seems fine. (Runs on different - faster hardware
though).








*Spiros Ioannou IT Manager, inAccesswww.inaccess.com
<http://www.inaccess.com>M: +30 6973-903808T: +30 210-6802-358*

On 20 July 2015 at 15:01, Andres Freund  wrote:

> Heikki,
>
> On 2015-07-20 13:27:12 +0200, Andres Freund wrote:
> > On 2015-07-20 13:22:42 +0200, Andres Freund wrote:
> > > Hm. The problem seems to be the WaitXLogInsertionsToFinish() call in
> > > XLogFlush().
> >
> > These are the relevant stack traces:
> > db9lock/debuglog-commit.txt
> > #2  0x7f7405bd44f4 in LWLockWaitForVar (l=0x7f70f2ab6680,
> valptr=0x7f70f2ab66a0, oldval=, newval=0x)
> at
> /tmp/buildd/postgresql-9.4-9.4.4/build/../src/backend/storage/lmgr/lwlock.c:1011
> > #3  0x7f7405a0d3e6 in WaitXLogInsertionsToFinish
> (upto=121713318915952) at
> /tmp/buildd/postgresql-9.4-9.4.4/build/../src/backend/access/transam/xlog.c:1755
> > #4  0x7f7405a0e1d3 in XLogFlush (record=121713318911056) at
> /tmp/buildd/postgresql-9.4-9.4.4/build/../src/backend/access/transam/xlog.c:2849
> >
> > db9lock/debuglog-insert-8276.txt
> > #1  0x7f7405b77d91 in PGSemaphoreLock (sema=0x7f73ff6531d0,
> interruptOK=0 '\000') at pg_sema.c:421
> > #2  0x7f7405bd4849 in LWLockAcquireCommon (val=,
> valptr=, mode=, l=) at
> /tmp/buildd/postgresql-9.4-9.4.4/build/../src/backend/storage/lmgr/lwlock.c:626
> > #3  LWLockAcquire (l=0x7f70ecaaa1a0, mode=LW_EXCLUSIVE) at
> /tmp/buildd/postgresql-9.4-9.4.4/build/../src/backend/storage/lmgr/lwlock.c:467
> > #4  0x7f7405a0dcca in AdvanceXLInsertBuffer (upto=,
> opportunistic=) at
> /tmp/buildd/postgresql-9.4-9.4.4/build/../src/backend/access/transam/xlog.c:2161
> > #5  0x7f7405a0e301 in GetXLogBuffer (ptr=121713318928384) at
> /tmp/buildd/postgresql-9.4-9.4.4/build/../src/backend/access/transam/xlog.c:1848
> > #6  0x7f7405a0e9c9 in CopyXLogRecordToWAL (EndPos=,
> StartPos=, rdata=0x71c21b90, isLogSwitch= out>, write_len=) at
> /tmp/buildd/postgresql-9.4-9.4.4/build/../src/backend/access/transam/xlog.c:1494
> > #7  XLogInsert (rmid=, info=,
> rdata=) at /tmp/buildd/postgre
>
>
> XLogFlush() has the following comment:
> /*
>  * Re-check how far we can now flush the WAL. It's
> generally not
>  * safe to call WaitXLogInsertionsToFinish while
> holding
>  * WALWriteLock, because an in-progress insertion
> might need to
>  * also grab WALWriteLock to make progress. But we
> know that all
>  * the insertions up to insertpos have already
> finished, because
>  * that's what the earlier
> WaitXLogInsertionsToFinish() returned.
>  * We're only calling it again to allow insertpos
> to be moved
>  * further forward, not to actually wait for
> anyone.
>  */
> insertpos = WaitXLogInsertionsToFinish(insertpos);
>
> but I don't think that's valid reasoning. WaitXLogInsertionsToFinish()
> calls LWLockWaitForVar(oldval = InvalidXLogRecPtr), which will block if
> there's a exlusive locker and some backend doesn't yet have set
> initializedUpto. Which seems like a ossible state?
>


Re: [GENERAL] Lots of stuck queries after upgrade to 9.4

2015-07-20 Thread Spiros Ioannou
Debian 6, x64, HP Proliant, dbsize: 3TB on EMC AX-4i (iscsi), 16 cores,
24GB RAM.
kernel: Linux vserver11 2.6.32-5-xen-amd64 #1 SMP Sun Dec 7 22:52:42 UTC
2014 x86_64 GNU/Linux
I attach postgresql.conf










*Spiros Ioannou IT Manager, inAccesswww.inaccess.com
<http://www.inaccess.com>M: +30 6973-903808T: +30 210-6802-358*

On 20 July 2015 at 16:04, Andres Freund  wrote:

> Hi,
>
> On 2015-07-20 15:58:33 +0300, Spiros Ioannou wrote:
> > Happened again, another backtrace from a COMMIT process. I changed the
> > commit_delay to 0 (it was 4000 to help with our storage) and will report
> > back.
>
> What hardware & OS is this happening on?
>
> Regards,
>
> Andres
>


postgresql.conf
Description: Binary data

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Lots of stuck queries after upgrade to 9.4

2015-07-20 Thread Spiros Ioannou
Happened again, another backtrace from a COMMIT process. I changed the
commit_delay to 0 (it was 4000 to help with our storage) and will report
back.

Already logging to debuglog.txt.
#0  0x7f47b0789ec7 in semop () from /lib/libc.so.6
#1  0x7f47b2513d91 in PGSemaphoreLock (sema=0x7f47abfe7ed0,
interruptOK=0 '\000') at pg_sema.c:421
#2  0x7f47b25704f4 in LWLockWaitForVar (l=0x7f449f452680,
valptr=0x7f449f4526a0, oldval=, newval=0x)
at
/tmp/buildd/postgresql-9.4-9.4.4/build/../src/backend/storage/lmgr/lwlock.c:1011
#3  0x7f47b23a93e6 in WaitXLogInsertionsToFinish (upto=121780827906336)
at
/tmp/buildd/postgresql-9.4-9.4.4/build/../src/backend/access/transam/xlog.c:1755
#4  0x7f47b23aa1d3 in XLogFlush (record=121780827898328) at
/tmp/buildd/postgresql-9.4-9.4.4/build/../src/backend/access/transam/xlog.c:2849
#5  0x7f47b2398f17 in RecordTransactionCommit () at
/tmp/buildd/postgresql-9.4-9.4.4/build/../src/backend/access/transam/xact.c:1196
#6  0x7f47b239921b in CommitTransaction () at
/tmp/buildd/postgresql-9.4-9.4.4/build/../src/backend/access/transam/xact.c:1953
#7  0x7f47b239a8e5 in CommitTransactionCommand () at
/tmp/buildd/postgresql-9.4-9.4.4/build/../src/backend/access/transam/xact.c:2732
#8  0x7f47b257c105 in finish_xact_command () at
/tmp/buildd/postgresql-9.4-9.4.4/build/../src/backend/tcop/postgres.c:2437
#9  0x7f47b2580a10 in exec_execute_message (max_rows=,
portal_name=) at
/tmp/buildd/postgresql-9.4-9.4.4/build/../src/backend/tcop/postgres.c:1974
#10 PostgresMain (argc=, argv=,
dbname=0x7f47b49b5730 "ifms_db", username=) at
/tmp/buildd/postgresql-9.4-9.4.4/build/../src/backend/tcop/postgres.c:4137
#11 0x7f47b25280f5 in BackendRun (port=) at
/tmp/buildd/postgresql-9.4-9.4.4/build/../src/backend/postmaster/postmaster.c:4164
#12 BackendStartup (port=) at
/tmp/buildd/postgresql-9.4-9.4.4/build/../src/backend/postmaster/postmaster.c:3829
#13 ServerLoop () at
/tmp/buildd/postgresql-9.4-9.4.4/build/../src/backend/postmaster/postmaster.c:1597
#14 PostmasterMain (argc=, argv=) at
/tmp/buildd/postgresql-9.4-9.4.4/build/../src/backend/postmaster/postmaster.c:1244
#15 0x7f47b24ba474 in main (argc=5, argv=0x7f47b49b4920) at
/tmp/buildd/postgresql-9.4-9.4.4/build/../src/backend/main/main.c:228
Detaching from program: /usr/lib/postgresql/9.4/bin/postgres, process 15601









*Spiros Ioannou IT Manager, inAccesswww.inaccess.com
<http://www.inaccess.com>M: +30 6973-903808T: +30 210-6802-358*

On 20 July 2015 at 15:21, Heikki Linnakangas  wrote:

> On 07/20/2015 03:01 PM, Andres Freund wrote:
>
>> Heikki,
>>
>> On 2015-07-20 13:27:12 +0200, Andres Freund wrote:
>>
>>> On 2015-07-20 13:22:42 +0200, Andres Freund wrote:
>>>
>>>> Hm. The problem seems to be the WaitXLogInsertionsToFinish() call in
>>>> XLogFlush().
>>>>
>>>
>>> These are the relevant stack traces:
>>> db9lock/debuglog-commit.txt
>>> #2  0x7f7405bd44f4 in LWLockWaitForVar (l=0x7f70f2ab6680,
>>> valptr=0x7f70f2ab66a0, oldval=, newval=0x)
>>> at
>>> /tmp/buildd/postgresql-9.4-9.4.4/build/../src/backend/storage/lmgr/lwlock.c:1011
>>> #3  0x7f7405a0d3e6 in WaitXLogInsertionsToFinish
>>> (upto=121713318915952) at
>>> /tmp/buildd/postgresql-9.4-9.4.4/build/../src/backend/access/transam/xlog.c:1755
>>> #4  0x7f7405a0e1d3 in XLogFlush (record=121713318911056) at
>>> /tmp/buildd/postgresql-9.4-9.4.4/build/../src/backend/access/transam/xlog.c:2849
>>>
>>> db9lock/debuglog-insert-8276.txt
>>> #1  0x7f7405b77d91 in PGSemaphoreLock (sema=0x7f73ff6531d0,
>>> interruptOK=0 '\000') at pg_sema.c:421
>>> #2  0x7f7405bd4849 in LWLockAcquireCommon (val=,
>>> valptr=, mode=, l=) at
>>> /tmp/buildd/postgresql-9.4-9.4.4/build/../src/backend/storage/lmgr/lwlock.c:626
>>> #3  LWLockAcquire (l=0x7f70ecaaa1a0, mode=LW_EXCLUSIVE) at
>>> /tmp/buildd/postgresql-9.4-9.4.4/build/../src/backend/storage/lmgr/lwlock.c:467
>>> #4  0x7f7405a0dcca in AdvanceXLInsertBuffer (upto=,
>>> opportunistic=) at
>>> /tmp/buildd/postgresql-9.4-9.4.4/build/../src/backend/access/transam/xlog.c:2161
>>> #5  0x7f7405a0e301 in GetXLogBuffer (ptr=121713318928384) at
>>> /tmp/buildd/postgresql-9.4-9.4.4/build/../src/backend/access/transam/xlog.c:1848
>>> #6  0x7f7405a0e9c9 in CopyXLogRecordToWAL (EndPos=,
>>> StartPos=, rdata=0x71c21b90, isLogSwitch=>> out>, write_len=) at
>>> /tmp/buildd/postgresql-9.4-9.4.4/build/../src/backend/access/transam/xlog.c:1494
>>> #7  XLogInsert (rmid=, info=,
>>> rdata=) at /tmp/buildd/postgre
>>>
>>
>>
>> XLog

Re: [GENERAL] Lots of stuck queries after upgrade to 9.4

2015-07-20 Thread Spiros Ioannou
Hi Tom,
thank you for your input. The DB was stuck again, I attach all logs and
stack traces.

A stack trace from a COMMIT, an INSERT, an UPDATE, the wal writer, the
writer, and a sequence.

Stracing the commit was stuck at: semop(3145761, {{12, -1, 0}}, 1

Please tell me what else could I do to help debugging.

Is there a way to downgrade postgresql ?








*Spiros Ioannou IT Manager, inAccesswww.inaccess.com
<http://www.inaccess.com>M: +30 6973-903808T: +30 210-6802-358*

On 20 July 2015 at 01:01, Tom Lane  wrote:

> Spiros Ioannou  writes:
> > We upgraded to 9.4.4 from 8.4 We have tried 9.3.x in the past with
> success.
> > Unfortunately in 9.4 gets stuck 2-3 times a week.
>
> > Lots of queries seem stuck, that is they have waiting:false. Queries can
> be
> > COMMIT or INSERT.
> > No blocking locks seem to exist.
> > strace reveals that all stuck processes they are in semop
> > semop(2064417, {{0, -1, 0}}, 1^C 
>
> Stack traces of the stuck processes might help determine what's going on.
>
>
> https://wiki.postgresql.org/wiki/Generating_a_stack_trace_of_a_PostgreSQL_backend
>
> Please try to capture traces for both the COMMIT and INSERT cases,
> as they're almost certainly different.
>
> regards, tom lane
>


db9lock.tar.gz
Description: GNU Zip compressed data

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Lots of stuck queries after upgrade to 9.4

2015-07-19 Thread Spiros Ioannou
Os is debian 6


[GENERAL] Lots of stuck queries after upgrade to 9.4

2015-07-19 Thread Spiros Ioannou
We upgraded to 9.4.4 from 8.4 We have tried 9.3.x in the past with success.
Unfortunately in 9.4 gets stuck 2-3 times a week.

Lots of queries seem stuck, that is they have waiting:false. Queries can be
COMMIT or INSERT.
No blocking locks seem to exist.
strace reveals that all stuck processes they are in semop
semop(2064417, {{0, -1, 0}}, 1^C 

ipcs reveals no shortage of semaphores..

Please any ideas very welcome, our service now in a very sad state..







*Spiros Ioannou IT Manager, inAccesswww.inaccess.com
<http://www.inaccess.com>M: +30 6973-903808T: +30 210-6802-358*


Re: [GENERAL] How to access large objects in Postgresql c-language functions?

2015-01-26 Thread Spiros Ioannou
Thanks, that's interesting, I will look into that.
Regards,
-Spiros


Re: [GENERAL] How to access large objects in Postgresql c-language functions?

2015-01-26 Thread Spiros Ioannou
But these need a connection to the postgresql backend. It seems an overkill
for postgresql to connect to itself, I would prefer something more optimal.
It seems for now that reading libpq's source code is the only way to go.

Regards,

Spiros


[GENERAL] How to access large objects in Postgresql c-language functions?

2015-01-26 Thread Spiros Ioannou
I'm writing a c-language user defined function for postgresql 9.3.5. I
didn't use pl/pgsql or pl/python because part of it needs to be written in
C for speed.

The function gets an existing OID as an argument. How can I access & modify
the OID data from within the c-language function? I can't find any docs for
that!

Thanks,

-Spiros


Re: [GENERAL] partitioning query planner almost always scans all tables

2015-01-23 Thread Spiros Ioannou
Thank you for your input.

When doing an "ORDER BY measurement_time DESC" I had hoped for the query
planner to firstly query the most recent "child" table, i.e. the table
holding current month's data, and then move-on to the oldest table, since
it knows the partition is based on measurement_time. Instead it always
queries the oldest table first and then moves on to the current.

We did the partitioning to be able to drop old data saving disk space, as
you said. Not for performance reasons, although we hoped than having
smaller indexes would also help.

Best regards,
-Spiros









*Spiros Ioannou IT Manager, inAccesswww.inaccess.com
<http://www.inaccess.com>M: +30 6973-903808T: +30 210-6802-358*

On 22 January 2015 at 17:37, Tom Lane  wrote:

> Spiros Ioannou  writes:
> > It is ORDER BY measurement_time, not measurement_id, and measurement_time
> > is used to create the partition. So the planner should know the correct
> > order, but instead it seems to query tables in the wrong order.
>
> The planner does not know that, and even if it attempted to figure it out
> by comparing the child tables' constraints, it could not generate a plan
> that considered only one child table as you incorrectly imagine.  What if
> the "latest" table turned out to be empty at runtime?
>
> The obtained plan with a Merge Append atop Index Scan Backwards nodes
> seems perfectly reasonable to me.  This will result in fetching only the
> latest row within each partition, so that the work involved is O(number of
> partitions) not O(total number of rows).
>
> If you're not happy with that, reconsider how many partitions you really
> need.  Newbies almost invariably create far more partitions than is a good
> idea for performance.  In my view, if you've got more than a couple dozen,
> you're doing it wrong.  Partitioning is, in general, not a benefit for
> query performance (except in a few very narrow, specialized cases); and
> the more partitions you have the worse the penalty.  Partitioning only
> helps for data management, in particular being able to drop old data in
> bulk rather than through expensive DELETE WHERE queries.  How often do
> you do that, and do you really need to be able to do it at a small
> granularity?
>
> regards, tom lane
>


Re: [GENERAL] partitioning query planner almost always scans all tables

2015-01-22 Thread Spiros Ioannou
>
>
> > EXPLAIN ANALYZE  SELECT * FROM measurement_events WHERE
> > measurement_source_id='df86917e-8df0-11e1-8f8f-525400e76ceb' ORDER BY
> > measurement_time DESC LIMIT 1;
> >
> > This seems to fail, scanning all tables. Do you think this can be
> improved
> > at all ? The query plan of the above query is as follows:
>
> The combination of sorting by measurement_source_id and limit
> hinders constraint exclusion because the order of the column
> across whole the inheritance is not known to planner. And the
> below plan also dosn't show whether constraint exclusion worked
> or not, by the same reason. But I suppose it worked.
>

It is ORDER BY measurement_time, not measurement_id, and measurement_time
is used to create the partition. So the planner should know the correct
order, but instead it seems to query tables in the wrong order.


Re: [GENERAL] partitioning query planner almost always scans all tables

2015-01-21 Thread Spiros Ioannou
@Rob Sargent: sorry Rob, not sure what you are asking.

@Kyotaro HORIGUCHI
thanks for your reply and time Kyotaro,

Using the following query
EXPLAIN ANALYZE  SELECT * FROM measurement_events WHERE
measurement_source_id='df86917e-8df0-11e1-8f8f-525400e76ceb' AND
measurement_time >= '2015-01-01 00:00:00+0' LIMIT 1;

produces this plan:

 Limit  (cost=0.00..4.02 rows=1 width=67) (actual time=49.125..49.125
rows=1 loops=1)
   ->  Append  (cost=0.00..3644.05 rows=907 width=67) (actual
time=49.122..49.122 rows=1 loops=1)
 ->  Seq Scan on measurement_events  (cost=0.00..0.00 rows=1
width=966) (actual time=0.003..0.003 rows=0 loops=1)
   Filter: ((measurement_time >= '2015-01-01
02:00:00+02'::timestamp with time zone) AND (measurement_source_id =
'df86917e-8df0-11e1-8f8f-525400e76ceb'::uuid))
 ->  Bitmap Heap Scan on *measurement_events_p2015_01*
 (cost=41.73..3546.10 rows=894 width=54) (actual time=49.119..49.119 rows=1
loops=1)
   Recheck Cond: ((measurement_source_id =
'df86917e-8df0-11e1-8f8f-525400e76ceb'::uuid) AND (measurement_time >=
'2015-01-01 02:00:00+02'::timestamp with time zone))
   ->  Bitmap Index Scan on *measurement_events_p2015_01_pkey*
 (cost=0.00..41.51 rows=894 width=0) (actual time=41.836..41.836 rows=997
loops=1)
 Index Cond: ((measurement_source_id =
'df86917e-8df0-11e1-8f8f-525400e76ceb'::uuid) AND (measurement_time >=
'2015-01-01 02:00:00+02'::timestamp with time zone))
 ->  Index Scan using *measurement_events_p2015_02_pkey* on
*measurement_events_p2015_02*  (cost=0.14..8.16 rows=1 width=966) (never
executed)
   Index Cond: ((measurement_source_id =
'df86917e-8df0-11e1-8f8f-525400e76ceb'::uuid) AND (measurement_time >=
'2015-01-01 02:00:00+02'::timestamp with time zone))
 ->  Index Scan using *measurement_events_p2015_03_pkey* on
*measurement_events_p2015_03*  (cost=0.14..8.16 rows=1 width=966) (never
executed)
   Index Cond: ((measurement_source_id =
'df86917e-8df0-11e1-8f8f-525400e76ceb'::uuid) AND (measurement_time >=
'2015-01-01 02:00:00+02'::timestamp with time zone))
... (cut for brevity)

1) Do you know if this means that the query will stop on 1st find (since it
is limit 1), or will it search all tables regardless results?

2) To improve on the above, do you (or anyone else) have any input on this:

to get the latest value from all tables, we were using the following query
(before partitioning):

EXPLAIN ANALYZE  SELECT * FROM measurement_events WHERE
measurement_source_id='df86917e-8df0-11e1-8f8f-525400e76ceb' ORDER BY
measurement_time DESC LIMIT 1;

This seems to fail, scanning all tables. Do you think this can be improved
at all ? The query plan of the above query is as follows:

--
 Limit  (cost=5.57..9.99 rows=1 width=921) (actual time=5.361..5.361 rows=1
loops=1)
   ->  Merge Append  (cost=5.57..451374.16 rows=102155 width=921) (actual
time=5.359..5.359 rows=1 loops=1)
 Sort Key: measurement_events.measurement_time
 ->  Index Scan Backward using measurement_events_pkey on
*measurement_events*  (cost=0.12..8.14 rows=1 width=966) (actual
time=0.004..0.004 rows=0 loops=1)
   Index Cond: (measurement_source_id =
'df86917e-8df0-11e1-8f8f-525400e76ceb'::uuid)
 ->  Index Scan Backward using *measurement_events_p2014_01_pkey*
on *measurement_events_p2014_01*  (cost=0.14..8.16 rows=1 width=966)
(actual time=0.002..0.002 rows=0 loops=1)
   Index Cond: (measurement_source_id =
'df86917e-8df0-11e1-8f8f-525400e76ceb'::uuid)
 ->  Index Scan Backward using *measurement_events_p2014_02_pkey*
on *measurement_events_p2014_02*  (cost=0.14..8.16 rows=1 width=966)
(actual time=0.001..0.001 rows=0 loops=1)
   Index Cond: (measurement_source_id =
'df86917e-8df0-11e1-8f8f-525400e76ceb'::uuid)
 ->  Index Scan Backward using *measurement_events_p2014_03_pkey*
on *measurement_events_p2014_03*  (cost=0.14..8.16 rows=1 width=966)
(actual time=0.001..0.001 rows=0 loops=1)
   Index Cond: (measurement_source_id =
'df86917e-8df0-11e1-8f8f-525400e76ceb'::uuid)

. (cut for brevity)


[GENERAL] partitioning query planner almost always scans all tables

2015-01-19 Thread Spiros Ioannou
Hello group,
we have a timeseries table, and we tried to partition it by month (with
pg_partman). It seems the query planner always reads all tables regardless
of WHERE, except when WHERE is equality.

*the parent table:*

ifms_db=# \dS measurement_events
  Table "public.measurement_events"
Column |   Type   | Modifiers
---+--+---
 measurement_source_id | uuid | not null
 measurement_time  | timestamp with time zone | not null
 event_reception_time  | timestamp with time zone | not null
 measurement_value | character varying(200)   | not null
 quality   | character varying(500)   | not null
Indexes:
"measurement_events_pkey" PRIMARY KEY, btree (measurement_source_id,
measurement_time)
Triggers:
measurement_events_part_trig BEFORE INSERT ON measurement_events FOR
EACH ROW EXECUTE PROCEDURE measurement_events_part_trig_func()
Number of child tables: 25 (Use \d+ to list them.)


One of the children tables (2014_3)

ifms_db=# \dS measurement_events_p2014_03
  Table "public.measurement_events_p2014_03"
Column |   Type   | Modifiers
---+--+---
 measurement_source_id | uuid | not null
 measurement_time  | timestamp with time zone | not null
 event_reception_time  | timestamp with time zone | not null
 measurement_value | character varying(200)   | not null
 quality   | character varying(500)   | not null
Indexes:
"measurement_events_p2014_03_pkey" PRIMARY KEY, btree
(measurement_source_id, measurement_time)
Check constraints:
"measurement_events_p2014_03_partition_check" CHECK (measurement_time
>= '2014-03-01 00:00:00+02'::timestamp with time zone AND measurement_time
< '2014-04-01 00:00:00+03'::timestamp with time zone)
Inherits: measurement_events



The query:
# explain analyze  select * from measurement_events where
measurement_source_id='df86917e-8df0-11e1-8f8f-525400e76ceb' AND
measurement_time >= DATE '2015-01-01' limit 1;


 Limit  (cost=0.00..4.12 rows=1 width=87) (actual time=0.377..0.377 rows=1
loops=1)
   ->  Append  (cost=0.00..2696.08 rows=655 width=87) (actual
time=0.376..0.376 rows=1 loops=1)
 ->  Seq Scan on measurement_events  (cost=0.00..0.00 rows=1
width=966) (actual time=0.001..0.001 rows=0 loops=1)
   Filter: ((measurement_time >= '2015-01-01'::date) AND
(measurement_source_id = 'df86917e-8df0-11e1-8f8f-525400e76ceb'::uuid))
 ->  Index Scan using measurement_events_p2014_01_pkey on
measurement_events_p2014_01  (cost=0.14..8.16 rows=1 width=966) (actual
time=0.005..0.005 rows=0 loops=1)
   Index Cond: ((measurement_source_id =
'df86917e-8df0-11e1-8f8f-525400e76ceb'::uuid) AND (measurement_time >=
'2015-01-01'::date))
 ->  Index Scan using measurement_events_p2014_02_pkey on
measurement_events_p2014_02  (cost=0.14..8.16 rows=1 width=966) (actual
time=0.002..0.002 rows=0 loops=1)
   Index Cond: ((measurement_source_id =
'df86917e-8df0-11e1-8f8f-525400e76ceb'::uuid) AND (measurement_time >=
'2015-01-01'::date))
 ->  Index Scan using measurement_events_p2014_03_pkey on
measurement_events_p2014_03  (cost=0.14..8.16 rows=1 width=966) (actual
time=0.002..0.002 rows=0 loops=1)
   Index Cond: ((measurement_source_id =
'df86917e-8df0-11e1-8f8f-525400e76ceb'::uuid) AND (measurement_time >=
'2015-01-01'::date))
 ->  Index Scan using measurement_events_p2014_04_pkey on
measurement_events_p2014_04  (cost=0.14..8.16 rows=1 width=966) (actual
time=0.001..0.001 rows=0 loops=1)
   Index Cond: ((measurement_source_id =
'df86917e-8df0-11e1-8f8f-525400e76ceb'::uuid) AND (measurement_time >=
'2015-01-01'::date))
 ->  Index Scan using measurement_events_p2014_05_pkey on
measurement_events_p2014_05  (cost=0.14..8.16 rows=1 width=966) (actual
time=0.001..0.001 rows=0 loops=1)
   Index Cond: ((measurement_source_id =
'df86917e-8df0-11e1-8f8f-525400e76ceb'::uuid) AND (measurement_time >=
'2015-01-01'::date))
 ->  Index Scan using measurement_events_p2014_06_pkey on
measurement_events_p2014_06  (cost=0.14..8.16 rows=1 width=966) (actual
time=0.002..0.002 rows=0 loops=1)
   Index Cond: ((measurement_source_id =
'df86917e-8df0-11e1-8f8f-525400e76ceb'::uuid) AND (measurement_time >=
'2015-01-01'::date))
 ->  Index Scan using measurement_events_p2014_07_pkey on
measurement_events_p2014_07  (cost=0.14..8.16 rows=1 width=966) (actual
time=0.001..0.001 rows=0 loops=1)
   Index Cond: ((measurement_source_id =
'df86917e-8df0-11e1-8f8f-525400e76ceb'::uuid) AND (measurement_time >=
'2015-01-01'::date))
 ->  Index Scan 

Re: [GENERAL] Largely inconsistent query execution speed, involving psql_tmp

2014-07-09 Thread Spiros Ioannou
There are indices:
table:measurement_events"measurement_events_pkey" PRIMARY KEY, btree
(measurement_source_id, measurement_time)
table:measurement_sources"measurement_sources_pkey" PRIMARY KEY, btree
(measurement_source_id)

-Spiros


On 8 July 2014 18:10, Andy Colson  wrote:

> On 7/8/2014 4:47 AM, Spiros Ioannou wrote:
>
>> While executing the following query through psql :
>>
>> SELECT me.* FROM measurement_events me JOIN msrcs_timestamps mt ON
>> me.measurement_source_id=mt.measurement_source_id WHERE
>>   measurement_time > last_update_time
>>
>> there are two behaviors observed by postgresql (8.4):
>> 1) Either the query performs lots of reads on the database and completes
>> in about 4 hours (that is the normal-expected behavior)
>> 2) Either the query starts filling-up pgsql_tmp and this causes large
>> write I/O on the server, and the query never actually completes on a
>> reasonable time (we stop it after 10h).
>>
>> For some strange reason, behaviour 2 is always observed when running
>> psql through a bash script, while behavior 1 is only observed while
>> running psql interactively from command line (but not always).
>>
>> explain:
>> # explain SELECT me.* FROM measurement_events me JOIN msrcs_timestamps
>> mt ON me.measurement_source_id=mt.measurement_source_id WHERE
>>   measurement_time > last_update_time;
>>QUERY PLAN
>> 
>> --
>>   Hash Join  (cost=10111.78..422893652.69 rows=2958929695 width=103)
>> Hash Cond: (me.measurement_source_id = mt.measurement_source_id)
>> Join Filter: (me.measurement_time > mt.last_update_time)
>> ->  Seq Scan on measurement_events me  (cost=0.00..234251772.85
>> rows=8876789085 width=103)
>> ->  Hash  (cost=5733.57..5733.57 rows=350257 width=24)
>>   ->  Seq Scan on msrcs_timestamps mt  (cost=0.00..5733.57
>> rows=350257 width=24)
>> (6 rows)
>>
>>
>> We have tried so far fiddling with work_mem up to 512M - no difference.
>> Any suggestions?
>>
>>
>>
>> Thanks for any help,
>> -Spiros Ioannou
>> inaccess
>>
>>
> Is there any reason you don't have an index?
>
> One, or both, of these will help:
>
> create index measurement_events_pk on measurement_events(
> measurement_source_id);
>
> create index msrcs_timestamps_pk on msrcs_timestamps(measurement_
> source_id);
>
>
>
> measurement_events has 8 billion rows, so expect it to take a while, but
> its a one time cost and should _dramatically_ increase your query
> performance.
>
> -Andy
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>


[GENERAL] Largely inconsistent query execution speed, involving psql_tmp

2014-07-08 Thread Spiros Ioannou
While executing the following query through psql :

SELECT me.* FROM measurement_events me JOIN msrcs_timestamps mt ON
me.measurement_source_id=mt.measurement_source_id WHERE  measurement_time >
last_update_time

there are two behaviors observed by postgresql (8.4):
1) Either the query performs lots of reads on the database and completes in
about 4 hours (that is the normal-expected behavior)
2) Either the query starts filling-up pgsql_tmp and this causes large write
I/O on the server, and the query never actually completes on a reasonable
time (we stop it after 10h).

For some strange reason, behaviour 2 is always observed when running psql
through a bash script, while behavior 1 is only observed while running psql
interactively from command line (but not always).

explain:
# explain SELECT me.* FROM measurement_events me JOIN msrcs_timestamps mt
ON me.measurement_source_id=mt.measurement_source_id WHERE
 measurement_time > last_update_time;
  QUERY PLAN

--
 Hash Join  (cost=10111.78..422893652.69 rows=2958929695 width=103)
   Hash Cond: (me.measurement_source_id = mt.measurement_source_id)
   Join Filter: (me.measurement_time > mt.last_update_time)
   ->  Seq Scan on measurement_events me  (cost=0.00..234251772.85
rows=8876789085 width=103)
   ->  Hash  (cost=5733.57..5733.57 rows=350257 width=24)
 ->  Seq Scan on msrcs_timestamps mt  (cost=0.00..5733.57
rows=350257 width=24)
(6 rows)


We have tried so far fiddling with work_mem up to 512M - no difference.
Any suggestions?



Thanks for any help,
-Spiros Ioannou
inaccess


Re: [GENERAL] While only running pg_dump, postgresql performs writes inside base/ directory. How can we stop this?

2013-12-13 Thread Spiros Ioannou
- autovacuum is not off in the pre-snapshot (production) database. It
auto-runs every 2 weeks to avoid transaction ID wraparound.
- I tried modifying pgdump to have a more relaxed transaction isolation
(READ UNCOMMITTED) instead of the default (serializable) but this didn't
help.
thanks for the info.




On 10 December 2013 21:13, Jeff Janes  wrote:

> On Tue, Dec 10, 2013 at 12:43 AM, Spiros Ioannou wrote:
>
>> Hi Jeff,
>> autovacuum is off on the DB running on the filesystem snapshot.
>>
>
>
> Which probably makes sense on the snapshot, but is it also off on the
> pre-snapshot database?
>
>
>> What "hint bits" do you suppose it is setting? It's running only one COPY
>> command for days. Do you have any suggestions to make it more "read only" ?
>>
>
>
> When a query sees a tuple that is still listed as part of an open
> transaction, it needs to figure out whether that transaction has now
> completed, and if so whether it committed or aborted.  This can be quite a
> bit of work to do, so once complete it sets a hint bit locally to that
> tuple, so that the next visitor doesn't have to repeat the work.  I don't
> believe that there is any way to turn this off, unless you want to run your
> own custom build of PostgreSQL.
>
> Cheers,
>
> Jeff
>


Re: [GENERAL] While only running pg_dump, postgresql performs writes inside base/ directory. How can we stop this?

2013-12-10 Thread Spiros Ioannou
Hi Jeff,
autovacuum is off on the DB running on the filesystem snapshot.
What "hint bits" do you suppose it is setting? It's running only one COPY
command for days. Do you have any suggestions to make it more "read only" ?


On 2 December 2013 18:14, Jeff Janes  wrote:

> On Mon, Dec 2, 2013 at 6:22 AM, Spiros Ioannou wrote:
>
>> We are dumping a 4TB postgres database using pg_dump. The source is on a
>> copy-on-write snapshot filesystem, and the target is an external drive.
>>
>> While the pg_dump is running we get about 4-5MB/sec writes on the
>> filesystem where postgresql data is (iostat), and specifically inside the
>> base/ directory.
>>
>> Since the only query running on this DB is the COPY initiated from
>> pg_dump which only reads data, who is writing data to postgres datafiles?
>>
>
> It is probably setting "hint bits" on the tuples it is visiting.
>
>
>>
>> We need to find a way to make postgres stop writing because the source
>> copy-on-write filesystem gets filled up before the dump is finished.
>>
>
> Vacuum the system before taking the snapshot.
>
>>
>> Postgresql version is 8.4, vacuum is not running (autovacuum = off), OS
>> is debian 6.
>>
>
> Since autovacuum is off, what are you doing about vacuuming?
>
> Cheers,
>
> Jeff
>


[GENERAL] While only running pg_dump, postgresql performs writes inside base/ directory. How can we stop this?

2013-12-02 Thread Spiros Ioannou
We are dumping a 4TB postgres database using pg_dump. The source is on a
copy-on-write snapshot filesystem, and the target is an external drive.

While the pg_dump is running we get about 4-5MB/sec writes on the
filesystem where postgresql data is (iostat), and specifically inside the
base/ directory.

Since the only query running on this DB is the COPY initiated from pg_dump
which only reads data, who is writing data to postgres datafiles?

We need to find a way to make postgres stop writing because the source
copy-on-write filesystem gets filled up before the dump is finished.

Postgresql version is 8.4, vacuum is not running (autovacuum = off), OS is
debian 6.


[GENERAL] postgres slower than grep?

2001-07-10 Thread Spiros Ioannou

Hello all,

before you start reading, have in mind that this is not post to advertise 
one db over another, I just want to know what happens, that's why I did 
the comparisons with other db.
to the point:

This is my table:

   Table "table1"
 Attribute |Type | Modifier 
---+-+--
 ClientID  | integer | 
 firstname | varchar(5)  | 
 surname   | varchar(22) | 
 area  | varchar(3)  | 
 neigh | varchar(27) | 
 dimos | varchar(50) | 
 tel   | varchar(7)  | 

The Facts:
--
The table exported to a text file is about 330MB, and contains about 5
milion lines.

The same table takes 670MB in postgres. Why? 
(I tried mysql, and it uses 340MB).

issuing the following query:
select * from table1 where tel='7485842';
takes about 1min and 40 seconds to complete. (query has one result)

explain of the above returns:
Seq Scan on table1  (cost=0.00..147835.01 rows=23 width=76)

issuing a grep in the text file takes about 25 seconds! *(for the whole grep 
to finish - worse case)*

issuing the same query in mysql takes about 25 seconds.

-to test the disk speed, I doubled the size of the text file by copying 
 it twice and the same grep query took 51 seconds (as expected)

when creating an index the query completes of course in no time

yes , i tried vacuum

Postgres is version 7.0
system is solaris 2.7
hard disk is 1rpm, ultraSCSI
cpu is UltraSparcIIi 333Mhz
physical memory is 384MB

and now the questions
-
-shouldn't postgres be at least as fast as the grep?
-shouldn't the table data file be smaller to reduce disk activity? Why is it
 double as the mysql same data file or the text file? I also noticed that 
 an index file for column "tel" is about 130MB large and the mysql's one 
 was 64MB, is this normal?



Thank you for your time,

Spiros  Ioannou 
e-mail:[EMAIL PROTECTED] 
---
Image Video & Multimedia Systems Lab.   
Department of Electrical & Computer Eng.
National Technical University of Athens

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])