date:20171030

[GENERAL] From the "SQL is verbose" department, WINDOW RANGE specifications

2017-10-30 Thread David G. Johnston

The default range specification is:

RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW

It seems like a common second choice is to want:

RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING

Why did they have to make something so common take 49 characters that, for
seldom-using users, is nearly impossible to remember?

The following would seem to suffice:

{ RANGE | ROWS } ALL

I'd be happy to use non-portable syntax here...

As an aside, I'd vote to add the entire WINDOW syntax specification to the
Synopsis.  The main parameters section can remain as-is in order to aid in
reading comprehension - but having to search out the parameters area just
to remind oneself of the extremely verbose syntax is a bit annoying.  All
of the other sections are represented in both the main synopsis and the
parameters in this manner and I think WINDOW doesn't warrant an exception
(especially vis-a-via both the from_item and grouping_element
specifications).

David J.

Re: [GENERAL] Make "(composite).function_name" syntax work without search_path changes?

2017-10-30 Thread Tom Lane

"David G. Johnston"  writes:
> The system knows that the datatype being inspected is "altschema.alttype" -
> would it be reasonable for the system to check for a function named "label"
> in the same schema as the target type, "altschema", with the target
> argument type and invoke it if present?

The rule is that (v).label is equivalent to label(v), therefore it will
only find function "label" if that's in your search path.  I am very
much not excited about randomly enlarging the search path depending on
syntax --- quite aside from the difficulty of documenting it clearly,
that seems like a great recipe for security hazards.

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] query not scaling

2017-10-30 Thread Laurenz Albe

On Thu, 2017-10-26 at 19:07 -0600, Rob Sargent wrote:
>    ->  Nested Loop  (cost=3799.40..44686205.23 rows=1361304413 width=40)
> (actual time=55.443..89684.451 rows=75577302 loops=1)

>  ->  Hash Join  (cost=3798.98..43611.56 rows=823591 width=32)
> (actual time=55.393..1397.509 rows=823591 loops=1)

>  ->  Index Scan using marker_chrom_basepos_idx on base.marker m
>     (cost=0.42..37.67 rows=1653 width=20)
> (actual time=0.010..0.075 rows=92 loops=823591)
>  Index Cond: ((m.chrom = 22) AND (m.basepos >= s.startbase) AND
(m.basepos <= s.endbase))

I think your biggest problem is the join condition
   on m.basepos between s.startbase and s.endbase

That forces a nested loop join, which cannot be performed efficiently.

Yours,
Laurenz Albe


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] pgaduit - is there a way to audit a role

2017-10-30 Thread rakeshkumar464

Is there a way to audit a group like as follows

alter role db_rw set pgaudit.log = 'read,write,function,ddl'  

and then any user part of db_rw role can be audited automatically.  It does
not seem to work if I connect to the db as rakesh who is part of db_rw role.



--
Sent from: http://www.postgresql-archive.org/PostgreSQL-general-f1843780.html


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] Make "(composite).function_name" syntax work without search_path changes?

2017-10-30 Thread David G. Johnston

CREATE SCHEMA altschema;
CREATE TYPE altschema.alttype AS ( altid text, altlabel text );

CREATE FUNCTION altschema.label(item altschema.alttype)
RETURNS text
LANGUAGE sql
AS $$
SELECT (item).altlabel;
$$;

WITH vals (v) AS (
SELECT ('1', 'One')::altschema.alttype
)
SELECT (v).label
FROM vals;

-- column "label" not found in data type altschema.alttype

SET search_path TO altschema;

WITH vals (v) AS (
SELECT ('1', 'One')::altschema.alttype
)
SELECT (v).label
FROM vals;

-- success

The system knows that the datatype being inspected is "altschema.alttype" -
would it be reasonable for the system to check for a function named "label"
in the same schema as the target type, "altschema", with the target
argument type and invoke it if present?

At this point I'm just writing: altschema.label(v) which is adequate but
not as clean.  I'm consciously trying to write queries that don't require
application schemas in the search path: including the joyous
operator(altschema.@@) syntax in some situations.  I suppose inference
could be considered in that situation as well.

David J.

Re: [GENERAL] pg_audit to mask literal sql

2017-10-30 Thread rakeshkumar464

By mask I mean pgaudit should log where ssn = '123-456-7891' as where ssn =
'?' 



--
Sent from: http://www.postgresql-archive.org/PostgreSQL-general-f1843780.html


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] pg_audit to mask literal sql

2017-10-30 Thread rakeshkumar464

Yes all who interact with HIPAA data are trained for HIPAA SOP.



--
Sent from: http://www.postgresql-archive.org/PostgreSQL-general-f1843780.html


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] pg_audit to mask literal sql

2017-10-30 Thread rakeshkumar464

No they do select.  
It is fine in HIPAA to view data which are protected, if it is part of your
job.  What is not fine is being careless with that protected data and let
unauthorized person view that data.



--
Sent from: http://www.postgresql-archive.org/PostgreSQL-general-f1843780.html


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] pg_audit to mask literal sql

2017-10-30 Thread Rob Sargent




On 10/30/2017 03:35 PM, John R Pierce wrote:

On 10/30/2017 10:55 AM, rakeshkumar464 wrote:

Is there a way in pgaudit to mask literal sqls like the below:

insert into table (col1,col2) values(1,2)
select * from table where col1 = 1

These sqls are typed by our QA folks using pgadmin. pgaudit records this
verbatim which runs afoul of our HIPAA requirement.  Prepared 
statements are

not an issue since pgaudit provides a way to suppress values.


if you have a HIPAA requirement that says 'dont run manual sql 
statements', then, well, DONT.


why are QA folks making changes on production databases, anyways?   
thats not within their domain.   QA should be working on development 
or staging databases.




I suspect the QA types are testing against production and using/seeing 
real names, etc with queries which create /transitory/ tables.  I wonder 
if the QA folks have been HIPAA certified?  Probable better to get them 
redacted data for testing.



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] pg_audit to mask literal sql

2017-10-30 Thread John R Pierce


On 10/30/2017 10:55 AM, rakeshkumar464 wrote:

Is there a way in pgaudit to mask literal sqls like the below:

insert into table (col1,col2) values(1,2)
select * from table where col1 = 1

These sqls are typed by our QA folks using pgadmin. pgaudit records this
verbatim which runs afoul of our HIPAA requirement.  Prepared statements are
not an issue since pgaudit provides a way to suppress values.


if you have a HIPAA requirement that says 'dont run manual sql 
statements', then, well, DONT.


why are QA folks making changes on production databases, anyways?   
thats not within their domain.   QA should be working on development or 
staging databases.




--
john r pierce, recycling bits in santa cruz



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Roles inherited from a role which is the owner of a database can drop it?

2017-10-30 Thread David G. Johnston

On Mon, Oct 30, 2017 at 12:25 PM, Ivan Voras  wrote:

>
> 3. But they do log in with "developer" roles which are inherited from the
> owner role.
>
> [...]

> I've tried it on a dummy database and it apparently works as described
> here. Is this by design?
>
>
Not quite following but ownership is an inheritable permission; and even
if it was not SET ROLE is all that would be required.  Any owner can drop
an object that it owns.

> What are the best practices for this sort of scenario where there is a
> single owner of all the schema (which is large), where developers need
> access to everything but cannot do something as drastic as dropping the dbs
> (and possibly tables)?
>

Don't let developers into production databases...

Trusted people (and/or software) should be provided membership into
ownership groups.  Developers should provide these people/programs with
vetted scripts to execute against production.  Developers can do whatever
they want on their local database instance with full schema-modifying
privileges.

"developers need access to everything" - there is a lot of nuance and
detail behind that fragment that is needed if one is going to develop a
data access and change management policy.

David J.

Re: [GENERAL] pg_audit to mask literal sql

2017-10-30 Thread Arthur Zakirov

On Mon, Oct 30, 2017 at 10:55:17AM -0700, rakeshkumar464 wrote:
> Is there a way in pgaudit to mask literal sqls like the below:
> 
> insert into table (col1,col2) values(1,2)
> select * from table where col1 = 1 
> 
> These sqls are typed by our QA folks using pgadmin. pgaudit records this
> verbatim which runs afoul of our HIPAA requirement.  Prepared statements are
> not an issue since pgaudit provides a way to suppress values.
> 

I doubt that. But I'm not sure that I understood you correctly. What do
you mean by "mask"? Some additional examples may be useful too.

-- 
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] Roles inherited from a role which is the owner of a database can drop it?

2017-10-30 Thread Ivan Voras

Hello,

I just want to verify that what I'm observing is true, and if it is, I'd
like to know how to avoid it:

1. There are databases owned by a certain role which is a superuser
2. Nobody logs in with the superuser role unless necessary
3. But they do log in with "developer" roles which are inherited from the
owner role. These developer roles are not superusers themselves, but have
the CREATEDB flag
4. The developer roles can still drop the databases.

I've tried it on a dummy database and it apparently works as described
here. Is this by design?

If it is, is there a way to prevent the developer roles from dropping the
databases?

What are the best practices for this sort of scenario where there is a
single owner of all the schema (which is large), where developers need
access to everything but cannot do something as drastic as dropping the dbs
(and possibly tables)?

[GENERAL] pg_audit to mask literal sql

2017-10-30 Thread rakeshkumar464

Is there a way in pgaudit to mask literal sqls like the below:

insert into table (col1,col2) values(1,2)
select * from table where col1 = 1 

These sqls are typed by our QA folks using pgadmin. pgaudit records this
verbatim which runs afoul of our HIPAA requirement.  Prepared statements are
not an issue since pgaudit provides a way to suppress values.



--
Sent from: http://www.postgresql-archive.org/PostgreSQL-general-f1843780.html


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] the database system is shutting down - terminating walsender process due to replication timeout

2017-10-30 Thread Zarko Aleksic

Greetings everyone,


I'm looking for a bit of help understanding a particular behavior we are seeing 
with our PostgreSQL 9.6. After issuing a service shutdown command with 
"systemctl stop" on RHEL 7 our PostgreSQL instance started behaving weirdly. 
For the first time it wouldn't shutdown so easily / quickly.


>From the logs we could see that standby nodes that were trying to connect were 
>rejected due to database being shutdown. After wal_sender_timeout and 
>wal_receiver_timeout (default 60s) were reached the database finally shut 
>down. It seems that walsender process was preventing the shutdown of the 
>master database - until timeout was reached, a behavior we didn't experience 
>before.


Does anyone know why would this happen?


We have 1 standby node in our primary site (same subnet as master DB), and two 
standbys in a remote site. WAL archiving is enabled to the remote site with 
rsync command that worked normally during this time and generally completes 
within a couple of seconds - definitely less than a minute. So we kind of ruled 
out WAL archiving.

Would shutting down remote site standbys prevent this kind of delay, they 
usually have a couple of seconds lag in terms of replication (pg_xlog) location?


Here is the postgres log from master that was being shutdown.

2017-10-26 22:04:01 CDT [1701]: [6-1]  user=,db= LOG:  received fast shutdown 
request
2017-10-26 22:04:01 CDT [1701]: [7-1]  user=,db= LOG:  aborting any active 
transactions
2017-10-26 22:04:01 CDT [1711]: [2-1]  user=,db= LOG:  dbms_aq launcher 
shutting down
2017-10-26 22:04:01 CDT [1708]: [2-1]  user=,db= LOG:  autovacuum launcher 
shutting down
2017-10-26 22:04:01 CDT [1705]: [9971-1]  user=,db= LOG:  shutting down
2017-10-26 22:04:01 CDT [1705]: [9972-1]  user=,db= LOG:  checkpoint starting: 
shutdown immediate
2017-10-26 22:04:01 CDT [1705]: [9973-1]  user=,db= LOG:  checkpoint complete: 
wrote 54 buffers (0.0%); 0 transaction log file(s) added, 0 removed, 1 
recycled; write=0.017 s, sync=0.001 s, total=0.026 s; sync files=12, 
longest=0.000 s, average=0.000 s; distance=1100 kB, estimate=3307 kB
2017-10-26 22:04:01 CDT [1705]: [9974-1]  user=,db= LOG:  database system is 
shut down

2017-10-26 22:04:36 CDT [8763]: [1-1] "Local site standby IP"(49606) 
user=replication_user ,db=[unknown] FATAL:  the database system is shutting down

[OMITTED - a bunch of "FATAL:  the database system is shutting down" messages]

2017-10-26 22:05:00 CDT [2669]: [1-1] "Disaster recovery site standby IP" 
(55498) user=replication_user,db=[unknown] LOG:  terminating walsender process 
due to replication timeout

2017-10-26 22:05:00 CDT [1465]: [1-1] "Disaster recovery site standby 
IP"(36948) user=replication_user,db=[unknown] LOG:  terminating walsender 
process due to replication timeout
2017-10-26 22:05:01 CDT [1701]: [8-1]  user=,db= LOG:  database system is shut 
down

Standbys had the same message regarding walreceiver being terminated due to 
replication timeout.

Any help is greatly appreciated.

Thanks,
Zarko

Re: [GENERAL] gin index trouble

2017-10-30 Thread Rob Sargent




On 10/30/2017 10:56 AM, Peter Geoghegan wrote:

On Mon, Oct 30, 2017 at 9:45 AM, Rob Sargent  wrote:

Peter, you beat me to the punch.  I was just about to say "Having read the
referenced message I thought I would add that we never delete from this
table."  In this particular case it was written to record by record, in a
previous execution and at the time of the error it was only being read.  (In
case you've been following, the failed execution would have added ~1M
"segments", each which references an entry in the gin'd table "probandsets"
- but like a rookie I'm looking up each probandset(2^16) individually.
Re-working that NOW.)

It's not surprising that only a SELECT statement could see this
problem. I guess that it's possible that only page deletions used for
the pending list are involved here.

I'm not sure how reliably you can recreate the problem, but if it
doesn't take too long then it would be worth seeing what effect
turning off the FASTUPDATE storage parameter for the GIN index has.
That could prevent the problem from recurring, and would support my
theory about what's up here. (It wouldn't fix the corruption, though.)

Of course, what I'd much prefer is a self-contained test case. But if
you can't manage that, or if reproducing the issue takes hours, then
this simpler experiment might be worthwhile.

I can reload the gin'd table repeatedly in a dev environment. Does 
select * from  order by  expose the corruption or 
does the load itself necessarily fail at the moment of corruption?




--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] gin index trouble

2017-10-30 Thread Peter Geoghegan

On Mon, Oct 30, 2017 at 9:45 AM, Rob Sargent  wrote:
> Peter, you beat me to the punch.  I was just about to say "Having read the
> referenced message I thought I would add that we never delete from this
> table."  In this particular case it was written to record by record, in a
> previous execution and at the time of the error it was only being read.  (In
> case you've been following, the failed execution would have added ~1M
> "segments", each which references an entry in the gin'd table "probandsets"
> - but like a rookie I'm looking up each probandset(2^16) individually.
> Re-working that NOW.)

It's not surprising that only a SELECT statement could see this
problem. I guess that it's possible that only page deletions used for
the pending list are involved here.

I'm not sure how reliably you can recreate the problem, but if it
doesn't take too long then it would be worth seeing what effect
turning off the FASTUPDATE storage parameter for the GIN index has.
That could prevent the problem from recurring, and would support my
theory about what's up here. (It wouldn't fix the corruption, though.)

Of course, what I'd much prefer is a self-contained test case. But if
you can't manage that, or if reproducing the issue takes hours, then
this simpler experiment might be worthwhile.

-- 
Peter Geoghegan

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] gin index trouble

2017-10-30 Thread Rob Sargent




On 10/30/2017 10:32 AM, Peter Geoghegan wrote:

On Fri, Oct 27, 2017 at 3:15 PM, Rob Sargent  wrote:

I’ve hit this same message

Caused by: org.postgresql.util.PSQLException: ERROR: right sibling of GIN
page is of different type

in a couple of contexts and I’m starting to get worried.

I’ve rebuilt the index, but will that help?
Is there a way to see what the ‘different type’ is?
Is it caught/clean-up by vacuum analyse or some such?

Is there a lot of churn on this table? Do you either heavily update or
heavily delete rows in the table? Does vacuum tend to run on the table
rather frequently?

Peter, you beat me to the punch.  I was just about to say "Having read 
the referenced message I thought I would add that we never delete from 
this table."  In this particular case it was written to record by 
record, in a previous execution and at the time of the error it was only 
being read.  (In case you've been following, the failed execution would 
have added ~1M "segments", each which references an entry in the gin'd 
table "probandsets" - but like a rookie I'm looking up each 
probandset(2^16) individually. Re-working that NOW.)



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] UPDATE syntax change

2017-10-30 Thread Tom Lane

Adam Brusselback  writes:
> --works
> UPDATE tst_table
> SET (b, c) = ('help me', 'please')
> WHERE a = 0;
> --does not work
> UPDATE tst_table
> SET (b) = ('help me')
> WHERE a = 0;

> So there was a change made, and you now cannot use the multi-column
> syntax if you're only updating a single column.  Was this intentional?

You still can, but you have to write ROW() explicitly.  This conforms
to the standard, which our old behavior didn't.

It was probably an oversight not to list this change as a compatibility
issue.  I'll go fix that ...

regards, tom lane


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] UPDATE syntax change (column-list UPDATE syntax fails with single column)

2017-10-30 Thread Adam Brusselback

Appreciate the link, didn't come up when I was googling the issue.

As you said, a mention in the release notes would have been helpful.

Thanks,
-Adam


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Backup strategy using 'wal_keep_segments'

2017-10-30 Thread Stephen Frost

Greetings,

* Rhhh Lin (ruanline...@hotmail.com) wrote:
> A colleague recently suggested that instead of implementing an 
> 'archive_command' to push archivable WALs to a secondary location (for 
> further backup to tape for example), we could instead persist the WAL files 
> in their current location by setting the "wal_keep_segments" parameter to an 
> extreme value e.g. 1000 and have the 'archive_command' do nothing.

Michael's points are good and I wouldn't recommend using this archive
command either, but what isn't clear to me is what you're actaully
trying to solve by using such a method..?  You haven't said anywhere
what's wrong with archive_command (I know that there certainly are some
things wrong with it, of course, but there are solutions to a number of
those issues that isn't a hack like this ...).

Thanks!

Stephen

signature.asc
Description: Digital signature

Re: [GENERAL] UPDATE syntax change (column-list UPDATE syntax fails with single column)

2017-10-30 Thread Justin Pryzby

On Mon, Oct 30, 2017 at 12:29:03PM -0400, Adam Brusselback wrote:
> I have some queries that were working in 9.6 which suddenly broke when
> moving to 10.
> 
> Digging in, the error i'm getting is: ERROR:  source for a
> multiple-column UPDATE item must be a sub-SELECT or ROW() expression

> So there was a change made, and you now cannot use the multi-column
> syntax if you're only updating a single column.  Was this intentional?

I found the same while testing during beta:
https://www.postgresql.org/message-id/flat/20170719174507.GA19616%40telsasoft.com#20170719174507.ga19...@telsasoft.com

Justin


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] gin index trouble

2017-10-30 Thread Peter Geoghegan

On Fri, Oct 27, 2017 at 3:15 PM, Rob Sargent  wrote:
> I’ve hit this same message
>
> Caused by: org.postgresql.util.PSQLException: ERROR: right sibling of GIN
> page is of different type
>
> in a couple of contexts and I’m starting to get worried.
>
> I’ve rebuilt the index, but will that help?
> Is there a way to see what the ‘different type’ is?
> Is it caught/clean-up by vacuum analyse or some such?

Is there a lot of churn on this table? Do you either heavily update or
heavily delete rows in the table? Does vacuum tend to run on the table
rather frequently?

-- 
Peter Geoghegan


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] UPDATE syntax change

2017-10-30 Thread Adam Brusselback

Hey all, just getting around to updating my development environment to
Postgres 10, and there was something I found while in testing.

Version info: PostgreSQL 10.0 on x86_64-pc-linux-gnu, compiled by gcc
(Debian 6.3.0-18) 6.3.0 20170516, 64-bit

I have some queries that were working in 9.6 which suddenly broke when
moving to 10.

Digging in, the error i'm getting is: ERROR:  source for a
multiple-column UPDATE item must be a sub-SELECT or ROW() expression

Test script to replicate:
--so we have something to test with
CREATE TEMPORARY TABLE tst_table (a serial primary key, b text, c text);

--works
UPDATE tst_table
SET (b, c) = ('help me', 'please')
WHERE a = 0;
--does not work
UPDATE tst_table
SET (b) = ('help me')
WHERE a = 0;

So there was a change made, and you now cannot use the multi-column
syntax if you're only updating a single column.  Was this intentional?

I looked through my codebase, and luckily I have only a couple places
where that syntax was used.  Was just an unexpected change for me as I
couldn't find anything in the release notes about it, nor could I find
any mention of it in the docs.  It also didn't issue a warning in 9.6,
so there was nothing to tell me that the syntax was incorrect and
would change later.

Thanks,
-Adam


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] gin index trouble

2017-10-30 Thread Tom Lane

Rob Sargent  writes:
>> If you can make a test case that (eventually) hits that, we'd be
>> interested to see it ...

> Any hint(s) on what might trigger this sort of thing?  I could duplicate 
> the upload, but I doubt you want the 800K records, 200M input file even 
> if it did regenerate the problem.

It's possible you could duplicate the failure with synthetic data
generated by a not-very-long script.  That would beat uploading
a large data file, not to mention possibly needing to sanitize
your data.

regards, tom lane


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] gin index trouble

2017-10-30 Thread Peter Geoghegan

On Mon, Oct 30, 2017 at 7:35 AM, Tom Lane  wrote:
> Rob Sargent  writes:
>> I’ve hit this same message
>> Caused by: org.postgresql.util.PSQLException: ERROR: right sibling of GIN 
>> page is of different type
>> in a couple of contexts and I’m starting to get worried.
>
> If you can make a test case that (eventually) hits that, we'd be
> interested to see it ...

I suspect that this is the 9.6 bug that I described on that recent
-bugs thread [1]. It's just another symptom of the same problem.

It's certainly true that we saw a mix of undetectable
deadlocks/lock-ups (as seen on that -bugs thread) and corruption (as
seen on this thread) before commit e2c79e14 tried to address those
problems. Jeff Janes reported both symptoms in the thread leading up
to that commit during the beta period for 9.6. My guess is that that
commit was insufficient, and that we now continue to see the same mix
of symptoms for what is essentially the same bug.

[1] 
https://postgr.es/m/CAH2-WzmtLXbs8+c19t1T=rj0kyp7vk9q8hqjulgdldvmuee...@mail.gmail.com
-- 
Peter Geoghegan

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] gin index trouble

2017-10-30 Thread Rob Sargent




If you can make a test case that (eventually) hits that, we'd be
interested to see it ...



Any hint(s) on what might trigger this sort of thing?  I could duplicate 
the upload, but I doubt you want the 800K records, 200M input file even 
if it did regenerate the problem.


Would select * from  order by  show the message?

and I um, er, enabled gin on uuid by copying from a thread in this list, as 
follows:
create operator class _uuid_ops
default for type _uuid
using gin as
operator 1 &&(anyarray, anyarray)
,operator 2 @>(anyarray, anyarray)
,operator 3 <@(anyarray, anyarray)
,operator 4 =(anyarray, anyarray)
,function 1 uuid_cmp(uuid, uuid)
,function 2 ginarrayextract(anyarray, internal, internal)
,function 3 ginqueryarrayextract(anyarray, internal, smallint, internal, 
internal, internal, internal)
,function 4 ginarrayconsistent(internal, smallint, anyarray, integer, internal, 
internal, internal, internal)
,storage uuid;

You should not have needed to do that, I think, as the standard
anyarray GIN opclass should've handled it.  Having said that,
I don't immediately see anything broken about this definition,
so it seems like it should've worked.


Good to hear.

Thanks.



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] gin index trouble

2017-10-30 Thread Tom Lane

Rob Sargent  writes:
> I’ve hit this same message 
> Caused by: org.postgresql.util.PSQLException: ERROR: right sibling of GIN 
> page is of different type
> in a couple of contexts and I’m starting to get worried. 

If you can make a test case that (eventually) hits that, we'd be
interested to see it ...

> and I um, er, enabled gin on uuid by copying from a thread in this list, as 
> follows:
> create operator class _uuid_ops
> default for type _uuid
> using gin as
> operator 1 &&(anyarray, anyarray)
> ,operator 2 @>(anyarray, anyarray)
> ,operator 3 <@(anyarray, anyarray)
> ,operator 4 =(anyarray, anyarray)
> ,function 1 uuid_cmp(uuid, uuid)
> ,function 2 ginarrayextract(anyarray, internal, internal)
> ,function 3 ginqueryarrayextract(anyarray, internal, smallint, internal, 
> internal, internal, internal)
> ,function 4 ginarrayconsistent(internal, smallint, anyarray, integer, 
> internal, internal, internal, internal)
> ,storage uuid;

You should not have needed to do that, I think, as the standard
anyarray GIN opclass should've handled it.  Having said that,
I don't immediately see anything broken about this definition,
so it seems like it should've worked.

regards, tom lane


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] starting PG command line options vs postgresql.con

2017-10-30 Thread Michael Paquier

On Mon, Oct 30, 2017 at 2:08 PM, David G. Johnston
 wrote:
> On Mon, Oct 30, 2017 at 6:48 AM, rakeshkumar464 
> wrote:
>>
>> I would prefer using postgresql.conf.  what is the consensus in this forum
>> regarding command line vs postgresql.conf.
>
> I suspect that most people administering a PostgreSQL database would expect
> that the configuration file would be changed in lieu of passing options via
> the command line.

Disagreement here. For one, it makes pg_upgrade more complicated
because it would need to track and then rewrite postgresql.conf, or
just copy it temporarily. The current way of doing things gives the
best of both worlds.
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Comparing epoch to timestamp

2017-10-30 Thread Alexander Farber

# EXPLAIN SELECT ARRAY_AGG(hashed)

FROM words_nouns

  WHERE added > to_timestamp(0)

 UNION

   SELECT ARRAY_AGG(hashed)

  FROM words_verbs

WHERE added > to_timestamp(0)

   ;
   QUERY PLAN


 Unique  (cost=2361.99..2362.00 rows=2 width=32)
   ->  Sort  (cost=2361.99..2361.99 rows=2 width=32)
 Sort Key: (array_agg(words_nouns.hashed))
 ->  Append  (cost=1517.06..2361.98 rows=2 width=32)
   ->  Aggregate  (cost=1517.06..1517.07 rows=1 width=32)
 ->  Seq Scan on words_nouns  (cost=0.00..1517.05
rows=1 width=32)
   Filter: (added > '1970-01-01
01:00:00+01'::timestamp with time zone)
   ->  Aggregate  (cost=844.88..844.89 rows=1 width=32)
 ->  Seq Scan on words_verbs  (cost=0.00..844.88 rows=1
width=32)
   Filter: (added > '1970-01-01
01:00:00+01'::timestamp with time zone)
(10 rows)

Re: [GENERAL] starting PG command line options vs postgresql.con

2017-10-30 Thread David G. Johnston

On Mon, Oct 30, 2017 at 6:48 AM, rakeshkumar464 
wrote:

> I would prefer using postgresql.conf.  what is the consensus in this forum
> regarding command line vs postgresql.conf.

I suspect that most people administering a PostgreSQL database would
expect that the configuration file would be changed in lieu of passing
options via the command line.

Also if conflicting, which one
> takes priority.
>

https://www.postgresql.org/docs/9.6/static/config-setting.html#AEN32498

David J.

Re: [GENERAL] starting PG command line options vs postgresql.con

2017-10-30 Thread Tom Lane

rakeshkumar464  writes:
> I am new to Docker env and I see that PG, as a container is started with
> [ lots of command-line parameters ]

> I would prefer using postgresql.conf.  what is the consensus in this forum
> regarding command line vs postgresql.conf.  Also if conflicting, which one
> takes priority.

The command line takes priority, IIRC, which means that nothing set on
the command line can be overridden without a restart.

I like to specify -p on the command line so that it's easy to tell which
postmaster is which in "ps" listings (of course, this only matters if
you're running multiple postmasters).  Otherwise it's better to leave
as much as you can to postgresql.conf.

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] starting PG command line options vs postgresql.con

2017-10-30 Thread rakeshkumar464

I am new to Docker env and I see that PG, as a container is started with
parameters like this:
docker run -it \
--detach \
--name name \
--restart=unless-stopped \
-p 5432:5432 \

-e PGDATA=/var/lib/postgresql/data/pg10 
-N 500 \
-B 3GB \
-S 6291kB \
-c listen_addresses=* \
-c effective_cache_size=9GB \
-c maintenance_work_mem=768MB \
-c min_wal_size=2GB \
-c max_wal_size=4GB \

I would prefer using postgresql.conf.  what is the consensus in this forum
regarding command line vs postgresql.conf.  Also if conflicting, which one
takes priority.

thanks



--
Sent from: http://www.postgresql-archive.org/PostgreSQL-general-f1843780.html


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Fwd: SPI_palloc problem

2017-10-30 Thread Aron Widforss

On Sun, Oct 29, 2017, at 07:40 PM, Aron Widforss wrote:
> I mailed this from my main email address instead of the one I'm
> subscribed to pgsql-general with. So, here goes.
> 
> - Original message -
> From: Aron Widforss 
> To: pgsql-general@postgresql.org
> Subject: SPI_palloc problem
> Date: Sun, 29 Oct 2017 19:35:30 +0100
> 
> Good evening,
> 
> I'm really new to PostgreSQL, and even C, but am trying to put together
> a small pathfinding module to learn about it. Right now I'm just writing
> a function to load my graph into a convenient data structure to use
> later. I'm currently just loading the data of each vertice into my data
> structure and then printing the loaded position out with elog(). It
> works, but when I try to use SPI_palloc instead of palloc it disconnects
> every time.
> 
> I suspect that my bug is really trivial, so maybe someone on this list
> can just see what it is in the code and save me some time. :)
> 
> Regards,
> Aron Widforss
> 
> Here is my version, example data and program:
> 
> pathfinding-hike=# SELECT version();
>  version 
> -
>  PostgreSQL 9.5.9 on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu
>  5.4.0-6ubuntu1~16.04.4) 5.4.0 20160609, 64-bit
> (1 row)
> 
> 
> 
> pathfinding-hike=# SELECT * FROM toc LIMIT 10;
>  node |   pos| connected  |length | costs | 
>   lines
> --+--++---+---+
> 1 | {268582,6528365} | {2}| {297} | {297} |
> {1}
> 2 | {268773,6528472} | {1}| {297} | {297} |
> {1}
> 3 | {269103,6534918} | {15}   | {676} | {676} |
> {2}
> 4 | {269436,6534931} | {12}   | {263} | {263} |
> {4}
> 5 | {269521,6534450} | {11}   | {373} | {373} |
> {6}
> 6 | {269535,6536629} | {28,22,7}  | {894,1508,66} | {894,1508,66} |
> {7,3,8}
> 7 | {269542,6536693} | {6,9,9}| {66,657,218}  | {66,657,218}  |
> {8,5,9}
> 8 | {269585,6535610} | {14}   | {70}  | {70}  |
> {10}
> 9 | {269631,6536886} | {7,7,36}   | {657,218,584} | {657,218,584} |
> {5,9,11}
>10 | {269642,6534754} | {12,20,11} | {137,140,6}   | {137,140,6}   |
>{12,13,14}
> (10 rows)
> 
> 
> 
> #include "postgres.h"
> #include "utils/array.h"
> #include "utils/lsyscache.h"
> #include "catalog/pg_type.h"
> #include "executor/spi.h"
> #include "utils/builtins.h"
> 
> int check_err(int code);
> int check_err(int code)
> {
> if(code < 0) {
> elog(ERROR, "%m");
> }
> return code;
> }
> 
> int32_t * get_int_arr(HeapTuple tuple, TupleDesc rowdesc, int fnumber);
> int32_t * get_int_arr(HeapTuple tuple, TupleDesc rowdesc, int fnumber)
> {
> Datum raw_arr;
> ArrayType *pg_arr;
> Datum *c_arr;
> bool  isnull;
> /* Parameters for data alignment */
> Oid   eltype;
> int16 typlen;
> bool  typbyval;
> char  typalign;
> /* Info about unwrapped array */
> int32_t   *int_arr;
> int   arr_len;
> int   arr_dim;
> 
> /* Get array raw-format */
> raw_arr = SPI_getbinval(tuple, rowdesc, fnumber, );
> check_err(SPI_result);
> if(isnull) {
> elog(ERROR, "Cannot deconstruct null array");
> }
> 
> /* Get array db-format */
> pg_arr  = DatumGetArrayTypeP(raw_arr);
> arr_dim = ARR_NDIM(pg_arr);
> eltype  = ARR_ELEMTYPE(pg_arr);
> //TODO: More thorough type checking has to be done in plpgsql
> if(eltype != INT4OID) {
> elog(ERROR, "Array not of type Integer");
> }
> 
> /* Get array as C-array (length prepended to 1st element) */
> get_typlenbyvalalign(eltype, , , );
> if(arr_dim != 1) {
> elog(ERROR, "Cannot interpret multidimensional arrays");
> }
> deconstruct_array(pg_arr, eltype, typlen, typbyval, typalign,
>   _arr, NULL, _len);
> int_arr = palloc((arr_len+1)*sizeof(int32_t));
> for(int i = 0; i int_arr[i+1] = DatumGetInt32(c_arr[i]);
> }
> int_arr[0] = arr_len+1;
> 
> return int_arr;
> }
> 
> uint32_t * get_uint_arr(HeapTuple tuple, TupleDesc rowdesc, int
> fnumber);
> uint32_t * get_uint_arr(HeapTuple tuple, TupleDesc rowdesc, int fnumber)
> {
> int32_t  *int_arr;
> int_arr = get_int_arr(tuple, rowdesc, fnumber);
> 
> for(int i = 1; i < int_arr[0]; i++) {
> if(int_arr[i] < 0) {
> elog(ERROR, "Unsigned int stored as negative in database");
> }
> }
> 
> return (uint32_t *)int_arr;
> }
>  
> PG_MODULE_MAGIC;
> 
> PG_FUNCTION_INFO_V1(astarc);
> Datum
> astarc(PG_FUNCTION_ARGS)
> {
> int rows;
> 
>

Re: [GENERAL] Comparing epoch to timestamp

2017-10-30 Thread Alexander Farber

Sorry, I probably had to call:

# EXPLAIN SELECT ARRAY_AGG(hashed)

FROM words_nouns

  WHERE EXTRACT(EPOCH FROM added) > 0

 UNION

   SELECT ARRAY_AGG(hashed)

  FROM words_verbs

WHERE EXTRACT(EPOCH FROM added) > 0

   ;
 QUERY PLAN

-
 Unique  (cost=2707.03..2707.04 rows=2 width=32)
   ->  Sort  (cost=2707.03..2707.03 rows=2 width=32)
 Sort Key: (array_agg(words_nouns.hashed))
 ->  Append  (cost=1740.53..2707.02 rows=2 width=32)
   ->  Aggregate  (cost=1740.53..1740.54 rows=1 width=32)
 ->  Seq Scan on words_nouns  (cost=0.00..1684.66
rows=22348 width=32)
   Filter: (date_part('epoch'::text, added) >
'0'::double precision)
   ->  Aggregate  (cost=966.45..966.46 rows=1 width=32)
 ->  Seq Scan on words_verbs  (cost=0.00..936.05
rows=12157 width=32)
   Filter: (date_part('epoch'::text, added) >
'0'::double precision)
(10 rows)

but still not sure if this statement is ok performancewise...

Regards
Alex

[GENERAL] gin index trouble

2017-10-30 Thread Rob Sargent

I’ve hit this same message 
Caused by: org.postgresql.util.PSQLException: ERROR: right sibling of GIN page 
is of different type
in a couple of contexts and I’m starting to get worried. 
I’ve rebuilt the index, but will that help? 
Is there a way to see what the ‘different type’ is? 
Is it caught/clean-up by vacuum analyse or some such?

I’ve had good results using “<@" and “@>” and believe I've defended the use of 
an array, but I can’t loose three days worth of simulations to this dang wrong 
sibling.

select version(); — will use production release of 10 next week.
  version   


 PostgreSQL 10beta3 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 
20150623 (Red Hat 4.8.5-11), 64-bit


The only gin index I have is in this table definition:
\d sui.probandset
   Table "sui.probandset"
   Column|   Type   | Modifiers 
-+--+---
 id  | uuid | not null
 name| text | 
 probands| uuid[]   | not null
 meioses | integer  | 
 min_kincoef | double precision | 
 max_kincoef | double precision | 
 people_id   | uuid | not null
Indexes:
"probandset_pkey" PRIMARY KEY, btree (id)
"probandsetunique" gin (probands)
Check constraints:
"sortedset" CHECK (issorteduuids(probands))
Foreign-key constraints:
"probandset_people_id_fkey" FOREIGN KEY (people_id) REFERENCES 
base.people(id)
Referenced by:
TABLE "sui.probandset_group_member" CONSTRAINT 
"probandset_group_member_member_id_fkey" FOREIGN KEY (member_id) REFERENCES 
sui.probandset(id)
TABLE "sui.segment" CONSTRAINT "segment_probandset_id_fkey" FOREIGN KEY 
(probandset_id) REFERENCES sui.probandset(id)

and I um, er, enabled gin on uuid by copying from a thread in this list, as 
follows:
create operator class _uuid_ops
default for type _uuid
using gin as
operator 1 &&(anyarray, anyarray)
,operator 2 @>(anyarray, anyarray)
,operator 3 <@(anyarray, anyarray)
,operator 4 =(anyarray, anyarray)
,function 1 uuid_cmp(uuid, uuid)
,function 2 ginarrayextract(anyarray, internal, internal)
,function 3 ginqueryarrayextract(anyarray, internal, smallint, internal, 
internal, internal, internal)
,function 4 ginarrayconsistent(internal, smallint, anyarray, integer, internal, 
internal, internal, internal)
,storage uuid;

[GENERAL] Comparing epoch to timestamp

2017-10-30 Thread Alexander Farber

Hello,

in PostgreSQL 9.5 I have a table with 67000 records:

 # \d words_nouns
   Table "public.words_nouns"
 Column  |   Type   | Modifiers
-+--+---
 word| text | not null
 hashed  | text | not null
 added   | timestamp with time zone |
 removed | timestamp with time zone |
Indexes:
"words_nouns_pkey" PRIMARY KEY, btree (word)
Check constraints:
"words_nouns_word_check" CHECK (word ~ '^[А-Я]{2,}$'::text AND word !~
'[ЖШ]Ы'::text AND word !~ '[ЧЩ]Я'::text AND word !~ 'Ц[ЮЯ]'::text)
Triggers:
words_nouns_trigger BEFORE INSERT OR UPDATE ON words_nouns FOR EACH ROW
EXECUTE PROCEDURE words_trigger()

And a similar one words_verbs with 36000 records.

Is it a good idea to define the following custom function:

CREATE OR REPLACE FUNCTION words_get_added(
in_visited integer,
OUT out_json jsonb
) RETURNS jsonb AS
$func$
DECLARE
_added text[];
BEGIN
-- create array with words added to dictionary since in_visited
timestamp
IF in_visited > 0 THEN
_added := (
SELECT ARRAY_AGG(hashed)
FROM words_nouns
WHERE EXTRACT(EPOCH FROM added) > in_visited
UNION
SELECT ARRAY_AGG(hashed)
FROM words_verbs
WHERE EXTRACT(EPOCH FROM added) > in_visited
);

IF  CARDINALITY(_added) > 0 THEN
out_json := jsonb_build_object('added', _added);
END IF;
END IF;
END
$func$ LANGUAGE plpgsql;

or should I better transform in_visited to a timestamp with timezone and
compare to that?

I have tried the following, but am not sure how to interpret the result:

# explain select * from words_get_added(0);
 QUERY PLAN
-
 Function Scan on words_get_added  (cost=0.25..0.26 rows=1 width=32)
(1 row)

Thank you
Alex

37 matches

Mail list logo