from:"Adam"

Re: [GENERAL] Postgres 10.1 fails to start: server did not start in time

2017-11-11 Thread Adam Brusselback

Hey Christoph, I tried starting it with init (service postgresql
start), and pg_ctlcluster.

I modified the pg_ctl.conf and set the timeout higher so I could just
get my cluster back up and running properly, so I can't give you the
info on what systemctl status says at the moment.


On Sat, Nov 11, 2017 at 8:23 AM, Christoph Berg  wrote:
> Re: Tom Lane 2017-11-10 <8027.1510347...@sss.pgh.pa.us>
>> > The recovery succeeds, but when I go to start the cluster on the
>> > standby, it begins to replay the WAL, and does so for about 30
>> > seconds.  Then I get a line in my log saying:
>>
>> >> pg_ctl: server did not start in time
>
> Hi Adam,
>
> how did you start the server? Via pg_ctlcluster, the init system, or
> directly via pg_ctl?
>
>> > Followed by:
>> >> 2017-11-10 20:27:35.907 UTC [7132] LOG:  received smart shutdown request
>> >> ERROR [063]: : terminated on signal [SIGTERM]
>>
>> ... pg_ctl itself wouldn't decide to forcibly shut down the server
>> if the timeout expired.  It merely stops waiting and tells you so.
>> It seems like this must represent misdesign of whatever start script
>> you're using.  I think you need to complain to the Debian packagers
>> about that.
>
> pg_ctlcluster doesn't shut down if startup fails, but to be sure, we'd
> need to see the full log of whatever initiated the startup. If you are
> using systemd, what does `systemctl status postgresql@10-main` report?
> If that doesn't have anything, also check journalctl.
>
> Christoph


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Postgres 10.1 fails to start: server did not start in time

2017-11-10 Thread Adam Brusselback

> You might want to increase pg_ctl's wait timeout for this situation,
> since the default's evidently too little.  However ...
Got it, thanks.

> ... pg_ctl itself wouldn't decide to forcibly shut down the server
> if the timeout expired.  It merely stops waiting and tells you so.
> It seems like this must represent misdesign of whatever start script
> you're using.  I think you need to complain to the Debian packagers
> about that.

Hmm, interesting.  I installed from this repo:
http://apt.postgresql.org/pub/repos/apt/ stretch-pgdg main

I was assuming someone in the Postgres project was involved in
packaging it.  Do you know who I should reach out to in that case?


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] Postgres 10.1 fails to start: server did not start in time

2017-11-10 Thread Adam Brusselback

Hey all,
I am in the process of upgrading to Postgres 10, and am having trouble
getting my streaming replica working.

OS: Debian 9.2
Version: 10.1

I have my primary backed up using pgbackrest, and I restore that to my
replica.  It generates a recovery.conf which has a restore command for
the WAL to pull them from the pgbackrest server.

The recovery succeeds, but when I go to start the cluster on the
standby, it begins to replay the WAL, and does so for about 30
seconds.  Then I get a line in my log saying:

> pg_ctl: server did not start in time

Followed by:

> 2017-11-10 20:27:35.907 UTC [7132] LOG:  received smart shutdown request
> ERROR [063]: : terminated on signal [SIGTERM]
> 2017-11-10 20:27:35.911 P00   INFO: archive-get command end: terminated on 
> signal [SIGTERM]
ERROR [063]: : terminated on signal [SIGTERM]
> 2017-11-10 20:27:35.912 P00   INFO: archive-get command end: terminated on 
> signal [SIGTERM]
> 2017-11-10 20:27:35.978 UTC [7142] LOG:  shutting down
> 2017-11-10 20:27:36.151 UTC [7132] LOG:  database system is shut down

This happens weather I have the server configured as a standby or not.

Any help would be very appreciated.

Thanks,
-Adam


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Migrating money column from MS SQL Server to Postgres

2017-11-09 Thread Adam Brusselback

> Since you are migrating data into a staging table in PostgreSQL, you may set
> the field data type as TEXT for each field where you have noticed or
> anticipate issues.
> Then after population perform the datatype transformation query on the given
> fields to determine the actual field value that could not be gracefully
> transformed.

This is the approach I have come to as the most successful for data migrations.

I will use tools like Kettle / Talend to get data into a staging table
with every column as text, then use SQL to migrate that to a properly
typed table.  Works much better than trying to work within the
constraints of these tools.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Possible bug: could not open relation with OID [numbers] SQL State: XX000

2017-11-01 Thread Adam Brusselback

> If that's the correct theory, yes.  Did you match up the OID yet?
Yes, I did just now.  The OID matches the TOAST table for the temp
table: contract_actual_direct.

This just really surprises me I haven't seen it before considering I
know for a fact that some of my other functions are way more likely to
have their data stored TOASTed, and use the same DROP TABLE pattern at
the end of the function.

Now I suppose i'll have to figure out what to do going forward.
Dropping on commit is not an option, because some of these functions
need to be able to be run multiple times within a transaction.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Possible bug: could not open relation with OID [numbers] SQL State: XX000

2017-11-01 Thread Adam Brusselback

Huh, so in the other cases where the function works fine, it's likely that
the data all just fits within the regular table and doesn't have to be
TOAST'ed?

So this is something that isn't changed in PG10, and I could have
encountered in 9.6, and just by chance didn't?

This is a pattern I've used in quite a few (at least 50) functions, so it's
surprising I've not seen this issue until now.

Thanks,
-Adam

Re: [GENERAL] Possible bug: could not open relation with OID [numbers] SQL State: XX000

2017-11-01 Thread Adam Brusselback

Alright I figured it out.

The OID does not match any of the temp tables, so not sure what's up there.

I have the function RETURN QUERY,
and then I drop all my temp tables.

If I don't drop the tmp_base table at the end of the function, it will
work just fine.  If I keep the drop at the end in there, it'll blow up
every time.

This seriously seems like a bug to me.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Possible bug: could not open relation with OID [numbers] SQL State: XX000

2017-11-01 Thread Adam Brusselback

I believe it's one of the temp tables. The oid changes each time the
function is run.

I'll put some logging in place to identify the exact temp table it is
though.

[GENERAL] Possible bug: could not open relation with OID [numbers] SQL State: XX000

2017-11-01 Thread Adam Brusselback

Hey all,
First off: PostgreSQL 10.0 on x86_64-pc-linux-gnu, compiled by gcc
(Debian 6.3.0-18) 6.3.0 20170516, 64-bit

I have something going on, and i'm not sure what is causing it.  I
recently upgraded our development environment to PG10, and the error
in the subject appeared with one of my analytical functions.

It creates some temporary tables, joins them together, and then spits
out a result.  If I run it for one "contract_id", it'll work just
fine, then I run it for another similar "contract_id", it'll throw the
error in the subject.

I attached the function.

Any help would be appreciated.
Thanks,
-Adam
-- Function: gosimple.contract_exposure_direct(uuid, boolean, boolean, uuid[])

-- DROP FUNCTION gosimple.contract_exposure_direct(uuid, boolean, boolean, 
uuid[]);

CREATE OR REPLACE FUNCTION gosimple.contract_exposure_direct(
IN p_contract_id uuid,
IN p_stacked_ind boolean,
IN p_limit_actual_ind boolean,
IN p_valid_companies uuid[])
  RETURNS TABLE(direct_contract_ids uuid[], direct_rates numeric[], company_id 
uuid, company_name text, company_number text, product_id uuid, product_name 
text, product_number text, uom_type_id uuid, uom_type_description text, 
this_direct_rate numeric, this_estimated_quantity numeric, this_spend numeric, 
other_direct_rate numeric, total_direct_rate numeric, total_spend numeric, 
target_rate numeric, claim_ids uuid[], claim_amounts numeric[], 
total_claim_volume numeric, total_claim_amount numeric) AS
$BODY$
BEGIN
DROP TABLE IF EXISTS tmp_params;
DROP TABLE IF EXISTS tmp_valid_companies;
DROP TABLE IF EXISTS direct_ids;
DROP TABLE IF EXISTS tmp_price;
DROP TABLE IF EXISTS direct_info;
DROP TABLE IF EXISTS tmp_rates;
DROP TABLE IF EXISTS tmp_base;
DROP TABLE IF EXISTS contract_actual_direct;


IF p_contract_id IS NULL
THEN 
RAISE EXCEPTION 'p_contract_id cannot be null.';
END IF;

IF p_valid_companies IS NOT NULL AND p_limit_actual_ind = true
THEN 
RAISE EXCEPTION 'Cannot use p_valid_companies and 
p_limit_actual_ind together.';
END IF;


CREATE TEMPORARY TABLE tmp_params AS
SELECT cf.contractee_company_id, cf.contractee_grouping_id, 
crv.date_range
FROM contract_amend_version cav
INNER JOIN contract_renew_version crv
ON cav.contract_renew_version_id = crv.contract_renew_version_id
INNER JOIN contract_family cf
ON crv.contract_family_id = cf.contract_family_id
WHERE true
AND cav.contract_amend_version_id = p_contract_id
GROUP BY 1, 2, 3;

RAISE NOTICE 'tmp_params created: %', clock_timestamp();

CREATE TEMPORARY TABLE contract_actual_direct AS 
SELECT cad.contract_id, cad.product_id, cad.company_id, cad.claim_ids, 
cad.claim_amounts, cad.total_claim_volume, cad.total_claim_amount
FROM gosimple.contract_actual_direct(p_contract_id, p_valid_companies) 
cad;

ANALYZE contract_actual_direct;

RAISE NOTICE 'contract_actual_direct created: %', clock_timestamp();

CREATE TEMPORARY TABLE tmp_valid_companies AS
SELECT DISTINCT unnest(p_valid_companies) as company_id
WHERE p_valid_companies IS NOT NULL

UNION ALL

SELECT DISTINCT contract_actual_direct.company_id
FROM contract_actual_direct
WHERE p_limit_actual_ind = true

UNION ALL

SELECT DISTINCT contractee_view_hierarchy.company_id
FROM contractee_view_hierarchy
WHERE contract_id = p_contract_id
AND p_valid_companies IS NULL
AND p_limit_actual_ind = false;

ANALYZE tmp_valid_companies;

RAISE NOTICE 'tmp_valid_companies created: %', clock_timestamp();

CREATE TEMPORARY TABLE direct_ids AS  
SELECT p_contract_id::uuid as contract_id

UNION ALL

SELECT cav.contract_amend_version_id as contract_id
FROM contract_amend_version cav
INNER JOIN contract_renew_version crv
ON cav.contract_renew_version_id = crv.contract_renew_version_id
INNER JOIN contract_family cf
ON crv.contract_family_id = cf.contract_family_id
INNER JOIN tmp_params
ON true
WHERE true
AND cf.contractee_company_id IS NOT DISTINCT FROM 
tmp_params.contractee_company_id
AND cf.contractee_grouping_id IS NOT DISTINCT FROM 
tmp_params.contractee_grouping_id
AND crv.date_range && tmp_params.date_range
AND cav.contract_state IN ('APPROVED', 'ACTIVE', 'EXPIRED')
AND cav.contract_amend_version_id != p_contract_id
AND EXISTS (
SELECT 1
FROM tmp_valid_companies tvc
INNER JOIN contractee_view_hierarchy conv

Re: [GENERAL] UPDATE syntax change (column-list UPDATE syntax fails with single column)

2017-10-30 Thread Adam Brusselback

Appreciate the link, didn't come up when I was googling the issue.

As you said, a mention in the release notes would have been helpful.

Thanks,
-Adam


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] UPDATE syntax change

2017-10-30 Thread Adam Brusselback

Hey all, just getting around to updating my development environment to
Postgres 10, and there was something I found while in testing.

Version info: PostgreSQL 10.0 on x86_64-pc-linux-gnu, compiled by gcc
(Debian 6.3.0-18) 6.3.0 20170516, 64-bit

I have some queries that were working in 9.6 which suddenly broke when
moving to 10.

Digging in, the error i'm getting is: ERROR:  source for a
multiple-column UPDATE item must be a sub-SELECT or ROW() expression

Test script to replicate:
--so we have something to test with
CREATE TEMPORARY TABLE tst_table (a serial primary key, b text, c text);

--works
UPDATE tst_table
SET (b, c) = ('help me', 'please')
WHERE a = 0;
--does not work
UPDATE tst_table
SET (b) = ('help me')
WHERE a = 0;

So there was a change made, and you now cannot use the multi-column
syntax if you're only updating a single column.  Was this intentional?

I looked through my codebase, and luckily I have only a couple places
where that syntax was used.  Was just an unexpected change for me as I
couldn't find anything in the release notes about it, nor could I find
any mention of it in the docs.  It also didn't issue a warning in 9.6,
so there was nothing to tell me that the syntax was incorrect and
would change later.

Thanks,
-Adam


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Alternative to pgAdmin Postgres Manager that support pgagent (jobs)

2017-10-21 Thread Adam Brusselback

Happy to hear jpgAgent is working alright for you.  If you have any
questions with it feel free to ask me.

If you do want to help with pgAutomator, that sounds like something
you could start to learn on.  jpgAgent is pretty much feature complete
as far as my needs go, and no one has requested any additional
features, so i'd rather spend any new time on pgAutomator.

So one thing I know I need, is to figure out how to package a Postgres
extension, and get it onto pgxn.  The database portion of pgAutomator
is pretty complete at this point, so it'd be nice to learn how to
package it up even if I don't end up publishing it until I get a UI in
place.  I'll have to look, but i'm sure there are plenty of small
tasks that can be done with the agent itself as well.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] Logical replication - behavior of REFRESH PUBLICATION's copy_data option

2017-10-20 Thread Adam LaMore

Hi all,

I'm exploring the new PG10 logical replication feature and trying to
understand how ALTER SUBSCRIPTION ... REFRESH PUBLICATION works.

My planned publication will have over 100 tables, some of which are quite
large. If I add a table to the publication, I understand that I have to use
the above command on the subscriber to refresh its subscription. The
default value of the copy_data option is true, but I'm unclear if that will
(A) re-copy all of the data for all of the tables in the publication; (B)
copy the data for only the newly added table; or (C) some other behavior.
So does anyone know which one it is?

If (A) is true, then I assume I can avoid that by setting copy_data =
false. If I do that, how do I copy only the data for the newly added table
from the publisher to the subscriber? Perhaps an out-of-band dump and
restore of just that table? And should that be before or after the call to
"REFRESH PUBLICATION?

Thanks in advance,
Adam

Re: [GENERAL] Alternative to pgAdmin Postgres Manager that support pgagent (jobs)

2017-10-19 Thread Adam Brusselback

I'm currently in the same boat that I wish there was something better
for running jobs against Postgres than pgAgent.
Using pgAdmin to manage my numerous jobs isn't the best experience
i've ever had to say the least, but it does work.  No other tool I
have used will manage pgAgent jobs.

I worked to alleviate some of my pain with pgAgent, by re-writing the
entire agent portion: https://github.com/GoSimpleLLC/jpgAgent
It uses the same schema as pgAgent, and you can manage it with pgAdmin
just the same, it just has way more features, and has been extremely
stable for me.

One thing I have wanted to do for a while, is to write an entire
alternative job scheduler / management tool separate from pgAdmin
which will be dedicated to just that single purpose.
I put in a good bit of work on that, but haven't had time to go back
and get it to a point where I can actually release it:
https://github.com/Tostino/pgAutomator
The goal is to have a tool much closer to what is available for SQL
Server, or Oracle compared to what we have today.  The backend stuff
is mostly working, I just need to find time to write a frontend to
manage it.

Anyways, sorry for a bit of rambling.  My main point was that this is
something that has been a personal pain point, and I haven't found
anything I liked better up to this point in time.
I'm slowly working on remedying that, but until the time my
alternative is ready, i'm sticking with (j)pgAgent, and pgAdmin to
manage it.

Thanks,
-Adam


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] looking for a globally unique row ID

2017-09-16 Thread Adam Brusselback

Here is the last discussion I saw on it:
 
https://www.postgresql.org/message-id/flat/90261791-b731-a516-ab2a-dafb97df4464%40postgrespro.ru#90261791-b731-a516-ab2a-dafb97df4...@postgrespro.ru

Re: [GENERAL] looking for a globally unique row ID

2017-09-15 Thread Adam Brusselback

>
> I cannot image a single postgres index covering more than one physical
> table. Are you really asking for that?


While not available yet, that is a feature that has had discussion before.
Global indexes are what i've seen it called in those discussions.  One of
the main use cases is to provide uniqueness across multiple tables, which
would also allow things like foreign keys on partitioned tables.

Re: [GENERAL] Porting libpq to QNX 4.25

2017-08-25 Thread Adam Brusselback

>
> Do you believe the only path is Windows 10?


Not Peter...But I believe there are better choices considering if this has
to be on any sort of network, you're going to be potentially vulnerable to
any number of never-to-be-patched security holes.  Using XP in this day and
age is a hard argument to make in my eyes.

Re: [GENERAL] ERROR: unexpected chunk number 0 (expected 1) for toast value 76753264 in pg_toast_10920100

2017-08-09 Thread Adam Sjøgren

On 2017-06-21 Adam Sjøgren  wrote:

> Adam Sjøgren  wrote:

>> Meanwhile, I can report that I have upgraded from 9.3.14 to 9.3.17 and
>> the errors keep appearing the log.

Just to close this, for the record: We haven't seen the errors since
2017-06-30. We upgraded to 9.3.17 (latest 9.3 point-release at the time
of writing) on 2017-06-10.

Whether this means that the affected rows gradually got overwritten
after switching to .17 and thus got fixed, or if something subtle in our
workflow changed, so we aren't hitting this anymore, or something else
entirely is the answer, we're not sure.

We didn't get to trying Alvaro Herrera's suggestion of removing
6c243f90ab6904f27fa990f1f3261e1d09a11853 before the errors stopped
appearing "by themselves".

  Best regards,

Adam

-- 
 "My Dear Babbage. I am in much dismay at having  Adam Sjøgren
  got into so amazing a quagmire & botheration with a...@novozymes.com
  these Numbers, that I cannot possibly get the
  thing done today."

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] How to get transaction started always in WRITE mode.

2017-07-25 Thread Adam Šlachta

Hello Scott and other potential readers/writers,
 
> Login to the database with psql as the same user that your java app connects 
> with try: 
> 
> show default_transaction_read_only;
>  This can be set per-user, it's possible you're getting tripped up there.
 
show default_transaction_read_only;
OFF -> this was run under the same user as our application while the 
application was running;
 
> Also, what happens if you run: 
>  
> select pg_is_in_recovery();
> This can happen if you connect to a postgres slave instead of a master.  Make 
> sure you're always connecting to a master node for executing writes.  
 
select pg_is_in_recovery() returns “F”, however we have not configured 
master/slave (I guess this is mostly used for replication right?).
 
> Is it possible that your code / connect layer is setting 
> default_transaction_read_only to TRUE when the app connects?
 
Well, this might be hard to tell, since we would have to be sure all those 
layers (hibernate/spring) does not “somehow” set read-only to true. We have 
tried to set it up correctly, but most probably that’s where the problem root 
cause is.
Do you know how to find it out eg. in PostgreSQL logs how the read-only flag is 
setup for current transaction? We have tried to enable full logging 
(postgresql.conf) however reading it is quite tough and we did not get any 
closer to the solution.
 
Thank you for your help so far,
Adam


---
This e-mail message including any attachments is for the sole use of the 
intended recipient(s) and may contain privileged or confidential information. 
Any unauthorized review, use, disclosure or distribution is prohibited. If you 
are not the intended recipient, please immediately contact the sender by reply 
e-mail and delete the original message and destroy all copies thereof. 

Tato zpráva včetně veškerých příloh je důvěrná a mohou ji využít pouze osoby, 
jimž je adresována. Nejste-li adresátem zprávy, obsah i s přílohami a kopiemi 
bezodkladně odstraňte ze svého systému a dále ji nijak nevyužívejte. 
Upozorňujeme Vás, že využívání zpráv, které Vám nejsou určeny, je zakázáno, 
včetně jejich přímého či nepřímého zveřejňování, kopírování, tištění, 
rozšiřování anebo jakéhokoli právního jednání učiněného při spoléhání se na 
jejich obsah. Pokud jste zprávu obdrželi omylem, postupujte stejně a neprodleně 
informujte odesílatele. 

Der Inhalt dieser E-Mail ist vertraulich und ausschließlich für den 
bezeichneten Adressaten bestimmt. Wenn Sie nicht der vorgesehene Adressat 
dieser E-Mail oder dessen Vertreter sein sollten, so beachten Sie bitte, dass 
jede Form der Kenntnisnahme, Veröffentlichung, Vervielfältigung oder Weitergabe 
des Inhalts dieser E-Mail unzulässig ist. Wir bitten Sie, sich in diesem Fall 
mit dem Absender der E-Mail in Verbindung zu setzen.

[GENERAL] How to get transaction started always in WRITE mode.

2017-07-25 Thread Adam Šlachta

Hello,
 
In short: Is there any way how to setup PostgreSql 9.6 to always start a 
transaction in WRITE mode?
 
Our related configuration:
"default_transaction_isolation" --> "read committed"
"default_transaction_read_only" --> "off"
 
 
Longer description (for those who are interested, since it is not only 
PostgreSQL related):
 
We are facing problems with "cannot execute  in a 
read-only transaction" exception (org.postgresql.util.PSQLException).
It is very likely the problem is caused by our code, however at the moment the 
fastest solution before we solve the things properly would be to setup WRITE 
mode for all started transactions on a database-setup-level.
 
SW we use:
-> Java 8
-> Hibernate 5.1.2
-> spring-data-jpa 1.10.4.RELEASE
-> spring-beans, spring-core, other spring stuff of version 4.2.8.RELEASE
 
Related configuration (I don't want to spam here with long list of 
configuration files so I pick-up what I consider important):
Hibernate -> first & second level cache switched OFF
SessionFactory -> org.springframework.orm.hibernate5.LocalSessionFactoryBean
transactionManager -> org.springframework.orm.jpa.JpaTransactionManager
Spring @Transactional(read-only) hint -> where we could we set it to "false"
Our typical @Repository extends 
org.springframework.data.jpa.repository.JpaRepository, which uses 
implementation from 
org.springframework.data.jpa.repository.support.SimpleJpaRepository.
 
Thank you very much for any hints.
Adam Slachta



---
This e-mail message including any attachments is for the sole use of the 
intended recipient(s) and may contain privileged or confidential information. 
Any unauthorized review, use, disclosure or distribution is prohibited. If you 
are not the intended recipient, please immediately contact the sender by reply 
e-mail and delete the original message and destroy all copies thereof. 

Tato zpráva včetně veškerých příloh je důvěrná a mohou ji využít pouze osoby, 
jimž je adresována. Nejste-li adresátem zprávy, obsah i s přílohami a kopiemi 
bezodkladně odstraňte ze svého systému a dále ji nijak nevyužívejte. 
Upozorňujeme Vás, že využívání zpráv, které Vám nejsou určeny, je zakázáno, 
včetně jejich přímého či nepřímého zveřejňování, kopírování, tištění, 
rozšiřování anebo jakéhokoli právního jednání učiněného při spoléhání se na 
jejich obsah. Pokud jste zprávu obdrželi omylem, postupujte stejně a neprodleně 
informujte odesílatele. 

Der Inhalt dieser E-Mail ist vertraulich und ausschließlich für den 
bezeichneten Adressaten bestimmt. Wenn Sie nicht der vorgesehene Adressat 
dieser E-Mail oder dessen Vertreter sein sollten, so beachten Sie bitte, dass 
jede Form der Kenntnisnahme, Veröffentlichung, Vervielfältigung oder Weitergabe 
des Inhalts dieser E-Mail unzulässig ist. Wir bitten Sie, sich in diesem Fall 
mit dem Absender der E-Mail in Verbindung zu setzen.

Re: [GENERAL] Materialised view - refresh

2017-07-11 Thread Adam Brusselback

You can use something like cron, windows task scheduler (if you're running
windows), pgagent (or jpgagent), pg_cron, or any of the others.

I personally use (and wrote) jpgagent, at the time pgagent was the only
alternative and it was very unstable for me.  Here is the link if
interested: https://github.com/GoSimpleLLC/jpgAgent

Re: [GENERAL] ERROR: unexpected chunk number 0 (expected 1) for toast value 76753264 in pg_toast_10920100

2017-06-21 Thread Adam Sjøgren

Adam Sjøgren  wrote:

> Meanwhile, I can report that I have upgraded from 9.3.14 to 9.3.17 and
> the errors keep appearing the log.

Just a quick update with more observations:

All the errors in the postgres.log from one of the tables are triggered
by a stored procedure that gathers data to put in a field used for full
text search - this stored procedure is called by a before update trigger
on the table. We have only seen it in the log, but not been able to
reproduce it.

We have, however, now got a row in the other big table where we can get
the error just by running a SELECT * on the row, in psql:

  user@server db=# select * from ourschema.table_a where id = 6121931;
  ERROR:  unexpected chunk number 0 (expected 1) for toast value 339846807 in 
pg_toast_10919630
  user@server db=# 

Which is both nice - we can show the error on demand - but also more
worrying, I guess, because that means the problem is "on disk".

Running this in a stored procedure over the record in question:

>   SELECT *
>   INTO rec
>   FROM table_a where id = badid;
>   detoast := substr(rec.fts::text,1,2000);
>   exception
>   when others then
>   raise notice 'data for table_a id: % is corrupt', badid;
>   continue;

also shows the error:

  user@server db=# SELECT ourschema.check_sequence(6121931, 6121931);
  NOTICE:  data for table_a id: 6121931 is corrupt
   check_sequence 

  (1 row)

We are running this over the entire (160M+ row) table now, to see if any
other rows are affected.

So, we can reproduce the error message, but we can't reproduce the
problem from scratch.

Any ideas on what to look at, given a non-transient problem-row?

Our next step will be to try to switch to 9.3.17 with
6c243f90ab6904f27fa990f1f3261e1d09a11853 reverted as suggested by Alvaro
Herrera last week.

  Best regards,

Adam

-- 
 "Lägg ditt liv i min handAdam Sjøgren
  Sälj din själ till ett band"  a...@novozymes.com

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] ERROR: unexpected chunk number 0 (expected 1) for toast value 76753264 in pg_toast_10920100

2017-06-11 Thread Adam Sjøgren

Alvaro Herrera  wrote:

> ADSJ (Adam Sjøgren) wrote:
>
>> Our database has started reporting errors like this:
>> 
>>   2017-05-31 13:48:10 CEST ERROR:  unexpected chunk number 0 (expected 1) 
>> for toast value 14242189 in pg_toast_10919630

> Does the problem still reproduce if you revert commit
> 6c243f90ab6904f27fa990f1f3261e1d09a11853?

I will try and get back to you with the results (building new .deb
packages as I type this).

Meanwhile, I can report that I have upgraded from 9.3.14 to 9.3.17 and
the errors keep appearing the log.

We have tried running a function similar to the one described in
http://www.databasesoup.com/2013/10/de-corrupting-toast-tables.html as
suggested by Adrian Klaver, but we haven't been able to get any errors
from that.

This is the function we have run over our two tables:

  CREATE OR REPLACE FUNCTION check_table_a(from_id int, to_id int)
  RETURNS VOID LANGUAGE PLPGSQL AS
  $f$
  declare
  curid INT := 0;
  rec RECORD;
  badid INT;
  detoast TEXT;
  begin
  FOR badid IN SELECT id FROM table_a where id >= from_id and id <= to_id LOOP
  curid = curid + 1;
  if curid % 1 = 0 then
  raise notice '% rows inspected (%, %,%)', curid, badid, from_id, 
to_id;
  end if;
  begin
  SELECT *
  INTO rec
  FROM table_a where id = badid;
  detoast := substr(rec.fts::text,1,2000);
  exception
  when others then
  raise notice 'data for table_a id: % is corrupt', badid;
  continue;
  end;
  end loop;
  end;
  $f$;

  -- The other function has:
  --
  detoast := substr(vcontent.document,1,2000);
  --
  -- and is otherwise identical.

But no 'data for table... is corrupt' is printed.

We are only substr()'ing one field (which we know is big) for each row.
Should we do so for _all_ fields? Is there an elegant way to do so?

  Best regards,

Adam

-- 
 "Lägg ditt liv i min handAdam Sjøgren
  Sälj din själ till ett band"  a...@novozymes.com

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] ERROR: unexpected chunk number 0 (expected 1) for toast value 76753264 in pg_toast_10920100

2017-06-08 Thread Adam Sjøgren

Harry writes:

> The second vacuum causes an ERROR identical to that you are reporting
> below (unexpected chunk number n (expected n) for toast value...).
> However it may take up to ten attempts to replicate it.

Interesting.

> Out of interest, are you using any tablespaces other than pg_default?
> I can only replicate the issue when using separately mounted
> tablespaces.

No, we are using pg_default only.

I hope your finding can be reproduced, it would be really interesting to
see.


  Best regards,

Adam

-- 
 "Lägg ditt liv i min hand    Adam Sjøgren
  Sälj din själ till ett band"  a...@novozymes.com


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] ERROR: unexpected chunk number 0 (expected 1) for toast value 76753264 in pg_toast_10920100

2017-06-08 Thread Adam Sjøgren

Achilleas writes:

> First try to find which tables those toast relations refer to :
> select 10919630::regclass , 10920100::regclass ;
> Are those critical tables? Can you restore them somehow?

They are our two big tables, containing the bulk of our data (one with
168M rows, the other with 320M rows).

They are constantly being updated, but if I can identify the affected
rows, I can restore a backup on another machine and cherry pick them
from there.

> Also you may consider
> REINDEX TABLE pg_toast.pg_toast_10920100;
> REINDEX TABLE pg_toast.pg_toast_10919630;
> REINDEX TABLE ;
> REINDEX TABLE ;
>
> also VACUUM the above tables.

Yes, but I'd like to know find out why it happens, because cleaning up
and having the corruption reoccur is not so fun.

> You might want to write a function which iterates over the damaged
> table's rows in order to identify the damaged row(s). And then do some
> good update to create a new version.

Yes - we started by doing a quick pg_dump, but I guess we should switch
to something that can tell us exactly what rows hit the problem.

Anyone has a handy little script lying around?

  Thanks for the response!

Adam

-- 
 "Lägg ditt liv i min handAdam Sjøgren
  Sälj din själ till ett band"  a...@novozymes.com

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] ERROR: unexpected chunk number 0 (expected 1) for toast value 76753264 in pg_toast_10920100

2017-06-07 Thread Adam Sjøgren

Our database has started reporting errors like this:

  2017-05-31 13:48:10 CEST ERROR:  unexpected chunk number 0 (expected 1) for 
toast value 14242189 in pg_toast_10919630
  ...
  2017-06-01 11:06:56 CEST ERROR:  unexpected chunk number 0 (expected 1) for 
toast value 19573520 in pg_toast_10919630

(157 times, for different toast values, same pg_toast_nnn). pg_toast_10919630
corresponds to a table with around 168 million rows.

These went away, but the next day we got similar errors from another
table:

  2017-06-02 05:59:50 CEST ERROR:  unexpected chunk number 0 (expected 1) for 
toast value 47060150 in pg_toast_10920100
  ...
  2017-06-02 06:14:54 CEST ERROR:  unexpected chunk number 0 (expected 1) for 
toast value 47226455 in pg_toast_10920100

(Only 4 this time) pg_toast_10920100 corresponds to a table with holds
around 320 million rows (these are our two large tables).

The next day we got 6 such errors and the day after 10 such errors. On
June 5th we got 94, yesterday we got 111, of which one looked a little
different:

  2017-06-06 17:32:21 CEST ERROR:  unexpected chunk size 1996 (expected 1585) 
in final chunk 0 for toast value 114925100 in pg_toast_10920100

and today the logs have 65 lines, ending with these:

  2017-06-07 14:49:53 CEST ERROR:  unexpected chunk number 0 (expected 1) for 
toast value 131114834 in pg_toast_10920100
  2017-06-07 14:53:41 CEST ERROR:  unexpected chunk number 0 (expected 1) for 
toast value 131149566 in pg_toast_10920100

The database is 10 TB on disk (SSDs) and runs on a 48 core server with 3
TB RAM on Ubuntu 14.04 (Linux 3.18.13).

We are updating rows in the database a lot/continuously.

There are no apparent indications of hardware errors (like ECC) in
dmesg, nor any error messages logged by the LSI MegaRAID controller, as
far as I can tell.

We are running PostgreSQL 9.3.14 currently.

The only thing I could see in the release notes since 9.3.14 that might
be related is this:

 "* Avoid very-low-probability data corruption due to testing tuple
visibility without holding buffer lock (Thomas Munro, Peter Geoghegan,
Tom Lane)"

Although reading more about it, it doesn't sound like it would exhibit
the symptoms we see?

We have recently increased the load (to around twice the number of
cores), though, which made me think we could be triggering corner cases
we haven't hit before.

We will be upgrading to PostgreSQL 9.3.17 during the weekend, but I'd like to 
hear
if anyone has seen something like this, or have some ideas of how to
investigate/what the cause might be.


  Best regards,

Adam

-- 
 "Lägg ditt liv i min hand    Adam Sjøgren
  Sälj din själ till ett band"  a...@novozymes.com


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Caching and Blobs in PG? Was: Can PG replace redis, amqp, s3 in the future?

2017-05-07 Thread Adam Brusselback

>
> there's also pg_agent which is a cron-like extension, usually bundled with
> pg_admin but also available standalone
>
> https://www.pgadmin.org/docs4/dev/pgagent.html
>
>
> --
> john r pierce, recycling bits in santa cruz
>

In addition to that, there is also  jpgAgent:
https://github.com/GoSimpleLLC/jpgAgent

It uses the same schema as pgagent in the database, and just replaces the
actual agent portion of it with a compatible re-write.  Has been way more
stable for us since we switched to it, as well as providing features we
needed like email notifications and parallel running of steps.

Disclosure: I wrote it for my company... started on it well before all the
alternatives like pg_cron, pg_bucket, etc came out.

[GENERAL] pg_dump recording privileges on foreign data wrappers

2017-03-30 Thread Adam Mackler

If I grant a privilege on a foreign data wrapper like this:

GRANT USAGE ON FOREIGN DATA WRAPPER postgres_fdw TO myuser;

from within psql, then a dump of the database produced using pg_dump
seems to lack an equivalent GRANT statement, even though it contains
the CREATE EXTENSION statement for that foreign data wrapper.

Am I usderstanding correctly that when I feed that output of pg_dump
back into psql it will result in a database that has the foreign data
wrapper but without the priviliges that were set in the database that
was dumped?  Is that really what is supposed to happen?  Is there a
way to get pg_dump to output the necessary statements such that
running the dump back through psql results in the same priviliges that
I started with?

I am using version 9.5.6.

Thanks very much,
-- 
Adam Mackler


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Search on very big (partitioned) table

2017-02-20 Thread Adam Brusselback

Do you have non overlapping check constraints on the partitions by date to
allow the planner to exclude the child tables from needing to be looked at?

Re: [GENERAL] create trigger in postgres to check the password strength

2017-02-03 Thread Adam Brusselback

Oh sorry, I misunderstood. Didn't realize you meant database users an not
an application user table implemented in Postgres.  I'll let others answer
that then because i'm not aware of a way to do that.

Re: [GENERAL] create trigger in postgres to check the password strength

2017-02-03 Thread Adam Brusselback

Whoops, accidentally sent this to only Pawan instead of the list:
>
>
Hey there, so I would highly suggest you avoid arbitrary password strength
policies like that.  I wrote a library for my company which we use for
password strength estimation, but it is written in Java.  I've been
thinking about how to port it to pl/pgsql so it could easily be packaged as
an extension and used natively in Postgres, but I just haven't had time to
get around to that yet.  Here it is for reference: https://github.com/
GoSimpleLLC/nbvcxz

If you're actually interested in having an extension which works like the
above, and want to work on porting it, i'd be more than happy to jump in
and help out where I can. I just don't have the free cycles to do it my
self at the moment.

Now on to your original question...Why wouldn't it be possible to create a
trigger on your users table to check the password being inserted, raise an
error if it does not meet your requirement, or hash it if it does and
continue the insert?  Seems pretty straight forward other than the
complexity of actually estimating how secure a password is.

[GENERAL] Reading VARTAT EXTERNAL ONDISK value from logical replication decoder

2017-01-27 Thread Adam Dratwiński

Hello everyone,

In my application I am using this logical replication decoder:

https://github.com/xstevens/decoderbufs

A year ago I had a problem with some values being nil due to not
implemented decoding of EXTERNAL ONDISK values. I managed to hotfix it, it
kind of work until now.

Could someone more experienced help me to decode this value? I am talking
about this part of code:

https://github.com/xstevens/decoderbufs/blob/master/src/decoderbufs.c#L527-L528

Cheers
Adam

Re: [GENERAL] Column Tetris Automatisation

2017-01-15 Thread Adam Brusselback

I for one would love having something like this available.  I also know
i've seen discussed in the past, divorcing the physical column order from
the logical column order, which seems like it'd be useful here as well to
not break the workflow of those who do use ordinal positions for columns.

Re: [GENERAL] CRM where pg is a first class citizen?

2016-12-13 Thread Adam Brusselback

On Tue, Dec 13, 2016 at 3:36 PM, George Weaver  wrote:

> I've never used it but what about:
>
> https://developer.sugarcrm.com/2012/08/03/like-postgresql-
> and-want-to-use-it-with-sugarcrm-check-out-a-new-community-project/
>
> Cheers,
> George

Looks like not much came out of it:
https://community.sugarcrm.com/ideas/1503

They seem hesitant to support PG because of "market demand" which kinda
sucks, since actions like that are part of the reason for Postgres' lower
adoption rate than MySQL.

[GENERAL] Postgres create database freezes - trying to run with replication

2016-11-01 Thread Adam Carrgilson

catalog_xmin | restart_lsn |
confirmed_flush_lsn---++---++--+++--+--+-+-
pgsqlb|| physical  ||  | t  |
11809 |  |  | 0/6000888   |
pgsqlc|| physical  ||  | t  |
12050 |  |  | 0/6000888   |(2 rows)


What can be done to unfreeze this connection and correctly create my
applications database correctly?


Many Thanks,

Adam.

Re: [GENERAL] Incrementally refreshed materialized view

2016-09-29 Thread Adam Brusselback

On Thu, Sep 29, 2016 at 8:10 AM, Nguyễn Trần Quốc Vinh  wrote:

> Dear,
>
> As it was recommended, we pushed our projects into github:
> https://github.com/ntqvinh/PgMvIncrementalUpdate.
>
> 1) Synchronous incremental update
> - For-each-row triggers are generated for all changing events on all
> underlying tables.
>
> 2) Asynchronous (deferred) incremental update
> - Triggers are generated for all changing events on all underlying tables
> to collect all changed rows
> - Other codes are generated for each matview-query to update the matview.
>
>  We hope that our projects may be helpful for someone!
>

Very interesting. Does this support materialized views with recursive
queries? What about left joins? (not) exists? Aggregates? Window functions?
In reading up on the implementations in other databases, I was surprised by
some of the limitations imposed by DB2 / Oracle / Sql Server.

I'm trying to look through the code base to answer my questions, but it's
large enough that it may be easier to just ask first.

Re: [GENERAL] Incrementally refreshed materialized view

2016-09-26 Thread Adam Brusselback

On Mon, Sep 26, 2016 at 3:21 PM, Kevin Grittner  wrote:

> On Mon, Sep 26, 2016 at 2:04 PM, Rakesh Kumar
>  wrote:
>
> > Does PG have a concept of MV log, from where it can detect the
> > delta changes and apply  incremental changes quickly.
>
> That is what I am trying to work toward with the patch I cited in
> an earlier post.  Once some variation of that is in, the actual
> incremental maintenance can be build on top of it.  To see an
> example of what would be done with such a delta relation for a
> simple MV, using the count algorithm, see below:
>

Well I feel like I've learned a ton already reading through the links you
provided earlier and that example above.

I'm very interested in getting this into core. I'll look into what I need
to do to review. Not crazy familiar with C, as I mainly do Java
development. I'll see if I can help in any way though.

The main reason I was working on an alternative is because I need something
now rather than in a couple years, but I've been dealing with manually
creating the few I do need for my database. What I proposed above was just
me thinking about what could be done with things as they are. Obviously
it's junk compared to a real solution in-core.  Would you consider my
approach even worth trying, or should I just suck it up and do things
manually for now and put that effort into getting incremental refresh into
core?

Re: [GENERAL] Incrementally refreshed materialized view

2016-09-26 Thread Adam Brusselback

On Mon, Sep 26, 2016 at 2:35 PM, Rob Sargent  wrote:

> Of course 9.5 is the current release so the answer is Yes, since 9.5
>
> It seems like there is some confusion about what we're talking about. I am
talking about incremental updates to a sort of "fake" materialized view
(implemented as a table).  This is totally different than what we currently
have implemented for actual materialized views (REFRESH CONCURRENTLY).
Concurrent refresh just minimizes the time the view is locked by building a
second table in the background and merging the changes between them.

Re: [GENERAL] Incrementally refreshed materialized view

2016-09-26 Thread Adam Brusselback

I require eagerly refreshed materialized views for my use case, which is
something Postgres does not currently support.  I need my updates to a
table the view refers to visible within the same transaction, and often it
is a single change to one row which will only effect a single row in the
view.  If I used materialized views as they're currently implemented, that
would run the entire query and replace what needs to change, but it has to
run that entire query to do so.  For my use case, that is totally out of
the question to do for every insert / update / delete that could effect the
view.

For example, if I had a account balance materialized view that pretty much
summed transactions for a specific user from two different tables and
displayed a "balance" for every user, I would want that to only run the
query for the refresh for the specific user(s) that just had data
inserted/updated/deleted.  Not every user in the system after every
statement.

I've pretty much implemented this manually for some specific views which
performed horribly in Postgres (but would work fine in SQL Server for
example). I am looking to do this in a generic way so it's easier to
implement when necessary, and can be packaged as an extension for others
who may need to use it.

Obviously if we had better support for statement level triggers (so we
could reference all the NEW / OLD values touched by a statement) this would
be lower overhead, but that is one of the things holding up incrementally
refreshed materialized views from being implemented in the first place. I
just thought up a way to do it which gets around not having better
statement level triggers and wanted to see where I could get with things as
they are.

[GENERAL] Incrementally refreshed materialized view

2016-09-26 Thread Adam Brusselback

Hello all,
I am working on a plan to implement incrementally refreshed materialized
"views" with the existing functionality in Postgres.

Below is the plan for doing that:

Trigger based eagerly updated materialized tables for Postgres 9.5
>
>
>
> High level plan:
>
> Have a view definition stored in the database which we can use for
> reference.  Create functions which will read that view definition, and
> create a materialized table with all the same columns as the reference
> view, create triggers on all tables the view depends on to keep the
> materialized table fresh within a transaction.  All queries would hit the
> materialized table, the view is just there so we know what dependencies to
> track, and have an easy way to update the materialized table.
>
>
>
> How do we actually do the refresh?
>
> 1.   A refresh key is defined for the materialized table.
>
> 2.   Every dependent table must roll up to that refresh key so we
> know what rows to refresh.
>
> 3.   That key should be able to be referenced in the views where
> clause performantly so we can refresh just the rows that match the refresh
> key using the view.
>
> 4.   The refresh will be done by deleting any existing rows with the
> key, and inserting new ones with the key from the view.
>
> How do we know what to refresh?
>
> 1.   A before statement trigger to create a temp table to log all
> changes.
>
> 2.   A for each row trigger to log the rows modified by DML.
>
> a.   This should be done at the refresh key level.
>
>i.  We
> need to figure out a way to generate queries to roll up things multiple
> levels on the dependency chain until we get to the refresh key.  Not sure
> at all how to do that.
>
> 3.   An after statement trigger to run a refresh on the materialized
> table, looking at only the rows touched by the DML.
>

I am however stuck on: How do we know what to refresh?  -> Step 2
Pretty much, I need to figure out how to follow the joins in the view back
to whatever key was defined as the "refresh key" for each dependent table.
I know about the information_schema.view_column_usage, but I don't think
that'll get me everything I need.

I'd really appreciate any help with this, as i'd love a better way to get
eagerly refreshed materialized views in Postgres rather than doing
everything manually as I have to now.

If I can provide any more info please let me know.
Thanks,
-Adam

Re: [GENERAL] Vacuum Full - Questions

2016-08-31 Thread Adam Brusselback

Yes that very well could happen because the size of the table changed, as
well as stats being more accurate now. Just because you have a seq scan
doesn't mean the planer is making a bad choice.

Re: [GENERAL] Foreign key against a partitioned table

2016-08-23 Thread Adam Brusselback

I have wondered if there were any plans to enhance fkey support for
partitioned tables now that more work is being done on partitioning (I know
there has been a large thread on declarative partitioning on hackers,
though I haven't followed it too closely).

Foreign keys are all done through triggers on the backend anyways, it does
seem totally possible to have it work for partitioned tables if the code is
aware that a table is partitioned and it needs to look in all inherited
tables as well as the one specified.

Re: [GENERAL] C++ port of Postgres

2016-08-15 Thread Adam Brusselback

Just wondering what the end goal is for this project... Is it to just
maintain an up to date Postgres fork that will compile with a C++ compiler?
Is it to get a conversation going for a direction for Postgres itself to
move?  The former I don't see gaining much traction or doing all that much
for the state of the project. The latter possibly could if the community
gets on board.

Thanks,
-Adam

Re: [GENERAL] OT hardware recommend

2016-06-18 Thread Adam Brusselback

Agreed with Joshua, a single ssd will have way more performance than all 15
of those for random io for sure, and probably be very close on sequential.
That said, a raid controller able to handle all 15 drives (or multiple that
handle a subset of the drives) is likely to be more expensive than a single
good ssd, or a couple / few pretty good ssd's.

It's really amazing how much solid state drives transferred the database
bottleneck away from disk.

[GENERAL] UUID and Enum columns in exclusion constraints

2016-06-17 Thread Adam Brusselback

Just wondering what others have done for using enum or uuid columns in
exclusion constraints?

I have a solution now, but I just wanted to see what others have ended up
doing as well and see if what i'm doing is sane.  If i'm doing something
unsafe, or you know of a better way, please chime in.

For enum columns, I use a function in the constraint to convert the enum
value to an oid.  The function is defined as such:

> CREATE OR REPLACE FUNCTION enum_to_oid(
> _enum_schema text,
> _enum_name text,
> _enum anyenum)
>   RETURNS oid AS
> $BODY$
> SELECT e.oid
> FROM pg_type t
> INNER JOIN pg_enum e
> ON t.oid = e.enumtypid
> INNER JOIN pg_catalog.pg_namespace n
> ON n.oid = t.typnamespace
> WHERE true
> AND n.nspname = _enum_schema
> AND t.typname = _enum_name
> AND e.enumlabel = _enum::text;
> $BODY$
>   LANGUAGE sql STABLE;


For uuid columns, I use another function in the constraint to convert it to
a bytea type defined here:

> CREATE OR REPLACE FUNCTION uuid_to_bytea(_uuid uuid)
>   RETURNS bytea AS
> $BODY$
>  select decode(replace(_uuid::text, '-', ''), 'hex');
> $BODY$
>   LANGUAGE sql IMMUTABLE;


And i'd use these functions in the constraint like this:

> CONSTRAINT claim_product_reason_code_active_range_excl EXCLUDE
>   USING gist (uuid_to_bytea(claim_product_id) WITH =,
> enum_to_oid('enum'::text, 'claim_reason_type'::text, claim_reason_type)
> WITH =, enum_to_oid('enum'::text, 'claim_reason_code'::text,
> claim_reason_code) WITH =, active_range WITH &&)


And as a closing note on this, I really can't wait until these are
supported types for gist indexes.  It would be great not to have to play
games like this to have exclusion constraints on my tables just because I
am using uuids instead of ints, and enums instead of lookup tables (when an
enum really fits the problem well).

Re: [GENERAL] first_value/last_value

2016-05-18 Thread Adam Brusselback

Here is an example that works in a single query.  Since you have two
different orders you want the data back in, you need to use subqueries to
get the proper data back, but it works, and is very fast.

CREATE TEMPORARY TABLE foo AS
SELECT generate_series as bar
FROM generate_series(1, 100);

CREATE INDEX idx_foo_bar ON foo (bar);


SELECT *
FROM (
SELECT bar
FROM foo
ORDER BY bar asc
LIMIT 1
) x
UNION ALL
SELECT *
FROM (
SELECT bar
FROM foo
ORDER BY bar desc
LIMIT 1
) y;

DROP TABLE foo;

Re: [GENERAL] first_value/last_value

2016-05-18 Thread Adam Brusselback

Is there a reason you can't do that now with a limit 1/order by/union all?
Just have it ordered one way on the first query and the other on the
bottom. That will give you two rows that are the first / last in your set
based on whatever column you order on.
On May 18, 2016 8:47 PM, "Tom Smith"  wrote:

> Hello:
>
> Is there a plan for 9.7  to enable using the two aggregate function
> as non-window function?   i.e.  enabling getting the first/last row
> in single sql without using window features.
> There is actually a  C-extension for first()/last().
> I am wondering  if 9.7 would make them built-in function like max/min
>
> Thanks
>

Re: [GENERAL] Thoughts on "Love Your Database"

2016-05-18 Thread Adam Brusselback

> On Tue, May 17, 2016 at 1:54 PM, Raymond O'Donnell  wrote:

> > Having said all that, I've rarely had any trouble with pgAdmin 3 on

> > Windows 7 and XP, Ubuntu and Debian; just a very occasional crash (maybe

> > one every six months).

So just to chime in, it has not been at all that stable for my team and I.
It's not bad when connections are all stable or you're connecting to a
local instance, but if the connection between you and the server isn't
perfect... it's one of the least stable pieces of software I've ever used.

I'm sure for those who's servers are hosted locally, and you connect
through a lan, or those with a decent provider, it works much better than
for me.
My team is all remote though, and we don't have an office. Each of my
employees connects using their own internet connection, with varying
reliability.  Servers hosted on a cloud provider.  Crashes happen multiple
times a day for most.

Just wanted to share my experience.

Re: [GENERAL] Foreign key triggers

2016-05-14 Thread Adam Brusselback

Yes, foreign keys are implemented using triggers. Here is a blog post
explaining a little more:
http://bonesmoses.org/2014/05/14/foreign-keys-are-not-free/

I would assume it's still got to do a seq scan even on every referencing
table even if it's empty for every record since there are no indexes.  It
is not free to do ~25k seq scans on that many tables.

Re: [GENERAL] NULL concatenation

2016-05-12 Thread Adam Pearson

Hello Sridhar,

  Have you tried the 'coalesce' function to handle the nulls?


Kind Regards,

Adam Pearson


From: pgsql-general-ow...@postgresql.org  
on behalf of Sridhar N Bamandlapally 
Sent: 12 May 2016 09:47
To: PG-General Mailing List; PostgreSQL-hackers
Subject: [GENERAL] NULL concatenation

Hi

In migration, am facing issue with NULL concatenation in plpgsql,
by concatenating NULL between any where/position to Text / Varchar, the total 
string result is setting value to NULL


In Oracle:

declare
txt1 VARCHAR2(100) := 'ABCD';
txt2 VARCHAR2(100) := NULL;
txt3 VARCHAR2(100) := 'EFGH';
txt VARCHAR2(100) := NULL;
begin
  txt:= txt1 || txt2 || txt3;
  dbms_output.put_line (txt);
end;
/

abcdefgh   ===>return value



In Postgres

do $$
declare
txt1 text := 'ABCD';
txt2 text := NULL;
txt3 text := 'EFGH';
txt text := NULL;
begin
txt:= txt1 || txt2 || txt3;
raise notice '%', txt;
end$$ language plpgsql;

NOTICE:===> return value


SQL-Server also does same like Oracle

Is there any way alternate we have for same behavior in PostgreSQL

Please

Thanks
Sridhar
OpenText

Re: [GENERAL] Columnar store as default for PostgreSQL 10?

2016-04-25 Thread Adam Brusselback

>It is not difficult to simulate column store in a row store system if
>you're willing to decompose your tables into (what is essentially)
>BCNF fragments.  It simply is laborious for designers and programmers.

I could see a true column store having much better performance than
tricking a row based system into it.  Just think of the per-row overhead we
currently have at 28 bytes per row.  Breaking up data manually like that
may help a little, but if you don't have a very wide table to begin with,
it could turn out you save next to nothing by doing so.  A column store
wouldn't have this issue, and could potentially have much better
performance.

Re: [GENERAL] Plan to support predicate push-down into subqueries with aggregates?

2016-03-10 Thread Adam Brusselback

Rob,
I understand that if I were to replicate the logic in that view for every
use case I had for those totals, this would not be an issue. But that would
very much complicate some of my queries to the point of absurdity if I
wanted to write them in a way which would push everything down properly.
The issue is, that I need that data to be able to join to that view from
other smaller sets of data and not have it calculate the totals for every
"header" I have in my system, just to throw away 99% of them.

My application is for contract and rebate management, so as David said,
basically accounting.  We allow users to set up contracts to sell products
to their customers, and then track and verify the rebates they get back are
correct, and that they're not overpaying.
The equivalent of the header_total view is used in quite a few different
places in my application.  In one case, the state of one object in my
system (called a deduction) is derived from 5 different data points, 2 of
which are from my equivalent of the "header total" view.  Things like the
state for the deduction object are derived from a query that I encapsulated
inside a view.  You can see how this proliferates.

In the end, after I switched to materialized tables with this data stored,
querying things that relied on this aggregated data got much much faster.
In my largest client's database, the query to get the deduction state for a
single deduction went from 5 seconds, down to 2ms or so.  Unsurprisingly,
if I wanted the deduction state for every deduction the system, vs only
one, before it was 5 seconds either way, as it had to aggregate all of that
detail level data no matter what, and then throw most of it away if I
wanted just a single deduction.

I would very much rather not have to use the materialized tables to get
good performance, and just use views to get this data instead.  I don't
like having to have so many triggers to control the materialized tables, it
leaves too much room for inconsistent data where as a view querying the
underlying data directly leaves no room for inconsistency.

I understand that not everyone has the same use case as I do, but I can see
from my tests that the "dark side" does seem to be able to optimize for
it.  This doesn't seem like an uncommon use case to me (think of banking
software, with an account table, and transaction table, and having the
account balance derived in a view), and i'd love to see it supported.

I suppose there are two different ways this could go to improve my
situation: 1) better optimization for this type of query. 2) materialized
views getting more features like refreshing when the underlying data is
changed like detailed here
 (road
map part).
Either would be great in my book!

The feedback is very appreciated, I was just trying to see with this post
if there was any plan / ideas / anything at all in regards to this type of
use case (or better ways of doing it that I hadn't thought of).

Re: [GENERAL] Plan to support predicate push-down into subqueries with aggregates?

2016-03-09 Thread Adam Brusselback

I responded yesterday, but it seems to have gotten caught up because it was
too big with the attachments... Here it is again.

Sorry about not posting correctly, hopefully I did it right this time.

So I wanted to see if Sql Server (2014) could handle this type of query
differently than Postgres (9.5.1), so I got an instance of express
installed and ported the test script to it.

I updated my Postgres script so the data is the same in each server.  The
end result is Sql Server seems to be able to optimize all of these queries
MUCH better than Postgres.
I disabled parallelism in Sql Server to make the comparison fair.

I've attached the explain analyze results for Postgres, and the execution
plan for Sql Server (in picture form... don't know a better way)

Results are:
--Sql Server:15ms average
--Postgres: 6ms average
SELECT *
FROM header
INNER JOIN header_total
ON header.header_id = header_total.header_id
WHERE header.header_id = 26;


--Sql Server: 15ms average
--Postgres: 1250ms average
SELECT *
FROM header
INNER JOIN header_total
ON header.header_id = header_total.header_id
WHERE header.header_id < 27
AND header.header_id > 24;


--Sql Server: 567ms average
--Postgres: 1265ms average
SELECT *
FROM header
INNER JOIN header_total
ON header.header_id = header_total.header_id
WHERE header.description like '%5%';


--Sql Server: 15ms average
--Postgres: 1252ms average
SELECT *
FROM header_total
WHERE header_total.header_id IN (
SELECT header_id
FROM header
WHERE header.header_id < 27
AND header.header_id > 24);

Here are the sql server execution plans as links rather than attachments:
https://drive.google.com/file/d/0BzWRjbj6CQLeb29JZ0lMMnp4QTA/view?usp=sharing
https://drive.google.com/file/d/0BzWRjbj6CQLeM2t0MmZDdE03OHc/view?usp=sharing
https://drive.google.com/file/d/0BzWRjbj6CQLeV0hjRmM5NE9CTWc/view?usp=sharing
https://drive.google.com/file/d/0BzWRjbj6CQLeNmdlQWpHYU1BVHM/view?usp=sharing
query1:

'Nested Loop Left Join  (cost=13.22..1022.98 rows=1 width=125) (actual 
time=3.021..3.024 rows=1 loops=1)'
'  Join Filter: (header_1.header_id = detail_2.header_id)'
'  ->  Nested Loop Left Join  (cost=11.78..1013.93 rows=1 width=89) (actual 
time=2.978..2.981 rows=1 loops=1)'
'Join Filter: (header_1.header_id = detail_1.header_id)'
'->  Nested Loop  (cost=0.55..4.99 rows=1 width=49) (actual 
time=0.028..0.030 rows=1 loops=1)'
'  ->  Index Scan using header_pkey on header  (cost=0.28..2.49 
rows=1 width=33) (actual time=0.022..0.023 rows=1 loops=1)'
'Index Cond: (header_id = 26)'
'  ->  Index Scan using header_pkey on header header_1  
(cost=0.28..2.49 rows=1 width=16) (actual time=0.002..0.003 rows=1 loops=1)'
'Index Cond: (header_id = 26)'
'->  GroupAggregate  (cost=11.23..1008.92 rows=1 width=32) (actual 
time=2.944..2.945 rows=1 loops=1)'
'  Group Key: detail_1.header_id'
'  ->  Bitmap Heap Scan on detail_1  (cost=11.23..999.21 rows=969 
width=32) (actual time=0.301..2.067 rows=1000 loops=1)'
'Recheck Cond: (header_id = 26)'
'Heap Blocks: exact=1000'
'->  Bitmap Index Scan on idx_detail_1_header_id  
(cost=0.00..10.99 rows=969 width=0) (actual time=0.156..0.156 rows=1000 
loops=1)'
'  Index Cond: (header_id = 26)'
'  ->  GroupAggregate  (cost=1.44..9.01 rows=1 width=20) (actual 
time=0.033..0.033 rows=1 loops=1)'
'Group Key: detail_2.header_id'
'->  Bitmap Heap Scan on detail_2  (cost=1.44..8.95 rows=7 width=20) 
(actual time=0.013..0.027 rows=7 loops=1)'
'  Recheck Cond: (header_id = 26)'
'  Heap Blocks: exact=7'
'  ->  Bitmap Index Scan on idx_detail_2_header_id  
(cost=0.00..1.44 rows=7 width=0) (actual time=0.010..0.010 rows=7 loops=1)'
'Index Cond: (header_id = 26)'
'Planning time: 0.420 ms'
'Execution time: 3.127 ms'

query2:

'Hash Right Join  (cost=28367.90..28394.20 rows=3 width=129) (actual 
time=1245.897..1246.442 rows=2 loops=1)'
'  Hash Cond: (detail_1.header_id = header_1.header_id)'
'  ->  HashAggregate  (cost=28164.00..28176.50 rows=1000 width=32) (actual 
time=1235.943..1236.304 rows=1000 loops=1)'
'Group Key: detail_1.header_id'
'->  Seq Scan on detail_1  (cost=0.00..18164.00 rows=100 width=32) 
(actual time=0.008..215.064 rows=100 loops=1)'
'  ->  Hash  (cost=203.86..203.86 rows=3 width=89) (actual time=9.895..9.895 
rows=2 loops=1)'
'Buckets: 1024  Batches: 1  Memory Usage: 9kB'
'->  Hash Right Join  (cost=177.58..203.86 rows=3 width=89) (actual 
time=8.835..9.891 rows=2 loops=1)'
'  Hash Cond: (detail_2.header_id = header_1.header_id)'
'  ->  HashAggregate  (cost=167.50..180.00 rows=1000 width=20) 
(actual time=8.743..9.407 rows=1000 loops=1)'
'Group Key: detail_2.header_id'
'->  Seq Scan on detail_2  (cost=0.00..115.00 row

Re: [GENERAL] Plan to support predicate push-down into subqueries with aggregates?

2016-03-08 Thread Adam Brusselback

Thanks Tom, appreciate the reply.

Sorry if I didn't call it the correct thing. I just know that with trying
to encapsulate this aggregate logic in a view, I am unable to use that view
in a query that I know is only going to touch a subset of the data without
incurring a performance hit from the view doing seq scans on all of the
rows in the detail_1 and detail_2 tables, and then throwing out 99% of the
results when the filter is applied.

I had initially started creating functions that would take an array of ids
as a parameter, and manually push them down in the subqueries.  That got
really really messy though, and we moved away from doing that to having the
aggregates eagerly materialized to a table with triggers.

Are there any other options for making this type of query faster?  It could
be that I just am totally missing a better way to do this.  I do really
want to be able to contain that logic within a view of some sort though, as
a bunch of other stuff is built on top of that.  Having to push that
aggregate query into all of those other queries would be hell.

Thanks,
-Adam

On Tue, Mar 8, 2016 at 5:17 PM, Tom Lane  wrote:

> Adam Brusselback  writes:
> > I was wondering if there were any plans to support predicate push-down
> > optimization for subqueries (and views) with aggregates?
>
> Right offhand I would say that that's a complete mischaracterization
> of the problem.  I've not tried to run your test case, but you say
>
> > --Quick, is able to push down because it's a simple equality check
> > SELECT *
> > FROM header
> > INNER JOIN header_total
> > USING (header_id)
> > WHERE header.header_id = 26;
> >
> > --Slow, no pushdown
> > SELECT *
> > FROM header
> > INNER JOIN header_total
> > USING (header_id)
> > WHERE header.header_id < 200;
> >
> > --Slow, no pushdown
> > SELECT *
> > FROM header
> > INNER JOIN header_total
> > USING (header_id)
> > WHERE header.description like '%5%';
>
> There's no preference for equalities over other kinds of predicates
> as far as subquery pushdown is concerned.  I think what your real
> problem is is that in the first case, the system will derive the
> additional condition "header_total.header_id = 26", while in the
> second case it will not deduce "header_total.header_id < 200".
> That's because the machinery for deducing such implied constraints
> works only with equalities.  That's not very likely to change anytime
> soon, and even if it did, the inference would only extend to operators
> that are members of the same btree family as the join equality operator.
> Your example with a LIKE clause is always going to be out in the cold,
> because there is no principled basis for the planner to decide that
> "a = b" means that "a LIKE x" and "b LIKE x" will give the same result.
> It hasn't got enough information about the behavior of LIKE to know
> if that's safe or not.  (It does, on the other hand, know very well that
> SQL equality operators don't necessarily guarantee bitwise identity.)
>
> So I'd suggest just modifying your queries to write out both constraints
> explicitly.
>
> regards, tom lane
>

[GENERAL] Plan to support predicate push-down into subqueries with aggregates?

2016-03-08 Thread Adam Brusselback

Hi all.
I was wondering if there were any plans to support predicate push-down
optimization for subqueries (and views) with aggregates?

I was recently bit by this, as I had incorrectly assumed that this was an
optimization that was in place, and designed quite a bit around that
assumption, only to get hit with terrible performance when more data got
loaded into the system.

Currently I had to solve the issue by having aggregate tables which store
the data, which is maintained by triggers on IUD.  This gets messy quick,
as I have some aggregates which are dependent on 5-6 other tables.  I'd
love to be able to just store the logic for calculating aggregates in a
view, and use that at query time instead of having to deal with it like
this.

I have written a test case script that explains the type of queries I am
talking about, and the issues.  It's heavily simplified compared to a real
system, but it'll do.  Please see attached.

Thanks,
-Adam
CREATE SCHEMA test;

SET search_path = 'test';

CREATE TABLE header (
  header_id serial primary key,
  description text not null,
  amount numeric not null
  );

CREATE TABLE detail_1 (
  detail_1_id serial primary key,
  header_id integer not null references header (header_id),
  quantity numeric not null,
  rate numeric not null
  );
  
CREATE TABLE detail_2 (
  detail_2_id serial primary key,
  header_id integer not null references header (header_id),
  amount numeric not null
  );
  
  INSERT INTO header (description, amount)
  SELECT 'header record ' || generate_series, random() * 100
  FROM generate_series(1, 1000);
  
  INSERT INTO detail_1 (header_id, quantity, rate)
  SELECT header_id, random() * 50, random() * 10
  FROM header
  INNER JOIN generate_series(1, 800)
  ON random() < 0.5;
  
  INSERT INTO detail_2 (header_id, amount)
  SELECT header_id, random() * 120
  FROM header
  INNER JOIN generate_series(1, 7)
  ON random() < 0.5;

CREATE VIEW header_total AS 
SELECT header.header_id
, coalesce(detail_1.amount, 0) AS detail_1_amount
, coalesce(detail_1.detail_1_count, 0) AS detail_1_count
, coalesce(detail_2.amount, 0) AS detail_2_amount
, coalesce(detail_2.detail_2_count, 0) AS detail_2_count
, coalesce(detail_1.amount, 0) + coalesce(detail_2.amount, 0) as detail_total
, header.amount = coalesce(detail_1.amount, 0) + coalesce(detail_2.amount, 0) 
as balanced
FROM header
LEFT JOIN (
SELECT header_id
, sum(rate * quantity) as amount 
, count(detail_1_id) as detail_1_count
FROM detail_1
GROUP BY header_id
) detail_1
ON header.header_id = detail_1.header_id
LEFT JOIN (
SELECT header_id
, sum(amount) as amount 
, count(detail_2_id) as detail_2_count
FROM detail_2
GROUP BY header_id
) detail_2
ON header.header_id = detail_2.header_id;

CREATE INDEX idx_detail_1_header_id ON detail_1 (header_id);
CREATE INDEX idx_detail_2_header_id ON detail_2 (header_id);

ANALYZE header;
ANALYZE detail_1;
ANALYZE detail_2;

--Quick, is able to push down because it's a simple equality check
SELECT *
FROM header
INNER JOIN header_total
USING (header_id)
WHERE header.header_id = 26;

--Slow, no pushdown
SELECT *
FROM header
INNER JOIN header_total
USING (header_id)
WHERE header.header_id < 200;


--Slow, no pushdown
SELECT *
FROM header
INNER JOIN header_total
USING (header_id)
WHERE header.description like '%5%';

DROP SCHEMA test;
-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] "plan should not reference subplan's variable" when using row level security

2016-02-24 Thread Adam Guthrie

On 24 February 2016 at 20:27, Stephen Frost  wrote:
> Yeah, looks like a bug to me.  My gut reaction is that we're pulling up
> a subquery in a way that isn't possible and that plan shouldn't be
> getting built/considered.

Thanks - shall I go ahead and submit a bug report?

>
> As a work-around, until we fix it, you could create an sql function to
> check for the existance of the id in 'a' and use that in the policy
> definition.

I've also discovered that using the following policy instead

CREATE POLICY a_select ON b FOR SELECT
USING ( a_id IN (SELECT id FROM a) );

also seems to work around the issue.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] "plan should not reference subplan's variable" when using row level security

2016-02-24 Thread Adam Guthrie

Hi,

Whilst trying to use row level security with a subquery in the USING
expression, I'm receiving an error "plan should not reference
subplan's variable"

A simple sql file to reproduce:



CREATE TABLE a (
id  INTEGER PRIMARY KEY
);

CREATE TABLE b (
id  INTEGER PRIMARY KEY,
a_idINTEGER,
textTEXT
);

CREATE POLICY a_select ON b FOR SELECT
USING ( EXISTS(SELECT FROM a WHERE a.id = b.a_id) );

ALTER TABLE b ENABLE ROW LEVEL SECURITY;

INSERT INTO a (id) VALUES (1);

INSERT INTO b (id, a_id, text) VALUES (1, 1, 'one');

GRANT ALL ON ALL TABLES IN SCHEMA public TO test;

SET ROLE test;

SELECT * FROM b;

UPDATE b SET text = 'ONE' WHERE id = 1;



gives error:

psql:/tmp/test.sql:26: ERROR:  plan should not reference subplan's variable

Is this a bug or am I doing something wrong?

Any help much appreciated,

Adam


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] PostgreSQL Developer Best Practices

2015-08-25 Thread Adam Brusselback

Personally I always set the natural key with a not null and unique
constraint, but create an artificial key for it as well.  As an example, if
we had a product table, the product_sku is defined as not null with a
unique constraint on it, while product_id is the primary key which all
other tables reference as a foreign key.

In the case of a many to many situation, I prefer to use a two column
composite key.  In the case of a many to many, i've never run into a case
where I needed to reference a single row in that table without knowing
about both sides of that relation.

Just my $0.02
-Adam

On Tue, Aug 25, 2015 at 12:15 PM, David G. Johnston <
david.g.johns...@gmail.com> wrote:

> On Tue, Aug 25, 2015 at 11:40 AM, Melvin Davidson 
> wrote:
>
>> Consider:
>> SELECT c.registration_no,
>>c.car_make,
>>p.part_no
>>FROM car c
>>JOIN parts p ON ( p.registration_no = c.registration_no)
>>  WHERE registration_no = ;
>>
>>  versus:
>>  SELECT c.registration_no,
>>c.car_make,
>>p.part_no
>>FROM car c
>>JOIN parts p ON ( p.id = c.id)
>>  WHERE registration_no = ;
>>
>>  Why join on id when registration_no is better?
>>
>>
> I believe you are mistaken if you think there are absolute rules you can
> cling to here.  But even then I would lean toward calling primary keys an
> internal implementation detail that should be under the full control of the
> database in which they are directly used.  Artifical "natural" keys I would
> lean toward turning into, possibly unique, attributes.  Inherent "natural"
>
> keys get some consideration for using directly.
>
> The issue arise more, say, in a many-to-many situation.  Do you define the
> PK of the linking table as a two-column composite key or do you introduce
> a third, serial, field to stand in for the pair?
>
> David J.
>
>

Re: [GENERAL] Using the database to validate data

2015-07-27 Thread Adam Brusselback

e_id
> , price
> , reduction
> , redistributor_company_id
> , freight)
> SELECT
>   s.client_id
> , s.row_id
> , s.sale_number
> , cmpd.company_id
> , s.invoice_number
> , s.invoice_date::date
> , s.order_date::date
> , s.ship_date::date
> , s.sale_date::date
> , p.product_id
> , s.quantity::numeric
> , uom.uom_type_id
> , s.price::numeric
> , s.reduction::numeric
> , cmpr.company_id
> , s.freight
> FROM import_sale s
> INNER JOIN company cmpd
> ON cmpd.company_number = s.company_number
> LEFT JOIN company cmpr
> ON cmpr.company_number = s.redistributor_company_number
> INNER JOIN product p
> ON s.product_number = p.product_number
> INNER JOIN uom_type uom
> ON uom.uom_type_cd = s.quantity_uom
> WHERE NOT EXISTS (
> SELECT 1
> FROM import_sale_error se
> WHERE se.row_id = s.row_id)
> new sale_number
> AND (NOT EXISTS (
> SELECT 1
> FROM sale s2
> WHERE s.row_id = s2.source_row_id
> AND s.client_id = s2.client_id)
> existing sale_number with changes
> OR EXISTS (
> SELECT 1
> FROM sale s2
> WHERE s.processed_ind = false
> AND s.sale_number = s2.sale_number
> AND s.client_id = s2.client_id
> AND (cmpd.company_id != s2.company_id
> OR s.invoice_number != s2.invoice_number
> OR s.invoice_date::date != s2.invoice_date
> OR s.order_date::date != s2.order_date
> OR s.ship_date::date != s2.ship_date
> OR s.sale_date::date != s2.sale_date
> OR p.product_id != s2.product_id
> OR s.quantity::numeric != s2.quantity
> OR uom.uom_type_id != s2.uom_type_id
> OR  s.price::numeric != s2.price
> OR s.reduction::numeric != s2.reduction
> OR cmpr.company_id != s2.redistributor_company_id)
> );
>
>
> -
> --Update processed_ind
> UPDATE import_sale AS s
> SET processed_ind = true
> WHERE NOT EXISTS (
> SELECT 1
> FROM import_sale_error se
> WHERE se.row_id = s.row_id);
>
> $BODY$
>   LANGUAGE sql VOLATILE
>   COST 1000;


So all of that together is a pretty solid system for importing data, and
showing a user what went wrong with the data they sent if it is a bad row.

The other good part of this, is it's all set based. This process will run
through 20,000 lines on a single core server in around 5-10 seconds.

At one of my old jobs, we had something that did the same type of
validations / inserts, but did it row by row in a cursor (not written by
me), and that took a good 5 min (if I remember correctly) to process 20,000
lines.  This was also on a server running Sql Server on a 32 core machine.

Anyways, good luck!
-Adam

On Fri, Jul 24, 2015 at 9:55 AM, JPLapham  wrote:

> Zdeněk Bělehrádek wrote
> > What about creating a SAVEPOINT before each INSERT, and if the INSERT
> > returns
> > an error, then ROLLBACK TO SAVEPOINT? This way you will have all the
> > insertable data in your table, and you can still ROLLBACK the whole
> > transaction, or COMMIT it if there were no errors.
> >
> > It will probably be quite slow, but if you have only thousands of lines,
> > it
> > should be fast enough for your usecase IMHO.
> >
> > -- Zdeněk Bělehrádek
>
> Hmmm, interesting. Thanks, if that works, it would be exactly what I'm
> looking for!
>
> You are right, speed is not an issue.
>
> -Jon
>
>
>
> --
> View this message in context:
> http://postgresql.nabble.com/Using-the-database-to-validate-data-tp5859046p5859239.html
> Sent from the PostgreSQL - general mailing list archive at Nabble.com.
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>

Re: [GENERAL] How to keep pg_largeobject from growing endlessly

2015-04-15 Thread Adam Hooper

On Wed, Apr 15, 2015 at 9:57 AM, Andreas Joseph Krogh
 wrote:
>
> På onsdag 15. april 2015 kl. 15:50:36, skrev Adam Hooper 
> :
>
> On Wed, Apr 15, 2015 at 4:49 AM, Andreas Joseph Krogh
>  wrote:
> >
> > In other words: Does vacuumlo cause diskspace used by pg_largeobject to be 
> > freed to the OS (after eventually vacuumed by autovacuum)?
>
> No.
>
> Ok. Out of curiousity; When does it get freed, when VACUUM FULL'ed?

Yes. VACUUM FULL or CLUSTER will free the space. (Of course, you need
a lot of free disk space to perform those operations.)

Enjoy life,
Adam

-- 
Adam Hooper
+1-613-986-3339
http://adamhooper.com


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] How to keep pg_largeobject from growing endlessly

2015-04-15 Thread Adam Hooper

On Wed, Apr 15, 2015 at 4:49 AM, Andreas Joseph Krogh
 wrote:
>
>
> In other words: Does vacuumlo cause diskspace used by pg_largeobject to be 
> freed to the OS (after eventually vacuumed by autovacuum)?

No.

But that shouldn't matter in your scenario: if you create more large
objects than you delete, you aren't wasting space anyway.

A longer-term problem that may apply in your scenario: pg_largeobject
can't grow beyond your tablespace's disk size. Unlike other tables,
it's very hard to move pg_largeobject to a new database/tablespace
without downtime. If your table is constantly growing and you're
worrying about how much space it's taking, other storage strategies
(bytea, S3, NFS, etc) might inspire more confidence. I had this
problem a few months ago; since then, I only use pg_largeobject in
prototyping and low-growth situations.
http://www.postgresql.org/message-id/camwjz6gf9tm+vwm_0ymqypi4xk_bv2nyaremwr1ecsqbs40...@mail.gmail.com

Enjoy life,
Adam

-- 
Adam Hooper
+1-613-986-3339
http://adamhooper.com

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] pgAgent

2015-04-06 Thread Adam Brusselback

Here you are:

do $$
declare
job_id int;
begin

/* add a job and get its id: */
insert into
pgagent.pga_job (
jobjclid
, jobname
)
values
(
1 /*1=Routine Maintenance*/
, 'DELETE_NAMES' /* job name */
)
returning
jobid
into
job_id;


/* add a step to the job: */
insert into
pgagent.pga_jobstep (jstjobid, jstname, jstkind, jstcode,
jstdbname)
values
(
job_id
, 'DELETE_NAMES' /* step name */
, 's'/* sql step */
, 'BEGIN DELETE_NAMES; END;'  /* the sql to run */
, 'somedatabase'  /* the name of the database
to run the step against */
);


/* add a schedule to the job. This one runs every day at midnight: */
insert into pgagent.pga_schedule (
jscjobid
, jscname
, jscdesc
, jscminutes
, jschours
, jscweekdays
, jscmonthdays
, jscmonths
, jscenabled
, jscstart
, jscend)
values(
job_id
, 'schedule name'
, ''
,
'{t,f,f,f,f,f,f,f,f,f,f,f,f,f,f,f,f,f,f,f,f,f,f,f,f,f,f,f,f,f,f,f,f,f,f,f,f,f,f,f,f,f,f,f,f,f,f,f,f,f,f,f,f,f,f,f,f,f,f,f}'
, '{t,f,f,f,f,f,f,f,f,f,f,f,f,f,f,f,f,f,f,f,f,f,f,f}'
, '{f,f,f,f,f,f,f}'
, '{f,f,f,f,f,f,f,f,f,f,f,f,f,f,f,f,f,f,f,f,f,f,f,f,f,f,f,f,f,f,f,f}'
, '{f,f,f,f,f,f,f,f,f,f,f,f}'
, true
, '2015-04-06 00:00:00'
, NULL
);


end $$;


On Thu, Apr 2, 2015 at 12:25 PM, Champion Always  wrote:

> Hi,
>job_name => 'DELETE_NAMES'
>  ,job_type => 'PLSQL_BLOCK'
>  ,job_action => 'BEGIN DELETE_NAMES; END;'
>  ,start_date => sysdate
>  ,repeat_interval => 'FREQ=DAILY'
>  ,enabled => TRUE);
> it is oracle ..
>
> .And i installed pgAgent On postgres ..
> how  above oracle code can convert to postgres and place it in pgagent for
> daily schedule
> any help..?
>

Re: [GENERAL] Hex characters in COPY input

2015-02-27 Thread Adam Hooper

On Fri, Feb 27, 2015 at 9:39 AM, Melvin Call  wrote:
> On 2/26/15, Vick Khera  wrote:
>> On Thu, Feb 26, 2015 at 9:50 PM, Melvin Call 
>> wrote:
>>
>>> I get an error "ERROR:  invalid byte sequence for
>>> encoding "UTF8": 0xe9616c"
>>
>> Character code E9 is not UTF8. Don't tell Postgres you're importing UTF8 if
>> you're not.
>
> Regardless, can you point me to some reading that would have clued
> me in that e9 is not a UTF8 character? Or is the clue the fact that it was not
> preceeded with 0x00?

The error message Postgres gave you is a pretty darned good clue :).

But your question has an error, and it's worth expounding a bit. 0xe9
is not a UTF8 character because 0xe9 is not a _character_. 0xe9 is a
_byte_. Characters are not bytes. Characters can be _encoded_ into
bytes, and that's not the same thing.

UTF-8 encoding is a bit confusing: any byte in the range [0x00-0x7f]
represents the same character as in ASCII encoding (an extremely
popular encoding). Any byte in the range [0x80-0xff] is the "leading
byte" in a sequence of bytes that represents a single character.
"Continuation" bytes are in the range [0x80-0xbf]. (Why not the range
[0x00-0xff]? Because UTF-8 was designed to generate errors when fed
non-UTF8 byte sequences.) The first four bits of the leading byte
describe how many continuation bytes there are. If you care to read up
on the how and why of UTF-8 (a red herring in this discussion), try:
See http://en.wikipedia.org/wiki/UTF-8

Back to 0xe9. 0xe9 is '1110 1001' in binary. Postgres' UTF-8 decoder
sees that initial '1110' and determines that it needs to inspect three
bytes to read one character. The second byte is 0x61, which is not in
the range [0x80-0xbf], so Postgres reports an invalid byte sequence.
Hooray: it produces exactly the error message it should.

You don't need to concern yourself with the complications of UTF-8.
You only need to know that bytes are not characters; if you don't know
the encoding of a sequence of bytes, you've made a logic error.
Postgres told you the error, though it didn't hint at how to fix it.
(Text editors and web browsers use heuristics to guess file encodings,
and they're usually right, though it's impossible to create a perfect
heuristic. See 
http://stackoverflow.com/questions/4198804/how-to-reliably-guess-the-encoding-between-macroman-cp1252-latin1-utf-8-and
for further discussion there.)

If you're looking for take-away lessons, the main one is: "read the
error message" :).

Next time you have the "wrong encoding" problem, you have two options:
1) figure out the encoding and tell Postgres; or 2) regenerate the
file in the correct encoding (UTF-8). The "why" is here:
http://www.joelonsoftware.com/articles/Unicode.html

We on this list jumped strait to option 1. We've memorized 0xe9 in
particular, because we've been through your pain before. In the
Americas and Western Europe, if a file contains the byte 0xe9 it
probably contains the character "é" encoded as
windows-1252/ISO-8859-1/ISO-8859-15. That's very common. MySQL in
particular is a ghastly Internet denizen, in that it defaults to
ISO-8859-15 in an apparent crusade against globalization and modern
standards.

Enjoy life,
Adam

-- 
Adam Hooper
+1-613-986-3339
http://adamhooper.com

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Row-level Security vs Application-level authz

2015-02-24 Thread Adam Hooper

On Tue, Feb 24, 2015 at 8:37 PM, Stephen Frost  wrote:
> * David Steele (da...@pgmasters.net) wrote:
>> So I guess my last question is if you are inserting rows into a table to
>> track user connections, how do you clean them out when the client does
>> not disconnect cleanly?  Or is this table intended to be append-only?
>
> It wouldn't be intended to be append-only but I agree that, ideally,
> there'd be a way to address clients disconnect uncleanly.

This is starting to sound like a web app, which I have experience
with. The cardinal rule: assume everybody disconnects randomly, and
code accordingly :).

The goal here isn't to make the session table reflect the number of
users who are currently logged in. Rather, it's to ensure the session
table doesn't grow infinitely.

* You can set a restriction like, "a user can only be logged in once".
During login, delete other sessions associate with that user. The
session table's maximum size is the size of the user table.

* You can use an expiry-check function. Best is a definitive "this
session is disconnected"; if you can't do that, you can try a rule
such as "user cannot be logged in more than 20 times and sessions
older than two weeks are invalid". During login, run the expiry
checker on that user and delete expired rows. With the right function,
you can constrain the session table to a reasonable size.

* You can simply limit the size of the session table. If your limit is
100 and a user starts a 101st session, delete the first session.

The world of websites involves lots of users and loads of short-lived
sessions. A website doesn't check whether the user has access to a
row: it checks whether the user has access to an endpoint with the
given parameters. Postgres RLS seems like a bad approach for that use
case.

Enjoy life,
Adam

-- 
Adam Hooper
+1-613-986-3339
http://adamhooper.com

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] VACUUM FULL pg_largeobject without (much) downtime?

2015-02-04 Thread Adam Hooper

On Tue, Feb 3, 2015 at 3:12 PM, Bill Moran  wrote:
> On Tue, 3 Feb 2015 14:48:17 -0500
> Adam Hooper  wrote:
>
>> It's doable for us to VACUUM FULL and add a notice to our website
>> saying, "you can't upload files for the next two hours." Maybe that's
>> a better idea?
>
> It's really going to depend on what options you have available. Keep
> in mind that users won't be able to read large objects either, so
> you'll need to disable whatever features of the site view the files
> as well. Whether that's easier or harder depends on how much work it
> would be to disable those features of the site.

For the record, this is what we went with.

To those looking to use large objects (who, um, are already searching
for how to VACUUM FULL pg_largeobject), my advice: if we were to do it
all over again, we'd have used a separate database per "bucket".

For instance, imagine you store uploaded files and processed data in
pg_largeobject. Then some day you migrate the processed data
elsewhere. If uploaded files went in one database and processed data
went into the second, then the uploaded-files database's
pg_largeobject table would remain slim, and you could simply DROP
DATABASE on the other after all clients stopped using it. There
wouldn't be any downtime.

My take-away, though, is to avoid the pg_largeobject table whenever
possible. You can move BYTEA data with zero downtime using pg_repack,
but the same can't be said for large objects.

Enjoy life,
Adam

-- 
Adam Hooper
+1-613-986-3339
http://adamhooper.com

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] VACUUM FULL pg_largeobject without (much) downtime?

2015-02-03 Thread Adam Hooper

On Tue, Feb 3, 2015 at 2:29 PM, Bill Moran  wrote:
> On Tue, 3 Feb 2015 14:17:03 -0500
> Adam Hooper  wrote:
>
> My recommendation here would be to use Slony to replicate the data to a
> new server, then switch to the new server once the data has synchornized.

Looks exciting. But then I notice: "Slony-I does not automatically
replicate changes to large objects (BLOBS)." [1]

Does that still apply?

It's doable for us to VACUUM FULL and add a notice to our website
saying, "you can't upload files for the next two hours." Maybe that's
a better idea?

Enjoy life,
adam

[1] http://slony.info/documentation/2.2/limitations.html

-- 
Adam Hooper
+1-613-986-3339
http://adamhooper.com

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] VACUUM FULL pg_largeobject without (much) downtime?

2015-02-03 Thread Adam Hooper

On Tue, Feb 3, 2015 at 12:58 PM, Bill Moran  wrote:
> On Tue, 3 Feb 2015 10:53:11 -0500
> Adam Hooper  wrote:
>
>> This plan won't work: Step 2 will be too slow because pg_largeobject
>> still takes 266GB. We tested `VACUUM FULL pg_largeobject` on our
>> staging database: it took two hours, during which pg_largeobject was
>> locked. When pg_largeobject is locked, lots of our website doesn't
>> work.
>
> Sometimes CLUSTER is faster than VACUUM FULL ... have you tested CLUSTERing
> of pg_largeobject on your test system to see if it's fast enough?

On the 30GB that's left on staging, it takes 50min. Unfortunately, our
staging database is now at 30GB because we already completed a VACUUM
FULL on it. It seems difficult to me to revert that operation. But I
need an orders-of-magnitude difference, and this clearly isn't it.

> How big is the non-lo data?

It's 65GB, but I've used pg_repack to move it to a separate tablespace
so it won't affect downtime.

Enjoy life,
Adam

-- 
Adam Hooper
+1-613-986-3339
http://adamhooper.com

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] VACUUM FULL pg_largeobject without (much) downtime?

2015-02-03 Thread Adam Hooper

Hi list,

We run a website. We once stored all sorts of files in pg_largeobject,
which grew to 266GB. This is on an m1.large on Amazon EC2 on a single,
magnetic, non-provisioned-IO volume. In that context, 266GB is a lot.

We've since moved all but 60GB of that data to S3. We plan to reduce
that to 1GB by deleting old, unused data. Of course, pg_largeobject
will still take up 266GB because autovacuum doesn't reduce disk space.

We want to move our entire database to an SSD volume, with as little
downtime as possible. My tentative plan:

1. Use CREATE TABLESPACE and pg_repack to move user tables to a temporary volume
2. Take down Postgres, copy system-table files to the new volume, and
start up Postgres from the new volume
3. Use pg_repack to move everything to the new volume

This plan won't work: Step 2 will be too slow because pg_largeobject
still takes 266GB. We tested `VACUUM FULL pg_largeobject` on our
staging database: it took two hours, during which pg_largeobject was
locked. When pg_largeobject is locked, lots of our website doesn't
work.

How can we move our database without much downtime? Is there a way to
`VACUUM FULL` pg_largeobject without locking it for very long? Aside
from that problem, is the rest of my upgrade plan sane?

For what it's worth, here's some info from VACUUM VERBOSE:

overview=# VACUUM (VERBOSE, ANALYZE) pg_largeobject;
INFO:  vacuuming "pg_catalog.pg_largeobject"
INFO:  scanned index "pg_largeobject_loid_pn_index" to remove 1112630
row versions
DETAIL:  CPU 3.38s/9.89u sec elapsed 69.02 sec.
INFO:  "pg_largeobject": removed 1112630 row versions in 374889 pages
DETAIL:  CPU 7.48s/2.22u sec elapsed 150.44 sec.
INFO:  index "pg_largeobject_loid_pn_index" now contains 29373858 row
versions in 370470 pages
DETAIL:  1112630 index row versions were removed.
279489 index pages have been deleted, 276070 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.06 sec.
INFO:  "pg_largeobject": found 1112622 removable, 231974 nonremovable
row versions in 3189820 out of 34522175 pages
DETAIL:  0 dead row versions cannot be removed yet.
There were 10318306 unused item pointers.
0 pages are entirely empty.
CPU 69.81s/28.83u sec elapsed 1402.53 sec.
INFO:  analyzing "pg_catalog.pg_largeobject"
INFO:  "pg_largeobject": scanned 3 of 34522175 pages, containing
25085 live rows and 0 dead rows; 25085 rows in sample, 24203398
estimated total rows
VACUUM

Enjoy life,
Adam

-- 
Adam Hooper
+1-613-986-3339
http://adamhooper.com


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Conflicting function name in dynamically-loaded shared library

2015-01-29 Thread Adam Mackler

On Wed, Jan 28, 2015 at 11:23:15AM +, Albe Laurenz wrote:

> If you are on Linux, you could try the following patch:
> 
> http://www.postgresql.org/message-id/ca+csw_tpdygnzcyw0s4ou0mtuouhz9pc7mrbpxvd-3zbiwn...@mail.gmail.com
> 
> ...
> Renaming one of the functions seems like the best thing to do.

I am using FreeBSD, so RTLD_DEEPBIND is not supported by my dlopen().

However, your suggestion to rename the function seems at least as
good.  Patching PostgreSQL would just mean trading difficulties when
upgrading the third-party library for difficulties when upgrading
PostgreSQL.

Thank you for the response and insight.

-- 
Adam Mackler

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] Conflicting function name in dynamically-loaded shared library

2015-01-27 Thread Adam Mackler

Hi:

I am wanting to define some functions as described in section 35.9 of
the manual, "C-Language Functions."  I am compiling pre-existing files
of c-code from another project into object files, and then linking
those object files into a shared library along with my own functions
that follow the posgresql calling convention, invoking other functions
as desired.

This pre-existing c-code I'm compiling happens to contain a function
named "point_add()".  I see the postgresql source file fmgrtab.c also
has a function with the same name.  When my code tries to invoke its
version of point_add() I get a bus error, and when I changed the name
of that function to something else the bus error went away.

Of course, since I'm working with source code I can just keep the
modified function name, but I would like to be able to keep my version
of this c code updated with that project and so to use it unmodified.

So my questions are: first, might I be wrong about the cause of this
bus error?  I cannot think of another reason why changing the name of
a function would have this effect, but maybe there's some other reason
besides the "point_add()" function in fmgrtab.c conflicting.  If so,
I'm interested to know.

If, however, it is probable that this bus error is a result of this
naming conflict, then do I have any options for working around it that
would enable me to use the code from this other library without
changing the name of its "point_add()" function?  I know I could ask
that project's developers to change the function's name, but that
could break other code that currently uses it, and even if it didn't,
I would prefer something less intrusive on that project.

Thanks very much for any ideas about this,
-- 
Adam Mackler


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] table versioning approach (not auditing)

2014-10-02 Thread Adam Brusselback

Ended up running for 28 min, but it did work as expected.

On Thu, Oct 2, 2014 at 10:27 AM, Adam Brusselback  wrote:

> Testing that now.  Initial results are not looking too performant.
> I have one single table which had 234575 updates done to it.  I am rolling
> back 13093 of them.  It's been running 20 min now, using 100% of a single
> core, and almost 0 disk.  No idea how long it'll run at this point.
>
> This is on an i5 desktop with 16 gigs of ram and an ssd.
>
> This is a pretty good test though, as it's a real world use case (even if
> the data was generated with PGBench).  We now know that area needs some
> work before it can be used for anything more than a toy database.
>
> Thanks,
> -Adam
>
> On Thu, Oct 2, 2014 at 7:52 AM, Felix Kunde  wrote:
>
>> Hey there
>>
>> Thanks again for the fix. I was able to merge it into my repo.
>> Also thanks for benchmarking audit. Very interesting results.
>> I wonder how the recreation of former database states scales when
>> processing many deltas.
>> Haven’t done a lot of testing in that direction.
>>
>> I will transfer the code soon to a more public repo on GitHub. As far as
>> I see I have to create an organization for that.
>>
>> Cheers
>> Felix
>>
>> *Gesendet:* Mittwoch, 01. Oktober 2014 um 17:09 Uhr
>>
>> *Von:* "Adam Brusselback" 
>> *An:* "Felix Kunde" 
>> *Cc:* "pgsql-general@postgresql.org" 
>> *Betreff:* Re: [GENERAL] table versioning approach (not auditing)
>>   I know we're kinda hijacking this thread, so sorry for that.  If you'd
>> like to do that, i'd be more than happy to use it and push any fixes /
>> changes upstream.  I don't have much of a preference on the name either, as
>> long as it's something that makes sense.
>>
>> I would consider myself far from an expert though! Either way, more
>> people using a single solution is a good thing.
>>
>> As a side note, I did some benchmarking this morning and wanted to share
>> the results:
>> pgbench -i -s 140 -U postgres pgbench
>>
>> pgbench -c 4 -j 4 -T 600 -U postgres pgbench
>> no auditing tps: 2854
>> NOTE: Accounts are audited
>> auditing tps: 1278
>>
>> pgbench -c 2 -j 2 -N -T 300 -U postgres pgbench
>> no auditing tps: 2504
>> NOTE: Accounts are audited
>> auditing tps: 822
>>
>> pgbench -c 2 -j 2 -T 300 -U postgres pgbench
>> no auditing tps: 1836
>> NOTE: branches and tellers are audited, accounts are not
>> auditing tps: 505
>>
>> I'd love to see if there are some easy wins to boost the performance.
>>
>> On Wed, Oct 1, 2014 at 5:19 AM, Felix Kunde  wrote:
>>>
>>> Hey there. Thank you very much for that fix! Thats why I'd like to have
>>> a joint development and joint testing. It's way more convincing for users
>>> to go for a solution that is tested by some experts than just by a random
>>> developer :)
>>>
>>> I'm open to create a new project and push the code there. Don't care
>>> about the name. Then we might figure out which parts are already good,
>>> which parts could be improved and where to go next. I think switching to
>>> JSONB for example will be easy, as it offers the same functions than JSON
>>> afaik.
>>>
>>>
>>> Gesendet: Dienstag, 30. September 2014 um 21:16 Uhr
>>> Von: "Adam Brusselback" 
>>> An: "Felix Kunde" 
>>> Cc: "pgsql-general@postgresql.org" 
>>> Betreff: Re: [GENERAL] table versioning approach (not auditing)
>>>
>>> Felix, I'd love to see a single, well maintained project. For example, I
>>> just found yours, and gave it a shot today after seeing this post.  I found
>>> a bug when an update command is issued, but the old and new values are all
>>> the same.  The trigger will blow up.  I've got a fix for that, but if we
>>> had one project that more than a handful of people used, stuff like that
>>> would be quashed very quickly.
>>>
>>> I love the design of it by the way. Any idea what it will take to move
>>> to JSONB for 9.4?
>>>
>>>
>>> On Tue, Sep 30, 2014 at 7:22 AM, Felix Kunde 
>>> wrote:Hey
>>>
>>> yes i'm adding an additional key to each of my tables. First i wanted to
>>> use the primary key as one column in my audit_log table, but in some of my
>>> tables the PK consists of more than one column. Plus it's nice to have one
>>>

Re: [GENERAL] table versioning approach (not auditing)

2014-10-02 Thread Adam Brusselback

Testing that now.  Initial results are not looking too performant.
I have one single table which had 234575 updates done to it.  I am rolling
back 13093 of them.  It's been running 20 min now, using 100% of a single
core, and almost 0 disk.  No idea how long it'll run at this point.

This is on an i5 desktop with 16 gigs of ram and an ssd.

This is a pretty good test though, as it's a real world use case (even if
the data was generated with PGBench).  We now know that area needs some
work before it can be used for anything more than a toy database.

Thanks,
-Adam

On Thu, Oct 2, 2014 at 7:52 AM, Felix Kunde  wrote:

> Hey there
>
> Thanks again for the fix. I was able to merge it into my repo.
> Also thanks for benchmarking audit. Very interesting results.
> I wonder how the recreation of former database states scales when
> processing many deltas.
> Haven’t done a lot of testing in that direction.
>
> I will transfer the code soon to a more public repo on GitHub. As far as I
> see I have to create an organization for that.
>
> Cheers
> Felix
>
> *Gesendet:* Mittwoch, 01. Oktober 2014 um 17:09 Uhr
>
> *Von:* "Adam Brusselback" 
> *An:* "Felix Kunde" 
> *Cc:* "pgsql-general@postgresql.org" 
> *Betreff:* Re: [GENERAL] table versioning approach (not auditing)
>   I know we're kinda hijacking this thread, so sorry for that.  If you'd
> like to do that, i'd be more than happy to use it and push any fixes /
> changes upstream.  I don't have much of a preference on the name either, as
> long as it's something that makes sense.
>
> I would consider myself far from an expert though! Either way, more people
> using a single solution is a good thing.
>
> As a side note, I did some benchmarking this morning and wanted to share
> the results:
> pgbench -i -s 140 -U postgres pgbench
>
> pgbench -c 4 -j 4 -T 600 -U postgres pgbench
> no auditing tps: 2854
> NOTE: Accounts are audited
> auditing tps: 1278
>
> pgbench -c 2 -j 2 -N -T 300 -U postgres pgbench
> no auditing tps: 2504
> NOTE: Accounts are audited
> auditing tps: 822
>
> pgbench -c 2 -j 2 -T 300 -U postgres pgbench
> no auditing tps: 1836
> NOTE: branches and tellers are audited, accounts are not
> auditing tps: 505
>
> I'd love to see if there are some easy wins to boost the performance.
>
> On Wed, Oct 1, 2014 at 5:19 AM, Felix Kunde  wrote:
>>
>> Hey there. Thank you very much for that fix! Thats why I'd like to have a
>> joint development and joint testing. It's way more convincing for users to
>> go for a solution that is tested by some experts than just by a random
>> developer :)
>>
>> I'm open to create a new project and push the code there. Don't care
>> about the name. Then we might figure out which parts are already good,
>> which parts could be improved and where to go next. I think switching to
>> JSONB for example will be easy, as it offers the same functions than JSON
>> afaik.
>>
>>
>> Gesendet: Dienstag, 30. September 2014 um 21:16 Uhr
>> Von: "Adam Brusselback" 
>> An: "Felix Kunde" 
>> Cc: "pgsql-general@postgresql.org" 
>> Betreff: Re: [GENERAL] table versioning approach (not auditing)
>>
>> Felix, I'd love to see a single, well maintained project. For example, I
>> just found yours, and gave it a shot today after seeing this post.  I found
>> a bug when an update command is issued, but the old and new values are all
>> the same.  The trigger will blow up.  I've got a fix for that, but if we
>> had one project that more than a handful of people used, stuff like that
>> would be quashed very quickly.
>>
>> I love the design of it by the way. Any idea what it will take to move to
>> JSONB for 9.4?
>>
>>
>> On Tue, Sep 30, 2014 at 7:22 AM, Felix Kunde 
>> wrote:Hey
>>
>> yes i'm adding an additional key to each of my tables. First i wanted to
>> use the primary key as one column in my audit_log table, but in some of my
>> tables the PK consists of more than one column. Plus it's nice to have one
>> key that is called the same over all tables.
>>
>> To get a former state for one row at date x I need to join the latest
>> delta BEFORE date x with each delta AFTER date x. If I would log complete
>> rows, this joining part would not be neccessary, but as I usually work with
>> spatial databases that have complex geometries and also image files, this
>> strategy is too harddisk consuming.
>>
>> If there are more users following a similar approach, I wonder why we not
>> throw all the

Re: [GENERAL] table versioning approach (not auditing)

2014-10-01 Thread Adam Brusselback

I know we're kinda hijacking this thread, so sorry for that.  If you'd like
to do that, i'd be more than happy to use it and push any fixes / changes
upstream.  I don't have much of a preference on the name either, as long as
it's something that makes sense.

I would consider myself far from an expert though! Either way, more people
using a single solution is a good thing.

As a side note, I did some benchmarking this morning and wanted to share
the results:
pgbench -i -s 140 -U postgres pgbench

pgbench -c 4 -j 4 -T 600 -U postgres pgbench
no auditing tps: 2854
NOTE: Accounts are audited
auditing tps: 1278

pgbench -c 2 -j 2 -N -T 300 -U postgres pgbench
no auditing tps: 2504
NOTE: Accounts are audited
auditing tps: 822

pgbench -c 2 -j 2 -T 300 -U postgres pgbench
no auditing tps: 1836
NOTE: branches and tellers are audited, accounts are not
auditing tps: 505

I'd love to see if there are some easy wins to boost the performance.

On Wed, Oct 1, 2014 at 5:19 AM, Felix Kunde  wrote:

> Hey there. Thank you very much for that fix! Thats why I'd like to have a
> joint development and joint testing. It's way more convincing for users to
> go for a solution that is tested by some experts than just by a random
> developer :)
>
> I'm open to create a new project and push the code there. Don't care about
> the name. Then we might figure out which parts are already good, which
> parts could be improved and where to go next. I think switching to JSONB
> for example will be easy, as it offers the same functions than JSON afaik.
>
>
> Gesendet: Dienstag, 30. September 2014 um 21:16 Uhr
> Von: "Adam Brusselback" 
> An: "Felix Kunde" 
> Cc: "pgsql-general@postgresql.org" 
> Betreff: Re: [GENERAL] table versioning approach (not auditing)
>
> Felix, I'd love to see a single, well maintained project. For example, I
> just found yours, and gave it a shot today after seeing this post.  I found
> a bug when an update command is issued, but the old and new values are all
> the same.  The trigger will blow up.  I've got a fix for that, but if we
> had one project that more than a handful of people used, stuff like that
> would be quashed very quickly.
>
> I love the design of it by the way. Any idea what it will take to move to
> JSONB for 9.4?
>
>
> On Tue, Sep 30, 2014 at 7:22 AM, Felix Kunde 
> wrote:Hey
>
> yes i'm adding an additional key to each of my tables. First i wanted to
> use the primary key as one column in my audit_log table, but in some of my
> tables the PK consists of more than one column. Plus it's nice to have one
> key that is called the same over all tables.
>
> To get a former state for one row at date x I need to join the latest
> delta BEFORE date x with each delta AFTER date x. If I would log complete
> rows, this joining part would not be neccessary, but as I usually work with
> spatial databases that have complex geometries and also image files, this
> strategy is too harddisk consuming.
>
> If there are more users following a similar approach, I wonder why we not
> throw all the good ideas together, to have one solution that is tested,
> maintained and improved by more developpers. This would be great.
>
> Felix
>
>
> Gesendet: Montag, 29. September 2014 um 23:25 Uhr
> Von: "Abelard Hoffman"  ]>
> An: "Felix Kunde" 
> Cc: "pgsql-general@postgresql.org[pgsql-general@postgresql.org]" <
> pgsql-general@postgresql.org[pgsql-general@postgresql.org]>
> Betreff: Re: [GENERAL] table versioning approach (not auditing)
>
> Thank you Felix, Gavin, and Jonathan for your responses.
>
> Felix & Jonathan: both of you mention just storing deltas. But if you do
> that, how do you associate the delta record with the original row? Where's
> the PK stored, if it wasn't part of the delta?
>
> Felix, thank you very much for the example code. I took a look at your
> table schemas. I need to study it more, but it looks like the way you're
> handling the PK, is you're adding a separate synthethic key (audit_id) to
> each table that's being versioned. And then storing that key along with the
> delta.
>
> So then to find all the versions of a given row, you just need to join the
> audit row with the schema_name.table_name.audit_id column. Is that right?
> The only potential drawback there is there's no referential integrity
> between the audit_log.audit_id and the actual table.
>
> I do like that approach very much though, in that it eliminates the need
> to interrogate the json data in order to perform most queries.
>
> AH
>
>
>
> On Mon, Sep 29, 2014 at 12:26 AM, Felix Kunde  felix-ku...@gmx.de]> wrote:Hey
&g

Re: [GENERAL] table versioning approach (not auditing)

2014-09-30 Thread Adam Brusselback

Felix, I'd love to see a single, well maintained project. For example, I
just found yours, and gave it a shot today after seeing this post.  I found
a bug when an update command is issued, but the old and new values are all
the same.  The trigger will blow up.  I've got a fix for that, but if we
had one project that more than a handful of people used, stuff like that
would be quashed very quickly.

I love the design of it by the way. Any idea what it will take to move to
JSONB for 9.4?


On Tue, Sep 30, 2014 at 7:22 AM, Felix Kunde  wrote:

> Hey
>
> yes i'm adding an additional key to each of my tables. First i wanted to
> use the primary key as one column in my audit_log table, but in some of my
> tables the PK consists of more than one column. Plus it's nice to have one
> key that is called the same over all tables.
>
> To get a former state for one row at date x I need to join the latest
> delta BEFORE date x with each delta AFTER date x. If I would log complete
> rows, this joining part would not be neccessary, but as I usually work with
> spatial databases that have complex geometries and also image files, this
> strategy is too harddisk consuming.
>
> If there are more users following a similar approach, I wonder why we not
> throw all the good ideas together, to have one solution that is tested,
> maintained and improved by more developpers. This would be great.
>
> Felix
>
>
> Gesendet: Montag, 29. September 2014 um 23:25 Uhr
> Von: "Abelard Hoffman" 
> An: "Felix Kunde" 
> Cc: "pgsql-general@postgresql.org" 
> Betreff: Re: [GENERAL] table versioning approach (not auditing)
>
> Thank you Felix, Gavin, and Jonathan for your responses.
>
> Felix & Jonathan: both of you mention just storing deltas. But if you do
> that, how do you associate the delta record with the original row? Where's
> the PK stored, if it wasn't part of the delta?
>
> Felix, thank you very much for the example code. I took a look at your
> table schemas. I need to study it more, but it looks like the way you're
> handling the PK, is you're adding a separate synthethic key (audit_id) to
> each table that's being versioned. And then storing that key along with the
> delta.
>
> So then to find all the versions of a given row, you just need to join the
> audit row with the schema_name.table_name.audit_id column. Is that right?
> The only potential drawback there is there's no referential integrity
> between the audit_log.audit_id and the actual table.
>
> I do like that approach very much though, in that it eliminates the need
> to interrogate the json data in order to perform most queries.
>
> AH
>
>
>
> On Mon, Sep 29, 2014 at 12:26 AM, Felix Kunde 
> wrote:Hey
>
> i've also tried to implement a database versioning using JSON to log
> changes in tables. Here it is:
> https://github.com/fxku/audit[https://github.com/fxku/audit]
> I've got two versioning tables, one storing information about all
> transactions that happened and one where i put the JSON logs of row changes
> of each table. I'm only logging old values and not complete rows.
>
> Then I got a function that recreates a database state at a given time into
> a separate schema - either to VIEWs, MVIEWs or TABLES. This database state
> could then be indexed in order to work with it. You can also reset the
> production state to the recreated past state.
>
> Unfortunately I've got no time to further work on it at the moment + I
> have not done tests with many changes in the database so I can't say if the
> recreation process scales well. On downside I've realised is that using the
> json_agg function has limits when I've got binary data. It gets too long.
> So I'm really looking forward using JSONB.
>
> There are more plans in my mind. By having a Transaction_Log table it
> should be possible to revert only certain transactions. I'm also thinking
> of parallel versioning, e.g. different users are all working with their
> version of the database and commit their changes to the production state.
> As I've got a unique history ID for each table and each row, I should be
> able to map the affected records.
>
> Have a look and tell me what you think of it.
>
> Cheers
> Felix
>
>
> Gesendet: Montag, 29. September 2014 um 04:00 Uhr
> Von: "Abelard Hoffman" 
> An: "pgsql-general@postgresql.org" 
> Betreff: [GENERAL] table versioning approach (not auditing)
>
> Hi. I need to maintain a record of all changes to certain tables so assist
> in viewing history and reverting changes when necessary (customer service
> makes an incorrect edit, etc.).
>
> I have studied these two audit trigger examples:
>
> https://wiki.postgresql.org/wiki/Audit_trigger[https://wiki.postgresql.org/wiki/Audit_trigger]
> https://wiki.postgresql.org/wiki/Audit_trigger_91plus
>
> I've also read about two other approaches to versioning:
> 1. maintain all versions in one table, with a flag to indicate which is
> the current version
> 2. have a separate versions table for each real table, and insert into the
> associa

Re: [GENERAL] Re: User-defined operator function: what parameter type to use for uncast character string?

2014-07-31 Thread Adam Mackler

On Thu, Jul 31, 2014 at 10:03:00AM -0400, Tom Lane wrote:
> 2. text is the preferred type among the string class, so any case where
> you have text on one side and some other string type on the other is
> going to get resolved as text vs text.
> Because of #1, domain-specific functions and operators tend to be pretty
> useless; you find yourself always having to cast the other side to get
> 
> If you're intent on having this behavior, the way to go at it is to make
> your own actual datatype (not a domain) and create all your own comparison
> operators for it.  You can add an implicit cast to text for cases where

Thanks to everyone who responded to my question.  Yes, the
case-insensitivity was an example contrivance.  My actual operator
function is using regular expressions to do text replacements, so
citext is not going to solve all my problems.  Looks like 'CREATE
TYPE' is the only way to get what I'm after here, and since there are
regular expressions involved, a C-language solution is not seeming
like it's going to be very convenient.  On top of that, the content of
my regular-expression replacement strings are constructed from data
in my database, so really my SQL-language operator function seemed
ideal except--of course--for the limitations you all have explained to
me.

One final question: the 'CREATE CAST' command got my interest.  I'm
assuming that when the docs say it 'performs a conversion between two
data types,' that the meaning of "data type" includes only those
created using 'CREATE TYPE' and excludes domains.  If I am mistaken on
that point I would be grateful to learn of that mistake.

Thanks again,
-- 
Adam Mackler

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Re: User-defined operator function: what parameter type to use for uncast character string?

2014-07-31 Thread Adam Mackler

On Wed, Jul 30, 2014 at 10:59:28PM -0700, David G Johnston wrote:
> ISTM that if this was supported you would be doing it correctly.  

Thank you for the quick response.  I'm not understanding you.  Could you 
elaborate?

> The main problem is you are abusing DOMAIN - which is strictly the
> base type with constraints - and trying to add operators specific to
> the DOMAIN (i.e., ones that would not work with the base type).

Can you explain what the abuse is?  Also why the "=" operator does not
work even without the domain?  If I do everything the same, but make
the column type "char(3)" rather than "my_domain", then still I need
to cast the literal in the query to "text".  I'm not understanding (1)
why postgres doesn't use the type of the column--either char(3) or
my_domain--and then choose the operator function that has that type
for its first paramater even where the second is "unknown", and (2)
why using the cast "WHERE val='abc'::text" makes it work.  It seems as
if the determining factor is the type of the literal in the
WHERE_clause.

> And so now you have "domain = unknown" and the system is trying to
> figure out what unknown should be and also which operator to pick
> and it decides that since =(text,text) covers the domain and the
> unknown that is what it will pick.

If =(text,text) is chosen where the second argument is "unknown"
rather than "text", then postgres has no problem deciding that a
parameter defined for "text" will handle "unknown", right?  So if I
define =(my_domain,text) or =(char(3),text) then wouldn't those be
preferable to =(text,text) where the first argument type can be known
from the type of the column (as defined in the table) used as the
first argument to the operator function?

If I give my operator the unique name "" then my operator function
is chosen without the cast.  That makes it seem like a matter of
priority, where the built-in "=" operator takes priority over my
user-defined one.  If postgres will accept my operator as being
appropriate, isn't there some way to give it priority over the built-in
=(text,text) operator?

In other words, "\d my_table" shows the type of the column as
"my_domain" or "char(3)" depending on how I define it.  So why isn't
that taken into acount when choosing the operator function when the
second argument is "unknown"?

> Maybe you should consider using an "enum"

I don't see how I can use an enum.  Wouldn't that require defining all
possible cominations, which even in this limited example would be 26^3
values?  I just used three uppercase letters (and case-insensitive
matching) as an example.  In my actual application I have twelve
characters and the operator function is doing more advanced
regular-expression matching.

Thanks again,
-- 
Adam Mackler

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] User-defined operator function: what parameter type to use for uncast character string?

2014-07-30 Thread Adam Mackler

(Cross-posted to StackOverflow: 
http://stackoverflow.com/questions/25041100/postgresql-user-defined-operator-function-what-parameter-type-to-use-for-uncast
 )

I'm defining my own domain and a equality operator.  I cannot cause
PostgreSQL to use my operator function in a query without explicitly
casting a character-string argument to the type I used in defining the
operator.  My question is how to cause my custom operator to be used
without the cast.

As an example: first I define a domain and a table column of that
type.  For this example, the type requires its values to be three
uppercase letters A, B or C, and the equality operator will cause
queries to match regardless of case.

CREATE domain my_domain as char(3) check(VALUE similar to '[A-C]{3}');
CREATE TABLE my_table (val my_domain);
INSERT INTO my_table VALUES ('ABC');

The type of the column is my_domain:

sandbox=> \d my_table
Table "public.my_table"
 Column |   Type| Modifiers 
+---+---
 val| my_domain | 

Before defining the custom equality operator, case-sensitive queries
work as I expect.  The row in the table is capital letters, so the
query must contain capital letters to match the row

sandbox=> SELECT * FROM my_table WHERE val='abc';
 val 
-
(0 rows)

sandbox=> SELECT * FROM my_table WHERE val='ABC';
 val 
-
 ABC
(1 row)

Next I create an equality operator to do case-insensitive matching:

CREATE FUNCTION my_equals(this my_domain, that text) RETURNS boolean AS
'SELECT CAST (this AS text) = upper(that)' LANGUAGE SQL;
CREATE OPERATOR = (procedure=my_equals, leftarg=my_domain, rightarg = text);

The new operator is invoked causing a query containing lowercase
letters to match the uppercase column value, but only if I cast the
type of the WHERE clause:

sandbox=> SELECT * FROM my_table WHERE val=CAST ('abc' AS text);
 val 
-
 ABC
(1 row)

sandbox=> SELECT * FROM my_table WHERE val='abc';
 val 
-
(0 rows)

Question: What can I do so my custom equality operator is used without
the cast?  In other words, how to cause the last query above return
the table row (without changing the query)?  I have tried defining
my_equals() so its second parameter type is either varchar and
char(3), but those still require a cast in the WHERE-clause of the
query.  I've also tried anyelement, but that does not work even with a
cast.

Thank you,
-- 
Adam Mackler


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] 8.4 -> 9.1 : ANALYSE VERBOSE; -> out of shared memory

2014-05-12 Thread Souquieres Adam


Le 12/05/2014 16:24, Tom Lane a écrit :

Souquieres Adam  writes:

When i relaunch my ANALYSE VERBOSE, pg_locks table grows quickly from 20
lines to more than 1000 lines and still growing, all the lines are owned
by the same "virtual transaction" and the same "pid".

Hm.  I experimented a bit and looked at the code, and I find I was not
remembering ANALYZE's behavior exactly right: it only processes all the
tables inside one transaction if you start it inside a transaction block
(ie, after BEGIN, or inside a function).  If you just run it by itself
then it does a transaction per table, just like VACUUM.  So I'm thinking
there's something you're not telling us about exactly how you invoke
ANALYZE.


Ok, thank you for your help, we finally undestood what the problem is 
with your hints.


In our production environment, which is "living", we have 4200 tables 
(97% are generated by our program for BI performance) and not only 500 ( 
the real ones) ... this is the difference between test env and prod env,


moreover we launch the analyse verbose using JDBC with an ORM, and i 
think it add  begin; and end;.


I just tested it on pgadmin,

 * without begin end, there is not so much lock,
 * with begin end, there is a lock explosion that is normal when we
   look at the number of tables involved.


The solution must be to define a *better strategy for tables 
statistics*... we dont really need stats on all the tables because the 
most part of them is static...


Thank you very much

Regards,

Adam



When i finish to write this email, i juste hit more than 3200 lock owned
by the same transaction !

Could you show us some of those locks (a few dozen lines from pg_locks)?

regards, tom lane

Re: [GENERAL] 8.4 -> 9.1 : ANALYSE VERBOSE; -> out of shared memory

2014-05-12 Thread Souquieres Adam

I just hit the 20k locks in pg_locks, on 18k differents relations owned 
by the same virtual transaction and PID.


I only have like 500 tables and like 2k indexes, i must miss something.


Le 12/05/2014 15:42, Tom Lane a écrit :

Souquieres Adam  writes:

ANALYSE VERBOSE; should use only one transaction or one transaction per
table it analyse ?

ANALYZE is just a simple statement: it doesn't start or stop any
transactions.  So all the locks will be acquired in the calling
transaction.

You might be better off using VACUUM ANALYZE, which although it
does more work will divide the work into a transaction per table.

regards, tom lane






--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] 8.4 -> 9.1 : ANALYSE VERBOSE; -> out of shared memory

2014-05-12 Thread Souquieres Adam

When i relaunch my ANALYSE VERBOSE, pg_locks table grows quickly from 20 
lines to more than 1000 lines and still growing, all the lines are owned 
by the same "virtual transaction" and the same "pid".


max locks is 128, so i don't understand what happening,

When i finish to write this email, i juste hit more than 3200 lock owned 
by the same transaction !


Can you explain what is the difference between 8.4 and 9.1 on this point 
please ?


regards,
Adam

Le 12/05/2014 15:33, Souquieres Adam a écrit :

Hi,

thanks you both for your quick answers,




Le 12/05/2014 15:29, Tom Lane a écrit :

Merlin Moncure  writes:

On Mon, May 12, 2014 at 7:57 AM, Souquieres Adam
 wrote:
when we play : ANALYSE VERBOSE; ( stat on all databases, with 500 
tables and

1to DATA in all tables)
we now have this message :
org.postgresql.util.PSQLException: ERROR: out of shared memory 
Indice : You

might need to increase max_locks_per_transaction.
max_connections = 150
max_locks_per_transaction = 128 # was at default val ( 64?), we 
already try

to increase it without sucess

How high did you increase it?  It's not uncommon to have to raise that
parameter significantly if you have a lot of tables.  Try 2048.

We change the parameter from default value 64 to 128


It's unsurprising for analyze across 500 tables to require 500 locks.
However, with those settings you should already have 150*128 = 19200
slots in the shared lock table, so there's no way that the analyze
is eating them all.  What else is going on in the system?  How many
entries do you see in pg_locks while this is happening?

regards, tom lane


ANALYSE VERBOSE; should use only one transaction or one transaction 
per table it analyse ?


anyway, i try too list pg_locks table during this issue and i'll post 
you the result.



Adam






--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] 8.4 -> 9.1 : ANALYSE VERBOSE; -> out of shared memory

2014-05-12 Thread Souquieres Adam


Hi,

thanks you both for your quick answers,




Le 12/05/2014 15:29, Tom Lane a écrit :

Merlin Moncure  writes:

On Mon, May 12, 2014 at 7:57 AM, Souquieres Adam
 wrote:

when we play : ANALYSE VERBOSE; ( stat on all databases, with 500 tables and
1to DATA in all tables)
we now have this message :
org.postgresql.util.PSQLException: ERROR: out of shared memory Indice : You
might need to increase max_locks_per_transaction.
max_connections = 150
max_locks_per_transaction = 128 # was at default val ( 64?), we already try
to increase it without sucess

How high did you increase it?  It's not uncommon to have to raise that
parameter significantly if you have a lot of tables.  Try 2048.

We change the parameter from default value 64 to 128


It's unsurprising for analyze across 500 tables to require 500 locks.
However, with those settings you should already have 150*128 = 19200
slots in the shared lock table, so there's no way that the analyze
is eating them all.  What else is going on in the system?  How many
entries do you see in pg_locks while this is happening?

regards, tom lane


ANALYSE VERBOSE; should use only one transaction or one transaction per 
table it analyse ?


anyway, i try too list pg_locks table during this issue and i'll post 
you the result.



Adam


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] 8.4 -> 9.1 : ANALYSE VERBOSE; -> out of shared memory

2014-05-12 Thread Souquieres Adam


Hi,

we have a problem since we migrate from 8.4 to 9.1

when we play : ANALYSE VERBOSE; ( stat on all databases, with 500 tables 
and 1to DATA in all tables)


we now have this message :

 org.postgresql.util.PSQLException: ERROR: out of shared memory Indice 
: You might need to increase max_locks_per_transaction.


When we was in 8.4, there was no error,

there is our specific postgresql.conf configuration on the server :

default_statistics_target = 200
maintenance_work_mem = 1GB
constraint_exclusion = on
checkpoint_completion_target = 0.9
effective_cache_size = 7GB
work_mem = 48MB
wal_buffers = 32MB
checkpoint_segments = 64
shared_buffers = 2304MB
max_connections = 150
random_page_cost = 2.0
max_locks_per_transaction = 128 # was at default val ( 64?), we already 
try to increase it without sucess


Have you any suggestions  ?

[GENERAL] json datatype and table bloat?

2013-11-04 Thread Adam Jelinek

Along the lines of the equality operator; I have ran into issues trying to
pivot a table/result set with a json type due what seemed to be no equality
operator.



On Nov 4, 2013 10:14 AM, "Merlin Moncure"
>
wrote:

> On Fri, Nov 1, 2013 at 1:50 PM, Tom Lane 
> >
> wrote:
> > Gregory Haase  'haa...@onefreevoice.com');>> writes:
> >> The json_data column is not accounted for in pg_stats:
> >
> > Ah!  I hadn't twigged to the fact that your bloat measurement approach
> > assumed you had pg_stats entries for all the columns.
> >
> >> So I'm not sure if I'd actually qualify this as a "bug", but it appears
> >> that there is no way to currently get stats on a json data type.
> >
> > ANALYZE currently punts on columns that don't have an equality operator,
> > which json does not.  There isn't that much in the way of stats that we
> > could collect, though I suppose we could still compute average datum
> width
> > and null fraction.
> >
> > I'm not sure whether there are plans to invent an equality operator for
> > json.
>
> IMNSO, this may not be a bug, but it's pretty close.  All base types
> should have equality operator as well as other supporting
> infrastructure that the database itself depends on (in/out and
> send/receive for example).  This is a pretty good example of why.
>
> merlin
>
>
> --
> Sent via pgsql-general mailing list 
> (pgsql-general@postgresql.org 'pgsql-general@postgresql.org');>
> )
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>


-- 
Sent from Gmail Mobile

Re: [GENERAL] Explanantion on pgbouncer please

2013-10-31 Thread Adam Brusselback

For where you are measuring, everything looks normal to me.

Your application will make connections to the pooler as needed, and the
pooler will assign the application connection to a database connection it
has available in it's pool.  This gets rid of the overhead of creating a
brand new connection to the DB right from the application for a small
query, as connecting to the pooler is much less overhead.

The pooler should eventually get to the maximum number of allowed
connections.  That is just how it's supposed to work.  It is not supposed
to any connections after your application doesn't need the them any more.
It just keeps it open, and adds it to the available pool for your
application to use.

On Thu, Oct 31, 2013 at 10:49 AM, si24  wrote:

> I use a check in postgres to give the active connections being used.
>
> But what happens if the connection pooler goes all the way to 100 for
> example and say that 100 is your postgres maximum connections at the time.
> I
> know I can change the maximum connections in postgres but am not to sure on
> the right balance. If I have say 600 people looking at my map is that not
> then several calles to postgres to show them the map that each person is
> looking at on the computer?
>
>
>
> --
> View this message in context:
> http://postgresql.1045698.n5.nabble.com/Explanantion-on-pgbouncer-please-tp5776515p5776524.html
> Sent from the PostgreSQL - general mailing list archive at Nabble.com.
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>

Re: [GENERAL] Explanantion on pgbouncer please

2013-10-31 Thread Adam Brusselback

Where are you measuring the connections? From your app to PGBouncer, or
from PGBouncer to PostgreSQL?

If it is from your app to PGBouncer, that sounds strange, and like the app
is not properly releasing connections as it should.  If it is from
PGBouncer to PostgreSQL, that sounds normal.  I haven't used PGBouncer, but
i've used other connection poolers in the past.  They would start out with
a set number of minimum connections, and scale up to the max number
specified as needed.  The pooler wouldn't release a connection to the DB
once it has made one, and just kept it available in the pool.

On Thu, Oct 31, 2013 at 10:25 AM, si24  wrote:

> Can some one please give me a bit more of a better explanation on how
> exactly
> the pgbouncer works as I am now lost.
>
> I'm not sure if it is pooling the connections cause surely if its not being
> used the connections should go down not up i.e i run the webpage which has
> my map running which is an open layers map reading geoserver all my data on
> geoserver is from a database in postgres. When you start the web page it
> goes to 46 connections and after moving around for a while and selecting
> the
> different overlays that I have on the map it goes up to 75 connections
> after
> not touching it for a while nothing happens the connections don't go up or
> down, but when I move the map around and zoom then the connections increase
> again to 84 connections.
>
> Please help I'm stuck
>
>
>
> --
> View this message in context:
> http://postgresql.1045698.n5.nabble.com/Explanantion-on-pgbouncer-please-tp5776515.html
> Sent from the PostgreSQL - general mailing list archive at Nabble.com.
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>

Re: [GENERAL] Bug? Function with side effects not evaluated in CTE

2013-10-18 Thread Adam Jelinek

>Would help to include the explain(s).  Did you ANALYZE after the insert; if
>not the planner probably still thought the table was empty (thus the
>matching explain) but upon execution realized it had records and thus
needed
>to run the CTE.

I did not do an ANALYZE after the insert, I think the plan would still be
the same either way.  I did what I should have done to start with and ran
explain analyze on the query which showed that it found (or did not find)
matching rows.  After reading the remaining emails I think I understand
Thanks for explaining.

Here is my two cents (take it for what it is worth). I agree with Merlin on
this.  I work as a developer at a large corporation, and in my experience
very few of the developers can write "good" SQL/data access, and then only
a fraction of them even try to understand the planner.  Although the
behavior makes sense (from what was explained above) I does not do what one
would expect (the same thing every time).   Then when you read the manual
on VOLATILE it states the optimizer makes no assumptions on such functions,
resulting in people asking why is this happening.

On Fri, Oct 18, 2013 at 3:39 PM, David Johnston  wrote:

> ajeli...@gmail.com wrote
> > but if I insert one row before I run the sql the CTE is
> > executed and I get a new row in the table.  I was hoping that I would see
> > a
> > difference in the explain, but the explain with an empty table where the
> > CTE is *not* executed is identical to the explain where there is one row
> > in
> > the table already and the CTE *is* executed resulting in a new row.
>
> Would help to include the explain(s).  Did you ANALYZE after the insert; if
> not the planner probably still thought the table was empty (thus the
> matching explain) but upon execution realized it had records and thus
> needed
> to run the CTE.
>
> Since the executor cannot fully trust the statistics, and a full scan of an
> empty table would be very fast, scanning the table to delete would be a
> necessary first step before running the CTE for the secondary conditions
> (where clause).  An implicit first-condition/result is that a DELETE on an
> empty table is effectively a No-Op.  The only reason to override that no-op
> would be if a CTE needs to be run by policy as Tom noted.
>
>
> > I thought maybe Postgres was not executing the CTE because it knows that
> > there are no rows in the table for it to delete, however if I change the
> > CTE to be an insert returning instead of a function I get different
> > results.  Even when the table is empty I get new row created.
>
> Like Tom said, if you don't hide the INSERT inside a function the CTE will
> always be executed.
>
> David J.
>
>
>
>
> --
> View this message in context:
> http://postgresql.1045698.n5.nabble.com/Bug-Function-with-side-effects-not-evaluated-in-CTE-tp5774792p5775095.html
> Sent from the PostgreSQL - general mailing list archive at Nabble.com.
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>

Re: [GENERAL] Bug? Function with side effects not evaluated in CTE

2013-10-18 Thread Adam Jelinek

I thought this was interesting, and wanted to make sure I understood what
is going on, but the more tests I run the more confused I get.

if I take the exact set up outlined by Mosche I get the same results in 9.3
(as expected) , but if I insert one row before I run the sql the CTE is
executed and I get a new row in the table.  I was hoping that I would see a
difference in the explain, but the explain with an empty table where the
CTE is *not* executed is identical to the explain where there is one row in
the table already and the CTE *is* executed resulting in a new row.  I
thought maybe Postgres was not executing the CTE because it knows that
there are no rows in the table for it to delete, however if I change the
CTE to be an insert returning instead of a function I get different
results.  Even when the table is empty I get new row created.

I would really like to know "why' it is working like this so something
similar does not come back and bite me in the future.

Thanks

On Wed, Oct 16, 2013 at 7:14 PM, Rowan Collins wrote:

>  On 17/10/2013 00:06, Merlin Moncure wrote:
>
> That being said, I do think it might be better behavior (and still
> technically correct per the documentation) if volatile query
> expressions were force-evaluated.
>
>
> This sounds reasonable for a "yes or no" case like this, but wouldn't it
> raise the question of *how many times* the function should be evaluated?
>
> What if the query looked more like this:
>
> with tt_created as
> (
> select fn_new_item(foo) as item
> from some_huge_table
> )
> select item
> from tt_created
> limit 10
>
>
> Should the CTE be calculated in its entirety, running the function for
> every row in some_huge_table? Or should it run at most 10 times?
>
> Which is desired would depend on the situation, but there's no real way to
> indicate in the syntax.
>
> --
> Rowan Collins
> [IMSoP]
>
>

[GENERAL] A view representing a schedule for all days into the future

2013-10-10 Thread Adam Mackler

Hi:

I recently posted a question on stackoverflow, but I suspect it may
require specific PostgreSQL knowledge, so I'm cross-posting a
reference to it here.

http://stackoverflow.com/questions/19237450/can-sql-view-have-infinite-number-of-rows-repeating-schedule-each-row-a-day

The gist is that I have information representing companies' daily
schedules in some tables, and I want to be able to do a SELECT to get
the business hours for any given day, arbitrarily far into the future
(and past, but I can live with a limit in that direction).  I want to
encapsulate any messy SQL in a single location where it can be hidden
from client usage, presumably in a view.

I currently have a user-defined function that returns the results I
want, but the problem is in the invocation: Some host-language client
libraries aren't so graceful with user-defined functions, especially
when they return multiple rows of multiple columns.  I would like to
be able to do a simple table-style SELECT query, presumably on a view.
But if I can put any date into the WHERE clause, then that means the
view would represent a table with an infinite number of rows, would it
not?

The posting on SO clarifies the specifics of what I'm trying to.  It
seems like there ought to be a way, but I haven't figured it out.

Thanks very much.
-- 
Adam Mackler


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Can a view represent a schedule for all days into the future?

2013-10-10 Thread Adam Mackler

On Thu, Oct 10, 2013 at 10:42:47AM -0700, David Johnston wrote:
> Adam Mackler-3 wrote
> > http://stackoverflow.com/questions/19237450/can-sql-view-have-infinite-number-of-rows-repeating-schedule-each-row-a-day
> > 
> Not sure how you can state "But I'm willing to agree never to query such a
> view without a WHERE clause that restricts the number of rows." when you
> cannot even guarantee which host-language client libraries you need to
> support.

I'm willing to agree that if I ever query such a view without a WHERE
clause that restricts the number of rows, then I won't blame anyone
but myself for the consequences.  If someone can answer this challenge
without imposing that requirement, then I shall be all the more
impressed.

The library I'm using is SLICK:

http://slick.typesafe.com/

As far as I can tell it can't handle user-defined functions that
return multiple rows nor multiple columns, except perhaps through some
low-level SQL interface that would defeat my idea of the purpose of
using a database interface library.

As I mention in the post on SO, my understanding of relational
database theory tells me that tables/relations ARE functions, in the
sense of being a mapping between a primary key and a row.  So there's
nothing ambiguous about issuing a SELECT query for some particular
date, and if I'm only querying for one date, then there's only one day
in the resulting table.

It wouldn't surprise me if there were some way to create a view based
on underlying tables that contain schedules for each day of the week
that I could query in such a fashion for any day arbitrarily far into
the future.  If this is possible, then I would be interested in
knowing what the creation of such a view looks like, independently of
my immediate practical needs.  Call me academic.

--
Adam Mackler

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] Can a view represent a schedule for all days into the future?

2013-10-10 Thread Adam Mackler

Hi:

I recently posted a question on stackoverflow, but I suspect it may
require specific PostgreSQL knowledge, so I'm cross-posting a
reference to it here.

http://stackoverflow.com/questions/19237450/can-sql-view-have-infinite-number-of-rows-repeating-schedule-each-row-a-day

The gist is that I have information representing companies' daily
schedules in some tables, and I want to be able to do a SELECT to get
the business hours for any given day, arbitrarily far into the future
(and past, but I can live with a limit in that direction).  I want to
encapsulate any messy SQL in a single location where it can be hidden
from client usage, presumably in a view.

I currently have a user-defined function that returns the results I
want, but the problem is in the invocation: Some host-language client
libraries aren't so graceful with user-defined functions, especially
when they return multiple rows of multiple columns.  I would like to
be able to do a simple table-style SELECT query, presumably on a view.
But if I can put any date into the WHERE clause, then that means the
view would represent a table with an infinite number of rows, would it
not?

The posting on SO clarifies the specifics of what I'm trying to.  It
seems like there ought to be a way, but I haven't figured it out.

Thanks very much.
-- 
Adam Mackler


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Hi, Friends, are there any ETL tools (free or commercial) available for PostgreSQL?

2013-10-08 Thread Adam Jelinek

http://kettle.pentaho.com/   works pretty good to.


On Mon, Oct 7, 2013 at 11:39 AM, Michal TOMA wrote:

> Talend?
> http://talend.com/
> But usually all major ETL tools do work with any database including
> PostgreSQL
>
> On Monday 07 October 2013 17:54:36 Vick Khera wrote:
> > http://lmgtfy.com/?q=postgres+ETL+tools
> >
> > On Mon, Oct 7, 2013 at 11:02 AM, sunpeng  wrote:
> > > Hi, Friends, are there any ETL tools (free or commercial) available for
> > > PostgreSQL?
> > > Thanks!
> > > peng
>
>
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>

Re: [GENERAL] [Q] Table aliasing

2013-10-04 Thread Adam Jelinek

I almost always alias my tables by default with something short (Usually 1
- 3 characters), but not my subselects for an in list.   In this case I
would do d1, d2, ps, and p for the different tables.  I then do my best to
use the same alias in all my queries.   I am also big on formatting the SQL
here is how I would write what you have for readability


SELECT *
  FROM deal AS d1
 WHERE d1.deal.id IN (
   SELECT DISTINCT deal.id
 FROM deal AS d2
INNER
 JOIN partner_share AS ps
   ON d2.deal.id = ps.deal_id
INNER
 JOIN partner AS p
   ON ps.partner_id = p.partner.id
WHERE p.team_id = 12345
  AND (ps.type = 1 AND d2.external_id IS NOT NULL
   OR ps.type = 2 AND d2.external_id IS NULL)
 )




On Fri, Oct 4, 2013 at 5:59 AM, Ladislav Lenart  wrote:

> Hello.
>
> I have a noob question about table aliases in SQL.
>
> Suppose the following query:
>
> SELECT *
> FROM deal
> WHERE
> deal.id IN (
> SELECT DISTINCT deal.id
> FROM
> deal
> JOIN partner_share ON deal.id = partner_share.deal_id
> JOIN partner ONshare.partner_id = partner.id
> WHERE
> partner.team_id = 12345
> AND (
> partner_share.type = 1 AND deal.external_id IS NOT NULL
> OR partner_share.type = 2 AND deal.external_id IS NULL
> )
> )
>
> As you can see, the IN (...) references the deal table too.
>
> My questions:
> * Should I alias one of the references to deal?
> * The above query works the same both with and without an alias, so is it
> simply
> a matter of taste / good practice?
> * Where can I find more info about this, i.e. when the alias is mandatory
> and
> when it is only a convenience? I've scanned through
>
>
> http://www.postgresql.org/docs/9.1/static/queries-table-expressions.html
>
> but it only mentions that subselect in FROM must have an alias. I would
> like to
> know about IN (...) and EXISTS (...).
>
>
> Thank you,
>
> L.
>
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>

Re: [GENERAL] Using ODBC and VBA to pull data from a large object

2013-09-19 Thread Adam C Falkenberg

I'm using Excel.  I needed to set the MAXVARCHARSIZE parameter in the 
connection string to take care of my issue (MAXVARCHARSIZE=2048 for me). 
That allowed the defined size of the field to equal the actual size. 
Thanks everyone for your help!

Adam



From:   Vincent Veyron 
To: Adam C Falkenberg , 
Cc: pgsql-general@postgresql.org
Date:   09/19/2013 04:59 AM
Subject:Re: [GENERAL] Using ODBC and VBA to pull data from a large 
object
Sent by:pgsql-general-ow...@postgresql.org



Le mardi 17 septembre 2013 à 12:25 -0400, Adam C Falkenberg a écrit :
> Sorry about that.  Here's the driver information and some code. 


> Driver Name: PostgreSQL ANSI 
> Version: 9.02.01.00 
> 
> constr = "Driver={PostgreSQL ANSI}; Server=servername; Port=5432;
> Database=databasename; Uid=username; Pwd=password;" 
> With conn 
> .ConnectionString = (constr) 
> .Open 
> End With 
> 
> SQL = "SELECT data FROM pg_largeobject WHERE loid = " & id & " ORDER
> BY pageno" 
> rs.Open SQL, conn 
> 
> stream.Type = adTypeBinary 
> stream.Open 
> 
> ' Loop through the recordset and write the binary data to the stream 
> While Not rs.EOF 
> stream.Write rs.Fields("data").Value 
> rs.MoveNext 
> Wend 


You don't say where you use that recordset (Excel, Access?)

A google search will return several discussions about this problem; here
is an example :

http://social.msdn.microsoft.com/Forums/en-US/32b64a3f-3e7a-4e02-a7ef-824cacfea57a/256-char-limit-on-ado-recordset-field




-- 
Salutations, Vincent Veyron
http://gdlc.fr/logiciels
Applications de gestion des contentieux juridiques et des dossiers de 
sinistres assurance



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Using ODBC and VBA to pull data from a large object

2013-09-17 Thread Adam C Falkenberg

Sorry about that.  Here's the driver information and some code.  Thanks.

Driver Name: PostgreSQL ANSI
Version: 9.02.01.00

constr = "Driver={PostgreSQL ANSI}; Server=servername; Port=5432; 
Database=databasename; Uid=username; Pwd=password;"
With conn
.ConnectionString = (constr)
.Open
End With

SQL = "SELECT data FROM pg_largeobject WHERE loid = " & id & " ORDER BY 
pageno"
rs.Open SQL, conn

stream.Type = adTypeBinary
stream.Open
 
' Loop through the recordset and write the binary data to the stream
While Not rs.EOF
stream.Write rs.Fields("data").Value
    rs.MoveNext
Wend

Adam



From:   Andrew Satori 
To: Adam C Falkenberg , 
Date:   09/17/2013 12:02 PM
Subject:Re: [GENERAL] Using ODBC and VBA to pull data from a large 
object



You don't say with which driver. 

ODBC can be a bit twitchy with data types, and I have seen several drivers 
fail when they attempt to read the .Value, some of the driver don't pass 
through the adTypeBinarry and allocate a MAX_LENGTH string of 255 for the 
read buffer. I haven't tested the current driver from pg.org, but when I 
did a few months ago, it correctly handled the .Type field and allocated 
the length appropriately.

Some version information and source would make this far easier to resolve.

On Sep 17, 2013, at 11:51 AM, Adam C Falkenberg  
wrote:

> Thanks for the response.  The example you sent is what I tried to follow 
when I originally worked on this.  It works great for the first 255 bytes, 
but after that it returns 0's.  Is there any way to get all of the data in 
a large object returned to a recordset (not just the first 255 bytes)? 
Thanks again. 
> 
> Adam
> 
> 
> 
> From:Bret Stern  
> To:Adam C Falkenberg , 
> Cc:pgsql-general@postgresql.org 
> Date:09/17/2013 10:06 AM 
> Subject:Re: [GENERAL] Using ODBC and VBA to pull data from a 
large object 
> 
> 
> 
> On Tue, 2013-09-17 at 08:32 -0400, Adam C Falkenberg wrote:
> > Good Morning, 
> > 
> > I had a question about using ODBC with large objects.  When I query
> > the pg_largeobject table and return data to an ADO recordset, I can
> > only get the first 255 bytes even though the record contains 2048
> > bytes of data (all the bytes after the first 255 show as 0).  When I
> > checked the type of the record, it was a VarBinary.  Is there a way to
> > have all of the data returned to the recordset?  Thanks for any help. 
> > 
> > Adam
> 
> 
> Microsofts sample
> http://support.microsoft.com/kb/258038
> 
>

Re: [GENERAL] Using ODBC and VBA to pull data from a large object

2013-09-17 Thread Adam C Falkenberg

Thanks for the response.  The example you sent is what I tried to follow 
when I originally worked on this.  It works great for the first 255 bytes, 
but after that it returns 0's.  Is there any way to get all of the data in 
a large object returned to a recordset (not just the first 255 bytes)? 
Thanks again.

Adam C. Falkenberg
Quality Engineer
Great Lakes Works
Phone:  (313) 749 - 3758
Cell:   (313) 910 - 3195



From:   Bret Stern 
To: Adam C Falkenberg , 
Cc: pgsql-general@postgresql.org
Date:   09/17/2013 10:06 AM
Subject:Re: [GENERAL] Using ODBC and VBA to pull data from a large 
object



On Tue, 2013-09-17 at 08:32 -0400, Adam C Falkenberg wrote:
> Good Morning, 
> 
> I had a question about using ODBC with large objects.  When I query
> the pg_largeobject table and return data to an ADO recordset, I can
> only get the first 255 bytes even though the record contains 2048
> bytes of data (all the bytes after the first 255 show as 0).  When I
> checked the type of the record, it was a VarBinary.  Is there a way to
> have all of the data returned to the recordset?  Thanks for any help. 
> 
> Adam


Microsofts sample
http://support.microsoft.com/kb/258038

[GENERAL] Using ODBC and VBA to pull data from a large object

2013-09-17 Thread Adam C Falkenberg

Good Morning,

I had a question about using ODBC with large objects.  When I query the 
pg_largeobject table and return data to an ADO recordset, I can only get 
the first 255 bytes even though the record contains 2048 bytes of data 
(all the bytes after the first 255 show as 0).  When I checked the type of 
the record, it was a VarBinary.  Is there a way to have all of the data 
returned to the recordset?  Thanks for any help.

Adam

Re: [GENERAL] Enforcing Parameterised Nested Loop Join Order for Foreign Table Joins

2013-03-24 Thread Adam Zegelin

Thanks for your assistance Tom.

On 19/03/2013, at 12:40 PM, Tom Lane  wrote:

> I think you missed my point: you should not be insisting on a maximal
> set of required outer rels.

I’ve started generating multiple paths for the web services that support it, 
which works great.

> In this particular case, it won't generate a cross-product join of l1
> and l2 because there's a heuristic that says that's unlikely to be a
> good idea.  But in related cases, there could be join order restrictions
> that *require* us not to do the joins in that order; so even if you
> could talk us out of applying that heuristic, this code is still subject
> to undesirable failures.  You really need to provide three paths using
> the three possible combinations of outer rels.

Certain web services I’m connecting to explicitly require parameters, often 
multiple ones.
I check the quals and if any required values are not present I simply raise an 
error.
So really, it is a “good idea” in my case.
Queries that require different join orders will still fail because all required 
quals are not present.

Is this heuristic a tuneable parameter, or something that would require a logic 
change inside the planner itself?

Regards,
Adam

smime.p7s
Description: S/MIME cryptographic signature

Re: [GENERAL] Enforcing Parameterised Nested Loop Join Order for Foreign Table Joins

2013-03-18 Thread Adam Zegelin

Tom,

Thank you for your prompt reply. Your advice has pointed me in the right 
direction.

I now have the wrapper identifying columns that are inputs to the web service, 
and thus parameterisable. The ec_classes, left_join_clauses and 
right_join_clauses trees are scanned for Var exprs that match these attributes. 
If they are present, the relid is added to the required list of outer rels for 
the path -- this is done as an extension to the logic I posted previously.

In all cases this seems to work, except one. A join between 3 tables. The 
foreign table has 2 parameterised columns, each given a restriction based on 
one of the other two tables:

adam=# explain select * from l1, l2, foreign1 where foreign1.a = l1.a and 
foreign1.b = l2.a;
 QUERY PLAN

 Merge Join  (cost=5000704.96..5001278.44 rows=37822 width=168)
   Merge Cond: (l2.a = foreign1.b)
   ->  Sort  (cost=85.43..88.50 rows=1230 width=36)
 Sort Key: l2.a
 ->  Seq Scan on l2  (cost=0.00..22.30 rows=1230 width=36)
   ->  Sort  (cost=5000619.54..5000634.91 rows=6150 width=132)
 Sort Key: foreign1.b
 ->  Merge Join  (cost=5000135.26..5000232.51 rows=6150 width=132)
   Merge Cond: (foreign1.a = l1.a)
   ->  Sort  (cost=549.83..552.33 rows=1000 width=96)
 Sort Key: foreign1.a
 ->  Foreign Scan on foreign1  
(cost=500.00..500.00 rows=1000 width=96)
   ->  Sort  (cost=85.43..88.50 rows=1230 width=36)
 Sort Key: l1.a
 ->  Seq Scan on l1  (cost=0.00..22.30 rows=1230 
width=36)

My path generation logic seems to work:

baserel->cheapest_parameterized_paths = (
   {FOREIGNPATH
   :pathtype 120
   :parent_relids (b 3)
   :required_outer (b 1 2)
   :rows 500
   :startup_cost 0.00
   :total_cost 0.00
   :pathkeys <>
   :fdw_private <>
   }
   {FOREIGNPATH
   :pathtype 120
   :parent_relids (b 3)
   :required_outer (b)
   :rows 1000
   :startup_cost 500.00
   :total_cost 500.00
   :pathkeys <>
   :fdw_private <>
   }
)

Yet the planner picks the non-parameterised path:

ForeignPath* best_path = {FOREIGNPATH
   :pathtype 120
   :parent_relids (b 3)
   :required_outer (b)
   :rows 1000
   :startup_cost 500.00
   :total_cost 500.00
   :pathkeys <>
   :fdw_private <>
   }

I’ve tried adjusting planner tuneables to disable all join types except nested 
loop, and setting `join_collapse_limit` to 1 with no desirable outcome.

Yet, adding a restriction clause between the other two tables forces them to be 
scanned first:

adam=# explain select * from l1, l2, foreign1 where foreign1.a = l1.a and 
foreign1.b = l2.a and l1.b > l2.b;
   QUERY PLAN
-
 Nested Loop  (cost=0.00..2544241.17 rows=12608 width=168)
   ->  Nested Loop  (cost=0.00..22741.17 rows=504300 width=72)
 Join Filter: (l1.b > l2.b)
 ->  Seq Scan on l1  (cost=0.00..22.30 rows=1230 width=36)
 ->  Materialize  (cost=0.00..28.45 rows=1230 width=36)
   ->  Seq Scan on l2  (cost=0.00..22.30 rows=1230 width=36)
   ->  Foreign Scan on foreign1  (cost=0.00..0.00 rows=500 width=96)
 Filter: ((a = l1.a) AND (b = l2.a))

ForeignPath* best_path = {FOREIGNPATH
   :pathtype 120
   :parent_relids (b 3)
   :required_outer (b 1 2)
   :rows 500
   :startup_cost 0.00
   :total_cost 0.00
   :pathkeys <>
   :fdw_private <>
   }

On 18/03/2013, at 4:09 PM, Tom Lane  wrote:

> Adam Zegelin  writes:
>> Some service endpoints have no concept of unqualified queries. In the 
>> example above, a ‘sequence scan’ of Bing is a not possible.
> 
> In that case, you shouldn't be generating such a path.  But keep in mind
> that this may lead to failure to produce any plan at all for some
> queries.  If the foreign data source is really so broken that it can't
> do that, then you have little choice ... but you shouldn't be thinking
> of that as anything but a broken design decision on their part.

I tried adding a condition that would prevent the non-parameterised path from 
being generated if the service only supported parameterised scans. Postgres 
refuses to generate a plan: "ERROR:  could not devise a query plan for the 
given query". I did a bit of digging and this error is generated by 
pathnode.c:set_cheapest . As there is no non-parameterised 
`cheapest_total_path` the error is raised (line 253).

For now, I just add an expensive non-pramerterised path and let the FDW throw 
an error if no qual is found involving the required columns.

Regards,
Adam

smime.p7s
Description: S/MIME cryptographic signature

[GENERAL] Enforcing Parameterised Nested Loop Join Order for Foreign Table Joins

2013-03-17 Thread Adam Zegelin

Hello,

I’m in the process of writing a Postgres FDW that can interface with web 
service endpoints. Certain FDW columns would act as web service parameters, 
while others would be the output.

For example:

adam=# select * from bing where query = 'xbox';
 query |   url   | description
---+-+-
 xbox  | http://www.xbox.com/en-AU/index | Xbox Australia is
   ⋮   |⋮|   ⋮

For the simple cases, extracting the quals (such as var [query] = const “xbox”) 
works perfectly.

I’d like to join FDW tables with other tables, possibly local or foreign.
ex: `select * from search_terms, bing where bing.query = search_terms.term`, 
where `search_terms` is a local table.

Without any parameterised paths, Postgres, as expected, will attempt to perform 
unfiltered foreign sequence scans followed by a hash join of the two tables. 
Some service endpoints have no concept of unqualified queries. In the example 
above, a ‘sequence scan’ of Bing is a not possible.

I generate parameterised paths inside the FDW handler function 
`GetForeignPaths`. I call `create_foreignscan_path` with a set of req_outer 
relids found by scanning PlannerInfo’s eq_classes, left_join_clauses and 
right_join_clauses.

Bitmapset* outer_relids = NULL;

foreach(lc, root->eq_classes) {
EquivalenceClass* ec = (EquivalenceClass *) lfirst(lc);
if (ec->ec_members->length > 1)
outer_relids = bms_union(outer_relids, ec->ec_relids);
}

foreach(lc, list_union(root->left_join_clauses, 
root->right_join_clauses)) {
RestrictInfo *ri = (RestrictInfo *) lfirst(lc);
outer_relids = bms_union(outer_relids, ri->outer_relids);
}

Bitmapset* req_outer = bms_difference(outer_relids,
  
bms_make_singleton(baserel->relid));

foreignPath = create_foreignscan_path(root, baserel, nbrows, 0, 0, NIL, 
req_outer, NULL);


For simple joins this works. `BeginForeignScan` can access a list of quals with 
params, and the planner generates an appropriate nested loop join over the 
foreign table.

adam=# explain select * from search_terms, bing where bing.query = 
search_terms.term;
  QUERY PLAN
---
 Nested Loop  (cost=0.00..49.30 rows=6550 width=96)
   ->  Seq Scan on search_terms  (cost=0.00..23.10 rows=1310 width=32)
   ->  Foreign Scan on bing  (cost=0.00..0.00 rows=2 width=64)
 Filter: (search_terms.term = query)

Even a query over two foreign tables works correctly:

adam=# create foreign table ft1 (inp text, out text) server test;
adam=# create foreign table ft2 (inp text, out text) server test;

adam=# explain select * from ft1, ft2 where ft1.inp = ‘hello’ and ft2.inp = 
ft1.out;
  QUERY PLAN
--
 Nested Loop  (cost=0.00..500020.00 rows=5000 width=128)
   ->  Foreign Scan on ft1  (cost=0.00..50.00 rows=1000 width=64)
 Filter: (inp = 'hello'::text)
   ->  Foreign Scan on ft2  (cost=0.00..0.00 rows=2 width=64)
 Filter: (ft1."out" = inp)


But, on a more complex query consisting of multiple foreign tables the planner 
generates something a little less desirable:

adam=# create foreign table ft3 (inp text, out text) server test;

adam=# explain select * from ft1, ft2, ft3 where ft1.inp = 'hello' and 
ft2.inp = ft1.out and ft3.inp = ft2.out;
QUERY PLAN

--
 Nested Loop  (cost=500012.50..1000290.00 rows=25000 width=192)
   ->  Hash Join  (cost=500012.50..1000190.00 rows=5000 width=128)
 Hash Cond: (ft1."out" = ft2.inp)
 ->  Foreign Scan on ft1  (cost=0.00..50.00 rows=1000 width=64)
   Filter: (inp = 'hello'::text)
 ->  Hash  (cost=50.00..50.00 rows=1000 width=64)
   ->  Foreign Scan on ft2  (cost=0.00..50.00 rows=1000 
width=64)
   ->  Foreign Scan on ft3  (cost=0.00..0.00 rows=2 width=64)
 Filter: (ft2."out" = inp)

The high total costs are the result of my attempts to coerce the planner to 
select the parameterised paths and generate filtered foreign scans rather than 
preferring unfiltered foreign scans.

I’ve tried adjusting the query planner tuneables (enable_hashjoin, et al) and 
the path costs with some degree of success, but often the generated plans will 
filter the tables in the wrong order -- the output column of table 1 will be 
fi

Re: [GENERAL] Using PostgreSQL for NSS databases

2012-11-05 Thread Adam Tauno Williams

On Thu, 2012-11-01 at 14:28 -0400, Daniel Popowich wrote:
> I'm making this post here in hopes I may save someone from beating
> their head against the wall like I did...
> I am writing a custom Name Service Switch (NSS) module to take
> advantage of already existing account information in a pg database.
> Under certain circumstances, processes will hang due to non-recursive
> mutex locking during PG connection creation.  It goes something like
> this:
> 
> /etc/nsswitch.conf:
>   passwd:   files mypgmod
>   group:files mypgmod
> 

As an old sys-admin who has been using LDAP NSS for decades I'd
recommend you look at the design of the newer nss_ldapd /sssd scheme
[vs. the old nss_ldap scheme].  This runs a simple daemon that servers
the responses to NSS over a local socket and manages a small pool of
connections back to the DSA [or in your case the PostgreSQL server].
This really improves performance, both for the client and the server as
well as avoiding many concurrency issues and intermittent network
issues.

nscd has numerous problems of its own.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

1 2 3 4 5 6 >

1 - 100 of 585 matches

Mail list logo