Re: [GENERAL] [PERFORM] Very slow inner join query Unacceptable latency.

2013-05-23 Thread Amit Kapila

On Thursday, May 23, 2013 10:51 PM fburgess wrote:
> serverdb=# set enable_hashjoin=off;
> SET
> serverdb=# explain select count(*) as y0_ from SARS_ACTS this_ inner join 
> SARS_ACTS_RUN tr1_ on this_.SARS_RUN_ID=tr1_.ID where tr1.ALGORITHM='SMAT';

> QUERY PLAN
> --
> Aggregate  (cost=7765563.69..7765563.70 rows=1 width=0) 
>  Nested Loop  (cost=0.00..776.35 rows=3336 width=0)
> -> Index Scan using idx_sars_acts_run_algorithm on sars_acts_run tr1_ 
>  (cost=0.00..44.32 rows=650 width=8) 
>  Index Cond:  ((algorithm)::text = 'SMAT'::text)
> -> Index Scan using idx_sars_acts_run_id_end_time on sars_acts this_  
> (cost=0.00..11891.29 rows=4452 width=8) 
>  Index Cond:  (SARS_RUN_ID=tr1_.ID)
>(6 rows)

>serverdb=# \timing
>TIming is on.

>serverdb=# select count(*) as y0_ from SARS_ACTS this_ inner join 
>SARS_ACTS_RUN tr1_ on this_.SARS_RUN_ID=tr1_.ID where tr1.ALGORITHM='SMAT';
> y0_
>--
>1481710
>(1 row)

> Time: 85069.416 ms < 1.4 minutes <-- not great, but much better!

> Subsequently, runs in the milliseconds once cached.

If I see the plan from your other mail as below where Hash join is selected, 
the cost of Nested Loop is much more, that is the reason why optimizer would 
have selected 
Hash Join. 

serverdb=# explain analyze select count(*) as y0_ from SARS_ACTS this_ inner 
join SARS_ACTS_RUN tr1_ on this_.SARS_RUN_ID=tr1_.ID where 
tr1_.ALGORITHM='SMAT';
QUERY PLAN
--
Aggregate  (cost=3983424.05..3983424.06 rows=1 width=0) (actual 
time=1358298.003..1358298.004 rows=1 loops=1)
  -> Hash Join  (cost=44.93..3983415.81 rows=3297 width=0) (actual 
time=2593.768..1358041.205 rows 1481710 loops=1)


It is quite surprising that after optimizer decided the cost of some plan (Hash 
Join) to be lower but actual execution cost of same is more. 
There might be some problem with cost calculation model of Hash Join for some 
cases.

By the way which version of PostgreSQL you are using?

> But what negative impact is disabling hash joins?

I think using it as a temporary fix might be okay, but keeping such code in 
your application might be risky for you, because as the data changes in your 
tables, it could be quite possible that
in future Hash Join might be the best and cheapest way.

Can you try reproducing it with small data or else can you attach your schema 
and data for the tables/indexes used in query?

With Regards,
Amit Kapila.



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] [PERFORM] Very slow inner join query Unacceptable latency.

2013-05-21 Thread Amit Kapila
On Wednesday, May 22, 2013 3:24 AM fburgess wrote:

> The SARS_ACTS table currently has 37,115,515 rows

> we have indexed: idx_sars_acts_acts_run_id ON SARS_ACTS USING btree 
> (sars_run_id)
> we have pk constraint on the SARS_ACTS_RUN table; sars_acts_run_pkey PRIMARY 
> KEY (id )

> serverdb=# explain select count(*) as y0_ from SARS_ACTS this_ inner join 
> SARS_ACTS_RUN tr1_ on this_.SARS_RUN_ID=tr1_.ID where tr1_.ALGORITHM='SMAT';
>QUERY PLAN
> --
> Aggregate  (cost=4213952.17..4213952.18 rows=1 width=0)
>  -> Hash Join  (cost=230573.06..4213943.93 rows=3296 width=0)
>   Hash Cond:  (this_.SARS_RUN_ID=tr1_.ID)
>   ->  Seq Scan om sars_acts this_  (cost=0.00..3844241.84 rows=37092284 
> width=8)
>   ->  Hash  (cost=230565.81..230565.81 rows=580 width=8)
>  -> Seq Scan on sars_acts_run tr1_  (cost=0.00..230565.81 
> rows=580 width=8)
>   Filter:  ((algorithm)::text = 'SMAT'::text)
> (7 rows)



> This query executes in approximately 5.3 minutes to complete, very very slow, 
> our users are not happy.

> I did add an index on SARS_ACTS_RUN.ALGORITHM column but it didn't improve 
> the run time. 
> The planner just changed the "Filter:" to an "Index Scan:" improving the cost 
> of the Seq Scan 
> on the sars_acts_run table, but the overall run time remained the same. It 
> seems like the bottleneck 
> is in the Seq Scan on the sars_acts table.

>  -> Seq Scan on sars_acts_run tr1_  (cost=0.00..230565.81 
> rows=580 width=8)
>   Filter:  ((algorithm)::text = 'SMAT'::text)

> Does anyone have suggestions about how to speed it up?

Could you please once trying Analyzing both tables and then run the query to 
check which plan it uses:

Analyze SARS_ACTS;
Analyze SARS_ACTS_RUN;


With Regards,
Amit Kapila.



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] pg_stat_get_last_vacuum_time(): why non-FULL?

2013-04-07 Thread Amit Kapila
On Monday, April 08, 2013 4:40 AM CR Lender wrote:
> On 2013-03-31 18:31, CR Lender wrote:
> > On 2013-03-28 20:44, Kevin Grittner wrote:
> >> CR Lender  wrote:
> >>> I've read the manual more carefully now, and I can't see any
> mention of
> >>> what VACUUM does that VACUUM FULL does not. The point about extreme
> >>> maintainance is taken, but from what I read, VACUUM FULL should
> include
> >>> everything a normal VACUUM does.
> >>
> >> Prior to release 9.0 that is probably true.
> >
> > Hm, I can't find it, even in the manual for 9.2.
> > http://www.postgresql.org/docs/current/static/sql-vacuum.html
> >
> > If VACUUM FULL is just a more aggressive VACCUM (including writing
> new
> > data files), then I don't understand the "non-FULL" restriction in
> > pg_stat_get_last_vacuum_time()... unless that information is somehow
> > lost when table files are rewritten.
> 
> I don't mean to be pushy, but I have a meeting with the admin of that
> database tomorrow, and it would be nice if I had something concrete to
> tell him. I still don't know what it is that VACCUM does but VACUUM
> full
> doesn't do. There's nothing in the manual about that.

One of the important difference is that during the time VACUUM FULL is 
operating on a relation, 
no other operations will be allowed on that relation. Most of admin care about 
this point, because
they don't want to stop operations for background garbage collect.
VACUUM FULL is only done in rare cases when the relation size has grown too 
bigger than it's actual
Contents.


With Regards,
Amit Kapila.



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Questions: How to configure PostgreSQL at the beginning ....

2013-02-19 Thread Amit Kapila
On Tuesday, February 19, 2013 4:37 PM Tomas Pasterak wrote:

> I have downloaded and installed the latest PostgreSQL version V9.2 to my
Windows 7 OS PC.
> I want to have it running on my PC, as local host.

> Now I am facing some problems.

> 1.) I do not know, how to fill in the properties tab for the server, as
name, host (what shall be host, shall 
> I put there localhost or 127.0.0.1?), service field, Maintenance DB field
(can it be named as postgres?)?
> And, what shall I put as user name? And password? Is there something
predefined? Or how and where shall I change it? 
> In which conf file? 
> It does not accept what I add there 

> I want to have it as simple as possible, I want to connect my PHP scripts
with the PostgreSQL DB on my PC.


> 2.) Via the psql console, I can not login myself  I want to create
tables etc., but I need to overcome the 
> connection and 
> login procedure and again, I do not know what to put there, as it does not
accept what I enter there ...

> Server [localhost]: What to put here?
> Database [postgres]: What to put here?
> Port [5432]: What to put here?
> Username [postgres]: What to put here?
> Password: What to put here?

> The message returned is for example this: 
> psql: could not translate host name "Tomas" to address: Unknown server
error
> Press any key to continue ...

Have your psql.exe -d postgres?
If your server is started with default configuration it should work.

> 3.) There are also some .conf files, like postgresql.conf, pg_hba.conf,
pgpass.conf. What shall I put there?
> I played with that, but it does not work, I want to have it simple.

You don't need to change there unless you want something specific. 
After initdb, the server should start with
Pg_ctl.exe start -D data_dir

Where data_dir is the path where you have created your database.

With Regards,
Amit Kapila.




-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] bug, bad memory, or bad disk?

2013-02-15 Thread Amit Kapila


> -Original Message-
> From: Merlin Moncure [mailto:mmonc...@gmail.com]
> Sent: Friday, February 15, 2013 7:56 PM
> To: Amit Kapila
> Cc: Ben Chobot; PostgreSQL General
> Subject: Re: [GENERAL] bug, bad memory, or bad disk?
> 
> On Fri, Feb 15, 2013 at 8:08 AM, Amit Kapila 
> wrote:
> > On Friday, February 15, 2013 1:33 AM Ben Chobot wrote:
> >
> >> 2013-02-13T23:13:18.042875+00:00 pgdb18-vpc postgres[20555]: [76-1]
> >  ERROR:  invalid memory alloc request size
> >> 1968078400 
> >> 2013-02-13T23:13:18.956173+00:00 pgdb18-vpc postgres[23880]: [58-1]
> >  ERROR:  invalid page header in block 2948 of
> >> relation pg_tblspc/16435/PG_9.1_201105231/188417/56951641
> >> 2013-02-13T23:13:19.025971+00:00 pgdb18-vpc postgres[25027]: [36-1]
> >  ERROR:  could not open file
> >> "pg_tblspc/16435/PG_9.1_201105231/188417/58206627.1" (target block
> > 3936767042): No such file or directory
> >> 2013-02-13T23:13:19.847422+00:00 pgdb18-vpc postgres[28333]: [8-1]
> ERROR:
> >  could not open file
> >> "pg_tblspc/16435/PG_9.1_201105231/188417/58206627.1" (target block
> > 3936767042): No such file or directory
> >> 2013-02-13T23:13:19.913595+00:00 pgdb18-vpc postgres[28894]: [8-1]
> ERROR:
> >  could not open file
> >> "pg_tblspc/16435/PG_9.1_201105231/188417/58206627.1" (target block
> > 3936767042): No such file or directory
> >> 2013-02-13T23:13:20.043527+00:00 pgdb18-vpc postgres[20917]: [72-1]
> >  ERROR:  invalid memory alloc request size
> >> 1968078400
> >> 2013-02-13T23:13:21.548259+00:00 pgdb18-vpc postgres[23318]: [54-1]
> >  ERROR:  could not open file
> >> "pg_tblspc/16435/PG_9.1_201105231/188417/58206627.1" (target block
> > 3936767042): No such file or directory
> >> 2013-02-13T23:13:28.405529+00:00 pgdb18-vpc postgres[28055]: [12-1]
> >  ERROR:  invalid page header in block 38887 of
> >> relation pg_tblspc/16435/PG_9.1_201105231/188417/58206627
> >> 2013-02-13T23:13:29.199447+00:00 pgdb18-vpc postgres[25513]: [46-1]
> >  ERROR:  invalid page header in block 2368 of
> >> relation pg_tblspc/16435/PG_9.1_201105231/188417/60418945
> >
> >> There didn't seem to be much correlation to which files were
> affected, and
> > this was a critical server, so once we
> >> realized a simple reindex wasn't going to solve things, we shut it
> down
> > and brought up a slave as the new master db.
> >
> >> While that seemed to fix these issues, we soon noticed problems with
> > missing clog files. The missing clogs were outside > the range of the
> > existing clogs, so we tried using dummy clog files. It didn't help,
> and
> > running pg_check we found that > one block of one table was
> definitely
> > corrupt. Worse, that corruption had spread to all our replicas.
> >
> > Can you check that corrupted block is from one of the relations
> mentioned in
> > your errors. This is just to reconfirm.
> >
> >> I know this is a little sparse on details, but my questions are:
> >
> >> 1. What kind of fault should I be looking to fix? Because it spread
> to all
> > the replicas, both those that stream and
> >> those that replicate by replaying wals in the wal archive, I assume
> it's
> > not a storage issue. (My understanding is that > streaming replicas
> stream
> > their changes from memory, not from wals.)
> >
> >   Streaming replication stream their changes from wals.
> 
> Yeah.  This smells like disk corruption to me, but it really could be
> anything.  Unfortunately it can spread to the replicas especially if
> you're not timely about taking the master down.  page checksums (a
> proposed feature) are a way of dealing with this problem.

Yes, it can be one of the reason. 

One more thing I observed is the block it is trying to access in file has
very large block number (Ex. See in below log message). Is it really
feasible to have such large block number?

2013-02-13T23:13:19.025971+00:00 pgdb18-vpc postgres[25027]: [36-1]  ERROR:
could not open file "pg_tblspc/16435/PG_9.1_201105231/188417/58206627.1"
(target block 3936767042): No such file or directory

Also the first error message:
"invalid memory alloc request size 1968078400"
Here why and which operation requests such a memory request?

Not sure if there is any chances of memory corruption?

> The biggest issue is the missing clog files -- did you have more than
> one replica? Were they missing on all of them?

Won't it be possible that there are less clog files on replica. Not sure if
such can happen in 
Normal scenario's?

With Regards,
Amit Kapila.



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] bug, bad memory, or bad disk?

2013-02-15 Thread Amit Kapila
On Friday, February 15, 2013 1:33 AM Ben Chobot wrote:

> 2013-02-13T23:13:18.042875+00:00 pgdb18-vpc postgres[20555]: [76-1]
 ERROR:  invalid memory alloc request size 
> 1968078400
> 2013-02-13T23:13:18.956173+00:00 pgdb18-vpc postgres[23880]: [58-1]
 ERROR:  invalid page header in block 2948 of 
> relation pg_tblspc/16435/PG_9.1_201105231/188417/56951641
> 2013-02-13T23:13:19.025971+00:00 pgdb18-vpc postgres[25027]: [36-1]
 ERROR:  could not open file 
> "pg_tblspc/16435/PG_9.1_201105231/188417/58206627.1" (target block
3936767042): No such file or directory
> 2013-02-13T23:13:19.847422+00:00 pgdb18-vpc postgres[28333]: [8-1]  ERROR:
 could not open file 
> "pg_tblspc/16435/PG_9.1_201105231/188417/58206627.1" (target block
3936767042): No such file or directory
> 2013-02-13T23:13:19.913595+00:00 pgdb18-vpc postgres[28894]: [8-1]  ERROR:
 could not open file 
> "pg_tblspc/16435/PG_9.1_201105231/188417/58206627.1" (target block
3936767042): No such file or directory
> 2013-02-13T23:13:20.043527+00:00 pgdb18-vpc postgres[20917]: [72-1]
 ERROR:  invalid memory alloc request size 
> 1968078400
> 2013-02-13T23:13:21.548259+00:00 pgdb18-vpc postgres[23318]: [54-1]
 ERROR:  could not open file 
> "pg_tblspc/16435/PG_9.1_201105231/188417/58206627.1" (target block
3936767042): No such file or directory
> 2013-02-13T23:13:28.405529+00:00 pgdb18-vpc postgres[28055]: [12-1]
 ERROR:  invalid page header in block 38887 of 
> relation pg_tblspc/16435/PG_9.1_201105231/188417/58206627
> 2013-02-13T23:13:29.199447+00:00 pgdb18-vpc postgres[25513]: [46-1]
 ERROR:  invalid page header in block 2368 of 
> relation pg_tblspc/16435/PG_9.1_201105231/188417/60418945

> There didn't seem to be much correlation to which files were affected, and
this was a critical server, so once we 
> realized a simple reindex wasn't going to solve things, we shut it down
and brought up a slave as the new master db.

> While that seemed to fix these issues, we soon noticed problems with
missing clog files. The missing clogs were outside > the range of the
existing clogs, so we tried using dummy clog files. It didn't help, and
running pg_check we found that > one block of one table was definitely
corrupt. Worse, that corruption had spread to all our replicas.

Can you check that corrupted block is from one of the relations mentioned in
your errors. This is just to reconfirm.

> I know this is a little sparse on details, but my questions are:

> 1. What kind of fault should I be looking to fix? Because it spread to all
the replicas, both those that stream and 
> those that replicate by replaying wals in the wal archive, I assume it's
not a storage issue. (My understanding is that > streaming replicas stream
their changes from memory, not from wals.) 

  Streaming replication stream their changes from wals.

> 2. Is it possible that the corruption that was on the master got
replicated to the slaves when I tried to cleanly shut > down the master
before bringing up a new slave as the new master and switching the other
slaves over to replicating 
> from that?

At shutdown, master will send all WAL (upto shutdown checkpoint)

I think there are 2 issues in your mail
1. access to corrupted blocks - there are 2 things in this, one is how the
block get corrupted in master and why it's replicated to other servers.

The corrupted block replication can be done because of WAL as WAL contains
backup copies of blocks if full_page_write=on, which is default
configuration.
So I think now the main question remains is how the block/'s get corrupted
on master. For that I think some more information is required, like what
kind of operations are being done for relation which has corrupted block.
If we drop and recreate that relation, will this problem remains.
Is there any chance that the block gets corrupted due to hardware problem?

2. missing clog files - how did you find missing clog files, is any
operation got failed or just an observation?
Do you see any problems in system due to it?

With Regards,
Amit Kapila.





-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Join query query

2013-02-14 Thread Amit Kapila
On Thursday, February 14, 2013 4:31 AM Andrew Taylor wrote:
> Hi,

> I'd like to do something which I think should be quite easy - that is join
2 tables and create a new table.

> Table A postcode_input has columns which include postcode, eastings,
northings. there are 1,687,605 rows.
> Table B bng_lat_long has columns lat, lon, e, n. There are 1,687,605 rows.

> eastings = e and northings = n so there should be a 1 to 1 match. The
eastings northings pair should be unique in 
> aggregate.

I think mapping is m to n.

>So I tried doing this:

> SELECT A.postcode, A.eastings, A.northings, B.lat, B.lon INTO
postcode_lat_long
> FROM postcode_input AS A
> LEFT JOIN bng_lat_long AS B On A.eastings = B.e AND A.northings = B.n

> And ended up with a table 13,708,233 rows long with what looks like plenty
of duplicated rows. Some but not all are 
> duplicated. What can I do to sort this out? 

What is you exact expection of data in postcode_lat_long?

From the above it seems you want distinct rows which match between
postcode_input and bng_lat_long, if there is no match then take the values
of postcode_input and NULL for bng_lat_long.

If I am right, then you can try with using DISTINCT operator:
http://www.postgresql.org/docs/9.2/static/sql-select.html#SQL-DISTINCT

With Regards,
Amit Kapila.



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] [BUGS] BUG #7850: left outer join is not working or I didn't contruct the query properly

2013-02-05 Thread Amit Kapila
Tuesday, February 05, 2013 12:40 PM nvardar wrote:
> I am trying to run below query;
> 
> SELECT
> tbl.id,
> tbl.day,
> tbl.week,
> tbl.appfailure,
> tbl.appname,
> tbl.deviceid,
> ext6.DIM_NAME,
> ext7.DIM_NAME,
> ext8.DIM_NAME,
> ext9.DIM_NAME,
> tbl.starttime
> FROM
> APPSTARTEVT tbl
> left outer join DIM_deviceMakeModel ext6 on
> (ext6.DIM_ID=tbl.devicemakemodel)
> left outer join DIM_devicePlatformType ext7 on
> (ext7.DIM_ID=tbl.deviceplatformtype)
> left outer join DIM_deviceVersion ext8 on
> (ext8.DIM_ID=tbl.deviceversion)
> left outer join DIM_operator ext9 on (ext9.DIM_ID=tbl.operator)
> ORDER BY
> tbl.id ASC
> 
> throws;
> 
> >[Error] Script lines: 1-20 -
>  ERROR: Node 2 has aborted execution, cause is:
> com.edb.gridsql.exception.XDBServerException :
> java.lang.NullPointerException
> QUERY: SELECT "TMPTT6624_1"."deviceplatformtype" AS
> "deviceplatformtype","TMPTT6624_1"."XSERIALID0" AS
> "XSERIALID0","TMPTT6624_1"."XONODEID0" AS "XONODEID0" FROM
> "TMPTT6624_1"
> LEFT OUTER JOIN "dim_deviceplatformtype" "ext7" ON ("ext7"."dim_id" =
> "TMPTT6624_1"."deviceplatformtype")  WHERE "ext7"."dim_id" IS NULL
>  Line: 1 
> 
>  [Executed: 2/4/13 11:07:08 PM PST ] [Execution: 0/ms]
> 
> 
> exception.
> 
> Is there anything i am missing.

Why the query is different in Exception?
Can you try with 
SELECT
 tbl.id,
 tbl.day,
 tbl.week,
 tbl.appfailure,
 tbl.appname,
 tbl.deviceid,
 ext6.DIM_NAME,
 tbl.starttime
 FROM
 APPSTARTEVT tbl
 left outer join DIM_deviceMakeModel ext6 on
(ext6.DIM_ID=tbl.devicemakemodel)
ORDER BY   tbl.id ASC

Also as Pavel has pointed, send this to General mailing list. 

With Regards,
Amit Kapila.



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Passing dynamic parameters to a table-returning function

2013-02-05 Thread Amit Kapila
On Tuesday, February 05, 2013 2:22 AM Moshe Jacobson wrote:
Hi all,


> I'm having trouble because I can't figure out how to select the primary
key from that table, as well as all the columns > from the function's
output. I'm unsure of the syntax for passing in values from columns in the
database as the 
> parameters of a set-returning function from which I want to select *.

Can you try with RETURN NEXT syntax. For example, refer below link:
http://www.postgresql.org/docs/9.2/static/plpgsql-control-structures.html#PL
PGSQL-STATEMENTS-RETURNING

With Regards,
Amit Kapila.



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] reducing number of ANDs speeds up query

2013-01-11 Thread Amit kapila

On Saturday, January 12, 2013 7:17 AM T. E. Lawrence wrote:
> Hello,

> I have a pretty standard query with two tables:

> SELECT table_a.id FROM table_a a, table_b b WHERE ... AND ... AND b.value=...;

> With the last "AND b.value=..." the query is extremely slow (did not wait for 
> it to end, but more than a minute), because the value column is not indexed 
> (contains items longer than 8K).

> However the previous conditions "WHERE ... AND ... AND" should have already 
> reduced the candidate rows to just a few (table_b contains over 50m rows). 
> And indeed, removing the last "AND b.value=..." speeds the query to just a 
> millisecond.

> Is there a way to instruct PostgreSQL to do first the initial "WHERE ... AND 
> ... AND" and then the last "AND b.value=..." on the (very small) result?

You can try once with below query:
Select * from (SELECT a.id,b.value FROM table_a a, table_b b WHERE ... AND ... 
) X where X.value=...;

If this doesn't work can you send the Explain .. output for both queries(the 
query you are using and the query I have suggested)


With Regards,
Amit Kapila.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Counterintuitive locking behavior

2013-01-05 Thread Amit kapila

On Sunday, January 06, 2013 11:10 AM Amit kapila wrote:
On Sunday, January 06, 2013 7:48 AM Chris Travers wrote:




>> Is there a reason why we don't do locking this way?  (i.e. where on UPDATE 
>> foo, all rows selected from foo during the
>> update are locked unless the subselect specifically states otherwise.)

>The reason for this behavior is if it locks all rows during select, then it 
>can so happen that Update will actually not happen on the row but it will be 
>locked.
> For example

> UPDATE foo SET test = 3 WHERE test in (select test from foo where test = 1) 
> and FALSE;

> Now in this case if it locks the rows during subselect, then the rows will be 
> locked during whole transaction
> irrespective of the fact that they will not be updated.

In the above example and FALSE, I mean to say any other subquery which will 
yield the overall condition to not return any row.
Similarly there can be many more scenarios where only half of the selected rows 
(by one of the conds. ) will be actual candidates of Update.

With Regards,
Amit Kapila.



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Counterintuitive locking behavior

2013-01-05 Thread Amit kapila
On Sunday, January 06, 2013 7:48 AM Chris Travers wrote:

> I recently discovered that subselects in update statements don't assume that 
> the select is for update of the updating table.


> For example, if I do this:


> CREATE TABLE foo (
>   test int primary key,
> );


> INSERT INTO foo VALUES (1);


> then in one session:


> BEGIN;
> UPDATE foo SET test = 2 WHERE test in (select test from foo where test = 1);


> and then in the other session


> BEGIN;
> UPDATE foo SET test = 3 WHERE test in (select test from foo where test = 1);

The behavior will be same even for UPDATE foo SET test = 3 WHERE test =1;



> Is there a reason why we don't do locking this way?  (i.e. where on UPDATE 
> foo, all rows selected from foo during the 
> update are locked unless the subselect specifically states otherwise.)

The reason for this behavior is if it locks all rows during select, then it can 
so happen that Update will actually not happen on the row but it will be locked.
For example 

UPDATE foo SET test = 3 WHERE test in (select test from foo where test = 1) and 
FALSE;

Now in this case if it locks the rows during subselect, then the rows will be 
locked during whole transaction
irrespective of the fact that they will not be updated.

With Regards,
Amit Kapila.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Unable to reload postgresql.conf without restarting

2013-01-04 Thread Amit Kapila
On Thursday, January 03, 2013 9:56 PM Jose Martinez wrote:
>Hi,

>I made some changes to postgresql.conf and I want them to take effect
without having to restart the server.

>I tried 

>select pg_reload_conf();
>/usr/pgsql-9.1/bin/pg_ctl reload

>but when I do 'show all', I see no changes take effect. 

>There settings I tried to change are:
> -effective_cache_size 
> -work_mem

Tried in latest HEAD code(Windows OS) and its working, after select
pg_reload_conf(), the changed value is shown.
Can you try just with SET command and verify the behavior, as these
parameters are USERSET parameters, 
so no need to change config file and reload or restart.

With Regards,
Amit Kapila.



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Cursor fetch Problem.

2012-12-30 Thread Amit kapila

On Friday, December 28, 2012 10:58 AM Harry wrote:
> Hi,
> Kindly see the below attached statements related to Cursor Fetch Issue it's
> still residing as a process.

>500  20222 31036 79 Dec27 ?16:22:31 postgres: user1 sampledb 
>192.168.0.40[36022] FETCH
>500  20829 31036 81 Dec27 ?16:18:48 postgres: user1 sampledb 
>192.168.0.40[57591] FETCH
>500  20867 31036 81 Dec27 ?16:09:33 postgres: user1 sampledb 
>192.168.0.40[45316] FETCH
>500  20870 31036 81 Dec27 ?16:09:12 postgres: user1 sampledb 
>192.168.0.40[45343] FETCH


Not sure if FETCH has hanged due to some reason or some other problem due to 
which cancel or terminate backend is not working.
Can you provide a stacktrace of hanged backends? Also do let me know you 
Postgresql version and OS?

Just for test, can you connect a new backend with psql. verify the entry for 
same in pg_stat_activity. call terminate API for this session. check again if 
the new entry has gone from pg_stat_activity?

With Regards,
Amit Kapila.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Cursor fetch Problem.

2012-12-28 Thread Amit Kapila
On Thursday, December 27, 2012 2:44 PM Harry wrote:
> Below is the Linux ps -ef | grep postgres output :-
> 
> 501  12163  5473  0 Dec19 ?00:00:00 postgres: enterprisedb
> sampledb 192.168.0.231[53991] ?EDB-SPL Procedure successfully completed
> 501  12167  5473  0 Dec19 ?00:00:00 postgres: enterprisedb
> sampledb 192.168.0.231[53995] ?EDB-SPL Procedure successfully
> completed.
> 
> Also, if i try to kill from OS the whole database gets shut down.

Have you checked server logs for any specific messages like below or any
other message after you call cancel/terminate
Backend API?

FATAL:  terminating connection due to administrator command
ERROR:  canceling statement due to user request

With Regards,
Amit Kapila.



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Cursor fetch Problem.

2012-12-27 Thread Amit Kapila
On Thursday, December 27, 2012 11:51 AM Harry wrote:
> Hi Amit,
> Thanks for Reply.
> Kindly see my below output.
> 
> Also, tried to Kill it Firstly by using Cancel Backend and then
> Terminate
> Backend output showing "True" but still remaining as a process (i.e. in
> pg_stat_activity).

Can you check the server log and see if there is any of below the statements
in the log:

FATAL:  terminating connection due to administrator command
ERROR:  canceling statement due to user request

With Regards,
Amit Kapila.



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Cursor fetch Problem.

2012-12-27 Thread Amit Kapila
On Thursday, December 27, 2012 11:51 AM Harry wrote:
> Hi Amit,
> Thanks for Reply.
> Kindly see my below output.
> 16650;"sampledb";11965;10;"enterprisedb";"";"192.168.0.231";"";53897;"*
> 2012-12-19
> 11:39:48.234799+05:30";"2012-12-19 11:39:53.288441+05:30";"2012-12-19
> 11:39:53.288441+05:30*";f;"DECLARE
> BEGIN
> EXEC
> 16650;"sampledb";12156;10;"enterprisedb";"";"192.168.0.231";"";53983;*"
> 2012-12-19
> 12:18:38.57709+05:30";"2012-12-19 12:18:43.922301+05:30";"2012-12-19
> 12:18:43.922301+05:30"*;f;"DECLARE
> BEGIN
> EXEC
> 16650;"sampledb";13243;10;"enterprisedb";"Postgres Studio -
> Browser";"192.168.0.180";"";3907;"2012-12-26
> 16:35:45.753172+05:30";"";"2012-12-26 16:35:46.577723+05:30";f;""

Above shows that first two sessions are running from last few days. 
I am interested to know what is the transaction state in first 2 sessions.
In current version that information is part of pg_stat_activity, but don't
know how to get in the version you are using.
If possible for you, get this information. If you are using Linux system the
try ps ax | grep postgres and show the output

 
> Also, tried to Kill it Firstly by using Cancel Backend and then
> Terminate
> Backend output showing "True" but still remaining as a process (i.e. in
> pg_stat_activity).

Are you aware whether there is actually such long query running in first 2
sessions.
If you are not interested in first 2 sessions, you can even use OS kill
command.

With Regards,
Amit Kapila.



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Cursor fetch Problem.

2012-12-26 Thread Amit Kapila
On Wednesday, December 26, 2012 5:12 PM Harry wrote:
> Hi All,
> 
> I am getting a problem i.e. in database while checking pg_stat_activity
> faced issue statement  is residing as
> process
> for couple of days also, 

How have you concluded, it stays for couple of days? 
pg_stat_activity will show last statement executed in backend. What is the
value of 'state' for that backend. 
It's better if you can send the output of pg_stat_activity for that backend.

> not able to kill them through pg_terminate_backend  function.
Can you try once pg_cancel_backend(pid) and then pg_terminate_backend.

With Regards,
Amit Kapila.



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] In one of negative test row-level trigger results into loop

2012-09-25 Thread Amit Kapila
On Wednesday, September 26, 2012 9:58 AM Alban Hertroys wrote:
> > But some other databases like Oracle handles the scenario reported
> but not
> > loop.
> > To handle for After triggers, there is mutation table concept in
> Oracle due
> > to which it errors out
> > and for Before triggers, it errors out with "maximum number of
> recursive SQL
> > levels(50) exceeded".
> 
> 
> Oracle uses some arbitrary number to prevent you from looping (50
> apparently). A limit I've run into for perfectly valid situations.
> Thank you for preventing me from doing my job, Oracle.
> 
> Both databases have an upper limit. If you reach that limit with
> Postgres, you made a programming error that is easy to catch in
> development (before it reaches production). With Oracle, not so much.

As a developer for me the PostgreSQL behavior is perfectly valid. 

But the users claim that for their scenarios that kind of Limit (50) is
valid
and they get proper error. Also Oracle doesn't give error just if someone
calls
recursive functions, it is for recursive SQL's. For example

insert into tbl values(func(..)).

Now func again has insert/update/delete/select which calls func or some
other func, if in such a
Case it reaches depth level 50, only then it throws error.

Also I am not very sure whether we can always catch such error in dev env.
before it reaches production because
If such SQL statements are inside some app logic which at the time of dev,
got missed for test then it might reach production stage.  

With Regards,
Amit Kapila.




-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] In one of negative test row-level trigger results into loop

2012-09-25 Thread Amit Kapila
  On Monday, September 24, 2012 8:19 PM Tom Lane wrote:
> Amit Kapila  writes:
> > Below test results into Loop:
> 
> > [ AFTER INSERT trigger does another insert into its target table ]
> 
> Well, of course.  The INSERT results in scheduling another AFTER event.
> 
> > I understand that user can change his code to make it proper.
> 
> > However shouldn$B!G(Bt PostgreSQL also throws errors in such cases
> for recursion
> > level or something related?
> 
> No.  In the first place, there is no recursion here: the triggers fire
> sequentially, not in a nested way.  In the second place, this sort of
> thing is not necessarily wrong --- it's okay for a trigger to do
> something like that, so long as it doesn't repeat it indefinitely.
 
  But in the current case it will repeat until max stack depth is reached.

> (A human can see that this function will never stop adding rows, but
> Postgres' trigger mechanism doesn't have that much insight.)  In the
> third place, we don't attempt to prevent queries from taking
> unreasonable amounts of time, and a loop in a trigger is not very
> different from anything else in that line.  Use statement_timeout if
> you're concerned about that type of mistake.

I agree with you that such scenario's can be compared with loop in a
trigger. 
But some other databases like Oracle handles the scenario reported but not
loop.
To handle for After triggers, there is mutation table concept in Oracle due
to which it errors out
and for Before triggers, it errors out with "maximum number of recursive SQL
levels(50) exceeded".


With Regards,
Amit Kapila.



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] In one of negative test row-level trigger results into loop

2012-09-24 Thread Amit Kapila
> On Monday, September 24, 2012 7:44 PM Adrian Klaver
> On 09/24/2012 07:03 AM, Amit Kapila wrote:
> > Below test results into Loop:
> >
> > 1.create test table
> >
> > CREATE TABLE TEST_TABLE (NAME VARCHAR2, AGE INT);
> >
> > 2.create trigger function
> >
> > CREATE OR REPLACE FUNCTION TRIG_FUNC () RETURNS TRIGGER AS
> >
> > $$
> >
> > DECLARE
> >
> > PSQL VARCHAR2;
> >
> > BEGIN
> >
> > Raise info 'This is Test!!!';
> >
> > psql:= 'INSERT INTO TEST_TABLE VALUES(''john'', 25);';
> >
> > execute psql;
> >
> > RETURN NEW;
> >
> > END;
> >
> > $$ LANGUAGE plpgsql;
> >
> > 3.create trigger
> >
> > CREATE TRIGGER TEST_TRIGGER AFTER INSERT OR UPDATE OR DELETE ON
> > TEST_TABLE FOR EACH ROW
> >
> > EXECUTE PROCEDURE TRIG_FUNC ();
> >
> > 4.Perform an insert statement
> >
> > INSERT INTO TEST_TABLE VALUES('jack',25);
> >
> > Now, You will see an always loop.
> >
> > I understand that user can change his code to make it proper.
> >
> > However shouldn’t PostgreSQL also throws errors in such cases for
> > recursion level or something related?
> 
> What database are you running this on?
>  I get :
> test=> CREATE TABLE TEST_TABLE (NAME VARCHAR2, AGE INT);
> ERROR:  type "varchar2" does not exist
> LINE 1: CREATE TABLE TEST_TABLE (NAME VARCHAR2, AGE INT);
> 

I am sorry, actually I was trying to compare behavior with Oracle so used it
wrongly.
In Create Table statement, change Varchar2 to Varchar(30)   
And change in trigger function from Varchar2 to Varchar(200)

With Regards,
Amit Kapila.



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] In one of negative test row-level trigger results into loop

2012-09-24 Thread Amit Kapila
Below test results into Loop:



1.create test table

CREATE TABLE TEST_TABLE (NAME VARCHAR2, AGE INT);



2.create trigger function

CREATE OR REPLACE FUNCTION TRIG_FUNC () RETURNS TRIGGER AS

$$

DECLARE

PSQL VARCHAR2;

BEGIN

Raise info 'This is Test!!!';

psql:= 'INSERT INTO TEST_TABLE VALUES(''john'', 25);';

execute psql;

RETURN NEW;

END;

$$ LANGUAGE plpgsql;



3.create trigger

CREATE TRIGGER TEST_TRIGGER AFTER INSERT OR UPDATE OR DELETE ON TEST_TABLE
FOR EACH ROW

EXECUTE PROCEDURE TRIG_FUNC ();



4.Perform an insert statement

INSERT INTO TEST_TABLE VALUES('jack',25);



Now, You will see an always loop.

I understand that user can change his code to make it proper.

However shouldn’t PostgreSQL also throws errors in such cases for recursion
level or something related?



With Regards,

Amit Kapila.





[GENERAL] Behavior difference for walsender and walreceiver for n/w breakdown case

2012-09-10 Thread Amit Kapila
I have observed that currently incase there is a network break between
master and standby, walsender process gets terminated immediately, however 
walreceiver detects the breakage after long time. 



I could see that there is replication_timeout configuration parameter,
walsender checks for replication_timeout and exits after that timeout.



Shouldn't for walreceiver, there be a mechanism so that it can detect n/w
failure sooner?


Basic Steps to observe above behavior 
1. Both master and standby machine are connected normally, 
2. then you use the command: ifconfig ip down; make the network card of
master and standby down, 
Observation 
master can detect connect abnormal, but the standby can't detect connect
abnormal and show a connected channel long time. 



Note - Earlier I had sent this on Hackers list also, I just wanted to know
that is it the behavior as defined by PostgreSQL or is it a bug or a new
feature in itself.

  In case it is not clear, I will raise a bug.

  


With Regards, 
Amit Kapila