Re: [SQL] 'locking' the SELECTs based on indices...

2006-02-23 Thread Mario Splivalo
On Wed, 2006-02-22 at 13:58 -0500, Tom Lane wrote:
> Mario Splivalo <[EMAIL PROTECTED]> writes:
> > Now, when I do this from one connection (psql shell, for instance):
> 
> > [A]BEGIN TRANSACTION;
> > [A]SELECT * FROM bla WHERE code_id = 1 FOR UPDATE;
> 
> > and then, from another psql i do:
> > [B]SELECT * FROM bla WHERE code_id = 1 FOR UPDATE
> 
> > the second SELECT will wait untill I rollback or commit first
> > transaction. That is cool. 
> 
> > But, if I do second SELECT like this:
> 
> > [C]SELECT * FROM bla WHERE code_id = 2 FOR UPDATE
> 
> > I will get the rows.
> 
> Well, of course.  Why would you want something different?  Why do you
> think the table's indexes should have anything to do with it?
> 
> If you want a full-table lock then some form of LOCK TABLE seems like
> the proper answer.  SELECT FOR UPDATE is designed to lock the specified
> rows, no more.

Hm. But, is there a way to have just a part of table locked, for
instance:

BEGIN;
SELECT * FROM bla WHERE code_id = 1 AND code_value = 'abla' FOR UPDATE

and then, in second connection:

BEGIN;
SELECT * FROM bla WHERE code_id = 1 AND code_value = 'eble' FOR UPDATE

I'd like to have second SELECT to hold. Now I'm doing it like this:

First connection:
BEGIN;
SELECT * FROM bla WHERE code_id = 1 FOR UPDATE -- this is just for lock
SELECT * FROM bla WHERE code_id = 1 AND code_value = 'abla' FOR UPDATE
...

Second connection:
BEGIN;
SELECT * FROM bla WHERE code_id = 1 FOR UPDATE -- here it'll wait
SELECT * FROM bla WHERE code_id = 1 AND code_value = 'eble' FOR UPDATE

Since I'm actually doing this inside of a function, i'll use PERFORM for
the first select. The problem is that SELECT COUNT(*) FROM bla WHERE
code_id = 1 will return some 10M rows (on live data, my test data has
some 100k rows, and I don't see any performance impact - yet).

Is this a right way to go? 

> > If I erase the index bla_idx1, then [C] select will wait, same as [B]
> > select will wait.
> 
> I don't think so.  If it does, it's a bug; please provide a reproducible
> test case.
> 

It's not a bug. My high fever must have something to do with it. I just
tried it, removed the index, and [C] isn't waiting.

Mike


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [SQL] 'locking' the SELECTs based on indices...

2006-02-23 Thread Mario Splivalo
On Thu, 2006-02-23 at 13:56 +0200, Achilleus Mantzios wrote:

> The intersection of  rows that satisfy BOTH 
> "code_id = 1 AND code_value = 'abla'"
> and
> "code_id = 1 AND code_value = 'eble'"
> is ZERO!!!
> 
> Why would you want irrelevant rows to wait for one another??

It was a bit silly representation of what I actually want :) I'll
explain it later, down there.

> > First connection:
> > BEGIN;
> > SELECT * FROM bla WHERE code_id = 1 FOR UPDATE -- this is just for lock
> > SELECT * FROM bla WHERE code_id = 1 AND code_value = 'abla' FOR UPDATE
> > ...
> > 
> > Second connection:
> > BEGIN;
> > SELECT * FROM bla WHERE code_id = 1 FOR UPDATE -- here it'll wait
> > SELECT * FROM bla WHERE code_id = 1 AND code_value = 'eble' FOR UPDATE
> > 
> > Since I'm actually doing this inside of a function, i'll use PERFORM for
> > the first select. The problem is that SELECT COUNT(*) FROM bla WHERE
> > code_id = 1 will return some 10M rows (on live data, my test data has
> > some 100k rows, and I don't see any performance impact - yet).
> > 
> > Is this a right way to go? 
> >
> 
> I'll repeat the first question...
> 
> Lets say you have 100 rows with id=1, and from them 10 have 
> code_value = 'eble' and 20 have code_value = 'able',
> 
> so you got 10 rows with id=1 and code_value = 'eble'
> and 20 rows with id=1 and code_value = 'able'
> 
> So in the first case you deal with 10 rows, in the second with 20 rows.
> Why take into account the rest 90 and 80 rows respectively???
> 
> If for some reason you want INDEED the 
> "SELECT * FROM bla WHERE code_id = 1 FOR UPDATE" effect,
> and the cardinalities are as you describe (more or less),
> then go with the LOCK TABLE solution as Tom said.

I have a actuall table, called code_values. There I have stored codes
for the various prize-winning games. The table looks like this:

CREATE TABLE code_values (
code_id int4,
service_id int4,
code_value int4
)

code_id is, of course, PK for the table.

Now, when a 'player' sends an SMS, or makes a phone call, I need to
check if that code is valid, and then, if that code has allready been
played. If it's invalid, or if it has been played, the user is
discarded. Since the number of prize-winning games (or services) is
around 50, and each service has around 1M codes, I didn't want to lock
entire table, so ALL the services wait, but just the portion of the
table that contains data for the desired service. So, when I check for
the code:

SELECT code_value FROM code_values WHERE service_id = 1 AND code_value =
'KFH1A' FOR UPDATE

I lock just that particular row, which is no good. I need to have all
the codes for the service 1 locked, so if it happens that two users send
the very same code, one has to fail. Therefore, from within plpgsql I
first do:

PERFORM * FROM code_values WHERE service_id = 1 FOR UPDATE

and then, later in code, i check for the actuall code.

Now, If some other user want's his prize, when checking his code, if he
sends code for some other service then service 1, that's ok. If he's
sends code for the service 1 the PERFORM will wait untill I'm finished
with previous user.

I could go with the LOCK TABLE, but that seems to expensive. Or not?

Mario

> 
> P.S. 
> Ti si Hrvatski???

:) Yes! :) Ja sam Hrvat, govorim hrvatski :)

> Dobro Jutro druga!!!

Dobro jutro i tebi! :)


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [SQL] 'locking' the SELECTs based on indices...

2006-02-23 Thread PFC




I lock just that particular row, which is no good. I need to have all
the codes for the service 1 locked, so if it happens that two users send
the very same code, one has to fail. Therefore, from within plpgsql I
first do:


I'm a bit tired tonight so I'll simplify your example :

CREATE TABLE stuff ( a INT, b INT );

	Basically you want to lock ALL rows with a certain value of a, in order  
to perform an operation on only one of them.

You could do this :

CREATE TABLE all_as ( a INT PRIMARY KEY )
CREATE TABLE stuff ( a INT REFERENCES all_as(a), b INT );

	Now all the rows in "stuff" that have the same value of "a" reference the  
same row in "all_as".

All you have to do is

SELECT * FROM all_as WHERE a=the value FOR UPDATE

and you lock all rows having that particular value of a in the big 
table.



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [SQL] 'locking' the SELECTs based on indices...

2006-02-23 Thread PFC




Now, If some other user want's his prize, when checking his code, if he
sends code for some other service then service 1, that's ok. If he's
sends code for the service 1 the PERFORM will wait untill I'm finished
with previous user.


Sorry for the double post.

	If the rows in your table represent associations between codes and  
services that are one-use only, you could simply use UPDATE or DELETE, to  
mark the row in question as having been "consumed".
	Then, you check how many rows were deleted or updated. If it's 1, good.  
If it's 0, the code has been used already.


	If the code itself is one-use only, you should have a codes table and a  
codes_to_services table, with an ON DELETE CASCADE so that, when you use a  
code, you delete it from the codes table and it's "consumed" for all  
services.


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


[SQL] Sum If

2006-02-23 Thread Daniel Hernandez
 Hi Guys,   I'm new on this group, and I have a question, is there a way to do a "sum if" (kind of)?What i want to do is the following.Select customers.custid, if (sales.itemname = 'candy', sum(sales.count)) as "Sales candies", if (sales.itemname = 'some', sum(sales.count)) as "Sales Some"from ...join ...where .group by customers.custid ...Thanks in advanced, and best regards,Daniel Hernández.Tijuana, BC, México."More you learn, more you earn".Join Excite! - http://www.excite.comThe most personalized portal on the Web!


Re: [SQL] Sum If

2006-02-23 Thread Jim Buttafuoco

try 

select ...,sum(case when sales.itemname = 'some' then sales.count else 0 end) 
as "Sales Candies"
from your_table_here
group by ...



-- Original Message ---
From: "Daniel Hernandez" <[EMAIL PROTECTED]>
To: pgsql-sql@postgresql.org
Sent: Thu, 23 Feb 2006 12:46:44 -0500 (EST)
Subject: [SQL] Sum If

> Hi Guys,   I'm new on this group, and I have a question, is there a way to do 
> a "sum if" (kind of)
> ?What i want to do is the following.Select customers.custid, if 
> (sales.itemname = 'candy', sum(sales.count)) 
> as "Sales candies", if (sales.itemname = 'some', sum(sales.count)) 
> as "Sales Some"from 
> ...join ...where .group by customers.custid ...Thanks in advanced, and 
> best regards,Daniel 
> Hernández.Tijuana, BC, México."More you learn, more you earn".
> 
> ___
> Join Excite! - http://www.excite.com
> The most personalized portal on the Web!
--- End of Original Message ---


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [SQL] Sum If

2006-02-23 Thread Daniel Hernandez
 
Hi Jim,   Thanks for the tip, It worked!Thanks a lot!!!Daniel Hernández.Tijuana, BC, México."More you learn, more you earn".try select ...,sum(case when sales.itemname = 'some' then sales.count else 0 end) as "Sales Candies"from your_table_heregroup by ...<[EMAIL PROTECTED]>Join Excite! - http://www.excite.comThe most personalized portal on the Web!


Re: [SQL] 'locking' the SELECTs based on indices...

2006-02-23 Thread Mario Splivalo
On Thu, 2006-02-23 at 17:35 +0100, PFC wrote:
> 
> > Now, If some other user want's his prize, when checking his code, if he
> > sends code for some other service then service 1, that's ok. If he's
> > sends code for the service 1 the PERFORM will wait untill I'm finished
> > with previous user.
> 
>   Sorry for the double post.
> 
>   If the rows in your table represent associations between codes and  
> services that are one-use only, you could simply use UPDATE or DELETE, to  
> mark the row in question as having been "consumed".
>   Then, you check how many rows were deleted or updated. If it's 1, good. 
>  
> If it's 0, the code has been used already.
> 
>   If the code itself is one-use only, you should have a codes table and a 
>  
> codes_to_services table, with an ON DELETE CASCADE so that, when you use a  
> code, you delete it from the codes table and it's "consumed" for all  
> services.

Thank you for the advice, I'll consider it too.

The original idea was with UPDATE, so I could mark codes wich are used,
but the table with codes will have 10M rows in the begining, for just
one game. Later we'll have more games, with like 100M rows in the table.
UPDATEing such table, when the server is under normal load (server also
hosts some other games) is between 15 and 40 seconds. That is why I
tried to eliminate UPDATEs and go with two tables, and INSERTS into the
second table, for used codes.

Mike


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


[SQL] Question about index scan vs seq scan when using count()

2006-02-23 Thread Kashmira Patel \(kupatel\)



Hello,
   I am 
using postgres version 7.4, and I read this in the 
documentation:
 
"The aggregate 
functions sum and count always require a sequential scan if applied to the 
entire table."
 
My understanding of 
this statement is that if I use count() without a WHERE clause, then 
essentially, it is applied to the entire table and hence requires a seq 
scan.
But it should not 
require a seq scan if I have a condition. 
 
For example: I 
have a table vm_message with an index on column msgid.
Will the following 
do a sequential scan or an index?
 
select count(*) from 
vm_message where msgid = 3;
 
I used explain, and 
it said it would do a sequential scan. Why is that? 
In fact explain 
select * from vm_message where msgid = 3 also did a sequential 
scan.
 
What am I doing 
wrong here? I want it to use my index.
 
Thanks,
Kashmira


Re: [SQL] Question about index scan vs seq scan when using count()

2006-02-23 Thread Andrew Sullivan
On Thu, Feb 23, 2006 at 01:44:43PM -0800, Kashmira Patel (kupatel) wrote:
> My understanding of this statement is that if I use count() without a
> WHERE clause, then essentially, it is applied to the entire table and
> hence requires a seq scan.
> But it should not require a seq scan if I have a condition. 

It may not require it, but it might select it anyway.

> For example: I have a table vm_message with an index on column msgid.
> Will the following do a sequential scan or an index?
>  
> select count(*) from vm_message where msgid = 3;

How much of the table is that?  How many rows?  EXPLAIN ANALYSE will
tell you if you have the right plan (estimate vs. actual).  The real
question is, are you sure an indexscan is faster?

A
-- 
Andrew Sullivan  | [EMAIL PROTECTED]
It is above all style through which power defers to reason.
--J. Robert Oppenheimer

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [SQL] Question about index scan vs seq scan when using count()

2006-02-23 Thread Kashmira Patel \(kupatel\)

> For example: I have a table vm_message with an index on column msgid.
> Will the following do a sequential scan or an index?
>  
> select count(*) from vm_message where msgid = 3;

How much of the table is that?  How many rows?  EXPLAIN ANALYSE will
tell you if you have the right plan (estimate vs. actual).  The real
question is, are you sure an indexscan is faster?

[Kashmira] I did do an EXPLAIN ANALYZE as well, it also showed a
sequential scan. The table has about 600+ rows, with around 6 of them
matching the given id. Wouldn't an index scan be faster in this case?
Also, I have two more indices defined on this table, for other types of
queries I do on it. Would they be causing a problem? In general, is
there a rule of thumb as to when an index scan would be better than a
sequential scan?

Thanks,
Kashmira

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [SQL] Question about index scan vs seq scan when using count()

2006-02-23 Thread Kashmira Patel \(kupatel\)
The index has been around since I first created the table. I did VACUUM
ANALYZE, that should have taken care of analyzing this table, right?

-Original Message-
From: Tomas Vondra [mailto:[EMAIL PROTECTED] 
Sent: Thursday, February 23, 2006 2:29 PM
To: Kashmira Patel (kupatel)
Subject: Re: [SQL] Question about index scan vs seq scan when using
count()

> What am I doing wrong here? I want it to use my index.

For small tables the sequential scan is faster, that means less disk
reads is required the whole table than to use the index.

If it is a large table, the index should be used. Have you created the
index recently? Have you analyzed the table since that time (ANALYZE
tablename)

t.v.

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [SQL] Question about index scan vs seq scan when using count()

2006-02-23 Thread Andrew Sullivan
On Thu, Feb 23, 2006 at 02:25:34PM -0800, Kashmira Patel (kupatel) wrote:
> 
> [Kashmira] I did do an EXPLAIN ANALYZE as well, it also showed a
> sequential scan. The table has about 600+ rows, with around 6 of them
> matching the given id. Wouldn't an index scan be faster in this case?

EXPLAIN ANALYSE will always choose the same plan as EXPLAIN.  The
difference is that it shows you the estimate and actual.

I am surprised you're getting a seqscan for that, though.  Is there
something about the index you're not telling us?  

Is your system tuned correctly?  Maybe 600 rows is so small that a
seqscan's just as fast.

A

-- 
Andrew Sullivan  | [EMAIL PROTECTED]
"The year's penultimate month" is not in truth a good way of saying
November.
--H.W. Fowler

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [SQL] Question about index scan vs seq scan when using count()

2006-02-23 Thread Kashmira Patel \(kupatel\)

This is how I created the index:

CREATE INDEX msgid_index ON vm_message(msgid);

I guess it doing this because its a small table then. I will try putting
more values.

Thanks,
Kashmira 

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Andrew Sullivan
Sent: Thursday, February 23, 2006 2:47 PM
To: pgsql-sql@postgresql.org
Subject: Re: [SQL] Question about index scan vs seq scan when using
count()

On Thu, Feb 23, 2006 at 02:25:34PM -0800, Kashmira Patel (kupatel)
wrote:
> 
> [Kashmira] I did do an EXPLAIN ANALYZE as well, it also showed a 
> sequential scan. The table has about 600+ rows, with around 6 of them 
> matching the given id. Wouldn't an index scan be faster in this case?

EXPLAIN ANALYSE will always choose the same plan as EXPLAIN.  The
difference is that it shows you the estimate and actual.

I am surprised you're getting a seqscan for that, though.  Is there
something about the index you're not telling us?  

Is your system tuned correctly?  Maybe 600 rows is so small that a
seqscan's just as fast.

A

--
Andrew Sullivan  | [EMAIL PROTECTED]
"The year's penultimate month" is not in truth a good way of saying
November.
--H.W. Fowler

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [SQL] Question about index scan vs seq scan when using count()

2006-02-23 Thread Owen Jacobson
Kashmira Patel wrote:

> I did do an EXPLAIN ANALYZE as well, it also showed a
> sequential scan. The table has about 600+ rows, with around 6 of them
> matching the given id. Wouldn't an index scan be faster in this case?

Not necessarily.  It's entirely possible, if your rows are small, that 600 rows 
will fit on a single disk page.  The index will be stored on a(t least one) 
separate disk page.  The cost of loading a page from disk pretty much swamps 
the cost of processing rows on a page, so in general the server tries to 
minimize the number of pages used.  To use an index for a one-page table, it'd 
have to load two pages (the table and the index); to do a sequential scan over 
a one-page table it only has to load the table.

Indexes are useful because they allow the DB to reduce the total number of 
pages loaded to complete a query.

-Owen

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq