Re: [GENERAL] Switching from MySQL: ON DUPLICATE KEY UPDATE, plpgsql function

2009-07-01 Thread Chris Spotts
 
  Wouldn't you just be looking for something like:
 
  BEGIN;
   EXECUTE 'insert into forums_readposts values ('...')';
 EXCEPTION when unique_violation THEN
 EXECUTE 'update forums_readposts set lastpostread =
 '...' ';
  END;

  The logic as i read your post is. If the user's never done a
 catchup operation
  before, this will create the record. If he has, then it will update
 this record
  to reflect the new transid.
 
  Dave
 
 
 Hmm, yeah. I'm new to PostgreSQL, and have little experience with
 MySQL other than basic queries and joins. I didn't get to the part in
 the docs that covers EXECUTE yet, haha. That looks like it would do
 exactly what I want it to.
[Spotts, Christopher] 
There's a good example and some documentation of what Dave said in the manual 
at the bottom of this page.
http://www.postgresql.org/docs/current/static/plpgsql-control-structures.html
You should really only need to EXECUTE here if you have some dynamically 
determined columns or tables.  The magic here is a combination of IF found 
and EXCEPTION WHEN unique violation




-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Switching from MySQL: ON DUPLICATE KEY UPDATE, plpgsql function

2009-06-30 Thread APseudoUtopia
On Mon, Jun 29, 2009 at 2:26 PM, David Kerrd...@mr-paradox.net wrote:
 On Sat, Jun 27, 2009 at 08:23:26PM -0400, APseudoUtopia wrote:
 - Hey list,
 -
 - I'm migrating my site away from MySQL to PostgreSQL. So far, it's been
 - going great. However, there's one problem I've been having trouble
 - solving.
 -
 - I have a query which allows users to Catch up on read posts on the
 - forum. It works by either updating or inserting the last post read
 - number from every forum thread into the readposts table (for that
 - userid and threadid combination, of course). Here's the table
 - structure:
 -
 - CREATE TABLE forums_readposts (
 -  userid    INTEGER NOT NULL REFERENCES users_main (id) ON DELETE 
 CASCADE,
 -  threadid    INTEGER NOT NULL REFERENCES forums_topics (id) ON
 - DELETE CASCADE,
 -  lastpostread   INTEGER NOT NULL CHECK (lastpostread = 0),
 -  PRIMARY KEY (userid, threadid)
 - );
 -
 - Here's the original MySQL query that I have (db_string is a php
 - function that escapes the string):
 -
 - INSERT INTO forums_readposts (userid, threadid, lastpostread)
 - SELECT ' . db_string($_SESSION['UserInfo']['id']) . ', id,
 - lastpost FROM forums_topics ON DUPLICATE KEY UPDATE lastpostread
 - = lastpost;

 So regardless of other design issues. (i.e., assuming what you have was 
 working
 in MySQL).

 Wouldn't you just be looking for something like:

 BEGIN;
  EXECUTE 'insert into forums_readposts values ('...')';
        EXCEPTION when unique_violation THEN
                EXECUTE 'update forums_readposts set lastpostread = '...' ';
 END;

 The logic as i read your post is. If the user's never done a catchup 
 operation
 before, this will create the record. If he has, then it will update this 
 record
 to reflect the new transid.

 Dave


Hmm, yeah. I'm new to PostgreSQL, and have little experience with
MySQL other than basic queries and joins. I didn't get to the part in
the docs that covers EXECUTE yet, haha. That looks like it would do
exactly what I want it to.

I appreciate it.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Switching from MySQL: ON DUPLICATE KEY UPDATE, plpgsql function

2009-06-29 Thread Tguru

To migrate the site, you can use an open source ETL tool.

Talend Open Studio is an open source ETL tool for data integration and
migration experts. It's easy to learn for a non-technical user. What
distinguishes Talend, when it comes to business users, is the tMap
component. It allows the user to get a graphical and functional view of
integration processes. 
For more information: http://www.talend.com/



Justin-95 wrote:
 
 
 
 
   
 
 
 
 
 APseudoUtopia wrote:
 
   thread, then logs out (intending to read all the other forum threads
 at some point in the future when they log in again). If I used a VIEW,
 it would automatically consider all those unread forum posts to be
 read when the user logs out.
 
   
 That wouldn't work. What if a user logs in, reads only one forum 
 
 
 You are keeping a list of all the forums a user has read,  i would not
 worry about making sure the table tracking user activity has duplicate
 key values. The select can be limited to return just on row with the
 highest time stamp then compare this result to figure out what forms
 the user has not read yet.  This eliminates one of problems but creates
 a problem where table tracking user activity is going bloat but in low
 traffic times delete the duplicate values. 
 
 A similar topic was discussed  on the performance  mailing list, where
 updates are hung for several seconds for a similar tracking table... 
 http://archives.postgresql.org/pgsql-performance/2009-06/msg00300.php 
 
   
 
 
 
 

-- 
View this message in context: 
http://www.nabble.com/Switching-from-MySQL%3A-ON-DUPLICATE-KEY-UPDATE%2C-plpgsql-function-tp24237803p24254206.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Switching from MySQL: ON DUPLICATE KEY UPDATE, plpgsql function

2009-06-29 Thread Lennin Caro



--- On Mon, 6/29/09, Tguru g...@talend.com wrote:

 From: Tguru g...@talend.com
 Subject: Re: [GENERAL] Switching from MySQL: ON DUPLICATE KEY UPDATE, plpgsql 
 function
 To: pgsql-general@postgresql.org
 Date: Monday, June 29, 2009, 1:33 PM
 
 To migrate the site, you can use an open source ETL tool.
 
 Talend Open Studio is an open source ETL tool for data
 integration and
 migration experts. It's easy to learn for a non-technical
 user. What
 distinguishes Talend, when it comes to business users, is
 the tMap
 component. It allows the user to get a graphical and
 functional view of
 integration processes. 
 For more information: http://www.talend.com/
 

 Justin-95 wrote:

  
  
  APseudoUtopia wrote:
  
    thread, then logs out (intending to
 read all the other forum threads
  at some point in the future when they log in again).
 If I used a VIEW,
  it would automatically consider all those unread forum
 posts to be
  read when the user logs out.
  
    
  That wouldn't work. What if a user logs in, reads only
 one forum 
  
  
  You are keeping a list of all the forums a user has
 read,  i would not
  worry about making sure the table tracking user
 activity has duplicate
  key values. The select can be limited to return just
 on row with the
  highest time stamp then compare this result to figure
 out what forms
  the user has not read yet.  This eliminates one of
 problems but creates
  a problem where table tracking user activity is going
 bloat but in low
  traffic times delete the duplicate values. 
  
  A similar topic was discussed  on the performance 
 mailing list, where
  updates are hung for several seconds for a similar
 tracking table... 
  http://archives.postgresql.org/pgsql-performance/2009-06/msg00300.php
 
  
    
  
  
  
  


another option is Pentaho, is good and easy too http://kettle.pentaho.org/




-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Switching from MySQL: ON DUPLICATE KEY UPDATE, plpgsql function

2009-06-29 Thread David Kerr
On Sat, Jun 27, 2009 at 08:23:26PM -0400, APseudoUtopia wrote:
- Hey list,
- 
- I'm migrating my site away from MySQL to PostgreSQL. So far, it's been
- going great. However, there's one problem I've been having trouble
- solving.
- 
- I have a query which allows users to Catch up on read posts on the
- forum. It works by either updating or inserting the last post read
- number from every forum thread into the readposts table (for that
- userid and threadid combination, of course). Here's the table
- structure:
- 
- CREATE TABLE forums_readposts (
-  useridINTEGER NOT NULL REFERENCES users_main (id) ON DELETE 
CASCADE,
-  threadidINTEGER NOT NULL REFERENCES forums_topics (id) ON
- DELETE CASCADE,
-  lastpostread   INTEGER NOT NULL CHECK (lastpostread = 0),
-  PRIMARY KEY (userid, threadid)
- );
- 
- Here's the original MySQL query that I have (db_string is a php
- function that escapes the string):
- 
- INSERT INTO forums_readposts (userid, threadid, lastpostread)
- SELECT ' . db_string($_SESSION['UserInfo']['id']) . ', id,
- lastpost FROM forums_topics ON DUPLICATE KEY UPDATE lastpostread
- = lastpost;

So regardless of other design issues. (i.e., assuming what you have was working 
in MySQL).

Wouldn't you just be looking for something like:

BEGIN;
  EXECUTE 'insert into forums_readposts values ('...')';
EXCEPTION when unique_violation THEN
EXECUTE 'update forums_readposts set lastpostread = '...' ';
END;

The logic as i read your post is. If the user's never done a catchup operation
before, this will create the record. If he has, then it will update this record
to reflect the new transid.

Dave

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Switching from MySQL: ON DUPLICATE KEY UPDATE, plpgsql function

2009-06-28 Thread APseudoUtopia
On Sat, Jun 27, 2009 at 9:13 PM, justinjus...@emproshunts.com wrote:
 APseudoUtopia wrote:

 Hey list,

 I have a query which allows users to Catch up on read posts on the
 forum. It works by either updating or inserting the last post read
 number from every forum thread into the readposts table (for that
 userid and threadid combination, of course). Here's the table
 structure:


 Wouldn't a view be better than having a table that is deleted and updated
 all the time.  I would add a field in the user table called  last_login type
 timestamp  then do a select from the forums table to generate this table
 where last_login = FormTimeStamp .

 I don't see the point having this table when a view would work better.


--snip--

 Creating a view would work better and than creating a table to track this.
 I would think this website tracks the last time the user logged in
 correct???  This is going to create allot of over head maintaining this
 table when a simple select statement will work so much better if i
 understand what you are doing.




That wouldn't work. What if a user logs in, reads only one forum
thread, then logs out (intending to read all the other forum threads
at some point in the future when they log in again). If I used a VIEW,
it would automatically consider all those unread forum posts to be
read when the user logs out.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Switching from MySQL: ON DUPLICATE KEY UPDATE, plpgsql function

2009-06-28 Thread Justin






APseudoUtopia wrote:

  thread, then logs out (intending to read all the other forum threads
at some point in the future when they log in again). If I used a VIEW,
it would automatically consider all those unread forum posts to be
read when the user logs out.

  
That wouldn't work. What if a user logs in, reads only one forum


You are keeping a list of all the forums a user has read,  i would not
worry about making sure the table tracking user activity has duplicate
key values. The select can be limited to return just on row with the
highest time stamp then compare this result to figure out what forms
the user has not read yet.  This eliminates one of problems but creates
a problem where table tracking user activity is going bloat but in low
traffic times delete the duplicate values.

A similar topic was discussed  on the performance  mailing list, where
updates are hung for several seconds for a similar tracking table...
http://archives.postgresql.org/pgsql-performance/2009-06/msg00300.php

 




[GENERAL] Switching from MySQL: ON DUPLICATE KEY UPDATE, plpgsql function

2009-06-27 Thread APseudoUtopia
Hey list,

I'm migrating my site away from MySQL to PostgreSQL. So far, it's been
going great. However, there's one problem I've been having trouble
solving.

I have a query which allows users to Catch up on read posts on the
forum. It works by either updating or inserting the last post read
number from every forum thread into the readposts table (for that
userid and threadid combination, of course). Here's the table
structure:

CREATE TABLE forums_readposts (
 useridINTEGER NOT NULL REFERENCES users_main (id) ON DELETE CASCADE,
 threadidINTEGER NOT NULL REFERENCES forums_topics (id) ON
DELETE CASCADE,
 lastpostread   INTEGER NOT NULL CHECK (lastpostread = 0),
 PRIMARY KEY (userid, threadid)
);

Here's the original MySQL query that I have (db_string is a php
function that escapes the string):

INSERT INTO forums_readposts (userid, threadid, lastpostread)
SELECT ' . db_string($_SESSION['UserInfo']['id']) . ', id,
lastpost FROM forums_topics ON DUPLICATE KEY UPDATE lastpostread
= lastpost;

Obviously this will not work with PostgreSQL. I've googled around a
bit and I decided to create a plpgsql function to handle the task. I
don't have much done, but here's what I have:

-
CREATE FUNCTION FORUM_CATCH_UP_ALL (INTEGER) RETURNS VOID AS
$FuncTag$
BEGIN
LOOP
-- Try to update the record
-- This query is broken. I'm not sure how to do the 
subquery or
whatever I need to do. Maybe FROM? Another loop?
-- UPDATE forums_readposts SET lastpostread = 
(SELECT
lastpost FROM forums_topics WHERE blah blah
IF found THEN
RETURN;
END IF;
-- Not there, try to insert the key
-- If someone else inserts the same key concurrently,
-- We could get a unique-key failure
BEGIN
INSERT INTO forums_readposts (userid, 
threadid,
lastpostread) (SELECT $1, id, lastpost FROM forums_topics)
WHERE userid = $1;
RETURN;
EXCEPTION WHEN unique_violation THEN
-- Do nothing, and loop to try the update again
END;
END LOOP;
END;
$FuncTag$
LANGUAGE plpgsql;
-

I got the structure from the example in the postgresql documentation.
Hopefully it's a step in the right direction.

If anyone can point me in the direction to take another step in, I'd
really appreciate it.

Thanks.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Switching from MySQL: ON DUPLICATE KEY UPDATE, plpgsql function

2009-06-27 Thread justin




APseudoUtopia wrote:

  Hey list,

I have a query which allows users to "Catch up" on read posts on the
forum. It works by either updating or inserting the "last post read"
number from every forum thread into the readposts table (for that
userid and threadid combination, of course). Here's the table
structure:
  

Wouldn't a view be better than having a table that is deleted and
updated all the time.  I would add a field in the user table called 
last_login type timestamp  then do a select from the forums table to
generate this table where last_login = FormTimeStamp .

I don't see the point having this table when a view would work better. 





  
Obviously this will not work with PostgreSQL. I've googled around a
bit and I decided to create a plpgsql function to handle the task. I
don't have much done, but here's what I have:

-
CREATE FUNCTION FORUM_CATCH_UP_ALL (INTEGER) RETURNS VOID AS
$FuncTag$
	BEGIN
		LOOP
			-- Try to update the record
			-- This query is broken. I'm not sure how to do the subquery or
whatever I need to do. Maybe FROM? Another loop?
			-- UPDATE "forums_readposts" SET "lastpostread" = (SELECT
"lastpost" FROM "forums_topics" WHERE blah blah
			IF found THEN
RETURN;
			END IF;
			-- Not there, try to insert the key
			-- If someone else inserts the same key concurrently
  



  			-- We could get a unique-key failure
			BEGIN
INSERT INTO "forums_readposts" ("userid", "threadid",
"lastpostread") (SELECT $1, "id", "lastpost" FROM "forums_topics")
WHERE "userid" = $1;
RETURN;
			EXCEPTION WHEN unique_violation THEN
-- Do nothing, and loop to try the update again
			END;
		END LOOP;
	END;
$FuncTag$
LANGUAGE plpgsql;
-
  


if you want to do something like this either do a test first to see if
the key is present in the table, update or do an insert like this
There is no reason to do a loop in the function waiting for a lock to
clear.   Postgresql Locks do not work like MySQL.  
CREATE FUNCTION FORUM_CATCH_UP_ALL (pUserID INTEGER) RETURNS VOID AS
$FuncTag$
	BEGIN
                      select lastpostread  from forums_readposts  where
userid = pUserId;
                      if (  found() ) then
                         UPDATE forums_readposts SET "lastpostread" =
(SELECT lastpost FROM forums_topics WHERE blah blah) ; --its helpful to
post the entire function ;
                      else
                         INSERT INTO forums_readposts ( userid,
threadid,
                                    lastpostread) (SELECT $1, id,
lastpost FROM forums_topics)
                                    WHERE userid = pUserID;
                      end;
	END;
$FuncTag$
LANGUAGE plpgsql;


  
I got the structure from the example in the postgresql documentation.
Hopefully it's a step in the right direction.

If anyone can point me in the direction to take another step in, I'd
really appreciate it.

Thanks.
  


Creating a view would work better and than creating a table to track
this.  I would think this website tracks the last time the user logged
in correct???  This is going to create allot of over head maintaining
this table when a simple select statement will work so much better if i
understand what you are doing.








Re: [GENERAL] Switching from MySQL: ON DUPLICATE KEY UPDATE, plpgsql function

2009-06-27 Thread Martin Gainty

contempating installing a insert,update trigger on forums_topics table 
something like

SET search_path = public;

CREATE TRIGGER MyTableName_Trig 
AFTER INSERT OR DELETE OR UPDATE ON forum_topics
FOR EACH ROW EXECUTE PROCEDURE FORUM_CATCH_UP_ALL ();

HTH
Martin 
__ 
Verzicht und Vertraulichkeitanmerkung/Note de déni et de confidentialité
 
Diese Nachricht ist vertraulich. Sollten Sie nicht der vorgesehene Empfaenger 
sein, so bitten wir hoeflich um eine Mitteilung. Jede unbefugte Weiterleitung 
oder Fertigung einer Kopie ist unzulaessig. Diese Nachricht dient lediglich dem 
Austausch von Informationen und entfaltet keine rechtliche Bindungswirkung. 
Aufgrund der leichten Manipulierbarkeit von E-Mails koennen wir keine Haftung 
fuer den Inhalt uebernehmen.
Ce message est confidentiel et peut être privilégié. Si vous n'êtes pas le 
destinataire prévu, nous te demandons avec bonté que pour satisfaire informez 
l'expéditeur. N'importe quelle diffusion non autorisée ou la copie de ceci est 
interdite. Ce message sert à l'information seulement et n'aura pas n'importe 
quel effet légalement obligatoire. Étant donné que les email peuvent facilement 
être sujets à la manipulation, nous ne pouvons accepter aucune responsabilité 
pour le contenu fourni.




Date: Sat, 27 Jun 2009 21:13:23 -0400
From: jus...@emproshunts.com
To: apseudouto...@gmail.com; pgsql-general@postgresql.org
Subject: Re: [GENERAL] Switching from MySQL: ON DUPLICATE KEY UPDATE, plpgsql 
function






  


APseudoUtopia wrote:

  Hey list,

I have a query which allows users to Catch up on read posts on the
forum. It works by either updating or inserting the last post read
number from every forum thread into the readposts table (for that
userid and threadid combination, of course). Here's the table
structure:
  

Wouldn't a view be better than having a table that is deleted and
updated all the time.  I would add a field in the user table called 
last_login type timestamp  then do a select from the forums table to
generate this table where last_login = FormTimeStamp .



I don't see the point having this table when a view would work better. 









  Obviously this will not work with PostgreSQL. I've googled around a
bit and I decided to create a plpgsql function to handle the task. I
don't have much done, but here's what I have:

-
CREATE FUNCTION FORUM_CATCH_UP_ALL (INTEGER) RETURNS VOID AS
$FuncTag$
BEGIN
LOOP
-- Try to update the record
-- This query is broken. I'm not sure how to do the 
subquery or
whatever I need to do. Maybe FROM? Another loop?
-- UPDATE forums_readposts SET lastpostread = 
(SELECT
lastpost FROM forums_topics WHERE blah blah
IF found THEN
RETURN;
END IF;
-- Not there, try to insert the key
-- If someone else inserts the same key concurrently
  




-- We could get a unique-key failure
BEGIN
INSERT INTO forums_readposts (userid, 
threadid,
lastpostread) (SELECT $1, id, lastpost FROM forums_topics)
WHERE userid = $1;
RETURN;
EXCEPTION WHEN unique_violation THEN
-- Do nothing, and loop to try the update again
END;
END LOOP;
END;
$FuncTag$
LANGUAGE plpgsql;
-
  



if you want to do something like this either do a test first to see if
the key is present in the table, update or do an insert like this

There is no reason to do a loop in the function waiting for a lock to
clear.   Postgresql Locks do not work like MySQL.  

CREATE FUNCTION FORUM_CATCH_UP_ALL (pUserID INTEGER) RETURNS VOID AS
$FuncTag$
BEGIN
  select lastpostread  from forums_readposts  where
userid = pUserId;

  if (  found() ) then

 UPDATE forums_readposts SET lastpostread =
(SELECT lastpost FROM forums_topics WHERE blah blah) ; --its helpful to
post the entire function ;

  else

 INSERT INTO forums_readposts ( userid,
threadid,

lastpostread) (SELECT $1, id,
lastpost FROM forums_topics)

WHERE userid = pUserID;

  end;

END;
$FuncTag$
LANGUAGE plpgsql;



  I got the structure from the example in the postgresql documentation.
Hopefully it's a step in the right direction.

If anyone can point me in the direction to take another step in, I'd
really appreciate it.

Thanks.
  



Creating a view would work better and than creating a table to track
this.  I would think this website tracks the last time the user logged

Re: [GENERAL] Switching from MySQL: ON DUPLICATE KEY UPDATE, plpgsql function

2009-06-27 Thread Greg Stark
On Sun, Jun 28, 2009 at 2:13 AM, justinjus...@emproshunts.com wrote:

 if you want to do something like this either do a test first to see if the
 key is present in the table, update or do an insert like this
 There is no reason to do a loop in the function waiting for a lock to
 clear.   Postgresql Locks do not work like MySQL.

The latter is exactly what he posted, you *do* have to loop because
two sessions can try to do the update, find no records, and then both
try to insert and fail.


The problem is that the example posted is for a single update/insert.
The problem you're solving is for merging in a whole set of changes.
That's a bit more painful.

I think you're going to want something like;

UPDATE forums_readposts
   SET lastpostread=(
   select lastpost
  from forums_topics
where id=threadid
   )
 WHERE userid=_id

INSERT INTO forums_readposts
(userid,threadid,lastpostread)
(select _userid, id, lastpost
   from forums_topics
 where id not in (
  select threadid
 from forum_readposts existing
   where existing.userid=_userid
 )
)


(you might want to experiment with that as an NOT EXISTS as there are
still cases where one is optimized better than the other due to the
standard's required null behaviour)

You have a few options here. You could just decide concurrency for
this operation really isn't important and use something to serialize
this operation. For example you could lock the user record with an
explicit select for update on the user record and the commit
immediately afterward.

Or you could catch the exception around the insert and assume if that
happened you don't have to bother retrying because the other
transaction you collided with is presumably doing the same thing. That
would break if a user hit catch up and simultaneously clicked on a
new thread he hadn't read before in another window.

Or you could do the same kind of loop around this, just always doing
the insert since it should insert 0 records if there are no missing
threads.

You could skip the insert entirely if the number of records updated in
matches the number of threads and you have that number handy. That
would be an especially good idea if you catch the exception around the
insert since exceptions are moderately expensive. They create a
subtransaction. Probably not a factor for an operation like this which
isn't dominating the workload.

-- 
greg
http://mit.edu/~gsstark/resume.pdf

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general