Re: [HACKERS] Contrib -- PostgreSQL shared variables

2004-08-30 Thread Josh Berkus
Mohawksoft,

 One of the most common SQL database applications is as the backend of some
 higher level system, like a web site. In these systems, ususlly done in
 Java, PHP, or heaven help us, .NET, you have multiple front end machine
 communicating with a SQL database backend.

Leaving aside that Java, .NET and even PHP5 have much more sophisticated ways 
of sharing variables ...

 These systems use the database as the main data server. The overall
 scalability of their system is based on the scalability of the database.
 At some point, you'll max out the DB and be stuck. The shared variables
 module is designed to allow you to scale better by allowing you to cheapy
 maintain data.

I really don't follow how this makes a lack of transaction-awareness for these 
global variables an advantage. If you're saying that it's not possible to 
make shared variables which are both cheap and transaction-aware, then say so 
and be clear about it.

Please understand that I am not opposed to the idea behind the shared variable 
proposal.   I just think there are some details to be worked out.

-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] Contrib -- PostgreSQL shared variables

2004-08-30 Thread pgsql
 Mohawksoft,

 One of the most common SQL database applications is as the backend of
 some
 higher level system, like a web site. In these systems, ususlly done in
 Java, PHP, or heaven help us, .NET, you have multiple front end machine
 communicating with a SQL database backend.

 Leaving aside that Java, .NET and even PHP5 have much more sophisticated
 ways
 of sharing variables ...

Yes, and hows does a developer share between PHP, Java, and .NET? Yes, in
a homogenous system that may work, but not across multiple systems, and
that still doesn't address the data being 'in' the database as opposed to
some other system.


 These systems use the database as the main data server. The overall
 scalability of their system is based on the scalability of the database.
 At some point, you'll max out the DB and be stuck. The shared variables
 module is designed to allow you to scale better by allowing you to
 cheapy
 maintain data.

 I really don't follow how this makes a lack of transaction-awareness for
 these
 global variables an advantage. If you're saying that it's not possible
 to
 make shared variables which are both cheap and transaction-aware, then say
 so
 and be clear about it.

 Please understand that I am not opposed to the idea behind the shared
 variable
 proposal.   I just think there are some details to be worked out.

You won't understand it from what you quoted because the reasoning has
been snipped, and I've posted the example so many times my fingers are
sore.

Some data does not need to be transactional. Data that does not need to be
transactional does not need to be transactional. If overhead can be
reduced or eliminated by eliminating transactional procedures for data
which does not need it, then the database scales better. Isn't that
completely obvious?

If your system is bonking out, every last bit of efficiency is important.
(And this is no small efficiency!) It is these sorts of tricks that can
make the difference between a well working site, and one that looks like
it has been slashdotted.



---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] Contrib -- PostgreSQL shared variables

2004-08-30 Thread Robert Treat
On Monday 30 August 2004 09:50, [EMAIL PROTECTED] wrote:
  On Sun, Aug 29, 2004 at 09:24:59AM -0400, [EMAIL PROTECTED] wrote:
  What's the advantage of instead stuffing these in-memory variables
  into PostgreSQL?  That it still works correctly and simply even if you
  have a farm of 10 separate web servers sitting in front of it?  That
  could be useful for certain applications, but is there any more to it
  than that?

 Isn't that could be useful for certain applications enough of a reason
 to add something to contrib?


That's enough of a reason to add something to *gborg*, but I can think of a 
lot of things I would rather see the postgresql major contributors 
maintaining than this.. especially when you consider that we've already 
removed things like jdbc from contrib. 

-- 
Robert Treat
Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] Contrib -- PostgreSQL shared variables

2004-08-29 Thread pgsql

 I think the shared variable module is another one of those things. The
 cost overhead of a single variable implemented as a row is too high,
 especially if you want to update it many times a second.

 Question:  How will these system variables behave regarding
 transactions?
 If I update a system variable and roll back the transaction, does it
 change
 back?   Do changes in a running transaction remain invisible until COMMIT?
 Excuse me if you've already answered these; I've not caught up on my
 -hackers
 backlog since I broke my foot.

Actually that are not involved with transactions in any way.

There classes of problems in which proper ACID implementation is not
nessisary. In fact, there are subclasses within a larger system that need
not have the overhead imposed by transactions.

Think about this:

In one part of the system, you update a table:

updated summary_table set sum = sum + $newvalue ;


In another part of the system you perform this:

select sum from summary_table;

You did this to keep from having to do:

select sum(col) from real_table;

Which, as everyone knows, can be millions of records, and a full table
scan would be too expensive.


In an active system there is *no* way that the results of these
transactions can be anything but an estimate or a snap shot. By the time
the results have been reported, the results surely would have changed.

In order to get this inaccurate information, you had to incure many dleted
rows in the summary table for each update or the cost of a full table
scan.

Usng the variable metaphor, the overall cost of doing this is almost nil.
With care, the results can be just as accurate with respect to time.

Like I said, it is *NOT* a general solution, but it *is* a solution a lot
of developers using databases have to implement *somewhere*, why not in
the database in which the data is stored?


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [HACKERS] Contrib -- PostgreSQL shared variables

2004-08-29 Thread pgsql
 On Sat, 28 Aug 2004 [EMAIL PROTECTED] wrote:


 I don't see how this is different from CREATE TABLE shared_variables
 (name
 VARCHAR PRIMARY KEY, value VARCHAR) and
 inserting/updating/deleting/selecting from that. Perhaps these are
 per-session shared variables? IN which case, what is the utility if
 sharing
 them across shared memory?

 - --
 Jonathan Gardner

 Well, the issues you don't see is this:

 What if you have to update the variables [n] times a second?

 You have to vacuum very frequently. If you update a variable a hundred
 times a second, and vacuum only once every minute, the time it takes to
 update ranges from reading one row from the database to reading 5999
 dead
 rows to get to the live one. Then you vacuum, then you are back to one
 row
 again.

 I think the right approach is to tackle that problem instead of working
 around it with a completely new variable mechanism.

You assume there is a problem with the existing mechanism that needs
fixing.  There isn't. PostgreSQL does the right thing already. The issue
is that the class of problem we are trying to solve does not fit the MVCC
methodology very well.

I've given this example a number of times:

Consider this query:

select sum(col) from table;

If this is an active system, with many inserts and updates a second, and
table has a couple million rows, there is NO way to get any meaningful
answer out of this query. It is only a snapshot, not much better than an
estimate.

The shared variables are a way of expressing, I know this is just an
estimate, but I need to do it FAST.

A LOT of websites can benefit from this, like a quick Number of people
online



 I've been playing with the idea of a quick vacuum that runs through the
 shmem buffers. The idea is that since the pages are already in memory,
 the vacuum runs very quickly. Vacuuming the hot pages frequently
 enough should avoid the problem you describe. It also saves I/O in the
 long run since dirty pages are vacuumed before they are written to
 disk, eliminating the need to read in, vacuum and write the same pages
 again later.

That's fine, but how does it affect performance?


 The problem is of course that to vacuum the heap pages, you have to make
 sure that there is no references to the dead tuples from any indexes.

Exactly.


 The trivial case is that the table has no indexes. But I believe that
 even if the table has ONE index, it's very probable that the corresponding
 index pages of the dead tuples are also in memory, since the tuple was
 probably accessed through the index.

 As the number of indexes gets bigger, the chances of all corresponding
 index pages being in memory gets smaller.

 If the quick vacuum or opportunistic vacuum as I call it is clever
 enough to recognize that there is a dead tuple in memory, and all the
 index pages that references are in memory too, it could reliably vacuum
 just those tuples without scanning through the whole relation and without
 doing any extra I/O.

 I've written some code that implements the trivial case of no indexes. I'm
 hoping to extend it to handle the indexes too if I have time. Then we'll
 see if it's any good. I've attached a patch with my current ugly
 implementation if you want to give it a try.

This is cool, but is it actually more work over the long run than
vacuum? I mean, a wile ago, vacuum had to lock the tables. This
non-locking vacuum is pretty cool and takes away almost all objections
that I can think about with PostgreSQL.


 On top of that, all the WAL logging that has to take place for each
 transaction.

 How is that a bad thing? You don't want to give up ACID do you?

Actually, yes, yes I do. Not for everything, but for classes of problems
that do not need it.

I've some experience with this. I have a session manager for PHP. I wrote
it because databases fail miserably when what you want to do is have
SQL-esque information that neeeds to be FAST and can go away without
worrying too much about it.

Think about a website shopping cart. The session and the things in the
cart don't matter after 10 or 15 minutes. Most of the time, it will just
be abandoned. The only time it is important is at checkout. At which point
it is transfered to an ACID compliant system.

My session manager on a modern dual PIII can handle 8000 full get vars,
set vars cycles a second. This is more or less impossible with a SQL
database without spending a LOT of money on clustering. Mine can do it
with a single standard x86 Linux server box.

Some information can go away without worry. Paying for protecting it with
CPU and disk I/O can be VERY expensive. PostgreSQL does what it does very
well. Sometimes what it does can get in the way of using it.


 - Heikki
 ---(end of broadcast)---
 TIP 4: Don't 'kill -9' the postmaster



---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send 

Re: [HACKERS] Contrib -- PostgreSQL shared variables

2004-08-29 Thread Greg Stark

Josh Berkus [EMAIL PROTECTED] writes:

 Question:  How will these system variables behave regarding transactions?   
 If I update a system variable and roll back the transaction, does it change 
 back?   Do changes in a running transaction remain invisible until COMMIT?  
 Excuse me if you've already answered these; I've not caught up on my -hackers 
 backlog since I broke my foot.

I think this is precisely the objection. He wants something that allows him to
break out of transaction semantics entirely. Other backends would see the
value he sets before he commits. If he rolls back his changes would remain.

I expect to see substantial resistance to incorporating such a feature. One
argument will be that he should simply use a separate system for such data.
For example, memcached would be much faster and more scalable than
communicating via postgres.

The only disadvantage to using something like memcached would be that you
couldn't transparently use values from there inside your queries. You would
have to fetch them and then pass them as parameters to postgres.

-- 
greg


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] Contrib -- PostgreSQL shared variables

2004-08-29 Thread Andrew Dunstan

Greg Stark wrote:
I expect to see substantial resistance to incorporating such a feature. One
argument will be that he should simply use a separate system for such data.
For example, memcached would be much faster and more scalable than
communicating via postgres.
 

Postgres should do what postgres is good at IMNSHO. Small is beautiful.
The only disadvantage to using something like memcached would be that you
couldn't transparently use values from there inside your queries. You would
have to fetch them and then pass them as parameters to postgres.
 

Could you not overcome that using server-side functions, in, say, C or 
plperlu? In fact I suspect you could implement this whole scheme using 
memcached (which looks quite cool and exactly made for this) plus a very 
small amount of plperlu glue.

cheers
andrew

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly


Re: [HACKERS] Contrib -- PostgreSQL shared variables

2004-08-29 Thread Andrew Piskorski
On Sun, Aug 29, 2004 at 09:24:59AM -0400, [EMAIL PROTECTED] wrote:

 Think about a website shopping cart. The session and the things in the
 cart don't matter after 10 or 15 minutes. Most of the time, it will just
 be abandoned. The only time it is important is at checkout. At which point
 it is transfered to an ACID compliant system.

Much of the rest of what you said makes sense, but this is a TERRIBLE
example.  As a user, I have been burned repeatedly by crappy,
non-persistent shopping carts built by developers who, apparently,
agree with you that, Oh, it's just a shopping cart, no one cares if
it randomly disappears and the user has to start over from scratch.

Nonsense!  Damn it, if put 12 different items in my shopping cart and
then go to lunch, I expect my shopping cart to STILL BE THERE when I
get back!  Note that Amazon does this right, so any ecommerce site
doing it WRONG does so at severe peril of losing its customers.

 My session manager on a modern dual PIII can handle 8000 full get vars,
 set vars cycles a second.

I don't really see why you need or want to do this in PostgreSQL,
though.  Can't you do it in your web or application server?  (I
certainly can in AOLserver with its nsv API, for example.)

What's the advantage of instead stuffing these in-memory variables
into PostgreSQL?  That it still works correctly and simply even if you
have a farm of 10 separate web servers sitting in front of it?  That
could be useful for certain applications, but is there any more to it
than that?

I've occasionally REALLY wanted a small in-memory ACId (no D for
Durability) RDBMS for tracking transient data that doesn't need to go
to disk (and one of these days I will experiment with using SQLite for
that), but that's not what you're talking about, right?  The shared
variables you're talking about here are just simple scalar values, no
tables, no integrity constraints, no MVCC, none of that, right?

-- 
Andrew Piskorski [EMAIL PROTECTED]
http://www.piskorski.com/

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Contrib -- PostgreSQL shared variables

2004-08-29 Thread Josh Berkus
Mohawksoft,

 Actually that are not involved with transactions in any way.

 There classes of problems in which proper ACID implementation is not
 nessisary. In fact, there are subclasses within a larger system that need
 not have the overhead imposed by transactions.

Hmmm, wait, are you implementing this, or is Jonathan?  

As a database developer who would use this or a similar feature, I do not 
agree that the lack of transactionality is a benefit.   Quite the contrary.   
Imagine this scenario, once we have error-trapping in place for 8.1 (pardon 
the syntax if it's off the proposal, I think you'll see the point anyway):

BEGIN
SAVEPOINT before_udpate;
UPDATE jehosaphat SET slivy = 60 WHERE status = 4;
SET SHARED jehosaphat_last_update = now();
IF transaction_error  0 THEN 
ROLLBACK TO before_update;
ELSE
COMMIT;
END IF;
...

Now assume that the reason I'm using a shared variable is that jehosaphat 
needs to be updated every 500ms, and thus we don't want to be storing it in a 
table.   In that case, if shared variables are not transactional, the above 
would be deceptive; the system would think that jeshosaphat had been updated 
when it had not.   

Ah, you say, but you could work around that if you coded correctly.   Maybe 
so, although I find the idea that transactions don't apply to some objects 
very liable to cause confusion.   Also consider this case:

CREATE FUNCTION update_jeshosaphat_last () RETURNS TRIGGER AS '
BEGIN
SET SHARED jehosaphat_last_update = now();
END; ' language plpgsql;

CREATE TRIGGER tg_jehosaphat_update
BEFORE UPDATE OR DELETE ON jehosaphat 
FOR EACH STATEMENT EXECUTE PROCEDURE update_jehosaphat_last();

NOW imagine if your update gets rolled back; the shared variable will be 
misreporting the last update and there isn't any coding around it.   

My personal perspective is that shared variables should be transactional, just 
like everything else in PostgreSQL except sequences.Just because they are 
transactional does not mean they have to be saved on disk; that's where most 
of your dead tuple overhead comes in.   If you could place the variables in 
reserved shared memory and never sync, then needing to have several copies 
which split and collapse to remain current with the existing transactions 
should not be a major performance cost. Of course, it will make the code a 
lot harder, but that's one of the reasons why nobody has done it before now.

Now, if you're telling me that it's too difficult for you to implement 
transactions at this time, and that you want to push this patch without 
transactions for them to be added sometime later, then we can have it out on 
that basis.   But you're going to have a hard time convincing me or several 
other people that ignoring transactions is somehow a benefit.

-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Contrib -- PostgreSQL shared variables

2004-08-28 Thread Heikki Linnakangas
On Sat, 28 Aug 2004 [EMAIL PROTECTED] wrote:

I don't see how this is different from CREATE TABLE shared_variables
(name
VARCHAR PRIMARY KEY, value VARCHAR) and
inserting/updating/deleting/selecting from that. Perhaps these are
per-session shared variables? IN which case, what is the utility if
sharing
them across shared memory?
- --
Jonathan Gardner
Well, the issues you don't see is this:
What if you have to update the variables [n] times a second?
You have to vacuum very frequently. If you update a variable a hundred
times a second, and vacuum only once every minute, the time it takes to
update ranges from reading one row from the database to reading 5999 dead
rows to get to the live one. Then you vacuum, then you are back to one row
again.
I think the right approach is to tackle that problem instead of working 
around it with a completely new variable mechanism.

I've been playing with the idea of a quick vacuum that runs through the 
shmem buffers. The idea is that since the pages are already in memory, 
the vacuum runs very quickly. Vacuuming the hot pages frequently 
enough should avoid the problem you describe. It also saves I/O in the 
long run since dirty pages are vacuumed before they are written to 
disk, eliminating the need to read in, vacuum and write the same pages 
again later.

The problem is of course that to vacuum the heap pages, you have to make 
sure that there is no references to the dead tuples from any indexes.

The trivial case is that the table has no indexes. But I believe that 
even if the table has ONE index, it's very probable that the corresponding 
index pages of the dead tuples are also in memory, since the tuple was 
probably accessed through the index.

As the number of indexes gets bigger, the chances of all corresponding 
index pages being in memory gets smaller.

If the quick vacuum or opportunistic vacuum as I call it is clever 
enough to recognize that there is a dead tuple in memory, and all the 
index pages that references are in memory too, it could reliably vacuum 
just those tuples without scanning through the whole relation and without 
doing any extra I/O.

I've written some code that implements the trivial case of no indexes. I'm 
hoping to extend it to handle the indexes too if I have time. Then we'll 
see if it's any good. I've attached a patch with my current ugly 
implementation if you want to give it a try.

On top of that, all the WAL logging that has to take place for each
transaction.
How is that a bad thing? You don't want to give up ACID do you?
- HeikkiIndex: src/backend/commands/vacuumlazy.c
===
RCS file: /projects/cvsroot/pgsql-server/src/backend/commands/vacuumlazy.c,v
retrieving revision 1.43
diff -r1.43 vacuumlazy.c
65,85d64
 typedef struct LVRelStats
 {
   /* Overall statistics about rel */
   BlockNumber rel_pages;
   double  rel_tuples;
   BlockNumber nonempty_pages; /* actually, last nonempty page + 1 */
   Sizethreshold;  /* minimum interesting free space */
   /* List of TIDs of tuples we intend to delete */
   /* NB: this list is ordered by TID address */
   int num_dead_tuples;/* current # of entries */
   int max_dead_tuples;/* # slots allocated in array 
*/
   ItemPointer dead_tuples;/* array of ItemPointerData */
   /* Array or heap of per-page info about free space */
   /* We use a simple array until it fills up, then convert to heap */
   boolfs_is_heap; /* are we using heap organization? */
   int num_free_pages; /* current # of entries */
   int max_free_pages; /* # slots allocated in array */
   PageFreeSpaceInfo *free_pages;  /* array or heap of blkno/avail */
 } LVRelStats;
 
 
176a156,275
 
 bool lazy_scan_heap_page(Relation onerel, LVRelStats *vacrelstats, 
   BlockNumber blkno, 
 Page page)
 {
   bool pgchanged;
   bool tupgone;
   
   OffsetNumber offnum,
   maxoff;
 
   HeapTupleData tuple;
 
   double nkeep = 0;
   double nunused = 0;
 int tups_vacuumed = 0;
   
   /* KLUDGE! */
   if(onerel == NULL)
   OldestXmin = GetOldestXmin(true);
 
   pgchanged = false;
   maxoff = PageGetMaxOffsetNumber(page);
   for (offnum = FirstOffsetNumber;
offnum = maxoff;
offnum = OffsetNumberNext(offnum))
   {
   ItemId  itemid;
   uint16  sv_infomask;
 
   itemid = PageGetItemId(page, offnum);
 
   if (!ItemIdIsUsed(itemid))
   {
   nunused += 1;
   continue;
   }
 
   tuple.t_datamcxt = NULL;
   

Re: [HACKERS] Contrib -- PostgreSQL shared variables

2004-08-28 Thread Josh Berkus
Jonathan,

 The stuff that I contribute is often met with resistence, and that may or
 may not be a good thing, but over the years, things I've said they NEEDED
 to do, have been done and PostgreSQL is better for it.

 Now don't think I'm talking trash to Tom either. These guys do a lot of
 hard work. I have the luxury of working on other projects and noticing the
 sorts of things that are a problem or could be great features. I have
 great respect for them, even if not they for me.

I think you're reading too much into the objections.  One of the things that 
keeps PostgreSQL fast, accessable, and open to new OSS developers is a high 
barrier of entry to new code proposed for the core.   A certain minimalism is 
called for if we don't want to end up like a Microsoft product, bloated, 
self-conflicted, and utterly impenetrable even to its own experts.   Often 
it's Tom and Peter expressing that minimalism but others would have to were 
they to go away (gods forfend).

This means that, for a project to get into the core, any proposer will have to 
spend a lot of time convincing the people on this list of its worthiness, 
both through lobbying and through code.  Thus the very first piece of 
criticism you'll see is Why do we need it?.   But you can overcome these 
sorts of objections -- others have, as you yourself pointed out.  

Unfortunately, /contrib is no better than core in this way as our code gets 
larger and mainenance gets harder.  I believe our eventual goal is/should be 
to *remove* contrib entirely, replacing it with a build environment of 
pluggable components which Peter is working on.   This will have the 
beneficial effect of erasing the difficult-to-cross gap which currently 
exists between external (pgFoundry, GBorg, Sourceforge) projects and contrib.   
Go Peter!

Also, keep in mind that Tom just completed a 300-hour marathon of reviewing 
features for 8.0.I can't imagine that he'll take a positive additude 
toward new features of any kind for about a month.

 I think the shared variable module is another one of those things. The
 cost overhead of a single variable implemented as a row is too high,
 especially if you want to update it many times a second.

Question:  How will these system variables behave regarding transactions?   
If I update a system variable and roll back the transaction, does it change 
back?   Do changes in a running transaction remain invisible until COMMIT?  
Excuse me if you've already answered these; I've not caught up on my -hackers 
backlog since I broke my foot.

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] Contrib -- PostgreSQL shared variables

2004-08-27 Thread Jonathan Gardner
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On Wednesday 25 August 2004 02:15 pm, [EMAIL PROTECTED] wrote:
 This is a first pass on a simple shared memory variable system for
 PostgreSQL. I would appriciate anyone interested in this functionality to
 rip it apart.

 It basically adds this functionality:

 SetShared('name', value);
 GetSharedInt('name');
 SetSharedText('name);
 RemovedShared('name');

 I also added two extra functions that are sort of a kludge, but could be
 very helpful.

 AddSharedInt('name', value);
 SubSharedInt('name', value);

 These add or subtect the 'value' from the variable and return the result.

I don't see how this is different from CREATE TABLE shared_variables (name 
VARCHAR PRIMARY KEY, value VARCHAR) and 
inserting/updating/deleting/selecting from that. Perhaps these are 
per-session shared variables? IN which case, what is the utility if sharing 
them across shared memory?

- -- 
Jonathan Gardner
[EMAIL PROTECTED]
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.3 (GNU/Linux)

iD8DBQFBL8FMqp6r/MVGlwwRAl4KAJ4oC02uGs5IjO0basUCxNPd3/CCqQCfcepL
NLDzKJPINHKK9adjKue2qlQ=
=Z8mP
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [HACKERS] Contrib -- PostgreSQL shared variables

2004-08-27 Thread pgsql

 I don't see how this is different from CREATE TABLE shared_variables
 (name
 VARCHAR PRIMARY KEY, value VARCHAR) and
 inserting/updating/deleting/selecting from that. Perhaps these are
 per-session shared variables? IN which case, what is the utility if
 sharing
 them across shared memory?

 - --
 Jonathan Gardner

Well, the issues you don't see is this:

What if you have to update the variables [n] times a second?

You have to vacuum very frequently. If you update a variable a hundred
times a second, and vacuum only once every minute, the time it takes to
update ranges from reading one row from the database to reading 5999 dead
rows to get to the live one. Then you vacuum, then you are back to one row
again.

On top of that, all the WAL logging that has to take place for each
transaction.

I hope no one misunderstands my motives here. I think PostgreSQL is a
GREAT project. I differ with the core team on a frequent basis. Maybe the
problem is maybe I've been corrupted by Windows programming or something.
If you develop software on Windows, the trick is knowing how to do things
not intended to be done by you in order to accomplish your task. These
small little tricks are common knowledge in the M$ campuses, but in the
outside world they are the street cred that gets you paid. I may play a
little too fast and loose with standards.

PostgreSQL is a GREAT SQL database, however the database is no longer
the product. The product is the set of functions and infrastructure on top
of the database. The SQL database is a commodity, it is what you can do
with it that makes it interresting.

The stuff that I contribute is often met with resistence, and that may or
may not be a good thing, but over the years, things I've said they NEEDED
to do, have been done and PostgreSQL is better for it.

I remember an exchange with Tom about four or five years ago, about the
need to return a number of rows from a PostgreSQL function. I don't
remember it too well, but my memory tells me he said it wasn't possible,
or maybe it was possible but not worth it, or something like that.
PostgreSQL returns sets of rows now. It took a few years, but no one would
say, now, they didn't need to do it.

Now don't think I'm talking trash to Tom either. These guys do a lot of
hard work. I have the luxury of working on other projects and noticing the
sorts of things that are a problem or could be great features. I have
great respect for them, even if not they for me.

I think the shared variable module is another one of those things. The
cost overhead of a single variable implemented as a row is too high,
especially if you want to update it many times a second. This is a simple
module. It will work for a good number of deployments and make PostgreSQL
more viable for them. I fully expect that the module will grow into
something better, ala rserv, be folded into PostgreSQL, ala toast, or just
languish in contrib. (if it is added at all.)

I believe in Open Source, I have needed this functionality in the past, I
will probably need it in the future. If I can share it with the project,
why not?


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


[HACKERS] Contrib -- PostgreSQL shared variables

2004-08-25 Thread pgsql
This is a first pass on a simple shared memory variable system for
PostgreSQL. I would appriciate anyone interested in this functionality to
rip it apart.

It basically adds this functionality:

SetShared('name', value);
GetSharedInt('name');
SetSharedText('name);
RemovedShared('name');

I also added two extra functions that are sort of a kludge, but could be
very helpful.

AddSharedInt('name', value);
SubSharedInt('name', value);

These add or subtect the 'value' from the variable and return the result.

sharedvar.tgz
Description: application/unix-tar

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] Contrib -- PostgreSQL shared variables

2004-08-25 Thread Rick Gigger
LockShared('name');
[EMAIL PROTECTED] wrote:
This is a first pass on a simple shared memory variable system for
PostgreSQL. I would appriciate anyone interested in this functionality to
rip it apart.
It basically adds this functionality:
SetShared('name', value);
GetSharedInt('name');
SetSharedText('name);
RemovedShared('name');
I also added two extra functions that are sort of a kludge, but could be
very helpful.
AddSharedInt('name', value);
SubSharedInt('name', value);
These add or subtect the 'value' from the variable and return the result.

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match
---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly


Re: [HACKERS] Contrib -- PostgreSQL shared variables

2004-08-25 Thread pgsql
 LockShared('name');

Hmmm, I thought about that, but it is *WAY* more complicated than it
looks. What if after a Lock the process crashes before it can call
Unlock? It is this problem that inspired the add and sub calls.




 [EMAIL PROTECTED] wrote:
 This is a first pass on a simple shared memory variable system for
 PostgreSQL. I would appriciate anyone interested in this functionality
 to
 rip it apart.

 It basically adds this functionality:

 SetShared('name', value);
 GetSharedInt('name');
 SetSharedText('name);
 RemovedShared('name');

 I also added two extra functions that are sort of a kludge, but could be
 very helpful.

 AddSharedInt('name', value);
 SubSharedInt('name', value);

 These add or subtect the 'value' from the variable and return the
 result.


 


 ---(end of broadcast)---
 TIP 9: the planner will ignore your desire to choose an index scan if
 your
   joining column's datatypes do not match



---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [HACKERS] Contrib -- PostgreSQL shared variables

2004-08-25 Thread Thomas Hallgren
[EMAIL PROTECTED] wrote:
This is a first pass on a simple shared memory variable system for
PostgreSQL. I would appriciate anyone interested in this functionality to
rip it apart.
It basically adds this functionality:
SetShared('name', value);
GetSharedInt('name');
SetSharedText('name);
RemovedShared('name');
I also added two extra functions that are sort of a kludge, but could be
very helpful.
AddSharedInt('name', value);
SubSharedInt('name', value);
These add or subtect the 'value' from the variable and return the result.
Something that I've found very useful when dealing with shared memory is 
the ability to do atomic exchange of values. With that in place, its 
possible to perform atomic operations involving several variables.

Perhaps it could be as simple as splitting your SetShared into 
SetSharedInt and SetSharedString and then have them return the old value?

Here's an example of what I mean (I use C syntax for clarity, I know the 
intended use is from SQL).

/* Loop until something other than LOCKED is returned. When
 * that happens, we have the lock.
 */
while(SetSharedInt('lock', LOCKED) == LOCKED)
usleep(WAIT_PERIOD);
Set a group of variables here.
/* Unlock
 */
SetSharedInt('lock', UNLOCKED);
Regards,
Thomas Hallgren
---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly


Re: [HACKERS] Contrib -- PostgreSQL shared variables -with swap

2004-08-25 Thread pgsql
This new version contains, in addition to the previous version,
SwapShared(..) which allows you to swap a value in a variable.

sharedvar-040825.tgz
Description: application/unix-tar

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])