[SQL] Question on string value expression wildcarding

2000-08-24 Thread Steve Wampler


I have LIKE expressions:

(a) name LIKE 'kp.dhs.%'
(b) name LIKE 'kp.dhs%'

where the name column contains strings prefixed with "kp.dhs.".

I'm using postgresql  7.0.2.

Expression (a) fails to match any names while (b) matches
all strings prefixed with "kp.dhs", including (as expected)
those prefixed with "kp.dhs.".

So I take it that ".%" has some special meaning in wildcarding,
but my (limited) SQL references don't mention this case.
Is this To Be Expected SQL behavior?  If so, what
expression can be used to match only strings prefixed with
"kp.dhs."?

Thanks!
--
Steve Wampler-  SOLIS Project, National Solar Observatory
[EMAIL PROTECTED]



[SQL] Using a postgres table to maintain unique id?

2000-11-13 Thread Steve Wampler


Hi,

I have a distributed application that needs unique 64-bit
"id" values.  The current implementation keeps the current
value for this id in a file on one system and uses
a process to provide access to this file via CORBA.

However, I also use Postgres (7.0.2) throughout this
application and it seems cleaner to me to keep the current
id value in a table and just use postgres to provide access
(with a trigger function to increment the id on access).

Is this reasonable?  Is it fast?  (I need 10 or more IDs
generated each second.)  Can I avoid having the table
gradually fill with "old" rows for this entry, and this
avoid the need to run VACUUM ANALYZE periodically?

Any tips on how to implement the trigger would be
appreciated.  (Is it possible to generate an int8 sequence
value?)

Thanks!
--
Steve Wampler-  SOLIS Project, National Solar Observatory
[EMAIL PROTECTED]



Re: [SQL] Using a postgres table to maintain unique id?

2000-11-13 Thread Steve Wampler

Poet/Joshua Drake wrote:
> 
> >However, I also use Postgres (7.0.2) throughout this
> >application and it seems cleaner to me to keep the current
> >id value in a table and just use postgres to provide access
> >(with a trigger function to increment the id on access).
> 
> Why not a sequence?

Can someone show me how to create (and use) an int8 sequence?

> >Is this reasonable?  Is it fast?  (I need 10 or more IDs
> >generated each second.)  Can I avoid having the table
> >gradually fill with "old" rows for this entry, and this
> >avoid the need to run VACUUM ANALYZE periodically?
> 
> The only problem I have had with this type of thing is when a number gets
> deleted, it does not get recycled.

Fortunately, I don't want any number to ever get recycled - the id needs to
be unique throughout the 25+ year lifetime of the project.  The table
would have a single row with a single column.  Selecting that table cell
would return the current value, but leave the value incremented in the
table cell (it's ok if it increments the value before returning).


--
Steve Wampler-  SOLIS Project, National Solar Observatory
[EMAIL PROTECTED]



Re: [SQL] Using a postgres table to maintain unique id?

2000-11-13 Thread Steve Wampler

Marten Feldtmann wrote:
> 
>  Throw away all the "hardwired"-stuff and do it with software. I
> once described an algorithm in one of this lists how to create
> unique values for clients without minimum interaction with the
> database.
> 
>  The result: query once in the beginning of your application,
> generate your id's "offline" at the maximum speed you may
> have and store your last generated id when your client
> finished. Superior to all the "hardwired"-database solutions !

Yes, but...

(1) The application I have is composed of about 50 processes
running on 3 different OS/architectures (Linux/intel,
Solaris/sparc, and VxWorks/ppc).  The IDs I need must be
unique across all processes (I suppose one solution would
be to provide each ID with a unique prefix based on the
process that is running, but...)

(2) Some of these systems are real-time boxes that might get
rebooted at any moment, or might hang for hardware-related
reasons [I'd like to able to say that all of the processes
could detect imminent failure, but unfortunately, I can't].
So determining when a client "finishes" is not always possible,
which prevents (he claims) the above solution from claiming
ID uniqueness.

However, it might be sufficient to provide a process on the
postgres DB machine (if *that* machine dies, *everything* stops...)
that serves IDs via CORBA to all the other applications and
(internally) uses the "software" approach given above.  This
process could "sync" with the database every N seconds or so
(where N might be < 1.0).  This, while still not guaranteeing
uniqueness, would at least come pretty close...  It would still be
nice to avoid having to VACUUM ANALYZE this table, though, and it
"feels" as though it is duplicating functionality already provided
by postgres DB backends.

I'll think about this solution - thanks!



--
Steve Wampler-  SOLIS Project, National Solar Observatory
[EMAIL PROTECTED]



Re: [SQL] Using a postgres table to maintain unique id?

2000-11-14 Thread Steve Wampler

Thomas Swan wrote:
> 
> At 11/13/2000 06:22 PM -0800, Michael Teter wrote:
> >
> > From what I can tell (both from docs and doing a
> > describe on sequences in my database), a postgresql
> > sequence is an int4, not an int8, and thus you are
> > limited to a max of 2.1 billion values.
> >
> > If you require an int8 sequence, you'll probably have
> > to manage your own and just use an int8 column.
> >
> I had originally started using int8 and creating custom sequences.   However,
> as mentioned in a previous post, there is an inherent performance penalty in
> using int8 over int4.   Tom Lane advised me that the int8 routines are an
> emulated or synthesized data type.  in the test I did on our 7.0.2 server I
> notice about a 25-30% decrease in performance when using complex joins on
> tables containing referential keys, primary keys all in the int8 data type.
> 
> This might be something to think about as well.

Thanks.  Because of these and other comments people have made, I've gone back
to using a flat_file-with-server approach instead of adding a table to my
postgres DB.  While an int4 *might* work, it doesn't handle the "worst-case"
scenario (which is up around 15 billion values).

Thanks to everyone for your comments and suggestions!

--
Steve Wampler-  SOLIS Project, National Solar Observatory
[EMAIL PROTECTED]



[SQL] Replacing a simple nested query?

2003-07-13 Thread Steve Wampler

I've got a simple nested query:

  select * from attributes where id in (select id from
 attributes where (name='obsid') and (value='oid00066'));

that performs abysmally.  I've heard this described as the
'classic WHERE IN' problem.

Is there a better way to obtain the same results?  The inner
select identifies a set of ids (2049 of them, to be exact)
that are then used to locate records that have the same id
(about 30-40K of those, including the aforementioned 2049).

Thanks!
-Steve

-- 
Steve Wampler -- [EMAIL PROTECTED]
Quantum materiae materietur marmota monax si marmota
monax materiam possit materiari?

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [SQL] Replacing a simple nested query?

2003-07-14 Thread Steve Wampler
On Sun, 2003-07-13 at 14:50, Steve Wampler wrote:
> I've got a simple nested query:
> 
>   select * from attributes where id in (select id from
>  attributes where (name='obsid') and (value='oid00066'));
> 
> that performs abysmally.  I've heard this described as the
> 'classic WHERE IN' problem.
> 
> Is there a better way to obtain the same results?  The inner
> select identifies a set of ids (2049 of them, to be exact)
> that are then used to locate records that have the same id
> (about 30-40K of those, including the aforementioned 2049).

For the record, Joe Conway and Hannu Krosing both provided
the same solution:

   select at.* from attributes_table at, attributes a
  where at.id = a.id and a.name='obsid' and a.value='oid00066';

which is several orders of infinity faster than than my naive
approach above:
-
lab.devel.configdb=# explain analyze select * from
attributes_table where id in (select id from attributes
where (name='obsid') and (value='oid00066')) order by id;
NOTICE:  QUERY PLAN:

Index Scan using id_index on attributes_table  (cost=0.00..8773703316.10
rows=241201 width=59) (actual time=136297.91..3418016.04 rows=32799
loops=1)
  SubPlan
->  Materialize  (cost=18187.48..18187.48 rows=15 width=25) (actual
time=0.01..1.68 rows=1979 loops=482402)
  ->  Index Scan using name_index on attributes_table 
(cost=0.00..18187.48 rows=15 width=25) (actual time=0.27..251.95
rows=2049 loops=1)
Total runtime: 3418035.38 msec
--
lab.devel.configdb=# explain analyze select at.* from
attributes_table at, attributes a
where at.id = a.id and a.name='obsid' and a.value='oid00066';
NOTICE:  QUERY PLAN:

Nested Loop  (cost=0.00..18739.44 rows=217 width=84) (actual
time=0.76..1220.65 rows=32799 loops=1)
  ->  Index Scan using name_index on attributes_table 
(cost=0.00..18187.48 rows=15 width=25) (actual time=0.47..507.31
rows=2049 loops=1)
  ->  Index Scan using id_index on attributes_table at 
(cost=0.00..35.80 rows=12 width=59) (actual time=0.11..0.31 rows=16
loops=2049)
Total runtime: 1235.42 msec
---

My thanks to both Joe and Hannu!
Steve
-- 
Steve Wampler -- [EMAIL PROTECTED]
Quantum materiae materietur marmota monax si marmota
monax materiam possit materiari?

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


[SQL] Copying rows between tables?

2004-01-13 Thread Steve Wampler

I know I can select from one table into a new table, but is
there a way to select from one table into an existing table
(aside from copying everything to a file, editing the file
and then copying from that file)?  Without destroying existing
entries, of course...

I have an application where old records are archived into
'archive' tables.  Occasionally there is a need to copy
some of these old records into the 'active' table.

Thanks for any pointers!
Steve
-- 
Steve Wampler -- [EMAIL PROTECTED]
The gods that smiled on your birth are now laughing out loud.

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [SQL] Make COUNT(*) Faster?

2005-07-08 Thread Steve Wampler
Chris Browne wrote:

> None of those transactions have COMMITted, so there are some 78 tuples
> "in limbo" spread across 16 transactions.
> 
> If there were some "single secret place" with a count, how would you
> suggest it address those 78 tuples and 16 transactions that aren't yet
> (and maybe never will be) part of the count?

Hmmm, I understand this and don't doubt it, but out of curiousity, how
does the current SELECT COUNT(*) handle this?  It doesn't lock the entire
table while counting (I assume) so the current implementation is really
just an approximate count in the above scenario anyway.  Or even when
not, since the true 'count' is likely to have changed by the time the
user does anything with the result of SELECT COUNT(*) on any active table
(and on an inactive table, pg_class.reltuples is nearly as good as
SELECT COUNT(*) and far faster to get to.)

I assume this has been beaten well past death, but I don't see why it
wouldn't be possible to keep pg_class.reltuples a bit more up-to-date
instead of updating it only on vacuums.

-- 
Steve Wampler -- [EMAIL PROTECTED]
The gods that smiled on your birth are now laughing out loud.

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [SQL] Make COUNT(*) Faster?

2005-07-08 Thread Steve Wampler
Bruno Wolff III wrote:
> No, it is an exact count.

Yes, for the transaction, but it's an approximation of the number of
tuples in the table - which is probably what the people who worry about
its cost are more interested in (an approximate count for the table).
I'm also claiming that a true count for any active table is meaningless and
am *not* suggesting that effort be spent on trying to produce such
a true count.

>>I assume this has been beaten well past death, but I don't see why it
>>wouldn't be possible to keep pg_class.reltuples a bit more up-to-date
>>instead of updating it only on vacuums.
> 
> 
> Because it costs resources to keep track of that and people don't usually need
> exact tuple counts for whole tables. 

Yes, we agree completely! (Which is why I said 'a bit more' instead of
'exactly' above.)  My uses for COUNT(*) are to get 'reasonable' approximate
counts of the table sizes - not true counts, but approximate values.  
Unfortunately,
pg_class.reltuples gets too far off too fast for me to use it as a consistent 
guide
to current table size.  If you Folks Who Know believe that simply keeping
pg_class.reltuples 'closer' to the actual table size is too expensive, I'll
accept that [after all, I have to right now anyway], but I'm surprised that
it is, given all the other work that must go on at the start/close of a 
transaction.

I also understand that 'reasonable' and 'closer' are vague terms.
In the example scenerio where there were around 80 rows in an indeterminate
state, my claim is that, in a table of around a million rows, it doesn't
matter whether some portion of those indeterminate rows are included in
an approximation of the table size or not (though it might in a table of
100 'true' rows - but the decision to ask for a true 'transaction' count (slow)
or an approximate table size (fast) should be left to the user in either case).

So, leave COUNT(*) alone.  But it would be very handy to have a
way to get an approximate table size that is more accurate than is
provided by a pg_class.reltuples that is only updated on vacuums.

-- 
Steve Wampler -- [EMAIL PROTECTED]
The gods that smiled on your birth are now laughing out loud.

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [SQL] Make COUNT(*) Faster?

2005-07-08 Thread Steve Wampler
Tom Lane wrote:
> Steve Wampler <[EMAIL PROTECTED]> writes:
> 
>>So, leave COUNT(*) alone.  But it would be very handy to have a
>>way to get an approximate table size that is more accurate than is
>>provided by a pg_class.reltuples that is only updated on vacuums.
> 
> If you want something cheap, you could use the same technique the
> planner uses nowadays: take RelationGetNumberOfBlocks() (which is
> guaranteed accurate) and multiply by reltuples/relpages.  I don't
> see anyplace where RelationGetNumberOfBlocks is directly exposed to
> users now, but it'd be trivial to code up a couple of C functions to
> provide this functionality.

Yes - this would be an excellent approximation for my needs!  The
solution that Dawid Kuroczko suggested (just call "explain select *
on ..." and parse the result) would be equivalent these days, right?
(I think in the 7.x versions the planner was just using pg_class.reltuples,
which wouldn't have helped.)  If true, I can handle that parsing myself
easily enough without exposing RelationGetNumberOfBlocks.

Thanks (Tom and Dawid)!

-- 
Steve Wampler -- [EMAIL PROTECTED]
The gods that smiled on your birth are now laughing out loud.

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [SQL] Make COUNT(*) Faster?

2005-07-08 Thread Steve Wampler
Christopher Browne wrote:
>>I'm also claiming that a true count for any active table is
>>meaningless and am *not* suggesting that effort be spent on trying
>>to produce such a true count.
> 
> 
> That's a pretty big assumption that would in fact be WRONG.

Please reread the message from Bruno and reconcile the above statement
with his assertion (which I believe) that there is *no* single true count
for an active table.  [I'm defining 'active' as currently
undergoing insert/copy/delete/update actions].

> We have managers interested in counting the number of objects we have
> around (As a domain registry, what objects would you imagine those
> might be :-)), and they're keen on possibly even being able to
> reconcile those counts from day to day based on transaction activity.

If Bruno is correct, then they need to do this reconcilation from
within a single transaction (the same one that does the COUNT(*)) - or
else they are working on an 'inactive' table [one not currently
accepting changes].  If neither condition holds, then isn't the result
they are using from COUNT(*) currently is *already* an approximation?

> Leaping into some sort of vague guesstimation would destroy the
> ability to do any kind of analysis of activity, and I daresay enrage
> them.

No doubt!  Let's hope the above conditions hold.

> There may be times that a really rough guess can suffice; there are
> other times when exactness is absolutely vital.

But, as others have said, COUNT(*) does not return a true count for
a table, but rather just a true count for the *current transaction*.
So COUNT(*)'s from different simultaneous transactions may very well
produce different values.

> Creating a "fast but WRONG COUNT(*)" which prevented getting the exact
> answer that the present implementation provides would be a severe
> misfeature.

Agreed - note that I did not suggest replacing the current COUNT(*)
with an inexact version, but wanted (and now have) a quick way to get
a reasonable approximation of the current table size.

-- 
Steve Wampler -- [EMAIL PROTECTED]
The gods that smiled on your birth are now laughing out loud.

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [SQL] What does PostgreSQL do when time goes backward?

2010-08-04 Thread Steve Wampler

John Hasler wrote:


Frank writes:

My ntp client changes clock (by small amount) at any time:
Jul 25 05:29:38 bax ntpd[10269]: adjusting local clock by 0.098724s
Jul 25 05:31:43 bax ntpd[10269]: adjusting local clock by 0.038991s
Jul 25 06:13:38 bax ntpd[10269]: adjusting local clock by -0.037131s
Jul 25 15:01:52 bax ntpd[10269]: adjusting local clock by -0.112429s


Ken writes:

These do seem to be larger values than you might expect from a clock
conditioned with ntpd.  Is it a VM or is there something going on that
would stop or suspend your system?



John writes:

There is certainly something wrong there.


I saw very bad clock performance on one Linux box I had (dual-single core
AMD cpus, no VMs), even with NTP, until I changed the clocksource kernel
parameter to hpet.  Unfortunately (or fortunately) I no longer have that box.

--
Steve Wampler -- swamp...@noao.edu
The gods that smiled on your birth are now laughing out loud.

--
Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql


[SQL] Insert with replace?

2000-05-30 Thread Steve Wampler


What is the normal way to preform a "insert with replace" on
a row in a relational database (specifically postgresql)?

I've got a database that (logically) represents a set of
items, where each item has a primary key represented by two
columns (id and name).  If that key isn't in the database,
I want to insert the item.  If that key is in the database, I want
to update the item.  I have no need to retain the previous
values.

The interface is through jdbc.

I assume I could try the update first and then, if the update
generates an exception, perform an insert, but is there a way
to do the same with a single command?  (The data is such that
the update will fail in the majority of cases.)

Thanks! 
--
Steve Wampler-  SOLIS Project, National Solar Observatory
[EMAIL PROTECTED]